C++ How to get sizeof(enum) == sizeof(char)? - c++

I would like to know how.
I have looked at this topic, and I understand that "The choice of type is implementation-defined.", but I am curious to know how to get 1 instead of 4.

C++11 introduced a way to change the underlying type of an enum.
enum foo : char { ... };
enum class foo : char { ... };
Still, you're probably better off with using the default int in most cases.

On GCC, you can also use the 'packed' attribute to tell the compiler you care more about space than word alignment / access speed:
enum foo { ... } __attribute__((packed));
There are similar hints for other compilers too.
(This is useful when trying to avoid any C++11 features that aren't yet supported by your compiler or IDE.)

Related

std::optional & compilers - shouldn't this crash?

I'm trying to create a hiearchy of classes in which parts are optional. I would like things to automatically be created as soon as variables are set.
For that I use C++17 std::optional feature.
Now in the example below I forgot to set the "parent" (test2_inst) first, yet g++, clang and msvc all compile and run fine altough with the "not set" output.
My questions now are: am I indeed doing the wrong thing in this example? and what would the proper way of resolving this?
Or are the compilers doing the wrong thing?
#include <optional>
class test1 {
public:
class test2 {
public:
int a, b;
class test3 {
public:
int c, d;
};
test3 test3_inst;
};
std::optional<test2> test2_inst;
};
int main(int argc, char *argv[])
{
test1 *test1_inst = new test1();
// can set value
test1_inst->test2_inst->test3_inst.c = 3;
// yet optional says it is note set?
if (test1_inst->test2_inst.has_value())
printf("set\n");
else
printf("not set\n");
return 0;
}
The behaviour of optional::operator* and optional::operator-> is undefined if the optional does not contain a value.
Accesses the contained value.
Returns a pointer to the contained value.
Returns a reference to the contained value.
The behavior is undefined if *this does not contain a value.
Source: https://en.cppreference.com/w/cpp/utility/optional/operator*
shouldn't this crash?
Could. Undefined behavior can do anything. Crashing is one possibility. Not crashing and appearing to work is also a possibility.
am I indeed doing the wrong thing in this example?
Yes.
what would the proper way of resolving this?
Depends what you are trying to do. Check the optional...
if (test1_inst->test2_inst)
test1_inst->test2_inst->test3_inst.c = 3;
Or, assign its value...
test1_inst->test2_inst = test1::test2{1, 2, {3, 4}};
Or are the compilers doing the wrong thing?
No, the C++ standard gives the compilers a lot of latitude.
C++ is not a nanny language, and gives programmers enough rope to shoot themselves in the foot.

Casting a pointer to a struct into another struct type with a smaller number of fields

Basic problem
I'm in a tricky situation that requires taking a pointer to a struct mainset and turning this into a pointer to a struct subset, whose fields are a contiguous subset of the fields of mainset, starting from the first. Is such a thing possible, with well-defined behavior? I realize that this is a pretty terrible thing to do, but I have good and frustrating reasons to do it [explained at the bottom for patient readers].
My attempt an an implementation seems to work, on OS X with the clang compiler:
#include <iostream>
struct mainset {
size_t size;
uint32_t reflex_size;
};
struct subset {
size_t size;
};
using namespace std;
int main(int argc, char *argv[]) {
mainset test = {1, 1};
subset* stest = reinterpret_cast<subset*>(&test);
std::cout << stest->size << std::endl;
}
The output is indeed 1, as I expect. However, I wonder: am I just getting lucky with a particular compiler and a simple case (in reality my structs are more complicated), or will this work in general?
Also, a follow-up question: for other annoying reasons, I worry that I might need to make my larger struct
struct mainset {
uint32_t reflex_size;
size_t size;
};
instead, with the extra field coming at the front. Could my implementation be extended to work in this case? I tried replacing &test with &test+sizeof(test.reflex_size) but this didn't work; the output of the cout statement was 0.
Explanation of why I have to do this
My project uses the GSL library for linear algebra. This library makes use of structs of the form
struct gsl_block {
size_t size;
double* data;
}
and similar structs like gsl_vector and gsl_matrix. So, I've used these structs as members of my C++ classes; no problem. A recently demanded feature for my project, however, is to enable reflection to my classes with the Reflex tool, part of the ROOT ecosystem. To enable reflection for a struct like this in Reflex, I must add an annotation like
struct gsl_block {
size_t size;
double* data; //[size]
}
This annotation tells Reflex that that the length of the array is provided by the field size of the same struct. Normally that would be that, but Reflex and ROOT have a very unfortunate limitation: the length field must be 32 bit. Having been informed that this limitation won't be fixed anytime soon, and not having the time/resources to fix it myself, I'm looking for workarounds. My idea is to somehow embed a struct bit-compatible with gsl_block within a larger struct:
struct extended_gsl_block {
size_t size;
double* data; //[reflex_size]
uint32_t reflex_size;
}
and similar things for gsl_vector and gsl_matrix; I can ensure that reflex_size and size are always equal (neither is ever bigger than ~50) and Reflex will be able to parse this header correctly (I hope; if reflex_size is required to precede data as a field something more difficult would be required). Since GSL routines work with pointers to these structs, my idea is this: given a pointer extended_gsl_block*, somehow get a pointer to just the fields size and data and reinterpret_cast this into a gsl_block*.
You are in luck.
The classes you show as an example conform to the requirements of standard layout types.
You can read more here:
http://en.cppreference.com/w/cpp/language/data_members#Standard_layout
You can test this premise in the compiler with:
static_assert(std::is_standard_layout<gsl_block>::value, "not a standard layout");

C++ Help to define and use a vector of vector of a struct

My intention is to build such a data structure in C++:
struct callbackDataUnit {
std::string columnName;
std::string columnData;
};
std::vector<callbackDataUnit> callbackRow;
std::vector<callbackRow> callbackSet; <--- Invalid... It needs a type here
The compiler first complains about the lack os static on callbackRow. Even if I use static there, it still does not compile as the structure is naturally invalid
I would like to take this opportunity to understand a little more about C++ (I´m a beginner on that area), so here goes my questions:
a) Why do we need the static qualifier here ?
b) How can I solve this matrix of the first variables ? I could create 3 classes here (CallbackDataUnit, CallbackRow and CallbackSet) but I feel I would be missing real C++ power here. Would it make sense to make callbackRow a single element struct, so that it can be added to callbackSet ?
Thanks for helping
I think you want define new types, not variables.
To do that, you can use typedef or using.
typedef std::vector<callbackDataUnit> callbackRow;
typedef std::vector<callbackRow> callbackSet;
using callbackRow = std::vector<callbackDataUnit>;
using callbackSet = std::vector<callbackRow>;
If you want to just define variables, you can use:
std::vector<callbackDataUnit> callbackRow;
std::vector<decltype(callbackRow)> callbackSet;

Should I use an enum or multiple const for non-sequential constants in c++?

I'm writing porting file-io set of functions from c into a c++ class. "Magic numbers" (unnamed constants) abound.
The functions read a file header which has a number of specific entries whose locations are currently denoted by magic numbers.
I was taught by a veteran programmer a couple years back that using "magic numbers" is inherently evil, and thus, I have since tried to avoid using unnamed constants in my port. So I want to create some sort of list of constants of where the entries are stored.
So far I've come up with two solutions that seem relatively safe -- use a namespace enclosed set of constants or a namespace enclosed enum.
Can I use either solution safely? Would there be any advantages to one over the other?
e.g.
OPTION 1
namespace hdr_pos {
const unsigned int item_1_pos=4;
const unsigned int item_2_pos=8;
const unsigned int item_3_pos=12;
const unsigned int item_4_pos=24;
const unsigned int item_5_pos=32;
};
OPTION 2
namespace hdr_pos {
enum e {
item_1_pos=4,
item_2_pos=8,
item_3_pos=12,
item_4_pos=24,
item_5_pos=32
};
};
Is there anyway to prevent duplicates, to catch if I change the positions due to a future update to the file header, but forget to change one of them?
Please keep this factual and non-subjective. If there is no advantage you know of, feel free to answer that.
Note: I would use more descriptive names, of course, in my actual implementation; I just called things item_<#>_... for examples sake...
I can see two advantages to using an enum. First, some debuggers can translate constants back into enum variable names (which can make debugging easier in some cases). Also, you can declare a variable of an enumerated type which can only hold a value from that enumeration. This can give you an additional form of type checking that you wouldn't have simply by using constants.
Checking to see if a value is duplicated might depend on your particular compiler. The easiest way to do so would probably be to write an external script that will parse your enum definition and report whether or not a value is duplicated (you can run this as part of your build process if you like).
I've dealt with this situation before, for error codes.
I have seen people using enums for error codes, and this pose some issues:
you can assign an int to the enum that doesn't not correspond to any value (too bad)
the value itself is declared in a header, meaning that error code reassignment (this happens...) breaks code compatibility, you also have to take care when adding elements...
you have to define all codes in the same header, even if often times some code are naturally restricted to a small portion of the application, because enums cannot be "extended"
there is no check that a same code is not assigned twice
you cannot iterate over the various fields of an enum
When designing my error codes solution, I thus chose another road: constants in a namespace, defined in source files, which address points 2 and 3. To gain in type safety though, the constants are not int, but a specific Code class:
namespace error { class Code; }
Then I can define several error files:
// error/common.hpp
namespace error
{
extern Code const Unknown;
extern Code const LostDatabaseConnection;
extern Code const LostNASConnection;
}
// error/service1.hpp
// error/service2.hpp
I didn't solved the arbitrary cast issue though (constructor is explicit, but public), because in my case I was required to forward error codes returned by other servers, and I certainly didn't want to have to know them all (that would have been too brittle)
However I did thought about it, by making the required constructor private and enforcing the use of a builder, we're even going to get 4. and 5. in a swoop:
// error/code.hpp
namespace error
{
class Code;
template <size_t constant> Code const& Make(); // not defined here
class Code: boost::totally_ordered<Code>
{
public:
Code(): m(0) {} // Default Construction is useful, 0 is therefore invalid
bool operator<(Code const& rhs) const { return m < rhs.m; }
bool operator==(Code const& rhs) const { return m == rhs.m; }
private:
template <size_t> friend Code const& Make();
explicit Code(size_t c): m(c) { assert(c && "Code - 0 means invalid"); }
size_t m;
};
std::set<Code> const& Codes();
}
// error/privateheader.hpp (inaccessible to clients)
namespace error
{
std::set<Code>& PrivateCodes() { static std::set<Code> Set; return Set; }
std::set<Code> const& Codes() { return PrivateCodes(); }
template <size_t constant>
Code const& Make()
{
static std::pair< std::set<Code>::iterator, bool > r
= PrivateCodes().insert(Code(constant));
assert(r.second && "Make - same code redeclared");
return *(r.first);
}
}
//
// We use a macro trick to create a function whose name depends
// on the code therefore, if the same value is assigned twice, the
// linker should complain about two functions having the same name
// at the condition that both are located into the same namespace
//
#define MAKE_NEW_ERROR_CODE(name, value) \
Make<value>(); void _make_new_code_##value ();
// error/common.cpp
#include "error/common.hpp"
#include "privateheader.hpp"
namespace error
{
Code const Unkown = MAKE_NEW_ERROR_CODE(1)
/// ....
}
A tad more work (for the framework), and only link-time/run-time check of the same assignment check. Though it's easy to diagnose duplicates simply by scanning for the pattern MAKE_NEW_ERROR_CODE
Have fun!
The title of your question suggests that the main reason you have doubts about using a enum is that your constants are non-iterative. But in C++ enum types are non-iterative already. You have to jump through quite a few hoops to make an iterative enum type.
I'd say that if your constants are related by nature, then enum is a pretty good idea, regardless of whether the constants are iterative or not. The main disadvantage of enums though is total lack of type control. In many cases you might prefer to have strict control over the types of your constant values (like have them unsigned) and that's something enum can't help you with (at least yet).
One thing to keep in mind is that you can't take the address of an enum:
const unsigned* my_arbitrary_item = &item_1_pos;
If they're purely constants and require no run-time stuff (like can't init enum with non-enum value) then they should just be const unsigned ints. Of course, the enum is less typing, but that's besides the point.

Accessing any structs members at run-time

Is it possible to get access to an individual member of a struct or class without knowing the names of its member variables?
I would like to do an "offsetof(struct, tyname)" without having the struct name or member variable name hard coded amoungst other things.
thanks.
Sure. If you have a struct and you know the offset and the type of the member variable, you can access it using pointers.
struct my_struct {
int member1;
char member2;
short member3;
char member4;
}
...
struct my_struct obj;
short member3 = *((short*)((char*)&obj + 5));
That'll get the value of member3, which is 5 bytes on from the start of obj on an x86 computer. However, you want to be careful. First of all, if the struct changes, your data will be garbage. We're casting all over the place, so you get no type safety, and the compiler won't warn you if something's awry. You'll also need to make sure the compiler's not packing the struct to align variables to word boundaries, or the offset will change.
This isn't a pleasant thing to do, and I'd avoid it if I were you, but yes, it can be done.
C and C++ are compiled languages without built-in "reflection" features. This means that regardless of what you do and how you do it, one way or another the path will always start from an explicit hard-coded value, be that a member name or an compile-time offset value. That means that if you want to select a struct member based on some run-time key, you have no other choice but to manually create a mapping of some kind that would map the key value to something that identifies a concrete struct member.
In C++ in order to identify a struct member at run-time you can use such feature as pointers-to-members. In C your only choice is to use an offset value.
Another issue is, of course, specifying the type of the members, if your members can have different types. But you provided no details about that, so I can't say whether you need to deal with it or not.
We had a similar problem some years ago: A huge struct of configuration information that we wanted to reflect on. So we wrote a Perl script to find the struct, parse its members, and output a C++ file that looked like:
struct ConfField
{ const char* name;
int type;
size_t offset;
};
ConfField confFields[] = {
{ "version", eUInt32, 0 },
{ "seqID", eUInt32, 4 },
{ "timestamp", eUInt64, 8 },
// ... lots more ...
{ 0, 0, 0 }
};
And we'd feed the script with the output from gcc -E.
Nowadays, I understand that gccxml can output an XML file representing any C++ source that gcc can compile, since it actually uses the g++ front end to do the parsing. So I'd recommend pairing it with an XML-parsing script (I'd use Python with the lxml library) to find out everything you ever wanted to know about your C++ source.
Somewhere in your code you need to reference the data member in the struct. However you can create a variable that is a pointer to a struct data member and from then on you no longer need to reference it by name.
struct foo
{
int member1;
int member2;
};
typedef int (foo::*intMemberOfFoo);
intMemberOfFoo getMember()
{
if (rand() > RAND_MAX / 2) return &foo::member1;
else return &foo::member2;
}
foo f;
void do_somthing()
{
intMemberOfFoo m = getMember();
f.*m = 0;
}
The technical answer is 'yes' because C++ is Turing-complete and you can do almost anything if you try hard enough. The more practical answer is probably 'no' since there is no safe and easy way of doing exactly what you want.
I agree with GMan. What exactly are you trying to do that makes you think you need this technique?
Well you will have to set up some stuff first, but it can be done. Expanding on Samir's response
struct my_struct {
int member1;
char member2;
short member3;
char member4;
}
you can create a table of offsets:
my_struct tmp;
int my_struct_offsets[4]={
0,
(char*)&(tmp.member2)-(char*)&(tmp.member1),
(char*)&(tmp.member3)-(char*)&(tmp.member1),
(char*)&(tmp.member4)-(char*)&(tmp.member1)
}
this will take into account different alignments on different systems