What does union in C++ do in this case? - c++

In one of the classes that I am working I found some thing like this in the header file:
// Flags
union
{
DWORD _flags;
struct {
unsigned _fVar1:1;
unsigned _fVar2:1;
unsigned _fVar3:1;
unsigned _fVar4:1;
};
};
In some of the class's member functions, I have seen _flags being set directly like _flags = 3;.
I have also seen the members in the struct being set directly, like _fVar1 = 0 and being compared against.
I am trying to remove _fVar1, I am not sure what it will do to other places where _flags and other _fVar# are accessed or set.
For instance, does setting _flags = 3 means that _fVar1 and _fVar2 will be 1 and _fVar3 and _fVar4 will be 0? Would removing or adding to the struct means I have to make corresponding changes to codes that touches any of the other members in the union?

Anonymous member structs (classes) are not allowed in C++, so the program is ill-formed as far as the standard is concerned.
Accessing non-active member of a union has undefined behaviour.
So in short: Whatever it does is up to the compiler.
Both of those are allowed in C (the former wasn't allowed until C11, the latter until C99), and by some compilers, as an extension in C++ (and as an extension in earlier versions of C). Let us assume that you use such compiler.
For instance, does setting _flags = 3 means that _fVar1 and _fVar2 will be 1 and _fVar3 and _fVar4 will be 0?
That is probably the intention. However, the behaviour depends on the representation that the compiler has chosen for the bit fields.
Without making assumptions about the representation, the only sensible thing that you can use the union for is to set all flags to 0 (_flags = 0), or all flags to 1 (_flags = -1).
Would removing or adding to the struct means I have to make corresponding changes to codes that touches any of the other members in the union?
Yes, unless the code touches all of the members equally, like the two examples above.

There is nothing very special about it. This is simply a union of an integer variable and a struct with bit fields.
Every bitfield in the struct is one bit length, so it can be used to access individual bits in the integer.

Related

The reason for explicitly start C/C++ enum at 0

I know that c and c++ standards state that if you don't specify first element's value a start value of enum will default to 0.
But e.g. in Linux kernel sources I faced strange declarations dozens of times. e.g. numa_faults_stats:
enum numa_faults_stats {
NUMA_MEM = 0,
NUMA_CPU,
NUMA_MEMBUF,
NUMA_CPUBUF
};
What is the need for explicitly set first element of this enum to 0?
Related post.
There are very many rules for various things in C and C++: this being one of them. Sometimes it's nice to be explicit, for clarity.
Another common one is to use variable names in function prototypes (only the types are needed). Yet another is a return 0; in main in either language. The explicit use of public and private in a C++ class or struct is another.
You can use enums without care its value like only using it comprasions with each other. But sometimes its value is important. You may use is as an index of an array. eg.
struct NUMA Numa[N];
Numa[NUMA_MEM];
Numa[NUMA_CPU];
In this case it is definitly good idea explicitly assing value even it is default equal. You emphasize that its value has usage in code.

Do the C++ standards guarantee that unused private fields will influence sizeof?

Consider the following struct:
class Foo {
int a;
};
Testing in g++, I get that sizeof(Foo) == 4 but is that guaranteed by the standard? Would a compiler be allowed to notice that a is an unused private field and remove it from the in-memory representation of the class (leading to a smaller sizeof)?
I don't expect any compilers to actually do that kind of optimization but this question popped up in a language lawyering discussion so now I'm curious.
The C++ standard doesn't define a lot about memory layouts. The fundamental rule for this case is item 4 under section 9 Classes:
4 Complete objects and member subobjects of class type shall have nonzero size. [ Note: Class objects can be assigned, passed as arguments to functions, and returned by functions (except objects of classes for which copying or moving has been restricted; see 12.8). Other plausible operators, such as equality comparison, can be defined by the user; see 13.5. — end note ]
Now there is one more restriction, though: Standard-layout classes. (no static elements, no virtuals, same visibility for all members) Section 9.2 Class members requires layout compatibility between different classes for standard-layout classes. This prevents elimination of members from such classes.
For non-trivial non-standard-layout classes I see no further restriction in the standard. The exact behavior of sizeof(), reinterpret_cast(), ... are implementation defined (i.e. 5.2.10 "The mapping function is implementation-defined.").
The answer is yes and no. A compiler could not exhibit exactly that behaviour within the standard, but it could do so partly.
There is no reason at all why a compiler could not optimise away the storage for the struct if that storage is never referenced. If the compiler gets its analysis right, then no program that you could write would ever be able to tell whether the storage exists or not.
However, the compiler cannot report a smaller sizeof() thereby. The standard is pretty clear that objects have to be big enough to hold the bits and bytes they contain (see for example 3.9/4 in N3797), and to report a sizeof smaller than that required to hold an int would be wrong.
At N3797 5.3.2:
The sizeof operator yields the number of bytes in the object
representation of its operand
I do not se that 'representation' can change according to whether the struct or member is referenced.
As another way of looking at it:
struct A {
int i;
};
struct B {
int i;
};
A a;
a.i = 0;
assert(sizeof(A)==sizeof(B));
I do not see that this assert can be allowed to fail in a standards-conforming implementation.
If you look at templates, you'll notice that "optimization" of such often ends up with nearly nothing in the output even though the template files may be thousands of lines...
I think that the optimization you are talking about will nearly always occur in a function when the object is used on the stack and the object doesn't get copied or passed down to another function and the private field is never accessed (not even initialized... which could be viewed as a bug!)

What is the difference between a proper defined union and a reinterpret_cast?

Can you propose at least 1 scenario where there is a substantial difference between
union {
T var_1;
U var_2;
}
and
var_2 = reinterpret_cast<U> (var_1)
?
The more i think about this, the more they look like the same thing to me, at least from a practical viewpoint.
One difference that I found is that while the union size is big as the biggest data type in terms of size, the reinterpret_cast as described in this post can lead to a truncation, so the plain old C-style union is even safer than a newer C++ casting.
Can you outline the differences between this 2 ?
Contrary to what the other answers state, from a practical point of view there is a huge difference, although there might not be such a difference in the standard.
From the standard point of view, reinterpret_cast is only guaranteed to work for roundtrip conversions and only if the alignment requirements of the intermediate pointer type are not stronger than those of the source type. You are not allowed (*) to read through one pointer and read from another pointer type.
At the same time, the standard requires similar behavior from unions, it is undefined behavior to read out of a union member other than the active one (the member that was last written to)(+).
Yet compilers often provide additional guarantees for the union case, and all compilers I know of (VS, g++, clang++, xlC_r, intel, Solaris CC) guarantee that you can read out of an union through an inactive member and that it will produce a value with exactly the same bits set as those that were written through the active member.
This is particularly important with high optimizations when reading from network:
double ntohdouble(const char *buffer) { // [1]
union {
int64_t i;
double f;
} data;
memcpy(&data.i, buffer, sizeof(int64_t));
data.i = ntohll(data.i);
return data.f;
}
double ntohdouble(const char *buffer) { // [2]
int64_t data;
double dbl;
memcpy(&data, buffer, sizeof(int64_t));
data = ntohll(data);
dbl = *reinterpret_cast<double*>(&data);
return dbl;
}
The implementation in [1] is sanctioned by all compilers I know (gcc, clang, VS, sun, ibm, hp), while the implementation in [2] is not and will fail horribly in some of them when aggressive optimizations are used. In particular, I have seen gcc reorder the instructions and read into the dbl variable before evaluating ntohl, thus producing the wrong results.
(*) With the exception that you are always allowed to read from a [signed|unsigned] char* regardless of that the real object (original pointer type) was.
(+) Again with some exceptions, if the active member shares a common prefix with another member, you can read through the compatible member that prefix.
There are some technical differences between a proper union and a (let's assume) a proper and safe reinterpret_cast. However, I can't think of any of these differences which cannot be overcome.
The real reason to prefer a union over reinterpret_cast in my opinion isn't a technical one. It's for documentation.
Supposing you are designing a bunch of classes to represent a wire protocol (which I guess is the most common reason to use type-punning in the first place), and that wire protocol consists of many messages, submessages and fields. If some of those fields are common, such as msg type, seq#, etc, using a union simplifies tying these elements together and helps to document exactly how the protocol appears on the wire.
Using reinterpret_cast does the same thing, obviously, but in order to really know what's going on you have to examine the code that advances from one packet to the next. Using a union you can just take a look at the header and get an idea what's going on.
In C++11, union is class type, you can an hold a member with non-trivial member functions. You can't simply cast from one member to another.
§ 9.5.3
[ Example: Consider the following union:
union U {
int i;
float f;
std::string s;
};
Since std::string (21.3) declares non-trivial versions of all of the special member functions, U will have
an implicitly deleted default constructor, copy/move constructor, copy/move assignment operator, and destructor. To use U, some or all of these member functions must be user-provided. — end example ]
From a practical point of view, they're most probably 100% identical, at least on real, non-fictional computers. You take the binary representation of one type and stuff it into another type.
From a language lawyer point of view, using reinterpret_cast is well-defined for some occasions (e.g. pointer to integer conversions) and implementation-specific otherwise.
Union type punning, on the other hand is very clearly undefined behaviour, always (though undefined does not necessarily mean "doesn't work"). The standard says that the value of at most one of the non-static data members can be stored in a union at any time. This means that if you set var1 then var1 is valid, but var2 is not.
However, since var1 and var2 are stored at the same memory location, you can of course still read and write any of the types as you like, and assuming they have the same storage size, no bits are "lost".

Casting big POD into small POD - guaranteed to work?

Suppose I've a POD struct which has more than 40 members. These members are not built-in types, rather most of them are POD structs, which in turn has lots of members, most of which are POD struct again. This pattern goes up to many levels - POD has POD has POD and so on - up to even 10 or so levels.
I cannot post the actual code, so here is one very simple example of this pattern:
//POD struct
struct Big
{
A a[2]; //POD has POD
B b; //POD has POD
double dar[6];
int m;
bool is;
double d;
char c[10];
};
And A and B are defined as:
struct A
{
int i;
int j;
int k;
};
struct B
{
A a; //POD has POD
double x;
double y;
double z;
char *s;
};
It's really very simplified version of the actual code which was written (in C) almost 20 years back by Citrix Systems when they came up with ICA protocol. Over the years, the code has been changed a lot. Now we've the source code, but we cannot know which code is being used in the current version of ICA, and which has been discarded, as the discarded part is also present in the source code.
That is the background of the problem. The problem is: now we've the source code, and we're building a system on the top of ICA protocol, for which at some point we need to know the values of few members of the big struct. Few members, not all. Fortunately, those members appear in the beginning of the struct, so can we write a struct which is part of the big struct, as:
//Part of struct B
//Whatever members it has, they are in the same order as they appear in Big.
struct partBig
{
A a[2];
B b;
double dar[6];
//rest is ignored
};
Now suppose, we know pointer to Big struct (that we know by deciphering the protocol and data streams), then can we write this:
Big *pBig = GetBig();
partBig *part = (partBig*)pBig; //Is this safe?
/*here can we pretend that part is actually Big, so as to access
first few members safely (using part pointer), namely those
which are defined in partBig?*/
I don't want to define the entire Big struct in our code, as it has too many POD members and if I define the struct entirely, I've to first define hundreds of other structs. I don't want to that, as even if I do, I doubt if I can do that correctly, as I don't know all the structs correctly (as to which version is being used today, and which is discarded).
I've done the casting already, and it seems to work for last one year, I didn't see any problem with that. But now I thought why not start a topic and ask everyone. Maybe, I'll have a better solution, or at least, will make some important notes.
Relevant references from the language specification will be appreciated. :-)
Here is a demo of such casting: http://ideone.com/c7SWr
The language specification has some features that are similar to what you are trying to do: the 6.5.2.3/5 states that if you have several structs that share common initial sequence of members, you are allowed to inspect these common members through any of the structs in the union, regardless of which one is currently active. So, from this guarantee one can easily derive that structs with common initial sequence of members should have identical memory layout for these common members.
However, the language does not seem to explicitly allow doing what you are doing, i.e. reinterpreting one struct object as another unrelated struct object through a pointer cast. (The language allows you to access the first member of a struct object through a pointer cast, but not what you do in your code). So, from that perspective, what you are doing might be a strict-aliasing violation (see 6.5/7). I'm not sure about this though, since it is not immediately clear to me whether the intent of 6.5/7 was to outlaw this kind of access.
However, in any case, I don't think that compilers that use strict-aliasing rules for optimization do it at aggregate type level. Most likely they only do it at fundamental type level, meaning that your code should work fine in practice.
Yes, assuming your small structure is laid out with the same packing/padding rules as the big one, you'll be fine. The memory layout order of fields in a structure is well-defined by the language specifications.
The only way to get BigPart from Big is using reinterpret_cast (or C style, as in your question), and the standard does not specify what happens in that case. However, it should work as expected. It would probably fail only for some super exotic platforms.
The relevant type-aliasing rule is 3.10/10: "If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: ...".
The list contains two cases that are relevant to us:
a type similar (as defined in 4.4) to the dynamic type of the object
an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union)
The first case isn't sufficient: it merely covers cv-qualifications. The second case allows at the union hack that AndreyT mentions (the common initial sequence), but not anything else.

How are structs laid out in memory in C++?

Is the way C++ structs are laid out set by the standard, or at least common across compilers?
I have a struct where one of its members needs to be aligned on 16 byte boundaries, and this would be easier if I can guarantee the ordering of the fields.
Also, for non-virtual classes, is the address of the first element also likely to be the address of the struct?
I'm most interested in GCC and MSVS.
C and C++ both guarantee that fields will be laid out in memory in the same order as you define them. For C++ that's only guaranteed for a POD type1 (anything that would be legitimate as a C struct [Edit: C89/90 -- not, for example, a C99 VLA] will also qualify as a POD).
The compiler is free to insert padding between members and/or at the end of the struct. Most compilers give you some way to control that (e.g., #pragma pack(N)), but it does vary between compilers.
1Well, there is one corner case they didn't think of, where it isn't guaranteed for a POD type -- an access specifier breaks the ordering guarantee:
struct x {
int x;
int y;
public:
int z;
};
This is a POD type, but the public: between y and z means they could theoretically be re-ordered. I'm pretty sure this is purely theoretical though -- I don't know of any compiler that does reorder the members in this situation (and unless memory fails me even worse than usual today, this is fixed in C++0x).
Edit: the relevant parts of the standard (at least most of them) are §9/4:
A POD-struct is an aggregate class that has no non-volatile
data members of type pointer to member, non-POD-struct, non-
POD-union (or array of such types) or reference, and has no
user-defined copy assignment operator and no user-defined
destructor.
and §8.5.1/1:
An aggregate is an array or a class (clause 9) with no user-
declared constructors (12.1), no private or protected non-
static data members (clause 11), no base classes (clause 10)
and no virtual functions (10.3).
and §9.2/12:
...the order of allocation of non-static data members
separated by an access-specifier is unspecified (11.1).
Though that's restricted somewhat by §9.2/17:
A pointer to a POD-struct object, suitably converted using
a reinterpret_cast, points to its initial member...
Therefore, (even if preceded by a public:, the first member you define must come first in memory. Other members separated by public: specifiers could theoretically be rearranged.
I should also point out that there's some room for argument about this. In particular, there's also a rule in §9.2/14:
Two POD-struct (clause 9) types are layout-compatible if they have the same number of nonstatic data members, and corresponding nonstatic data members (in order) have layout-compatible types (3.9).
Therefore, if you have something like:
struct A {
int x;
public:
int y;
public:
int z;
};
It is required to be layout compatible with:
struct B {
int x;
int y;
int z;
};
I'm pretty sure this is/was intended to mean that the members of the two structs must be laid out the same way in memory. Since the second one clearly can't have its members rearranged, the first one shouldn't be either. Unfortunately, the standard never really defines what "layout compatible" means, rendering the argument rather weak at best.
C++ inherits from c the need/desire to work efficiently on many platforms, and therefore leaves certain things up to the compiler. Aside from requiring elements to appear in the specified order, this is one of them.
But many compilers support #pragmas and options to let you establish come control of the packing. See your compiler's documentation.
Is the way C++ structs are laid out
set by the standard, or at least
common across compilers?
There is no guarantee in the standard. I would not depend on compilers having the same alignment
I have a struct where one of its
members needs to be aligned on 16 byte
boundaries, and this would be easier
if I can guarantee the ordering of the
fields.
Compilers will not change the order of the fields.
Here are some links on how to set them for both GCC and MSVC:
For GCC: http://developer.apple.com/mac/library/documentation/DeveloperTools/gcc-4.0.1/gcc/Structure_002dPacking-Pragmas.html
MSVC: http://msdn.microsoft.com/en-us/library/ms253935(VS.80).aspx and http://msdn.microsoft.com/en-us/library/2e70t5y1(VS.80).aspx
I would keep them as structures and use an extern "C" to ensure that it works properly. Maybe not needed but that would definitely work.
C structs (and C++ PODs, for compatibility) are required by the standard to have sequential layout.
The only difference between compilers is alignment, but fortunately, both MSVC and GCC support #pragma pack.
If you want to see the memory layout of all class types in memory under MSVC, add the /d1reportAllClassLayout switch to the compiler command line.
If you want to see what one class is layed out like, add /d1reportSingleClassLayoutNNNNN where NNNNN is the name of the class.
Structures are laid out sequentially in memory. However, the way they're aligned in memory varies based on the operating system.
For things bigger than 4 bytes, there's a difference between Windows and Linux. Linux aligns them as if they're 4 bytes, so for example a double (8 bytes) could start at p+4, p+8, p+12, etc. where p is the start of the structure. In Windows, a double (8 bytes) needs to start at an address that's a multiple of 8 so p+8, p+16, p+24, etc.