Casting big POD into small POD - guaranteed to work? - c++

Suppose I've a POD struct which has more than 40 members. These members are not built-in types, rather most of them are POD structs, which in turn has lots of members, most of which are POD struct again. This pattern goes up to many levels - POD has POD has POD and so on - up to even 10 or so levels.
I cannot post the actual code, so here is one very simple example of this pattern:
//POD struct
struct Big
{
A a[2]; //POD has POD
B b; //POD has POD
double dar[6];
int m;
bool is;
double d;
char c[10];
};
And A and B are defined as:
struct A
{
int i;
int j;
int k;
};
struct B
{
A a; //POD has POD
double x;
double y;
double z;
char *s;
};
It's really very simplified version of the actual code which was written (in C) almost 20 years back by Citrix Systems when they came up with ICA protocol. Over the years, the code has been changed a lot. Now we've the source code, but we cannot know which code is being used in the current version of ICA, and which has been discarded, as the discarded part is also present in the source code.
That is the background of the problem. The problem is: now we've the source code, and we're building a system on the top of ICA protocol, for which at some point we need to know the values of few members of the big struct. Few members, not all. Fortunately, those members appear in the beginning of the struct, so can we write a struct which is part of the big struct, as:
//Part of struct B
//Whatever members it has, they are in the same order as they appear in Big.
struct partBig
{
A a[2];
B b;
double dar[6];
//rest is ignored
};
Now suppose, we know pointer to Big struct (that we know by deciphering the protocol and data streams), then can we write this:
Big *pBig = GetBig();
partBig *part = (partBig*)pBig; //Is this safe?
/*here can we pretend that part is actually Big, so as to access
first few members safely (using part pointer), namely those
which are defined in partBig?*/
I don't want to define the entire Big struct in our code, as it has too many POD members and if I define the struct entirely, I've to first define hundreds of other structs. I don't want to that, as even if I do, I doubt if I can do that correctly, as I don't know all the structs correctly (as to which version is being used today, and which is discarded).
I've done the casting already, and it seems to work for last one year, I didn't see any problem with that. But now I thought why not start a topic and ask everyone. Maybe, I'll have a better solution, or at least, will make some important notes.
Relevant references from the language specification will be appreciated. :-)
Here is a demo of such casting: http://ideone.com/c7SWr

The language specification has some features that are similar to what you are trying to do: the 6.5.2.3/5 states that if you have several structs that share common initial sequence of members, you are allowed to inspect these common members through any of the structs in the union, regardless of which one is currently active. So, from this guarantee one can easily derive that structs with common initial sequence of members should have identical memory layout for these common members.
However, the language does not seem to explicitly allow doing what you are doing, i.e. reinterpreting one struct object as another unrelated struct object through a pointer cast. (The language allows you to access the first member of a struct object through a pointer cast, but not what you do in your code). So, from that perspective, what you are doing might be a strict-aliasing violation (see 6.5/7). I'm not sure about this though, since it is not immediately clear to me whether the intent of 6.5/7 was to outlaw this kind of access.
However, in any case, I don't think that compilers that use strict-aliasing rules for optimization do it at aggregate type level. Most likely they only do it at fundamental type level, meaning that your code should work fine in practice.

Yes, assuming your small structure is laid out with the same packing/padding rules as the big one, you'll be fine. The memory layout order of fields in a structure is well-defined by the language specifications.

The only way to get BigPart from Big is using reinterpret_cast (or C style, as in your question), and the standard does not specify what happens in that case. However, it should work as expected. It would probably fail only for some super exotic platforms.

The relevant type-aliasing rule is 3.10/10: "If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: ...".
The list contains two cases that are relevant to us:
a type similar (as defined in 4.4) to the dynamic type of the object
an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union)
The first case isn't sufficient: it merely covers cv-qualifications. The second case allows at the union hack that AndreyT mentions (the common initial sequence), but not anything else.

Related

Do the C++ standards guarantee that unused private fields will influence sizeof?

Consider the following struct:
class Foo {
int a;
};
Testing in g++, I get that sizeof(Foo) == 4 but is that guaranteed by the standard? Would a compiler be allowed to notice that a is an unused private field and remove it from the in-memory representation of the class (leading to a smaller sizeof)?
I don't expect any compilers to actually do that kind of optimization but this question popped up in a language lawyering discussion so now I'm curious.
The C++ standard doesn't define a lot about memory layouts. The fundamental rule for this case is item 4 under section 9 Classes:
4 Complete objects and member subobjects of class type shall have nonzero size. [ Note: Class objects can be assigned, passed as arguments to functions, and returned by functions (except objects of classes for which copying or moving has been restricted; see 12.8). Other plausible operators, such as equality comparison, can be defined by the user; see 13.5. — end note ]
Now there is one more restriction, though: Standard-layout classes. (no static elements, no virtuals, same visibility for all members) Section 9.2 Class members requires layout compatibility between different classes for standard-layout classes. This prevents elimination of members from such classes.
For non-trivial non-standard-layout classes I see no further restriction in the standard. The exact behavior of sizeof(), reinterpret_cast(), ... are implementation defined (i.e. 5.2.10 "The mapping function is implementation-defined.").
The answer is yes and no. A compiler could not exhibit exactly that behaviour within the standard, but it could do so partly.
There is no reason at all why a compiler could not optimise away the storage for the struct if that storage is never referenced. If the compiler gets its analysis right, then no program that you could write would ever be able to tell whether the storage exists or not.
However, the compiler cannot report a smaller sizeof() thereby. The standard is pretty clear that objects have to be big enough to hold the bits and bytes they contain (see for example 3.9/4 in N3797), and to report a sizeof smaller than that required to hold an int would be wrong.
At N3797 5.3.2:
The sizeof operator yields the number of bytes in the object
representation of its operand
I do not se that 'representation' can change according to whether the struct or member is referenced.
As another way of looking at it:
struct A {
int i;
};
struct B {
int i;
};
A a;
a.i = 0;
assert(sizeof(A)==sizeof(B));
I do not see that this assert can be allowed to fail in a standards-conforming implementation.
If you look at templates, you'll notice that "optimization" of such often ends up with nearly nothing in the output even though the template files may be thousands of lines...
I think that the optimization you are talking about will nearly always occur in a function when the object is used on the stack and the object doesn't get copied or passed down to another function and the private field is never accessed (not even initialized... which could be viewed as a bug!)

What is the difference between a proper defined union and a reinterpret_cast?

Can you propose at least 1 scenario where there is a substantial difference between
union {
T var_1;
U var_2;
}
and
var_2 = reinterpret_cast<U> (var_1)
?
The more i think about this, the more they look like the same thing to me, at least from a practical viewpoint.
One difference that I found is that while the union size is big as the biggest data type in terms of size, the reinterpret_cast as described in this post can lead to a truncation, so the plain old C-style union is even safer than a newer C++ casting.
Can you outline the differences between this 2 ?
Contrary to what the other answers state, from a practical point of view there is a huge difference, although there might not be such a difference in the standard.
From the standard point of view, reinterpret_cast is only guaranteed to work for roundtrip conversions and only if the alignment requirements of the intermediate pointer type are not stronger than those of the source type. You are not allowed (*) to read through one pointer and read from another pointer type.
At the same time, the standard requires similar behavior from unions, it is undefined behavior to read out of a union member other than the active one (the member that was last written to)(+).
Yet compilers often provide additional guarantees for the union case, and all compilers I know of (VS, g++, clang++, xlC_r, intel, Solaris CC) guarantee that you can read out of an union through an inactive member and that it will produce a value with exactly the same bits set as those that were written through the active member.
This is particularly important with high optimizations when reading from network:
double ntohdouble(const char *buffer) { // [1]
union {
int64_t i;
double f;
} data;
memcpy(&data.i, buffer, sizeof(int64_t));
data.i = ntohll(data.i);
return data.f;
}
double ntohdouble(const char *buffer) { // [2]
int64_t data;
double dbl;
memcpy(&data, buffer, sizeof(int64_t));
data = ntohll(data);
dbl = *reinterpret_cast<double*>(&data);
return dbl;
}
The implementation in [1] is sanctioned by all compilers I know (gcc, clang, VS, sun, ibm, hp), while the implementation in [2] is not and will fail horribly in some of them when aggressive optimizations are used. In particular, I have seen gcc reorder the instructions and read into the dbl variable before evaluating ntohl, thus producing the wrong results.
(*) With the exception that you are always allowed to read from a [signed|unsigned] char* regardless of that the real object (original pointer type) was.
(+) Again with some exceptions, if the active member shares a common prefix with another member, you can read through the compatible member that prefix.
There are some technical differences between a proper union and a (let's assume) a proper and safe reinterpret_cast. However, I can't think of any of these differences which cannot be overcome.
The real reason to prefer a union over reinterpret_cast in my opinion isn't a technical one. It's for documentation.
Supposing you are designing a bunch of classes to represent a wire protocol (which I guess is the most common reason to use type-punning in the first place), and that wire protocol consists of many messages, submessages and fields. If some of those fields are common, such as msg type, seq#, etc, using a union simplifies tying these elements together and helps to document exactly how the protocol appears on the wire.
Using reinterpret_cast does the same thing, obviously, but in order to really know what's going on you have to examine the code that advances from one packet to the next. Using a union you can just take a look at the header and get an idea what's going on.
In C++11, union is class type, you can an hold a member with non-trivial member functions. You can't simply cast from one member to another.
§ 9.5.3
[ Example: Consider the following union:
union U {
int i;
float f;
std::string s;
};
Since std::string (21.3) declares non-trivial versions of all of the special member functions, U will have
an implicitly deleted default constructor, copy/move constructor, copy/move assignment operator, and destructor. To use U, some or all of these member functions must be user-provided. — end example ]
From a practical point of view, they're most probably 100% identical, at least on real, non-fictional computers. You take the binary representation of one type and stuff it into another type.
From a language lawyer point of view, using reinterpret_cast is well-defined for some occasions (e.g. pointer to integer conversions) and implementation-specific otherwise.
Union type punning, on the other hand is very clearly undefined behaviour, always (though undefined does not necessarily mean "doesn't work"). The standard says that the value of at most one of the non-static data members can be stored in a union at any time. This means that if you set var1 then var1 is valid, but var2 is not.
However, since var1 and var2 are stored at the same memory location, you can of course still read and write any of the types as you like, and assuming they have the same storage size, no bits are "lost".

Unions as Base Class

The standard defines that Unions cannot be used as Base class, but is there any specific reasoning for this? As far as I understand Unions can have constructors, destructors, also member variables, and methods to operate on those varibales. In short a Union can encapsulate a datatype and state which might be accessed through member functions. Thus it in most common terms qualifies for being a class and if it can act as a class then why is it restricted from acting as a base class?
Edit: Though the answers try to explain the reasoning I still do not understand how Union as a Derived class is worst than when Union as just a class. So in hope of getting more concrete answer and reasoning I will push this one for a bounty. No offence to the already posted answers, Thanks for those!
Tony Park gave an answer which is pretty close to the truth. The C++ committee basically didn't think it was worth the effort to make unions a strong part of C++, similarly to the treatment of arrays as legacy stuff we had to inherit from C but didn't really want.
Unions have problems: if we allow non-POD types in unions, how do they get constructed? It can certainly be done, but not necessarily safely, and any consideration would require committee resources. And the final result would be less than satisfactory, because what is really required in a sane language is discriminated unions, and bare C unions could never be elevated to discriminated unions in way compatible with C (that I can imagine, anyhow).
To elaborate on the technical issues: since you can wrap a POD-component only union in a struct without losing anything, there's no advantage allowing unions as bases. With POD-only union components, there's no problem with explicit constructors simply assigning one of the components, nor with using a bitblit (memcpy) for compiler generated copy constructor (or assignment).
Such unions, however, aren't useful enough to bother with except to retain them so existing C code can be considered valid C++. These POD-only unions are broken in C++ because they fail to retain a vital invariant they possess in C: any data type can be used as a component type.
To make unions useful, we must allow constructable types as members. This is significant because it is not acceptable to merely assign a component in a constructor body, either of the union itself, or any enclosing struct: you cannot, for example, assign a string to an uninitialised string component.
It follows one must invent some rules for initialising union component with mem-initialisers, for example:
union X { string a; string b; X(string q) : a(q) {} };
But now the question is: what is the rule? Normally the rule is you must initialise every member and base of a class, if you do not do so explicitly, the default constructor is used for the remainder, and if one type which is not explicitly initialised does not have a default constructor, it's an error [Exception: copy constructors, the default is the member copy constructor].
Clearly this rule can't work for unions: the rule has to be instead: if the union has at least one non-POD member, you must explicitly initialise exactly one member in a constructor. In this case, no default constructor, copy constructor, assignment operator, or destructor will be generated and if any of these members are actually used, they must be explicitly supplied.
So now the question becomes: how would you write, say, a copy constructor? It is, of course quite possible to do and get right if you design your union the way, say, X-Windows event unions are designed: with the discriminant tag in each component, but you will have to use placement operator new to do it, and you will have to break the rule I wrote above which appeared at first glance to be correct!
What about default constructor? If you don't have one of those, you can't declare an uninitialised variable.
There are other cases where you can determine the component externally and use placement new to manage a union externally, but that isn't a copy constructor. The fact is, if you have N components you'd need N constructors, and C++ has a broken idea that constructors use the class name, which leaves you rather short of names and forces you to use phantom types to allow overloading to choose the right constructor .. and you can't do that for the copy constructor since its signature is fixed.
Ok, so are there alternatives? Probably, yes, but they're not so easy to dream up, and harder to convince over 100 people that it's worthwhile to think about in a three day meeting crammed with other issues.
It is a pity the committee did not implement the rule above: unions are mandatory for aligning arbitrary data and external management of the components is not really that hard to do manually, and trivial and completely safe when the code is generated by a suitable algorithm, in other words, the rule is mandatory if you want to use C++ as a compiler target language and still generate readable, portable code. Such unions with constructable members have many uses but the most important one is to represent the stack frame of a function containing nested blocks: each block has local data in a struct, and each struct is a union component, there is no need for any constructors or such, the compiler will just use placement new. The union provides alignment and size, and cast free component access.
[And there is no other conforming way to get the right alignment!]
Therefore the answer to your question is: you're asking the wrong question. There's no advantage to POD-only unions being bases, and they certainly can't be derived classes because then they wouldn't be PODs. To make them useful, some time is required to understand why one should follow the principle used everywhere else in C++: missing bits aren't an error unless you try to use them.
Union is a type that can be used as any one of its members depending on which member has been set - only that member can be later read.
When you derive from a type the derived type inherits the base type - the derived type can be used wherever the base type could be. If you could derive from a union the derived class could be used (not implicitly, but explicitly through naming the member) wherever any of the union members could be used, but among those members only one member could be legally accessed. The problem is the data on which member has been set is not stored in the union.
To avoid this subtle yet dangerous contradiction that in fact subverts a type system deriving from a union is not allowed.
Bjarne Stroustrup said 'there seems little reason for it' in The Annotated C++ Reference Manual.
The title asks why unions can't be a base class, but the question appears to be about unions as a derived class. So, which is it?
There's no technical reason why unions can't be a base class; it's just not allowed. A reasonable interpretation would be to think of the union as a struct whose members happen to potentially overlap in memory, and consider the derived class as a class that inherits from this (rather odd) struct. If you need that functionality, you can usually persuade most compilers to accept an anonymous union as a member of a struct. Here's an example, that's suitable for use as a base class. (And there's an anonymous struct in the union for good measure.)
struct V3 {
union {
struct {
float x,y,z;
};
float f[3];
};
};
The rationale for unions as a derived class is probably simpler: the result wouldn't be a union. Unions would have to be the union of all their members, and all of their bases. That's fair enough, and might open up some interesting template possibilities, but you'd have a number of limitations (all bases and members would have to be POD -- and would you be able to inherit twice, because a derived type is inherently non-POD?), this type of inheritance would be different from the other type the language sports (OK, not that this has stopped C++ before) and it's sort of redundant anyway -- the existing union functionality would do just as well.
Stroustrup says this in the D&E book:
As with void *, programmers should know that unions ... are inherently dangerous, should be avoided wherever possible, and should be handled with special care when actually needed.
(The elision doesn't change the meaning.)
So I imagine the decision is arbitrary, and he just saw no reason to change the union functionality (it works fine as-is with the C subset of C++), and so didn't design any integration with the new C++ features. And when the wind changed, it got stuck that way.
I think you got the answer yourself in your comments on EJP's answer.
I think unions are only included in C++ at all in order to be backwards compatible with C. I guess unions seemed like a good idea in 1970, on systems with tiny memory spaces. By the time C++ came along I imagine unions were already looking less useful.
Given that unions are pretty dangerous anyway, and not terribly useful, the vast new opportunities for creating bugs that inheriting from unions would create probably just didn't seem like a good idea :-)
Here's my guess for C++ 03.
As per $9.5/1, In C++ 03, Unions can not have virtual functions. The whole point of a meaningful derivation is to be able to override behaviors in the derived class. If a union cannot have virtual functions, that means that there is no point in deriving from a union.
Hence the rule.
You can inherit the data layout of a union using the anonymous union feature from C++11.
#include <cstddef>
template <size_t N,typename T>
struct VecData {
union {
struct {
float x;
float y;
float z;
};
float a[N];
};
};
template <size_t N, typename T>
class Vec : public VecData<N,T> {
//methods..
};
In general its almost always better not work with unions directly but enclose them within a struct or class. Then you can base your inheritance off the struct outer layer and use unions within if you need to.

Why union can't be used in Inheritance?

I saw below thing in c++ standard (§9.5/1):
A union shall not have base classes. A union shall not
be used as a base class.
A union can have member functions
(including constructors and
destructors), but not virtual (10.3)
functions
From above, union can have constructor and destructor as well.
So why is it not allowed in Inheritance ?
EDIT: For answering comments:
If union is allowed as a base class, its data can be used by a derived class. If derived class is interested to use only one member of union, this way can be used for saving memory. I think, this is improper inheritance. Is it better to have union inside derived class in that case ?
If union is allowed as derived class, it can use services of base class. For example, if Union has multiple types of data. As we know, only one type of data can be used. For each type of data, a base class is present to offer services for that particular type. In this case, multiple inheritance can be used to get services of all base classes for all types of data in Union. This also i feel as improper usage of inheritance. But is there any equivalent concept to achieve content in this point?
Just my thoughts...
Here is a simple workaround:
struct InheritedUnion
{
union {
type1 member1;
type2 member2;
};
};
struct InheritsUnion : InheritedUnion
{};
By making the union anonymous, it works just as if the base type were actually a union.
(This answer was written for C++03, the situation may have changed since C++11)
I can't imagine any compelling reason for excluding it... more that there's no particularly good reason for including it. Unions just aren't used enough to matter that much, and there are heavy restrictions on the type of members in the union, restrictions that prevent OO practices. Those restrictions - and the burden of manually tracking the particular variable(s) in a union that have validity - mean pretty much the first thing you want to do with a union is encapsulate it in a class, then you can inherit from that anyway. Letting the union be part of the inheritance chain just spreads the pain around.
So, anything that might give casual C++ users the impression that unions could be mixed into their OO-centric code would cause more trouble than good. Unions are basically a hack for memory conservation, fast-but-nasty data reinterpretation and implementing variants, and tend to be an implementation detail rather than an interface one.
You've added a bit to your question with your edit... thoughts follow.
If union is allowed as a base class, its data can be used by a derived class. If derived class is interested to use only one member of union, this way can be used for saving memory. I think, this is improper inheritance. Is it better to have union inside derived class in that case ?
So, we're discussing:
union U { int i; double d; };
struct D : U { };
U's data members would either have to be - implicitly or explicitly - public, protected or private. If they're public or protected, then they're not encapsulated, and we're back in the "share the pain" scenario mentioned above. U has less ability to encapsulate them than a class containing a union. If they're private, then they could just as easily be a union data member in a class.
If union is allowed as derived class, it can use services of base class. For example, if Union has multiple types of data. As we know, only one type of data can be used. For each type of data, a base class is present to offer services for that particular type. In this case, multiple inheritance can be used to get services of all base classes for all types of data in Union. This also i feel as improper usage of inheritance. But is there any equivalent concept to achieve content in this point?
Ok, this is where things get weird. So, we've got:
union U : public A, private B { };
Firstly, let me be a bit more explicit about something. Unions can't contain complex, encapsulated objects - they're the antithesis of encapsulation. You're pretty much limited to POD data, and can't have non-default constructors etc.. Note I'm talking about the data members of the union, and not the union itself. So, A and B would - if those union-content rules remain - be very limited, and the ability of U to derive not particularly useful.
That leads into the question of why unions can't manage more complex objects in some safe way. Well, how could they do it? Obvious answer is add a hidden enum to say which one is valid, and some rules about which type should be constructed by default, invoking the destructor first when a different enum field is assigned to, etc etc etc.... Maybe they should throw if someone does something to the member that's not currently constructed? Sounds ok?
Well, firstly, the overhead of an enum might not be necessary, as client code might use one member then another in a known-appopriate order. Checks and exceptions in the language itself are a similar overhead... they may be able to be factored down to a single check before multiple uses if left to the client code. In both cases, you'd be paying for some management overhead that only some applications need - a very un-C++ approach.
Ignoring that, unions aren't that simple anyway. Like enums, their use is designed to be flexible and would be hard to communicate to the compiler so it could clean up, check and automate it. You might think "huh? enums are complex?", but each enum is - conceptually generalised - effectively an arbitrary set of independent bitfields of varying width. It's non-trivial to describe which ones are meant to be mutually exclusive or dependent etc.. The compiler doesn't buy into the problem space at all. Similarly, a union may have one or more concurrently legitimate views on the same data, while others are invalid, with subtleties to boot. For example: a union of int64_t, double, and char[4] could always be read as an int64_t or char[4] after being set as a double, but the other way around might read an invalid double and cause undefined behaviour, unless you're re-reading values that came from a double at some earlier time, perhaps in a serialisation/deserialisation library. The compiler doesn't want to buy into the management of that, which is what it would have to do to ensure object members of unions obeyed the promises implicit in their own encapsulation.
You ask "is it better to have union inside derived class?"... no - doesn't generally work as most objects can't be put into a union (see above). You hit this problem whether the union is inside the derived class, or - through the new language features you postulate - the union actually is the derived class.
I do understand your frustration though. In practice, people do sometimes want unions of arbitrary non-trivial objects, so they hack it up the hard and dirty way using reinterpret cast or similar, managing memory alignment of some space big enough for the set of objects they'll support (or - easier but slower - a pointer to them). You'll find this sort of thing in the boost variant and any libraries. But, you can't derive from them... the knowledge about appropriate usage, the extent of safety checks etc. just isn't deducible or expressible in C++. The compiler's not about to do that for you.

Do class/struct members always get created in memory in the order they were declared?

This is a question that was sparked by Rob Walker's answer here.
Suppose I declare a class/struct like so:
struct
{
char A;
int B;
char C;
int D;
};
Is it safe to assume that these members will be declared in exactly that order in memory, or is this a compiler dependent thing? I'm asking because I had always assumed that the compiler can do whatever it wants with them.
This leads into my next question. If the above example causes memory alignment issues, why can the compiler not just turn that into something like this implicitly:
struct
{
char A;
char C;
int B;
int D;
};
(I'm primarily asking about C++, but I'd be interested to hear the C answer as well)
Related topics
Why doesn't GCC optimize structs?
C99 §6.7.2.1 clause 13 states:
Within a structure object, the
non-bit-field members and the units in
which bit-fields reside have addresses
that increase in the order in which
they are declared.
and goes on to say a bit more about padding and addresses. The C89 equivalent section is §6.5.2.1.
C++ is a bit more complicated. In the 1998 and 2003 standards, there is §9.2 clause 12 (clause 15 in C++11):
Nonstatic data members of a
(non-union) class declared without an
intervening access-specifier are
allocated so that later members have
higher addresses within a class
object. The order of allocation of
nonstatic data members separated by an
access-specifier is unspecified
(11.1). Implementation alignment
requirements might cause two adjacent
members not to be allocated
immediately after each other; so might
requirements for space for managing
virtual functions (10.3) and virtual
base classes (10.1).
The data members are arranged in the order declared. The compiler is free to intersperse padding to arrange the memory alignment it likes (and you'll find that many compilers have a boatload a alignment specification options---useful if mixing bits compiled by different programs.).
See also Why doesn't GCC optimize structs?.
It appears that this answer is somewhat obsolete for C++. You learn something everyday. Thanks aib, Nemanja.
I cannot speak for C++, but in C the order is guaranteed to be the same order in memory as declared in the struct.
Basically, you can count on that only for the classes with a standard layout. Strictly speaking, standard layout is a C++0x thing, but it is really just standardizing existing practice/
Aside from padding for alignment, no structure optimization is allowed by any compiler (that I am aware of) for C or C++. I can't speak for C++ classes, as they may be another beast entirely.
Consider your program is interfacing with system/library code on Windows but you want to use GCC. You would have to verify that GCC used an identical layout-optimization algorithm so all your structures would be packed correctly before sending them to the MS-compiled code.
While browsing the related topics at the right, I looked at this question. I figure this may be an interesting corner case when thinking about these issues (unless it's more common than I realize).
To paraphrase, if you have a struct in C that looks something like this:
struct foo{};
and subclass it like so in C++ (using a separate compilation unit):
extern "C" foo;
struct bar: public foo{};
Then the memory alignment won't necessarily be the same for the reasons aib mentions (even amongst compilers from the same vendor).