Do the C++ standards guarantee that unused private fields will influence sizeof? - c++

Consider the following struct:
class Foo {
int a;
};
Testing in g++, I get that sizeof(Foo) == 4 but is that guaranteed by the standard? Would a compiler be allowed to notice that a is an unused private field and remove it from the in-memory representation of the class (leading to a smaller sizeof)?
I don't expect any compilers to actually do that kind of optimization but this question popped up in a language lawyering discussion so now I'm curious.

The C++ standard doesn't define a lot about memory layouts. The fundamental rule for this case is item 4 under section 9 Classes:
4 Complete objects and member subobjects of class type shall have nonzero size. [ Note: Class objects can be assigned, passed as arguments to functions, and returned by functions (except objects of classes for which copying or moving has been restricted; see 12.8). Other plausible operators, such as equality comparison, can be defined by the user; see 13.5. — end note ]
Now there is one more restriction, though: Standard-layout classes. (no static elements, no virtuals, same visibility for all members) Section 9.2 Class members requires layout compatibility between different classes for standard-layout classes. This prevents elimination of members from such classes.
For non-trivial non-standard-layout classes I see no further restriction in the standard. The exact behavior of sizeof(), reinterpret_cast(), ... are implementation defined (i.e. 5.2.10 "The mapping function is implementation-defined.").

The answer is yes and no. A compiler could not exhibit exactly that behaviour within the standard, but it could do so partly.
There is no reason at all why a compiler could not optimise away the storage for the struct if that storage is never referenced. If the compiler gets its analysis right, then no program that you could write would ever be able to tell whether the storage exists or not.
However, the compiler cannot report a smaller sizeof() thereby. The standard is pretty clear that objects have to be big enough to hold the bits and bytes they contain (see for example 3.9/4 in N3797), and to report a sizeof smaller than that required to hold an int would be wrong.
At N3797 5.3.2:
The sizeof operator yields the number of bytes in the object
representation of its operand
I do not se that 'representation' can change according to whether the struct or member is referenced.
As another way of looking at it:
struct A {
int i;
};
struct B {
int i;
};
A a;
a.i = 0;
assert(sizeof(A)==sizeof(B));
I do not see that this assert can be allowed to fail in a standards-conforming implementation.

If you look at templates, you'll notice that "optimization" of such often ends up with nearly nothing in the output even though the template files may be thousands of lines...
I think that the optimization you are talking about will nearly always occur in a function when the object is used on the stack and the object doesn't get copied or passed down to another function and the private field is never accessed (not even initialized... which could be viewed as a bug!)

Related

Rationale for non-virtual derived class not being pointer-interconvertible with its first base

There are a few questions and answers on the site already concerning pointer-interconvertibility of structs and their first member variable, as well as structs and their first public base. This question is one of them, for example.
However, what I'm interested in is not the fact that it's undefined behavior to reinterpret_cast (or static_cast through a void *) between a non-standard-layout struct and its public base, but rather the reasoning why the C++ standard currently forbids such casts. The existing questions and answers don't cover this aspect.
Consider the following example in particular (Godbolt):
#include <type_traits>
struct Base {
int m_base_var = 1;
};
struct Derived: public Base {
int m_derived_var = 2;
};
Derived g_derived;
constexpr Derived *g_pDerived = &g_derived;
constexpr Base *g_pBase = &g_derived;
constexpr void *g_pvDerived = &g_derived;
//These assertions all hold
static_assert(!std::is_pointer_interconvertible_base_of_v<Base, Derived>);
static_assert((void *)g_pDerived == (void *)g_pBase);
static_assert((void *)g_pDerived == g_pvDerived);
static_assert((void *)g_pBase == g_pvDerived);
//This is well-defined and returns &g_derived
Derived * getDerived() {
return static_cast<Derived *>(g_pvDerived);
}
//This is also well-defined; outer static_cast added to illustrate the sequence of conversions
Base * getBase() {
return static_cast<Base *>(static_cast<Derived *>(g_pvDerived));
}
//This is UB due to the first static_assert!
Base * getBaseUB() {
return static_cast<Base *>(g_pvDerived);
}
As you can see from the Godbolt link, all three functions compile to the exact same assembly on x86-64 GCC. However, the standard forbids the third variant since Base is not a pointer-interconvertible base of Derived.
My question is: Is there an obvious reason why the standard forbids this kind of cast? In particular, on all implementations that I know of, the value of a pointer to the Base subobject is the same as that of the pointer to the whole Derived, and I don't see a particular reason why Derived should not be considered standard-layout anymore. (In other words, Base lives at offset zero within Derived.) Would it be legal for a C++ implementation to place Base at a non-zero offset within Derived? (Is there maybe an implementation already that does this?)
Note that this question is only about cases without virtual member functions / virtual inheritance / multiple inheritance.
This is really two question:
What is standard layout, like really?
Why is pointer-interconvertibility linked to standard layout?
Standard layout was constructed out of one half of the pre-C++11 concept of "plain old data" types. The other half is trivial copyability (ie: memcpying an instance of the object is just as good as a copy constructor). These two halves didn't really interact, but POD required both.
From a purely standard C++ perspective, standard layout is a requirement for the ability to construct a type whose layout matches an existing type, such that if you shove both of those types into a union, you can access subobjects of the non-active members. This is the core functionality that standard layout enabled within the language since it was invented in C++11.
This is why standard layout only allows one member in the class hierarchy to have non-static data members. If only one class has NSDMs, then there's no question about the ordering between NSDMs of different base classes and so forth. And being able to know a priori what that ordering is is vital for being able to know that two types match.
This is also useful for communicating across languages.
Once standard layout was defined however, it started getting used for things that were... less clearly part of its domain.
For example, standard layout became the determinator for whether offsetof was valid behavior. This is due to offsetof oroginally being based on being a POD type, so when the layout part was spun off, offsetof was updated to use that. However, this was suboptimal, as the only layout-based thing that would break offsetof is having virtual base classes (the offset of members of a virtual base class depends on the most-derived class, which depends on the runtime type of the object). Now, the new restriction was still was better than POD, but it could have been expanded to include more stuff. But that would mean coming up with a new definition.
Something similar likely goes with pointer-interconvertibility. This concept was invented in C++17 to resolve various issues with the object model. There is no evidence in the papers explaining why they picked standard layout to hang pointer-interconvertibility on. But it was an existing tool with well-defined rules that already had well-defined rules on what the "first subobject" was for any given type.
Expanding the rules like you wants requires creating a new definition for "first suboject".
Is a base class pointer-interconvertible to a particular derived class? Well, that depends on what other classes that derived class inherits from. Is the first NSDM pointer-interconvertible to the class it is a member of? That depends on what other classes are involved in the inheritance diagram.
These dependencies already exist, but they all key off of a specific, pre-existing rule. What you want requires creating a new rule that's more complex. It will have to copy 90% of the existing standard-layout rules (forbidding virtual, public/private members, etc) and then add its own rules. The first NSDM is pointer-interconvertible unless any base classes are non-empty. Any particular base class is pointer-interconvertible only so long as all previous base classes in declaration order are empty of NSDMs.
It's so much easier to just piggyback off of the standard layout rules and say "a standard-layout type is pointer-interconvertible with its first NSDM and all of its base classes."
Having an extra rule also imposes some burden on the specification. It's a partial redundancy, and that breeds errors. For example, C++23 is on track to expand standard layout types by removing the forbidding of mixing public and private members, forcing layout to be ordered strictly by declaration. If pointer-interconvertibility had its own rules, it would have been possible to update standard-layout but not pointer-interconvertibility.
Both C and C++ were defined by tradition before standards were written. When the first standards were written, it was more important to avoid requiring that any implementations break programs that existed for them, than to forbid implementations that would processing various programs from changing in such a fashion as to break them.
Although it would be typical for a derived class object to store all of the parent data in a sequence of consecutive bytes, followed by the remainder of the data for the object, it could sometimes be useful for implementations to deviate from that. If a base class had a char field, a derived class had an int, and sub-derived class had another five char fields, an implementation that nestles one of the sub-derived class fields between the base-class field and first-derived class might be able to store data more compactly than one which places everything from each layer in a single sequence of consecutive bytes.
Note that the C++ Standard expressly waives jurisdiction over what C++ programs should be viewed as "conforming". As such, there was no perceived need to ensure that it didn't classify as UB any programs which most implementations should process usefully. If the authors had foreseen the way compilers would treat the Committee's decisions to waive jurisdiction over various constructs as an invitation to process them nonsensically, they probably would have defined behavior in many more corner cases than they actually did.

What is the size limit for a class?

I was wondering what the size limit for a class is. I did a simple test:
#define CLS(name,other) \
class name\
{\
public: \
name() {};\
other a;\
other b;\
other c;\
other d;\
other e;\
other f;\
other g;\
other h;\
other i;\
other j;\
other k;\
};
class A{
int k;
public:
A(){};
};
CLS(B,A);
CLS(C,B);
CLS(D,C);
CLS(E,D);
CLS(F,E);
CLS(G,F);
CLS(H,G);
CLS(I,H);
CLS(J,I);
It fails to compile with
"'J' : class is too large"
If I remove the final declaration - CLS(J,I);, it all compiles fine.
Is this a compiler-imposed restriction, or is it somewhere in the standard?
In C++11 this is Annex B. Implementations can impose limits, but they should be at least:
Size of an object [262 144].
Data members in a single class [16 384].
Members declared in a single class [4 096].
The third one isn't directly relevant to the kind of construction you're using, I mention it just because it indicates that the second one is indeed the total members, presumably including those in bases and I'm not sure about members-of-members. But it's not just about the members listed in a single class definition.
Your implementation appears to have given up either 2^31 data members, or at size 2^32, since it accepts I but not J. It's fairly obviously reasonable for a compiler to refuse to consider classes with size greater than SIZE_MAX, even if the program happens not to instantiate it or use sizeof on the type. So even with the best possible effort on the part of the compiler I wouldn't ever expect this to work on a 32 bit implementation.
Note that "these quantities are only guidelines and do not determine compliance", so a conforming implication can impose an arbitrary smaller limit even where it has sufficient resources to compile a program that uses larger numbers. There's no minimum limit for conformance.
There are various opportunities in the C++ standard for a conforming implementation to be useless due to ridiculously small resource limits, so there's no additional harm done if this is another one.
C++03 is more-or-less the same:
Size of an object [262 144].
Data members in a single class, structure, or union [16 384].
Members declared in a single class [4 096].
I wanted to mention another place in which class size limit is mentioned, which is in section 1.2 of the Itanium C++ ABI draft
Various representations specified by this ABI impose limitations on
conforming user programs. These include, for the 64-bit Itanium ABI:
The offset of a non-virtual base subobject in the full object
containing it must be representable by a 56-bit signed integer (due to
RTTI implementation). This implies a practical limit of 2**55 bytes on
the size of a class.
I'm sure its compiler dependent. You can run your compiler in a preprocess only mode to see what the generated output is if you're curious. You might also want to look at template expansion rather than macros.

Casting big POD into small POD - guaranteed to work?

Suppose I've a POD struct which has more than 40 members. These members are not built-in types, rather most of them are POD structs, which in turn has lots of members, most of which are POD struct again. This pattern goes up to many levels - POD has POD has POD and so on - up to even 10 or so levels.
I cannot post the actual code, so here is one very simple example of this pattern:
//POD struct
struct Big
{
A a[2]; //POD has POD
B b; //POD has POD
double dar[6];
int m;
bool is;
double d;
char c[10];
};
And A and B are defined as:
struct A
{
int i;
int j;
int k;
};
struct B
{
A a; //POD has POD
double x;
double y;
double z;
char *s;
};
It's really very simplified version of the actual code which was written (in C) almost 20 years back by Citrix Systems when they came up with ICA protocol. Over the years, the code has been changed a lot. Now we've the source code, but we cannot know which code is being used in the current version of ICA, and which has been discarded, as the discarded part is also present in the source code.
That is the background of the problem. The problem is: now we've the source code, and we're building a system on the top of ICA protocol, for which at some point we need to know the values of few members of the big struct. Few members, not all. Fortunately, those members appear in the beginning of the struct, so can we write a struct which is part of the big struct, as:
//Part of struct B
//Whatever members it has, they are in the same order as they appear in Big.
struct partBig
{
A a[2];
B b;
double dar[6];
//rest is ignored
};
Now suppose, we know pointer to Big struct (that we know by deciphering the protocol and data streams), then can we write this:
Big *pBig = GetBig();
partBig *part = (partBig*)pBig; //Is this safe?
/*here can we pretend that part is actually Big, so as to access
first few members safely (using part pointer), namely those
which are defined in partBig?*/
I don't want to define the entire Big struct in our code, as it has too many POD members and if I define the struct entirely, I've to first define hundreds of other structs. I don't want to that, as even if I do, I doubt if I can do that correctly, as I don't know all the structs correctly (as to which version is being used today, and which is discarded).
I've done the casting already, and it seems to work for last one year, I didn't see any problem with that. But now I thought why not start a topic and ask everyone. Maybe, I'll have a better solution, or at least, will make some important notes.
Relevant references from the language specification will be appreciated. :-)
Here is a demo of such casting: http://ideone.com/c7SWr
The language specification has some features that are similar to what you are trying to do: the 6.5.2.3/5 states that if you have several structs that share common initial sequence of members, you are allowed to inspect these common members through any of the structs in the union, regardless of which one is currently active. So, from this guarantee one can easily derive that structs with common initial sequence of members should have identical memory layout for these common members.
However, the language does not seem to explicitly allow doing what you are doing, i.e. reinterpreting one struct object as another unrelated struct object through a pointer cast. (The language allows you to access the first member of a struct object through a pointer cast, but not what you do in your code). So, from that perspective, what you are doing might be a strict-aliasing violation (see 6.5/7). I'm not sure about this though, since it is not immediately clear to me whether the intent of 6.5/7 was to outlaw this kind of access.
However, in any case, I don't think that compilers that use strict-aliasing rules for optimization do it at aggregate type level. Most likely they only do it at fundamental type level, meaning that your code should work fine in practice.
Yes, assuming your small structure is laid out with the same packing/padding rules as the big one, you'll be fine. The memory layout order of fields in a structure is well-defined by the language specifications.
The only way to get BigPart from Big is using reinterpret_cast (or C style, as in your question), and the standard does not specify what happens in that case. However, it should work as expected. It would probably fail only for some super exotic platforms.
The relevant type-aliasing rule is 3.10/10: "If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: ...".
The list contains two cases that are relevant to us:
a type similar (as defined in 4.4) to the dynamic type of the object
an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union)
The first case isn't sufficient: it merely covers cv-qualifications. The second case allows at the union hack that AndreyT mentions (the common initial sequence), but not anything else.

How are structs laid out in memory in C++?

Is the way C++ structs are laid out set by the standard, or at least common across compilers?
I have a struct where one of its members needs to be aligned on 16 byte boundaries, and this would be easier if I can guarantee the ordering of the fields.
Also, for non-virtual classes, is the address of the first element also likely to be the address of the struct?
I'm most interested in GCC and MSVS.
C and C++ both guarantee that fields will be laid out in memory in the same order as you define them. For C++ that's only guaranteed for a POD type1 (anything that would be legitimate as a C struct [Edit: C89/90 -- not, for example, a C99 VLA] will also qualify as a POD).
The compiler is free to insert padding between members and/or at the end of the struct. Most compilers give you some way to control that (e.g., #pragma pack(N)), but it does vary between compilers.
1Well, there is one corner case they didn't think of, where it isn't guaranteed for a POD type -- an access specifier breaks the ordering guarantee:
struct x {
int x;
int y;
public:
int z;
};
This is a POD type, but the public: between y and z means they could theoretically be re-ordered. I'm pretty sure this is purely theoretical though -- I don't know of any compiler that does reorder the members in this situation (and unless memory fails me even worse than usual today, this is fixed in C++0x).
Edit: the relevant parts of the standard (at least most of them) are §9/4:
A POD-struct is an aggregate class that has no non-volatile
data members of type pointer to member, non-POD-struct, non-
POD-union (or array of such types) or reference, and has no
user-defined copy assignment operator and no user-defined
destructor.
and §8.5.1/1:
An aggregate is an array or a class (clause 9) with no user-
declared constructors (12.1), no private or protected non-
static data members (clause 11), no base classes (clause 10)
and no virtual functions (10.3).
and §9.2/12:
...the order of allocation of non-static data members
separated by an access-specifier is unspecified (11.1).
Though that's restricted somewhat by §9.2/17:
A pointer to a POD-struct object, suitably converted using
a reinterpret_cast, points to its initial member...
Therefore, (even if preceded by a public:, the first member you define must come first in memory. Other members separated by public: specifiers could theoretically be rearranged.
I should also point out that there's some room for argument about this. In particular, there's also a rule in §9.2/14:
Two POD-struct (clause 9) types are layout-compatible if they have the same number of nonstatic data members, and corresponding nonstatic data members (in order) have layout-compatible types (3.9).
Therefore, if you have something like:
struct A {
int x;
public:
int y;
public:
int z;
};
It is required to be layout compatible with:
struct B {
int x;
int y;
int z;
};
I'm pretty sure this is/was intended to mean that the members of the two structs must be laid out the same way in memory. Since the second one clearly can't have its members rearranged, the first one shouldn't be either. Unfortunately, the standard never really defines what "layout compatible" means, rendering the argument rather weak at best.
C++ inherits from c the need/desire to work efficiently on many platforms, and therefore leaves certain things up to the compiler. Aside from requiring elements to appear in the specified order, this is one of them.
But many compilers support #pragmas and options to let you establish come control of the packing. See your compiler's documentation.
Is the way C++ structs are laid out
set by the standard, or at least
common across compilers?
There is no guarantee in the standard. I would not depend on compilers having the same alignment
I have a struct where one of its
members needs to be aligned on 16 byte
boundaries, and this would be easier
if I can guarantee the ordering of the
fields.
Compilers will not change the order of the fields.
Here are some links on how to set them for both GCC and MSVC:
For GCC: http://developer.apple.com/mac/library/documentation/DeveloperTools/gcc-4.0.1/gcc/Structure_002dPacking-Pragmas.html
MSVC: http://msdn.microsoft.com/en-us/library/ms253935(VS.80).aspx and http://msdn.microsoft.com/en-us/library/2e70t5y1(VS.80).aspx
I would keep them as structures and use an extern "C" to ensure that it works properly. Maybe not needed but that would definitely work.
C structs (and C++ PODs, for compatibility) are required by the standard to have sequential layout.
The only difference between compilers is alignment, but fortunately, both MSVC and GCC support #pragma pack.
If you want to see the memory layout of all class types in memory under MSVC, add the /d1reportAllClassLayout switch to the compiler command line.
If you want to see what one class is layed out like, add /d1reportSingleClassLayoutNNNNN where NNNNN is the name of the class.
Structures are laid out sequentially in memory. However, the way they're aligned in memory varies based on the operating system.
For things bigger than 4 bytes, there's a difference between Windows and Linux. Linux aligns them as if they're 4 bytes, so for example a double (8 bytes) could start at p+4, p+8, p+12, etc. where p is the start of the structure. In Windows, a double (8 bytes) needs to start at an address that's a multiple of 8 so p+8, p+16, p+24, etc.

Do class/struct members always get created in memory in the order they were declared?

This is a question that was sparked by Rob Walker's answer here.
Suppose I declare a class/struct like so:
struct
{
char A;
int B;
char C;
int D;
};
Is it safe to assume that these members will be declared in exactly that order in memory, or is this a compiler dependent thing? I'm asking because I had always assumed that the compiler can do whatever it wants with them.
This leads into my next question. If the above example causes memory alignment issues, why can the compiler not just turn that into something like this implicitly:
struct
{
char A;
char C;
int B;
int D;
};
(I'm primarily asking about C++, but I'd be interested to hear the C answer as well)
Related topics
Why doesn't GCC optimize structs?
C99 §6.7.2.1 clause 13 states:
Within a structure object, the
non-bit-field members and the units in
which bit-fields reside have addresses
that increase in the order in which
they are declared.
and goes on to say a bit more about padding and addresses. The C89 equivalent section is §6.5.2.1.
C++ is a bit more complicated. In the 1998 and 2003 standards, there is §9.2 clause 12 (clause 15 in C++11):
Nonstatic data members of a
(non-union) class declared without an
intervening access-specifier are
allocated so that later members have
higher addresses within a class
object. The order of allocation of
nonstatic data members separated by an
access-specifier is unspecified
(11.1). Implementation alignment
requirements might cause two adjacent
members not to be allocated
immediately after each other; so might
requirements for space for managing
virtual functions (10.3) and virtual
base classes (10.1).
The data members are arranged in the order declared. The compiler is free to intersperse padding to arrange the memory alignment it likes (and you'll find that many compilers have a boatload a alignment specification options---useful if mixing bits compiled by different programs.).
See also Why doesn't GCC optimize structs?.
It appears that this answer is somewhat obsolete for C++. You learn something everyday. Thanks aib, Nemanja.
I cannot speak for C++, but in C the order is guaranteed to be the same order in memory as declared in the struct.
Basically, you can count on that only for the classes with a standard layout. Strictly speaking, standard layout is a C++0x thing, but it is really just standardizing existing practice/
Aside from padding for alignment, no structure optimization is allowed by any compiler (that I am aware of) for C or C++. I can't speak for C++ classes, as they may be another beast entirely.
Consider your program is interfacing with system/library code on Windows but you want to use GCC. You would have to verify that GCC used an identical layout-optimization algorithm so all your structures would be packed correctly before sending them to the MS-compiled code.
While browsing the related topics at the right, I looked at this question. I figure this may be an interesting corner case when thinking about these issues (unless it's more common than I realize).
To paraphrase, if you have a struct in C that looks something like this:
struct foo{};
and subclass it like so in C++ (using a separate compilation unit):
extern "C" foo;
struct bar: public foo{};
Then the memory alignment won't necessarily be the same for the reasons aib mentions (even amongst compilers from the same vendor).