Related
This FAQ is about Aggregates and PODs and covers the following material:
What are Aggregates?
What are PODs (Plain Old Data)?
How are they related?
How and why are they special?
What changes for C++11?
How to read:
This article is rather long. If you want to know about both aggregates and PODs (Plain Old Data) take time and read it. If you are interested just in aggregates, read only the first part. If you are interested only in PODs then you must first read the definition, implications, and examples of aggregates and then you may jump to PODs but I would still recommend reading the first part in its entirety. The notion of aggregates is essential for defining PODs. If you find any errors (even minor, including grammar, stylistics, formatting, syntax, etc.) please leave a comment, I'll edit.
This answer applies to C++03. For other C++ standards see:
C++11 changes
C++14 changes
C++17 changes
C++20 changes
What are aggregates and why they are special
Formal definition from the C++ standard (C++03 8.5.1 §1):
An aggregate is an array or a class (clause 9) with no user-declared
constructors (12.1), no private or protected non-static data members (clause 11),
no base classes (clause 10), and no virtual functions (10.3).
So, OK, let's parse this definition. First of all, any array is an aggregate. A class can also be an aggregate if… wait! nothing is said about structs or unions, can't they be aggregates? Yes, they can. In C++, the term class refers to all classes, structs, and unions. So, a class (or struct, or union) is an aggregate if and only if it satisfies the criteria from the above definitions. What do these criteria imply?
This does not mean an aggregate class cannot have constructors, in fact it can have a default constructor and/or a copy constructor as long as they are implicitly declared by the compiler, and not explicitly by the user
No private or protected non-static data members. You can have as many private and protected member functions (but not constructors) as well as as many private or protected static data members and member functions as you like and not violate the rules for aggregate classes
An aggregate class can have a user-declared/user-defined copy-assignment operator and/or destructor
An array is an aggregate even if it is an array of non-aggregate class type.
Now let's look at some examples:
class NotAggregate1
{
virtual void f() {} //remember? no virtual functions
};
class NotAggregate2
{
int x; //x is private by default and non-static
};
class NotAggregate3
{
public:
NotAggregate3(int) {} //oops, user-defined constructor
};
class Aggregate1
{
public:
NotAggregate1 member1; //ok, public member
Aggregate1& operator=(Aggregate1 const & rhs) {/* */} //ok, copy-assignment
private:
void f() {} // ok, just a private function
};
You get the idea. Now let's see how aggregates are special. They, unlike non-aggregate classes, can be initialized with curly braces {}. This initialization syntax is commonly known for arrays, and we just learnt that these are aggregates. So, let's start with them.
Type array_name[n] = {a1, a2, …, am};
if(m == n)
the ith element of the array is initialized with ai
else if(m < n)
the first m elements of the array are initialized with a1, a2, …, am and the other n - m elements are, if possible, value-initialized (see below for the explanation of the term)
else if(m > n)
the compiler will issue an error
else (this is the case when n isn't specified at all like int a[] = {1, 2, 3};)
the size of the array (n) is assumed to be equal to m, so int a[] = {1, 2, 3}; is equivalent to int a[3] = {1, 2, 3};
When an object of scalar type (bool, int, char, double, pointers, etc.) is value-initialized it means it is initialized with 0 for that type (false for bool, 0.0 for double, etc.). When an object of class type with a user-declared default constructor is value-initialized its default constructor is called. If the default constructor is implicitly defined then all nonstatic members are recursively value-initialized. This definition is imprecise and a bit incorrect but it should give you the basic idea. A reference cannot be value-initialized. Value-initialization for a non-aggregate class can fail if, for example, the class has no appropriate default constructor.
Examples of array initialization:
class A
{
public:
A(int) {} //no default constructor
};
class B
{
public:
B() {} //default constructor available
};
int main()
{
A a1[3] = {A(2), A(1), A(14)}; //OK n == m
A a2[3] = {A(2)}; //ERROR A has no default constructor. Unable to value-initialize a2[1] and a2[2]
B b1[3] = {B()}; //OK b1[1] and b1[2] are value initialized, in this case with the default-ctor
int Array1[1000] = {0}; //All elements are initialized with 0;
int Array2[1000] = {1}; //Attention: only the first element is 1, the rest are 0;
bool Array3[1000] = {}; //the braces can be empty too. All elements initialized with false
int Array4[1000]; //no initializer. This is different from an empty {} initializer in that
//the elements in this case are not value-initialized, but have indeterminate values
//(unless, of course, Array4 is a global array)
int array[2] = {1, 2, 3, 4}; //ERROR, too many initializers
}
Now let's see how aggregate classes can be initialized with braces. Pretty much the same way. Instead of the array elements we will initialize the non-static data members in the order of their appearance in the class definition (they are all public by definition). If there are fewer initializers than members, the rest are value-initialized. If it is impossible to value-initialize one of the members which were not explicitly initialized, we get a compile-time error. If there are more initializers than necessary, we get a compile-time error as well.
struct X
{
int i1;
int i2;
};
struct Y
{
char c;
X x;
int i[2];
float f;
protected:
static double d;
private:
void g(){}
};
Y y = {'a', {10, 20}, {20, 30}};
In the above example y.c is initialized with 'a', y.x.i1 with 10, y.x.i2 with 20, y.i[0] with 20, y.i[1] with 30 and y.f is value-initialized, that is, initialized with 0.0. The protected static member d is not initialized at all, because it is static.
Aggregate unions are different in that you may initialize only their first member with braces. I think that if you are advanced enough in C++ to even consider using unions (their use may be very dangerous and must be thought of carefully), you could look up the rules for unions in the standard yourself :).
Now that we know what's special about aggregates, let's try to understand the restrictions on classes; that is, why they are there. We should understand that memberwise initialization with braces implies that the class is nothing more than the sum of its members. If a user-defined constructor is present, it means that the user needs to do some extra work to initialize the members therefore brace initialization would be incorrect. If virtual functions are present, it means that the objects of this class have (on most implementations) a pointer to the so-called vtable of the class, which is set in the constructor, so brace-initialization would be insufficient. You could figure out the rest of the restrictions in a similar manner as an exercise :).
So enough about the aggregates. Now we can define a stricter set of types, to wit, PODs
What are PODs and why they are special
Formal definition from the C++ standard (C++03 9 §4):
A POD-struct is an aggregate class
that has no non-static data members of
type non-POD-struct, non-POD-union (or
array of such types) or reference, and
has no user-defined copy assignment
operator and no user-defined
destructor. Similarly, a POD-union is
an aggregate union that has no
non-static data members of type
non-POD-struct, non-POD-union (or
array of such types) or reference, and
has no user-defined copy assignment
operator and no user-defined
destructor. A POD class is a class
that is either a POD-struct or a
POD-union.
Wow, this one's tougher to parse, isn't it? :) Let's leave unions out (on the same grounds as above) and rephrase in a bit clearer way:
An aggregate class is called a POD if
it has no user-defined copy-assignment
operator and destructor and none of
its nonstatic members is a non-POD
class, array of non-POD, or a
reference.
What does this definition imply? (Did I mention POD stands for Plain Old Data?)
All POD classes are aggregates, or, to put it the other way around, if a class is not an aggregate then it is sure not a POD
Classes, just like structs, can be PODs even though the standard term is POD-struct for both cases
Just like in the case of aggregates, it doesn't matter what static members the class has
Examples:
struct POD
{
int x;
char y;
void f() {} //no harm if there's a function
static std::vector<char> v; //static members do not matter
};
struct AggregateButNotPOD1
{
int x;
~AggregateButNotPOD1() {} //user-defined destructor
};
struct AggregateButNotPOD2
{
AggregateButNotPOD1 arrOfNonPod[3]; //array of non-POD class
};
POD-classes, POD-unions, scalar types, and arrays of such types are collectively called POD-types.
PODs are special in many ways. I'll provide just some examples.
POD-classes are the closest to C structs. Unlike them, PODs can have member functions and arbitrary static members, but neither of these two change the memory layout of the object. So if you want to write a more or less portable dynamic library that can be used from C and even .NET, you should try to make all your exported functions take and return only parameters of POD-types.
The lifetime of objects of non-POD class type begins when the constructor has finished and ends when the destructor has finished. For POD classes, the lifetime begins when storage for the object is occupied and finishes when that storage is released or reused.
For objects of POD types it is guaranteed by the standard that when you memcpy the contents of your object into an array of char or unsigned char, and then memcpy the contents back into your object, the object will hold its original value. Do note that there is no such guarantee for objects of non-POD types. Also, you can safely copy POD objects with memcpy. The following example assumes T is a POD-type:
#define N sizeof(T)
char buf[N];
T obj; // obj initialized to its original value
memcpy(buf, &obj, N); // between these two calls to memcpy,
// obj might be modified
memcpy(&obj, buf, N); // at this point, each subobject of obj of scalar type
// holds its original value
goto statement. As you may know, it is illegal (the compiler should issue an error) to make a jump via goto from a point where some variable was not yet in scope to a point where it is already in scope. This restriction applies only if the variable is of non-POD type. In the following example f() is ill-formed whereas g() is well-formed. Note that Microsoft's compiler is too liberal with this rule—it just issues a warning in both cases.
int f()
{
struct NonPOD {NonPOD() {}};
goto label;
NonPOD x;
label:
return 0;
}
int g()
{
struct POD {int i; char c;};
goto label;
POD x;
label:
return 0;
}
It is guaranteed that there will be no padding in the beginning of a POD object. In other words, if a POD-class A's first member is of type T, you can safely reinterpret_cast from A* to T* and get the pointer to the first member and vice versa.
The list goes on and on…
Conclusion
It is important to understand what exactly a POD is because many language features, as you see, behave differently for them.
What changes for C++11?
Aggregates
The standard definition of an aggregate has changed slightly, but it's still pretty much the same:
An aggregate is an array or a class (Clause 9) with no user-provided constructors (12.1),
no brace-or-equal-initializers for non-static data members (9.2), no private or protected
non-static data members (Clause 11), no base classes (Clause 10), and no virtual functions (10.3).
Ok, what changed?
Previously, an aggregate could have no user-declared constructors, but now it can't have user-provided constructors. Is there a difference? Yes, there is, because now you can declare constructors and default them:
struct Aggregate {
Aggregate() = default; // asks the compiler to generate the default implementation
};
This is still an aggregate because a constructor (or any special member function) that is defaulted on the first declaration is not user-provided.
Now an aggregate cannot have any brace-or-equal-initializers for non-static data members. What does this mean? Well, this is just because with this new standard, we can initialize members directly in the class like this:
struct NotAggregate {
int x = 5; // valid in C++11
std::vector<int> s{1,2,3}; // also valid
};
Using this feature makes the class no longer an aggregate because it's basically equivalent to providing your own default constructor.
So, what is an aggregate didn't change much at all. It's still the same basic idea, adapted to the new features.
What about PODs?
PODs went through a lot of changes. Lots of previous rules about PODs were relaxed in this new standard, and the way the definition is provided in the standard was radically changed.
The idea of a POD is to capture basically two distinct properties:
It supports static initialization, and
Compiling a POD in C++ gives you the same memory layout as a struct compiled in C.
Because of this, the definition has been split into two distinct concepts: trivial classes and standard-layout classes, because these are more useful than POD. The standard now rarely uses the term POD, preferring the more specific trivial and standard-layout concepts.
The new definition basically says that a POD is a class that is both trivial and has standard-layout, and this property must hold recursively for all non-static data members:
A POD struct is a non-union class that is both a trivial class and a standard-layout class,
and has no non-static data members of type non-POD struct, non-POD union (or array of such types).
Similarly, a POD union is a union that is both a trivial class and a standard layout class, and has
no non-static data members of type non-POD struct, non-POD union (or array of such types).
A POD class is a class that is either a POD struct or a POD union.
Let's go over each of these two properties in detail separately.
Trivial classes
Trivial is the first property mentioned above: trivial classes support static initialization.
If a class is trivially copyable (a superset of trivial classes), it is ok to copy its representation over the place with things like memcpy and expect the result to be the same.
The standard defines a trivial class as follows:
A trivially copyable class is a class that:
— has no non-trivial copy constructors (12.8),
— has no non-trivial move constructors (12.8),
— has no non-trivial copy assignment operators (13.5.3, 12.8),
— has no non-trivial move assignment operators (13.5.3, 12.8), and
— has a trivial destructor (12.4).
A trivial class is a class that has a trivial default constructor (12.1) and is trivially copyable.
[ Note: In particular, a trivially copyable or trivial class does not have virtual functions
or virtual base classes.—end note ]
So, what are all those trivial and non-trivial things?
A copy/move constructor for class X is trivial if it is not user-provided and if
— class X has no virtual functions (10.3) and no virtual base classes (10.1), and
— the constructor selected to copy/move each direct base class subobject is trivial, and
— for each non-static data member of X that is of class type (or array thereof), the constructor
selected to copy/move that member is trivial;
otherwise the copy/move constructor is non-trivial.
Basically this means that a copy or move constructor is trivial if it is not user-provided, the class has nothing virtual in it, and this property holds recursively for all the members of the class and for the base class.
The definition of a trivial copy/move assignment operator is very similar, simply replacing the word "constructor" with "assignment operator".
A trivial destructor also has a similar definition, with the added constraint that it can't be virtual.
And yet another similar rule exists for trivial default constructors, with the addition that a default constructor is not-trivial if the class has non-static data members with brace-or-equal-initializers, which we've seen above.
Here are some examples to clear everything up:
// empty classes are trivial
struct Trivial1 {};
// all special members are implicit
struct Trivial2 {
int x;
};
struct Trivial3 : Trivial2 { // base class is trivial
Trivial3() = default; // not a user-provided ctor
int y;
};
struct Trivial4 {
public:
int a;
private: // no restrictions on access modifiers
int b;
};
struct Trivial5 {
Trivial1 a;
Trivial2 b;
Trivial3 c;
Trivial4 d;
};
struct Trivial6 {
Trivial2 a[23];
};
struct Trivial7 {
Trivial6 c;
void f(); // it's okay to have non-virtual functions
};
struct Trivial8 {
int x;
static NonTrivial1 y; // no restrictions on static members
};
struct Trivial9 {
Trivial9() = default; // not user-provided
// a regular constructor is okay because we still have default ctor
Trivial9(int x) : x(x) {};
int x;
};
struct NonTrivial1 : Trivial3 {
virtual void f(); // virtual members make non-trivial ctors
};
struct NonTrivial2 {
NonTrivial2() : z(42) {} // user-provided ctor
int z;
};
struct NonTrivial3 {
NonTrivial3(); // user-provided ctor
int w;
};
NonTrivial3::NonTrivial3() = default; // defaulted but not on first declaration
// still counts as user-provided
struct NonTrivial5 {
virtual ~NonTrivial5(); // virtual destructors are not trivial
};
Standard-layout
Standard-layout is the second property. The standard mentions that these are useful for communicating with other languages, and that's because a standard-layout class has the same memory layout of the equivalent C struct or union.
This is another property that must hold recursively for members and all base classes. And as usual, no virtual functions or virtual base classes are allowed. That would make the layout incompatible with C.
A relaxed rule here is that standard-layout classes must have all non-static data members with the same access control. Previously these had to be all public, but now you can make them private or protected, as long as they are all private or all protected.
When using inheritance, only one class in the whole inheritance tree can have non-static data members, and the first non-static data member cannot be of a base class type (this could break aliasing rules), otherwise, it's not a standard-layout class.
This is how the definition goes in the standard text:
A standard-layout class is a class that:
— has no non-static data members of type non-standard-layout class (or array of such types)
or reference,
— has no virtual functions (10.3) and no virtual base classes (10.1),
— has the same access control (Clause 11) for all non-static data members,
— has no non-standard-layout base classes,
— either has no non-static data members in the most derived class and at most one base class with
non-static data members, or has no base classes with non-static data members, and
— has no base classes of the same type as the first non-static data member.
A standard-layout struct is a standard-layout class defined with the class-key struct or
the class-key class.
A standard-layout union is a standard-layout class defined with the class-key union.
[ Note: Standard-layout classes are useful for communicating with code written in other programming languages. Their layout is specified in 9.2.—end note ]
And let's see a few examples.
// empty classes have standard-layout
struct StandardLayout1 {};
struct StandardLayout2 {
int x;
};
struct StandardLayout3 {
private: // both are private, so it's ok
int x;
int y;
};
struct StandardLayout4 : StandardLayout1 {
int x;
int y;
void f(); // perfectly fine to have non-virtual functions
};
struct StandardLayout5 : StandardLayout1 {
int x;
StandardLayout1 y; // can have members of base type if they're not the first
};
struct StandardLayout6 : StandardLayout1, StandardLayout5 {
// can use multiple inheritance as long only
// one class in the hierarchy has non-static data members
};
struct StandardLayout7 {
int x;
int y;
StandardLayout7(int x, int y) : x(x), y(y) {} // user-provided ctors are ok
};
struct StandardLayout8 {
public:
StandardLayout8(int x) : x(x) {} // user-provided ctors are ok
// ok to have non-static data members and other members with different access
private:
int x;
};
struct StandardLayout9 {
int x;
static NonStandardLayout1 y; // no restrictions on static members
};
struct NonStandardLayout1 {
virtual f(); // cannot have virtual functions
};
struct NonStandardLayout2 {
NonStandardLayout1 X; // has non-standard-layout member
};
struct NonStandardLayout3 : StandardLayout1 {
StandardLayout1 x; // first member cannot be of the same type as base
};
struct NonStandardLayout4 : StandardLayout3 {
int z; // more than one class has non-static data members
};
struct NonStandardLayout5 : NonStandardLayout3 {}; // has a non-standard-layout base class
Conclusion
With these new rules a lot more types can be PODs now. And even if a type is not POD, we can take advantage of some of the POD properties separately (if it is only one of trivial or standard-layout).
The standard library has traits to test these properties in the header <type_traits>:
template <typename T>
struct std::is_pod;
template <typename T>
struct std::is_trivial;
template <typename T>
struct std::is_trivially_copyable;
template <typename T>
struct std::is_standard_layout;
What has changed for C++14
We can refer to the Draft C++14 standard for reference.
Aggregates
This is covered in section 8.5.1 Aggregates which gives us the following definition:
An aggregate is an array or a class (Clause 9) with no user-provided
constructors (12.1), no private or protected non-static data members
(Clause 11), no base classes (Clause 10), and no virtual functions
(10.3).
The only change is now adding in-class member initializers does not make a class a non-aggregate. So the following example from C++11 aggregate initialization for classes with member in-place initializers:
struct A
{
int a = 3;
int b = 3;
};
was not an aggregate in C++11 but it is in C++14. This change is covered in N3605: Member initializers and aggregates, which has the following abstract:
Bjarne Stroustrup and Richard Smith raised an issue about aggregate
initialization and member-initializers not working together. This
paper proposes to fix the issue by adopting Smith's proposed wording
that removes a restriction that aggregates can't have
member-initializers.
POD stays the same
The definition for POD(plain old data) struct is covered in section 9 Classes which says:
A POD struct110 is a non-union class that is both a trivial class and
a standard-layout class, and has no non-static data members of type
non-POD struct, non-POD union (or array of such types). Similarly, a
POD union is a union that is both a trivial class and a
standard-layout class, and has no non-static data members of type
non-POD struct, non-POD union (or array of such types). A POD class is
a class that is either a POD struct or a POD union.
which is the same wording as C++11.
Standard-Layout Changes for C++14
As noted in the comments pod relies on the definition of standard-layout and that did change for C++14 but this was via defect reports that were applied to C++14 after the fact.
There were three DRs:
DR 1672
DR 1813
DR 2120
So standard-layout went from this Pre C++14:
A standard-layout class is a class that:
(7.1) has no non-static data members of type non-standard-layout class (or array of such types) or reference,
(7.2) has no virtual functions ([class.virtual]) and no virtual base classes ([class.mi]),
(7.3) has the same access control (Clause [class.access]) for all non-static data members,
(7.4) has no non-standard-layout base classes,
(7.5) either has no non-static data members in the most derived class and at most one base class with non-static data members, or has
no base classes with non-static data members, and
(7.6) has no base classes of the same type as the first non-static data member.109
To this in C++14:
A class S is a standard-layout class if it:
(3.1) has no non-static data members of type non-standard-layout class (or array of such types) or reference,
(3.2) has no virtual functions and no virtual base classes,
(3.3) has the same access control for all non-static data members,
(3.4) has no non-standard-layout base classes,
(3.5) has at most one base class subobject of any given type,
(3.6) has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
(3.7) has no element of the set M(S) of types as a base class, where for any type X, M(X) is defined as follows.104
[ Note: M(X) is the set of the types of all non-base-class subobjects that may be at a zero offset in X.
— end note
]
(3.7.1) If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
(3.7.2) If X is a non-union class type with a non-static data member of type X0 that is either of zero size or is the first
non-static data member of X (where said member may be an anonymous
union), the set M(X) consists of X0 and the elements of M(X0).
(3.7.3) If X is a union type, the set M(X) is the union of all M(Ui) and the set containing all Ui, where each Ui is the type of the
ith non-static data member of X.
(3.7.4) If X is an array type with element type Xe, the set M(X) consists of Xe and the elements of M(Xe).
(3.7.5) If X is a non-class, non-array type, the set M(X) is empty.
Changes in C++17
Download the C++17 International Standard final draft here.
Aggregates
C++17 expands and enhances aggregates and aggregate initialization. The standard library also now includes an std::is_aggregate type trait class. Here is the formal definition from section 11.6.1.1 and 11.6.1.2 (internal references elided):
An aggregate is an array or a class with
— no user-provided, explicit, or inherited constructors,
— no private or protected non-static data members,
— no virtual functions, and
— no virtual, private, or protected base classes.
[ Note: Aggregate initialization does not allow accessing protected and private base class’ members or constructors. —end note ]
The elements of an aggregate are:
— for an array, the array elements in increasing subscript order, or
— for a class, the direct base classes in declaration order, followed by the direct non-static data members that are not members of an anonymous union, in declaration order.
What changed?
Aggregates can now have public, non-virtual base classes. Furthermore, it is not a requirement that base classes be aggregates. If they are not aggregates, they are list-initialized.
struct B1 // not a aggregate
{
int i1;
B1(int a) : i1(a) { }
};
struct B2
{
int i2;
B2() = default;
};
struct M // not an aggregate
{
int m;
M(int a) : m(a) { }
};
struct C : B1, B2
{
int j;
M m;
C() = default;
};
C c { { 1 }, { 2 }, 3, { 4 } };
cout
<< "is C aggregate?: " << (std::is_aggregate<C>::value ? 'Y' : 'N')
<< " i1: " << c.i1 << " i2: " << c.i2
<< " j: " << c.j << " m.m: " << c.m.m << endl;
//stdout: is C aggregate?: Y, i1=1 i2=2 j=3 m.m=4
Explicit defaulted constructors are disallowed
struct D // not an aggregate
{
int i = 0;
D() = default;
explicit D(D const&) = default;
};
Inheriting constructors are disallowed
struct B1
{
int i1;
B1() : i1(0) { }
};
struct C : B1 // not an aggregate
{
using B1::B1;
};
Trivial Classes
The definition of trivial class was reworked in C++17 to address several defects that were not addressed in C++14. The changes were technical in nature. Here is the new definition at 12.0.6 (internal references elided):
A trivially copyable class is a class:
— where each copy constructor, move constructor, copy assignment operator, and move assignment operator is either deleted or trivial,
— that has at least one non-deleted copy constructor, move constructor, copy assignment operator, or move assignment operator, and
— that has a trivial, non-deleted destructor.
A trivial class is a class that is trivially copyable and has one or more default constructors, all of which are either trivial or deleted and at least one of which is not deleted. [ Note: In particular, a trivially copyable
or trivial class does not have virtual functions or virtual base classes.—end note ]
Changes:
Under C++14, for a class to be trivial, the class could not have any copy/move constructor/assignment operators that were non-trivial. However, then an implicitly declared as defaulted constructor/operator could be non-trivial and yet defined as deleted because, for example, the class contained a subobject of class type that could not be copied/moved. The presence of such non-trivial, defined-as-deleted constructor/operator would cause the whole class to be non-trivial. A similar problem existed with destructors. C++17 clarifies that the presence of such constructor/operators does not cause the class to be non-trivially copyable, hence non-trivial, and that a trivially copyable class must have a trivial, non-deleted destructor. DR1734, DR1928
C++14 allowed a trivially copyable class, hence a trivial class, to have every copy/move constructor/assignment operator declared as deleted. If such as class was also standard layout, it could, however, be legally copied/moved with std::memcpy. This was a semantic contradiction, because, by defining as deleted all constructor/assignment operators, the creator of the class clearly intended that the class could not be copied/moved, yet the class still met the definition of a trivially copyable class. Hence in C++17 we have a new clause stating that trivially copyable class must have at least one trivial, non-deleted (though not necessarily publicly accessible) copy/move constructor/assignment operator. See N4148, DR1734
The third technical change concerns a similar problem with default constructors. Under C++14, a class could have trivial default constructors that were implicitly defined as deleted, yet still be a trivial class. The new definition clarifies that a trivial class must have a least one trivial, non-deleted default constructor. See DR1496
Standard-layout Classes
The definition of standard-layout was also reworked to address defect reports. Again the changes were technical in nature. Here is the text from the standard (12.0.7). As before, internal references are elided:
A class S is a standard-layout class if it:
— has no non-static data members of type non-standard-layout class (or array of such types) or reference,
— has no virtual functions and no virtual base classes,
— has the same access control for all non-static data members,
— has no non-standard-layout base classes,
— has at most one base class subobject of any given type,
— has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
— has no element of the set M(S) of types (defined below) as a base class.108
M(X) is defined as follows:
— If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
— If X is a non-union class type whose first non-static data member has type X0 (where said member may be an anonymous union), the set M(X) consists of X0 and the elements of M(X0).
— If X is a union type, the set M(X) is the union of all M(Ui) and the set containing all Ui, where each Ui is the type of the ith non-static data member of X.
— If X is an array type with element type Xe, the set M(X) consists of Xe and the elements of M(Xe).
— If X is a non-class, non-array type, the set M(X) is empty.
[ Note: M(X) is the set of the types of all non-base-class subobjects that are guaranteed in a standard-layout class to be at a zero offset in X. —end note ]
[ Example:
struct B { int i; }; // standard-layout class
struct C : B { }; // standard-layout class
struct D : C { }; // standard-layout class
struct E : D { char : 4; }; // not a standard-layout class
struct Q {};
struct S : Q { };
struct T : Q { };
struct U : S, T { }; // not a standard-layout class
—end example ]
108) This ensures that two subobjects that have the same class type and that belong to the same most derived object are not allocated at the same address.
Changes:
Clarified that the requirement that only one class in the derivation tree "has" non-static data members refers to a class where such data members are first declared, not classes where they may be inherited, and extended this requirement to non-static bit fields. Also clarified that a standard-layout class "has at most one base class subobject of any given type." See DR1813, DR1881
The definition of standard-layout has never allowed the type of any base class to be the same type as the first non-static data member. It is to avoid a situation where a data member at offset zero has the same type as any base class. The C++17 standard provides a more rigorous, recursive definition of "the set of the types of all non-base-class subobjects that are guaranteed in a standard-layout class to be at a zero offset" so as to prohibit such types from being the type of any base class. See DR1672, DR2120.
Note: The C++ standards committee intended the above changes based on defect reports to apply to C++14, though the new language is not in the published C++14 standard. It is in the C++17 standard.
POD in C++11 was basically split into two different axes here: triviality and layout. Triviality is about the relationship between an object's conceptual value and the bits of data within its storage. Layout is about... well, the layout of an object's subobjects. Only class types have layout, while all types have triviality relationships.
So here is what the triviality axis is about:
Non-trivially copyable: The value of objects of such types may be more than just the binary data that are stored directly within the object.
For example, unique_ptr<T> stores a T*; that is the totality of the binary data within the object. But that's not the totality of the value of a unique_ptr<T>. A unique_ptr<T> stores either a nullptr or a pointer to an object whose lifetime is managed by the unique_ptr<T> instance. That management is part of the value of a unique_ptr<T>. And that value is not part of the binary data of the object; it is created by the various member functions of that object.
For example, to assign nullptr to a unique_ptr<T> is to do more than just change the bits stored in the object. Such an assignment must destroy any object managed by the unique_ptr. To manipulate the internal storage of a unique_ptr without going through its member functions would damage this mechanism, to change its internal T* without destroying the object it currently manages, would violate the conceptual value that the object possesses.
Trivially copyable: The value of such objects are exactly and only the contents of their binary storage. This is what makes it reasonable to allow copying that binary storage to be equivalent to copying the object itself.
The specific rules that define trivial copyability (trivial destructor, trivial/deleted copy/move constructors/assignment) are what is required for a type to be binary-value-only. An object's destructor can participate in defining the "value" of an object, as in the case with unique_ptr. If that destructor is trivial, then it doesn't participate in defining the object's value.
Specialized copy/move operations also can participate in an object's value. unique_ptr's move constructor modifies the source of the move operation by null-ing it out. This is what ensures that the value of a unique_ptr is unique. Trivial copy/move operations mean that such object value shenanigans are not being played, so the object's value can only be the binary data it stores.
Trivial: This object is considered to have a functional value for any bits that it stores. Trivially copyable defines the meaning of the data store of an object as being just that data. But such types can still control how data gets there (to some extent). Such a type can have default member initializers and/or a default constructor that ensures that a particular member always has a particular value. And thus, the conceptual value of the object can be restricted to a subset of the binary data that it could store.
Performing default initialization on a type that has a trivial default constructor will leave that object with completely uninitialized values. As such, a type with a trivial default constructor is logically valid with any binary data in its data storage.
The layout axis is really quite simple. Compilers are given a lot of leeway in deciding how the subobjects of a class are stored within the class's storage. However, there are some cases where this leeway is not necessary, and having more rigid ordering guarantees is useful.
Such types are standard layout types. And the C++ standard doesn't even really do much with saying what that layout is specifically. It basically says three things about standard layout types:
The first subobject is at the same address as the object itself.
You can use offsetof to get a byte offset from the outer object to one of its member subobjects.
unions get to play some games with accessing subobjects through an inactive member of a union if the active member is (at least partially) using the same layout as the inactive one being accessed.
Compilers generally permit standard layout objects to map to struct types with the same members in C. But there is no statement of that in the C++ standard; that's just what compilers feel like doing.
POD is basically a useless term at this point. It is just the intersection of trivial copyability (the value is only its binary data) and standard layout (the order of its subobjects is more well-defined). One can infer from such things that the type is C-like and could map to similar C objects. But the standard has no statements to that effect.
can you please elaborate following rules:
I'll try:
a) standard-layout classes must have all non-static data members with the same access control
That's simple: all non-static data members must all be public, private, or protected. You can't have some public and some private.
The reasoning for them goes to the reasoning for having a distinction between "standard layout" and "not standard layout" at all. Namely, to give the compiler the freedom to choose how to put things into memory. It's not just about vtable pointers.
Back when they standardized C++ in 98, they had to basically predict how people would implement it. While they had quite a bit of implementation experience with various flavors of C++, they weren't certain about things. So they decided to be cautious: give the compilers as much freedom as possible.
That's why the definition of POD in C++98 is so strict. It gave C++ compilers great latitude on member layout for most classes. Basically, POD types were intended to be special cases, something you specifically wrote for a reason.
When C++11 was being worked on, they had a lot more experience with compilers. And they realized that... C++ compiler writers are really lazy. They had all this freedom, but they didn't do anything with it.
The rules of standard layout are more or less codifying common practice: most compilers didn't really have to change much if anything at all to implement them (outside of maybe some stuff for the corresponding type traits).
Now, when it came to public/private, things are different. The freedom to reorder which members are public vs. private actually can matter to the compiler, particularly in debugging builds. And since the point of standard layout is that there is compatibility with other languages, you can't have the layout be different in debug vs. release.
Then there's the fact that it doesn't really hurt the user. If you're making an encapsulated class, odds are good that all of your data members will be private anyway. You generally don't expose public data members on fully encapsulated types. So this would only be a problem for those few users who do want to do that, who want that division.
So it's no big loss.
b) only one class in the whole inheritance tree can have non-static data members,
The reason for this one comes back to why they standardized standard layout again: common practice.
There's no common practice when it comes to having two members of an inheritance tree that actually store things. Some put the base class before the derived, others do it the other way. Which way do you order the members if they come from two base classes? And so on. Compilers diverge greatly on these questions.
Also, thanks to the zero/one/infinity rule, once you say you can have two classes with members, you can say as many as you want. This requires adding a lot of layout rules for how to handle this. You have to say how multiple inheritance works, which classes put their data before other classes, etc. That's a lot of rules, for very little material gain.
You can't make everything that doesn't have virtual functions and a default constructor standard layout.
and the first non-static data member cannot be of a base class type (this could break aliasing rules).
I can't really speak to this one. I'm not educated enough in C++'s aliasing rules to really understand it. But it has something to do with the fact that the base member will share the same address as the base class itself. That is:
struct Base {};
struct Derived : Base { Base b; };
Derived d;
static_cast<Base*>(&d) == &d.b;
And that's probably against C++'s aliasing rules. In some way.
However, consider this: how useful could having the ability to do this ever actually be? Since only one class can have non-static data members, then Derived must be that class (since it has a Base as a member). So Base must be empty (of data). And if Base is empty, as well as a base class... why have a data member of it at all?
Since Base is empty, it has no state. So any non-static member functions will do what they do based on their parameters, not their this pointer.
So again: no big loss.
What changes in c++20
Following the rest of the clear theme of this question, the meaning and use of aggregates continues to change with every standard. There are several key changes on the horizon.
Types with user-declared constructors P1008
In C++17, this type is still an aggregate:
struct X {
X() = delete;
};
And hence, X{} still compiles because that is aggregate initialization - not a constructor invocation. See also: When is a private constructor not a private constructor?
In C++20, the restriction will change from requiring:
no user-provided, explicit, or inherited constructors
to
no user-declared or inherited constructors
This has been adopted into the C++20 working draft. Neither the X here nor the C in the linked question will be aggregates in C++20.
This also makes for a yo-yo effect with the following example:
class A { protected: A() { }; };
struct B : A { B() = default; };
auto x = B{};
In C++11/14, B was not an aggregate due to the base class, so B{} performs value-initialization which calls B::B() which calls A::A(), at a point where it is accessible. This was well-formed.
In C++17, B became an aggregate because base classes were allowed, which made B{} aggregate-initialization. This requires copy-list-initializing an A from {}, but from outside the context of B, where it is not accessible. In C++17, this is ill-formed (auto x = B(); would be fine though).
In C++20 now, because of the above rule change, B once again ceases to be an aggregate (not because of the base class, but because of the user-declared default constructor - even though it's defaulted). So we're back to going through B's constructor, and this snippet becomes well-formed.
Initializing aggregates from a parenthesized list of values P960
A common issue that comes up is wanting to use emplace()-style constructors with aggregates:
struct X { int a, b; };
std::vector<X> xs;
xs.emplace_back(1, 2); // error
This does not work, because emplace will try to effectively perform the initialization X(1, 2), which is not valid. The typical solution is to add a constructor to X, but with this proposal (currently working its way through Core), aggregates will effectively have synthesized constructors which do the right thing - and behave like regular constructors. The above code will compile as-is in C++20.
Class Template Argument Deduction (CTAD) for Aggregates P1021 (specifically P1816)
In C++17, this does not compile:
template <typename T>
struct Point {
T x, y;
};
Point p{1, 2}; // error
Users would have to write their own deduction guide for all aggregate templates:
template <typename T> Point(T, T) -> Point<T>;
But as this is in some sense "the obvious thing" to do, and is basically just boilerplate, the language will do this for you. This example will compile in C++20 (without the need for the user-provided deduction guide).
C++ allows instantiating a class, setting the public members' value via an initialization list, as the example below shows for b1:
class B {
public:
int i;
std::string str;
};
B b1{ 42,"foo" };
B b2();
; however, if I provide a constructor
B(int k) { }
, it won't compile.
So, what's going on behind the hood?
Is it that when no constructor is provide, the compiler will provide one? But how could it provide one with initialization list? I thought it will just provide a "blank" constructor taking no input, as the example shows for b2. Or does it provide both?
But how could it provide one with initialization list?
No, it won't.
Note that for list initialization B b1{ 42,"foo" };, aggregate initialization is performed.
If T is an aggregate type, aggregate initialization is performed.
And B is an aggregate type,
An aggregate is one of the following types:
array type
class type (typically, struct or union), that has
no private or protected non-static data members
no user-provided, inherited, or explicit constructors (explicitly defaulted or deleted constructors are allowed)
no virtual, private, or protected base classes
no virtual member functions
That's why B b1{ 42,"foo" }; works well.
And If you provide a user-defined constructor, B becomes non-aggregate type, then aggregate initialization won't work again. In this case, you can only initialize B like B b1{42}; or B b2(42);, which will call the appropriate constructor.
BTW: After providing a user-defined constructor (taking one parameter), the implicitly-declared default constructor won't be declared by the compiler again. It means B b2; or B b2{}; won't work again.
BTW2: B b2(); might be a function declaration. See Most vexing parse.
Here is an example of my question:
class MyBaseClass
{
public:
MyBaseClass(): my_bool(false), my_value(0)
{}
MyBaseClass(bool b, int i): my_bool(b), my_value(i)
{}
private:
bool my_bool;
int my_value;
}
class MyDerivedClass1 : public ::MyBaseClass
{
public:
MyDerivedClass1(double d): my_double(d)
{}
private:
double my_double;
}
class MyDerivedClass2 : public ::MyBaseClass
{
public:
MyDerivedClass2(double d): MyBaseClass(), my_double(d)
{}
private:
double my_double;
}
Why isn't the MyDerivedClass1 an ok way to initialize my derived class versus having to explicitly initialize the base class like in MyDerivedClass2?
I guess I don't understand why I can't just rely on C++ calling my base constructor? I know if I wanted them initialized to something different I'd have to call the other constructor in my initialization list, but all I want is the base constructor to be called anyway.
There is no difference between providing a default-constructed base class in the initializer list or not providing it. What you use if entirely your style. (or the company)
In general, I would say you have 2 options (let's keep constructors out of scope), assuming you always initialize your members.
Option 1: Initialize all members in the init-list.
This option should be used if you are C++98-compatible. It has as advantage that you are defining all construction parameters in a single list, making it easy to search through. (Unless you have 50+ members)
The disadvantage of this option is a lot of duplication when you have multiple constructors.
With this, you can have 3 variants for them:
Skip the default-initialized classes, this makes the list shorter though it's hard to check if you have intended to 'forget' it. (Think replacing a class by a ptr to it)
Default initialize all members, this makes the list longer, though indicates clearly the intent
Explicitly provide the class you are initializing, by copy contructing with a tempuary
Option 2: Initialize all members on declaration, except for constructor parameters
This option assumes you initialize everything in the class declaration. Yet again you can explicitly call the default constructor or not, preferably with braced init, as round braces are interpreted as a function declaration.
In the initializer list you only have to put the members which are linked to the construction parameters.
The advantage of this option is readability (especially for large classes). The disadvantage is that this limits your compiler options to 'modern' compilers.
Base classes
If we again consider base classes and look at them as if they were members, the consistent way would be to declare them explicitely for option 1 and don't write them for option 2.
Personnally I like option 2, as I too often encounter classes with too many members and checking if all members are initialized is easier by hand with this one.
Option 1 however is often used because of legacy in order to keep the code consistent.
POD
Important to mention are the PODs (Plain old datatype), like int, double ...
If you don't initialize them you get some random data. This makes it important to explicitly intialize them, regardless of the method you use to do so.
Conclusion
So in the end, it's all a matter of style with no functional difference.
Although in most cases there isn't a semantic difference and the issue is mostly one of style, there is one case where it is a difference: when the default constructor of the base class is not user-provided.
The difference comes from the fact that a base that is not in the member initializer list is default-initialized, while explicitly writing Base() value-initializes it, which performs zero-initialization if the default constructor isn't user-provided.
Thus:
struct A { int i; };
struct B : A {
B() : j(i) {} // may be undefined behavior
int j;
};
struct C : A {
C() : A(), j(i) {} // OK; both i and j are zero
int j;
};
struct B {
int b1, b2;
B(int, int);
};
struct D : B {
int d1, d2;
// which is technically better ?
D (int i, int j, int k, int l) : B(i,j), d1(k), d2(l) {} // 1st Base
// or
D (int i, int j, int k, int l) : d1(k), d2(l), B(i,j) {} // last Base
};
Above is just a pseudo code. In actual, I wanted to know that does the order of calling base constructor matter?
Are there any bad behaviors (especially corner cases) caused by any of the cases? My question is on more technical aspect and not on coding styles.
The order you refer in your question is not the "order of calling base constructor". In fact, you can't call a constructor. Constructors are not callable by the user. Only compiler can call constructors.
What you can do is to specify initializers. In this case (constructor initializer list) you are specifying initializers for subobjects of some larger object. The order in which you specify these initializers does not matter: the compiler will call the constructors in a very specific order defined by the language specification, regardless of the order in which you specify the initializers. The base class constructors are always called first (in the order in which the base classes are listed in class definition), then the constructors of member subobjects are called (again, in the order in which these members are listed in the class definition).
(There are some peculiarities in this rule when it comes to virtual base classes, but I decided not to include them here.)
As for the bad behaviors... Of course there is a potential for "bad behaviors" here. If you assume that the order of initialization depends on the order you used in the constructor initializer list, you will most likely eventually run into an unpleasant surprise, when you discover that the compiler completely ignores that order and uses its own order (the order of declaration) instead. For example, the author of this code
struct S {
int b, a;
S() : a(5), b(a) {}
};
might expect a to be initialized first, and b to receive the initial value of 5 from a, but in reality this won't happen since b is initialized before a.
The order is well defined. It does not depend on how you specify them while initializtion.
Base class constructor B will be called first and then the member variables(d1 & d2) in the order in which they are declared.
To explain the comment in #Andrey T's answer.
class MyClass1: public MyClass2, public virtual MyClass3
{
};
The order of calling the Base class constructors is well defined by the standard and will be:
MyClass3
MyClass2
MyClass1
The virtual Base class MyClass3 is given preference over Base Class MyClass2.
The order things appear in the initialisation list is not significant. In your case, the base object will always be initialised first, followed by d1 and d2, in that order. Initialisation is performed in order of derivation and in order members appear in the class definition.
Having said that, it is normally considered good style to write the initialisation list in the order of initialisation, and some compilers will issue a warning if you don't do this.
This FAQ is about Aggregates and PODs and covers the following material:
What are Aggregates?
What are PODs (Plain Old Data)?
How are they related?
How and why are they special?
What changes for C++11?
How to read:
This article is rather long. If you want to know about both aggregates and PODs (Plain Old Data) take time and read it. If you are interested just in aggregates, read only the first part. If you are interested only in PODs then you must first read the definition, implications, and examples of aggregates and then you may jump to PODs but I would still recommend reading the first part in its entirety. The notion of aggregates is essential for defining PODs. If you find any errors (even minor, including grammar, stylistics, formatting, syntax, etc.) please leave a comment, I'll edit.
This answer applies to C++03. For other C++ standards see:
C++11 changes
C++14 changes
C++17 changes
C++20 changes
What are aggregates and why they are special
Formal definition from the C++ standard (C++03 8.5.1 §1):
An aggregate is an array or a class (clause 9) with no user-declared
constructors (12.1), no private or protected non-static data members (clause 11),
no base classes (clause 10), and no virtual functions (10.3).
So, OK, let's parse this definition. First of all, any array is an aggregate. A class can also be an aggregate if… wait! nothing is said about structs or unions, can't they be aggregates? Yes, they can. In C++, the term class refers to all classes, structs, and unions. So, a class (or struct, or union) is an aggregate if and only if it satisfies the criteria from the above definitions. What do these criteria imply?
This does not mean an aggregate class cannot have constructors, in fact it can have a default constructor and/or a copy constructor as long as they are implicitly declared by the compiler, and not explicitly by the user
No private or protected non-static data members. You can have as many private and protected member functions (but not constructors) as well as as many private or protected static data members and member functions as you like and not violate the rules for aggregate classes
An aggregate class can have a user-declared/user-defined copy-assignment operator and/or destructor
An array is an aggregate even if it is an array of non-aggregate class type.
Now let's look at some examples:
class NotAggregate1
{
virtual void f() {} //remember? no virtual functions
};
class NotAggregate2
{
int x; //x is private by default and non-static
};
class NotAggregate3
{
public:
NotAggregate3(int) {} //oops, user-defined constructor
};
class Aggregate1
{
public:
NotAggregate1 member1; //ok, public member
Aggregate1& operator=(Aggregate1 const & rhs) {/* */} //ok, copy-assignment
private:
void f() {} // ok, just a private function
};
You get the idea. Now let's see how aggregates are special. They, unlike non-aggregate classes, can be initialized with curly braces {}. This initialization syntax is commonly known for arrays, and we just learnt that these are aggregates. So, let's start with them.
Type array_name[n] = {a1, a2, …, am};
if(m == n)
the ith element of the array is initialized with ai
else if(m < n)
the first m elements of the array are initialized with a1, a2, …, am and the other n - m elements are, if possible, value-initialized (see below for the explanation of the term)
else if(m > n)
the compiler will issue an error
else (this is the case when n isn't specified at all like int a[] = {1, 2, 3};)
the size of the array (n) is assumed to be equal to m, so int a[] = {1, 2, 3}; is equivalent to int a[3] = {1, 2, 3};
When an object of scalar type (bool, int, char, double, pointers, etc.) is value-initialized it means it is initialized with 0 for that type (false for bool, 0.0 for double, etc.). When an object of class type with a user-declared default constructor is value-initialized its default constructor is called. If the default constructor is implicitly defined then all nonstatic members are recursively value-initialized. This definition is imprecise and a bit incorrect but it should give you the basic idea. A reference cannot be value-initialized. Value-initialization for a non-aggregate class can fail if, for example, the class has no appropriate default constructor.
Examples of array initialization:
class A
{
public:
A(int) {} //no default constructor
};
class B
{
public:
B() {} //default constructor available
};
int main()
{
A a1[3] = {A(2), A(1), A(14)}; //OK n == m
A a2[3] = {A(2)}; //ERROR A has no default constructor. Unable to value-initialize a2[1] and a2[2]
B b1[3] = {B()}; //OK b1[1] and b1[2] are value initialized, in this case with the default-ctor
int Array1[1000] = {0}; //All elements are initialized with 0;
int Array2[1000] = {1}; //Attention: only the first element is 1, the rest are 0;
bool Array3[1000] = {}; //the braces can be empty too. All elements initialized with false
int Array4[1000]; //no initializer. This is different from an empty {} initializer in that
//the elements in this case are not value-initialized, but have indeterminate values
//(unless, of course, Array4 is a global array)
int array[2] = {1, 2, 3, 4}; //ERROR, too many initializers
}
Now let's see how aggregate classes can be initialized with braces. Pretty much the same way. Instead of the array elements we will initialize the non-static data members in the order of their appearance in the class definition (they are all public by definition). If there are fewer initializers than members, the rest are value-initialized. If it is impossible to value-initialize one of the members which were not explicitly initialized, we get a compile-time error. If there are more initializers than necessary, we get a compile-time error as well.
struct X
{
int i1;
int i2;
};
struct Y
{
char c;
X x;
int i[2];
float f;
protected:
static double d;
private:
void g(){}
};
Y y = {'a', {10, 20}, {20, 30}};
In the above example y.c is initialized with 'a', y.x.i1 with 10, y.x.i2 with 20, y.i[0] with 20, y.i[1] with 30 and y.f is value-initialized, that is, initialized with 0.0. The protected static member d is not initialized at all, because it is static.
Aggregate unions are different in that you may initialize only their first member with braces. I think that if you are advanced enough in C++ to even consider using unions (their use may be very dangerous and must be thought of carefully), you could look up the rules for unions in the standard yourself :).
Now that we know what's special about aggregates, let's try to understand the restrictions on classes; that is, why they are there. We should understand that memberwise initialization with braces implies that the class is nothing more than the sum of its members. If a user-defined constructor is present, it means that the user needs to do some extra work to initialize the members therefore brace initialization would be incorrect. If virtual functions are present, it means that the objects of this class have (on most implementations) a pointer to the so-called vtable of the class, which is set in the constructor, so brace-initialization would be insufficient. You could figure out the rest of the restrictions in a similar manner as an exercise :).
So enough about the aggregates. Now we can define a stricter set of types, to wit, PODs
What are PODs and why they are special
Formal definition from the C++ standard (C++03 9 §4):
A POD-struct is an aggregate class
that has no non-static data members of
type non-POD-struct, non-POD-union (or
array of such types) or reference, and
has no user-defined copy assignment
operator and no user-defined
destructor. Similarly, a POD-union is
an aggregate union that has no
non-static data members of type
non-POD-struct, non-POD-union (or
array of such types) or reference, and
has no user-defined copy assignment
operator and no user-defined
destructor. A POD class is a class
that is either a POD-struct or a
POD-union.
Wow, this one's tougher to parse, isn't it? :) Let's leave unions out (on the same grounds as above) and rephrase in a bit clearer way:
An aggregate class is called a POD if
it has no user-defined copy-assignment
operator and destructor and none of
its nonstatic members is a non-POD
class, array of non-POD, or a
reference.
What does this definition imply? (Did I mention POD stands for Plain Old Data?)
All POD classes are aggregates, or, to put it the other way around, if a class is not an aggregate then it is sure not a POD
Classes, just like structs, can be PODs even though the standard term is POD-struct for both cases
Just like in the case of aggregates, it doesn't matter what static members the class has
Examples:
struct POD
{
int x;
char y;
void f() {} //no harm if there's a function
static std::vector<char> v; //static members do not matter
};
struct AggregateButNotPOD1
{
int x;
~AggregateButNotPOD1() {} //user-defined destructor
};
struct AggregateButNotPOD2
{
AggregateButNotPOD1 arrOfNonPod[3]; //array of non-POD class
};
POD-classes, POD-unions, scalar types, and arrays of such types are collectively called POD-types.
PODs are special in many ways. I'll provide just some examples.
POD-classes are the closest to C structs. Unlike them, PODs can have member functions and arbitrary static members, but neither of these two change the memory layout of the object. So if you want to write a more or less portable dynamic library that can be used from C and even .NET, you should try to make all your exported functions take and return only parameters of POD-types.
The lifetime of objects of non-POD class type begins when the constructor has finished and ends when the destructor has finished. For POD classes, the lifetime begins when storage for the object is occupied and finishes when that storage is released or reused.
For objects of POD types it is guaranteed by the standard that when you memcpy the contents of your object into an array of char or unsigned char, and then memcpy the contents back into your object, the object will hold its original value. Do note that there is no such guarantee for objects of non-POD types. Also, you can safely copy POD objects with memcpy. The following example assumes T is a POD-type:
#define N sizeof(T)
char buf[N];
T obj; // obj initialized to its original value
memcpy(buf, &obj, N); // between these two calls to memcpy,
// obj might be modified
memcpy(&obj, buf, N); // at this point, each subobject of obj of scalar type
// holds its original value
goto statement. As you may know, it is illegal (the compiler should issue an error) to make a jump via goto from a point where some variable was not yet in scope to a point where it is already in scope. This restriction applies only if the variable is of non-POD type. In the following example f() is ill-formed whereas g() is well-formed. Note that Microsoft's compiler is too liberal with this rule—it just issues a warning in both cases.
int f()
{
struct NonPOD {NonPOD() {}};
goto label;
NonPOD x;
label:
return 0;
}
int g()
{
struct POD {int i; char c;};
goto label;
POD x;
label:
return 0;
}
It is guaranteed that there will be no padding in the beginning of a POD object. In other words, if a POD-class A's first member is of type T, you can safely reinterpret_cast from A* to T* and get the pointer to the first member and vice versa.
The list goes on and on…
Conclusion
It is important to understand what exactly a POD is because many language features, as you see, behave differently for them.
What changes for C++11?
Aggregates
The standard definition of an aggregate has changed slightly, but it's still pretty much the same:
An aggregate is an array or a class (Clause 9) with no user-provided constructors (12.1),
no brace-or-equal-initializers for non-static data members (9.2), no private or protected
non-static data members (Clause 11), no base classes (Clause 10), and no virtual functions (10.3).
Ok, what changed?
Previously, an aggregate could have no user-declared constructors, but now it can't have user-provided constructors. Is there a difference? Yes, there is, because now you can declare constructors and default them:
struct Aggregate {
Aggregate() = default; // asks the compiler to generate the default implementation
};
This is still an aggregate because a constructor (or any special member function) that is defaulted on the first declaration is not user-provided.
Now an aggregate cannot have any brace-or-equal-initializers for non-static data members. What does this mean? Well, this is just because with this new standard, we can initialize members directly in the class like this:
struct NotAggregate {
int x = 5; // valid in C++11
std::vector<int> s{1,2,3}; // also valid
};
Using this feature makes the class no longer an aggregate because it's basically equivalent to providing your own default constructor.
So, what is an aggregate didn't change much at all. It's still the same basic idea, adapted to the new features.
What about PODs?
PODs went through a lot of changes. Lots of previous rules about PODs were relaxed in this new standard, and the way the definition is provided in the standard was radically changed.
The idea of a POD is to capture basically two distinct properties:
It supports static initialization, and
Compiling a POD in C++ gives you the same memory layout as a struct compiled in C.
Because of this, the definition has been split into two distinct concepts: trivial classes and standard-layout classes, because these are more useful than POD. The standard now rarely uses the term POD, preferring the more specific trivial and standard-layout concepts.
The new definition basically says that a POD is a class that is both trivial and has standard-layout, and this property must hold recursively for all non-static data members:
A POD struct is a non-union class that is both a trivial class and a standard-layout class,
and has no non-static data members of type non-POD struct, non-POD union (or array of such types).
Similarly, a POD union is a union that is both a trivial class and a standard layout class, and has
no non-static data members of type non-POD struct, non-POD union (or array of such types).
A POD class is a class that is either a POD struct or a POD union.
Let's go over each of these two properties in detail separately.
Trivial classes
Trivial is the first property mentioned above: trivial classes support static initialization.
If a class is trivially copyable (a superset of trivial classes), it is ok to copy its representation over the place with things like memcpy and expect the result to be the same.
The standard defines a trivial class as follows:
A trivially copyable class is a class that:
— has no non-trivial copy constructors (12.8),
— has no non-trivial move constructors (12.8),
— has no non-trivial copy assignment operators (13.5.3, 12.8),
— has no non-trivial move assignment operators (13.5.3, 12.8), and
— has a trivial destructor (12.4).
A trivial class is a class that has a trivial default constructor (12.1) and is trivially copyable.
[ Note: In particular, a trivially copyable or trivial class does not have virtual functions
or virtual base classes.—end note ]
So, what are all those trivial and non-trivial things?
A copy/move constructor for class X is trivial if it is not user-provided and if
— class X has no virtual functions (10.3) and no virtual base classes (10.1), and
— the constructor selected to copy/move each direct base class subobject is trivial, and
— for each non-static data member of X that is of class type (or array thereof), the constructor
selected to copy/move that member is trivial;
otherwise the copy/move constructor is non-trivial.
Basically this means that a copy or move constructor is trivial if it is not user-provided, the class has nothing virtual in it, and this property holds recursively for all the members of the class and for the base class.
The definition of a trivial copy/move assignment operator is very similar, simply replacing the word "constructor" with "assignment operator".
A trivial destructor also has a similar definition, with the added constraint that it can't be virtual.
And yet another similar rule exists for trivial default constructors, with the addition that a default constructor is not-trivial if the class has non-static data members with brace-or-equal-initializers, which we've seen above.
Here are some examples to clear everything up:
// empty classes are trivial
struct Trivial1 {};
// all special members are implicit
struct Trivial2 {
int x;
};
struct Trivial3 : Trivial2 { // base class is trivial
Trivial3() = default; // not a user-provided ctor
int y;
};
struct Trivial4 {
public:
int a;
private: // no restrictions on access modifiers
int b;
};
struct Trivial5 {
Trivial1 a;
Trivial2 b;
Trivial3 c;
Trivial4 d;
};
struct Trivial6 {
Trivial2 a[23];
};
struct Trivial7 {
Trivial6 c;
void f(); // it's okay to have non-virtual functions
};
struct Trivial8 {
int x;
static NonTrivial1 y; // no restrictions on static members
};
struct Trivial9 {
Trivial9() = default; // not user-provided
// a regular constructor is okay because we still have default ctor
Trivial9(int x) : x(x) {};
int x;
};
struct NonTrivial1 : Trivial3 {
virtual void f(); // virtual members make non-trivial ctors
};
struct NonTrivial2 {
NonTrivial2() : z(42) {} // user-provided ctor
int z;
};
struct NonTrivial3 {
NonTrivial3(); // user-provided ctor
int w;
};
NonTrivial3::NonTrivial3() = default; // defaulted but not on first declaration
// still counts as user-provided
struct NonTrivial5 {
virtual ~NonTrivial5(); // virtual destructors are not trivial
};
Standard-layout
Standard-layout is the second property. The standard mentions that these are useful for communicating with other languages, and that's because a standard-layout class has the same memory layout of the equivalent C struct or union.
This is another property that must hold recursively for members and all base classes. And as usual, no virtual functions or virtual base classes are allowed. That would make the layout incompatible with C.
A relaxed rule here is that standard-layout classes must have all non-static data members with the same access control. Previously these had to be all public, but now you can make them private or protected, as long as they are all private or all protected.
When using inheritance, only one class in the whole inheritance tree can have non-static data members, and the first non-static data member cannot be of a base class type (this could break aliasing rules), otherwise, it's not a standard-layout class.
This is how the definition goes in the standard text:
A standard-layout class is a class that:
— has no non-static data members of type non-standard-layout class (or array of such types)
or reference,
— has no virtual functions (10.3) and no virtual base classes (10.1),
— has the same access control (Clause 11) for all non-static data members,
— has no non-standard-layout base classes,
— either has no non-static data members in the most derived class and at most one base class with
non-static data members, or has no base classes with non-static data members, and
— has no base classes of the same type as the first non-static data member.
A standard-layout struct is a standard-layout class defined with the class-key struct or
the class-key class.
A standard-layout union is a standard-layout class defined with the class-key union.
[ Note: Standard-layout classes are useful for communicating with code written in other programming languages. Their layout is specified in 9.2.—end note ]
And let's see a few examples.
// empty classes have standard-layout
struct StandardLayout1 {};
struct StandardLayout2 {
int x;
};
struct StandardLayout3 {
private: // both are private, so it's ok
int x;
int y;
};
struct StandardLayout4 : StandardLayout1 {
int x;
int y;
void f(); // perfectly fine to have non-virtual functions
};
struct StandardLayout5 : StandardLayout1 {
int x;
StandardLayout1 y; // can have members of base type if they're not the first
};
struct StandardLayout6 : StandardLayout1, StandardLayout5 {
// can use multiple inheritance as long only
// one class in the hierarchy has non-static data members
};
struct StandardLayout7 {
int x;
int y;
StandardLayout7(int x, int y) : x(x), y(y) {} // user-provided ctors are ok
};
struct StandardLayout8 {
public:
StandardLayout8(int x) : x(x) {} // user-provided ctors are ok
// ok to have non-static data members and other members with different access
private:
int x;
};
struct StandardLayout9 {
int x;
static NonStandardLayout1 y; // no restrictions on static members
};
struct NonStandardLayout1 {
virtual f(); // cannot have virtual functions
};
struct NonStandardLayout2 {
NonStandardLayout1 X; // has non-standard-layout member
};
struct NonStandardLayout3 : StandardLayout1 {
StandardLayout1 x; // first member cannot be of the same type as base
};
struct NonStandardLayout4 : StandardLayout3 {
int z; // more than one class has non-static data members
};
struct NonStandardLayout5 : NonStandardLayout3 {}; // has a non-standard-layout base class
Conclusion
With these new rules a lot more types can be PODs now. And even if a type is not POD, we can take advantage of some of the POD properties separately (if it is only one of trivial or standard-layout).
The standard library has traits to test these properties in the header <type_traits>:
template <typename T>
struct std::is_pod;
template <typename T>
struct std::is_trivial;
template <typename T>
struct std::is_trivially_copyable;
template <typename T>
struct std::is_standard_layout;
What has changed for C++14
We can refer to the Draft C++14 standard for reference.
Aggregates
This is covered in section 8.5.1 Aggregates which gives us the following definition:
An aggregate is an array or a class (Clause 9) with no user-provided
constructors (12.1), no private or protected non-static data members
(Clause 11), no base classes (Clause 10), and no virtual functions
(10.3).
The only change is now adding in-class member initializers does not make a class a non-aggregate. So the following example from C++11 aggregate initialization for classes with member in-place initializers:
struct A
{
int a = 3;
int b = 3;
};
was not an aggregate in C++11 but it is in C++14. This change is covered in N3605: Member initializers and aggregates, which has the following abstract:
Bjarne Stroustrup and Richard Smith raised an issue about aggregate
initialization and member-initializers not working together. This
paper proposes to fix the issue by adopting Smith's proposed wording
that removes a restriction that aggregates can't have
member-initializers.
POD stays the same
The definition for POD(plain old data) struct is covered in section 9 Classes which says:
A POD struct110 is a non-union class that is both a trivial class and
a standard-layout class, and has no non-static data members of type
non-POD struct, non-POD union (or array of such types). Similarly, a
POD union is a union that is both a trivial class and a
standard-layout class, and has no non-static data members of type
non-POD struct, non-POD union (or array of such types). A POD class is
a class that is either a POD struct or a POD union.
which is the same wording as C++11.
Standard-Layout Changes for C++14
As noted in the comments pod relies on the definition of standard-layout and that did change for C++14 but this was via defect reports that were applied to C++14 after the fact.
There were three DRs:
DR 1672
DR 1813
DR 2120
So standard-layout went from this Pre C++14:
A standard-layout class is a class that:
(7.1) has no non-static data members of type non-standard-layout class (or array of such types) or reference,
(7.2) has no virtual functions ([class.virtual]) and no virtual base classes ([class.mi]),
(7.3) has the same access control (Clause [class.access]) for all non-static data members,
(7.4) has no non-standard-layout base classes,
(7.5) either has no non-static data members in the most derived class and at most one base class with non-static data members, or has
no base classes with non-static data members, and
(7.6) has no base classes of the same type as the first non-static data member.109
To this in C++14:
A class S is a standard-layout class if it:
(3.1) has no non-static data members of type non-standard-layout class (or array of such types) or reference,
(3.2) has no virtual functions and no virtual base classes,
(3.3) has the same access control for all non-static data members,
(3.4) has no non-standard-layout base classes,
(3.5) has at most one base class subobject of any given type,
(3.6) has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
(3.7) has no element of the set M(S) of types as a base class, where for any type X, M(X) is defined as follows.104
[ Note: M(X) is the set of the types of all non-base-class subobjects that may be at a zero offset in X.
— end note
]
(3.7.1) If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
(3.7.2) If X is a non-union class type with a non-static data member of type X0 that is either of zero size or is the first
non-static data member of X (where said member may be an anonymous
union), the set M(X) consists of X0 and the elements of M(X0).
(3.7.3) If X is a union type, the set M(X) is the union of all M(Ui) and the set containing all Ui, where each Ui is the type of the
ith non-static data member of X.
(3.7.4) If X is an array type with element type Xe, the set M(X) consists of Xe and the elements of M(Xe).
(3.7.5) If X is a non-class, non-array type, the set M(X) is empty.
Changes in C++17
Download the C++17 International Standard final draft here.
Aggregates
C++17 expands and enhances aggregates and aggregate initialization. The standard library also now includes an std::is_aggregate type trait class. Here is the formal definition from section 11.6.1.1 and 11.6.1.2 (internal references elided):
An aggregate is an array or a class with
— no user-provided, explicit, or inherited constructors,
— no private or protected non-static data members,
— no virtual functions, and
— no virtual, private, or protected base classes.
[ Note: Aggregate initialization does not allow accessing protected and private base class’ members or constructors. —end note ]
The elements of an aggregate are:
— for an array, the array elements in increasing subscript order, or
— for a class, the direct base classes in declaration order, followed by the direct non-static data members that are not members of an anonymous union, in declaration order.
What changed?
Aggregates can now have public, non-virtual base classes. Furthermore, it is not a requirement that base classes be aggregates. If they are not aggregates, they are list-initialized.
struct B1 // not a aggregate
{
int i1;
B1(int a) : i1(a) { }
};
struct B2
{
int i2;
B2() = default;
};
struct M // not an aggregate
{
int m;
M(int a) : m(a) { }
};
struct C : B1, B2
{
int j;
M m;
C() = default;
};
C c { { 1 }, { 2 }, 3, { 4 } };
cout
<< "is C aggregate?: " << (std::is_aggregate<C>::value ? 'Y' : 'N')
<< " i1: " << c.i1 << " i2: " << c.i2
<< " j: " << c.j << " m.m: " << c.m.m << endl;
//stdout: is C aggregate?: Y, i1=1 i2=2 j=3 m.m=4
Explicit defaulted constructors are disallowed
struct D // not an aggregate
{
int i = 0;
D() = default;
explicit D(D const&) = default;
};
Inheriting constructors are disallowed
struct B1
{
int i1;
B1() : i1(0) { }
};
struct C : B1 // not an aggregate
{
using B1::B1;
};
Trivial Classes
The definition of trivial class was reworked in C++17 to address several defects that were not addressed in C++14. The changes were technical in nature. Here is the new definition at 12.0.6 (internal references elided):
A trivially copyable class is a class:
— where each copy constructor, move constructor, copy assignment operator, and move assignment operator is either deleted or trivial,
— that has at least one non-deleted copy constructor, move constructor, copy assignment operator, or move assignment operator, and
— that has a trivial, non-deleted destructor.
A trivial class is a class that is trivially copyable and has one or more default constructors, all of which are either trivial or deleted and at least one of which is not deleted. [ Note: In particular, a trivially copyable
or trivial class does not have virtual functions or virtual base classes.—end note ]
Changes:
Under C++14, for a class to be trivial, the class could not have any copy/move constructor/assignment operators that were non-trivial. However, then an implicitly declared as defaulted constructor/operator could be non-trivial and yet defined as deleted because, for example, the class contained a subobject of class type that could not be copied/moved. The presence of such non-trivial, defined-as-deleted constructor/operator would cause the whole class to be non-trivial. A similar problem existed with destructors. C++17 clarifies that the presence of such constructor/operators does not cause the class to be non-trivially copyable, hence non-trivial, and that a trivially copyable class must have a trivial, non-deleted destructor. DR1734, DR1928
C++14 allowed a trivially copyable class, hence a trivial class, to have every copy/move constructor/assignment operator declared as deleted. If such as class was also standard layout, it could, however, be legally copied/moved with std::memcpy. This was a semantic contradiction, because, by defining as deleted all constructor/assignment operators, the creator of the class clearly intended that the class could not be copied/moved, yet the class still met the definition of a trivially copyable class. Hence in C++17 we have a new clause stating that trivially copyable class must have at least one trivial, non-deleted (though not necessarily publicly accessible) copy/move constructor/assignment operator. See N4148, DR1734
The third technical change concerns a similar problem with default constructors. Under C++14, a class could have trivial default constructors that were implicitly defined as deleted, yet still be a trivial class. The new definition clarifies that a trivial class must have a least one trivial, non-deleted default constructor. See DR1496
Standard-layout Classes
The definition of standard-layout was also reworked to address defect reports. Again the changes were technical in nature. Here is the text from the standard (12.0.7). As before, internal references are elided:
A class S is a standard-layout class if it:
— has no non-static data members of type non-standard-layout class (or array of such types) or reference,
— has no virtual functions and no virtual base classes,
— has the same access control for all non-static data members,
— has no non-standard-layout base classes,
— has at most one base class subobject of any given type,
— has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
— has no element of the set M(S) of types (defined below) as a base class.108
M(X) is defined as follows:
— If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
— If X is a non-union class type whose first non-static data member has type X0 (where said member may be an anonymous union), the set M(X) consists of X0 and the elements of M(X0).
— If X is a union type, the set M(X) is the union of all M(Ui) and the set containing all Ui, where each Ui is the type of the ith non-static data member of X.
— If X is an array type with element type Xe, the set M(X) consists of Xe and the elements of M(Xe).
— If X is a non-class, non-array type, the set M(X) is empty.
[ Note: M(X) is the set of the types of all non-base-class subobjects that are guaranteed in a standard-layout class to be at a zero offset in X. —end note ]
[ Example:
struct B { int i; }; // standard-layout class
struct C : B { }; // standard-layout class
struct D : C { }; // standard-layout class
struct E : D { char : 4; }; // not a standard-layout class
struct Q {};
struct S : Q { };
struct T : Q { };
struct U : S, T { }; // not a standard-layout class
—end example ]
108) This ensures that two subobjects that have the same class type and that belong to the same most derived object are not allocated at the same address.
Changes:
Clarified that the requirement that only one class in the derivation tree "has" non-static data members refers to a class where such data members are first declared, not classes where they may be inherited, and extended this requirement to non-static bit fields. Also clarified that a standard-layout class "has at most one base class subobject of any given type." See DR1813, DR1881
The definition of standard-layout has never allowed the type of any base class to be the same type as the first non-static data member. It is to avoid a situation where a data member at offset zero has the same type as any base class. The C++17 standard provides a more rigorous, recursive definition of "the set of the types of all non-base-class subobjects that are guaranteed in a standard-layout class to be at a zero offset" so as to prohibit such types from being the type of any base class. See DR1672, DR2120.
Note: The C++ standards committee intended the above changes based on defect reports to apply to C++14, though the new language is not in the published C++14 standard. It is in the C++17 standard.
POD in C++11 was basically split into two different axes here: triviality and layout. Triviality is about the relationship between an object's conceptual value and the bits of data within its storage. Layout is about... well, the layout of an object's subobjects. Only class types have layout, while all types have triviality relationships.
So here is what the triviality axis is about:
Non-trivially copyable: The value of objects of such types may be more than just the binary data that are stored directly within the object.
For example, unique_ptr<T> stores a T*; that is the totality of the binary data within the object. But that's not the totality of the value of a unique_ptr<T>. A unique_ptr<T> stores either a nullptr or a pointer to an object whose lifetime is managed by the unique_ptr<T> instance. That management is part of the value of a unique_ptr<T>. And that value is not part of the binary data of the object; it is created by the various member functions of that object.
For example, to assign nullptr to a unique_ptr<T> is to do more than just change the bits stored in the object. Such an assignment must destroy any object managed by the unique_ptr. To manipulate the internal storage of a unique_ptr without going through its member functions would damage this mechanism, to change its internal T* without destroying the object it currently manages, would violate the conceptual value that the object possesses.
Trivially copyable: The value of such objects are exactly and only the contents of their binary storage. This is what makes it reasonable to allow copying that binary storage to be equivalent to copying the object itself.
The specific rules that define trivial copyability (trivial destructor, trivial/deleted copy/move constructors/assignment) are what is required for a type to be binary-value-only. An object's destructor can participate in defining the "value" of an object, as in the case with unique_ptr. If that destructor is trivial, then it doesn't participate in defining the object's value.
Specialized copy/move operations also can participate in an object's value. unique_ptr's move constructor modifies the source of the move operation by null-ing it out. This is what ensures that the value of a unique_ptr is unique. Trivial copy/move operations mean that such object value shenanigans are not being played, so the object's value can only be the binary data it stores.
Trivial: This object is considered to have a functional value for any bits that it stores. Trivially copyable defines the meaning of the data store of an object as being just that data. But such types can still control how data gets there (to some extent). Such a type can have default member initializers and/or a default constructor that ensures that a particular member always has a particular value. And thus, the conceptual value of the object can be restricted to a subset of the binary data that it could store.
Performing default initialization on a type that has a trivial default constructor will leave that object with completely uninitialized values. As such, a type with a trivial default constructor is logically valid with any binary data in its data storage.
The layout axis is really quite simple. Compilers are given a lot of leeway in deciding how the subobjects of a class are stored within the class's storage. However, there are some cases where this leeway is not necessary, and having more rigid ordering guarantees is useful.
Such types are standard layout types. And the C++ standard doesn't even really do much with saying what that layout is specifically. It basically says three things about standard layout types:
The first subobject is at the same address as the object itself.
You can use offsetof to get a byte offset from the outer object to one of its member subobjects.
unions get to play some games with accessing subobjects through an inactive member of a union if the active member is (at least partially) using the same layout as the inactive one being accessed.
Compilers generally permit standard layout objects to map to struct types with the same members in C. But there is no statement of that in the C++ standard; that's just what compilers feel like doing.
POD is basically a useless term at this point. It is just the intersection of trivial copyability (the value is only its binary data) and standard layout (the order of its subobjects is more well-defined). One can infer from such things that the type is C-like and could map to similar C objects. But the standard has no statements to that effect.
can you please elaborate following rules:
I'll try:
a) standard-layout classes must have all non-static data members with the same access control
That's simple: all non-static data members must all be public, private, or protected. You can't have some public and some private.
The reasoning for them goes to the reasoning for having a distinction between "standard layout" and "not standard layout" at all. Namely, to give the compiler the freedom to choose how to put things into memory. It's not just about vtable pointers.
Back when they standardized C++ in 98, they had to basically predict how people would implement it. While they had quite a bit of implementation experience with various flavors of C++, they weren't certain about things. So they decided to be cautious: give the compilers as much freedom as possible.
That's why the definition of POD in C++98 is so strict. It gave C++ compilers great latitude on member layout for most classes. Basically, POD types were intended to be special cases, something you specifically wrote for a reason.
When C++11 was being worked on, they had a lot more experience with compilers. And they realized that... C++ compiler writers are really lazy. They had all this freedom, but they didn't do anything with it.
The rules of standard layout are more or less codifying common practice: most compilers didn't really have to change much if anything at all to implement them (outside of maybe some stuff for the corresponding type traits).
Now, when it came to public/private, things are different. The freedom to reorder which members are public vs. private actually can matter to the compiler, particularly in debugging builds. And since the point of standard layout is that there is compatibility with other languages, you can't have the layout be different in debug vs. release.
Then there's the fact that it doesn't really hurt the user. If you're making an encapsulated class, odds are good that all of your data members will be private anyway. You generally don't expose public data members on fully encapsulated types. So this would only be a problem for those few users who do want to do that, who want that division.
So it's no big loss.
b) only one class in the whole inheritance tree can have non-static data members,
The reason for this one comes back to why they standardized standard layout again: common practice.
There's no common practice when it comes to having two members of an inheritance tree that actually store things. Some put the base class before the derived, others do it the other way. Which way do you order the members if they come from two base classes? And so on. Compilers diverge greatly on these questions.
Also, thanks to the zero/one/infinity rule, once you say you can have two classes with members, you can say as many as you want. This requires adding a lot of layout rules for how to handle this. You have to say how multiple inheritance works, which classes put their data before other classes, etc. That's a lot of rules, for very little material gain.
You can't make everything that doesn't have virtual functions and a default constructor standard layout.
and the first non-static data member cannot be of a base class type (this could break aliasing rules).
I can't really speak to this one. I'm not educated enough in C++'s aliasing rules to really understand it. But it has something to do with the fact that the base member will share the same address as the base class itself. That is:
struct Base {};
struct Derived : Base { Base b; };
Derived d;
static_cast<Base*>(&d) == &d.b;
And that's probably against C++'s aliasing rules. In some way.
However, consider this: how useful could having the ability to do this ever actually be? Since only one class can have non-static data members, then Derived must be that class (since it has a Base as a member). So Base must be empty (of data). And if Base is empty, as well as a base class... why have a data member of it at all?
Since Base is empty, it has no state. So any non-static member functions will do what they do based on their parameters, not their this pointer.
So again: no big loss.
What changes in c++20
Following the rest of the clear theme of this question, the meaning and use of aggregates continues to change with every standard. There are several key changes on the horizon.
Types with user-declared constructors P1008
In C++17, this type is still an aggregate:
struct X {
X() = delete;
};
And hence, X{} still compiles because that is aggregate initialization - not a constructor invocation. See also: When is a private constructor not a private constructor?
In C++20, the restriction will change from requiring:
no user-provided, explicit, or inherited constructors
to
no user-declared or inherited constructors
This has been adopted into the C++20 working draft. Neither the X here nor the C in the linked question will be aggregates in C++20.
This also makes for a yo-yo effect with the following example:
class A { protected: A() { }; };
struct B : A { B() = default; };
auto x = B{};
In C++11/14, B was not an aggregate due to the base class, so B{} performs value-initialization which calls B::B() which calls A::A(), at a point where it is accessible. This was well-formed.
In C++17, B became an aggregate because base classes were allowed, which made B{} aggregate-initialization. This requires copy-list-initializing an A from {}, but from outside the context of B, where it is not accessible. In C++17, this is ill-formed (auto x = B(); would be fine though).
In C++20 now, because of the above rule change, B once again ceases to be an aggregate (not because of the base class, but because of the user-declared default constructor - even though it's defaulted). So we're back to going through B's constructor, and this snippet becomes well-formed.
Initializing aggregates from a parenthesized list of values P960
A common issue that comes up is wanting to use emplace()-style constructors with aggregates:
struct X { int a, b; };
std::vector<X> xs;
xs.emplace_back(1, 2); // error
This does not work, because emplace will try to effectively perform the initialization X(1, 2), which is not valid. The typical solution is to add a constructor to X, but with this proposal (currently working its way through Core), aggregates will effectively have synthesized constructors which do the right thing - and behave like regular constructors. The above code will compile as-is in C++20.
Class Template Argument Deduction (CTAD) for Aggregates P1021 (specifically P1816)
In C++17, this does not compile:
template <typename T>
struct Point {
T x, y;
};
Point p{1, 2}; // error
Users would have to write their own deduction guide for all aggregate templates:
template <typename T> Point(T, T) -> Point<T>;
But as this is in some sense "the obvious thing" to do, and is basically just boilerplate, the language will do this for you. This example will compile in C++20 (without the need for the user-provided deduction guide).