struct alignment C/C++ - c++

In c/c++ (I am assuming they are the same in this regard), if I have the following:
struct S {
T a;
.
.
.
} s;
Is the following guaranteed to be true?
(void*)&s == (void*)&s.a;
Or in other words, is there any kind of guarantee that there will be no padding before the first member?

In C, yes, they're the same address. Simple, and straightforward.
In C++, no, they're not the same address. Base classes can (and I would suspect, do) come before all members, and virtual member functions usually add hidden data to the struct somewhere. Even more confusing, a C++ compiler may also rearrange members at will, unless the class is a standard layout type (though I don't know that any compiler does so)
Finally, if the C++ struct is composed of standard layout types, contains no base classes nor virtual functions and all members have the same visibility, and possibly other limitations I forgot, then it falls back on the C rules, and requires the first member to be at the same address as the object itself.
§ 9.2/7
A standard-layout class is a class that:
— has no non-static data members of type non-standard-layout class (or array of such types) or reference,
— has no virtual functions (10.3) and no virtual base classes (10.1),
— has the same access control (Clause 11) for all non-static data members,
— has no non-standard-layout base classes,
— either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
— has no base classes of the same type as the first non-static data member.
§ 9.2/20
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. —end note ]

Yes, it is.
It is guaranteed there is no padding before the first struct member in C and in C++ (if it is a POD).
C quote:
(C11, 6.7.2.1p15) "There may be unnamed padding within a structure object, but not at its beginning."
C++ quote:
(C++11, 9.2p20) "There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment"

Related

will the padding of base class be copied into the derived class?

Recently, I've been reading "inside the c++ object model". It says that the padding used in base class should also be copied into the derived class, in case you want to assign the base class to the derived class. Thus, I run a test under a 64-bit computer:
class A {
public:
int valA;
char a;
};
class B : public A {
public:
char b;
};
class C : public B {
public:
char c;
};
int main(){
std::cout << sizeof(A) << " " << sizeof(B) << " " << sizeof(C)
<< std::endl;
C c;
printf("%p\n%p\n%p\n",&c,&c.b,&c.c);
}
Here is the result:
8 12 12
0x7ffd22c5072c
0x7ffd22c50734
0x7ffd22c50735
So why is C the same size of B? Although it seems that B used the 3 byte padding in A.
So why is C the same size of B?
Because trailing padding of B was reused for C::b. The padding can be reused because B is not a POD (plain old data) class (because it is not a standard layout class).
Although it seems that B used the 3 byte padding in A.
The padding of A cannot be reused for other sub objects of B, because A is a standard layout class and is trivially copyable i.e. A is a POD class.
will the padding of base class be copied into the derived class?
I suppose that you didn't mean to ask about copying, but are rather whether the base class sub object of a derived class will have the same padding as the individual type.
The answer is, as might be deduced from the above: The padding will be same, except the trailing padding may be re-used for other sub objects, unless the base class is a POD, in which case its padding can not be reused.
In the case where the padding may be reused, whether it will be is not specified by the standard, and there are differences between compilers.
Please explain or link to a definition of "standard layout types".
Current standard draft:
[basic.types]
... Scalar types, standard-layout class types ([class.prop]), arrays of such types and cv-qualified versions of these types are collectively called standard-layout types.
[class.prop] (in older versions of the standard, these may be found under [class] directly)
A class S is a standard-layout class if it:
(3.1) has no non-static data members of type non-standard-layout class (or array of such types) or reference,
(3.2) has no virtual functions and no virtual base classes,
(3.3) has the same access control for all non-static data members,
(3.4) has no non-standard-layout base classes,
(3.5) has at most one base class subobject of any given type,
(3.6) has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
(3.7) has no element of the set M(S) of types as a base class, where for any type X, M(X) is defined as follows.107 [ Note: M(X) is the
set of the types of all non-base-class subobjects that may be at a
zero offset in X. — end note]
(3.7.1) If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
(3.7.2) If X is a non-union class type with a non-static data member of type X0 that is either of zero size or is the first
non-static data member of X (where said member may be an anonymous
union), the set M(X) consists of X0 and the elements of M(X0).
(3.7.3) If X is a union type, the set M(X) is the union of all M(Ui) and the set containing all Ui, where each Ui is the type of the
ith non-static data member of X.
(3.7.4) If X is an array type with element type Xe, the set M(X) consists of Xe and the elements of M(Xe).
(3.7.5) If X is a non-class, non-array type, the set M(X) is empty.
Item (3.6) applies in this case. Some members of B are not first declared in B. In particular, B::A::valA, and B::A::a are declared first in A. A friendlier way to describe the rule is: The class must have either no direct members, or none of its ancestors must have members. In this case both the base and the derived class have members, so it is not standard layout.
C is the same size as B because on your platform the ABI chooses the use the padding in B to store the 1-byte member C::c. B has 3 bytes of padding at the end because the entire B object has alignment 4 (due to the int member in A).
B is not the same size as A, however, because in this case the ABI apparently does not allow storing B::b in the padding of A even though there is room. This happens when all of A members are public, as they are in your example: if you make any member private, the size of A, B and C will all be 8. I believe this may be for ABI backwards compatibility, rather than motivated by any language in the standard.
I don't know if there is language in the standard which directly allows this (but there doesn't need to be), but it certainly seems that this type of padding re-use is contemplated in the case of inheritance. For example, the documentation for std::memcpy says:
If the objects are potentially-overlapping or not TriviallyCopyable, the behavior of memcpy is not specified and may be undefined.
It goes on to define potentially-overlapping:
A subobject is potentially overlapping if it is either
a base class subobject, or
a non-static data member declared with the [[no_unique_address]] attribute.
The second condition applies only in C++20.
This seems to be written to allow padding to be shared: if this clause didn't exist, memcpy on a pointer to a B subclass of C would overwrite the value of C::c which is stored in what is usually padding for B.

What is the purpose of bullet point (7.5) in [class]/7, in C++14?

This is basically a continuation of my prior question.
This is [class]/7 in C++14:
A standard-layout class is a class that:
(7.1) — has no non-static data members of type non-standard-layout class (or array of such types) or reference,
(7.2) — has no virtual functions (10.3) and no virtual base classes (10.1),
(7.3) — has the same access control (Clause 11) for all non-static data members,
(7.4) — has no non-standard-layout base classes,
(7.5) — either has no non-static data members in the most derived class and at most one base class with
non-static data members, or has no base classes with non-static data members, and
(7.6) — has no base classes of the same type as the first non-static data member.
Consider the following snippet:
struct B{ int i; };
struct A : B{ int j; };
A satisfies bullet points (7.1) thru (7.4), but doesn't satisfy (7.5), as A has a non-static data member and has a base class with a non-static data member.
What is the problem with A being a standard-layout class?
Edit
As far as I can understand the accepted answer to the question of which this is being considered a dupe, the snippet above would have undefined behavior, if I tried to cast a pointer to A to the first data member of the base class B and back, because of this sentence written by the OP:
Within a class, members are allocated in increasing addresses according to the declaration order. However C++ doesn't dictate the order of allocation for data members across classes.
But that doesn't seem to answer my question. Suppose for example that in a certain compiler implementation, base B would follow struct A in memory, instead of preceding it. But this would contradict the fact that there is an implicit conversion, from a pointer to a derived class, to a pointer to a base class, according to [conv.ptr]/3:
A prvalue of type “pointer to cv D”, where D is a class type, can be
converted to a prvalue of type “pointer to cv B”, where B is a base
class (Clause 10) of D.
That is, if the base B followed struct A in memory, the above implicit conversion would be invalid.
Directly answering the question as phrased:
The purpose of this bullet is to allow very simple cases of inheritance where only one of the classes has data members.
Data layout for inheritance is unspecified, so the standard could just disallow inheritance altogether, but the standard makes an exception if one class has no data to treat the result still as Standard Layout.

Using offsetof for template classes

From the C++ standard:
A standard-layout class is a class that:
— has no non-static data members of type non-standard-layout class (or
array of such types) or reference,
— has no virtual functions (10.3) and no virtual base classes (10.1),
— has the same access control (Clause 11) for all non-static data members,
— has no non-standard-layout base classes,
— either has no non-static data members in the most derived class and
at most one base class with non-static data members, or has no base
classes with non-static data members, and
— has no base classes of the same type as the first non-static data
member
The macro offsetof(type, member-designator) accepts a restricted set
of type arguments in this International Standard. If type is not a
standard-layout class (Clause 9), the results are undefined
Considering these statements is there any safe way of using offsetof for members that depend on template parameters? If not, how may I get the offset of a member in template classes? What might be unsafe when using something like:
//MS Visual Studio 2013 definition
#define offsetof(s,m) (size_t)&reinterpret_cast<const volatile char&>((((s *)0)->m))
on non standard layout classes?
Following a sample where is NOT SAFE according to the standard:
#include <cstddef>
#include <iostream>
template<typename T>
struct Test
{
int a;
T b;
};
struct NonStdLayout
{
virtual void f(){};
};
int main()
{
std::cout << offsetof(Test<int>, b) << std::endl;
std::cout << offsetof(Test<NonStdLayout>, b) << std::endl;
return 0;
}
offsetof cannot be used on non-standard-layout classes simply because their layout in memory is unknown. For example, the standard does not specify how virtual member functions are implemented. One common way of doing so is to add a pointer to the vtable as the first data member of a class, but it's not the only way.
As to your definition of offsetof: there is no guarantee that a null pointer converts to a 0 via reinterpret_cast (or via a C-style cast), neither is there any semantics specified for other values of pointers cast to integers.
So if you know that your definition makes sense in the underlying addressing scheme used by your compiler for your platform, it can work. But it's an if you have to be aware of.
The answer is that it is perfectly safe to use offsetof in templates. No harm will be done thereby. However, if you choose to do so then you impose a restriction on the type of parameter for the template. It will work correctly for standard layout classes, and in principle at least the compiler should tell you when the parameter is of a type for which it won't work.
There is no way under the standard to obtain the offset for a member of a non-standard-layout class, regardless of whether any template is involved. It will probably work in individual compilers, but it may not. It is likely to work on all non-virtual classes (although this is not a requirement of the standard). Maybe you just have to experiment.
We are frequently forced to write non-standards compliant code to solve problems like this, so we test it carefully on individual compilers. It just means more hard work in research and testing.

Does a nested enum inside POD class makes it not POD?

By definition here, POD is a simple class with no user-defined constructors, non static members, and containing simple data types only.
The question is, will these 2 classes below be equivalent as POD types (in terms of memory footprint):
class pod
{
public:
int x;
double y;
};
class pod1
{
public:
int x;
double y;
enum POD_TYPE
{
POD1 = 0,
POD = 1
};
};
In other words does adding enum to the class only affects scope resolution of enum and does not affect properties of the class itself? By observation, it seems that class is still pod, but I would like to confirm based on the standard.
Yes, that type is still POD. The definition is given by C++11 9/10:
A POD struct is a non-union class that is both a trivial class and a standard-layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types).
Trivial means that it doesn't do any funny business when creating, destroying or copying objects. Standard-layout means that it doesn't do any funny business with the layout of data members: no polymorphism, and restrictions on what you can do with access specifiers and inheritance. These terms are fully defined in C++11 9/6 and 9/7, if you want more detail.
Nested types (such as your enumeration), static data members and non-virtual member functions (apart from constructors etc. which would make it non-trivial) will not effect any of those things, so it is still POD.
UPDATE: Since you say you're interested in historic definitions, C++03 defined:
9/4 A POD-struct is an aggregate class that has no non-static members of type non-POD-struct, non-POD-union (or array of such types), and has no user-defined copy assignment operator and no user-defined destructor"
8.5.1/1 An aggregate is an array or class with no user-declared constructors, no private or protected non-static data members, no base classes and no virtual functions.
So there were more restrictions; but nested types were still allowed. I don't have a copy of C++98, but I'm sure that would be identical to C++03.
It doesn't make any difference regarding POD status because defining a nested enum does not add data members to the class. In fact, you could also define a nested class that is not POD inside pod1 and it would still not make a difference regarding the PODness of pod1.
The enum POD_TYPE is a type and won't effect the layout, we can see this from the draft C++ standard section 9.2 Class members paragraph 1 which says:
[...]Members of a class are data members, member functions (9.3), nested types, and
enumerators. Data members and member functions are static or non-static; see 9.4. Nested types are classes (9.1, 9.7) and enumerations (7.2) [...]
as opposed to data members and we can further see that the definition of a standard layout class depends only on data member from paragraph 16 which says:
Two standard-layout struct (Clause 9) types are layout-compatible if they have the same number of non-static data members and corresponding non-static data members (in declaration order) have layout-compatible types (3.9).
and we further see going back to section 9 Classes paragraph 10 which says(emphasis mine):
A POD struct108 is a non-union class that is both a trivial class and a standard-layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). Similarly, a POD union is a union that is both a trivial class and a standard layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). A POD class is a class that is either a POD struct or a POD union.
As far as I can tell the pre C++11 standard does not diff much in the above items.

C++ optimise away private variable

Does ISO C++ (11) permit a private non-static class member variable to be optimised away?
This could be detected:
class X { int x; };
assert (sizeof(X) >= sizeof(int));
but I am not aware of a clause that demands the assertion above.
To clarify: (a) Is there a clause in the C++ Standard that ensure the assertion above.
(b) Can anyone think of any other way to detect the elision of x?
[offsetof?]
(c) Is the optimisation permitted anyhow, despite (a) and (b)?
I have a feeling the optimisation could be possible if the class is local to a function but not otherwise (but I'd like to have a definitive citation).
I do not think it is forbidden, but I think it is impractical.
§9 Classes [class]
7/ A standard-layout class is a class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions (10.3) and no virtual base classes (10.1),
has the same access control (Clause 11) for all non-static data members,
has no non-standard-layout base classes,
either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
has no base classes of the same type as the first non-static data member.107
8/ A standard-layout struct is a standard-layout class defined with the class-key struct or the class-key class.
... thus class X { int x; }; is a standard-layout struct.
§9.2 Class members [class.mem]
16/ Two standard-layout struct (Clause 9) types are layout-compatible if they have the same number of non-static data members and corresponding non-static data members (in declaration order) have layout-compatible types (3.9).
... thus class X { int x; }; is layout-compatible with struct Y { int y; };.
The unfortunate thing is that layout-compatible is not formally defined in the Standard. However given the use of the word layout it seems the intent is to declare that two layout-compatible types should have the same underlying representation.
Therefore, to be able to remove the x in X one would have to prove that all structures that are layout-compatible (such as Y) are amenable to the same optimization (to keep the layout compatibility). It seems quite... improbable... in any non-trivial program.