How is the memory layout of a class vs. a struct - c++

I come from C programming where the data in a struct is laid out with the top variable first, then the second, third and so on..
I am now programming in C++ and I am using a class instead. I basically want to achieve the same, but I also want get/set methods and also maybe other methods (I also want to try do it in a C++ style and maye learn something new).
Is there a guarantee e.g. that the public variables will be first in memory then the private variable?

Is there a guarantee e.g. that the public variables will be first in
memory then the private variable?
No, such a guarantee is not made - C++11 standard, [class.mem]/14:
Nonstatic data members of a (non-union) class with the same access
control (Clause 11) are allocated so that later members have higher
addresses within a class object. The order of allocation of non-static
data members with different access control is unspecified (11).
So
struct A
{
int i, j;
std::string str;
private:
float f;
protected:
double d;
};
It is only guaranteed that, for a given object of type A,
i has a smaller address than j and
j has a smaller address than str
Note that the class-keys struct and class have no difference regarding layout whatsoever: Their only difference are access-rights which only exist at compile-time.
It only says the order, but not that the first variable actually start
at the "first address"? Lets assume a class without inheritance.
Yes, but only for standard-layout classes. There is a row of requirements a class must satisfy to be a standard-layout class, one of them being that all members have the same access-control.
Quoting C++14 (the same applies for C++11, but the wording is more indirect), [class.mem]/19:
If a standard-layout class object has any non-static data members, its
address is the same as the address of its first non-static data
member. Otherwise, its address is the same as the address of its first
base classsubobject (if any). [ Note: There might therefore be
unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. — end note ]
[class]/7:
A standard-layout class is a class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions (10.3) and no virtual base classes (10.1),
has the same access control (Clause 11) for all non-static data members,
has no non-standard-layout base classes,
either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base
classes with non-static data members, and
has no base classes of the same type as the first non-static data member. 110
110) This ensures that two subobjects that have the same class type and that belong to the same most derived object are not
allocated at the same address (5.10).

First thing first: class and struct in C++ are very much the same - the only difference is that all members before the first access specifier in a class are considered private, while in a struct they are public.
Is there a guarantee e.g. that the public variables will be first in memory then the private variable?
There is no such guarantee. When there is no inheritance, the memory will be allocated to class members in the order in which you declare them within the same access group. It is up to the compiler to decide if the public member variables should be placed ahead of the private / protected ones or vice versa. Like C, C++ can add padding in between class members.
Inheritance makes things more complicated, because data members of the base class need to be placed within the derived class as well. On top of that, there is virtual inheritance and multiple inheritance, with complex rules.
I basically want to achieve the same [layout], but I also want get/set methods and also maybe other methods.
If you make all data members of your class private, and add accessor member functions (that's what C++ calls "methods" from other languages) you would achieve this effect.

Related

Why is std::string a standard layout type?

Taking the example from here: trivial vs. standard layout vs. POD
The following code passes:
struct T {
public:
int i;
private:
int j;
};
static_assert(! std::is_standard_layout<T>::value, "");
Yet the following does not:
static_assert(! std::is_standard_layout<std::string>::value, "");
So if all it takes for a type not to be a standard layout, then how could std::string possible be one?
Let's look at the actual rules for standard layout:
[C++14: 9/7]: A standard-layout class is a class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions (10.3) and no virtual base classes (10.1),
has the same access control (Clause 11) for all non-static data members,
has no non-standard-layout base classes,
either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
has no base classes of the same type as the first non-static data member.
std::string probably has no public data members (what would they be?), which is where you tripped up with your T (since now you have both private and public data members; see emboldened passage).
But as far as I can tell there is no actual requirement for std::string to be standard layout. That's just how your implementation has done it.
According to the requirement of StandardLayoutType:
Requirements
All non-static data members have the same access control
...
That's why T failed to be standard layout type. std::string is just satisfying the requirements.

Does inheriting from an empty class increase that class's size? [duplicate]

According to C++11 9.1/7 (draft n3376), a standard-layout class is a class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions (10.3) and no virtual base classes (10.1),
has the same access control (Clause11) for all non-static data members,
has no non-standard-layout base classes,
either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
has no base classes of the same type as the first non-static data member.
it follows that an empty class is a standard-layout class; and that another class with an empty class as a base is also a standard-layout class provided the first non-static data member of such class is not of the same type as the base.
Furthermore, 9.2/19 states that:
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. —end note]
This seems to imply that the Empty Base Class Optimization is now a mandatory optimization, at least for standard-layout classes. My point is that if the empty base optimization isn't mandated, then the layout of a standard-layout class would not be standard but rather depend on whether the implementation implements or not said optimization. Is my reasoning correct, or am I missing something?
Yes, you're correct, that was pointed out in the "PODs revisited" proposals: http://www.open-std.org/jtc1/sc22/WG21/docs/papers/2007/n2342.htm#ABI
The Embarcadero compiler docs also state it: http://docwiki.embarcadero.com/RADStudio/en/Is_standard_layout
Another key point is [class.mem]/16
Two standard-layout struct (Clause 9) types are layout-compatible if they have the same number of non-static data members and corresponding non-static data members (in declaration order) have layout-compatible types (3.9).
Note that only data members affect layout compatibility, not base classes, so these two standard layout classes are layout-compatible:
struct empty { };
struct stdlayout1 : empty { int i; };
struct stdlayout2 { int j; };

Order of storage inside a structure / object

Consider these two cases :
struct customType
{
dataType1 var1;
dataType2 var2;
dataType3 var3;
} ;
customType instance1;
// Assume var1, var2 and var3 were initialized to some valid values.
customType * instance2 = &instance1;
dataType1 firstMemberInsideStruct = (dataType1)(*instance2);
class CustomType
{
public:
dataType1 member1;
dataType2 member2;
retrunType1 memberFunction1();
private:
dataType3 member3;
dataType4 member4;
retrunType2 memberFunction2();
};
customType object;
// Assume member1, member2, member3 and member4 were initialized to some valid values.
customType *pointerToAnObject = &object ;
dataType1 firstMemberInTheObject = (dataType1) (*pointerToAnObject);
Is it always safe to do this ?
I want to know if standard specifies any order of storage among -
The elements inside a C structure.
Data members inside an object of a C++ class.
C99 and C++ differ a bit on this.
The C99 standard guarantees that the fields of a struct will be laid out in memory in the order they are declared, and that the fields of two identical structs will have the same offsets. See this question for the relevant sections of the C99 standard. To summarize: the offset of the first field is specified to be zero, but the offsets after that are not specified by the standard. This is to allow C compilers to adjust the offsets of each field so the field will satisfy any memory alignment requirements of the architecture. Because this is implementation-dependent, C provides a standard way to determine the offset of each field using the offsetof macro.
C++ offers this guarantee only for Plain old data (POD). C++ classes that are not plain old data cannot be treated like this. The standard gives the C++ compiler quite a bit of freedom in how it organizes a class when the class uses multiple inheritance, has non-public fields or members, or contains virtual members.
What this means for your examples:
dataType1 firstMemberInsideStruct = (dataType1)(*instance2);
This line is okay only if dataType1, dataType2, and dataType3 are plain old data. If any of them are not, then the customType struct may not have a trivial constructor (or destructor) and this assumption may not hold.
dataType1 firstMemberInTheObject = (dataType1) (*pointerToAnObject);
This line is not safe regardless of whether dataType1, dataType2, and dataType3 are POD, because the CustomType class has private instance variables. This makes it not a POD class, and so you cannot assume that its first instance variable will be ordered in a particular way.
9.0.7
A standard-layout class is a class that: — has no non-static data
members of type non-standard-layout class (or array of such types) or
reference, — has no virtual functions (10.3) and no virtual base
classes (10.1), — has the same access control (Clause 11) for all
non-static data members, — has no non-standard-layout base classes, —
either has no non-static data members in the most derived class and at
most one base class with non-static data members, or has no base
classes with non-static data members, and — has no base classes of the
same type as the first non-static data member.108
9.2.14
Nonstatic data members of a (non-union) class with the same access
control (Clause 11) are allocated so that later members have higher
addresses within a class object. The order of allocation of non-static
data members with different access control is unspecified (11).
Implementation alignment requirements might cause two adjacent members
not to be allocated immediately after each other; so might
requirements for space for managing virtual functions (10.3) and
virtual base classes (10.1).
9.2.20
A pointer to a standard-layout struct object, suitably converted using
a reinterpret_cast, points to its initial member (or if that member is
a bit-field, then to the unit in which it resides) and vice versa. [
Note: There might therefore be unnamed padding within a
standard-layout struct object, but not at its beginning, as necessary
to achieve appropriate alignment. — end note ]
It's not always safe to do so. If the classes have virtual methods, it most definitely is not. Data members are guaranteed to appear in the same order for the same access level chunk, but these groups can be reordered.
In order to be safe with these type of casts, you should provide a conversion constructor or a cast operator, and not rely on implementation details.
Typically in a C struct members are stored in the order that they are declared. However the elements must be aligned properly. Wikipedia has a good example of how this works.
I will re-iterate here:
If you have the following struct
struct MixedData
{
char Data1;
short Data2;
int Data3;
char Data4;
};
padding will be inserted in between differing data types in order to assure the proper byte-alignment. chars are 1-byte aligned, shorts are 2-byte aligned, ints are 4-byte aligned, etc.
Thus to make Data2 2-byte aligned, there will be a 1-byte padding inserted between Data1 and Data2.
It is also worth mentioning that there are mechanisms that can change the packing alignment. See #pragma pack.

Safety of casting between pointers of two identical classes?

Let's say I have two different classes, both represent 2D coordinate data in the same internal way like the following:
class LibA_Vertex{
public:
// ... constructors and various methods, operator overloads
float x, y
};
class LibB_Vertex{
public:
// ... same usage and internal data as LibA, but with different methods
float x, y
};
void foobar(){
LibA_Vertex * verticesA = new LibA_Vertex[1000];
verticesA[50].y = 9;
LibB_Vertex * verticesB = reinterpret_cast<LibB_Vertex*>( vertexA );
print(verticesB[50].y); // should output a "9"
};
Given the two classes being identical and the function above, can I reliably count on this pointer conversion working as expected in every case?
(The background, is that I need an easy way of trading vertex arrays between two separate libraries that have identical Vertex classes, and I want to avoid needlessly copying arrays).
C++11 added a concept called layout-compatible which applies here.
Two standard-layout struct (Clause 9) types are layout-compatible if they have the same number of non-static data members and corresponding non-static data members (in declaration order) have layout-compatible types (3.9).
where
A standard-layout class is a class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions (10.3) and no virtual base classes (10.1),
has the same access control (Clause 11) for all non-static data members,
has no non-standard-layout base classes,
either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
has no base classes of the same type as the first non-static data member.
A standard-layout struct is a standard-layout class defined with the class-key struct or the class-key class.
A standard-layout union is a standard-layout class defined with the class-key union.
Finally
Pointers to cv-qualified and cv-unqualified versions (3.9.3) of layout-compatible
types shall have the same value representation and alignment requirements (3.11).
Which guarantees that reinterpret_cast can turn a pointer to one type into a pointer to any layout-compatible type.
I would wrap that conversion up in a class (so that if you need to change platform or something, it's at least localised in one spot) but yes it should be possible.
You'll want to use reinterpret_cast, not static_cast as well.
Theoretically this is an undefined behavior. However, it may work in certain systems/platforms.
I would suggest that you should try to merge 2 classes into 1. i.e.
class Lib_Vertex{
// data (which is exactly same for both classes)
public:
// methods for LibA_Vertex
// methods for LibB_Vertex
};
Adding methods into a class will not affect its size. You may have to change your design a bit but it's worth it.
Technically this is undefined behavior. In reality, if the same compiler was used to compile both classes, they'll have the same layout in memory if the fields are declared in the same order, have the same types and the same access level.

In a class with no virtual methods or superclass, is it safe to assume (address of first member variable) == this?

I made a private API that assumes that the address of the first member-object in the class will be the same as the class's this-pointer... that way the member-object can trivially derive a pointer to the object that it is a member of, without having to store a pointer explicitly.
Given that I am willing to make sure that the container class won't inherit from any superclass, won't have any virtual methods, and that the member-object that does this trick will be the first member object declared, will that assumption hold valid for any C++ compiler, or do I need to use the offsetof() operator (or similar) to guarantee correctness?
To put it another way, the code below does what I expect under g++, but will it work everywhere?
class MyContainer
{
public:
MyContainer() {}
~MyContainer() {} // non-virtual dtor
private:
class MyContained
{
public:
MyContained() {}
~MyContained() {}
// Given that the only place Contained objects are declared is m_contained
// (below), will this work as expected on any C++ compiler?
MyContainer * GetPointerToMyContainer()
{
return reinterpret_cast<MyContainer *>(this);
}
};
MyContained m_contained; // MUST BE FIRST MEMBER ITEM DECLARED IN MyContainer
int m_foo; // other member items may be declared after m_contained
float m_bar;
};
It seems the current standard guarantees this only for POD types.
9.2.17
A pointer to a POD-struct object,
suitably converted, points to its
initial member (or if that member is a
bit-field, then to the unit in which
it resides) and vice versa. [Note:
There might therefore be unnamed
padding within a POD-struct object,
but not at its beginning, as necessary
to achieve appropriate alignment. ]
However, the C++0x standard seems to extend this guarantee to "standard-layout struct object"
A standard-layout class is a class
that:
— has no non-static data members of
type non-standard-layout class (or
array of such types) or reference,
— has no virtual functions (10.3) and
no virtual base classes (10.1),
— has the same access control (Clause
11) for all non-static data members,
— has no non-standard-layout base
classes,
— either has no non-static data
members in the most-derived class and
at most one base class with non-static
data members, or has no base classes
with non-static data members, and
— has no base classes of the same type
as the first non-static data member.
A standard-layout struct is a
standard-layout class defined with the
class-key struct or the class-key
class.
It is probably likely that the assumption holds in practice (and the former didn't just have these distinctions, though this could have been the intention)?
It is not guaranteed for non-POD types. C++ Standard 9.2/12:
Nonstatic data members of a
(non-union) class declared without an
intervening access-specifier are allo-
cated so that later members have
higher addresses within a class
object. The order of allocation of
nonstatic data members separated by an
access-specifier is unspecified
(11.1). Implementation alignment
require- ments might cause two
adjacent members not to be allocated
immediately after each other; so might
requirements for space for managing
virtual functions (10.3) and virtual
base classes (10.1).
In your case you have non-POD type since it contains custom destructor. More about POD types you could read here.
The latest C++ spec draft says this is ok, as long as the class qualifies as a standard layout class, which just requires
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions (10.3) and no virtual base classes (10.1),
has the same access control (Clause 11) for all non-static data members,
has no non-standard-layout base classes,
either has no non-static data members in the most-derived class and at most one base class with
non-static data members, or has no base classes with non-static data members, and
has no base classes of the same type as the first non-static data member.
Depending on the definition of MyContained, your class might or might not be standard layout
Note that POD-classes are the intersection of standard layout and trivially copyable classes