Not Understanding C++ Memory Alignment - c++

I understand that in C++, if we have a struct like this:
struct x_
{
char a; // 1 byte
int b; // 4 bytes
short c; // 2 bytes
char d; // 1 byte
} MyStruct;
Memory structure will look like this due to compiler padding:
struct x_
{
char a; // 1 byte
char _pad0[3]; // padding to put 'b' on 4-byte boundary
int b; // 4 bytes
short c; // 2 bytes
char d; // 1 byte
char _pad1[1]; // padding to make sizeof(x_) multiple of 4
}
Can somebody please help me understand why sizeof(x_) must be a multiple of 4, and not any other number?

At least, a class must have the same alignment largest alignment of any of its non-static data member. So x_ must be aligned as b. If x_ had a lower alignment requirement, then b could be misaligned. Imagine an object x_ at address 3 then its b member would also be at an address 3+alignof(int). If alignof(int)==4 then b would be at address 7, so it would be misaligned. So following this example, alignof(x_)==4;
Then the size of an object must be a multiple of the alignment because elements of an array are required to be contiguous: a pointer to the end of element n must be a pointer to the element n+1. So in your case x size must be a multiple of 4.

Related

Trailing padding in C/C++ in nested structures - is it neccesary?

This is more of a theoretical question.
I'm familiar with how padding and trailing padding works.
struct myStruct{
uint32_t x;
char* p;
char c;
};
// myStruct layout will compile to
// x: 4 Bytes
// padding: 4 Bytes
// *p: 8 Bytes
// c: 1 Byte
// padding: 7 Bytes
// Total: 24 Bytes
There needs to be padding after x, so that *p is aligned, and there needs to be trailing padding after c so that the whole struct size is divisible by 8 (in order to get the right stride length).
But consider this example:
struct A{
uint64_t x;
uint8_t y;
};
struct B{
struct A myStruct;
uint32_t c;
};
// Based on all information I read on internet, and based on my tinkering
// with both GCC and Clang, the layout of struct B will look like:
// myStruct.x: 8 Bytes
// myStruct.y: 1 Byte
// myStruct.padding: 7 Bytes
// c: 4 Bytes
// padding: 4 Bytes
// total size: 24 Bytes
// total padding: 11 Bytes
// padding overhead: 45%
// my question is, why struct A does not get "inlined" into struct B,
// and therefore why the final layout of struct B does not look like this:
// myStruct.x: 8 Bytes
// myStruct.y: 1 Byte
// padding 3 Bytes
// c: 4 Bytes
// total size: 16 Bytes
// total padding: 3 Bytes
// padding overhead: 19%
Both layouts satisfy alignments of all variables. Both layouts have the same order of variables. In both layouts struct B has correct stride length (divisible by 8 Bytes). Only thing that differs (besides 33% smaller size), is that struct A does not have correct stride length in layout 2, but that should not matter, since clearly there is no array of struct As.
I checked this layout in GCC with -O3 and -g, struct B has 24 Bytes.
My question is - is there some reason why this optimization is not applied? Is there some layout requirement in C/C++ that forbids this? Or is there some compilation flag I'm missing? Or is this an ABI thing?
EDIT:
Answered.
See answer from #dbush on why compiler cannot emit this layout on it's own.
The following code example uses GCC pragmas packed and aligned (as suggested by #jaskij) to manualy enforce the more optimized layout. Struct B_packed has only 16 Bytes instead of 24 Bytes (note that this code might cause issues/run slow when there is an array of structs B_packed, be aware and don't blindly copy this code):
struct __attribute__ ((__packed__)) A_packed{
uint64_t x;
uint8_t y;
};
struct __attribute__ ((__packed__)) B_packed{
struct A_packed myStruct;
uint32_t c __attribute__ ((aligned(4)));
};
// Layout of B_packed will be
// myStruct.x: 8 Bytes
// myStruct.y: 1 Byte
// padding for c: 3 Bytes
// c: 4 Bytes
// total size: 16 Bytes
// total padding: 3 Bytes
// padding overhead: 19%
is there some reason why this optimization is not applied
If this were allowed, the value of sizeof(struct B) would be ambiguous.
Suppose you did this:
struct B b;
struct A a = { 1, 2 };
b.c = 0x12345678;
memcpy(&b.myStruct, &a, sizeof(struct A));
You'd be overwriting the value of b.c.
Padding is used to force alignment. Now if you have an array of struct myStruct, then there is a rule that array elements follow each other without any padding. In your case, without padding inside myStruct after the last field, the second myStruct in an array wouldn't be properly aligned. Therefore it is necessary that sizeof(myStruct) is a multiple of the alignment of myStruct, and for that you may need enough padding at the end.

Why the size of my Person is 10 bytes, and not 16 ? [duplicate]

Someone explain me how does the order of the member declaration inside a class determines the size of that class.
For Example :
class temp
{
public:
int i;
short s;
char c;
};
The size of above class is 8 bytes.
But when the order of the member declaration is changed as below
class temp
{
public:
char c;
int i;
short s;
};
then the size of class is 12 bytes.
How?
The reason behind above behavior is data structure alignment and padding. Basically if you are creating a 4 byte variable e.g. int, it will be aligned to a four byte boundary i.e. it will start from an address in memory, which is multiple of 4. Same applies to other data types. 2 byte short should start from even memory address and so on.
Hence if you have a 1 byte character declared before the int (assume 4 byte here), there will be 3 free bytes left in between. The common term used for them is 'padded'.
Data structure alignment
Another good pictorial explanation
Reason for alignment
Padding allows faster memory access i.e. for cpu, accessing memory areas that are aligned is faster e.g. reading a 4 byte aligned integer might take a single read call where as if an integer is located at a non aligned address range (say address 0x0002 - 0x0006), then it would take two memory reads to get this integer.
One way to force compiler to avoid alignment is (specific to gcc/g++) to use keyword 'packed' with the structure attribute. packed keyword Also the link specifies how to enforce alignment by a specific boundary of your choice (2, 4, 8 etc.) using the aligned keyword.
Best practice
It is always a good idea to structure your class/struct in a way that variables are already aligned with minimum padding. This reduces the size of the class overall plus it reduces the amount of work done by the compiler i.e. no rearrangement of structure. Also one should always access member variables by their names in the code, rather than trying to read a specific byte from structure assuming a value would be located at that byte.
Another useful SO question on performance advantage of alignment
For the sake of completion, following would still have a size of 8 bytes in your scenario (32 bit machine), but it won't get any better since full 8 bytes are now occupied, and there is no padding.
class temp
{
public:
int i;
short s;
char c;
char c2;
};
class temp
{
public:
int i; //size 4 alignment 4
short s; //size 2 alignment 2
char c; //size 1 alignment 1
}; //Size 8 alignment max(4,2,1)=4
temp[i[0-4];s[4-2];c[6-7]]] -> 8
Padding in (7-8)
class temp
{
public:
char c; //size 1 alignment 1
int i; //size 4 alignment 4
short s; //size 2 alignment 2
};//Size 12 alignment max(4,2,1)=4
temp[c[0-1];i[4-8];s[8-10]]] -> 12
Padding in (1-4) and (10-12)

Structure alignment padding, largest size of padding, and order of struct members

I've been learning about structure data padding since I found out my sizeof() operator wasn't returning what I expected. According to the pattern that I've observed, it aligns structure members with the largest data type. So for example...
struct MyStruct1
{
char a; // 1 byte
char b; // 1 byte
char c; // 1 byte
char d; // 1 byte
char e; // 1 byte
// Total 5 Bytes
//Total size of struct = 5 (no padding)
};
struct MyStruct2
{
char a; // 1 byte
char b; // 1 byte
char c; // 1 byte
char d; // 1 byte
char e; // 1 byte
short f; // 2 bytes
// Total 7 Bytes
//Total size of struct = 8 (1 byte of padding between char e and short f
};
struct MyStruct3
{
char a; // 1 byte
char b; // 1 byte
char c; // 1 byte
char d; // 1 byte
char e; // 1 byte
int f; // 4 bytes
// Total 9 bytes
//Total size of struct = 12 (3 bytes of padding between char e and int f
};
However if make the last member an 8 byte data type, for example a long long, it still only adds 3 bytes of padding, making a four-byte aligned structure. However if I build in 64 bit mode, it does in fact align for 8 bytes (the biggest data type). My first question is, am I wrong in saying it aligns the members with the largest data type? This statement seems correct for a 64 bit build, but only true up to 4 byte data types in a 32 bit build. Has this to do with the native 'word' size of the CPU? Or the program itself?
My second question is, would the following be an entire waste of space and bad programming?
struct MyBadStruct
{
char a; // 1 byte
unsigned int b; // 4 bytes
UINT8 c; // 1 byte
long d; // 4 bytes
UCHAR e; // 1 byte
char* f; // 4 bytes
char g; // 1 byte
// Total of 16 bytes
//Total size of struct = 28 bytes (12 bytes of padding, wasted)
};
How padding is done, is not part of the standard. So it can be done differently on different systems and compilers. It is often done so that variables are aligned at there size, i.e. size=1 -> no alignment, size=2 -> 2 byte alignment, size=4 -> 4 byte alignment and so on. For size=8, it is normally 4 or 8 bytes aligned. The struct it self is normally 4 or 8 bytes aligned. But - just to repeat - it is system/compiler dependent.
In your case it seems to follow the pattern above.
So
char a;
int b;
will give 3 bytes padding to 4 byte align the int.
and
char a1;
int b1;
char a2;
int b2;
char a3;
int b3;
char a4;
int b4;
will end up as 32 byte (again to 4 byte align the int).
But
int b1;
int b2;
int b3;
int b4;
char a1;
char a2;
char a3;
char a4;
will be just 20 as the int is already aligned.
So if memory matters, put the largest members first.
However, if memory doesn't matter (e.g. because the struct isn't used that much), it may be better to keep things in a logical order so that the code is easy to read for humans.
Typically the best way to reduce the amount of padding inserted by the compiler is to sort the data members inside of your struct from largest to smallest:
struct MyNotSOBadStruct
{
long d; // 4 bytes
char* f; // 4 bytes
unsigned int b; // 4 bytes
char a; // 1 byte
UINT8 c; // 1 byte
UCHAR e; // 1 byte
char g; // 1 byte
// Total of 16 bytes
};
the size may vary depending on 32 vs 64 bit os because the size of a pointer will change
live version: http://coliru.stacked-crooked.com/a/aee33c64192f2fe0
i get size = 24
All of the following is implementation dependent. Do not rely on this for the correctness of your programs (but by all means make use of it for debugging or improving performance).
In general each datatype has a preferred alignment. This is never larger than the size of the type, but it can be smaller.
It appears that your compiler is aligning 64-bit integers on a 32-bit boundary when compiling in 32-bit mode, but on a 64-bit boundary in 64-bit mode.
As to your question about MyBadStruct: In general, write your code to be simple and easy to understand; only do anything else if you know (through measurement) that you have a problem. Having said that, if you sort your member variables by size (largest first), you will minimize padding space.

When the compiler decides to pad a struct

Let's say we have:
struct A{
char a1;
char a2;
};
struct B{
int b1;
char b2;
};
struct C{
char C1;
int C2;
};
I know that because of padding to a multiple of the word size (assuming word size=4), sizeof(C)==8 although sizeof(char)==1 and sizeof(int)==4.
I would expect that sizeof(B)==5 instead of sizeof(B)==8.
But if sizeof(B)==8 I would expect that sizeof(A)==4 instead of sizeof(A)==2.
Could anyone please explain why the padding and the aligning are working differently in those cases?
A common padding scheme is to pad structs so that each member starts at an even multiple of the size of that member or to the machine word size (whichever is smaller). The entire struct is padded following the same rule.
Assuming such a padding scheme I would expect:
The biggest member in struct A has size 1, so no padding is used.
In struct B, the size of 5 is padded to 8, because one member has size 4.
The layout would be:
int 4
char 1
padding 3
In struct C, some padding is inserted before the int, so that it starts at an address divisible by 4.
The layout would be:
char 1
padding 3
int 4
It's up to the compiler to decide how best to pad the struct. For some reason, it decided that in struct B that char b2 was more optimally aligned on a 4 byte boundary. Additionally, the specific architecture may have requirements/behaviors that the compiler takes into account when deciding how to pad structs.
If you 'pack' the struct, then you'd see the sizes you expect (although that is not portable and may have performance penalties and other issues depending on the architecture).
structs in general will be aligned to boundaries based on the largest type contained. Consider an array of struct B myarray[5];
struct B must be aligned to 8 bytes so that it's b1 member is always on a 4 byte boundary. myarray[1].b1 can't start at the 6th byte into the array, which is what you would have if sizeof(B) == 5.

Why does the size of a class depends on the order of the member declaration? and How?

Someone explain me how does the order of the member declaration inside a class determines the size of that class.
For Example :
class temp
{
public:
int i;
short s;
char c;
};
The size of above class is 8 bytes.
But when the order of the member declaration is changed as below
class temp
{
public:
char c;
int i;
short s;
};
then the size of class is 12 bytes.
How?
The reason behind above behavior is data structure alignment and padding. Basically if you are creating a 4 byte variable e.g. int, it will be aligned to a four byte boundary i.e. it will start from an address in memory, which is multiple of 4. Same applies to other data types. 2 byte short should start from even memory address and so on.
Hence if you have a 1 byte character declared before the int (assume 4 byte here), there will be 3 free bytes left in between. The common term used for them is 'padded'.
Data structure alignment
Another good pictorial explanation
Reason for alignment
Padding allows faster memory access i.e. for cpu, accessing memory areas that are aligned is faster e.g. reading a 4 byte aligned integer might take a single read call where as if an integer is located at a non aligned address range (say address 0x0002 - 0x0006), then it would take two memory reads to get this integer.
One way to force compiler to avoid alignment is (specific to gcc/g++) to use keyword 'packed' with the structure attribute. packed keyword Also the link specifies how to enforce alignment by a specific boundary of your choice (2, 4, 8 etc.) using the aligned keyword.
Best practice
It is always a good idea to structure your class/struct in a way that variables are already aligned with minimum padding. This reduces the size of the class overall plus it reduces the amount of work done by the compiler i.e. no rearrangement of structure. Also one should always access member variables by their names in the code, rather than trying to read a specific byte from structure assuming a value would be located at that byte.
Another useful SO question on performance advantage of alignment
For the sake of completion, following would still have a size of 8 bytes in your scenario (32 bit machine), but it won't get any better since full 8 bytes are now occupied, and there is no padding.
class temp
{
public:
int i;
short s;
char c;
char c2;
};
class temp
{
public:
int i; //size 4 alignment 4
short s; //size 2 alignment 2
char c; //size 1 alignment 1
}; //Size 8 alignment max(4,2,1)=4
temp[i[0-4];s[4-2];c[6-7]]] -> 8
Padding in (7-8)
class temp
{
public:
char c; //size 1 alignment 1
int i; //size 4 alignment 4
short s; //size 2 alignment 2
};//Size 12 alignment max(4,2,1)=4
temp[c[0-1];i[4-8];s[8-10]]] -> 12
Padding in (1-4) and (10-12)