why the size of the class cl1 in the following code is 8 but not 5, while the size of class cl2 is 1?
class cl1 {
public:
int n;
char cb;
cl1();
~cl1();
};
class cl2 {
public:
char cb;
cl2();
~cl2();
};
The compiler is free to insert padding in between and after class members in order to ensure that variables are properly aligned, etc. Exactly what padding is inserted is up to the implementation. In this case, I'd guess that the compiler is adding 3 bytes of padding after cl1::cb, perhaps to ensure that the next variable in memory is aligned on a 4-byte boundary.
The largest member of cl1 (n) is 4 bytes, so the size of cl1 is padded up to the nearest 4 bytes (8 in this case) so that an array of cl1 objects does not create n members which are not aligned to 4-byte addresses. Most processors really hate misaligned multi-byte values, either suffering performance losses (two memory cycles to access one value) or outright crashes (alignment exceptions).
There is no guarantee that this will be consistent from compiler to compiler or platform to platform.
This is all due to padding. More info can be found, for example, here
The thing is that the addresses of both the object and its members should be properly aligned for OS and Hardware - specific reasons. Thus the result. The problem of padding is complicated by the fact that objects in an array must be located consecutively, without any space in between, and ALL should be properly aligned.
It is because of structure padding by the compiler. If you want to remove the padding, try #pragma pack(1) and you should get 5 and 1 as expected.
While you're exploring the size of struct and how padding is done, let me tell you an interesting thing. The size of struct not only depends on the members, but also on the order of their declaration. For example, size of the following structs is different, even though both has same number of members of same types, the only difference is the order of their declaration!
struct A
{
int a;
char b;
char c;
};
struct B
{
char b;
int a;
char c;
};
cout << "sizeof(A) = " << sizeof(A) << endl;
cout << "sizeof(B) = " << sizeof(B) << endl;
Output:
sizeof(A) = 8
sizeof(B) = 12
Online Demo at Ideone: http://www.ideone.com/8OoxX
Related
I understand sizeof empty structs is 1, but when combined with templates, this can cause scenarios where the sizeof a class may be misleading.
For example, in the below code, imagine I am coding to a binary protocol where there are certain important fields, followed by an optional struct.
After creating the message, we do a memcpy using sizeof(Message), but we get a total of 2 and tried to send 2 bytes, despite only truly having 1 byte of a message. This is dangerous and led to some address-sanitizer issues.
I've looked at empty base optimization, but that would only work if the optional field is in the beginning. Even if we don't use sizeof in this particular instance, sizeof is commonly used elsewhere in other generic code to handle messages like these.
struct EmptyClass{
// empty class is 1 byte
};
struct NonEmptyClass{
uint32_t j = 1; // non-empty class is 4 bytes
};
#pragma pack(push, 1) // exact fit - no padding
template <typename A>
struct Message{
bool i = true; // example of an important field of 1 byte
A a; // 1 byte if empty, 4 bytes if full
};
#pragma pack(pop)
int main() {
Message<NonEmptyClass> a;
std::cout << "Size of nonEmpty a: " << sizeof(a) << std::endl; // 5
Message<EmptyClass> b;
std::cout << "Size of Empty b: " << sizeof(b) << std::endl; // 2
// memcpy b results in address-sanitizer issues, and likely garbage values for the second byte
return 0;
}
In general, does this mean the only way to resolve these kind of issues is to have a custom (compile-time) size operator since sizeof isn't overridable?
Is sizeof(Type) safe to use for memcpying said Type?
Is sizeof(Type) safe to use for memcpying said Type?
For all trivially copyable types, yes. More specifically, it is safe to copy padding bytes.
But the representation of a class is not necessarily same across separate systems. Classes are not an ideal way to represent structure of a binary communication protocol.
Note that there is a potential problem in your example if you expand it to actually use a or b. a.a and a.b are uninitialised, so behaviour of reading their value is undefined. memcpy itself is actually safe, but interpreting the copied data as EmptyClass or NonEmptyClass is not.
I was testing a class alignment and found strange behavior. I tested it with VS2012 compiler setting 4 and 8 bytes alignment setting but in each case output is same.
class Alignemnt{
public:
Alignemnt():a(){}
int a;
};
class Alignemnt_1{
public:
int a;
char array[2];
};
class Alignemnt_2{
public:
int a;
char array[2];
int x;
};
std::cout << "Sizeof(Alignemnt) :" <<sizeof(Alignemnt) << std::endl;
std::cout << "Sizeof(Alignemnt_1) :" <<sizeof(Alignemnt_1) << std::endl;
std::cout << "Sizeof(Alignemnt_2) :" <<sizeof(Alignemnt_2) << std::endl;
Every time output is:
Sizeof(Alignemnt) : 4
Sizeof(Alignemnt_1) : 8
Sizeof(Alignemnt_2) : 12
I think, Alignemnt_2 size should be 16 byte.
I assume you are referring to the /Zp switch, which lets you control maximum struct member alignment:
When you specify this option, each structure member after the first is stored on either the size of the member type or n-byte boundaries (where n is 1, 2, 4, 8, or 16), whichever is smaller.
Since you are not using a struct member with an alignment of more than 4 bytes (sizeof(int) and alignof(int) is both 4), all settings of 4 bytes and above will lead to exactly the same behavior.
If you want to specify the exact alignment of a structure member, consider using the standard C++ alignas which allows you to specify the exact alignment a member is supposed to have (VS 2012 should support it iirc).
See the result of using alignas.
Alignment shouldn't change the size of your object, just the starting address of the objects. For instance, an 8 byte aligned object could be at address 0x100000 or 0x100008, or really any address ending in 0 or 8 when written in hex but not 0x100004.
No.
Aligment 2 is fine.
You have 2 chars next to each other, that means you have 2 out of 4 bytes used. The packing will align it to 4 bytes, and then you will end up with 4 + 4 +4 .
If you want to end up to 16 you can try the following declaration :
{
char a;
int b;
char c;
int d;
}
Let's say we have:
struct A{
char a1;
char a2;
};
struct B{
int b1;
char b2;
};
struct C{
char C1;
int C2;
};
I know that because of padding to a multiple of the word size (assuming word size=4), sizeof(C)==8 although sizeof(char)==1 and sizeof(int)==4.
I would expect that sizeof(B)==5 instead of sizeof(B)==8.
But if sizeof(B)==8 I would expect that sizeof(A)==4 instead of sizeof(A)==2.
Could anyone please explain why the padding and the aligning are working differently in those cases?
A common padding scheme is to pad structs so that each member starts at an even multiple of the size of that member or to the machine word size (whichever is smaller). The entire struct is padded following the same rule.
Assuming such a padding scheme I would expect:
The biggest member in struct A has size 1, so no padding is used.
In struct B, the size of 5 is padded to 8, because one member has size 4.
The layout would be:
int 4
char 1
padding 3
In struct C, some padding is inserted before the int, so that it starts at an address divisible by 4.
The layout would be:
char 1
padding 3
int 4
It's up to the compiler to decide how best to pad the struct. For some reason, it decided that in struct B that char b2 was more optimally aligned on a 4 byte boundary. Additionally, the specific architecture may have requirements/behaviors that the compiler takes into account when deciding how to pad structs.
If you 'pack' the struct, then you'd see the sizes you expect (although that is not portable and may have performance penalties and other issues depending on the architecture).
structs in general will be aligned to boundaries based on the largest type contained. Consider an array of struct B myarray[5];
struct B must be aligned to 8 bytes so that it's b1 member is always on a 4 byte boundary. myarray[1].b1 can't start at the 6th byte into the array, which is what you would have if sizeof(B) == 5.
According to MSDN, the /Zp command defaults to 8, which means 64-bit alignment boundaries are used. I have always assumed that for 32-bit applications, the MSVC compiler will use 32-bit boundaries. For example:
struct Test
{
char foo;
int bar;
};
The compiler will pad it like so:
struct Test
{
char foo;
char padding[3];
int bar;
};
So, since /Zp8 is used by default, does that mean my padding becomes 7+4 bytes using the same example above:
struct Test
{
char foo;
char padding1[7];
int bar;
char padding2[4];
}; // Structure has 16 bytes, ending on an 8-byte boundary
This is a bit ridiculous isn't it? Am I misunderstanding? Why is such a large padding used, it seems like a waste of space. Most types on a 32-bit system aren't even going to use 64-bits, so the majority of variables would have padding (probably over 80%).
That's not how it works. Members are aligned to a multiple of their size. Char to 1 byte, short to 2, int to 4, double to 8. The structure is padded at the end to ensure the members still align correctly when the struct is used in an array.
A packing of 8 means it stops trying to align members that are larger than 8. Which is a practical limit, the memory allocator doesn't return addresses aligned better than 8. And double is brutally expensive if it isn't aligned properly and ends up straddling a cache line. But otherwise a headache if you write SIMD code, it requires 16 byte alignment.
That does not mean every member is aligned on an 8byte boundary. Read a little more carefully:
the smaller member type or n-byte boundaries
The key here is the first part- "smaller member type". That means that members with less alignment might be aligned less, effectively.
struct x {
char c;
int y;
};
std::cout << sizeof(x);
std::cout << "offsetof(x, c) = " << offsetof(x, c) << '\n';
std::cout << "offsetof(x, c) = " << offsetof(x, y) << '\n';
This yields 8, 0, 4- meaning that in fact, the int is only padded to a 4byte alignment.
Are the members of a structure packed in C/C++?
By packed I mean that they are compact and among the fields there aren't memory spaces.
That isn't what aligned means, and no, no particular alignment or packing is guaranteed. The elements will be in order, but the compiler can insert padding where it chooses. This actually creates (useful) alignment. E.g., for a x86:
struct s
{
char c;
int i;
};
there will probably (but not necessarily) be three bytes between c and i. This allows i to be aligned on a word boundary, which can provide much faster memory access (on some architectures, it's required).
From C99 ยง6.7.2.1:
Each non-bit-field member of a
structure or union object is aligned
in an implementation- defined manner
appropriate to its type.
What you are asking for is packing, and alignment is different. Both are outside of the scope of the language and are specific for each implementation. Take a look here.
Generally not. Some info here.
Depending on the compiler, you can introduce pragmas to help (from the link above):
#pragma pack(push) /* push current alignment to stack */
#pragma pack(1) /* set alignment to 1 byte boundary */
struct MyPackedData
{
char Data1;
long Data2;
char Data3;
};
#pragma pack(pop) /* restore original alignment from stack */
Typically (but under no guarantees), members of a struct are word-aligned. This means that a field less than the size of a word will be padded to take up an entire word.
However, when the next member of the struct can also fit inside the same word, then the compiler will put both members into the same word. This is more efficient space-wise, but depending on your platform, retrieving said members might be more expensive computationally.
On my 32-bit system using GCC under Cygwin, this program...
#include <iostream>
struct foo
{
char a;
int b;
char c;
};
int main(int argc, char** argv)
{
std::cout << sizeof(foo) << std::endl;
}
outputs '12' because both chars are word-aligned and take up 4 bytes each.
However, switch the struct to
struct foo
{
char a;
char c;
int b;
};
and the output is '8' because both chars next to each other can fit in a single word.
It is possible to pack bytes in order to conserve memory. For instance, pack(2) will tell members that longer than a byte to pack to two-bytes in order to maintain a two-byte boundary so that any padding members are two bytes long. Sometimes packing is used as part of a standard communication protocol where it expects a certain size. Here is what Wikipedia has to say about C/C++ and padding:
Padding is only inserted when a
structure member is followed by a
member with a larger alignment
requirement or at the end of the
structure. By changing the ordering of
members in a structure, it is possible
to change the amount of padding
required to maintain alignment. For
example, if members are sorted by
ascending or descending alignment
requirements a minimal amount of
padding is required. The minimal
amount of padding required is always
less than the largest alignment in the
structure. Computing the maximum
amount of padding required is more
complicated, but is always less than
the sum of the alignment requirements
for all members minus twice the sum of
the alignment requirements for the
least aligned half of the structure
members.
Although C and C++ do not allow the
compiler to reorder structure members
to save space, other languages might.
Since in struct's, the compiler treats things as words, sometimes care must be taken if you are relying on the size of the struct to be a certain size. For instance, aligning char vs int.
They are not packed by default. Instead, they are word-aligned depending on how your machine is set up. If you do want them to be packed. Then you can use __attribute__((__packed__)) at the end of your struct declaration like this:
struct abc {
char a;
int b;
char c;
}__attribute__((__packed__));
Then, for
struct abc _abc;
_abc will be packed.
Reference: Specific structure packing when using the GNU C Compiler
Seeing some outputs of the same structure's variations may give a clue about what is going on. After reading this, if I did not get it wrong, small types will be padded to be a word-lengths.
struct Foo {
char x ; // 1 byte
int y ; // 4 byte
char z ; // 1 byte
int w ; // 4 byte
};
struct FooOrdered {
char x ; // 1 byte
char z ; // 1 byte
int y ; // 4 byte
int w ; // 4 byte
};
struct Bar {
char x ; // 1 byte
int w ; // 4 byte
};
struct BarSingleType {
char x ; // 1 byte
};
int main(int argc, char const *argv[]) {
cout << sizeof(Foo) << endl;
cout << sizeof(FooOrdered) << endl;
cout << sizeof(Bar) << endl;
cout << sizeof(BarSingleType) << endl;
return 0;
}
In my environment output was like this:
16
12
8
1