MSVC 2008 16 Bytes structure members alignment weirdness - c++

Can anybody please explain what's going on?
My MSVC 2008 project's structure member alignment setting is set to 16 bytes (/Zp16) alignment, however one of the following structures is being aligned by 16 bytes and another is aligned only by 8 bytes... WHY?!!!
struct HashData
{
void *pData;
const char* pName;
int crc;
bool bModified;
}; // sizeof (HashData) == 4 + 4 + 4 + 1 + padding = 16 bytes, ok
class StringHash
{
HashData data[1024];
int mask;
int size;
}; // sizeof(StringHash) == 1024 * 16 + 4 + 4 + 0 = 16392 bytes, why not 16400 bytes?
This may not look like a big deal, but it's a big problem for me, since I am forced to emulate the MSVC structures alignment in GCC and specifying the aligned(16) attribute makes the sizeof (StringHash) == 16400!
Please tell me, when and why MSVC overrides the /Zp16 setting, I absolutely can't fathom it...

I think you misunderstood the /Zp16 option.
MSDN says,
When you specify this option, each structure member after the first is
stored on either the size of the member type or n-byte boundaries
(where n is 1, 2, 4, 8, or 16), whichever is smaller.
Please read the "whichever is smaller". It doesn't say that the struct will be padded by 16. It rather defines the boundary of each member relative to each other, starting from the first member.
What you basically want is align (C++) attribute, which says
Use __declspec(align(#)) to precisely control the alignment of user-defined data
So try this:
_declspec(align(16)) struct StringHash
{
HashData data[1024];
int mask;
int size;
};
std::cout << sizeof(StringHash) << std::endl;
It should print what you expect.
Or you can use #pragma pack(16).

Consider using the pack pragma directive:
// Set packing to 16 byte alignment
#pragma pack(16)
struct HashData
{
void *pData;
const char* pName;
int crc;
bool bModified;
};
class StringHash
{
HashData data[1024];
int mask;
int size;
};
// Restore default packing
#pragma pack()
See: pack and Working with Packing Structures

Related

Packing unions/structure to avoid padding

I have a structure that looks like this:
struct vdata {
static_assert(sizeof(uint8_t *) == 8L, "size of pointer must be 8");
union union_data {
uint8_t * A; // 8 bytes
uint8_t B[12]; // 12 bytes
} u;
int16_t C; // 2 bytes
int16_t D; // 2 bytes
};
I would like to make this 16 bytes, but GCC is telling me it is 24, as the union is padding to 16 bytes.
I would like to put vdata into a large std::vector. From my understanding, there should be no issue with alignment if this were 16 bytes, since the pointer would always be 8 byte aligned.
I understand that I can force this to be packed using __attribute__((__packed__)) in GCC. But I would like to know if there is a portable and standard compliant way to get this to be 16 bytes?
Edit: Ideas
Idea 1: split up the B array.
struct vdata {
union union_data {
uint8_t * A; // 8 bytes
uint8_t B[8]; // 8 bytes
} u;
uint8_t B2[4]; // 4 bytes
int16_t C; // 2 bytes
int16_t D; // 2 bytes
};
Could B2 elements be reliably accessed from a pointer of B? Is that defined behavior?
Idea 2: store pointer as byte array and memcpy as necessary (#Eljay)
struct vdata {
union union_data {
std::byte A[sizeof(uint8_t*)]; // 8 bytes
uint8_t B[12]; // 12 bytes
} u;
int16_t C; // 2 bytes
int16_t D; // 2 bytes
};
Would there be a performance penalty for accessing the pointer, or would it be optimized out? (Assuming GCC x86).
You could change A to std::byte A[sizeof(uint8_t*)]; and then std::memcpy the pointer into A and out of A.
Worth commenting as to what is going on, and that these extra hoops are to avoid padding bytes.
Also adding a set_A setter and get_A getter may be very helpful.
struct vdata {
union union_data {
std::byte A[sizeof(uint8_t*)]; // 8 bytes
uint8_t B[12]; // 12 bytes
} u;
int16_t C; // 2 bytes
int16_t D; // 2 bytes
void set_A(uint8_t* p) {
std::memcpy(u.A, &p, sizeof p);
}
uint8_t* get_A() {
uint8_t* result;
std::memcpy(&result, u.A, sizeof result);
return result;
}
};
Store C+D in the union's array, and provide method access to them:
struct vdata {
static_assert(sizeof(uint8_t *) == 8L, "size of pointer must be 8");
union union_data {
uint8_t * A; // 8 bytes
uint8_t B[16]; // 12 + 2*2 bytes
} u;
int16_t& C() {
return *reinterpret_cast<int16_t*>(static_cast<void*>(&u.B[12]));
}
int16_t& D() {
return *reinterpret_cast<int16_t*>(static_cast<void*>(&u.B[14]));
}
};
Demo (with zero warnings for strict aliasing violations and run-time address sanitization enabled)
Keep in mind that there's no strict aliasing violation when the buffer is char* i.e. single byte type like uint8_t - I mean thankfully because otherwise it would be impossible to create memory pools. If it makes things clearer/safer you can even have an explicit char array buffer:
struct vdata {
union union_data {
uint8_t * A; // 8 bytes
uint8_t B[12]; // 12 bytes
char buf[16]; // 16 bytes - could be std::byte buf[16]
} u;
int16_t& C() { return *(int16_t*)(&u.buf[12]); }
int16_t& D() { return *(int16_t*)(&u.buf[14]); }
};
Regarding alignment The array is 8-aligned due to the address of the union, so positions 12&14 are guaranteed to be 2-aligned which is the requirement for int16_t (even though the string u.B appears in the code).
Alternatively you can force align the structure. The C++ alignas specifier would not be valid here because you want to lower the alignment of your structure, put a pragma directive is possible to give you again 16 bytes:
#pragma pack(4)
struct vdata {
static_assert(sizeof(uint8_t *) == 8L, "size of pointer must be 8");
union union_data {
uint8_t * A; // 8 bytes
uint8_t B[12]; // 12 bytes
} u;
int16_t C; // 2 bytes
int16_t D; // 2 bytes
};
Demo
I'm fairly certain that this one will cause problems.
As far as I understand, the following code would be the most safe one.
The data that specify the type is in the Initial common sequence. Thus you can access it either way (by using cda.C or cdb.C) so it is perfect for determining the type.
Then putting everything in a struct for both cases allows to ensure that each struct layout is independant (thus B can start before next 8 bytes alignment).
#include <cstdint>
#include <iostream>
struct CDA
{
int16_t C; // 2 bytes
int16_t D; // 2 bytes
uint8_t* A; // 8 bytes
};
struct CDB
{
int16_t C; // 2 bytes
int16_t D; // 2 bytes
uint8_t B[12]; // 12 bytes
};
struct vdata {
union union_data {
CDA cda;
CDB cdb;
} u;
};
static_assert(sizeof(uint8_t*) == 8);
static_assert(sizeof(CDA) == 16);
static_assert(sizeof(CDB) == 16);
static_assert(offsetof(vdata::union_data, cda) == offsetof(vdata::union_data, cdb));
static_assert(offsetof(CDA, C) == offsetof(CDB, C));
static_assert(offsetof(CDA, C) == 0);
static_assert(sizeof(vdata) == 16);
int main()
{
std::cout << "sizeof(CDA) : " << sizeof(CDA) << std::endl;
std::cout << "sizeof(CDB) : " << sizeof(CDB) << std::endl;
std::cout << "sizeof(vdata) : " << sizeof(vdata) << std::endl;
}
Usefull source of information:
CppCon 2017: Scott Schurr “Type Punning in C++17: Avoiding Pun-defined Behavior”
Union declaration
std::launder
std::variant
How to decide?
If the size optimization is not that important, I would recommend to use std::variant.
If the size is important but the order is not, then the current solution might be the best choice.
If portability is not so important, then pragma pack solution might be appropriate (remember to reset alignment after the struct definition).
Otherwise, if you really need layout control, then either use:
std::byte array and memcpy (access data with functions)
placement new and std::launder.
In all cases, be sure to have appropriate assertion that verify assumptions you make. I have put many in my sample code but you can adjust depending on your need.
Also, unless you have millions of vdata items or you are on an embedded device, then using 24 bytes instead of 16 might not be a big deal.
You might also use conditionnal define to optimize only for your current compiler. This could be useful to ensure that you have working code (though maybe less optimal) for every target or it can allows to depend on behavior that is undefined from the standard but might be defined on your compiler.

why size of structure does not change when use 24 bit integer

I am trying to port embed code in windows platform.
I have come across below problem I am posting a sample code here.
here even after I use int24 size remains 12 bytes in windows why ?
struct INT24
{
INT32 data : 24;
};
struct myStruct
{
INT32 a;
INT32 b;
INT24 c;
};
int _tmain(int argc, _TCHAR* argv[])
{
unsigned char myArr[11] = { 0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00, 0xFF,0xFF,0xFF };
myStruct *p = (myStruct*)myArr;
cout << sizeof(*p);
}
There are two reasons, each of which would be enough by themselves.
Presumably the size of INT32 is 4 bytes. The size of INT24 is also 4 bytes, because it contains a INT32 bit field. Since, myStruct contains 3 members of size 4, its size must therefore be at least 12.
Presumably the alignment requirement of INT32 is 4. So, even if the size of INT24 were 3, the size of myStruct would still have to be 12, because it must have at least the alignment requirement of INT32 and therefore the size of myStruct must be padded to the nearest multiple of 4.
any way or workaround ?
This is implementation specific, but the following hack may work for some compilers/cpu combinations. See the manual of your compiler for the syntax for similar feature, and the manual for your target cpu whether it supports unaligned memory access. Also do realize that unaligned memory access does have a performance penalty.
#pragma pack(push, 1)
struct INT24
{
INT32 data : 24;
};
#pragma pack(pop)
#pragma pack(push, 1)
struct myStruct
{
INT32 a;
INT32 b;
INT24 c;
};
#pragma pack(pop)
Packing a bit field might not work the same in all compilers. Be sure to check how yours behaves.
I think that a standard compliant way would be to store char arrays of sizes 3 and 4, and whenever you need to read or write one of the integer, you'd have to std::memcpy the value. That would be a bit burdensome to implement and possibly also slower than the #pragma pack hack.
Sadly for you, the compiler in optimising the code for a particular architecture reserves the right to pad out the structure by inserting spaces between members and even at the end of the structure.
Using a bit field does not reduce the size of the struct; you still get the whole of the "fielded" type in the struct.
The standard guarantees that the address of the first member of a struct is the same as the address of the struct, unless it's a polymorphic type.
But all is not lost: you can rely on the fact that an array of char will always be contiguous and contain no packing.
If CHAR_BIT is defined as 8 on your system (it probably is), you can model an array of 24 bit types on an array of char. If it's not 8 then even this approach will not work: I'd then suggest resorting to inline assembly.

Struct size stays the same even after adding a new member to it

when is simply execute
cout << sizeof(string);
i got 8 as answer.
now i am having a structure
typedef struct {
int a;
string str;
} myType;
and i am executing
cout << sizeof(myType);
i got 16 as the answer.
now i made a change in my structure
typedef struct {
int a, b;
string str;
} myType;
and i am executing
cout << sizeof(myType);
i got 16 as the answer!!!. How? What is happening?
Perhaps padding is happening. E.g. sizeof(int) can be 4 bytes and compiler can add 4 bytes after a for the sake of data alignment. The layout could be like this:
typedef struct {
int a; // 4 bytes
// 4 bytes for padding
string str; // 8 bytes
} myType;
typedef struct {
int a; // 4 bytes
int b; // 4 bytes
string str; // 8 bytes
} myType;
Looks like 8 byte alignment.
So if you have any data type that has less than 8 bytes, it will still use 8 bytes.
I assume the pointer is 8 byte, whereas the ints are only 4 bytes each.
You can force 1 byte alignment using code like outlined here Struct one-byte alignment conflicted with alignment requirement of the architecture? . You should then get different size for first case.
It's called structure packing to achieve optimal memory alignment.
See The Lost Art of C Structure Packing to understand the how and why. It's done the same way in both C and C++.
In C/C++ structs are "packed" in byte chunks. You can specify which size your structs should be packed.
Here a reference: http://msdn.microsoft.com/en-us/library/2e70t5y1.aspx

Why does an uint64_t needs more memory than 2 uint32_t's when used in a class? And how to prevent this?

I have made the following code as an example.
#include <iostream>
struct class1
{
uint8_t a;
uint8_t b;
uint16_t c;
uint32_t d;
uint32_t e;
uint32_t f;
uint32_t g;
};
struct class2
{
uint8_t a;
uint8_t b;
uint16_t c;
uint32_t d;
uint32_t e;
uint64_t f;
};
int main(){
std::cout << sizeof(class1) << std::endl;
std::cout << sizeof(class2) << std::endl;
std::cout << sizeof(uint64_t) << std::endl;
std::cout << sizeof(uint32_t) << std::endl;
}
prints
20
24
8
4
So it's fairly simple to see that one uint64_t is as large as two uint32_t's, Why would class 2 have 4 extra bytes, if they are the same except for the substitution of two uint32_t's for an uint64_t.
As it was pointed out, this is due to padding.
To prevent this, you may use
#pragma pack(1)
class ... {
};
#pragma pack(pop)
It tells your compiler to align not to 8 bytes, but to one byte. The pop command switches it off (this is very important, since if you do that in the header and somebody includes your header, very weird errors may occur)
Why does an uint64_t needs more memory than 2 uint32_t's when used in a class?
The reason is padding due to alignment requirements.
On most 64-bit architectures uint8_t has an alignment requirement of 1, uint16_t has an alignment requirement of 2, uint32_t has an alignment requirement of 4 and uint64_t has an alignment requirement of 8. The compiler must ensure that all members in a structure are correctly aligned and that the size of a structure is a multiple of it's overall alignment requirement. Furthermore the compiler is not allowed to re-order members.
So your structs end up laid out as follows
struct class1
{
uint8_t a; //offset 0
uint8_t b; //offset 1
uint16_t c; //offset 2
uint32_t d; //offset 4
uint32_t e; //offset 8
uint32_t f; //offset 12
uint32_t g; //offset 16
}; //overall alignment requirement 4, overall size 20.
struct class2
{
uint8_t a; //offset 0
uint8_t b; //offset 1
uint16_t c; //offset 2
uint32_t d; //offset 4
uint32_t e; //offset 8
// 4 bytes of padding because f has an alignment requirement of 8
uint64_t f; //offset 16
}; //overall alignment requirement 8, overall size 24
And how to prevent this?
Unfortunately there is no good general solution.
Sometimes it is possible to reduce the amount of padding by re-ordering fields, but that doesn't help in your case. It just moves the padding around in the structure. A structure with a field requiring 8 byte alignment will always have a size that is a multiple of 8. Therefore no matter how much you rearrange the fields your structure will always have a size of at least 24.
You can use compiler-specific features such as #pragma pack or __attribute((packed)) to force the compiler to pack the structure more tightly than normal alignment requirements would allow. However, as well as limiting portability, this creates a problem when taking the address of a member or binding a reference to the member. The resulting pointer or reference may not satisfy the alignment requirements and therefore may not be safe to use.
Different compilers vary in how they handle this problem. From some playing around on godbolt.
g++ 9 through 11 will refuse to bind a reference to a packed member and give a warning when taking the address.
clang 4 through 11 will give a warning when taking the address, but will silently bind a reference and pass that reference across a compilation unit boundary.
Clang 3.9 and earlier will take the address and bind a reference silently.
g++ 8 and earlier and clang 3.9 and earlier (down to the oldest version on godbolt) will also refuse to bind a reference, but will take the address with no warning.
icc will bind a pointer or take the address without producing any warnings in either case (though to be fair intel processors support unaligned access in hardware).
The rule for alignment (on x86 and x86_64) is generally to align a variable on it's size.
In other words, 32-bit variables are aligned on 4 bytes, 64-bit variables on 8 bytes, etc.
The offset of f is 12, so in case of uint32_t f no padding is needed, but when f is an uint64_t, 4 bytes of padding are added to get f to align on 8 bytes.
For this reason it is better to order data members from largest to smallest. Then there wouldn't be any need for padding or packing (except possibly at the end of the structure).

MSVC default memory alignment of 8

According to MSDN, the /Zp command defaults to 8, which means 64-bit alignment boundaries are used. I have always assumed that for 32-bit applications, the MSVC compiler will use 32-bit boundaries. For example:
struct Test
{
char foo;
int bar;
};
The compiler will pad it like so:
struct Test
{
char foo;
char padding[3];
int bar;
};
So, since /Zp8 is used by default, does that mean my padding becomes 7+4 bytes using the same example above:
struct Test
{
char foo;
char padding1[7];
int bar;
char padding2[4];
}; // Structure has 16 bytes, ending on an 8-byte boundary
This is a bit ridiculous isn't it? Am I misunderstanding? Why is such a large padding used, it seems like a waste of space. Most types on a 32-bit system aren't even going to use 64-bits, so the majority of variables would have padding (probably over 80%).
That's not how it works. Members are aligned to a multiple of their size. Char to 1 byte, short to 2, int to 4, double to 8. The structure is padded at the end to ensure the members still align correctly when the struct is used in an array.
A packing of 8 means it stops trying to align members that are larger than 8. Which is a practical limit, the memory allocator doesn't return addresses aligned better than 8. And double is brutally expensive if it isn't aligned properly and ends up straddling a cache line. But otherwise a headache if you write SIMD code, it requires 16 byte alignment.
That does not mean every member is aligned on an 8byte boundary. Read a little more carefully:
the smaller member type or n-byte boundaries
The key here is the first part- "smaller member type". That means that members with less alignment might be aligned less, effectively.
struct x {
char c;
int y;
};
std::cout << sizeof(x);
std::cout << "offsetof(x, c) = " << offsetof(x, c) << '\n';
std::cout << "offsetof(x, c) = " << offsetof(x, y) << '\n';
This yields 8, 0, 4- meaning that in fact, the int is only padded to a 4byte alignment.