I have a struct that is supposed to be 8 byte in size.
struct Slot {
uint8_t T;
uint8_t S;
uint32_t O : 24;
uint32_t L : 24;
}
However, sizeof(Slot) tells me the size is 12 byte.
So the compiler seems to pad the data although it shouldn't be necessary (probably because the 24-bit cannot be aligned properly).
A hacky solution would be to use 3 one-byte fields instead of a single three-byte field:
struct Slot2 {
uint8_t T;
uint8_t S;
uint8_t O1;
uint8_t O2;
uint8_t O3;
uint8_t L1;
uint8_t L2;
uint8_t L3;
}; // sizeof(Slot2) = 8
Is there any other way to achieve this?
This gives size 8 bytes on MSVC without packing pragma.
struct Slot {
uint32_t O : 24;
uint32_t T : 8;
uint32_t L : 24;
uint32_t S : 8;
};
There is no way anyone can tell what your code will do or how the data will end up in memory, because the behavior of bit fields is poorly specified by the C standard. See this.
It is not specified what will happen when you use an uint32_t for a bit field.
You can't know if there will be padding bits.
You can't know if there will be padding bytes.
You can't know where padding bits or bytes will end up.
You can't know whether 8 bits of the 2nd 24 bit chunk end up immediately after previous data, or if it is aligned to the next 32 bit segment.
You can't know which bit that is msb and which that is lsb.
Endianess will cause problems.
The solution is to not use bit fields at all. Use the bitwise operators instead.
Your "hack" solution is exactly the right one. I suspect that the layout is determined by some outside factors, so you won't be able to map this to a struct in any better way. I suspect the order of bytes in your 24 bit numbers is also determined by the outside, and not by your compiler.
To handle that kind of situation, a struct of bytes or just an array of bytes is the easiest and portable solution.
I think, what you want, 8 bytes, is not something that the C standard can gurantee, with your first definition.
Related: from C11 standard, Chapter ยง6.7.2.1,
An implementation may allocate any addressable storage unit large enough to hold a bitfield. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined.
You can have a way out however, if you can adjust the variable so that they can fit properly in 32-bit alignment, then
24 + 8 + 24 + 8 = 64 bits = 8 bytes.
you can have a structure of size 8 bytes.
With this compiler dependant solution (works with gcc, msvc) the struct will be 8 bytes:
#pragma pack(push, 1)
struct Slot {
uint8_t T;
uint8_t S;
uint32_t O : 24;
uint32_t L : 24;
};
#pragma pack(pop)
This will set the alignment of the struct to 1 byte.
On MSVC the following works and keeps your variable orders the same:
struct Slot {
uint64_t T : 8;
uint64_t S : 8;
uint64_t O : 24;
uint64_t L : 24;
};
This is not guaranteed across compilers though. YMMV on other compilers.
Try something like as shown below:
struct Slot {
uint32_t O : 24;
uint8_t T;
uint32_t L : 24;
uint8_t S;
};
Related
Why does the sizes of these two structs differ?
#pragma pack(push, 1)
struct WordA
{
uint32_t address : 8;
uint32_t data : 20;
uint32_t sign : 1;
uint32_t stateMatrix : 2;
uint32_t parity : 1;
};
struct WordB
{
uint8_t address;
uint32_t data : 20;
uint8_t sign : 1;
uint8_t stateMatrix : 2;
uint8_t parity : 1;
};
#pragma pack(pop)
Somehow WordB occupies 6 bytes instead of four, while WordA occupies exactly 32 bits.
I assumed that given the sum of used bits inside a struct would yield both structs to be of the same size. Apparently I am wrong, but I cannot find an explanation why.
Bit fields page shows only examples when all of the struct members are of the same type, which is a case of WordA.
Can anybody explain, why the sizes don't match and if it is according to the standard or implementation-defined?
Why can't a bit field be split between different underlying types?
It can in the sense that standard allows it.
It wasn't because that's what the language implementer (or rather, the designer of the ABI) chose. This decision may have been preferred because it may make the program faster or the compiler easier to implement.
Here is the standard quote:
[class.bit]
... Allocation of bit-fields within a class object is implementation-defined.
Alignment of bit-fields is implementation-defined.
Bit-fields are packed into some addressable allocation unit.
I played with https://gcc.godbolt.org/ and this code seems to always return 1.
However I wonder if standard guarantee this.
#include <cstdint>
struct S{
uint8_t a : 6;
uint8_t b : 2;
};
int main(){
return sizeof(S);
}
My real example is following:
struct Pair{
uint64_t created; // 8 bytes
uint32_t expires; // 4 bytes
uint16_t keylen; // 2 bytes, 4 bits are used for vallen. 2 bits are reserved for future versions. 10 bytes for keylen.
uint16_t vallen; // 2 bytes
};
Currently I do some bitmasks and shifts.
Can I do it like this? Will it always work (any "normal" compiler)?
How keylen behave on Big Endian? Using shifts I always use first 4 bits.
struct Pair{
uint64_t created;
uint32_t expires;
uint8_t vallen2 : 4;
uint8_t future : 2;
uint16_t keylen : 10;
uint16_t vallen;
};
Most compilers will give you bit fields that are packed together, although the behaviour is implementation defined.
One way to check would be to add a static_assert to compare the size of your struct definitions with and without the bitfields. This way you would get a nice compile time error once the implementation doesn't do what you expect.
#include <cstdint>
struct Pair_size_check{
uint64_t created; // 8 bytes
uint32_t expires; // 4 bytes
uint16_t keylen; // 2 bytes, 4 bits are used for vallen. 2 bits are reserved for future versions. 10 bytes for keylen.
uint16_t vallen; // 2 bytes
};
struct Pair{
uint64_t created;
uint32_t expires;
uint8_t vallen2 : 4;
uint8_t future : 2;
uint16_t keylen : 10;
uint16_t vallen;
};
static_assert(sizeof(Pair) == sizeof(Pair_size_check));
I have made the following code as an example.
#include <iostream>
struct class1
{
uint8_t a;
uint8_t b;
uint16_t c;
uint32_t d;
uint32_t e;
uint32_t f;
uint32_t g;
};
struct class2
{
uint8_t a;
uint8_t b;
uint16_t c;
uint32_t d;
uint32_t e;
uint64_t f;
};
int main(){
std::cout << sizeof(class1) << std::endl;
std::cout << sizeof(class2) << std::endl;
std::cout << sizeof(uint64_t) << std::endl;
std::cout << sizeof(uint32_t) << std::endl;
}
prints
20
24
8
4
So it's fairly simple to see that one uint64_t is as large as two uint32_t's, Why would class 2 have 4 extra bytes, if they are the same except for the substitution of two uint32_t's for an uint64_t.
As it was pointed out, this is due to padding.
To prevent this, you may use
#pragma pack(1)
class ... {
};
#pragma pack(pop)
It tells your compiler to align not to 8 bytes, but to one byte. The pop command switches it off (this is very important, since if you do that in the header and somebody includes your header, very weird errors may occur)
Why does an uint64_t needs more memory than 2 uint32_t's when used in a class?
The reason is padding due to alignment requirements.
On most 64-bit architectures uint8_t has an alignment requirement of 1, uint16_t has an alignment requirement of 2, uint32_t has an alignment requirement of 4 and uint64_t has an alignment requirement of 8. The compiler must ensure that all members in a structure are correctly aligned and that the size of a structure is a multiple of it's overall alignment requirement. Furthermore the compiler is not allowed to re-order members.
So your structs end up laid out as follows
struct class1
{
uint8_t a; //offset 0
uint8_t b; //offset 1
uint16_t c; //offset 2
uint32_t d; //offset 4
uint32_t e; //offset 8
uint32_t f; //offset 12
uint32_t g; //offset 16
}; //overall alignment requirement 4, overall size 20.
struct class2
{
uint8_t a; //offset 0
uint8_t b; //offset 1
uint16_t c; //offset 2
uint32_t d; //offset 4
uint32_t e; //offset 8
// 4 bytes of padding because f has an alignment requirement of 8
uint64_t f; //offset 16
}; //overall alignment requirement 8, overall size 24
And how to prevent this?
Unfortunately there is no good general solution.
Sometimes it is possible to reduce the amount of padding by re-ordering fields, but that doesn't help in your case. It just moves the padding around in the structure. A structure with a field requiring 8 byte alignment will always have a size that is a multiple of 8. Therefore no matter how much you rearrange the fields your structure will always have a size of at least 24.
You can use compiler-specific features such as #pragma pack or __attribute((packed)) to force the compiler to pack the structure more tightly than normal alignment requirements would allow. However, as well as limiting portability, this creates a problem when taking the address of a member or binding a reference to the member. The resulting pointer or reference may not satisfy the alignment requirements and therefore may not be safe to use.
Different compilers vary in how they handle this problem. From some playing around on godbolt.
g++ 9 through 11 will refuse to bind a reference to a packed member and give a warning when taking the address.
clang 4 through 11 will give a warning when taking the address, but will silently bind a reference and pass that reference across a compilation unit boundary.
Clang 3.9 and earlier will take the address and bind a reference silently.
g++ 8 and earlier and clang 3.9 and earlier (down to the oldest version on godbolt) will also refuse to bind a reference, but will take the address with no warning.
icc will bind a pointer or take the address without producing any warnings in either case (though to be fair intel processors support unaligned access in hardware).
The rule for alignment (on x86 and x86_64) is generally to align a variable on it's size.
In other words, 32-bit variables are aligned on 4 bytes, 64-bit variables on 8 bytes, etc.
The offset of f is 12, so in case of uint32_t f no padding is needed, but when f is an uint64_t, 4 bytes of padding are added to get f to align on 8 bytes.
For this reason it is better to order data members from largest to smallest. Then there wouldn't be any need for padding or packing (except possibly at the end of the structure).
I'm trying to interface with Ada code using C++, so I'm defining a struct using bit fields, so that all the data is in the same place in both languages. The following is not precisely what I'm doing, but outlines the problem. The following is also a console application in VS2008, but that's not super relevant.
using namespace System;
int main() {
int array1[2] = {0, 0};
int *array2 = new int[2]();
array2[0] = 0;
array2[1] = 0;
#pragma pack(1)
struct testStruct {
// Word 0 (desired)
unsigned a : 8;
unsigned b : 1;
bool c : 1;
unsigned d : 21;
bool e : 1;
// Word 1 (desired)
int f : 32;
// Words 2-3 (desired)
int g[2]; //Cannot assign bit field but takes 64 bits in my compiler
};
testStruct test;
Console::WriteLine("size of char: {0:D}", sizeof(char) * 8);
Console::WriteLine("size of short: {0:D}", sizeof(short) * 8);
Console::WriteLine("size of int: {0:D}", sizeof(int) * 8);
Console::WriteLine("size of unsigned: {0:D}", sizeof(unsigned) * 8);
Console::WriteLine("size of long: {0:D}", sizeof(long) * 8);
Console::WriteLine("size of long long: {0:D}", sizeof(long long) * 8);
Console::WriteLine("size of bool: {0:D}", sizeof(bool) * 8);
Console::WriteLine("size of int[2]: {0:D}", sizeof(array1) * 8);
Console::WriteLine("size of int*: {0:D}", sizeof(array2) * 8);
Console::WriteLine("size of testStruct: {0:D}", sizeof(testStruct) * 8);
Console::WriteLine("size of test: {0:D}", sizeof(test) * 8);
Console::ReadKey(true);
delete[] array2;
return 0;
}
(If it wasn't clear, in the real program, the basic idea is that the program gets a void* from something communicating with the Ada code and casts it to a testStruct* to access the data.)
With #pragma pack(1) commented out, the output is:
size of char: 8
size of short: 16
size of int: 32
size of unsigned: 32
size of long: 32
size of long long: 64
size of bool: 8
size of int[2]: 64
size of int*: 32
size of testStruct: 224
size of test: 224
Obviously 4 words (indexed 0-3) should be 448 = 32*4 = 128 bits, not 224. The other output lines were to help confirm the size of types under the VS2008 compiler.
With #pragma pack(1) uncommented, that number (on the last two lines of output) is reduced to 176, which is still greater than 128. It seems that the bools aren't being packed together with the unsigned ints in "Word 0".
Note: a&b, c, d, e, f, packaged in different words would be 5, +2 for the array = 7 words, times 32 bits = 224, the number we get with #pragma pack(1) commented out. If c and e (the bools) instead take up 8 bits each, as opposed to 32, we get 176, which is the number we get with #pragma pack(1) uncommented. It seems #pragma pack(1) is only allowing the bools to be packed into single bytes by themselves, instead of words, but not the bools with the unsigned ints at all.
So my question, in one sentence: Is there a way to force the compiler to pack a through e into one word? Related is this question: C++ bitfield packing with bools , but that doesn't answer my question; it only points out the behavior I'm trying to force to go away.
If there is literally no way to do this, does anyone have any ideas for workarounds? I'm at a loss, because:
I was asked to avoid changing the struct format that I'm copying (no re-ordering).
I don't want to change the bools to unsigned ints because it may cause problems down the road with constantly having to re-cast it to bool and maybe accidentally using the wrong version of an overloaded function, not to mention making the code more obscure for others who read it later.
I don't want to declare them as private unsigned ints then make public accessors or something because all other members of all other structs in the project are accessed directly without () afterward, so it would seem a bit hacky and obtuse, and one would almost NEED the IntelliSense or trial-and-error to remember which needs () and which doesn't.
I would like to avoid creating another struct type just for the data conversion (and e.g. make a constructor for testStruct that takes in a single testStructImport-type object) because the actual struct is very long with lots of bit-field-specified variables.
I recommend that you create a "normal" structure without any bit packing. Use default POD types for the members.
Create interface functions for loading the "normal" fields from a buffer (uint8_t), and storing to a buffer.
This will allow you to use the data members in a sane method in your program. The bit packing and unpacking will be handled by the interface function. The bit twiddling should use bitwise AND and bitwise OR functions and not rely on the bit field notation in a structure. This will allow you to adjust the bit twiddling and be more portable among compilers.
This is how I designed my protocol classes. And I don't have to worry about bit field positioning, Endianess or things of that sort.
Also, I can use block I/O for reading and writing the buffer.
Try packing in this way:
#pragma pack( push, 1 )
struct testStruct {
// Word 0 (desired)
unsigned a : 8;
unsigned b : 1;
unsigned c : 1;
unsigned d : 21;
unsigned e : 1;
// Word 1 (desired)
unsigned f : 32;
// Words 2-3 (desired)
unsigned g[2]; //Cannot assign bit field but takes 64 bits in my compiler
};
#pragma pack(pop)
There is no easy, elegant method without using accessors or an interface layer. Unfortunately, there is nothing like a #pragma thing to fix this. I ended up just converting the bools to unsigned int and renaming variables from e.g. f to f_flag or f_bool to encourage correct usage and make it clear what the variables contained. It's lower-effort than Thomas's solution, but not as robust, obviously, and still gets around some of the main drawbacks with any of the easier methods.
Years after I posted this question, user #WaltK added this comment to the linked, related question:
"If you want to have more control over the layout of bit field
structures in memory, consider using this bit field facility,
implemented as a library header file."
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why isn't sizeof for a struct equal to the sum of sizeof of each member?
I was trying to understand the concept of bit fields.
But I am not able to find why the size of the following structure in CASE III is coming out as 8 bytes.
CASE I:
struct B
{
unsigned char c; // +8 bits
} b;
sizeof(b); // Output: 1 (because unsigned char takes 1 byte on my system)
CASE II:
struct B
{
unsigned b: 1;
} b;
sizeof(b); // Output: 4 (because unsigned takes 4 bytes on my system)
CASE III:
struct B
{
unsigned char c; // +8 bits
unsigned b: 1; // +1 bit
} b;
sizeof(b); // Output: 8
I don't understand why the output for case III comes as 8. I was expecting 1(char) + 4(unsigned) = 5.
You can check the layout of the struct by using offsetof, but it will be something along the lines of:
struct B
{
unsigned char c; // +8 bits
unsigned char pad[3]; //padding
unsigned int bint; //your b:1 will be the first byte of this one
} b;
Now, it is obvious that (in a 32-bit arch.) the sizeof(b) will be 8, isn't it?
The question is, why 3 bytes of padding, and not more or less?
The answer is that the offset of a field into a struct has the same alignment requirements as the type of the field itself. In your architecture, integers are 4-byte-aligned, so offsetof(b, bint) must be multiple of 4. It cannot be 0, because there is the c before, so it will be 4. If field bint starts at offset 4 and is 4 bytes long, then the size of the struct is 8.
Another way to look at it is that the alignment requirement of a struct is the biggest of any of its fields, so this B will be 4-byte-aligned (as it is your bit field). But the size of a type must be a multiple of the alignment, 4 is not enough, so it will be 8.
I think you're seeing an alignment effect here.
Many architectures require integers to be stored at addresses in memory that are multiple of the word size.
This is why the char in your third struct is being padded with three more bytes, so that the following unsigned integer starts at an address that is a multiple of the word size.
Char are by definition a byte. ints are 4 bytes on a 32 bit system. And the struct is being padded the extra 4.
See http://en.wikipedia.org/wiki/Data_structure_alignment#Typical_alignment_of_C_structs_on_x86 for some explanation of padding
To keep the accesses to memory aligned the compiler is adding padding if you pack the structure it will no add the padding.
I took another look at this and here's what I found.
From the C book, "Almost everything about fields is implementation-dependant."
On my machine:
struct B {
unsigned c: 8;
unsigned b: 1;
}b;
printf("%lu\n", sizeof(b));
print 4 which is a short;
You were mixing bit fields with regular struct elements.
BTW, a bit fields is defined as: "a set of adjacent bits within a sindle implementation-defined storage unit" So, I'm not even sure that the ':8' does what you want. That would seem to not be in the spirit of bit fields (as it's not a bit any more)
The alignment and total size of the struct are platform and compiler specific. You cannot not expect straightforward and predictable answers here. Compiler can always have some special idea. For example:
struct B
{
unsigned b0: 1; // +1 bit
unsigned char c; // +8 bits
unsigned b1: 1; // +1 bit
};
Compiler can merge fields b0 and b1 into one integer and may not. It is up to compiler. Some compilers have command line keys that control this, some compilers not. Other example:
struct B
{
unsigned short c, d, e;
};
It is up to compiler to pack/not pack the fields of this struct (asuming 32 bit platform). Layout of the struct can differ between DEBUG and RELEASE builds.
I would recommend using only the following pattern:
struct B
{
unsigned b0: 1;
unsigned b1: 7;
unsigned b2: 2;
};
When you have sequence of bit fields that share the same type, compiler will put them into one int. Otherwise various aspects can kick in. Also take into account that in a big project you write piece of code and somebody else will write and rewrite the makefile; move your code from one dll into another. At this point compiler flags will be set and changed. 99% chance that those people will have no idea of alignment requirements for your struct. They will not even open your file ever.