Reordering bit-fields mysteriously changes size of struct - c++

For some reason I have a struct that needs to keep track of 56 bits of information ordered as 4 packs of 12 bits and 2 packs of 4 bits. This comes out to 7 bytes of information total.
I tried a bit field like so
struct foo {
uint16_t R : 12;
uint16_t G : 12;
uint16_t B : 12;
uint16_t A : 12;
uint8_t X : 4;
uint8_t Y : 4;
};
and was surprised to see sizeof(foo) evaluate to 10 on my machine (a linux x86_64 box) with g++ version 12.1. I tried reordering the fields like so
struct foo2 {
uint8_t X : 4;
uint16_t R : 12;
uint16_t G : 12;
uint16_t B : 12;
uint16_t A : 12;
uint8_t Y : 4;
};
and was surprised that the size now 8 bytes, which is what I originally expected. It's the same size as the structure I expected the first solution to effectively produce:
struct baseline {
uint16_t first;
uint16_t second;
uint16_t third;
uint8_t single;
};
I am aware of size and alignment and structure packing, but I am really stumped as to why the first ordering adds 2 extra bytes. There is no reason to add more than one byte of padding since the 56 bits I requested can be contained exactly by 7 bytes.
Minimal Working Example Try it on Wandbox
What am I missing?
PS: none of this changes if we change uint8_t to uint16_t

If we create an instance of struct foo, zero it out, set all bits in a field, and print the bytes, and do this for each field, we see the following:
R: ff 0f 00 00 00 00 00 00 00 00
G: 00 00 ff 0f 00 00 00 00 00 00
B: 00 00 00 00 ff 0f 00 00 00 00
A: 00 00 00 00 00 00 ff 0f 00 00
X: 00 00 00 00 00 00 00 f0 00 00
Y: 00 00 00 00 00 00 00 00 0f 00
So what appears to be happening is that each 12 bit field is starting in a new 16 bit storage unit. Then the first 4 bit field fills out the remaining bits in the prior 16 bit unit, then the last field takes up 4 bits in the last unit. This occupies 9 bites And since the largest field, in this case a bitfield storage unit, is 2 bytes wide, one byte of padding is added at the end.
So it appears that is 12 bit field, which has a 16 bit base type, is kept within a single 16 bit storage unit instead of being split between multiple storage units.
If we do the same for the modified struct:
X: 0f 00 00 00 00 00 00 00
R: f0 ff 00 00 00 00 00 00
G: 00 00 ff 0f 00 00 00 00
B: 00 00 00 00 ff 0f 00 00
A: 00 00 00 00 00 00 ff 0f
Y: 00 00 00 00 00 00 00 f0
We see that X takes up 4 bits of the first 16 bit storage unit, then R takes up the remaining 12 bits. The rest of the fields fill out as before. This results in 8 bytes being used, and so requires no additional padding.
While the exact details of the ordering of bitfields is implementation defined, the C standard does set a few rules.
From section 6.7.2.1p11:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field. If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is implementation-defined. The order of
allocation of bit-fields within a unit (high-order to low-order or
low-order to high-order) is implementation-defined. The alignment of
the addressable storage unit is unspecified.
And 6.7.2.1p15:
Within a structure object, the non-bit-field members and the units in
which bit-fields reside have addresses that increase in the order in
which they are declared.

Related

Using CMemoryState to detect memory leaks in C++/MFC app

I have a C++/MFC app that has a memory leak, running under 64-bit Windows 10 using VisualStudio 2019. Using the CMemoryState class I'm trying to find the exact location of the leak using code like this:
CMemoryState oldState.Checkpoint();
// code from my app runs here
CMemoryState newState.Checkpoint();
// Compare to previous
CMemoryState diff;
BOOL differs = diff.Difference(oldState, newState);
if(differs)
{
diff.DumpStatistics();
diff.DumpAllObjectsSince();
}
This produces output like
0 bytes in 0 Free Blocks.
19239 bytes in 472 Normal Blocks.
176 bytes in 2 CRT Blocks.
0 bytes in 0 Ignore Blocks.
272 bytes in 2 Client Blocks.
Largest number used: 43459889 bytes.
Total allocations: 215080520 bytes.
Dumping objects ->
{1314867} normal block at 0x00000181073C31C0, 16 bytes long.
Data: < s > D0 7F DD AC 73 00 00 00 00 00 00 00 00 00 00 00
{1272136} normal block at 0x0000018179C19750, 16 bytes long.
Data: < s > D0 7F DD AC 73 00 00 00 00 00 00 00 00 00 00 00
{1265825} normal block at 0x0000018179C18CB0, 16 bytes long.
Data: < _ s > B0 5F DD AC 73 00 00 00 00 00 00 00 00 00 00 00
with thousands of lines like the above. Are these significant, i e, an actual memory leak, or are they some artifact of MFC?

C++ Initialize Variable in .data Section

I'm asking myself, how you could improve code. If encountered following problem:
int i = 10;
int s = i * 12;
int main(){ }
When you look at the code in the PE-format, you notice, that neither i nor s is declared in the .data segment, despite the fact that their value could have been precalculated. They get initialized in runtime.
Here I have an example of another Code which has the same phenomenon. The Values were also declared like in the example above
View in Debugger before passing EntryPoint:
0133BF7C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0133BF8C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
After passing the EntryPoint:
0133BF7C E9 01 00 00 DF 02 00 00 64 00 00 00 00 00 00 00 é...ß...d.......
0133BF8C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
How can I declare a variable with an initial value in .data Section in C++ to save the precious computation time?
Or is the .data section never initialized with something before the execution starts?
If these are compile time constants, use constexpr.
constexpr int i = 10;
constexpr int s = i * 12;
Starting in C++17 these are also inline variables so you can declare them in a header file and not worry about having multiple definitions.
If these are not constants, but you want them to be constant initialized, then you can at least use static to make them have internal linkage which makes the optimization more likely to happen. Starting in C++20 you can use constinit to specify that the variable is to have static initialization, but can be changed later on in the program.

The size of these structs are different in a file but the same in program memory

Consider the following POD struct:
struct MessageWithArray {
uint32_t raw;
uint32_t myArray[10];
//MessageWithArray() : raw(0), myArray{ 10,20,30,40,50,60,70,80,90,100 } { };
};
Running the following:
#include <type_traits>
#include <iostream>
#include <fstream>
#include <string>
struct MessageWithArray {
uint32_t raw;
uint32_t myArray[10];
//MessageWithArray() : raw(0), myArray{ 10,20,30,40,50,60,70,80,90,100 } { };
};
//https://stackoverflow.com/questions/46108877/exact-definition-of-as-bytes-function
template <class T>
char* as_bytes(T& x) {
return &reinterpret_cast<char&>(x);
// or:
// return reinterpret_cast<char*>(std::addressof(x));
}
int main() {
MessageWithArray msg = { 0, {0,1,2,3,4,5,6,7,8,9} };
std::cout << "Size of MessageWithArray struct: " << sizeof(msg) << std::endl;
std::cout << "Is a POD? " << std::is_pod<MessageWithArray>() << std::endl;
std::ofstream buffer("message.txt");
buffer.write(as_bytes(msg), sizeof(msg));
return 0;
}
Gives the following output:
Size of MessageWithArray struct: 44
Is a POD? 1
A hex dump of the "message.txt" file looks like this:
00 00 00 00 00 00 00 00 01 00 00 00 02 00 00 00
03 00 00 00 04 00 00 00 05 00 00 00 06 00 00 00
07 00 00 00 08 00 00 00 09 00 00 00
Now if I uncomment the constructor (so that MessageWithArray has a zero-argument constructor), MessageWithArray becomes a non-POD struct. Then I use the constructor to initialize instead. This results in the following changes in the code:
....
struct MessageWithArray {
.....
MessageWithArray() : raw(0), myArray{ 10,20,30,40,50,60,70,80,90,100 }{ };
};
....
int main(){
MessageWithArray msg;
....
}
Running this code, I get:
Size of MessageWithArray struct: 44
Is a POD? 0
A hex dump of the "message.txt" file looks like this:
00 00 00 00 0D 0A 00 00 00 14 00 00 00 1E 00 00
00 28 00 00 00 32 00 00 00 3C 00 00 00 46 00 00
00 50 00 00 00 5A 00 00 00 64 00 00 00
Now, I'm not so interested in the actual hex values, what I'm curious about is why there is one more byte in the non-POD struct dump compared to the POD struct dump, when sizeof() declares they are the same number of bytes? Is it possible that, because the constructor makes the struct non-POD, that something hidden has been added to the struct? sizeof() should be an accurate compile-time check, correct? Is something possibly avoiding being measured by sizeof()?
Specifications: I am running this in an empty project in Visual Studio 2017 version 15.7.5, Microsoft Visual C++ 2017, on a Windows 10 machine.
Intel Core i7-4600M CPU
64-bit Operating System, x64-based processor
EDIT: I decided to initialize the struct to avoid Undefined Behaviour, and because the question is still valid with the initialization. Initializing it to a value without 10 preserves the behaviour I observed initially, because the data the array had never contained any 10s (even if it was garbage, and random).
It has nothing to do with POD-ness.
Your ofstream is opened in text mode (rather than binary mode). On windows it means that \n gets converted to \r\n.
In the second case there happened to be one 0x0A (\n) byte in the struct, that became 0x0D 0x0A (\r\n). That's why you see an extra byte.
Also, using uninitialized variables in the first case leads to undefined behaviour, which is this case didn't manifest itself.
Other answer explains the problem with writing binary data into stream opened in text mode, however this code is fundamentally wrong. There is no need to dump anything, the proper way to check sizes of those structures and verify that they are equal would be to use static_assert:
struct MessageWithArray {
uint32_t raw;
uint32_t myArray[10];
};
struct NonPodMessageWithArray {
uint32_t raw;
uint32_t myArray[10];
NonPodMessageWithArray() : raw(0), myArray{ 10,20,30,40,50,60,70,80,90,100 } {}
};
static_assert(sizeof(MessageWithArray) == sizeof(NonPodMessageWithArray));
online compiler

(C++) Weird bitmap issue - Colors in grayscale

I have a weird issue with creating an Bitmap in C++. I'm using the BITMAPFILEHEADER and BITMAPINFOHEADER Structure for creating an 8bit grayscale image. Bitmap data is coming from a camera over DMA as unsigned char an has exactly the same lenghts as expected. Saving the image an opening it, it contains colors?!
The way it should be: http://www.freeimagehosting.net/qd1ku
The way it is: http://www.freeimagehosting.net/83r1s
Do you have any Idea where this is comping from?
The Header of the bitmap is:
42 4D 36 00 04 00 00 00 00 00 36 00 00 00 28 00
00 00 00 02 00 00 00 02 00 00 01 00 08 00 00 00
00 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00
Info-Header:
42 4D Its a Bitmap
36 00 04 00 Size of Bitmap = 0x04 00 36 - Header-Size = 512x512
00 00 00 00 Reserved
36 00 00 00 Offset = Sizeof(Bitmapinfoheader);
28 00 00 00 Sizeof(Bitmapinfoheader);
00 02 00 00 =0x200 = 512 px.
00 02 00 00 same
01 00 = 1 - Standard. Not used anymore.
08 00 Color dept = 8 bit.
00 00 00 00 Compression: 0 = none.
00 00 00 00 Filesize or zero
00 00 00 00 X-Dot-Per-Meter, may be left 0
00 00 00 00 y-Dot-Per-Meter, may be left 0
00 00 00 00 If zero, all 255 colors are used
00 00 00 00 If zero, no color table values are used
Do you have any Idea where this comes from?
Under windows, if you do not supply a palette for your 8 bit image a system default one is provided for you. I do not recall offhand the win32 way to add a palette, but it should be as simple as creating a 256 element char array where the value of each entry is the same as its index, and writing it out to your file at the appropriate point and updating the offset parameter, etc.

Accessing specific binary information based on binary format documentation

I have a binary file and documentation of the format the information is stored in. I'm trying to write a simple program using c++ that pulls a specific piece of information from the file but I'm missing something since the output isn't what I expect.
The documentation is as follows:
Half-word Field Name Type Units Range Precision
10 Block Divider INT*2 N/A -1 N/A
11-12 Latitude INT*4 Degrees -90 to +90 0.001
There are other items in the file obviously but for this case I'm just trying to get the Latitude value.
My code is:
#include <cstdlib>
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, char* argv[])
{
char* dataFileLocation = "testfile.bin";
ifstream dataFile(dataFileLocation, ios::in | ios::binary);
if(dataFile.is_open())
{
char* buffer = new char[32768];
dataFile.seekg(10, ios::beg);
dataFile.read(buffer, 4);
dataFile.close();
cout << "value is << (int)(buffer[0] & 255);
}
}
The result of which is "value is 226" which is not in the allowed range.
I'm quite new to this and here's what my intentions where when writing the above code:
Open file in binary mode
Seek to the 11th byte from the start of the file
Read in 4 bytes from that point
Close the file
Output those 4 bytes as an integer.
If someone could point out where I'm going wrong I'd sure appreciate it. I don't really understand the (buffer[0] & 255) part (took that from some example code) so layman's terms for that would be greatly appreciated.
Hex Dump of the first 100 bytes:
testfile.bin 98,402 bytes 11/16/2011 9:01:52
-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
00000000- 00 5F 3B BF 00 00 C4 17 00 00 00 E2 2E E0 00 00 [._;.............]
00000001- 00 03 FF FF 00 00 94 70 FF FE 81 30 00 00 00 5F [.......p...0..._]
00000002- 00 02 00 00 00 00 00 00 3B BF 00 00 C4 17 3B BF [........;.....;.]
00000003- 00 00 C4 17 00 00 00 00 00 00 00 00 80 02 00 00 [................]
00000004- 00 05 00 0A 00 0F 00 14 00 19 00 1E 00 23 00 28 [.............#.(]
00000005- 00 2D 00 32 00 37 00 3C 00 41 00 46 00 00 00 00 [.-.2.7.<.A.F....]
00000006- 00 00 00 00 [.... ]
Since the documentation lists the field as an integer but shows the precision to be 0.001, I would assume that the actual value is the stored value multiplied by 0.001. The integer range would be -90000 to 90000.
The 4 bytes must be combined into a single integer. There are two ways to do this, big endian and little endian, and which you need depends on the machine that wrote the file. x86 PCs for example are little endian.
int little_endian = buffer[0] | buffer[1]<<8 | buffer[2]<<16 | buffer[3]<<24;
int big_endian = buffer[0]<<24 | buffer[1]<<16 | buffer[2]<<8 | buffer[3];
The &255 is used to remove the sign extension that occurs when you convert a signed char to a signed integer. Use unsigned char instead and you probably won't need it.
Edit: I think "half-word" refers to 2 bytes, so you'll need to skip 20 bytes instead of 10.