Why is the size of the union greater than expected? - c++

#include <iostream>
typedef union dbits {
double d;
struct {
unsigned int M1: 20;
unsigned int M2: 20;
unsigned int M3: 12;
unsigned int E: 11;
unsigned int s: 1;
};
};
int main(){
std::cout << "sizeof(dbits) = " << sizeof(dbits) << '\n';
}
output: sizeof(dbits) = 16, but if
typedef union dbits {
double d;
struct {
unsigned int M1: 12;
unsigned int M2: 20;
unsigned int M3: 20;
unsigned int E: 11;
unsigned int s: 1;
};
};
Output: sizeof(dbits) = 8
Why does the size of the union increase?
In the first and second union, the same number of bits in the bit fields in the structure, why the different size?
I would like to write like this:
typedef union dbits {
double d;
struct {
unsigned long long M: 52;
unsigned int E: 11;
unsigned int s: 1;
};
};
But, sizeof(dbits) = 16, but not 8, Why?
And how convenient it is to use bit fields in structures to parse bit in double?

members of a bit field will not cross boundaries of the specified storage type. So
unsigned int M1: 20;
unsigned int M2: 20;
will be 2 unsigned int using 20 out of 32 bit each.
In your second case 12 + 20 == 32 fits in a single unsigned int.
As for your last case members with different storage type can never share. So you get one unsigned long long and one unsigned int instead of a single unsigned long long as you desired.
You should use uint64_t so you get exact bit counts. unsigned int could e anything from 16 to 128 (or more) bit.
Note: bitfields are highly implementation defined, this is just the common way it usually works.

Related

Size of structure - padding and alignment

I have an explicitly sized structure as follow:
typedef struct
{
unsigned long A : 4;
unsigned long B : 12;
union
{
unsigned long C1 : 8;
unsigned long C2 : 8;
unsigned long C3 : 8;
};
unsigned long D : 8;
}FooStruct;
The total size of this struct should be 32bit (4 bytes) in theory. However, I get a 12-byte size using sizeof so there should be some padding and alignment happening here.
I just don't see why and where. Can someone explain to me how this structure takes 12 bytes in memory?
The union forces the start of a new unsigned long, and the member after the union yet another unsigned long. Assuming long is 4 bytes that means your struct will have 3 unsigned longs for a total of 12 bytes. Although a union with three equally sized members also seems odd.
If you want this to have a size of 4 bytes why not change it to:
typedef struct
{
unsigned short A : 4;
unsigned short B : 12;
union
{
unsigned char C1 : 8;
unsigned char C2 : 8;
unsigned char C3 : 8;
};
unsigned char D : 8;
}FooStruct;
Additionally if you are using gcc and want to disable structure padding, you can use __attribute__((packed)):
struct FooStruct
{
unsigned long A : 4;
unsigned long B : 12;
union
{
unsigned long C1 : 8;
unsigned long C2 : 8;
unsigned long C3 : 8;
} __attribute__((packed)) C;
unsigned long D : 8;
} __attribute__((packed));
But beware that some architectures may have penalities on unalligned data access or not allow it at all.

Creating a generic payload class

my task is to place specified bits in an array of 8 bytes (not all 64 bits).
This can be done by using a struct:
struct Date {
unsigned int nWeekDay : 3;
unsigned int nMonthDay : 6;
unsigned int reserved1 : 10;
unsigned int nMonth : 5;
unsigned int nYear : 8;
};
I would like to write a generic class that gets value, start bits and length and place the the value a the correct position.
Could someone point me to an implementation of such a class\function?

C++ bits operation. 6 integers into one unsigned long long

I would like to put 6 ints into one unsigned long long variable. Then, I would like to read these integers from long long variable bits range. I wrote something like this but it returns negative output
unsigned long long encode(int caller, int caller_zone,
int callee, int callee_zone,
int duration, int tariff) {
struct CallInfo
{
int caller : 17;
int caller_zone : 7;
int callee : 17;
int callee_zone : 7;
int duration : 13;
int tariff : 3;
};
CallInfo info = { caller, caller_zone, callee, callee_zone, duration, tariff};
cout << info.caller << endl;
cout << info.caller_zone << endl;
}
It's much easier to use bit fields for this, e.g.
struct CallInfo
{
unsigned int caller : 17;
unsigned int caller_zone : 7;
unsigned int callee : 17;
unsigned int callee_zone : 7;
unsigned int duration : 13;
unsigned int tariff : 3;
};
You would not really need an encode function, as you could just write, e.g.
CallInfo info = { /* ... initialise fields here ... */ };
and then access fields in the normal way:
info.caller = 0;
info.caller_zone = info.callee_zone;
// ...

C++ Bits in 64 bit integer

Hello I have a struct here that is 7 bytes and I'd like to write it to a 64 bit integer. Next, I'd like to extract out this struct later from the 64 bit integer.
Any ideas on this?
#include "stdafx.h"
struct myStruct
{
unsigned char a;
unsigned char b;
unsigned char b;
unsigned int someNumber;
};
int _tmain(int argc, _TCHAR* argv[])
{
myStruct * m = new myStruct();
m->a = 11;
m->b = 8;
m->c = 12;
m->someNumber = 30;
printf("\n%s\t\t%i\t%i\t%i\t%i\n\n", "struct", m->a, m->b, m->c, m->someNumber);
unsigned long num = 0;
// todo: use bitwise operations from m into num (total of 7 bytes)
printf("%s\t\t%i\n\n", "ulong", num);
m = new myStruct();
// todo: use bitwise operations from num into m;
printf("%s\t\t%i\t%i\t%i\t%i\n\n", "struct", m->a, m->b, m->c, m->someNumber);
return 0;
}
You should to do something like this:
class structured_uint64
{
uint64_t data;
public:
structured_uint64(uint64_t x = 0):data(x) {}
operator uint64_t&() { return data; }
unsigned uint8_t low_byte(size_t n) const { return data >> (n * 8); }
void low_byte(size_t n, uint8_t val) {
uint64_t mask = static_cast<uint64_t>(0xff) << (8 * n);
data = (data & ~mask) | (static_cast<uint64_t>(val) << (8 * n));
}
unsigned uint32_t hi_word() const { return (data >> 24); }
// et cetera
};
(there is, of course, lots of room for variation on the details of the interface and where among the 64 bits the constituents are placed)
Using different types to alias the same portion of memory is a generally bad idea. The thing is, it's very valuable for the optimizer to be able to use reasoning like:
"Okay, I've read a uint64_t at the start of this block, and nowhere in the middle does the program write to any uint64_ts, therefore the value must be unchanged!"
which means it will get the wrong answer if you tried to change the value of the uint64_t object through a uint32_t reference. And as this is very dependent what optimizations are possible and done, it is actually pretty easy to never run across the problem in test cases, but see it in the real program you're trying to write -- and you'll spend forever trying to find the bug because you convinced yourself it's not this problem.
So, you really should do the insertion/extraction of the fields with bit twiddling (or intrinsics, if profiling shows that this is a performance issue and there are useful ones available) rather than trying to set up a clever struct.
If you really know what you're doing, you can make the aliasing work, I believe. But it should only be done if you really know what you're doing, and that includes knowing relevant rules from the standard inside and out (which I don't, and so I can't advise you on how to make it work). And even then you probably shouldn't do it.
Also, if you intend your integral types to be a specific size, you should really use the correct types. For example, never use unsigned int for an integer that is supposed to be exactly 32 bits. Instead use uint32_t. Not only is it self-documenting, but you won't run into a nasty surprise when you try to build your program in an environment where unsigned int is not 32 bits.
Use a union. Each element of a union occupies the same address space. The struct is one element, the unsigned long long is another.
#include <stdio.h>
union data
{
struct
{
unsigned char a;
unsigned char b;
unsigned char c;
unsigned int d;
} e;
unsigned long long f;
};
int main()
{
data dat;
dat.f = 0xFFFFFFFFFFFFFFFF;
dat.e.a = 1;
dat.e.b = 2;
dat.e.c = 3;
dat.e.d = 4;
printf("f=%016llX\n",dat.f);
printf("%02X %02X %02X %08X\n",dat.e.a,dat.e.b,dat.e.c,dat.e.d);
return 0;
}
Output, but note one byte of the original unsigned long long remains. Compilers like to align data such as 4-byte integers on addresses divisible by 4, so three bytes, then a pad byte so the integer is at offset 4 and the struct has a total size of 8.
f=00000004FF030201
01 02 03 00000004
This can be controlled in compiler-dependent fashion. Below is for Microsoft C++:
#include <stdio.h>
#pragma pack(push,1)
union data
{
struct
{
unsigned char a;
unsigned char b;
unsigned char c;
unsigned int d;
} e;
unsigned long long f;
};
#pragma pack(pop)
int main()
{
data dat;
dat.f = 0xFFFFFFFFFFFFFFFF;
dat.e.a = 1;
dat.e.b = 2;
dat.e.c = 3;
dat.e.d = 4;
printf("f=%016llX\n",dat.f);
printf("%02X %02X %02X %08X\n",dat.e.a,dat.e.b,dat.e.c,dat.e.d);
return 0;
}
Note the struct occupies seven bytes now and the highest byte of the unsigned long long is now unchanged:
f=FF00000004030201
01 02 03 00000004
Got it.
static unsigned long long compress(char a, char b, char c, unsigned int someNumber)
{
unsigned long long x = 0;
x = x | a;
x = x << 8;
x = x | b;
x = x << 8;
x = x | c;
x = x << 32;
x = x | someNumber;
return x;
}
myStruct * decompress(unsigned long long x)
{
printBinary(x);
myStruct * m = new myStruct();
m->someNumber = x | 4294967296;
x = x >> 32;
m->c = x | 256;
x = x >> 8;
m->b = x | 256;
x = x >> 8;
m->a = x | 256;
return m;
}

sizeof(struct) returns unexpected value

This should be simple but I have no clue where to look for the issue:
I have a struct:
struct region
{
public:
long long int x;
long long int y;
long long int width;
long long int height;
unsigned char scale;
};
When I do sizeof(region) it gives me 40 when I am expecting 33.
Any ideas?
(mingw gcc, win x64 os)
It's padding the struct to fit an 8-byte boundary. So it actually is taking 40 bytes in memory - sizeof is returning the correct value.
If you want it to only take 33 bytes then specify the packed attribute:
struct region
{
public:
long long int x;
long long int y;
long long int width;
long long int height;
unsigned char scale;
} __attribute__ ((packed));
long long int values are 8 bytes each. scale is only 1 byte but is padded for alignments, so it effectively takes up 8 bytes too. 5*8 = 40.
As others said, structs are padded for alignments, and such padding not only depends on the type of the members, but also on the order of the members in which they're defined.
For example, consider these two structs A and B as defined below. Both structs are identical in terms of members and types; the only difference is that the order in which members are defined isn't same:
struct A
{
int i;
int j;
char c;
char d;
};
struct B
{
int i;
char c;
int j;
char d;
};
Would the sizeof(A) be equal to sizeof(B) just because they've same number of members of same type? No. Try printing the size of each:
cout << "sizeof(A) = "<< sizeof(A) << endl;
cout << "sizeof(B) = "<< sizeof(B) << endl;
Output:
sizeof(A) = 12
sizeof(B) = 16
Surprised? See the output yourself : http://ideone.com/yCX4S