I wish to create a Block struct for use in a voxel game I am building (just background context), however I have run into issues with my saving and loading.
I can either represent a block as a single Uint16 and shift the bits to get the different elements such as blockID and health, or I can use a bitfield such as the one below:
struct Block
{
Uint16 id : 8;
Uint16 health : 6;
Uint16 visible : 1;
Uint16 structural : 1;
}
With the first method, when I wish to save the Block data I can simply convert the value of the Uint16 into a hex value and write it to a file. With loading I can simply read and convert the number back, then go back to reading the individual bits with manual bit shifting.
My issue is that I cannot figure out how to get the whole value of the Uint16 I am using with the bitfields method, which means I cannot save the block data as a single hex value.
So, the question is how do I go about getting the actual single Uint16 stored inside my block struct that is made up from the different bit fields. If it is not possible then that is fine, as I have already stated my manual bit shifting approach works just fine. I just wanted to profile to see which method of storing and modifying data is faster really.
If I have missed out a key detail or there is any extra information you need to help me out here, by all means do ask.
A union is probably the cleanest way:
#include <iostream>
typedef unsigned short Uint16;
struct S {
Uint16 id : 8;
Uint16 health : 6;
Uint16 visible : 1;
Uint16 structural : 1;
};
union U {
Uint16 asInt;
S asStruct;
};
int main() {
U u;
u.asStruct.id = 0xAB;
u.asStruct.health = 0xF;
u.asStruct.visible = 1;
u.asStruct.structural = 1;
std::cout << std::hex << u.asInt << std::endl;
}
This prints out cfab.
Update:
After further consideration and reading more deeply about this I have decided that any kind of type punning is bad. Instead I would recommend just biting the bullet and explicitly do the bit-twiddling to construct your value for serialization:
#include <iostream>
typedef unsigned short Uint16;
struct Block
{
Uint16 id : 8;
Uint16 health : 6;
Uint16 visible : 1;
Uint16 structural : 1;
operator Uint16() {
return structural | visible << 2 | health << 4 | id << 8;
}
};
int main() {
Block b{0xAB, 0xF, 1, 1};
std::cout << std::hex << Uint16(b) << std::endl;
}
This has the further bonus that it prints abf5 which matches the initializer order.
If you are worried about performance, instead of using the operator member function you could have a function that the compiler optimizes away:
...
constexpr Uint16 serialize(const Block& b) {
return b.structural | b.visible << 2 | b.health << 4 | b.id << 8;
}
int main() {
Block b{0xAB, 0xF, 1, 1};
std::cout << std::hex << serialize(b) << std::endl;
}
And finally if speed is more important than memory, I would recommend getting rid of the bit fields:
struct Block
{
Uint16 id;
Uint16 health;
Uint16 visible;
Uint16 structural;
};
There are at least two methods for what you want:
Bit Shifting
Casting
Bit Shifting
You can build a uint16_t from your structure by shifting the bit fields into a uint16_t:
uint16_t halfword;
struct Bit_Fields my_struct;
halfword = my_struct.id << 8;
halfword = halfword | (my_struct.health << 2);
halfword = halfword | (my_struct.visible << 1);
halfword = halfword | (my_struct.structural);
Casting
Another method is to cast the instance of the structure to a uint16_t:
uint16_t halfword;
struct Bit_Fields my_struct;
halfword = (uint16_t) my_struct;
Endianess
One issue of concern is Endianness; or the byte ordering of multi-byte values. This may play a part with where the bits lie within the 16-bit unit.
Living on the edge (of undefined-behavior)..
The naive solution would be to reinterpret_cast a reference to the object to the underlying type of your bit-field, abusing the fact that the first non-static data-member of a standard-layout class is located at the same address as the object itself.
struct A {
uint16_t id : 8;
uint16_t health : 6;
uint16_t visible : 1;
uint16_t structural : 1;
};
A a { 0, 0, 0, 1 };
uint16_t x = reinterpret_cast<uint16_t const&> (a);
The above might look accurate, and it will often (not always) yield the expected result - but it suffers from two big problems:
The allocation of bit-fields within an object is implementation-defined, and;
the class type must be standard-layout.
There is nothing saying that the bit-fields will, physically, be stored in the order you declare them, and even if that was the case a compiler might insert padding between every bit-field (as this is allowed).
To sum things up; how bit-fields end up in memory is highly implementation-defined, trying to reason about the behavior requires you to look into your implementations documentation on the matter.
What about using a union?
Accessing inactive union member - undefined?
Recommendation
Stick with the bit-fiddling approach, unless you can absolutely prove that every implementation on which the code is ran handles it the way you would want it to.
What does the standard (N4296) say?
9.6p1 Bit-fields [class.bit]
[...] Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. [...]
9.2p20 Classes [class]
If a standard-layout class object has any non-static data members, its address is the same as the address of its first non-static data member. [...]
This isn't a good usage of bit fields (and really, there are very few).
There is no guarantee that the order of your bit fields will be the same as the order they're declared in; it could change between builds of your application.
You'll have to manually store your members in a uint16_t using the shift and bitwise-or operators. As a general rule, you should never just dump or blindly copy data when dealing with external storage; you should manually serialize/deserialize it, to ensure it's in the format you expect.
You can use union:
typedef union
{
struct
{
Uint16 id : 8;
Uint16 health : 6;
Uint16 visible : 1;
Uint16 structural : 1;
} Bits;
Uint16 Val;
} TMyStruct;
Related
Why does the sizes of these two structs differ?
#pragma pack(push, 1)
struct WordA
{
uint32_t address : 8;
uint32_t data : 20;
uint32_t sign : 1;
uint32_t stateMatrix : 2;
uint32_t parity : 1;
};
struct WordB
{
uint8_t address;
uint32_t data : 20;
uint8_t sign : 1;
uint8_t stateMatrix : 2;
uint8_t parity : 1;
};
#pragma pack(pop)
Somehow WordB occupies 6 bytes instead of four, while WordA occupies exactly 32 bits.
I assumed that given the sum of used bits inside a struct would yield both structs to be of the same size. Apparently I am wrong, but I cannot find an explanation why.
Bit fields page shows only examples when all of the struct members are of the same type, which is a case of WordA.
Can anybody explain, why the sizes don't match and if it is according to the standard or implementation-defined?
Why can't a bit field be split between different underlying types?
It can in the sense that standard allows it.
It wasn't because that's what the language implementer (or rather, the designer of the ABI) chose. This decision may have been preferred because it may make the program faster or the compiler easier to implement.
Here is the standard quote:
[class.bit]
... Allocation of bit-fields within a class object is implementation-defined.
Alignment of bit-fields is implementation-defined.
Bit-fields are packed into some addressable allocation unit.
I currently have code that looks like this:
union {
struct {
void* buffer;
uint64_t n : 63;
uint64_t flag : 1;
} a;
struct {
unsigned char buffer[15];
unsigned char n : 7;
unsigned char flag : 1;
} b;
} data;
It is part of an attempted implementation of a data structure that does small-size optimization. Although it works on my machine with the compiler I am using, I am aware that there is no guarantee that the two flag bits from each of the structs actually end up in the same bit. Even if they did, it would still technically be undefined behavior to read it from the struct that wasn't most recently written. I would like to use this bit to discriminate between which of the two types is currently stored.
Is there a safe and portable way to achieve the same thing without increasing the size of the union? For our purpose, it can not be larger than 16 bytes.
If not, could it be achieved by sacrificing an entire byte (of n in the first struct and of buffer in the second), instead of a bit?
This is my problem, I have a structure (that I cannot change) like the following:
struct X
{
uint8_t fieldAB;
uint8_t fieldCDE;
uint8_t fieldFGH;
...
}
Each field of this structure contains different values packed using a bitmask (bitfield), that is for example fieldAB contains two different values (A and B) in the hi/lo nibbles, while fieldCDE contains three different values (C, D and E with the following bit mask: bit 7-6, bit 5-4-3, bit 2-1-0) and so on...
I would like to write a simple API to read and write this value using enum, that allows to easily access to values of each field:
getValue(valueTypeEnum typeOfValue, X & data);
setValue(valueTypeEnum typeOfValue, X & data, uint8_t value);
Where the enum valueTypeEnum is something like this:
enum valueTypeEnum
{
A,
B,
C,
D,
E,
...
}
My idea was to use a map (dictionary) that given valueTypeEnum returns the bitmask to use and the offset for access to the right field of the structure, but I think it's a little tricky and not so elegant.
What are your suggestions?
I can think of a few ways this could be done, the simplest is to use bitfields directly in your struct:
struct X {
uint32_t A : 4; // 4 bits for A.
uint32_t B : 4;
uint32_t C : 4;
uint32_t D : 4;
uint8_t E : 7;
uint8_t F : 1;
};
Then you can easily get or set the values using for instance:
X x;
x.A = 0xF;
Another way could be to encode it directly in macros or inline functions, but I guess what you are looking for is probably the bitfield.
As pointed out in the comments, the actual behaviour of bit-fields may depend on your platform, so if space is of the essence, you should check that it behaves as you expect. Also see here for more information on bit-fields in C++.
I'll dig into the bitfields a little more :
Your X structure is left unchanged :
struct X
{
uint8_t fieldAB;
uint8_t fieldCDE;
uint8_t fieldFGH;
};
Let's define an union for easy translation :
union Xunion {
X x;
struct Fields { // Named in case you need to sizeof() it
uint8_t A : 4;
uint8_t B : 4;
uint8_t C : 2;
uint8_t D : 3;
uint8_t E : 3;
uint8_t F : 2;
uint8_t G : 3;
uint8_t H : 3;
};
};
And now you can access these bitfields conveniently.
Before anyone tries to skin me alive, note that this is in no way portable, nor even defined by the C++ standard. But it'll do what you expect on any sane compiler.
You may want to add a compiler-specific packing directive (e.g GCC's __attribute__((packed))) to the Fields struct, as well as a static_assert ensuring the sizeof both union members are strictly equal.
Your best bet IMO is to forget using a structure at all, or if so use a union of a structure and a byte array. Then have your access functions use the byte array, for which it's easy to calculate offsets and so on. Doing this you are I believe guaranteed to access the bytes you want.
The downside is that you will have to re-assemble 16 and 32 bit values, if any, found in the structure, and do so bearing in mind any endianness issues. If you know that any such values are found on 16 or 32 bit address boundaries, you could use union'd short and long arrays to do this which would probably be best, though somewhat opaque.
HTH
Maybe I found a solution.
I can create an API like the following:
uint8_t getValue(valueTypeEnum typeOfValue, X * data)
{
uint8_t bitmask;
uint8_t * field = getBitmaskAndFieldPtr(typeOfValue, data, &bitmask);
return ((*field) & bitmask) >> ...;
}
void setValue(valueTypeEnum typeOfValue, X * data, uint8_t value)
{
uint8_t bitmask;
uint8_t * field = getBitmaskAndFieldPtr(typeOfValue, data, &bitmask);
*field = ((*field) & !bitmask) | (value << ...);
}
uint8_t * getBitmaskAndFieldPtr(valueTypeEnum typeOfValue, X * data, uint8_t * bitmask)
{
uint8_t * fieldPtr = 0;
switch(typeOfValue)
{
case A:
{
*bitmask = 0x0F; // I can use also a dictionary here
fieldPtr = &data.AB;
break;
}
case B:
{
*bitmask = 0xF0; // I can use also a dictionary here
fieldPtr = &data.AB;
break;
}
case C:
{
*bitmask = 0xC0; // I can use also a dictionary here
fieldPtr = &data.CDE;
break;
}
...
}
return fieldPtr;
}
I know that the switch-case is not so elegant and require to be updated each time the structure is changed, but I don't see an automatic way (maybe using reflection) to solve this problem.
Now I have a struct looking like this:
struct Struct {
uint8_t val1 : 2;
uint8_t val2 : 2;
uint8_t val3 : 2;
uint8_t val4 : 2;
} __attribute__((packed));
Is there a way to make all the vals a single array? The point is not space taken, but the location of all the values: I need them to be in memory without padding, and each occupying 2 bits. It's not important to have array, any other data structure with simple access by index will be ok, and not matter if it's plain C or C++. Read/write performance is important - it should be same (similar to) as simple bit operations, which are used now for indexed access.
Update:
What I want exactly can be described as
struct Struct {
uint8_t val[4] : 2;
} __attribute__((packed));
No, C only supports bitfields as structure members, and you cannot have arrays of them. I don't think you can do:
struct twobit {
uint8_t val : 2;
} __attribute__((packed));
and then do:
struct twobit array[32];
and expect array to consist of 32 2-bit integers, i.e. 8 bytes. A single char in memory cannot contain parts of different structs, I think. I don't have the paragraph and verse handy right now though.
You're going to have to do it yourself, typically using macros and/or inline functions to do the indexing.
You have to manually do the bit stuff that's going on right now:
constexpr uint8_t get_mask(const uint8_t n)
{
return ~(((uint8_t)0x3)<<(2*n));
}
struct Struct2
{
uint8_t val;
inline void set_val(uint8_t v,uint8_t n)
{
val = (val&get_mask(n))|(v<<(2*n));
}
inline uint8_t get_val(uint8_t n)
{
return (val&~get_mask(n))>>(2*n);
}
//note, return type only, assignment WONT work.
inline uint8_t operator[](uint8_t n)
{
return get_val(n);
}
};
Note that you may be able to get better performance if you use actual assembly commands.
Also note that, (almost) no matter what, a uint8_t [4] will have better performance than this, and a processor aligned type (uint32_t) may have even better performance.
I have a struct with bit-fields (totally 32 bit width) and I have a 32-bit variable. When I try to assign the variable value to my struct, I got an error:
error: conversion from ‘uint32_t {aka unsigned int}’ to non-scalar type ‘main()::CPUID’ requested.
struct CPUIDregs
{
uint32_t EAXBuf;
};
CPUIDregs CPUIDregsoutput;
int main () {
struct CPUID
{
uint32_t Stepping : 4;
uint32_t Model : 4;
uint32_t FamilyID : 4;
uint32_t Type : 2;
uint32_t Reserved1 : 2;
uint32_t ExtendedModel : 4;
uint32_t ExtendedFamilyID : 8;
uint32_t Reserved2 : 4;
};
CPUID CPUIDoutput = CPUIDregsoutput.EAXBuf;
Do you have any idea how to do it in the shortest way? Thanks
P.S. Of course I have more appropriate value of EAX in real code, but I guess it doesn't affect here.
You should never rely on how the compiler lays out your structure in memory. There are ways to do what you want with a single assignment, but I will neither recommend nor tell you.
The best way to do the assignment would be the following:
static inline void to_id(struct CPUid *id, uint32_t value)
{
id->Stepping = value & 0xf;
id->Model = value >> 4 & 0xf;
id->FamilyID = value >> 8 & 0xf;
id->Type = value >> 12 & 0x3;
id->Reserved1 = value >> 14 & 0x3;
id->ExtendedModel = value >> 16 & 0xf;
id->ExtendedFamilyID = value >> 20 & 0xff;
id->Reserved2 = value >> 28 & 0xf;
}
And the opposite
static inline uint32_t from_id(struct CPUid *id)
{
return id->Stepping
| id->Model << 4
| id->FamilyID << 8
| id->Type << 12
| id->Reserved1 << 14
| id->ExtendedModel << 16
| id->ExtendedFamilyID << 20
| id->Reserved2 << 28;
}
Use a union.
union foo {
struct {
uint8_t a : 4;
uint8_t b : 4;
uint8_t c : 4;
uint8_t d : 4;
uint16_t e;
};
uint32_t allfields;
};
int main(void) {
union foo a;
a.allfields = 0;
a.b = 3;
return 0;
}
Just if somebody´s interested, I´ve got a better solution for my own question:
*(reinterpret_cast<uint32_t *> (&CPUIDoutput)) = CPUIDregsoutput.EAXBuf;
These are struct members, so you need to assign directly do them, or make sure the RHS of your assignment is a value of type CPUID. Not sure why you expect to be able to assign to the struct from an integer.
The facts that the struct contains bitfields, and that the sum of the bits happens to be the same as the number of bits in the integer you're trying to assign, mean nothing. They're still not compatible types, for assignment purposes.
If this was too vague, consider showing more/better code.
I wanted to add a couple of things here in case someone needs the information. As Shahbaz has pointed out, there is one way of SAFELY doing this. Here are the issues with the other ones I see.
The union has the problem of endian-ness. When you have a uint32_t value assigned to a bitfield, big endian will have your bits in one order while little endian will store your bytes of bits in reverse order. When you think you are assigning it one value, you could instead be assigning wrong values to wrong bits. This is therfore NOT portable code.
When you change the pointer type of anything on the LHS of the assignment, assume you are doing something wrong. This is my mantra. Sure, it works, here. Again, this becomes dependent on endian-ness, as well as compiler settings and memory allocation. I can remap any chunk of memory into a different structure, but if I'm doing that then how do I ensure that the 2 structures are of the same size and order? What if down the road you decide to optimize your code and make one a uint16_t, or reorder the bits on a structure? This is a maintenance NIGHTMARE. Just... don't do it. If not for yourself, then for the next individual who is going to have to maintain your code.
Hope this helps explain some of the answers that get the job done but aren't the way to do it.