How to extract values from C++ structure with bitfields - c++

This is my problem, I have a structure (that I cannot change) like the following:
struct X
{
uint8_t fieldAB;
uint8_t fieldCDE;
uint8_t fieldFGH;
...
}
Each field of this structure contains different values packed using a bitmask (bitfield), that is for example fieldAB contains two different values (A and B) in the hi/lo nibbles, while fieldCDE contains three different values (C, D and E with the following bit mask: bit 7-6, bit 5-4-3, bit 2-1-0) and so on...
I would like to write a simple API to read and write this value using enum, that allows to easily access to values of each field:
getValue(valueTypeEnum typeOfValue, X & data);
setValue(valueTypeEnum typeOfValue, X & data, uint8_t value);
Where the enum valueTypeEnum is something like this:
enum valueTypeEnum
{
A,
B,
C,
D,
E,
...
}
My idea was to use a map (dictionary) that given valueTypeEnum returns the bitmask to use and the offset for access to the right field of the structure, but I think it's a little tricky and not so elegant.
What are your suggestions?

I can think of a few ways this could be done, the simplest is to use bitfields directly in your struct:
struct X {
uint32_t A : 4; // 4 bits for A.
uint32_t B : 4;
uint32_t C : 4;
uint32_t D : 4;
uint8_t E : 7;
uint8_t F : 1;
};
Then you can easily get or set the values using for instance:
X x;
x.A = 0xF;
Another way could be to encode it directly in macros or inline functions, but I guess what you are looking for is probably the bitfield.
As pointed out in the comments, the actual behaviour of bit-fields may depend on your platform, so if space is of the essence, you should check that it behaves as you expect. Also see here for more information on bit-fields in C++.

I'll dig into the bitfields a little more :
Your X structure is left unchanged :
struct X
{
uint8_t fieldAB;
uint8_t fieldCDE;
uint8_t fieldFGH;
};
Let's define an union for easy translation :
union Xunion {
X x;
struct Fields { // Named in case you need to sizeof() it
uint8_t A : 4;
uint8_t B : 4;
uint8_t C : 2;
uint8_t D : 3;
uint8_t E : 3;
uint8_t F : 2;
uint8_t G : 3;
uint8_t H : 3;
};
};
And now you can access these bitfields conveniently.
Before anyone tries to skin me alive, note that this is in no way portable, nor even defined by the C++ standard. But it'll do what you expect on any sane compiler.
You may want to add a compiler-specific packing directive (e.g GCC's __attribute__((packed))) to the Fields struct, as well as a static_assert ensuring the sizeof both union members are strictly equal.

Your best bet IMO is to forget using a structure at all, or if so use a union of a structure and a byte array. Then have your access functions use the byte array, for which it's easy to calculate offsets and so on. Doing this you are I believe guaranteed to access the bytes you want.
The downside is that you will have to re-assemble 16 and 32 bit values, if any, found in the structure, and do so bearing in mind any endianness issues. If you know that any such values are found on 16 or 32 bit address boundaries, you could use union'd short and long arrays to do this which would probably be best, though somewhat opaque.
HTH

Maybe I found a solution.
I can create an API like the following:
uint8_t getValue(valueTypeEnum typeOfValue, X * data)
{
uint8_t bitmask;
uint8_t * field = getBitmaskAndFieldPtr(typeOfValue, data, &bitmask);
return ((*field) & bitmask) >> ...;
}
void setValue(valueTypeEnum typeOfValue, X * data, uint8_t value)
{
uint8_t bitmask;
uint8_t * field = getBitmaskAndFieldPtr(typeOfValue, data, &bitmask);
*field = ((*field) & !bitmask) | (value << ...);
}
uint8_t * getBitmaskAndFieldPtr(valueTypeEnum typeOfValue, X * data, uint8_t * bitmask)
{
uint8_t * fieldPtr = 0;
switch(typeOfValue)
{
case A:
{
*bitmask = 0x0F; // I can use also a dictionary here
fieldPtr = &data.AB;
break;
}
case B:
{
*bitmask = 0xF0; // I can use also a dictionary here
fieldPtr = &data.AB;
break;
}
case C:
{
*bitmask = 0xC0; // I can use also a dictionary here
fieldPtr = &data.CDE;
break;
}
...
}
return fieldPtr;
}
I know that the switch-case is not so elegant and require to be updated each time the structure is changed, but I don't see an automatic way (maybe using reflection) to solve this problem.

Related

Using a Color struct in a buffer/array

I'm working on a library that handles (among others) images. For interoperability with OGL I need (at least) to write to BGRA buffers (byte order -> B first)
For that I'm designing a ColorARGB class that should represent a color. I have 2 approaches in mind: The first one stores the color value as an uint32_t and provides conversion methods to byte buffers and the second stores the components as bytes and should be able to be used directly. In code:
struct ColorARGB
{
uint32_t clrValue;
ColorARGB(uint32_t clrValue):clrValue(clrValue){}
ColorARGB(uint8_t a, uint8_t r, uint8_t g, uint8_t b)
{
clrValue = a << 24 | r << 16 | g << 8 | b;
}
static ColorARGB fromBGRA(const uint8_t* ptr)
{
return fromBGRA(reinterpret_cast<const uint32_t*>(ptr));
}
static ColorARGB fromBGRA(const uint32_t* ptr)
{
// This is little endian BGRA word format
return ColorARGB(boost::endian::little_to_native(*ptr));
}
// Similar functions for toBGRA
}
or:
struct ColorARGB2{
uint8_t clrValues[4]; // Or maybe: uint8_t b, g, r, a;
ColorARGB2(uint8_t a, uint8_t r, uint8_t g, uint8_t b)
{
clrValues[0] = b; clrValues[1] = g;
clrValues[2] = r; clrValues[3] = a;
}
}
The second version should allow std::vector<ColorARGB2> while the first one would have the problem, that on big endian machines the buffer is ARGB not BGRA. I can also just reinterpret_cast<ColorARGB2*> while it isn't possible for for ColorARGB for the endianess reason.
Is there anything wrong with ColorARGB2? Would I run into possible alignment issues especially when handling byte buffers (uint8_t*)? Can I simply implement the comparison as reinterpret_cast<const uint32_t*>(&lhs) == reinterpret_cast<const uint32_t*>(&rhs) or could this fail due to alignment?
Update (not a real answer but helps):
I found the following of use in the boost src code:
#if defined(__x86_64__) || defined(_M_X64) || defined(__i386) || defined(_M_IX86)
// On x86 (which is little endian), unaligned loads are permitted
# define RTTR_USE_UNALIGNED_ACCESS 1
#endif
In theory you could hit alignment issues, but in practice, I've never seen it happen for a situation like this.
There is also a 3rd option you haven't enumerated, and that is to do both by using a union. Something along the lines of:
typedef union ARGBPixel {
uint32_t colorValue;
uint8_t components[4]; // <- Or an existing struct with separate a,r,g,b
} ARGBPixel;
With the above union the same piece of memory can be addressed as either a uint32_t or an array or struct of uint8_ts.

Getting entire value from bitfields

I wish to create a Block struct for use in a voxel game I am building (just background context), however I have run into issues with my saving and loading.
I can either represent a block as a single Uint16 and shift the bits to get the different elements such as blockID and health, or I can use a bitfield such as the one below:
struct Block
{
Uint16 id : 8;
Uint16 health : 6;
Uint16 visible : 1;
Uint16 structural : 1;
}
With the first method, when I wish to save the Block data I can simply convert the value of the Uint16 into a hex value and write it to a file. With loading I can simply read and convert the number back, then go back to reading the individual bits with manual bit shifting.
My issue is that I cannot figure out how to get the whole value of the Uint16 I am using with the bitfields method, which means I cannot save the block data as a single hex value.
So, the question is how do I go about getting the actual single Uint16 stored inside my block struct that is made up from the different bit fields. If it is not possible then that is fine, as I have already stated my manual bit shifting approach works just fine. I just wanted to profile to see which method of storing and modifying data is faster really.
If I have missed out a key detail or there is any extra information you need to help me out here, by all means do ask.
A union is probably the cleanest way:
#include <iostream>
typedef unsigned short Uint16;
struct S {
Uint16 id : 8;
Uint16 health : 6;
Uint16 visible : 1;
Uint16 structural : 1;
};
union U {
Uint16 asInt;
S asStruct;
};
int main() {
U u;
u.asStruct.id = 0xAB;
u.asStruct.health = 0xF;
u.asStruct.visible = 1;
u.asStruct.structural = 1;
std::cout << std::hex << u.asInt << std::endl;
}
This prints out cfab.
Update:
After further consideration and reading more deeply about this I have decided that any kind of type punning is bad. Instead I would recommend just biting the bullet and explicitly do the bit-twiddling to construct your value for serialization:
#include <iostream>
typedef unsigned short Uint16;
struct Block
{
Uint16 id : 8;
Uint16 health : 6;
Uint16 visible : 1;
Uint16 structural : 1;
operator Uint16() {
return structural | visible << 2 | health << 4 | id << 8;
}
};
int main() {
Block b{0xAB, 0xF, 1, 1};
std::cout << std::hex << Uint16(b) << std::endl;
}
This has the further bonus that it prints abf5 which matches the initializer order.
If you are worried about performance, instead of using the operator member function you could have a function that the compiler optimizes away:
...
constexpr Uint16 serialize(const Block& b) {
return b.structural | b.visible << 2 | b.health << 4 | b.id << 8;
}
int main() {
Block b{0xAB, 0xF, 1, 1};
std::cout << std::hex << serialize(b) << std::endl;
}
And finally if speed is more important than memory, I would recommend getting rid of the bit fields:
struct Block
{
Uint16 id;
Uint16 health;
Uint16 visible;
Uint16 structural;
};
There are at least two methods for what you want:
Bit Shifting
Casting
Bit Shifting
You can build a uint16_t from your structure by shifting the bit fields into a uint16_t:
uint16_t halfword;
struct Bit_Fields my_struct;
halfword = my_struct.id << 8;
halfword = halfword | (my_struct.health << 2);
halfword = halfword | (my_struct.visible << 1);
halfword = halfword | (my_struct.structural);
Casting
Another method is to cast the instance of the structure to a uint16_t:
uint16_t halfword;
struct Bit_Fields my_struct;
halfword = (uint16_t) my_struct;
Endianess
One issue of concern is Endianness; or the byte ordering of multi-byte values. This may play a part with where the bits lie within the 16-bit unit.
Living on the edge (of undefined-behavior)..
The naive solution would be to reinterpret_cast a reference to the object to the underlying type of your bit-field, abusing the fact that the first non-static data-member of a standard-layout class is located at the same address as the object itself.
struct A {
uint16_t id : 8;
uint16_t health : 6;
uint16_t visible : 1;
uint16_t structural : 1;
};
A a { 0, 0, 0, 1 };
uint16_t x = reinterpret_cast<uint16_t const&> (a);
The above might look accurate, and it will often (not always) yield the expected result - but it suffers from two big problems:
The allocation of bit-fields within an object is implementation-defined, and;
the class type must be standard-layout.
There is nothing saying that the bit-fields will, physically, be stored in the order you declare them, and even if that was the case a compiler might insert padding between every bit-field (as this is allowed).
To sum things up; how bit-fields end up in memory is highly implementation-defined, trying to reason about the behavior requires you to look into your implementations documentation on the matter.
What about using a union?
Accessing inactive union member - undefined?
Recommendation
Stick with the bit-fiddling approach, unless you can absolutely prove that every implementation on which the code is ran handles it the way you would want it to.
What does the standard (N4296) say?
9.6p1 Bit-fields [class.bit]
[...] Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. [...]
9.2p20 Classes [class]
If a standard-layout class object has any non-static data members, its address is the same as the address of its first non-static data member. [...]
This isn't a good usage of bit fields (and really, there are very few).
There is no guarantee that the order of your bit fields will be the same as the order they're declared in; it could change between builds of your application.
You'll have to manually store your members in a uint16_t using the shift and bitwise-or operators. As a general rule, you should never just dump or blindly copy data when dealing with external storage; you should manually serialize/deserialize it, to ensure it's in the format you expect.
You can use union:
typedef union
{
struct
{
Uint16 id : 8;
Uint16 health : 6;
Uint16 visible : 1;
Uint16 structural : 1;
} Bits;
Uint16 Val;
} TMyStruct;

Specifying bit size of array elements in a struct

Now I have a struct looking like this:
struct Struct {
uint8_t val1 : 2;
uint8_t val2 : 2;
uint8_t val3 : 2;
uint8_t val4 : 2;
} __attribute__((packed));
Is there a way to make all the vals a single array? The point is not space taken, but the location of all the values: I need them to be in memory without padding, and each occupying 2 bits. It's not important to have array, any other data structure with simple access by index will be ok, and not matter if it's plain C or C++. Read/write performance is important - it should be same (similar to) as simple bit operations, which are used now for indexed access.
Update:
What I want exactly can be described as
struct Struct {
uint8_t val[4] : 2;
} __attribute__((packed));
No, C only supports bitfields as structure members, and you cannot have arrays of them. I don't think you can do:
struct twobit {
uint8_t val : 2;
} __attribute__((packed));
and then do:
struct twobit array[32];
and expect array to consist of 32 2-bit integers, i.e. 8 bytes. A single char in memory cannot contain parts of different structs, I think. I don't have the paragraph and verse handy right now though.
You're going to have to do it yourself, typically using macros and/or inline functions to do the indexing.
You have to manually do the bit stuff that's going on right now:
constexpr uint8_t get_mask(const uint8_t n)
{
return ~(((uint8_t)0x3)<<(2*n));
}
struct Struct2
{
uint8_t val;
inline void set_val(uint8_t v,uint8_t n)
{
val = (val&get_mask(n))|(v<<(2*n));
}
inline uint8_t get_val(uint8_t n)
{
return (val&~get_mask(n))>>(2*n);
}
//note, return type only, assignment WONT work.
inline uint8_t operator[](uint8_t n)
{
return get_val(n);
}
};
Note that you may be able to get better performance if you use actual assembly commands.
Also note that, (almost) no matter what, a uint8_t [4] will have better performance than this, and a processor aligned type (uint32_t) may have even better performance.

2 bits size variable

I need to define a struct which has data members of size 2 bits and 6 bits.
Should I use char type for each member?Or ,in order not to waste a memory,can I use something like :2\ :6 notation?
how can I do that?
Can I define a typedef for 2 or 6 bits type?
You can use something like:
typedef struct {
unsigned char SixBits:6;
unsigned char TwoBits:2;
} tEightBits;
and then use:
tEightBits eight;
eight.SixBits = 31;
eight.TwoBits = 3;
But, to be honest, unless you're having to comply with packed data external to your application, or you're in a very memory constrained situation, this sort of memory saving is not usually worth it. You'll find your code is a lot faster if it's not having to pack and unpack data all the time with bitwise and bitshift operations.
Also keep in mind that use of any type other than _Bool, signed int or unsigned int is an issue for the implementation. Specifically, unsigned char may not work everywhere.
It's probably best to use uint8_t for something like this. And yes, use bit fields:
struct tiny_fields
{
uint8_t twobits : 2;
uint8_t sixbits : 6;
}
I don't think you can be sure that the compiler will pack this into a single byte, though. Also, you can't know how the bits are ordered, within the byte(s) that values of the the struct type occupies. It's often better to use explicit masks, if you want more control.
Personally I prefer shift operators and some macros over bit fields, so there's no "magic" left for the compiler. It is usual practice in embedded world.
#define SET_VAL2BIT(_var, _val) ( (_var) | ((_val) & 3) )
#define SET_VAL6BIT(_var, _val) ( (_var) | (((_val) & 63) << 2) )
#define GET_VAL2BIT(_var) ( (_val) & 3)
#define GET_VAL6BIT(_var) ( ((_var) >> 2) & 63 )
static uint8_t my_var;
<...>
SET_VAL2BIT(my_var, 1);
SET_VAL6BIT(my_var, 5);
int a = GET_VAL2BIT(my_var); /* a == 1 */
int b = GET_VAL6BIT(my_var); /* b == 5 */

How to assign value to a struct with bit-fields?

I have a struct with bit-fields (totally 32 bit width) and I have a 32-bit variable. When I try to assign the variable value to my struct, I got an error:
error: conversion from ‘uint32_t {aka unsigned int}’ to non-scalar type ‘main()::CPUID’ requested.
struct CPUIDregs
{
uint32_t EAXBuf;
};
CPUIDregs CPUIDregsoutput;
int main () {
struct CPUID
{
uint32_t Stepping : 4;
uint32_t Model : 4;
uint32_t FamilyID : 4;
uint32_t Type : 2;
uint32_t Reserved1 : 2;
uint32_t ExtendedModel : 4;
uint32_t ExtendedFamilyID : 8;
uint32_t Reserved2 : 4;
};
CPUID CPUIDoutput = CPUIDregsoutput.EAXBuf;
Do you have any idea how to do it in the shortest way? Thanks
P.S. Of course I have more appropriate value of EAX in real code, but I guess it doesn't affect here.
You should never rely on how the compiler lays out your structure in memory. There are ways to do what you want with a single assignment, but I will neither recommend nor tell you.
The best way to do the assignment would be the following:
static inline void to_id(struct CPUid *id, uint32_t value)
{
id->Stepping = value & 0xf;
id->Model = value >> 4 & 0xf;
id->FamilyID = value >> 8 & 0xf;
id->Type = value >> 12 & 0x3;
id->Reserved1 = value >> 14 & 0x3;
id->ExtendedModel = value >> 16 & 0xf;
id->ExtendedFamilyID = value >> 20 & 0xff;
id->Reserved2 = value >> 28 & 0xf;
}
And the opposite
static inline uint32_t from_id(struct CPUid *id)
{
return id->Stepping
| id->Model << 4
| id->FamilyID << 8
| id->Type << 12
| id->Reserved1 << 14
| id->ExtendedModel << 16
| id->ExtendedFamilyID << 20
| id->Reserved2 << 28;
}
Use a union.
union foo {
struct {
uint8_t a : 4;
uint8_t b : 4;
uint8_t c : 4;
uint8_t d : 4;
uint16_t e;
};
uint32_t allfields;
};
int main(void) {
union foo a;
a.allfields = 0;
a.b = 3;
return 0;
}
Just if somebody´s interested, I´ve got a better solution for my own question:
*(reinterpret_cast<uint32_t *> (&CPUIDoutput)) = CPUIDregsoutput.EAXBuf;
These are struct members, so you need to assign directly do them, or make sure the RHS of your assignment is a value of type CPUID. Not sure why you expect to be able to assign to the struct from an integer.
The facts that the struct contains bitfields, and that the sum of the bits happens to be the same as the number of bits in the integer you're trying to assign, mean nothing. They're still not compatible types, for assignment purposes.
If this was too vague, consider showing more/better code.
I wanted to add a couple of things here in case someone needs the information. As Shahbaz has pointed out, there is one way of SAFELY doing this. Here are the issues with the other ones I see.
The union has the problem of endian-ness. When you have a uint32_t value assigned to a bitfield, big endian will have your bits in one order while little endian will store your bytes of bits in reverse order. When you think you are assigning it one value, you could instead be assigning wrong values to wrong bits. This is therfore NOT portable code.
When you change the pointer type of anything on the LHS of the assignment, assume you are doing something wrong. This is my mantra. Sure, it works, here. Again, this becomes dependent on endian-ness, as well as compiler settings and memory allocation. I can remap any chunk of memory into a different structure, but if I'm doing that then how do I ensure that the 2 structures are of the same size and order? What if down the road you decide to optimize your code and make one a uint16_t, or reorder the bits on a structure? This is a maintenance NIGHTMARE. Just... don't do it. If not for yourself, then for the next individual who is going to have to maintain your code.
Hope this helps explain some of the answers that get the job done but aren't the way to do it.