How can I refer to sub-elements in a particular variable? - c++

My problem is as follows: I have a 64 bit variable, of type uint64_t (so I know it's specified to be at least 64 bits wide).
I want to be able to refer to different parts of it, for example breaking it down into two uint32_ts, four uint16_ts or eight uint8_ts. Is there a standards compliant way to do it that doesn't rely on undefined behavior?
My approach is as follows:
class Buffer
{
uint64_t m_64BitBuffer;
public:
uint64_t & R64() { return m_64BitBuffer; }
uint32_t & R32(R32::Part part) { return *(reinterpret_cast<uint32_t*>(&m_64BitBuffer)+part); }
uint16_t & R16(R16::Part part) { return *(reinterpret_cast<uint16_t*>(&m_64BitBuffer)+part); }
uint8_t & R8(R8::Part part) { return *(reinterpret_cast<uint8_t*>(&m_64BitBuffer)+part); }
};
Where R32::Part, R16::Part and R8::Part are enums that define values between 0 and 1, 0 and 3 and 0 and 7 respectively.
I imagine this should be ok. There should be no issues with alignment, for example. I'd like to know if I'm breaking any rules, and if so, how to do this properly.

Type-punning through a union is allowed by some compilers, so you could simply have the following anonymous union member:
union {
uint64_t val;
struct { uint32_t as32[2]; };
struct { uint16_t as16[4]; };
struct { uint8_t as8[8]; };
} u;
Access to each part is as easy as reading from the appropriate member.

Related

Are structs that contain packed structs packed themselves?

Suppose I have a struct which contains other structs, which are not packed:
struct ContainerOfNonPacked
{
NonPacked1 first;
NonPacked2 second;
};
And then I have a struct which contains other structs which are packed:
struct ContainerOfPacked
{
Packed1 first; // declaration of struct had __attribute__((packed))
Packed2 second; // ditto
};
The first is not going to be packed by the compiler (i.e. there's no guarantee that there will be no "holes" inside the struct). It might coincidentally have no holes, but that's not what the question is about.
What about the second container, that contains packed structs? Is there any guarantee that a struct consisting solely of packed structs as its fields is itself packed?
Edit: it is up to the compiler and is not a behaviour one can depend on.
For the case of GCC and clang, structs that contain only packed structs are themselves "packed". They will not have any padding "holes" in them.
Here is an example to illustrate (godbolt link: https://godbolt.org/g/2noSpP):
We have four structs, Packed1, Packed2, NonPacked1, and NonPacked2.
#include <cstdint>
struct Packed1
{
uint32_t a;
uint64_t b;
} __attribute__((packed));
struct NonPacked1
{
uint16_t a;
// Padding of 2 bytes
uint32_t b;
uint8_t c;
};
// We have two structs of the same size, one packed and one not
static_assert(sizeof(NonPacked1) == 12);
static_assert(sizeof(Packed1) == 12);
struct Packed2
{
uint64_t a;
uint64_t b;
} __attribute__((packed)); // packing has no effect, but better for illustration
struct NonPacked2
{
uint32_t a;
// Padding of 4 bytes
uint64_t b;
};
// And again, two structs of the same size
static_assert(sizeof(Packed2) == 16);
static_assert(sizeof(NonPacked2) == 16);
Packed1 and Packed2 go into a ContainerOfPacked struct, and the other two into a ContainerOfNonPacked.
struct ContainerOfNonPacked
{
NonPacked1 first; // 12 bytes
// Padding of 4 bytes between the non-packed struct-fields
NonPacked2 second; // 16 bytes
};
struct ContainerOfPacked
{
Packed1 first; // 12
// No padding between the packed struct-fields
Packed2 second; // 16
};
static_assert(sizeof(ContainerOfNonPacked) == 32);
static_assert(sizeof(ContainerOfPacked) == 28);
As the code comments and static asserts show, the container of packed structs does not have padding holes inside it. It behaves as if it is itself "packed".
Although this is an answer, I'm also looking for answers which can cite the relevant parts of the Standard or some other authoritative/theoretical explanations.

How to extract values from C++ structure with bitfields

This is my problem, I have a structure (that I cannot change) like the following:
struct X
{
uint8_t fieldAB;
uint8_t fieldCDE;
uint8_t fieldFGH;
...
}
Each field of this structure contains different values packed using a bitmask (bitfield), that is for example fieldAB contains two different values (A and B) in the hi/lo nibbles, while fieldCDE contains three different values (C, D and E with the following bit mask: bit 7-6, bit 5-4-3, bit 2-1-0) and so on...
I would like to write a simple API to read and write this value using enum, that allows to easily access to values of each field:
getValue(valueTypeEnum typeOfValue, X & data);
setValue(valueTypeEnum typeOfValue, X & data, uint8_t value);
Where the enum valueTypeEnum is something like this:
enum valueTypeEnum
{
A,
B,
C,
D,
E,
...
}
My idea was to use a map (dictionary) that given valueTypeEnum returns the bitmask to use and the offset for access to the right field of the structure, but I think it's a little tricky and not so elegant.
What are your suggestions?
I can think of a few ways this could be done, the simplest is to use bitfields directly in your struct:
struct X {
uint32_t A : 4; // 4 bits for A.
uint32_t B : 4;
uint32_t C : 4;
uint32_t D : 4;
uint8_t E : 7;
uint8_t F : 1;
};
Then you can easily get or set the values using for instance:
X x;
x.A = 0xF;
Another way could be to encode it directly in macros or inline functions, but I guess what you are looking for is probably the bitfield.
As pointed out in the comments, the actual behaviour of bit-fields may depend on your platform, so if space is of the essence, you should check that it behaves as you expect. Also see here for more information on bit-fields in C++.
I'll dig into the bitfields a little more :
Your X structure is left unchanged :
struct X
{
uint8_t fieldAB;
uint8_t fieldCDE;
uint8_t fieldFGH;
};
Let's define an union for easy translation :
union Xunion {
X x;
struct Fields { // Named in case you need to sizeof() it
uint8_t A : 4;
uint8_t B : 4;
uint8_t C : 2;
uint8_t D : 3;
uint8_t E : 3;
uint8_t F : 2;
uint8_t G : 3;
uint8_t H : 3;
};
};
And now you can access these bitfields conveniently.
Before anyone tries to skin me alive, note that this is in no way portable, nor even defined by the C++ standard. But it'll do what you expect on any sane compiler.
You may want to add a compiler-specific packing directive (e.g GCC's __attribute__((packed))) to the Fields struct, as well as a static_assert ensuring the sizeof both union members are strictly equal.
Your best bet IMO is to forget using a structure at all, or if so use a union of a structure and a byte array. Then have your access functions use the byte array, for which it's easy to calculate offsets and so on. Doing this you are I believe guaranteed to access the bytes you want.
The downside is that you will have to re-assemble 16 and 32 bit values, if any, found in the structure, and do so bearing in mind any endianness issues. If you know that any such values are found on 16 or 32 bit address boundaries, you could use union'd short and long arrays to do this which would probably be best, though somewhat opaque.
HTH
Maybe I found a solution.
I can create an API like the following:
uint8_t getValue(valueTypeEnum typeOfValue, X * data)
{
uint8_t bitmask;
uint8_t * field = getBitmaskAndFieldPtr(typeOfValue, data, &bitmask);
return ((*field) & bitmask) >> ...;
}
void setValue(valueTypeEnum typeOfValue, X * data, uint8_t value)
{
uint8_t bitmask;
uint8_t * field = getBitmaskAndFieldPtr(typeOfValue, data, &bitmask);
*field = ((*field) & !bitmask) | (value << ...);
}
uint8_t * getBitmaskAndFieldPtr(valueTypeEnum typeOfValue, X * data, uint8_t * bitmask)
{
uint8_t * fieldPtr = 0;
switch(typeOfValue)
{
case A:
{
*bitmask = 0x0F; // I can use also a dictionary here
fieldPtr = &data.AB;
break;
}
case B:
{
*bitmask = 0xF0; // I can use also a dictionary here
fieldPtr = &data.AB;
break;
}
case C:
{
*bitmask = 0xC0; // I can use also a dictionary here
fieldPtr = &data.CDE;
break;
}
...
}
return fieldPtr;
}
I know that the switch-case is not so elegant and require to be updated each time the structure is changed, but I don't see an automatic way (maybe using reflection) to solve this problem.

Specifying bit size of array elements in a struct

Now I have a struct looking like this:
struct Struct {
uint8_t val1 : 2;
uint8_t val2 : 2;
uint8_t val3 : 2;
uint8_t val4 : 2;
} __attribute__((packed));
Is there a way to make all the vals a single array? The point is not space taken, but the location of all the values: I need them to be in memory without padding, and each occupying 2 bits. It's not important to have array, any other data structure with simple access by index will be ok, and not matter if it's plain C or C++. Read/write performance is important - it should be same (similar to) as simple bit operations, which are used now for indexed access.
Update:
What I want exactly can be described as
struct Struct {
uint8_t val[4] : 2;
} __attribute__((packed));
No, C only supports bitfields as structure members, and you cannot have arrays of them. I don't think you can do:
struct twobit {
uint8_t val : 2;
} __attribute__((packed));
and then do:
struct twobit array[32];
and expect array to consist of 32 2-bit integers, i.e. 8 bytes. A single char in memory cannot contain parts of different structs, I think. I don't have the paragraph and verse handy right now though.
You're going to have to do it yourself, typically using macros and/or inline functions to do the indexing.
You have to manually do the bit stuff that's going on right now:
constexpr uint8_t get_mask(const uint8_t n)
{
return ~(((uint8_t)0x3)<<(2*n));
}
struct Struct2
{
uint8_t val;
inline void set_val(uint8_t v,uint8_t n)
{
val = (val&get_mask(n))|(v<<(2*n));
}
inline uint8_t get_val(uint8_t n)
{
return (val&~get_mask(n))>>(2*n);
}
//note, return type only, assignment WONT work.
inline uint8_t operator[](uint8_t n)
{
return get_val(n);
}
};
Note that you may be able to get better performance if you use actual assembly commands.
Also note that, (almost) no matter what, a uint8_t [4] will have better performance than this, and a processor aligned type (uint32_t) may have even better performance.

Define an enum to be smaller than one byte / Why is this struct larger than one byte?

I'd like to define an enum to be smaller than one byte while maintaining type safety.
Defining an enum as:
enum MyEnum : unsigned char
{
i ,j, k, w
};
I can shrink it to one byte, however I'd like to make it use only 2 bits since I will at most have 4 values in it. Can this be done?
In my struct where I use the enum, the following does not work
struct MyStruct
{
MyEnum mEnum : 2; // This will be 4 bytes in size
};
Thanks!
Update:
The questions comes from this scenario:
enum MyEnum : unsigned char
{
i ,j, k, w
};
struct MyStruct
{
union
{
signed int mXa:3;
unsigned int mXb:3;
};
union
{
signed int mYa:3;
unsigned int mYb:3;
};
MyEnum mEnum:2;
};
sizeof(MyStruct) is showing 9 bytes. Ideally I'd like the struct to be 1 bytes in size.
Update for implemented solution:
This struct is one byte and offers the same functionality and type safety:
enum MyEnum :unsigned char
{
i,j,k,w
};
struct MyStruct
{
union
{
struct { MyEnum mEnum:2; char mXa:3; char mXb:3;};
struct { MyEnum mEnum:2; unsigned char mYa:3; unsigned char mYb:3;};
};
};
As per standard definition, a types sizeof must be at least 1 byte. This is the smallest addressable unit of memory.
The feature of bitfields you are mentioning allows to define members of structures to have smaller sizes, but the struct itself may not be because
It must be of at least 1 byte too
Alignment considerations might need it to be even bigger
additionally you may not take the address of bitfield members, since as said above, a byte is the smallest addressable unit of memory (You can already see that by sizeofactually returning the number of bytes, not bits, so if you expected less than CHAR_BIT bits, sizeof would not even be able to express it).
bitfields can only share space if they use the same underlying type. And any unused bits are actually left unused; if the sum of bits in an unsigned int bitfield is 3 bits, it still takes 4 bytes total. Since both enums have unsigned int members, they're both 4 bytes, but since they are bitfields, they have an alignment of one. So the first enum is 4 bytes, and the second is four bytes, then the MyEnum is 1 byte. Since all of those have an alignment of one, no padding is needed.
Unfortunately, union doesn't really work with bitfields really at all. Bitfields are for integer types only. The most I could get your data to without serious redesign is 3 bytes: http://coliru.stacked-crooked.com/view?id=c6ad03c93d7893ca2095fabc7f72ca48-e54ee7a04e4b807da0930236d4cc94dc
enum MyEnum : unsigned char
{
i ,j, k, w
};
union MyUnion
{
signed char ma:3; //char to save memory
unsigned char mb:3;
};
struct MyStruct
{
MyUnion X;
MyUnion Y;
MyEnum mEnum;
}; //this structure is three bytes
In the complete redesign category, you have this: http://coliru.stacked-crooked.com/view?id=58269eef03981e5c219bf86167972906-e54ee7a04e4b807da0930236d4cc94dc
No. C++ defines "char" to be the smallest addressable unit of memory for the platform. You can't address 2 bits.
Bit packing 'Works for me'
#include <iostream>
enum MyEnum : unsigned char
{
i ,j, k, w
};
struct MyStruct
{
MyEnum mEnum : 2;
unsigned char val : 6;
};
int main()
{
std::cout << sizeof(MyStruct);
}
prints out 1. How / what are you measuring?
Edit: Live link
Are you doing something like having a pointer as the next thing in the struct? In which case, you'll have 30bits of dead space as pointers must be 4 byte aligned on most 32bit systems.
Edit: With your updated example, its the unions which are breaking you
enum MyEnum : unsigned char
{
i ,j, k, w
};
struct MyStruct
{
unsigned char mXb:3;
unsigned char mYb:3;
MyEnum mEnum:2;
};
Has size 1. I'm not sure how unions and bit packing work together though, so I'm no more help.

Warning: cast increases required alignment

I'm recently working on this platform for which a legacy codebase issues a large number of "cast increases required alignment to N" warnings, where N is the size of the target of the cast.
struct Message
{
int32_t id;
int32_t type;
int8_t data[16];
};
int32_t GetMessageInt(const Message& m)
{
return *reinterpret_cast<int32_t*>(&data[0]);
}
Hopefully it's obvious that a "real" implementation would be a bit more complex, but the basic point is that I've got data coming from somewhere, I know that it's aligned (because I need the id and type to be aligned), and yet I get the message that the cast is increasing the alignment, in the example case, to 4.
Now I know that I can suppress the warning with an argument to the compiler, and I know that I can cast the bit inside the parentheses to void* first, but I don't really want to go through every bit of code that needs this sort of manipulation (there's a lot because we load a lot of data off of disk, and that data comes in as char buffers so that we can easily pointer-advance), but can anyone give me any other thoughts on this problem? I mean, to me it seems like such an important and common option that you wouldn't want to warn, and if there is actually the possibility of doing it wrong then suppressing the warning isn't going to help. Finally, can't the compiler know as I do how the object in question is actually aligned in the structure, so it should be able to not worry about the alignment on that particular object unless it got bumped a byte or two?
One possible alternative might be:
int32_t GetMessageInt(const Message& m)
{
int32_t value;
memcpy(&value, &(data[0]), sizeof(int32_t));
return value;
}
For x86 architecture, the alignment isn't going to matter that much, it's more a performance issue that isn't really relevant for the code you have provided. For other architectures (eg MIPS) misaligned accesses cause CPU exceptions.
OK, here's another alternative:
struct Message
{
int32_t id;
int32_t type;
union
{
int8_t data[16];
int32_t data_as_int32[16 * sizeof(int8_t) / sizeof(int32_t)];
// Others as required
};
};
int32_t GetMessageInt(const Message& m)
{
return m.data_as_int32[0];
}
Here's variation on the above that includes the suggestions from cpstubing06:
template <size_t N>
struct Message
{
int32_t id;
int32_t type;
union
{
int8_t data[N];
int32_t data_as_int32[N * sizeof(int8_t) / sizeof(int32_t)];
// Others as required
};
static_assert((N * sizeof(int8_t) % sizeof(int32_t)) == 0,
"N is not a multiple of sizeof(int32_t)");
};
int32_t GetMessageInt(const Message<16>& m)
{
return m.data_as_int32[0];
}
// Runtime size checks
template <size_t N>
void CheckSize()
{
assert(sizeof(Message<N>) == N * sizeof(int8_t) + 2 * sizeof(int32_t));
}
void CheckSizes()
{
CheckSize<8>();
CheckSize<16>();
// Others as required
}