Creating constructor for a struct(union) in C++ - c++

What is the best way to create a constructor for a struct(which has a union member, does it matter?) to convert uint8_t type into the struct?
Here is my example to clarify more:
struct twoSixByte
{
union {
uint8_t fullByte;
struct
{
uint8_t twoPart : 2;
uint8_t sixPart : 6;
} bits;
};
};
uint32_t extractByte(twoSixByte mixedByte){
return mixedByte.bits.twoPart * mixedByte.bits.sixPart;
}
uint8_t tnum = 182;
print(extractByte(tnum)); // must print 2 * 54 = 108
P.S.
Finding from comments & answers, type-punning for unions is not possible in C++.
The solutions given are a little bit complicated specially where there are lots of these structures in the code. There are even situations where a byte is divided into multiple bit parts(more than two). So without taking advantage of unions and instead using bitsets ans shifting bits adds a lot of burden to the code.
Instead, I managed for a much simpler solution. I just converted the type before passing it to the function. Here is the fixed code:
struct twoSixByte
{
union {
uint8_t fullByte;
struct
{
uint8_t twoPart : 2;
uint8_t sixPart : 6;
} bits;
};
};
uint32_t extractByte(twoSixByte mixedByte){
return mixedByte.bits.twoPart * mixedByte.bits.sixPart;
}
uint8_t tnum = 182;
twoSixByte mixedType;
mixedType.fullByte = tnum;
print(extractByte(mixedByte)); // must print 2 * 54 = 108

Unless there is a pressing need for you to use a union, don't use it. Simplify your class to:
struct twoSixByte
{
twoSixByte(uint8_t in) : twoPart((in & 0xC0) >> 6), sixPart(in & 0x3F) {}
uint8_t twoPart : 2;
uint8_t sixPart : 6;
};
If there is a need to get the full byte, you can use:
uint8_t fullByte(twoSixByte mixedByte)
{
return ((mixedByte.twoPart << 6) | mixedByte.sixPart);
}

You could avoid the union and type punning and use a struct with the relevant member function. Note that we don't need a constructor if the struct is regarded as an aggregate to be initialized:
#include <cstdint>
struct twoSixByte {
uint8_t fullByte; // no constructor needed, initializing as an aggregate
uint32_t extractByte(){
return ((fullByte & 0b1100'0000) >> 6) * (fullByte & 0b0011'1111);
}
};
int main()
{
twoSixByte tnum{182};
auto test = tnum.extractByte(); // test == 2 * 54 == 108
}

Related

Standard way of overlay flexible array member

So the server sends the data just as packed structures, so what only need to decode is to overlay the structure pointer on the buffer. However one of the structure is a dynamic array kind of data, but I learned that flexible array member is not a C++ standard feature. How can I do it in standard C++ way, but without copying like a vector?
// on wire format: | field a | length | length of struct b |
// the sturcts are defined packed
__pragma(pack(1))
struct B {
//...
};
struct Msg {
int32_t a;
uint32_t length;
B *data; // how to declare this?
};
__pragma(pack())
char *buf = readIO();
// overlay, without copy and assignments of each field
const Msg *m = reinterpret_cast<const Msg *>(buf);
// access m->data[i] from 0 to length
The common way to do this in C was to declare data as an array of length one as the last struct member. You then allocate the space needed as if the array was larger.
Seems to work fine in C++ as well. You should perhaps wrap access to the data in a span or equivalent, so the implementation details don't leak outside your class.
#include <string>
#include <span>
struct B {
float x;
float y;
};
struct Msg {
int a;
std::size_t length;
B data[1];
};
char* readIO()
{
constexpr int numData = 3;
char* out = new char[sizeof(Msg) + sizeof(B) * (numData - 1)];
return out;
}
int main(){
char *buf = readIO();
// overlay, without copy and assignments of each field
const Msg *m = reinterpret_cast<const Msg *>(buf);
// access m->data[i] from 0 to length
std::span<const B> data(m->data, m->length);
for(auto& b: data)
{
// do something
}
return 0;
}
https://godbolt.org/z/EoMbeE8or
A standard solution is to not represent the array as a member of the message, but rather as a separate object.
struct Msg {
int a;
size_t length;
};
const Msg& m = *reinterpret_cast<const Msg*>(buf);
span<const B> data = {
reinterpret_cast<const B*>(buf + sizeof(Msg)),
m.length,
};
Note that reinterpretation / copying of bytes is not portable between systems with different representations (byte endianness, integer sizes, alignments, subobject packing etc.), and same representation is typically not something that can be assumed in network communication.
// on wire format: | field a | length | length of struct b |
You can't overlay the struct, because you can't guarantee that the binary representation of Msg will match the on wire format. Also int is at least 16 bits, can be any number of bits greater than 16, and size_t has various size depending on architecture.
Write actual accessors to the data. Use fixed width integer types. It will only work if the data actually point to a properly aligned region. This method allows you to write assertions and throw exceptions when stuff goes bad (for example, you can throw on out-of-bounds access to the array).
struct Msg {
constexpr static size_t your_required_alignment = alingof(uint32_t);
char *buf;
Msg (char *buf) : buf(buf) {
assert((uintptr_t)buf % your_required_alignment == 0);
}
int32_t& get_a() { return *reinterpret_cast<int32_t*>(buf); }
uint32_t& length() { return *reinterpret_cast<uint32_t *>(buf + sizeof(int32_t)); }
struct Barray {
char *buf;
Barray(char *buf) : buf(buf) {}
int16_t &operator[](size_t idx) {
return *reinterpret_cast<int16_t*>(buf + idx * sizeof(int16_t));
}
}
Barray data() {
return buf + sizeof(int32_t) + sizoef(uint32_t);
}
};
int main() {
Msg msg(readIO());
std::cout << msg.a() << msg.length();
msg.data()[1] = 5;
// or maybe even implement straight operator[]:
// msg[1] = 5;
}
If the data do not point to a properly aligned region, you have to copy the data, there is no possibility to access them using other types then char.

C++ Same Name Different Type

I have an anonymous union which contain a uint8_t and a bit field struct.
union
{
struct
{
uint8_t ID : 1;
uint8_t C : 4;
uint8_t RM : 1;
uint8_t SWRST : 1;
uint8_t STS : 1;
} ControlByte1;
uint8_t ControlByte1;
};
As you can see both the uint8_t and struct instances are called ControlByte1. What I'm trying to do is to have ControlByte1 be a uint8_t while being able to access its bits with ControlByte1.ID, ControlByte1.C, ControlByte1.RM, ControlByte1.SWRTS and ControlByte1.STS. As expected, when I write ControlByte1 = ...;, ... = ControlByte1; or function(ControlByte1);, the compilator has no way of knowing if I mean ControlByte1 as a uint8_t or struct. Is there any way to do this?
Thanks!
Edit 1:
I would just like them to have the same name because ControlByte1 is a one byte I2C control byte which contain ID, C, RM, SWRST and STS fields. I want to be able to access ControlByte1 as a byte and also access it's fields individually.
Edit 2:
I may have found a way by overloading the implicit type casting operator.
struct
{
uint8_t ID : 1;
uint8_t C : 4;
uint8_t RM : 1;
uint8_t SWRST : 1;
uint8_t STS : 1;
operator uint8_t*() {return (uint8_t *)this;}
operator uint8_t() {return *(uint8_t *)this;}
} ControlByte1;
I'm not sure if I'm doing the overloading correctly but it does compile, I haven't tried running my code yet.
uint8_t var1 = ControlByte1; // Compile
uint8_t *ptr1 = ControlByte1; // Compile
uint8_t *ptr2 = &ControlByte1; // Does not compile!!!
var1 = ControlByte1; // Compile
ptr1 = ControlByte1; // Compile
ptr2 = &ControlByte1; // Does not compile!!!
I'm guessing that &ControlByte1 does not compile because I need to overload the & operator too? There is no need to overload assignment operator because I do not assign anything directly to ControlByte1.
Edit 3:
This make more sense.
struct
{
uint8_t ID : 1;
uint8_t C : 4;
uint8_t RM : 1;
uint8_t SWRST : 1;
uint8_t STS : 1;
// For implicit casting to uint8_t
operator uint8_t() const
{
return(*(uint8_t *)this);
}
// For implicit casting to uint8_t *
uint8_t * operator &() const
{
return((uint8_t *)this);
}
} ControlByte1;

Is there a way I can use a 2-bit size type instead of an int, by just plugging in the new type name instead of int?

I have an application where I need to save as much of memory as possible. I need to store a large amount of data that can take exactly three possible values. So, I have been trying to use a 2 bit sized type.
One possibility is using bit fields. I could do
struct myType {
uint8_t twoBits : 2;
}
This is a suggestion from this thread.
However, everywhere where I have used int variables prior to this, I would need to change their usage by appending a .twoBits. I checked if I can create a bit field outside of a struct, such as
uint8_t twoBits : 2;
but this thread says it is not possible. However,that thread is specific to C, so I am not sure if it applied to C++.
Is there a clean way I can define a 2-bit type, so that by simply replacing int with my type, I can run the program correctly? Or is using bit fields the only possible way?
CPU, and thus the memory, the bus, and the compiler too, uses only bytes or groups of bytes. There's no way to store a 2-bits type without storing also the other 6 remaining bits.
What you can so is define a struct that only uses some bits. But we aware that it will not save memory.
You can pack several x-bits types in a struct, as you already know. Or you can do bits operations to pack/unpack them into a integer type.
Is there a clean way I can define a 2-bit type, so that by simply
replacing int with my type, I can run the program correctly? Or is
using bit fields the only possible way?
You can try to make the struct as transparent as possible by providing implicit conversion operators and constructors:
#include <cstdint>
#include <iostream>
template <std::size_t N, typename T = unsigned>
struct bit_field {
T rep : N;
operator T() { return rep; }
bit_field(T i) : rep{ i } { }
bit_field() = default;
};
using myType = bit_field<2, std::uint8_t>;
int main() {
myType mt;
mt = 3;
std::cout << mt << "\n";
}
So objects of type my_type somewhat behave like real 3-bit unsigned integers, despite having more than 3 bits.
Of course, the residual bits are unused, but as single bits are not addressable on most systems, this is the best way to go.
I'm not convinced that you will save anything with your existing structure, as the surrounding structure still gets rounded up to a whole number of bytes.
You can write the following to squeeze 4 2-bit counters into 1 byte, but as you say, you have to name them myInst.f0:
struct MyStruct
{
ubyte_t f0:2,
f1:2,
f2:2,
f3:2;
} myInst;
In c and c++98, you can declare this anonymous, but this usage is deprecated. You can now access the 4 values directly by name:
struct
{ // deprecated!
ubyte_t f0:2,
f1:2,
f2:2,
f3:2;
};
You could declare some sort of template that wraps a single instance with an operator int and operator =(int), and then define a union to put the 4 instances at the same location, but again anonymous unions are deprecated. However you could then declare references to your 4 values, but then you are paying for the references, which are bigger than the bytes you were trying to save!
template <class Size,int offset,int bits>
struct Bitz
{
Size ignore : offset,
value : bits;
operator Size()const { return value; }
Size operator = (Size val) { return (value = val); }
};
template <class Size,int bits>
struct Bitz0
{ // I know this can be done better
Size value : bits;
operator Size()const { return value; }
Size operator = (Size val) { return (value = val); }
};
static union
{ // Still deprecated!
Bitz0<char, 2> F0;
Bitz<char, 2, 2> F1;
Bitz<char, 4, 2> F2;
Bitz<char, 6, 2> F3;
};
union
{
Bitz0<char, 2> F0;
Bitz<char, 2, 2> F1;
Bitz<char, 4, 2> F2;
Bitz<char, 6, 2> F3;
} bitz;
Bitz0<char, 2>& F0 = bitz.F0; /// etc...
Alternatively, you could simply declare macros to replace the the dotted name with a simple name (how 1970s):
#define myF0 myInst.f0
Note that you can't pass bitfields by reference or pointer, as they don't have a byte address, only by value and assignment.
A very minimal example of a bit array with a proxy class that looks (for the most part) like you were dealing with an array of very small integers.
#include <cstdint>
#include <iostream>
#include <vector>
class proxy
{
uint8_t & byte;
unsigned int shift;
public:
proxy(uint8_t & byte,
unsigned int shift):
byte(byte),
shift(shift)
{
}
proxy(const proxy & src):
byte(src.byte),
shift(src.shift)
{
}
proxy & operator=(const proxy &) = delete;
proxy & operator=(unsigned int val)
{
if (val <=3)
{
uint8_t wipe = 3 << shift;
byte &= ~wipe;
byte |= val << shift;
}
// might want to throw std::out_of_range here
return *this;
}
operator int() const
{
return (byte >> shift) &0x03;
}
};
Proxy holds a reference to a byte and knows how to extract two specific bits and look like an int to anyone who uses it.
If we wrap an array of bits packed into bytes with a class that returns this proxy object wrapped around the appropriate byte, we now have something that looks a lot like an array of very small ints.
class bitarray
{
size_t size;
std::vector<uint8_t> data;
public:
bitarray(size_t size):
size(size),
data((size + 3) / 4)
{
}
proxy operator[](size_t index)
{
return proxy(data[index/4], (index % 4) * 2);
}
};
If you want to extend this and go the distance, Writing your own STL Container should help you make a fully armed and operational bit-packed array.
There's room for abuse here. The caller can hold onto a proxy and get up to whatever manner of evil this allows.
Use of this primitive example:
int main()
{
bitarray arr(10);
arr[0] = 1;
arr[1] = 2;
arr[2] = 3;
arr[3] = 1;
arr[4] = 2;
arr[5] = 3;
arr[6] = 1;
arr[7] = 2;
arr[8] = 3;
arr[9] = 1;
std::cout << arr[0] << std::endl;
std::cout << arr[1] << std::endl;
std::cout << arr[2] << std::endl;
std::cout << arr[3] << std::endl;
std::cout << arr[4] << std::endl;
std::cout << arr[5] << std::endl;
std::cout << arr[6] << std::endl;
std::cout << arr[7] << std::endl;
std::cout << arr[8] << std::endl;
std::cout << arr[9] << std::endl;
}
Simply, build on top of bitset, something like:
#include<bitset>
#include<iostream>
using namespace std;
template<int N>
class mydoublebitset
{
public:
uint_least8_t operator[](size_t index)
{
return 2 * b[index * 2 + 1] + b[index * 2 ];
}
void set(size_t index, uint_least8_t store)
{
switch (store)
{
case 3:
b[index * 2] = 1;
b[index * 2 + 1] = 1;
break;
case 2:
b[index * 2] = 0;
b[index * 2 + 1] = 1;
break;
case 1:
b[index * 2] = 0;
b[index * 2 + 1] = 1;
break;
case 0:
b[index * 2] = 0;
b[index * 2 + 1] = 0;
break;
default:
throw exception();
}
}
private:
bitset<N * 2> b;
};
int main()
{
mydoublebitset<12> mydata;
mydata.set(0, 0);
mydata.set(1, 2);
mydata.set(2, 2);
cout << (unsigned int)mydata[0] << (unsigned int)mydata[1] << (unsigned int)mydata[2] << endl;
system("pause");
return 0;
}
Basically use a bitset with twice the size and index it accordingly. its simpler and memory efficient as is required by you.

Getting a int32_t or a int64_t value from a char array

An operation I need to perform requires me to get one int32_t value and 2 int64_t values from a char array
the first 4 bytes of the char array contain the int32 value, the next 8 bytes contain the first int64_t value, the the next 8 bytes contain the second. I can't figure out how to get to these values. I have tried;
int32_t firstValue = (int32_t)charArray[0];
int64_t firstValue = (int64_t)charArray[1];
int64_t firstValue = (int64_t)charArray[3];
int32_t *firstArray = reinterpet_cast<int32_t*>(charArray);
int32_t num = firstArray[0];
int64_t *secondArray = reinterpet_cast<int64_t*>(charArray);
int64_t secondNum = secondArray[0];
I'm just grabbing at straws. Any help appreciated
Quick and dirty solution:
int32_t value1 = *(int32_t*)(charArray + 0);
int64_t value2 = *(int64_t*)(charArray + 4);
int64_t value3 = *(int64_t*)(charArray + 12);
Note that this could potentially cause misaligned memory accesses. So it may not always work.
A more robust solution that doesn't violate strict-aliasing and won't have alignment issues:
int32_t value1;
int64_t value2;
int64_t value3;
memcpy(&value1,charArray + 0,sizeof(int32_t));
memcpy(&value2,charArray + 4,sizeof(int64_t));
memcpy(&value3,charArray + 12,sizeof(int64_t));
try this
typedef struct {
int32_t firstValue;
int64_t secondValue;
int64_t thirdValue;
} hd;
hd* p = reinterpret_cast<hd*>(charArray);
now you can access the values e.g. p->firstValue
EDIT: make sure the struct is packed on byte boundaries e.g. with Visual Studio you write #pragma pack(1) before the struct
To avoid any alignment concerns, the ideal solution is to copy the bytes out of the buffer into the target objects. To do this, you can use some helpful utilities:
typedef unsigned char const* byte_iterator;
template <typename T>
byte_iterator begin_bytes(T& x)
{
return reinterpret_cast<byte_iterator>(&x);
}
template <typename T>
byte_iterator end_bytes(T& x)
{
return reinterpret_cast<byte_iterator>(&x + 1);
}
template <typename T>
T safe_reinterpret_as(byte_iterator const it)
{
T o;
std::copy(it, it + sizeof(T), ::begin_bytes(o));
return o;
}
Then your problem is rather simple:
int32_t firstValue = safe_reinterpret_as<int32_t>(charArray);
int64_t secondValue = safe_reinterpret_as<int64_t>(charArray + 4);
int64_t thirdValue = safe_reinterpret_as<int64_t>(charArray + 12);
if charArray is a 1 byte char type, then you need to use 4 and 12 for your 2nd and 3rd values

Data structures with different sized bit fields

If I have a requirement to create a data structure that has the following fields:
16-bit Size field
3-bit Version field
1-bit CRC field
How would I code this struct? I know the Size field would be an unsigned short type, but what about the other two fields?
First, unsigned short isn't guaranteed to be only 16 bits, just at least 16 bits.
You could do this:
struct Data
{
unsigned short size : 16;
unsigned char version : 3;
unsigned char crc : 1;
};
Assuming you want no padding between the fields, you'll have to issue the appropriate instructions to your compiler. With gcc, you can decorate the structure with __attribute__((packed)):
struct Data
{
// ...
} __attribute__((packed));
In Visual C++, you can use #pragma pack:
#pragma pack(push, 0)
struct Data
{
// ...
};
#pragma pack(pop)
The following class implements the fields you are looking for as a kind of bitfields.
struct Identifier
{
unsigned int a; // only bits 0-19 are used
unsigned int getSize() const {
return a & 0xFFFF; // access bits 0-15
}
unsigned int getVersion() const {
return (a >> 16) & 7; // access bits 16-18
}
unsigned int getCrc() const {
return (a >> 19) & 1; // access bit 19
}
void setSize(unsigned int size) {
a = a - (a & 0xFFF) + (size & 0xFFF);
}
void setVersion(unsigned int version) {
a = a - (a & (7<<16)) + ((version & 7) << 16);
}
void setCrc(unsigned int crc) {
a = a - (a & (1<<19)) + ((crc & 1) << 19);
}
};