Weird Data Types - C++ - c++

I know this sounds like a silly question, but I would like to know if it is possible in any way to make a custom variable size like this rather than using plain 8, 16, 32 and 64 bit integers:
uint15_t var; //as an example
I did some research online and I found nothing that works on all sizes (only stuff like 12 bit and 24 bit). (I was also wondering if bitfields would work with other data sizes too).
And yes, I know you could use a 16 bit integer to store a 15 bit value, but I just wanted to know if it is possible to do something like this.
How can I make and implement custom integer sizes?

Inside a struct or class, can use the bitfields feature to declare integers of the size you want for a member-variable:
unsigned int var : 15;
... it won't be very CPU-efficient (since the compiler will have to generate bitwise-operations on most accesses to that variable) but it will give you the desired 15-bit behavior.

To be able to use your bitfield int as a normal int but still get the behavior of a 15 bit int you can do it like this :
#include <cassert>
#include <utility>
template<typename type_t, std::size_t N>
struct bits_t final
{
bits_t() = default;
~bits_t() = default;
// explicitly implicit so it can convert automatically from underlying type
bits_t(const type_t& value) :
m_value{ value }
{
};
// implicit conversion back to underlying type
operator type_t ()
{
return m_value;
}
private:
type_t m_value : N;
};
int main()
{
bits_t<int, 15> value;
value = 16383; // 0x3FFF
assert(value == 16383);
// overflow now at bit 15 :)
value = 16384; // 0x4000, 16th bit is set.
assert(value == -16384);
return 0;
}

bitfields feature will do the trick ...
uint32_t customInt : 15;

You can try using bitfields, like many people mentioned. However, bitfields don't have a proper type. If you want to make your arbitrary-sized integers object-oriented, you can stuff the bitfield into a template:
template <int size> struct my_uint
{
uint32_t value: size;
};
typedef my_uint<13> uint13_t; // some people use "using" syntax to do this
typedef my_uint<14> uint14_t;
typedef my_uint<15> uint15_t;
However, now you lost arithmetic operators, and you have to implement (overload) them yourself. You have to ask yourself many questions about what you really want to do with these new types:
Do you want to overload operators like +, *, etc? Which ones?
Do you want to support arrays?
What is the maximum size you want to support? In my example, it's 32.
Do you want to support implicit constructors, e.g. uint15_t(uint32_t)?
How to support overflow?
There is no way to make your new types behave like built-in types - you can come close but cannot quite do it. That is, if you write a big program where you work with uint15_t and later you decide to switch to uint16_t, there will be subtle changes caused by uint16_t being a built-in type (e.g. consider rules about implicit conversions).

Related

Filling an std::vector with raw data

I need to fill a vector with raw data, sometimes 2 bytes, sometimes 8... I ended up with this template function:
template <typename T>
void fillVector(std::vector<uint8_t>& dest, T t)
{
auto ptr = reinterpret_cast<uint8_t*>(&t);
dest.insert(dest.end(),ptr,ptr+sizeof(t));
}
with this I can fill the vector like this:
fillVector<uint32_t>(dst,32bitdata);
fillVector<uint16_t>(dst,16bitdata);
I was wondering if something more similar already exist in the standard library?
No, there's nothing in the standard library to achieve what you are after. So your solution is pretty much what you can currently go with (assuming your goal is to do some form of serialization).
The only point of improvement is that you are assuming uint8_t is a type that may be used to alias an object and inspect its bytes. That need not be the case. The only such types in C++11 are char and unsigned char. While uint8_t usually aliases the later in most modern architectures, that's not a hard requirement, it could alias a platform specific 8 bit unsigned integer type (the merits of that are outside the scope of this question). So to be standard conforming, either guard against it:
static_assert(std::is_same<unsigned char, std::uint8_t>::value, "Oops!");
Or use your own alias for valid "byte" type
namespace myapp { using byte = unsigned char; }
and deal in std::vector<myapp::byte>.

Portable bit fields for Handles

I want to use and store "Handles" to data in an object buffer to reduce allocation overhead. The handle is simply an index into an array with the object. However I need to detect use-after-reallocations, as this could slip in quite easily. The common approach seems to be using bit fields. However this leads to 2 problems:
Bit fields are implementation defined
Bit shifting is not portable across big/little endian machines.
What I need:
Store handle to file (file handler can manage either integer types (byte swapping) or byte arrays)
Store 2 values in the handle with minimum space
What I got:
template<class T_HandleDef, typename T_Storage = uint32_t>
struct Handle
{
typedef T_HandleDef HandleDef;
typedef T_Storage Storage;
Handle(): handle_(0){}
private:
const T_Storage handle_;
};
template<unsigned T_numIndexBits = 16, typename T_Tag = void>
struct HandleDef{
static const unsigned numIndexBits = T_numIndexBits;
};
template<class T_Handle>
struct HandleAccessor{
typedef typename T_Handle::Storage Storage;
typedef typename T_Handle::HandleDef HandleDef;
static const unsigned numIndexBits = HandleDef::numIndexBits;
static const unsigned numMagicBits = sizeof(Storage) * 8 - numIndexBits;
/// "Magic" struct that splits the handle into values
union HandleData{
struct
{
Storage index : numIndexBits;
Storage magic : numMagicBits;
};
T_Handle handle;
};
};
A usage would be for example:
typedef Handle<HandleDef<24> > FooHandle;
FooHandle Create(unsigned idx, unsigned m){
HandleAccessor<FooHandle>::HandleData data;
data.idx = idx;
data.magic = m;
return data.handle;
}
My goal was to keep the handle as opaque as possible, add a bool check but nothing else. Users of the handle should not be able to do anything with it but passing it around.
So problems I run into:
Union is UB -> Replace its T_Handle by Storage and add a ctor to Handle from Storage
How does the compiler layout the bit field? I fill the whole union/type so there should be no padding. So probably the only thing that can be different is which type comes first depending on endianess, correct?
How can I store handle_ to a file and load it from a possible different endianess machine and still have index and magic be correct? I think I can store the containing Storage 'endian-correct' and get correct values, IF both members occupy exactly half the space (2 Shorts in an uint) But I always want more space for the index than for the magic value.
Note: There are already questions about bitfields and unions. Summary:
Bitfields may have unexpected padding (impossible here as whole type occupied)
Order of "members" depend on compiler (only 2 possible ways here, should be save to assume order depends entirely on endianess, so this may or may not actually help here)
Specific binary layout of bits can be achieved by manual shifting (or e.g. wrappers http://blog.codef00.com/2014/12/06/portable-bitfields-using-c11/) -> Is not an answer here. I need also a specific layout of the values IN the bitfield. So I'm not sure what I get, if I e.g. create a handle as handle = (magic << numIndexBits) | index and save/load this as binary (no endianess conversion) Missing a BigEndian machine for testing.
Note: No C++11, but boost is allowed.
Answer is pretty simple (based on another question I forgot the link to and comments by #Jeremy Friesner ):
As "numbers" are already an abstraction in C++ one can be sure to always have the same bit representation when the variable is in a CPU register (when it is used for anything calculation like) Also bit shifts in C++ are defined in an endian-independent way. This means x << 1 is always equal x * 2 (and hence big-endian)
Only time one get endianess problems is when saving to file, send/recv over network or accessing it from memory differently (e.g. via pointers...)
One cannot use C++ bitfields here, as one cannot be 100% sure about the order of the "entries". Bitfield containers might be ok, if they allow access to the data as a "number".
Savest is (still) using bitshifts, which are very simple in this case (only 2 values) During storing/serialization the number must then be stored in an endian-agnostic way.

How should I define a set number of bits in a typedef?

The Problem
I'm currently trying to simulate some firmware in C++11. In the firmware we have a fixed data length of 32 bits, we split this 32 bits into smaller packets e.g we have a packet which as a size of 9 bits, another of 6 which gets packed into the 32 bit word.
In C++ I want to ensure the data I type in is of those lengths. I don't care if I overflow, just that only the 9 bits are operated on or passed onto another function.
Ideally I'd like some simple typedef like:
only_18_bits some_value;
My Attempt
struct sel_vals{
int_fast32_t m_val : 18;
int_fast8_t c_val : 5;
}
But this is a little annoying as I'd have to do this whenever I want to use it:
sel_vals somevals;
somevals.m_val = 5;
Seems a little verbose to me plus I have to declare the struct first.
Also for obvious reasons, I can't just do something like:
typedef sel_vals.m_val sel_vals_m_t;
typedef std::vector<sel_vals_m_t>;
I could use std::bitset<9> but whenever I want to do some maths I have to convert it to unsigned, it just gets a little messy. I want to avoid mess.
Any ideas?
I would suggest a wrapper facade, something along these lines:
#include <cstdint>
template<int nbits> class bits {
uint64_t value;
static const uint64_t mask = (~(uint64_t)0) >> (64-nbits);
public:
bits(uint64_t initValue=0) : value(initValue & mask) {}
bits &operator=(uint64_t newValue)
{
value=newValue & mask;
}
operator uint64_t() const { return value; }
};
//
bits<19> only_19_bits_of_precision;
With a little bit of work, you can define math operator overloads that directly operate on these templates.
With a little bit of more work, you could work this template to pick a smaller internal value, uint32_t, uint16_t, or uint8_t, if the nbits template parameter is small enough.

Comparing enums to integers

I've read that you shouldn't trust on the underlying implementation of an enum on being either signed or unsigned. From this I have concluded that you should always cast the enum value to the type that it's being compared against. Like this:
enum MyEnum { MY_ENUM_VALUE = 0 };
int i = 1;
if (i > static_cast<int>(MY_ENUM_VALUE))
{
// do stuff
}
unsigned int u = 2;
if (u > static_cast<unsigned int>(MY_ENUM_VALUE))
{
// do more stuff
}
Is this the best practice?
Edit: Does the situation change if the enum is anonymous?
An enum is an integer so you can compare it against any other integer, and even floats. The compiler will automatically convert both integers to the largest, or the enum to a double before the compare.
Now, if your enumeration is not supposed to represent a number per se, you may want to consider creating a class instead:
enum class some_name { MY_ENUM_VALUE, ... };
int i;
if(i == static_cast<int>(some_name::MY_ENUM_VALUE))
{
...
}
In that case you need a cast because an enum class is not viewed as an integer by default. This helps quite a bit to avoid bugs in case you were to misuse an enum value...
Update: also, you can now specify the type of integer of an enum. This was available in older compilers too, but it was often not working quite right (in my own experience).
enum class some_name : uint8_t { ... };
That means the enumeration uses uint8_t to store those values. Practical if you are using enumeration values in a structure used to send data over a network or save in a binary file where you need to know the exact size of the data.
When not specified, the type defaults to int.
As brought up by others, if the point of using enum is just to declare numbers, then using constexpr is probably better.
constexpr int MY_CONSTANT_VALUE = 0;
This has the same effect, only the type of MY_CONSTANT_VALUE is now an int. You could go a little further and use typedef as in:
typedef int my_type_t;
constexpr my_type_t MY_CONSTANT_VALUE = 0;
I often use enum even if I'm to use a single value when the value is not generally considered an integer. There is no set in stone rule in this case.
Short answer: Yes
enum is signed int type, but they get implicitly cast into unsigned int. Your compiler might give a warning without explicit casting, but its still very commonly used. however you should explicitly cast to make it clear to maintainers.
And of course, explicit cast will be must when its a strongly typed enum.
Best practice is not to write
int i = 1;
if (i > static_cast<int>(MY_ENUM_VALUE))
{
// do stuff
}
instead write
MyEnumValue i = MY_ENUM_VALUE ;
...
if ( i > MY_ENUM_VALUE ) {..}
But if - as in your example - you only have one value in your enum it is better to declare it as a constant instead of an enum.

Custom byte size?

So, you know how the primitive of type char has the size of 1 byte? How would I make a primitive with a custom size? So like instead of an in int with the size of 4 bytes I make one with size of lets say 16.
Is there a way to do this? Is there a way around it?
It depends on why you are doing this. Usually, you can't use types of less than 8 bits, because that is the addressable unit for the architecture. You can use structs, however, to define different lengths:
struct s {
unsigned int a : 4; // a is 4 bits
unsigned int b : 4; // b is 4 bits
unsigned int c : 16; // c is 16 bits
};
However, there is no guarantee that the struct will be 24 bits long. Also, this can cause endian issues. Where you can, it's best to use system independent types, such as uint16_t, etc. You can also use bitwise operators and bit shifts to twiddle things very specifically.
Normally you'd just make a struct that represents the data in which you're interested. If it's 16 bytes of data, either it's an aggregate of a number of smaller types or you're working on a processor that has a native 16-byte integral type.
If you're trying to represent extremely large numbers, you may need to find a special library that handles arbitrarily-sized numbers.
In C++11, there is an excellent solution for this: std::aligned_storage.
#include <memory>
#include <type_traits>
int main()
{
typedef typename std::aligned_storage<sizeof(int)>::type memory_type;
memory_type i;
reinterpret_cast<int&>(i) = 5;
std::cout << reinterpret_cast<int&>(i) << std::endl;
return 0;
}
It allows you to declare a block of uninitialized storage on the stack.
If you want to make a new type, typedef it. If you want it to be 16-bytes in size, typedef a struct that has 16-bytes of member data within it. Just beware that quite often compilers will pad things on you to match your systems alignment needs. A 1 byte struct rarely remains 1 bytes without care.
You could just static cast to and from std::string. I don't know enough C++ to give an example, but I think this would be pretty intuitive.