Is memcpy the standard way to pack float into uint32? - c++

Is the following the best way to pack a float's bits into a uint32? This might be a fast and easy yes, but I want to make sure there's no better way, or that exchanging the value between processes doesn't introduce a weird wrinkle.
"Best" in my case, is that it won't ever break on a compliant C++ compiler (given the static assert), can be packed and unpacked between two processes on the same computer, and is as fast as copying a uint32 into another uint32.
Process A:
static_assert(sizeof(float) == sizeof(uint32) && alignof(float) == alignof(uint32), "no");
...
float f = 0.5f;
uint32 buffer[128];
memcpy(buffer + 41, &f, sizeof(uint32)); // packing
Process B:
uint32 * buffer = thisUint32Is_ReadFromProcessA(); // reads "buffer" from process A
...
memcpy(&f, buffer + 41, sizeof(uint32)); // unpacking
assert(f == 0.5f);

Yes, this is the standard way to do type punning. Cppreferences's page on memcpy even includes an example showing how you can use it to reinterpret a double as an int64_t
#include <iostream>
#include <cstdint>
#include <cstring>
int main()
{
// simple usage
char source[] = "once upon a midnight dreary...", dest[4];
std::memcpy(dest, source, sizeof dest);
for (char c : dest)
std::cout << c << '\n';
// reinterpreting
double d = 0.1;
// std::int64_t n = *reinterpret_cast<std::int64_t*>(&d); // aliasing violation
std::int64_t n;
std::memcpy(&n, &d, sizeof d); // OK
std::cout << std::hexfloat << d << " is " << std::hex << n
<< " as an std::int64_t\n";
}
ouput
o
n
c
e
0x1.999999999999ap-4 is 3fb999999999999a as an std::int64_t
As long as the asserts pass (your are writing and reading the correct number of bytes) then the operation is safe. You can't pack a 64 bit object in a 32 bit object, but you can pack one 32 bit object into another 32 bit object, as long they are trivially copyable

Or this:
union TheUnion {
uint32 theInt;
float theFloat;
};
TheUnion converter;
converter.theFloat = myFloatValue;
uint32 myIntRep = converter.theInt;
I don't know if this is better, but it's a different way to look at it.

Related

Qt: from a fixed number of bytes to an integer

Using Qt5.4, I build the function generateRandomIDOver2Bytes. It generates a random number and it puts it onto a variable that occupies exactly two bytes.
QByteArray generateRandomIDOver2Bytes() {
QString randomValue = QString::number(qrand() % 65535);
QByteArray x;
x.setRawData(randomValue.toLocal8Bit().constData(), 2);
return x;
}
My issue is reverting the so generated value in order to obtain, again, an integer.
The following minimum example actually does not work:
QByteArray tmp = generateRandomIDOver2Bytes(); //for example, the value 27458
int value = tmp.toUInt();
qDebug() << value; //it prints always 9
Any idea?
A 16 bit integer can be split into individual bytes by bit operations.
This way, it can be stored into a QByteArray.
From Qt doc. of QByteArray:
QByteArray can be used to store both raw bytes (including '\0's) and traditional 8-bit '\0'-terminated strings.
For recovering, bit operations can be used as well.
The contents of the QByteArray does not necessarily result into printable characters but that may not (or should not) be required in this case.
testQByteArrayWithUShort.cc:
#include <QtCore>
int main()
{
quint16 r = 65534;//qrand() % 65535;
qDebug() << "r:" << r;
// storing r in QByteArray (little endian)
QByteArray qBytes(2, 0); // reserve space for two bytes explicitly
qBytes[0] = (uchar)r;
qBytes[1] = (uchar)(r >> 8);
qDebug() << "qBytes:" << qBytes;
// recovering r
quint16 rr = qBytes[0] | qBytes[1] << 8;
qDebug() << "rr:" << rr;
}
Output:
r: 65534
qBytes: "\xFE\xFF"
rr: 65534
Given the random value 27458, when you do this:
x.setRawData(randomValue.toLocal8Bit().constData(), 2);
you're filling the array with the first two bytes of this string: "27458".
And here:
int value = tmp.toUInt();
the byte array is implicitly cast to a string ("27"), which in turn is converted to a numeric value (an unsigned integer).
Let's try something different, that maybe suits your need.
First, store the value in a numeric variable, possibly of the deisred size (16 bits, 2 bytes):
ushort randomValue = qrand() % 65535;
then just return a byte array, built using a pointer to the ushort, cast to char * (don't use setRawData, because it doesn't copy the bytes you pass it in, as well explained here):
return QByteArray(reinterpret_cast<char *>(&randomValue), 2);
To get back to the value:
QByteArray tmp = generateRandomIDOver2Bytes(); //for example, the value 27458
ushort value;
memcpy(&value, tmp.data(), 2);
Please notice: types do matter here. You wrote an uint in a byte array, you must read an uint out of it.
All this can be generalized in a class like:
template <typename T>
class Value
{
QByteArray bytes;
public:
Value(T t) : bytes(reinterpret_cast<char*>(&t), sizeof(T)) {}
T read() const
{
T t;
memcpy(&t, bytes.data(), sizeof(T));
return t;
}
};
so you can have a generic function like:
template<typename T>
Value<T> generateRandomIDOverNBytes()
{
T value = qrand() % 65535;
qDebug() << value;
return Value<T>(value);
}
and safely use the type your prefer to store the random value:
Value<ushort> value16 = generateRandomIDOverNBytes<ushort>();
qDebug() << value16.read();
Value<int> value32 = generateRandomIDOverNBytes<int>();
qDebug() << value32.read();
Value<long long> value64 = generateRandomIDOverNBytes<long long>();
qDebug() << value64.read();

How to convert a list of uint8 to their corresponding signed value?

Say that I have an array of bytes:
std::array<std::uint8_t, 4> list
and I want to convert these to their corresponding signed value after concatenating the bits contained in list. For the case of list for example, and since it is an array of size 4, this would translate to an int32. What is the "correct" way of doing this in C++ that would not result in undefined or compiler specific behavior? Would doing something like this be correct and not considered undefined or compiler specific?:
std::uint32_t sum = list[0];
sum = sum + static_cast<std::uint32_t>(list[1])<<8;
sum = sum + static_cast<std::uint32_t>(list[2])<<16;
sum = sum + static_cast<std::uint32_t>(list[3])<<24;
std::int32_t sum_int32 = static_cast<std::int32_t>(sum);
In other words sum is meant to hold the 32bit representation of the value in two's complement.
If you insist
to convert these to their corresponding signed value
that is to int32_t
than following should be fast, safe, short, portable and easy because used boost code is header only (no need to build boost):
#include <array>
#include <iostream>
#include <boost/numeric/conversion/cast.hpp>
#ifdef __linux__
#include <arpa/inet.h>
#elif _WIN32
#include <winsock.h>
#else
// ...
#endif
int main()
{
using boost::numeric_cast;
using boost::numeric::bad_numeric_cast;
using boost::numeric::positive_overflow;
using boost::numeric::negative_overflow;
std::array<uint8_t, 4> a = { 1,2,3,4 }; // big-endian
uint32_t ui = ntohl(*reinterpret_cast<uint32_t*>(a.data())); // convert to host specific byte order
std::cout << std::hex << ui << std::endl;
try
{
int32_t si = numeric_cast<int32_t>(ui); // This conversion succeeds (is in range)
std::cout << std::hex << si << std::endl;
}
catch (negative_overflow& e) {
std::cout << e.what();
}
catch (positive_overflow& e) {
std::cout << e.what();
}
}
Good, but you get implementation defined behaviour if the unsigned value cannot be represented in the signed one (cf, for example, this online c++ standard draft):
4.7 Integral conversions
3) If the destination type is signed, the value is unchanged if it can
be represented in the destination type (and bit-field width);
otherwise, the value is implementation-defined.
To overcome this, you could restrict the most significant byte to 7 bits:
sum = sum + (static_cast<std::uint32_t>(list[3]) & 0x7F) <<24;
if (static_cast<std::uint32_t>(list[3]) & 0x80) {
sum = -sum;
}
Note that only 31 bit can be used for the signed value's content.
You actually cannot rely on two's complement representations, but with thesum = -sum-notation you should be safe.

Why does std::bitset only support integral data types? Why is float not supported?

On trying to generate the bit pattern of a float as follows:
std::cout << std::bitset<32>(32.5) << std::endl;
the compiler generates this warning:
warning: implicit conversion from 'double' to 'unsigned long long' changes value
from 32.5 to 32 [-Wliteral-conversion]
std::cout << std::bitset<32>(32.5) << std::endl;
Output on ignoring warning :) :
00000000000000000000000000100000
Why cannot bitset detect floats and correctly output bit sequence, when casting to char* and walking memory does show correct sequence?
This works, but is machine dependent on byte ordering and mostly unreadable:
template <typename T>
void printMemory(const T& data) {
const char* begin = reinterpret_cast<const char*>(&data);
const char* end = begin + sizeof(data);
while(begin != end)
std::cout << std::bitset<CHAR_BIT>(*begin++) << " ";
std::cout << std::endl;
}
Output:
00000000 00000000 00000010 01000010
Is there a reason not to support floats? Is there an alternative for floats?
What would you expect to appear in your bitset if you supplied a float? Presumably some sort of representation of an IEEE-7545 binary32 floating point number in big-endian format? What about platforms that don't represent their floats in a way that's even remotely similar to that? Should the implementation bend over backwards to (probably lossily) convert the float supplied to what you want?
The reason it doesn't is that there is no standard defined format for floats. They don't even have to be 32 bits. They just usually are on most platforms.
C++ and C will run on very tiny and/or odd platforms. The standard can't count on what's 'usually the case'. There were/are C/C++ compilers for 8/16 bit 6502 systems who's sorry excuse for a native floating point format was (I think) a 6-byte entity that used packed BCD encoding.
This is the same reason that signed integers are also unsupported. Two's complement is not universal, just almost universal. :-)
With all the usual warnings about floating point formats not being standardised, endianness, etc etc
Here is code that will probably work, at least on x86 hardware.
#include <bitset>
#include <iostream>
#include <type_traits>
#include <cstring>
constexpr std::uint32_t float_to_bits(float in)
{
std::uint32_t result = 0;
static_assert(sizeof(float) == sizeof(result), "float is not 32 bits");
constexpr auto size = sizeof(float);
std::uint8_t buffer[size] = {};
// note - memcpy through a byte buffer to satisfy the
// strict aliasing rule.
// note that this has no detrimental effect on performance
// since memcpy is 'magic'
std::memcpy(buffer, std::addressof(in), size);
std::memcpy(std::addressof(result), buffer, size);
return result;
}
constexpr std::uint64_t float_to_bits(double in)
{
std::uint64_t result = 0;
static_assert(sizeof(double) == sizeof(result), "double is not 64 bits");
constexpr auto size = sizeof(double);
std::uint8_t buffer[size] = {};
std::memcpy(buffer, std::addressof(in), size);
std::memcpy(std::addressof(result), buffer, size);
return result;
}
int main()
{
std::cout << std::bitset<32>(float_to_bits(float(32.5))) << std::endl;
std::cout << std::bitset<64>(float_to_bits(32.5)) << std::endl;
}
example output:
01000010000000100000000000000000
0100000001000000010000000000000000000000000000000000000000000000
#include <iostream>
#include <bitset>
#include <climits>
#include <iomanip>
using namespace std;
template<class T>
auto toBitset(T x) -> bitset<sizeof(T) * CHAR_BIT>
{
return bitset<sizeof(T) * CHAR_BIT>{ *reinterpret_cast<unsigned long long int *>(&x) };
}
int main()
{
double x;
while (cin >> x) {
cout << setw(14) << x << " " << toBitset(x) << endl;
}
return 0;
}
https://wandbox.org/permlink/tCz5WwHqu2X4CV1E
sadly this fails if argument type is bigger than size of unsigned long long, for example it will fail for long double. This is limit of bitset constructor.

C++ manipulating Raw Data of a struct

I have a simple struct that looks like this:
struct Object
{
int x_;
double y_;
};
I am trying to manipulate the raw data of an Object, this is what I've done:
int main()
{
Object my_object;
unsigned char* raw_data = reinterpret_cast<unsigned char*>(&my_object);
int x = 10;
memcpy(raw_data, &x, sizeof(x));
raw_data += sizeof(x);
double y = 20.1;
memcpy(raw_data, &y, sizeof(y));
Object* my_object_ptr = reinterpret_cast<Object *>(raw_data);
std::cout << *(my_object_ptr).x << std::endl; //prints 20 (expected 10)
std::cout << *(my_object_ptr).y << std::endl; //prints Rubbish (expected 20.1)
}
I was expecting that above code will work,,,
What is the real problem? Is this even possible?
You need to use offsetof macro. There were a few more problems too, most importantly you modified raw_data pointer, and then cast the modified value back to Object* pointer, resulting in Undefined Behavior. I chose to remove the raw_data modification (alternative would have been to not cast it back, but to just inspect my_object directly). Here's a fixed code for you, with explanation in comments:
#include <iostream>
#include <cstring> // for memcpy
#include <cstddef> // for offsetof macro
struct Object
{
int x_;
double y_;
};
int main()
{
Object my_object;
unsigned char* raw_data = reinterpret_cast<unsigned char*>(&my_object);
int x = 10;
// 1st memcpy fixed to calculate offset of x_ (even though it is probably 0)
memcpy(raw_data + offsetof(Object, x_), &x, sizeof(x));
//raw_data += offsetof(Object, y_); // if used, add offset of y_ instead of sizeof x
double y = 20.1;
// 2nd memcpy fixed to calculate offset of y_ (offset could be 4 or 8, depends on packing, sizeof int, etc)
memcpy(raw_data + offsetof(Object, y_), &y, sizeof(y));
// cast back to Object* pointer
Object* my_object_ptr = reinterpret_cast<Object *>(raw_data);
std::cout << my_object_ptr->x_ << std::endl; //prints 10
std::cout << my_object_ptr->y_ << std::endl; //prints 20.1
}
This is probably a structure padding issue. If you had double y_ as the first member, you'd probably have seen what you expected. The compiler will pad the structure with extra bytes to make the alignment correct in case the struct is used in an array. Try
#pragma pack(4)
before your struct definition.
The #pragma pack reference for Visual Studio: http://msdn.microsoft.com/en-us/library/2e70t5y1.aspx Your struct is packed to 8 bytes by default, so there's a 4 byte pad between x_ and y_.
Read http://www.catb.org/esr/structure-packing/ to really understand what's going on.

C/C++ efficient bit array

Can you recommend efficient/clean way to manipulate arbitrary length bit array?
Right now I am using regular int/char bitmask, but those are not very clean when array length is greater than datatype length.
std vector<bool> is not available for me.
Since you mention C as well as C++, I'll assume that a C++-oriented solution like boost::dynamic_bitset might not be applicable, and talk about a low-level C implementation instead. Note that if something like boost::dynamic_bitset works for you, or there's a pre-existing C library you can find, then using them can be better than rolling your own.
Warning: None of the following code has been tested or even compiled, but it should be very close to what you'd need.
To start, assume you have a fixed bitset size N. Then something like the following works:
typedef uint32_t word_t;
enum { WORD_SIZE = sizeof(word_t) * 8 };
word_t data[N / 32 + 1];
inline int bindex(int b) { return b / WORD_SIZE; }
inline int boffset(int b) { return b % WORD_SIZE; }
void set_bit(int b) {
data[bindex(b)] |= 1 << (boffset(b));
}
void clear_bit(int b) {
data[bindex(b)] &= ~(1 << (boffset(b)));
}
int get_bit(int b) {
return data[bindex(b)] & (1 << (boffset(b));
}
void clear_all() { /* set all elements of data to zero */ }
void set_all() { /* set all elements of data to one */ }
As written, this is a bit crude since it implements only a single global bitset with a fixed size. To address these problems, you want to start with a data struture something like the following:
struct bitset { word_t *words; int nwords; };
and then write functions to create and destroy these bitsets.
struct bitset *bitset_alloc(int nbits) {
struct bitset *bitset = malloc(sizeof(*bitset));
bitset->nwords = (n / WORD_SIZE + 1);
bitset->words = malloc(sizeof(*bitset->words) * bitset->nwords);
bitset_clear(bitset);
return bitset;
}
void bitset_free(struct bitset *bitset) {
free(bitset->words);
free(bitset);
}
Now, it's relatively straightforward to modify the previous functions to take a struct bitset * parameter. There's still no way to re-size a bitset during its lifetime, nor is there any bounds checking, but neither would be hard to add at this point.
boost::dynamic_bitset if the length is only known in run time.
std::bitset if the length is known in compile time (although arbitrary).
I've written a working implementation based off Dale Hagglund's response to provide a bit array in C (BSD license).
https://github.com/noporpoise/BitArray/
Please let me know what you think / give suggestions. I hope people looking for a response to this question find it useful.
This posting is rather old, but there is an efficient bit array suite in C in my ALFLB library.
For many microcontrollers without a hardware-division opcode, this library is EFFICIENT because it doesn't use division: instead, masking and bit-shifting are used. (Yes, I know some compilers will convert division by 8 to a shift, but this varies from compiler to compiler.)
It has been tested on arrays up to 2^32-2 bits (about 4 billion bits stored in 536 MBytes), although last 2 bits should be accessible if not used in a for-loop in your application.
See below for an extract from the doco. Doco is http://alfredo4570.net/src/alflb_doco/alflb.pdf, library is http://alfredo4570.net/src/alflb.zip
Enjoy,
Alf
//------------------------------------------------------------------
BM_DECLARE( arrayName, bitmax);
Macro to instantiate an array to hold bitmax bits.
//------------------------------------------------------------------
UCHAR *BM_ALLOC( BM_SIZE_T bitmax);
mallocs an array (of unsigned char) to hold bitmax bits.
Returns: NULL if memory could not be allocated.
//------------------------------------------------------------------
void BM_SET( UCHAR *bit_array, BM_SIZE_T bit_index);
Sets a bit to 1.
//------------------------------------------------------------------
void BM_CLR( UCHAR *bit_array, BM_SIZE_T bit_index);
Clears a bit to 0.
//------------------------------------------------------------------
int BM_TEST( UCHAR *bit_array, BM_SIZE_T bit_index);
Returns: TRUE (1) or FALSE (0) depending on a bit.
//------------------------------------------------------------------
int BM_ANY( UCHAR *bit_array, int value, BM_SIZE_T bitmax);
Returns: TRUE (1) if array contains the requested value (i.e. 0 or 1).
//------------------------------------------------------------------
UCHAR *BM_ALL( UCHAR *bit_array, int value, BM_SIZE_T bitmax);
Sets or clears all elements of a bit array to your value. Typically used after a BM_ALLOC.
Returns: Copy of address of bit array
//------------------------------------------------------------------
void BM_ASSIGN( UCHAR *bit_array, int value, BM_SIZE_T bit_index);
Sets or clears one element of your bit array to your value.
//------------------------------------------------------------------
BM_MAX_BYTES( int bit_max);
Utility macro to calculate the number of bytes to store bitmax bits.
Returns: A number specifying the number of bytes required to hold bitmax bits.
//------------------------------------------------------------------
You can use std::bitset
int main() {
const bitset<12> mask(2730ul);
cout << "mask = " << mask << endl;
bitset<12> x;
cout << "Enter a 12-bit bitset in binary: " << flush;
if (cin >> x) {
cout << "x = " << x << endl;
cout << "As ulong: " << x.to_ulong() << endl;
cout << "And with mask: " << (x & mask) << endl;
cout << "Or with mask: " << (x | mask) << endl;
}
}
I know it's an old post but I came here to find a simple C bitset implementation and none of the answers quite matched what I was looking for, so I implemented my own based on Dale Hagglund's answer. Here it is :)
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
typedef uint32_t word_t;
enum { BITS_PER_WORD = 32 };
struct bitv { word_t *words; int nwords; int nbits; };
struct bitv* bitv_alloc(int bits) {
struct bitv *b = malloc(sizeof(struct bitv));
if (b == NULL) {
fprintf(stderr, "Failed to alloc bitv\n");
exit(1);
}
b->nwords = (bits >> 5) + 1;
b->nbits = bits;
b->words = malloc(sizeof(*b->words) * b->nwords);
if (b->words == NULL) {
fprintf(stderr, "Failed to alloc bitv->words\n");
exit(1);
}
memset(b->words, 0, sizeof(*b->words) * b->nwords);
return b;
}
static inline void check_bounds(struct bitv *b, int bit) {
if (b->nbits < bit) {
fprintf(stderr, "Attempted to access a bit out of range\n");
exit(1);
}
}
void bitv_set(struct bitv *b, int bit) {
check_bounds(b, bit);
b->words[bit >> 5] |= 1 << (bit % BITS_PER_WORD);
}
void bitv_clear(struct bitv *b, int bit) {
check_bounds(b, bit);
b->words[bit >> 5] &= ~(1 << (bit % BITS_PER_WORD));
}
int bitv_test(struct bitv *b, int bit) {
check_bounds(b, bit);
return b->words[bit >> 5] & (1 << (bit % BITS_PER_WORD));
}
void bitv_free(struct bitv *b) {
if (b != NULL) {
if (b->words != NULL) free(b->words);
free(b);
}
}
void bitv_dump(struct bitv *b) {
if (b == NULL) return;
for(int i = 0; i < b->nwords; i++) {
word_t w = b->words[i];
for (int j = 0; j < BITS_PER_WORD; j++) {
printf("%d", w & 1);
w >>= 1;
}
printf(" ");
}
printf("\n");
}
void test(struct bitv *b, int bit) {
if (bitv_test(b, bit)) printf("Bit %d is set!\n", bit);
else printf("Bit %d is not set!\n", bit);
}
int main(int argc, char *argv[]) {
struct bitv *b = bitv_alloc(32);
bitv_set(b, 1);
bitv_set(b, 3);
bitv_set(b, 5);
bitv_set(b, 7);
bitv_set(b, 9);
bitv_set(b, 32);
bitv_dump(b);
bitv_free(b);
return 0;
}
I use this one:
//#include <bitset>
#include <iostream>
//source http://stackoverflow.com/questions/47981/how-do-you-set-clear-and-toggle-a-single-bit-in-c
#define BIT_SET(a,b) ((a) |= (1<<(b)))
#define BIT_CLEAR(a,b) ((a) &= ~(1<<(b)))
#define BIT_FLIP(a,b) ((a) ^= (1<<(b)))
#define BIT_CHECK(a,b) ((a) & (1<<(b)))
/* x=target variable, y=mask */
#define BITMASK_SET(x,y) ((x) |= (y))
#define BITMASK_CLEAR(x,y) ((x) &= (~(y)))
#define BITMASK_FLIP(x,y) ((x) ^= (y))
#define BITMASK_CHECK(x,y) ((x) & (y))
I have recently released BITSCAN, a C++ bit string library which is specifically oriented towards fast bit scanning operations. BITSCAN is available here. It is in alpha but still pretty well tested since I have used it in recent years for research in combinatorial optimization (e.g. in BBMC, a state of the art exact maximum clique algorithm). A comparison with other well known C++ implementations (STL or BOOST) may be found here.
I hope you find it useful. Any feedback is welcome.
In micro controller development, some times we need to use
2-dimentional array (matrix) with element value of [0, 1] only. That
means if we use 1 byte for element type, it wastes the memory greatly
(memory of micro controller is very limited). The proposed solution is
that we should use 1 bit matrix (element type is 1 bit).
http://htvdanh.blogspot.com/2016/09/one-bit-matrix-for-cc-programming.html
I recently implemented a small header-only library called BitContainer just for this purpose.
It focuses on expressiveness and compiletime abilities and can be found here:
https://github.com/EddyXorb/BitContainer
It is for sure not the classical way to look at bitarrays but can come in handy for strong-typing purposes and memory efficient representation of named properties.
Example:
constexpr Props props(Prop::isHigh(),Prop::isLow()); // intialize BitContainer of type Props with strong-type Prop
constexpr bool result1 = props.contains(Prop::isTiny()) // false
constexpr bool result2 = props.contains(Prop::isLow()) // true