Qt: from a fixed number of bytes to an integer - c++

Using Qt5.4, I build the function generateRandomIDOver2Bytes. It generates a random number and it puts it onto a variable that occupies exactly two bytes.
QByteArray generateRandomIDOver2Bytes() {
QString randomValue = QString::number(qrand() % 65535);
QByteArray x;
x.setRawData(randomValue.toLocal8Bit().constData(), 2);
return x;
}
My issue is reverting the so generated value in order to obtain, again, an integer.
The following minimum example actually does not work:
QByteArray tmp = generateRandomIDOver2Bytes(); //for example, the value 27458
int value = tmp.toUInt();
qDebug() << value; //it prints always 9
Any idea?

A 16 bit integer can be split into individual bytes by bit operations.
This way, it can be stored into a QByteArray.
From Qt doc. of QByteArray:
QByteArray can be used to store both raw bytes (including '\0's) and traditional 8-bit '\0'-terminated strings.
For recovering, bit operations can be used as well.
The contents of the QByteArray does not necessarily result into printable characters but that may not (or should not) be required in this case.
testQByteArrayWithUShort.cc:
#include <QtCore>
int main()
{
quint16 r = 65534;//qrand() % 65535;
qDebug() << "r:" << r;
// storing r in QByteArray (little endian)
QByteArray qBytes(2, 0); // reserve space for two bytes explicitly
qBytes[0] = (uchar)r;
qBytes[1] = (uchar)(r >> 8);
qDebug() << "qBytes:" << qBytes;
// recovering r
quint16 rr = qBytes[0] | qBytes[1] << 8;
qDebug() << "rr:" << rr;
}
Output:
r: 65534
qBytes: "\xFE\xFF"
rr: 65534

Given the random value 27458, when you do this:
x.setRawData(randomValue.toLocal8Bit().constData(), 2);
you're filling the array with the first two bytes of this string: "27458".
And here:
int value = tmp.toUInt();
the byte array is implicitly cast to a string ("27"), which in turn is converted to a numeric value (an unsigned integer).
Let's try something different, that maybe suits your need.
First, store the value in a numeric variable, possibly of the deisred size (16 bits, 2 bytes):
ushort randomValue = qrand() % 65535;
then just return a byte array, built using a pointer to the ushort, cast to char * (don't use setRawData, because it doesn't copy the bytes you pass it in, as well explained here):
return QByteArray(reinterpret_cast<char *>(&randomValue), 2);
To get back to the value:
QByteArray tmp = generateRandomIDOver2Bytes(); //for example, the value 27458
ushort value;
memcpy(&value, tmp.data(), 2);
Please notice: types do matter here. You wrote an uint in a byte array, you must read an uint out of it.
All this can be generalized in a class like:
template <typename T>
class Value
{
QByteArray bytes;
public:
Value(T t) : bytes(reinterpret_cast<char*>(&t), sizeof(T)) {}
T read() const
{
T t;
memcpy(&t, bytes.data(), sizeof(T));
return t;
}
};
so you can have a generic function like:
template<typename T>
Value<T> generateRandomIDOverNBytes()
{
T value = qrand() % 65535;
qDebug() << value;
return Value<T>(value);
}
and safely use the type your prefer to store the random value:
Value<ushort> value16 = generateRandomIDOverNBytes<ushort>();
qDebug() << value16.read();
Value<int> value32 = generateRandomIDOverNBytes<int>();
qDebug() << value32.read();
Value<long long> value64 = generateRandomIDOverNBytes<long long>();
qDebug() << value64.read();

Related

Is memcpy the standard way to pack float into uint32?

Is the following the best way to pack a float's bits into a uint32? This might be a fast and easy yes, but I want to make sure there's no better way, or that exchanging the value between processes doesn't introduce a weird wrinkle.
"Best" in my case, is that it won't ever break on a compliant C++ compiler (given the static assert), can be packed and unpacked between two processes on the same computer, and is as fast as copying a uint32 into another uint32.
Process A:
static_assert(sizeof(float) == sizeof(uint32) && alignof(float) == alignof(uint32), "no");
...
float f = 0.5f;
uint32 buffer[128];
memcpy(buffer + 41, &f, sizeof(uint32)); // packing
Process B:
uint32 * buffer = thisUint32Is_ReadFromProcessA(); // reads "buffer" from process A
...
memcpy(&f, buffer + 41, sizeof(uint32)); // unpacking
assert(f == 0.5f);
Yes, this is the standard way to do type punning. Cppreferences's page on memcpy even includes an example showing how you can use it to reinterpret a double as an int64_t
#include <iostream>
#include <cstdint>
#include <cstring>
int main()
{
// simple usage
char source[] = "once upon a midnight dreary...", dest[4];
std::memcpy(dest, source, sizeof dest);
for (char c : dest)
std::cout << c << '\n';
// reinterpreting
double d = 0.1;
// std::int64_t n = *reinterpret_cast<std::int64_t*>(&d); // aliasing violation
std::int64_t n;
std::memcpy(&n, &d, sizeof d); // OK
std::cout << std::hexfloat << d << " is " << std::hex << n
<< " as an std::int64_t\n";
}
ouput
o
n
c
e
0x1.999999999999ap-4 is 3fb999999999999a as an std::int64_t
As long as the asserts pass (your are writing and reading the correct number of bytes) then the operation is safe. You can't pack a 64 bit object in a 32 bit object, but you can pack one 32 bit object into another 32 bit object, as long they are trivially copyable
Or this:
union TheUnion {
uint32 theInt;
float theFloat;
};
TheUnion converter;
converter.theFloat = myFloatValue;
uint32 myIntRep = converter.theInt;
I don't know if this is better, but it's a different way to look at it.

Is there a way I can use a 2-bit size type instead of an int, by just plugging in the new type name instead of int?

I have an application where I need to save as much of memory as possible. I need to store a large amount of data that can take exactly three possible values. So, I have been trying to use a 2 bit sized type.
One possibility is using bit fields. I could do
struct myType {
uint8_t twoBits : 2;
}
This is a suggestion from this thread.
However, everywhere where I have used int variables prior to this, I would need to change their usage by appending a .twoBits. I checked if I can create a bit field outside of a struct, such as
uint8_t twoBits : 2;
but this thread says it is not possible. However,that thread is specific to C, so I am not sure if it applied to C++.
Is there a clean way I can define a 2-bit type, so that by simply replacing int with my type, I can run the program correctly? Or is using bit fields the only possible way?
CPU, and thus the memory, the bus, and the compiler too, uses only bytes or groups of bytes. There's no way to store a 2-bits type without storing also the other 6 remaining bits.
What you can so is define a struct that only uses some bits. But we aware that it will not save memory.
You can pack several x-bits types in a struct, as you already know. Or you can do bits operations to pack/unpack them into a integer type.
Is there a clean way I can define a 2-bit type, so that by simply
replacing int with my type, I can run the program correctly? Or is
using bit fields the only possible way?
You can try to make the struct as transparent as possible by providing implicit conversion operators and constructors:
#include <cstdint>
#include <iostream>
template <std::size_t N, typename T = unsigned>
struct bit_field {
T rep : N;
operator T() { return rep; }
bit_field(T i) : rep{ i } { }
bit_field() = default;
};
using myType = bit_field<2, std::uint8_t>;
int main() {
myType mt;
mt = 3;
std::cout << mt << "\n";
}
So objects of type my_type somewhat behave like real 3-bit unsigned integers, despite having more than 3 bits.
Of course, the residual bits are unused, but as single bits are not addressable on most systems, this is the best way to go.
I'm not convinced that you will save anything with your existing structure, as the surrounding structure still gets rounded up to a whole number of bytes.
You can write the following to squeeze 4 2-bit counters into 1 byte, but as you say, you have to name them myInst.f0:
struct MyStruct
{
ubyte_t f0:2,
f1:2,
f2:2,
f3:2;
} myInst;
In c and c++98, you can declare this anonymous, but this usage is deprecated. You can now access the 4 values directly by name:
struct
{ // deprecated!
ubyte_t f0:2,
f1:2,
f2:2,
f3:2;
};
You could declare some sort of template that wraps a single instance with an operator int and operator =(int), and then define a union to put the 4 instances at the same location, but again anonymous unions are deprecated. However you could then declare references to your 4 values, but then you are paying for the references, which are bigger than the bytes you were trying to save!
template <class Size,int offset,int bits>
struct Bitz
{
Size ignore : offset,
value : bits;
operator Size()const { return value; }
Size operator = (Size val) { return (value = val); }
};
template <class Size,int bits>
struct Bitz0
{ // I know this can be done better
Size value : bits;
operator Size()const { return value; }
Size operator = (Size val) { return (value = val); }
};
static union
{ // Still deprecated!
Bitz0<char, 2> F0;
Bitz<char, 2, 2> F1;
Bitz<char, 4, 2> F2;
Bitz<char, 6, 2> F3;
};
union
{
Bitz0<char, 2> F0;
Bitz<char, 2, 2> F1;
Bitz<char, 4, 2> F2;
Bitz<char, 6, 2> F3;
} bitz;
Bitz0<char, 2>& F0 = bitz.F0; /// etc...
Alternatively, you could simply declare macros to replace the the dotted name with a simple name (how 1970s):
#define myF0 myInst.f0
Note that you can't pass bitfields by reference or pointer, as they don't have a byte address, only by value and assignment.
A very minimal example of a bit array with a proxy class that looks (for the most part) like you were dealing with an array of very small integers.
#include <cstdint>
#include <iostream>
#include <vector>
class proxy
{
uint8_t & byte;
unsigned int shift;
public:
proxy(uint8_t & byte,
unsigned int shift):
byte(byte),
shift(shift)
{
}
proxy(const proxy & src):
byte(src.byte),
shift(src.shift)
{
}
proxy & operator=(const proxy &) = delete;
proxy & operator=(unsigned int val)
{
if (val <=3)
{
uint8_t wipe = 3 << shift;
byte &= ~wipe;
byte |= val << shift;
}
// might want to throw std::out_of_range here
return *this;
}
operator int() const
{
return (byte >> shift) &0x03;
}
};
Proxy holds a reference to a byte and knows how to extract two specific bits and look like an int to anyone who uses it.
If we wrap an array of bits packed into bytes with a class that returns this proxy object wrapped around the appropriate byte, we now have something that looks a lot like an array of very small ints.
class bitarray
{
size_t size;
std::vector<uint8_t> data;
public:
bitarray(size_t size):
size(size),
data((size + 3) / 4)
{
}
proxy operator[](size_t index)
{
return proxy(data[index/4], (index % 4) * 2);
}
};
If you want to extend this and go the distance, Writing your own STL Container should help you make a fully armed and operational bit-packed array.
There's room for abuse here. The caller can hold onto a proxy and get up to whatever manner of evil this allows.
Use of this primitive example:
int main()
{
bitarray arr(10);
arr[0] = 1;
arr[1] = 2;
arr[2] = 3;
arr[3] = 1;
arr[4] = 2;
arr[5] = 3;
arr[6] = 1;
arr[7] = 2;
arr[8] = 3;
arr[9] = 1;
std::cout << arr[0] << std::endl;
std::cout << arr[1] << std::endl;
std::cout << arr[2] << std::endl;
std::cout << arr[3] << std::endl;
std::cout << arr[4] << std::endl;
std::cout << arr[5] << std::endl;
std::cout << arr[6] << std::endl;
std::cout << arr[7] << std::endl;
std::cout << arr[8] << std::endl;
std::cout << arr[9] << std::endl;
}
Simply, build on top of bitset, something like:
#include<bitset>
#include<iostream>
using namespace std;
template<int N>
class mydoublebitset
{
public:
uint_least8_t operator[](size_t index)
{
return 2 * b[index * 2 + 1] + b[index * 2 ];
}
void set(size_t index, uint_least8_t store)
{
switch (store)
{
case 3:
b[index * 2] = 1;
b[index * 2 + 1] = 1;
break;
case 2:
b[index * 2] = 0;
b[index * 2 + 1] = 1;
break;
case 1:
b[index * 2] = 0;
b[index * 2 + 1] = 1;
break;
case 0:
b[index * 2] = 0;
b[index * 2 + 1] = 0;
break;
default:
throw exception();
}
}
private:
bitset<N * 2> b;
};
int main()
{
mydoublebitset<12> mydata;
mydata.set(0, 0);
mydata.set(1, 2);
mydata.set(2, 2);
cout << (unsigned int)mydata[0] << (unsigned int)mydata[1] << (unsigned int)mydata[2] << endl;
system("pause");
return 0;
}
Basically use a bitset with twice the size and index it accordingly. its simpler and memory efficient as is required by you.

C++ What should we pass in MurmurHash3 parameters?

I am confused with what parameter should I provide for the MurmurHash3_x86_128(). The murmurhash3 code can be found https://github.com/aappleby/smhasher/blob/master/src/MurmurHash3.cpp. Method definition is given below.
void MurmurHash3_x86_128 ( const void * key, const int len,
uint32_t seed, void * out )
I have passed the following values in the above method but my compiler is giving me segmentation fault. What am i doing wrong ?
int main()
{
uint64_t seed = 1;
uint64_t *hash_otpt;
const char *key = "hi";
MurmurHash3_x64_128(key, (uint64_t)strlen(key), seed, hash_otpt);
cout << "hashed" << hash_otpt << endl;
return 0;
}
This function put its hash in 128 bits of memory.
What your are doing is passing a pointer, that is not allocated yet to it.
The correct usage would be something like that:
int main()
{
uint64_t seed = 1;
uint64_t hash_otpt[2]; // allocate 128 bits
const char *key = "hi";
MurmurHash3_x64_128(key, (uint64_t)strlen(key), seed, hash_otpt);
cout << "hashed" << hash_otpt[0] << hash_otpt[1] << endl;
return 0;
}
You could have noticed that by analyzing how MurmurHash3_x86_128 fills out parameter:
((uint64_t*)out)[0] = h1;
((uint64_t*)out)[1] = h2;
hash_otpt is a pointer to nothing, but the function expects the fourth argument to be a pointer to some memory as it writes its output into this memory. In your example, it attempts a write operation, but fails (there's nowhere to write to as the pointer is not initialized). This gives you a SegmentationFault.
Figure out in how many uint64_ts does the hash fit into (2, because the output's size is 128 bits, and the size of a uint64_t is 64 bits) and allocate the memory:
hash_otpt = new uint64_t [2];
If you look at the documentation, you can see
MurmurHash3_x64_128 ... It has a 128-bit output.
So, your code can be something like this
uint64_t hash_otpt[2]; // This is 128 bits
MurmurHash3_x64_128(key, (uint64_t)strlen(key), seed, hash_otpt);
Note that you don't have to dynamically allocate the output at all.

How to byteswap a double?

I'm trying to write a byteswap routine for a C++ program running on Win XP. I'm compiling with Visual Studio 2008. This is what I've come up with:
int byteswap(int v) // This is good
{
return _byteswap_ulong(v);
}
double byteswap(double v) // This doesn't work for some values
{
union { // This trick is first used in Quake2 source I believe :D
__int64 i;
double d;
} conv;
conv.d = v;
conv.i = _byteswap_uint64(conv.i);
return conv.d;
}
And a function to test:
void testit() {
double a, b, c;
CString str;
for (a = -100; a < 100; a += 0.01) {
b = byteswap(a);
c = byteswap(b);
if (a != c) {
str.Format("%15.15f %15.15f %15.15f", a, c, a - c);
}
}
}
Getting these numbers not matching:
-76.789999999988126 -76.790000000017230 0.000000000029104
-30.499999999987718 -30.499999999994994 0.000000000007276
41.790000000014508 41.790000000029060 -0.000000000014552
90.330000000023560 90.330000000052664 -0.000000000029104
This is after having read through:
How do I convert between big-endian and little-endian values in C++?
Little Endian - Big Endian Problem
You can't use << and >> on double, by the way (unless I'm mistaken?)
Although a double in main memory is 64 bits, on x86 CPUs double-precision registers are 80 bits wide. So if one of your values is stored in a register throughout, but the other makes a round-trip through main memory and is truncated to 64 bits, this could explain the small differences you're seeing.
Maybe you can force variables to live in main memory by taking their address (and printing it, to prevent the compiler from optimizing it out), but I'm not certain that this is guaranteed to work.
b = byteswap(a);
That's a problem. After swapping the bytes, the value is no longer a proper double. Storing it back to a double is going to cause subtle problems when the FPU normalizes the value. You have to store it back into an __int64 (long long). Modify the return type of the method.
Try 3
Okay, found out there's a better way. The other way you have to worry about the order you pack/unpack stuff. This way you don't:
// int and float
static void swap4(void *v)
{
char in[4], out[4];
memcpy(in, v, 4);
out[0] = in[3];
out[1] = in[2];
out[2] = in[1];
out[3] = in[0];
memcpy(v, out, 4);
}
// double
static void swap8(void *v)
{
char in[8], out[8];
memcpy(in, v, 8);
out[0] = in[7];
out[1] = in[6];
out[2] = in[5];
out[3] = in[4];
out[4] = in[3];
out[5] = in[2];
out[6] = in[1];
out[7] = in[0];
memcpy(v, out, 8);
}
typedef struct
{
int theint;
float thefloat;
double thedouble;
} mystruct;
static void swap_mystruct(void *buf)
{
mystruct *ps = (mystruct *) buf;
swap4(&ps->theint);
swap4(&ps->thefloat);
swap8(&ps->thedouble);
}
Send:
char buf[sizeof (mystruct)];
memcpy(buf, &s, sizeof (mystruct));
swap_mystruct(buf);
Recv:
mystruct s;
swap_mystruct(buf);
memcpy(&s, buf, sizeof (mystruct));
Try 2
Okay, got it working! Hans Passant was right. They got me thinking with the "no longer a proper double" comment. So you can't byteswap a float into another float because then it might be in an improper format, so you have to byteswap to a char array and unswap back. This is the code I used:
int pack(int value, char *buf)
{
union temp {
int value;
char c[4];
} in, out;
in.value = value;
out.c[0] = in.c[3];
out.c[1] = in.c[2];
out.c[2] = in.c[1];
out.c[3] = in.c[0];
memcpy(buf, out.c, 4);
return 4;
}
int pack(float value, char *buf)
{
union temp {
float value;
char c[4];
} in, out;
in.value = value;
out.c[0] = in.c[3];
out.c[1] = in.c[2];
out.c[2] = in.c[1];
out.c[3] = in.c[0];
memcpy(buf, out.c, 4);
return 4;
}
int pack(double value, char *buf)
{
union temp {
double value;
char c[8];
} in, out;
in.value = value;
out.c[0] = in.c[7];
out.c[1] = in.c[6];
out.c[2] = in.c[5];
out.c[3] = in.c[4];
out.c[4] = in.c[3];
out.c[5] = in.c[2];
out.c[6] = in.c[1];
out.c[7] = in.c[0];
memcpy(buf, out.c, 8);
return 8;
}
int unpack(char *buf, int *value)
{
union temp {
int value;
char c[4];
} in, out;
memcpy(in.c, buf, 4);
out.c[0] = in.c[3];
out.c[1] = in.c[2];
out.c[2] = in.c[1];
out.c[3] = in.c[0];
memcpy(value, &out.value, 4);
return 4;
}
int unpack(char *buf, float *value)
{
union temp {
float value;
char c[4];
} in, out;
memcpy(in.c, buf, 4);
out.c[0] = in.c[3];
out.c[1] = in.c[2];
out.c[2] = in.c[1];
out.c[3] = in.c[0];
memcpy(value, &out.value, 4);
return 4;
}
int unpack(char *buf, double *value)
{
union temp {
double value;
char c[8];
} in, out;
memcpy(in.c, buf, 8);
out.c[0] = in.c[7];
out.c[1] = in.c[6];
out.c[2] = in.c[5];
out.c[3] = in.c[4];
out.c[4] = in.c[3];
out.c[5] = in.c[2];
out.c[6] = in.c[1];
out.c[7] = in.c[0];
memcpy(value, &out.value, 8);
return 8;
}
And a simple test function:
typedef struct
{
int theint;
float thefloat;
double thedouble;
} mystruct;
void PackStruct()
{
char buf[sizeof (mystruct)];
char *p;
p = buf;
mystruct foo, foo2;
foo.theint = 1;
foo.thefloat = 3.14f;
foo.thedouble = 400.5;
p += pack(foo.theint, p);
p += pack(foo.thefloat, p);
p += pack(foo.thedouble, p);
// Send or recv char array
p = buf;
p += unpack(p, &foo2.theint);
p += unpack(p, &foo2.thefloat);
p += unpack(p, &foo2.thedouble);
}
How to swap the bytes in any basic data type or array of bytes
ie: How to swap the bytes in place in any array, variable, or any other memory block, such as an int16_t, uint16_t, uint32_t, float, double, etc.:
Here's a way to improve the efficiency from 3 entire copy operations of the array to 1.5 entire copy operations of the array. See also the comments I left under your answer. I said:
Get rid of this: memcpy(in, v, 4); and just copy-swap straight into out from v, then memcpy the swapped values back from out into v. This saves you an entire unnecessary copy, reducing your copies of the entire array from 3 to 2.
There's also a further optimization to reduce the copies of the entire array from 2 to 1.5: copy the left half of the array into temporary variables, and the right-half of the array straight into the left-half, swapping as appropriately. Then copy from the temporary variables, which contain the old left-half of the array, into the right-half of the array, swapping as appropriately. This results in the equivalent of only 1.5 copy operations of the entire array, to be more efficient. Do all this in-place in the original array, aside from the temp variables you require for half of the array.
1. Here is my general C and C++ solution:
/// \brief Swap all the bytes in an array to convert from little-endian
/// byte order to big-endian byte order, or vice versa.
/// \note Works for arrays of any size. Swaps the bytes **in place**
/// in the array.
/// \param[in,out] byte_array The array in which to swap the bytes in-place.
/// \param[in] len The length (in bytes) of the array.
/// \return None
void swap_bytes_in_array(uint8_t * byte_array, size_t len)
{
size_t i_left = 0; // index for left side of the array
size_t i_right = len - 1; // index for right side of the array
while (i_left < i_right)
{
// swap left and right bytes
uint8_t left_copy = byte_array[i_left];
byte_array[i_left] = byte_array[i_right];
byte_array[i_right] = left_copy;
i_left++;
i_right--;
}
}
Usage:
// array of bytes
uint8_t bytes_array[16];
// Swap the bytes in this array of bytes in place
swap_bytes_in_array(bytes_array, sizeof(bytes_array));
double d;
// Swap the bytes in the double in place
swap_bytes_in_array((uint8_t*)(&d), sizeof(d));
uint64_t u64;
// swap the bytes in a uint64_t in place
swap_bytes_in_array((uint8_t*)(&u64), sizeof(u64));
2. And here is an optional C++ template wrapper around that to make it even easier to use in C++:
template <typename T>
void swap_bytes(T *var)
{
// Note that `sizeof(*var)` is the exact same thing as `sizeof(T)`
swap_bytes_in_array((uint8_t*)var, sizeof(*var));
}
Usage:
double d;
// Swap the bytes in the double in place
swap_bytes(&d);
uint64_t u64;
// swap the bytes in a uint64_t in place
swap_bytes(&u64);
Notes & unanswered questions
Note, however, that #Hans Passant seems to be onto something here. Although the above works perfectly on any signed or unsigned integer type, and seems to work on float and double for me too, it seems to be broken on long double. I think it's because when I store the swapped long double back into a long double variable, if it is determined to be not-a-valid long double representation anymore, something automatically changes a few of the swapped bytes or something. I'm not entirely sure.
On many 64-bit systems, long double is 16 bytes, so perhaps the solution is to keep the swapped version of the long double inside a 16-byte array and NOT attempt to use it or cast it back to a long double from the uint8_t 16-byte array until either A) it has been sent to the receiver (where the endianness of the system is opposite, so it's in good shape now) and/or B) byte-swapped back again so it's a valid long double again.
Keep the above in mind in case you see problems with float or double types too, as I see with only long double types.
Linux byteswap and endianness and host-to-network byte order utilities
Linux also has a bunch of built-in utilities via gcc GNU extensions that you can use. See:
https://man7.org/linux/man-pages/man3/bswap.3.html - #include <byteswap.h>
https://man7.org/linux/man-pages/man3/endian.3.html - #include <endian.h>
https://man7.org/linux/man-pages/man3/byteorder.3.html - #include <arpa/inet.h> - generally used for network sockets (Ethernet packets) and things; inet stands for "internet"

C/C++ efficient bit array

Can you recommend efficient/clean way to manipulate arbitrary length bit array?
Right now I am using regular int/char bitmask, but those are not very clean when array length is greater than datatype length.
std vector<bool> is not available for me.
Since you mention C as well as C++, I'll assume that a C++-oriented solution like boost::dynamic_bitset might not be applicable, and talk about a low-level C implementation instead. Note that if something like boost::dynamic_bitset works for you, or there's a pre-existing C library you can find, then using them can be better than rolling your own.
Warning: None of the following code has been tested or even compiled, but it should be very close to what you'd need.
To start, assume you have a fixed bitset size N. Then something like the following works:
typedef uint32_t word_t;
enum { WORD_SIZE = sizeof(word_t) * 8 };
word_t data[N / 32 + 1];
inline int bindex(int b) { return b / WORD_SIZE; }
inline int boffset(int b) { return b % WORD_SIZE; }
void set_bit(int b) {
data[bindex(b)] |= 1 << (boffset(b));
}
void clear_bit(int b) {
data[bindex(b)] &= ~(1 << (boffset(b)));
}
int get_bit(int b) {
return data[bindex(b)] & (1 << (boffset(b));
}
void clear_all() { /* set all elements of data to zero */ }
void set_all() { /* set all elements of data to one */ }
As written, this is a bit crude since it implements only a single global bitset with a fixed size. To address these problems, you want to start with a data struture something like the following:
struct bitset { word_t *words; int nwords; };
and then write functions to create and destroy these bitsets.
struct bitset *bitset_alloc(int nbits) {
struct bitset *bitset = malloc(sizeof(*bitset));
bitset->nwords = (n / WORD_SIZE + 1);
bitset->words = malloc(sizeof(*bitset->words) * bitset->nwords);
bitset_clear(bitset);
return bitset;
}
void bitset_free(struct bitset *bitset) {
free(bitset->words);
free(bitset);
}
Now, it's relatively straightforward to modify the previous functions to take a struct bitset * parameter. There's still no way to re-size a bitset during its lifetime, nor is there any bounds checking, but neither would be hard to add at this point.
boost::dynamic_bitset if the length is only known in run time.
std::bitset if the length is known in compile time (although arbitrary).
I've written a working implementation based off Dale Hagglund's response to provide a bit array in C (BSD license).
https://github.com/noporpoise/BitArray/
Please let me know what you think / give suggestions. I hope people looking for a response to this question find it useful.
This posting is rather old, but there is an efficient bit array suite in C in my ALFLB library.
For many microcontrollers without a hardware-division opcode, this library is EFFICIENT because it doesn't use division: instead, masking and bit-shifting are used. (Yes, I know some compilers will convert division by 8 to a shift, but this varies from compiler to compiler.)
It has been tested on arrays up to 2^32-2 bits (about 4 billion bits stored in 536 MBytes), although last 2 bits should be accessible if not used in a for-loop in your application.
See below for an extract from the doco. Doco is http://alfredo4570.net/src/alflb_doco/alflb.pdf, library is http://alfredo4570.net/src/alflb.zip
Enjoy,
Alf
//------------------------------------------------------------------
BM_DECLARE( arrayName, bitmax);
Macro to instantiate an array to hold bitmax bits.
//------------------------------------------------------------------
UCHAR *BM_ALLOC( BM_SIZE_T bitmax);
mallocs an array (of unsigned char) to hold bitmax bits.
Returns: NULL if memory could not be allocated.
//------------------------------------------------------------------
void BM_SET( UCHAR *bit_array, BM_SIZE_T bit_index);
Sets a bit to 1.
//------------------------------------------------------------------
void BM_CLR( UCHAR *bit_array, BM_SIZE_T bit_index);
Clears a bit to 0.
//------------------------------------------------------------------
int BM_TEST( UCHAR *bit_array, BM_SIZE_T bit_index);
Returns: TRUE (1) or FALSE (0) depending on a bit.
//------------------------------------------------------------------
int BM_ANY( UCHAR *bit_array, int value, BM_SIZE_T bitmax);
Returns: TRUE (1) if array contains the requested value (i.e. 0 or 1).
//------------------------------------------------------------------
UCHAR *BM_ALL( UCHAR *bit_array, int value, BM_SIZE_T bitmax);
Sets or clears all elements of a bit array to your value. Typically used after a BM_ALLOC.
Returns: Copy of address of bit array
//------------------------------------------------------------------
void BM_ASSIGN( UCHAR *bit_array, int value, BM_SIZE_T bit_index);
Sets or clears one element of your bit array to your value.
//------------------------------------------------------------------
BM_MAX_BYTES( int bit_max);
Utility macro to calculate the number of bytes to store bitmax bits.
Returns: A number specifying the number of bytes required to hold bitmax bits.
//------------------------------------------------------------------
You can use std::bitset
int main() {
const bitset<12> mask(2730ul);
cout << "mask = " << mask << endl;
bitset<12> x;
cout << "Enter a 12-bit bitset in binary: " << flush;
if (cin >> x) {
cout << "x = " << x << endl;
cout << "As ulong: " << x.to_ulong() << endl;
cout << "And with mask: " << (x & mask) << endl;
cout << "Or with mask: " << (x | mask) << endl;
}
}
I know it's an old post but I came here to find a simple C bitset implementation and none of the answers quite matched what I was looking for, so I implemented my own based on Dale Hagglund's answer. Here it is :)
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
typedef uint32_t word_t;
enum { BITS_PER_WORD = 32 };
struct bitv { word_t *words; int nwords; int nbits; };
struct bitv* bitv_alloc(int bits) {
struct bitv *b = malloc(sizeof(struct bitv));
if (b == NULL) {
fprintf(stderr, "Failed to alloc bitv\n");
exit(1);
}
b->nwords = (bits >> 5) + 1;
b->nbits = bits;
b->words = malloc(sizeof(*b->words) * b->nwords);
if (b->words == NULL) {
fprintf(stderr, "Failed to alloc bitv->words\n");
exit(1);
}
memset(b->words, 0, sizeof(*b->words) * b->nwords);
return b;
}
static inline void check_bounds(struct bitv *b, int bit) {
if (b->nbits < bit) {
fprintf(stderr, "Attempted to access a bit out of range\n");
exit(1);
}
}
void bitv_set(struct bitv *b, int bit) {
check_bounds(b, bit);
b->words[bit >> 5] |= 1 << (bit % BITS_PER_WORD);
}
void bitv_clear(struct bitv *b, int bit) {
check_bounds(b, bit);
b->words[bit >> 5] &= ~(1 << (bit % BITS_PER_WORD));
}
int bitv_test(struct bitv *b, int bit) {
check_bounds(b, bit);
return b->words[bit >> 5] & (1 << (bit % BITS_PER_WORD));
}
void bitv_free(struct bitv *b) {
if (b != NULL) {
if (b->words != NULL) free(b->words);
free(b);
}
}
void bitv_dump(struct bitv *b) {
if (b == NULL) return;
for(int i = 0; i < b->nwords; i++) {
word_t w = b->words[i];
for (int j = 0; j < BITS_PER_WORD; j++) {
printf("%d", w & 1);
w >>= 1;
}
printf(" ");
}
printf("\n");
}
void test(struct bitv *b, int bit) {
if (bitv_test(b, bit)) printf("Bit %d is set!\n", bit);
else printf("Bit %d is not set!\n", bit);
}
int main(int argc, char *argv[]) {
struct bitv *b = bitv_alloc(32);
bitv_set(b, 1);
bitv_set(b, 3);
bitv_set(b, 5);
bitv_set(b, 7);
bitv_set(b, 9);
bitv_set(b, 32);
bitv_dump(b);
bitv_free(b);
return 0;
}
I use this one:
//#include <bitset>
#include <iostream>
//source http://stackoverflow.com/questions/47981/how-do-you-set-clear-and-toggle-a-single-bit-in-c
#define BIT_SET(a,b) ((a) |= (1<<(b)))
#define BIT_CLEAR(a,b) ((a) &= ~(1<<(b)))
#define BIT_FLIP(a,b) ((a) ^= (1<<(b)))
#define BIT_CHECK(a,b) ((a) & (1<<(b)))
/* x=target variable, y=mask */
#define BITMASK_SET(x,y) ((x) |= (y))
#define BITMASK_CLEAR(x,y) ((x) &= (~(y)))
#define BITMASK_FLIP(x,y) ((x) ^= (y))
#define BITMASK_CHECK(x,y) ((x) & (y))
I have recently released BITSCAN, a C++ bit string library which is specifically oriented towards fast bit scanning operations. BITSCAN is available here. It is in alpha but still pretty well tested since I have used it in recent years for research in combinatorial optimization (e.g. in BBMC, a state of the art exact maximum clique algorithm). A comparison with other well known C++ implementations (STL or BOOST) may be found here.
I hope you find it useful. Any feedback is welcome.
In micro controller development, some times we need to use
2-dimentional array (matrix) with element value of [0, 1] only. That
means if we use 1 byte for element type, it wastes the memory greatly
(memory of micro controller is very limited). The proposed solution is
that we should use 1 bit matrix (element type is 1 bit).
http://htvdanh.blogspot.com/2016/09/one-bit-matrix-for-cc-programming.html
I recently implemented a small header-only library called BitContainer just for this purpose.
It focuses on expressiveness and compiletime abilities and can be found here:
https://github.com/EddyXorb/BitContainer
It is for sure not the classical way to look at bitarrays but can come in handy for strong-typing purposes and memory efficient representation of named properties.
Example:
constexpr Props props(Prop::isHigh(),Prop::isLow()); // intialize BitContainer of type Props with strong-type Prop
constexpr bool result1 = props.contains(Prop::isTiny()) // false
constexpr bool result2 = props.contains(Prop::isLow()) // true