I am dealing with very large list of booleans in C++, around 2^N items of N booleans each. Because memory is critical in such situation, i.e. an exponential growth, I would like to build a N-bits long variable to store each element.
For small N, for example 24, I am just using unsigned long int. It takes 64MB ((2^24)*32/8/1024/1024). But I need to go up to 36. The only option with build-in variable is unsigned long long int, but it takes 512GB ((2^36)*64/8/1024/1024/1024), which is a bit too much.
With a 36-bits variable, it would work for me because the size drops to 288GB ((2^36)*36/8/1024/1024/1024), which fits on a node of my supercomputer.
I tried std::bitset, but std::bitset< N > creates a element of at least 8B.
So a list of std::bitset< 1 > is much greater than a list of unsigned long int.
It is because the std::bitset just change the representation, not the container.
I also tried boost::dynamic_bitset<> from Boost, but the result is even worst (at least 32B!), for the same reason.
I know an option is to write all elements as one chain of booleans, 2473901162496 (2^36*36), then to store then in 38654705664 (2473901162496/64) unsigned long long int, which gives 288GB (38654705664*64/8/1024/1024/1024). Then to access an element is just a game of finding in which elements the 36 bits are stored (can be either one or two). But it is a lot of rewriting of the existing code (3000 lines) because mapping becomes impossible and because adding and deleting items during the execution in some functions will be surely complicated, confusing, challenging, and the result will be most likely not efficient.
How to build a N-bits variable in C++?
How about a struct with 5 chars (and perhaps some fancy operator overloading as needed to keep it compatible to the existing code)? A struct with a long and a char probably won't work because of padding / alignment...
Basically your own mini BitSet optimized for size:
struct Bitset40 {
unsigned char data[5];
bool getBit(int index) {
return (data[index / 8] & (1 << (index % 8))) != 0;
}
bool setBit(int index, bool newVal) {
if (newVal) {
data[index / 8] |= (1 << (index % 8));
} else {
data[index / 8] &= ~(1 << (index % 8));
}
}
};
Edit: As geza has also pointed out int he comments, the "trick" here is to get as close as possible to the minimum number of bytes needed (without wasting memory by triggering alignment losses, padding or pointer indirection, see http://www.catb.org/esr/structure-packing/).
Edit 2: If you feel adventurous, you could also try a bit field (and please let us know how much space it actually consumes):
struct Bitset36 {
unsigned long long data:36;
}
I'm not an expert, but this is what I would "try". Find the bytes for the smallest type your compiler supports (should be char). You can check with sizeof and you should get 1. That means 1 byte, so 8 bits.
So if you wanted a 24 bit type...you would need 3 chars. For 36 you would need 5 char array and you would have 4 bits of wasted padding on the end. This could easily be accounted for.
i.e.
char typeSize[3] = {0}; // should hold 24 bits
Now make a bit mask to access each position of typeSize.
const unsigned char one = 0b0000'0001;
const unsigned char two = 0b0000'0010;
const unsigned char three = 0b0000'0100;
const unsigned char four = 0b0000'1000;
const unsigned char five = 0b0001'0000;
const unsigned char six = 0b0010'0000;
const unsigned char seven = 0b0100'0000;
const unsigned char eight = 0b1000'0000;
Now you can use the bit-wise or to set the values to 1 where needed..
typeSize[1] |= four;
*typeSize[0] |= (four | five);
To turn off bits use the & operator..
typeSize[0] &= ~four;
typeSize[2] &= ~(four| five);
You can read the position of each bit with the & operator.
typeSize[0] & four
Bear in mind, I don't have a compiler handy to try this out so hopefully this is a useful approach to your problem.
Good luck ;-)
You can use array of unsigned long int and store and retrieve needed bit chains with bitwise operations. This approach excludes space overhead.
Simplified example for unsigned byte array B[] and 12-bit variables V (represented as ushort):
Set V[0]:
B[0] = V & 0xFF; //low byte
B[1] = B[1] & 0xF0; // clear low nibble
B[1] = B[1] | (V >> 8); //fill low nibble of the second byte with the highest nibble of V
Related
I have a 64-bit unsigned integer. I want to check the 6th bit of each byte and return a single byte representing those 6th bits.
The obvious, "brute force" solution is:
inline const unsigned char Get6thBits(unsigned long long num) {
unsigned char byte(0);
for (int i = 7; i >= 0; --i) {
byte <<= 1;
byte |= bool((0x20 << 8 * i) & num);
}
return byte;
}
I could unroll the loop into a bunch of concatenated | statements to avoid the int allocation, but that's still pretty ugly.
Is there a faster, more clever way to do it? Maybe use a bitmask to get the 6th bits, 0x2020202020202020 and then somehow convert that to a byte?
If _pext_u64 is a possibility (this will work on Haswell and newer, it's very slow on Ryzen though), you could write this:
int extracted = _pext_u64(num, 0x2020202020202020);
This is a really literal way to implement it. pext takes a value (first argument) and a mask (second argument), at every position that the mask has a set bit it takes the corresponding bit from the value, and all bits are concatenated.
_mm_movemask_epi8 is more widely usable, you could use it like this:
__m128i n = _mm_set_epi64x(0, num);
int extracted = _mm_movemask_epi8(_mm_slli_epi64(n, 2));
pmovmskb takes the high bit of every byte in its input vector and concatenates them. The bits we want are not the high bit of every byte, so I move them up two positions with psllq (of course you could shift num directly). The _mm_set_epi64x is just some way to get num into a vector.
Don't forget to #include <intrin.h>, and none of this was tested.
Codegen seems reasonable enough
A weirder option is gathering the bits with a multiplication: (only slightly tested)
int extracted = (num & 0x2020202020202020) * 0x08102040810204 >> 56;
The idea here is that num & 0x2020202020202020 only has very few bits set, so we can arrange a product that never carries into bits that we need (or indeed at all). The multiplier is constructed to do this:
a0000000b0000000c0000000d0000000e0000000f0000000g0000000h0000000 +
0b0000000c0000000d0000000e0000000f0000000g0000000h00000000000000 +
00c0000000d0000000e0000000f0000000g0000000h000000000000000000000 etc..
Then the top byte will have all the bits "compacted" together. The lower bytes actually have something like that too, but they're missing bits that would have to come from "higher" (bits can only move to the left in a multiplication).
In a single nibble (0-F) I can store one number from 0 to 15. In one byte, I can store a single number from 0 to 255 (00 - FF).
Can I use a byte (00-FF) to store two different numbers each in the range 0-127 (00 - 7F)?
The answer to your question is NO. You can split a single byte into two numbers, but the sum of the bits in the two numbers must be <= 8. Since, the range 0-127 requires 7 bits, the other number in the byte can only be 1 bit, i.e. 0-1.
For obvious cardinality reasons, you cannot store two small integers in the 0 ... 127 range in one byte of 0 ... 255 range. In other words the cartesian product [0;127]×[0;127] has 214 elements which is bigger than 28 (the cardinal of the [0;255] interval, for bytes)
(If you can afford losing precision - which you didn't tell - you could, e.g. by storing only the highest bits ...)
Perhaps your question is: could I store two small integers from [0;15] in a byte? Then of course you could:
typedef unsigned unibble_t; // unsigned nibble in [0;15]
uint8_t make_from_two_nibbles(unibble_t l, unibble_t r) {
assert(l<=15);
assert(r<=15);
return (l<<4) | r;
}
unibble_t left_nible (uint8_t x) { return x >> 4; }
unibble_t right_nibble (uint8_t) { return x & 0xf; }
But I don't think you always should do that. First, you might use bit fields in struct. Then (and most importantly) dealing with nibbles that way might be more inefficient and make less readable code than using bytes.
And updating a single nibble, e.g. with
void update_left_nibble (uint8_t*p, unibble_t l) {
assert (p);
assert (l<=15);
*p = ((l<<4) | ((*p) & 0xf));
}
is sometimes expensive (it involves a memory load and a memory store, so uses the CPU cache and cache coherence machinery), and most importantly is generally a non-atomic operation (what would happen if two different threads are calling simultaneously update_left_nibble on the same address p -i.e. with pointer aliasing- is undefined behavior).
As a rule of thumb, avoid packing more than one data item in a byte unless you are sure it is worthwhile (e.g. you have a billion of such data items).
One byte is not enough for two values in 0…127, because each of those values needs log2(128) = 7 bits, for a total of 14, but a byte is only 8 bits.
You can declare variables with bit-packed storage using the C and C++ bitfield syntax:
struct packed_values {
uint8_t first : 7;
uint8_t second : 7;
uint8_t third : 2;
};
In this example, sizeof(packed_values) should equal 2 because only 16 bits were used, despite having three fields.
This is simpler than using bitwise arithmetic with << and & operators, but it's still not quite the same as ordinary variables: bit-fields have no addresses, so you can't have a pointer (or C++ reference) to one.
Can I use a byte to store two numbers in the range 0-127?
Of course you can:
uint8_t storeTwoNumbers(unsigned a, unsigned b) {
return ((a >> 4) & 0x0f) | (b & 0xf0);
}
uint8_t retrieveTwoNumbers(uint8_t byte, unsigned *a, unsigned *b) {
*b = byte & 0xf0;
*a = (byte & 0x0f) << 4;
}
Numbers are still in range 0...127 (0...255, actually). You just loose some precision, similar to floating point types. Their values increment in steps of 16.
You can store two data in range 0-15 in a single byte, but you should not (one var = one data is a better design).
If you must, you can use bit-masks and bit-shifts to access to the two data in your variable.
uint8_t var; /* range 0-255 */
data1 = (var & 0x0F); /* range 0-15 */
data2 = (var & 0xF0) >> 4; /* range 0-15 */
I have the question of the title, but If not, how could I get away with using only 4 bits to represent an integer?
EDIT really my question is how. I am aware that there are 1 byte data structures in a language like c, but how could I use something like a char to store two integers?
In C or C++ you can use a struct to allocate the required number of bits to a variable as given below:
#include <stdio.h>
struct packed {
unsigned char a:4, b:4;
};
int main() {
struct packed p;
p.a = 10;
p.b = 20;
printf("p.a %d p.b %d size %ld\n", p.a, p.b, sizeof(struct packed));
return 0;
}
The output is p.a 10 p.b 4 size 1, showing that p takes only 1 byte to store, and that numbers with more than 4 bits (larger than 15) get truncated, so 20 (0x14) becomes 4. This is simpler to use than the manual bitshifting and masking used in the other answer, but it is probably not any faster.
You can store two 4-bit numbers in one byte (call it b which is an unsigned char).
Using hex is easy to see that: in b=0xAE the two numbers are A and E.
Use a mask to isolate them:
a = (b & 0xF0) >> 4
and
e = b & 0x0F
You can easily define functions to set/get both numbers in the proper portion of the byte.
Note: if the 4-bit numbers need to have a sign, things can become a tad more complicated since the sign must be extended correctly when packing/unpacking.
I am just wondering how this works, and to be clear as to whether it actually does work.
If you have a 32 bit int and an 8 bit int array of size 4. Can you assign the 32 bit int to the 0th index in the 8 bit int array and effectively have the same value, bit wise.
Also if you then wanted to convert it back I presume you could fill up the 32 bit int with the array and appropriate bit shifts.
int32 bigVbl = 20;
int8 smallVbl[4];
smallVbl[0] = bigVbl;
I expect the smallVbl array to hold the entirety of bigVbl.
Assignments always truncate the most significant bits and retain the LSBs. Other arithmetics operations truncate the result too, if it overflows. In that way you can extend the maths (and many other operations) for operating on big integers easily. Without truncation how can you crammed 32 bits into 8 bits?
To copy the 32-bit int into an array of 4 8-bit chars, the easiest ways is copy the whole number into the array. Another way is assign element-by-element
smallVbl[0] = bigVbl & 0xff; // the & 0xff is not really needed
smallVbl[1] = (bigVbl >> 8) & 0xff;
smallVbl[2] = (bigVbl >> 16) & 0xff;
smallVbl[3] = (bigVbl >> 24) & 0xff;
There are a couple of ways of doing it, the simplest probably being to use std::copy_n to copy the integer into the array:
std::copy_n(reinterpret_cast<int8*>(&bigVbl), // Source to copy from
std::min(sizeof(smallVbl), sizeof(bigVbl)), // Number of bytes to copy
smallVbl); // Destination to copy to
To copy the opposite direction, just switch place of the source and destination in the above call.
My machine is 64 bit. My code as below:
unsigned long long periodpackcount=*(mBuffer+offset)<<32|*(mBuffer+offset+1)<<24|* (mBuffer+offset+2)<<16|*(mBuffer+offset+3)<<8|*(mBuffer+offset+4);
mBuffer is unsigned char*. I want to get 5 bytes data and transform the data to host byte-order.
How can I avoid this warning ?
Sometimes it's best to break apart into a few lines in order to avoid issues. You have a 5 byte integer you want to read.
// Create the number to read into.
uint64_t number = 0; // uint64_t is in <stdint>
char *ptr = (char *)&number;
// Copy from the buffer. Plus 3 for leading 0 bits.
memcpy(ptr + 3, mBuffer + offset, 5);
// Reverse the byte order.
std::reverse(ptr, ptr + 8); // Can bit shift here instead
Probably not the best byte swap ever (bit shifting is faster). And my logic might be off for the offsetting, but something along those lines should work.
The other thing you may want to do is cast each byte before shifting since you're leaving it up to the compiler to determine the data type *(mBuffer + offset) is a character (I believe), so you may want to cast it to a larger type static_cast<uint64_t>(*(mBuffer + offset)) << 32 or something.