Best way to convert 8 boolean to one byte? - c++

I want to save 8 boolean to one byte and then save it to a file(this work must be done for a very large data), I've used the following code but I'm not sure it is the best one(in terms of speed and space):
int bits[]={1,0,0,0,0,1,1,1};
char a='\0';
for (int i=0;i<8;i++){
//and then save "a"
can anyone give me a better code(more speed) ?

If you don't mind using SSE intrinsics, then _mm_movemask_epi8 is an excellent fit. It uses 16 bytes, but you can just set the others to zero.
For example (not tested)
__m128i values = _mm_loadl_epi64((__m128i*)array);
__m128i order = _mm_set_epi8(0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0, 1, 2, 3, 4, 5, 6, 7);
values = _mm_shuffle_epi8(values, order);
int result = _mm_movemask_epi8(_mm_slli_epi32(values, 7));
This assumes the array is an array of chars. If you can't make that happen, it takes some more loads and packs and it becomes a bit annoying.

” can anyone give me a better code(more speed)
you should measure. Most of the impact on the speed of serializing to file is i/o speed. What you do with the bits will likely have an unmeasurably small impact, but if it has any impact then that is likely mostly influenced by your original representation of the sequence of booleans.
Now regarding the given code
int bits[]={1,0,0,0,0,1,1,1};
char a='\0';
for (int i=0;i<8;i++){
//and then save "a"
Use unsigned char as byte type, just on principle.
Use bitlevel OR, the | operator, again just on principle.
Use prefix ++, yes, also that just on principle.
The “on principle” for the first point is because in practice your code will not run on any machine with sign-and-magnitude or one's complement representation of signed integers, where char is signed. But I think it's generally a good idea to express in the code exactly what one intends doing, instead of rewriting it as something slightly different. And the intention here is to deal with bits, an unsigned byte.
The “on principle” for the bitlevel OR is because for this particular case there's no practical difference between bitlevel OR and addition. But in general it's a good idea to write in code what one means to express. And then it's no good to write a bitlevel OR as an addition: it might even trip you up, bite you in the a**, in some other context.
The “on principle” for the prefix ++ is because in practice the compiler will optimize prefix and postfix ++ for a basic type, when the expression result isn't used, to the very same machine code. But again it's generally better to write what one intends to express. Asking for an original value (the postfix ++) is just misleading a reader of the code when you're not ever using that original value – and as with the bitlevel OR expressed as addition, the pure increment expressed as postfix ++ might trip you up, bite you in the a**, in some other context, e.g. with iterators.
The general approach of explicitly coding up shifting and ORing appears to me to be fine because std::bitset does not support initialization from a sequence of booleans (only initialization from a text string), so it doesn't save you any work. But generally it's a good idea to check the standard library, whether it supports whatever one wants to do. It might even happen that someone else chimes in here with some standard library based approach that I didn't think of! ;-)

Replace the += operator by |=, which is the bit-wise operation (and actually what you want to do here).
Use unsigned char for your truth values, if possible.
Unless you want to hand-unroll your loops and/or use SIMD intrinsics, that would be the most compiler-optimizable solution, I guess.
there's another trick: structs can have bit offsets, and you can use union on them to misuse them as ints.
By the way: your code is buggy. You shift first, then write; you use addition, but a signed char, which will definitely go wrong for the 7th and 8th bits (given you erroneously shift too early; if you did that properly, only the 8th bit will cause hazard).


Can I use data types like bool to compress data while improving readability?

My official question will be: "Is there a clean way to use data types to "encode and compress" data rather than using messy bit masking." The hopes would be to save space in the case of compressing, and I would like to use native data types, structures, and arrays in order to improve readability over bit masking. I am proficient in bit masking from my assembly background but I am learning C++ and OOP. We can store so much information in a 32 bit register by using individual bits and I feel that I am trying to get back to that low level environment while having the readability of C++ code.
I am attempting to save some space because I am working with huge resource requirements. I am still learning more about how c++ treats the bool data type. I realize that memory is stored in byte chunks and not individual bits. I believe that a bool usually uses one byte and is masked somehow. In my head I could use 8 bool values in one byte.
If I malloc in C++ an array of 2 bool elements. Does it allocate two bytes or just one?
Example: We will use DNA as an example since it can be encoded into two bit to represent A,C,G and T. If I make a struct with an array of two bool called DNA_Base, then I make an array of 20 of those.
struct DNA_Base{ bool Bit_1; bool Bit_2; };
DNA_Base DNA_Sequence[7] = {false};
cout << sizeof(DNA_Base)<<sizeof(DNA_Sequence)<<endl;
//Yields a 2 and a 14.
//I would like this to say 1 and 2.
In my example I would also show the case where the DNA sequence can be 20 bases long which would require 40 bits to encode. GATTACA could only take up a maximum of 2 bytes? I suppose an alternative question would have been "How to make C++ do the bit masking for me in a more readable way" or should I just make my own data type and classes and implement the bit masking using classes and operator overloading.
Not fully what you want but you can use bitfield:
struct DNA_Base
unsigned char Bit_1 : 1;
unsigned char Bit_2 : 1;
DNA_Base DNA_Sequence[7];
So sizeof(DNA_Base) == 1 and sizeof(DNA_Sequence) == 7
So you have to pack the DNA_Base to avoid to lose place with padding, something like:
struct DNA_Base_4
unsigned char base1 : 2; // may have value 0 1 2 or 3
unsigned char base2 : 2;
unsigned char base3 : 2;
unsigned char base4 : 2;
So sizeof(DNA_Base_4) == 1
std::bitset is an other alternative, but you have to do the interpretation job yourself.
An array of bools will be N-elements x sizeof(bool).
If your goal is to save space in registers, don't bother, because it is actually more efficient to use a word size for the processor in question than to use a single byte, and the compiler will prefer to use a word anyway, so in a struct/class the bool will usually be expanded to a 32-bit or 64-bit native word.
Now, if you like to save room on disk, or in RAM, due to needing to store LOTS of bools, go ahead, but it isn't going to save room in all cases unless you actually pack the structure, and on some architectures packing can also have performance impact because the CPU will have to perform unaligned or byte-by-byte access.
A bitmask (or bitfield), on the other hand, is performant and efficient and as dense as possible, and uses a single bitwise operation. I would look at one of the abstract data types that provide bit fields.
The standard library has bitset which can be as long as you want.
Boost also has something I'm sure.
Unless you are on a 4 bit machine, the final result will be using bit arithmetic. Whether you do it explicitly, have the compiler do it via bit fields, or use a bit container, there will be bit manipulation.
I suggest the following:
Use existing compression libraries.
Use the method that is most readable or understood by people other
than yourself.
Use the method that is most productive (talking about development
Use the method that you will inject the least amount of defects.
Edit 1:
Write each method up as a separate function.
Tell the compiler to generate the assembly language for each function.
Compare the assembly language of each function to each other.
My belief is that they will be very similar, enough that wasting time discussing them is not worthwhile.
You can't operate on bits directly, but you can treat the smallest unit available to you as a multiple data store, and define
enum class DNAx4 : uint8_t {
AAAA = 0x00, AAAC = 0x01, AAAG = 0x02, AAAT = 0x03,
// .... And the rest of them
AAAA = 0xFC, AAAC = 0xFD, AAAG = 0xFE, AAAT = 0xFF
I'd actually go further, and create a structure DNAx16 or DNAx32 to efficiently use the native word size on your machine.
You can then define functions on the data type, which will have to use the underlying bit representation, but at least it allows you to encapsulate this and build higher level operations from these primitives.

Explain Bit Test macro in C++

I'm trying to figure out how does this code work, but I can't manage to get a single answer.
#define testbit(x, y) ( ( ((const char*) & (x))[(y)>>3] & 0x80 >> ((y)&0x07)) >> (7-((y)&0x07) ) )
I'm new at pointers, so if you can figure out a way to explain this in simplified english, I would really appreciate it.
It belongs to a segment of code for an X-Plane Plug-in found at line=19
The macro tests the value of the y-th bit in x. You can't directly address bits, so the code starts by treating x as an array of bytes (the const char* cast).
It then looks up the byte where the bit lives. There are 8 bits in a byte, so it divides by 8. Chasing performance, instead of simply dividing by 8, the code uses the binary trick of shifting right 3 places. In general, for unsigned x and y, x >> y = x/2^y, and x << y = x*2^y.
At this point you need to test the bit within the byte, so you get the remainder of y/8. Yet another bit trick, using y & 7 instead of the clearer y % 8.
With this information you can make a mask, a single on bit, 0x80 and shift it into position to test the y%8-th bit. The mask is ANDed against the byte and a non-zero result here means the bit was set to 1, otherwise 0.
Completing #RhythmicFistman's answer
#RhythmicFistman's answer is missing one small part to it and that is the last step in the shifts.
The >> (7-((y)&0x07) step ensures that you only ever get a result of 1 or 0. With this code it is safe to do comparisons like:
if (testbit(varible, 6) == 1) {
// do something
Where without that step testbit would return a bit mask in which the 6th bit would be set to 1 or 0 and all the other bits are always set to 0. That is the intent but it is not implemented in what is considered a portable way, see Warning 3 below.
Possible issues with using this code
Now to add something to the other answers. The other answers have not pointed out some keywords that should be mentioned here and they are strict aliasing and shift arithmetic right. My elaboration will come in the form of warnings below.
Warning 1: Endianness
This code assumes that you are using a big endian architecture or only wish to get the correct bit from an array of chars.
The reason is that if you convert an int into an array of chars (bytes) you will get different results on a big endian machine vs a little endian machine.
Warning 2: Strict Aliasing
The macro makes use of a cast (const char*) &(x) which is designed to change the type, a.k.a. alias, of (x) so that it is easier to get to the correct bits.
This is dangerous and the reason why is explained beautifully in this SO answer. The short version is that if you compile this code with optimisations strange things can happen.
The wikipedia pages on Aliasing and Pointer Aliasing are also useful and should be read.
Warning 3: Shift Arithmetic Right
In addition to this there could be a potential issue with the way this code uses the right shift operator >>. This operator has two different behaviors depending on whether the variable it is operating on is signed or unsigned. So long as you never use negative numbers you will be safe but this code will not protect you against that mistake. I suspect though, that you're less likely to make such a mistake anyway so it should be ok to use it.
Also worth mentioning, you are using signed char and are shifting it right. Though this works I would prefer unsigned char which would improve portability because it will not risk generating an arithmetic shift right when char and int are the same width (which is almost never the case in practice, granted). This works because char is promoted to int for the shift, see this SO answer for an explanation.
What you see is a macro, that make the following job :
(In order)
Make a bit shift to y (value : 3)
That take the address of x and pick the character in position y (into the string x)
Make a binary operation between the selected char and 0x80
Make a bit shift to the previous result (value: result of binary operation between y and 0x7)
Make a bit shift ti the previous result (value: 7 - (result of binary operation between y and 0x7))
Well, this is help you? I don't think so!
Because this macro is clairly unproper, and kind of tricky.
Bit mask, Binary operation, Binary shift...
If you can explain more precisly what you want to understand in this, maybe i can be helpfull.

Why use the '+' operator when '|' is perfectly good?

This is more of a philosophical question, but I've seen this a bunch of times in codebases here and there and do not really understand how this programming method came to be.
Suppose you have to set bits 2 and 3 to some value x without changing the other values in the uint. Doing so is pretty trivial and a common task, and I would be inclined to do it this way:
uint8_t someval = 0xFF; //some random previous value
uint8_t x = 0x2; //some random value to assign.
someval = (somval & ~0xC) | (x << 2); //Set the value to 0x2 for bits 2-3
I've seen code that instead or using '|' uses '+':
uint8_t someval = 0xFF; //some random previous value
uint8_t x = 0x2; //some random value to assign.
someval = (somval & ~0xC) + (x << 2); //Set the value to 0x2 for bits 2-3
Are they equivalent?
Is one better than the other?
Only if your hardware doesn't have a bitwise OR instruction, but I have never ever ever seen a processor that didn't have a bitwise OR (even small PIC10 processors have an OR instruction).
So why would some programmers be inclined to use '+' instead of '|'? Am I missing some really obvious, uber powerful optimization here?
If you want to perform bitwise operations, use bitwise operators.
If you want to perform arithmetic operations, use arithmetic operators.
It's true that for some values some arithmetic operations can be implemented as simple bitwise operations, but that's essentially an implementation detail to which you should never expose your readers. First and foremost the logic of the code should be clear and if possible self-explanatory. The compiler will choose appropriate low-level operations for you to implement your desire.
That's being philanthropic.
Are they equivalent?
Yes, as long as the bitfield being written to is clear beforehand. Otherwise, they'll go wrong in slightly different ways.
Is one better than the other?
No, although some would say that bitwise operations express the intent more clearly.
So why would some programmers be inclined to use '+' instead of '|'?
Because they're equivalent, and neither is particularly better than the other.
Am I missing some really obvious, uber powerful optimization here?
So why would some programmers be inclined to use '+' instead of '|'?
+ could bring out logical bugs faster. a | a would appear to work, whereas a simple a + a definitely wouldn't (of course, depends on the logic, but the + version is more error-prone).
Of course you should stick to the standard way of doing things (use bitwise operations when you want a bitwise operation, and arithmetic operations when you want to do math).
It's just a question of style. Any modern CPU will complete both operations in the same number of cycles (typically 1). Personally I prefer | in these cases since it more explicitly states to the code reader that you're doing bit twiddling instead of arithmetic.
If you have a bug in your code, then using + could lead to strange behavior, whereas using | would tend to mask the bug. For example, if you accidentally include the same bit twice, ORing it again is a no-op, but adding it will clear the bit and carry up into the next bit (and possibly farther, if more bits are set). So that would usually lead to fail-fast behavior instead of failure-masking behavior, which is generally preferable.

Do bit operations cause programs to run slower?

I'm dealing with a problem which needs to work with a lot of data. Currently its values are represented as an unsigned int. I know that real values do not exceed a limit of 1000.
I can use unsigned short to store it. An upside to this is that it'll use less storage space to store the value. Will performance suffer?
If I decided to store data as short but all the calling functions use int, it's recognized that I need to convert between these datatypes when storing or extracting values. Will performance suffer? Will the loss in performance be dramatic?
If I decided to not use short but just 10 bits packed into an array of unsigned int. What will happen in this case comparing with previous ones?
This all depends on architecture. Bit-fields are generally slower, but if you are able to to significantly cut down memory usage with them, you can even gain in performance due to better CPU caching and similar things. Likewise with short (though it is not dramatic in any case).
The best way is to make your source code able to switch representation easily (at compilation time, of course). Then you will be able to test and profile different implementations in your specific circumstances just by, say, changing one #define.
Also, don't forget about premature optimization rule. Make it work first. If it turns out to be slow/not fast enough, only then try to speed up.
I can use unsigned short to store it.
Yes you can use unsigned short (assuming (sizeof(unsigned short) * CHAR_BITS) >= 10)
An upside to this is that it'll use less storage space to store the value.
Less than what? Less than int? Depends what is the sizeof(int) on your system?
Will performance suffer?
Depends. The type int is supposed to be the most efficient integer type for your system so potentially using short may affect your performance. Whether it does will depend on the system. Time it and find out.
If I decided to store data as short but all the calling functions use int, it's recognized that I need to convert between these datatypes when storing or extracting values.
Yes. But the compiler will do the conversion automatically. One thing you need to watch though is conversion between signed and unsigned types. If the value does not fit the exact result may be implementation defined.
Will performance suffer?
Maybe. if sizeof(unsigned int) == sizeof(unsigned short) then probably not. Time it and see.
Will the loss in performance be dramatic?
Time it and see.
If I decided to not use short but just 10 bits packed into an array of unsigned int. What will happen in this case comparing with previous ones?
Time it and see.
A good compromise for you is probably packing three values into a 32 bit int (with two bits unused). Untangling 10 bits from a bit array is a lot more expensive, and doesn't save much space. You can either use bit fields, or do it by hand yourself:
(i&0x3FF) // Get i[0]
(i>>10)&0x3FF // Get i[1]
(i>>20)&0x3FF // Get i[2]
i = (i&0x3FFFFC00) | (j&0x3FF) // Set i[0] to j
i = (i&0x3FF003FF) | ((j&0x3FF)<<10) // Set i[1] to j
i = (i&0xFFFFF) | ((j&0x3FF)<<20) // Set i[2] to j
You can see here how much extra expense it is: a bit operation and 2/3 of a shift (on average) for get, and three bit operations and 2/3 of a shift (on average) to set. Probably not too bad, especially if you're mostly getting the values not setting them.

Is there any advantage to using '<< 1' instead of '* 2'?

I've seen this a couple of times, but it seems to me that using the bitwise shift left hinders readability. Why is it used? Is it faster than just multiplying by 2?
You should use * when you are multiplying, and << when you are bit shifting. They are mathematically equivalent, but have different semantic meanings. If you are building a flag field, for example, use bit shifting. If you are calculating a total, use multiplication.
It is faster on old compilers that don't optimize the * 2 calls by emitting a left shift instruction. That optimization is really easy to detect and any decent compiler already does.
If it affects readability, then don't use it. Always write your code in the most clear and concise fashion first, then if you have speed problems go back and profile and do hand optimizations.
It's used when you're concerned with the individual bits of the data you're working with. For example, if you want to set the upper byte of a word to 0x9A, you would not write
n |= 0x9A * 256
You'd write:
n |= 0x9A << 8
This makes it clearer that you're working with bits, rather than the data they represent.
For some architectures, bit shifting is faster than multiplying. However, any compiler worth its salt will optimize *2 (or any multiplication by a power of 2) to a left bit shift (when a bit shift would be faster).
For readability of values used as bitfields:
enum Flags { UP = (1<<0),
DOWN = (1<<1),
STRANGE = (1<<2),
CHARM = (1<<3),
which I think is preferable to either '=1,...,=2,...=4' or '=1,...=2, =2*2,...=2*3' especially if you have 8+ flags.
If you are using a old C compiler, it is preferrable to use bitwise. For readability you can comment you code though.