Efficient bit operations - c++

In C++ I want to encode the bits of 3 unsigned variables into one. More precisely, when the three variables are:
A: a3 a2 a1 a0
B: b3 b2 b1 b0
C: c3 c2 c1 c0
then the output variable shall contain such triples:
D: a3 b3 c3 a2 b2 c2 a1 b1 c1 a0 b0 c0
Let's assume that the output variable is large enough for all used bits. I have come up with
unsigned long long result(0);
unsigned a,b,c; // Some numbers to be encoded
for(int level=0;level<numLevels;++level)
{
int q(1<<level); // SearchBit q: 1<<level
int baseShift((3*level)-level); // 0,2,4,6
result|=( ((a&q)<<(baseShift+2)) | ((b&q)<<(baseShift+1)) | ((c&q)<<(baseShift)) );
}
...and it works sufficiently. But I wonder if there is a solution that does not require a loop that iterates over all bits separately.

Define a table mapping all or part of your bits to where they end up. Shift values appropriately.
unsigned long long encoder(unsigned a, unsigned b, unsigned c) {
static unsigned const encoding[16] = {
0b0000000000,
0b0000000001,
0b0000001000,
0b0000001001,
0b0001000000,
0b0001000001,
0b0001001000,
0b0001001001,
0b1000000000,
0b1000000001,
0b1000001000,
0b1000001001,
0b1001000000,
0b1001000001,
0b1001001000,
0b1001001001,
};
unsigned long long result(0);
int shift = 0;
do {
result += ((encoding[a & 0xF] << 2) | (encoding[b & 0xF] << 1) | encoding[c & 0xF]) << shift;
shift += 12;
a >>= 4;
b >>= 4;
c >>= 4;
} while (a || b || c);
return result;
}
encoding defines a table to map 4 bits into their encoded locations. This used directly for c, and shifted 1 or 2 bits for b and a. If you have more than 4 bits to process, the next 4 bits in the source values are offset 12 bits further to the left. Keep doing this until all nonzero bits have been processed.
This could use a while loop instead of a do/while but checking for zero before starting is useless unless most of the encodings are of all zero values.
If you frequently use more than 4 bits, the encoding table can be expanded and appropriate changes made to the loop to process more than 4 bits at a time.

Related

Concatenate Bits from 3 characters, taken from different locations in the bitset

I am trying to concatenate the bits of 3 characters a, b and c into a bitset of 16 bits. The constraints are the following:
Concatenate the last 2 bits of a into newVal1
Concatenate the 8 bits of b into newVal1
Concatenate the first 2 bits of c into newVal1
On paper I am getting: 1111111111110000 same as the result. But I am not sure of the way I am concatenating the bits. First shift left by 14 character a then shift left by 6 character b and finally, since There is no space left for character c then shift right by 2. Is there a better way to do it? It's already confusing for me
#include <iostream>
#include <bitset>
int main() {
int a = 0b11111111 & 0b00000011;
int b = 0b11111111;
int c = 0b11111111 & 0b11000000;
uint16_t newVal1 = (a << 14) + (b << 6) + (c >> 2 );
std::cout << std::bitset<16>(newVal1).to_string() << std::endl;
return 0;
}
First of all you need to consider the signed and unsigned integer problem. With signed integers you can get unexpected sign extensions, adding all ones at the top. And possible overflow will lead to undefined behavior.
So the first thing I would do is to use all unsigned integer values.
Then to make it clear and simple, my suggestion is that you do all the shifting on newVal1 instead, and just do bitwise OR into it:
unsigned a = /* value of a */;
unsigned b = /* value of b */;
unsigned c = /* value of c */
unsigned newVal1 = 0;
newVal1 |= a & 0x02; // Get lowest two bits of a
newVal1 <<= 8; // Make space for the next eight bits
newVal1 |= b & 0xffu; // "Concatenate" eight bits from b
newVal1 <<= 2; // Make space for the next two bits
newVal1 |= (c >> 6) & 0x02; // Get the two "top" bits from c
Now the lowest twelve bits of newVal1 should follow the three rules set up for your assignment. The bits top bits will be all zero.

What is this C++ code using operator ^?

What does this code mean?
int possible = 1;
for (int i = 0; i < n - 1; ++i){
possible += (n_size[i] ^ n_size[i + 1]) < 0;
}
I think this is ^ XOR, but how is it working in this code? That's strange,
Because I thought when we use XOR we have just 0 or 1.
Please, help me to understand.
Let's see this line :
possible += (n_size[i] ^ n_size[i + 1]) < 0;
We don't know about n_size but I'll suppose it is an array of n int. So we XOR (bitwise) two consecutive terms of n_size, and determine the sign of the result (by comparing it to 0).
Bitwise-XOR is operating bit per bit, so (ABCD = 1011 ^ 0101 <=> A = 1 ^ 0 , B = 0 ^ 1, C = 1 ^ 0, D = 1 ^ 1).
int are encoded in a certain manner, which allows us to get the sign with the most-significant bit (in the number 0bX???????, if X=1 the number is negative, else the number is positive).
So (A ^ B) < 0 is equivalent to (A < 0) ^ (B < 0).
So this line increments possible when two consecutive terms have not the same sign.
Finally, possible counts the number of consecutive terms alterning their sign.
PS : notice that float and double have their sign determined by their most-significant-bit too, so it works the same if n_size is an array of float or double.
As coderedoc was a little short in his comment: ^ is a bit-wise operator, just as | and &, too. Those operators are applied on every pair of corresponding bits (and for such a pair, XOR as you understood it) within two variables:
unsigned char c1 = 0b11001010;
unsigned char c2 = 0b10101100;
unsigned char c3 = c1 ^ c2; // == 0b01100110
unsigned char c4 = c1 & c2; // == 0b11101110
unsigned char c5 = c1 | c2; // == 0b10001000

Convert every 5 bits into integer values in C++

Firstly, if anyone has a better title for me, let me know.
Here is an example of the process I am trying to automate with C++
I have an array of values that appear in this format:
9C07 9385 9BC7 00 9BC3 9BC7 9385
I need to convert them to binary and then convert every 5 bits to decimal like so with the last bit being a flag:
I'll do this with only the first word here.
9C07
10011 | 10000 | 00011 | 1
19 | 16 | 3
These are actually x,y,z coordinates and the final bit determines the order they are in a '0' would make it x=19 y=16 z=3 and '1' is x=16 y=3 z=19
I already have a buffer filled with these hex values, but I have no idea where to go from here.
I assume these are integer literals, not strings?
The way to do this is with bitwise right shift (>>) and bitwise AND (&)
#include <cstdint>
struct Coordinate {
std::uint8_t x;
std::uint8_t y;
std::uint8_t z;
constexpr Coordinate(std::uint16_t n) noexcept
{
if (n & 1) { // flag
x = (n >> 6) & 0x1F; // 1 1111
y = (n >> 1) & 0x1F;
z = n >> 11;
} else {
x = n >> 11;
y = (n >> 6) & 0x1F;
z = (n >> 1) & 0x1F;
}
}
};
The following code would extract the three coordinates and the flag from the 16 least significant bits of value (ie. its least significant word).
int flag = value & 1; // keep only the least significant bit
value >>= 1; // shift right by one bit
int third_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits
int second_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits
int first_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits (only useful if there are other words in "value")
What you need is most likely some loop doing this on each word of your array.

Select spans of set bits in a bitmask that overlap with a 1-bit in a selector bitmap

Given:
A bitmask a (say, std::uint64_t), which contains at least one set (1) bit.
A selector bitmask b which is a subset of a (i.e. a & b == b), and has at least one bit set.
I want to select spans of contiguous 1-bits in a which overlap with a bit in b:
a = 0b1111001110001100;
b = 0b0000001010001000;
//c=0b0000001110001100
// XXXX YYY ZZ
The XXXX group is 0 in c because b & XXXX is false. The ZZ group is copied because b has one of the Z bits set. The YYY group is also set in c for the same reason. Notice that b can have multiple set bits in a single group in a.
So for every contiguous group of 1s in a, set all of those bits in c if b has a 1 in any of those positions. A more complex example:
std::uint64_t a = 0b1101110110101;
std::uint64_t b = 0b0001010010001;
// desired c == 0b0001110110001
// contiguous groups ^^^ ^^ ^ that overlap with a 1 in b
assert(a & b == b); // b is a subset of a
std::uint64_t c = some_magic_operation(a, b);
assert(c == 0b0001110110001);
Are there any bit-logic instructions/intrinsics (MMX, SSE, AVX, BMI1/BMI2), or bit manipulation tricks which allows me to calculate c from a and b efficiently? (i.e. without loops)?
ADDITIONAL:
Using hint from Denis' answer I can only imagine loop-based algorithm:
std::uint64_t a = 0b0110111001001101;
std::uint64_t b = 0b0100101000001101;
assert(a & b == b); // subset
std::cout << std::bitset< 16 >(a) << std::endl;
std::cout << std::bitset< 16 >(b) << std::endl;
std::uint64_t x = (a + b) & ~a;
std::uint64_t c = 0;
while ((x = (a & (x >> 1)))) { // length of longest 1-series times
c |= x;
}
std::cout << std::bitset< 16 >(c) << std::endl;
In case of uint64_t you may do this trick:
Let's set a = 0b11011101101. Having at least one 0 bit is important. The bitmask has 4 separate areas, filled with 1 bits. If you do c=a+(a&b), then each 1-filled area will overflow if at least one bit of b in this area is set. So you can check then, which area was overflown. For example, if you want 1-bits in 2-nd and 3-rd areas of a, you may do so:
assert(c & 0b00100010000);
// ^^^ ^^ this segments overflows

what is this C++ define doing? how can I write it in Python?

I have this C++ define
#define CSYNC_VERSION_INT(a, b, c) ((a) << 16 | (b) << 8 | (c))
I need to define the same in Python. What is this doing? How can I do the same in Python?
The equivalent would be
def CSYNC_VERSION_INT(a, b, c):
return a << 16 | b << 8 | c
It byteshifts a left by 16 bits, b left by 8 and c intact; then all these numbers are bitwise orred together. It thus packs the a, b, c into three (four) bytes of an integer, so that the lowest byte is the value of c, the second lowest is b and the topmost bytes are the a value.
CSYNC_VERSION_INT(3, 2, 8) is equal to 0x30208 in hex, or 197128 in decimal.
I want to add to Antti Haapala's answer what that macro does: it creates an int from three bytes, which are a, b and c.
Example:
int main()
{
unsigned int a = 0x02;
unsigned int b = 0xf4;
unsigned int c = 0x56;
unsigned int p = CSYNC_VERSION_INT(a, b, c);
// now p == 0x02f456
}
It is using bit-shifts to store a version number in a single int. It will store the "major" version in the upper 16 bits, the "minor" version in the first 8 bits of the lower 16, and the "revision" number in the lowest 8 bits.
It will not work well if the inputs are too large (e.g. if a is outside the valid range for an unsigned short, or if b or c are outside the range of an unsigned char). Since it has no type-safety, a better approach would be to make an inline function that does the same operation with the appropriate types:
inline unsigned long MakeVersion(unsigned short major, unsigned char minor, unsigned char revision)
{
unsigned long l = (static_cast<unsigned long>(major) << 16) | (static_cast<unsigned long>(minor) << 8) | (static_cast<unsigned long>(revision);
return l;
}
Since Python is a C-derived language, you should be able to use the same bit-shifts to accomplish the same task.
You can write this in python with the same meaning
res = ((a) << 16 | (b) << 8 | (c))
Assuming you have 1B data type (like char) and you want to store all the data in bigger data type (>= 3B), you have use this shift, so for
a = 01001110
b = 11010001
c = 00100011
will be
res= 01001110 11010001 00100011
(dump, all in binary)
'<<' this means bitwise shift (left)
'|' this means bitwise or (logic or for every bit)
You can also use the opposite attitude, to make a, b, c from res
a = (res >> 16) & 0xFF
b = (res >> 8) & 0xFF
c = res & 0xFF
So shift out what you need and then select only the last byte and store it.
Very useful when making calculator with unlimited precision :)