This question already has an answer here:
I want to pack the bits based on arbitrary mask
(1 answer)
Closed 5 years ago.
Problem
Suppose I have a bit mask mask and an input n, such as
mask = 0x10f3 (0001 0000 1111 0011)
n = 0xda4d (1101 1010 0100 1101)
I want to 1) isolate the masked bits (remove bits from n not in mask)
masked_n = 0x10f3 & 0xda4d = 0x1041 (0001 0000 0100 0001)
and 2) "flatten" them (get rid of the zero bits in mask and apply those same shifts to masked_n)?
flattened_mask = 0x007f (0000 0000 0111 1111)
bits to discard (___1 ____ 0100 __01)
first shift ( __ _1__ __01 0001)
second shift ( __ _101 0001)
result = 0x0051 (0000 0000 0101 0001)
Tried solutions
a) For this case, one could craft an ad hoc series of bit shifts:
result = (n & 0b10) | (n & 0b11110000) >> 2 | (n & 0b1000000000000) >> 6
b) More generically, one could also iterate over each bit of mask and calculate result one bit at a time.
for (auto i = 0, pos = 0; i < 16; i++) {
if (mask & (1<<i)) {
if (n & (1<<i)) {
result |= (1<<pos);
}
pos++;
}
}
Question
Is there a more efficient way of doing this generically, or at the very least, ad hoc but with a fixed number of operations regardless of bit placement?
A more efficient generic approach would be to loop over the bits but only process the number of bits that are in the mask, removing the if (mask & (1<<i)) test from your loop and looping only 7 times instead of 16 for your example mask. In each iteration of the loop find the rightmost bit of the mask, test it with n, set the corresponding bit in the result and then remove it from the mask.
int mask = 0x10f3;
int n = 0xda4d;
int result = 0;
int m = mask, pos = 1;
while(m != 0)
{
// find rightmost bit in m:
int bit = m & -m;
if (n & bit)
result |= pos;
pos <<= 1;
m &= ~bit; // remove the rightmost bit from m
}
printf("%04x %04x %04x\n", mask, n, result);
Output:
10f3 da4d 0051
Or, perhaps less readably but without the bit temp variable:
if (n & -m & m)
result |= pos;
pos <<= 1;
m &= m-1;
How does it work? First, consider why m &= m-1 clears the rightmost (least significant) set bit. Your (non-zero) mask m is going to be made up of a certain number of bits, then a 1 in the least significant set place, then zero or more 0s:
e.g:
xxxxxxxxxxxx1000
Subtracting 1 gives:
xxxxxxxxxxxx0111
So all the bits higher than the least significant set bit will be unchanged (so ANDing them together leaves them unchanged), the least significant set bit changes from a 1 to a 0, and the less significant bits were all 0s beforehand so ANDing them with all 1s leaves them unchanged. Net result: least significant set bit is cleared and the rest of the word stays the same.
To understand why m & -m gives the least significant set bit, combine the above with the knowledge that in 2s complement, -x = ~(x-1)
Related
I currently have an un unsigned int of 64 bits that contains:
0100
0100
0100
0100
0000...
And i would change it to :
01000
01000
01000
01000
00000...
Is there a way to to do that ?
Thanks
📎 Hi! It looks like you are trying to expand 4-bit nibbles into 5-bit groups.
In general, you can do it like this
uint64_t value = YOUR_DATA; //this has your bits.
for (int i; i< sizeof(value)*2; i++) {
uint8_t nibble = (value & 0xF);
nibble <<= 1; //shift left 1 bit, add 0 to end.
STORE(nibble, i);
value>>=4; //advance to next nibble
}
This will call STORE once for each group of 4 bits. The arguments to STORE are the "expanded" 5 bit value, and the nibble counter, where 0 represents the least significant 4 bits.
The design question to answer is how to store the result? 64 bits / 4 * 5 = 80 bits, so you either need 2 words, or to throw away the data at one end.
Assuming 2 words with the anchor at the LSB, STORE could look like
static uint64_t result[2] = {0,0};
void STORE(uint64_t result[], uint8_t value, int n) {
int idx = (n>12); //which result word?
result[idx] |= value << ( n*5 - idx*64 );
if (n==12) result[1] |= value>>4; //65th bit goes into 2nd result word
}
Omit the leading 0, it serves no purpose => shift left one bit
Firstly, if anyone has a better title for me, let me know.
Here is an example of the process I am trying to automate with C++
I have an array of values that appear in this format:
9C07 9385 9BC7 00 9BC3 9BC7 9385
I need to convert them to binary and then convert every 5 bits to decimal like so with the last bit being a flag:
I'll do this with only the first word here.
9C07
10011 | 10000 | 00011 | 1
19 | 16 | 3
These are actually x,y,z coordinates and the final bit determines the order they are in a '0' would make it x=19 y=16 z=3 and '1' is x=16 y=3 z=19
I already have a buffer filled with these hex values, but I have no idea where to go from here.
I assume these are integer literals, not strings?
The way to do this is with bitwise right shift (>>) and bitwise AND (&)
#include <cstdint>
struct Coordinate {
std::uint8_t x;
std::uint8_t y;
std::uint8_t z;
constexpr Coordinate(std::uint16_t n) noexcept
{
if (n & 1) { // flag
x = (n >> 6) & 0x1F; // 1 1111
y = (n >> 1) & 0x1F;
z = n >> 11;
} else {
x = n >> 11;
y = (n >> 6) & 0x1F;
z = (n >> 1) & 0x1F;
}
}
};
The following code would extract the three coordinates and the flag from the 16 least significant bits of value (ie. its least significant word).
int flag = value & 1; // keep only the least significant bit
value >>= 1; // shift right by one bit
int third_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits
int second_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits
int first_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits (only useful if there are other words in "value")
What you need is most likely some loop doing this on each word of your array.
I have a variable mask of type std::bitset<8> as
std::string bit_string = "00101100";
std::bitset<8> mask(bit_string);
Is there an efficient way to quickly mask out the corresponding (three) bits of another given std::bitset<8> input and move all those masked out bits to the rightmost? E.g., if input is 10100101, then I would like to quickly get 00000101 which equals 5 in decimal. Then I can vect[5] to quickly index the 6th element of vect which is std::vector<int> of size 8.
Or rather, can I quickly get the decimal value of the masked out bits (with their relative positions retained)? Or I can't?
I guess in my case the advantage that can be taken is the bitset<8> mask I have. And I'm supposed to manipulate it somehow to do the work fast.
I see it like this (added by Spektre):
mask 00101100b
input 10100101b
---------------
& ??1?01??b
>> 101b
5
First things first: you can't avoid O(n) complexity with n being the number of mask bits if your mask is available as binary. However, if your mask is constant for multiple inputs, you can preprocess the mask into a series of m mask&shift transformations where m is less or equal to your number of value 1 mask bits. If you know the mask at compile time, you can even preconstruct the transformations and then you get your O(m).
To apply this idea, you need to create a sub-mask for each group of 1 bits in your mask and combine it with a shift information. The shift information is constructed by counting the number of zeroes to the right of the current group.
Example:
mask = 00101100b
// first group of ones
submask1 = 00001100b
// number of zeroes to the right of the group
subshift1 = 2
submask2 = 00100000b
subshift2 = 3
// Apply:
input = 10100101b
transformed = (input & submask1) >> subshift1 // = 00000001b
transformed = (input & submask2) >> subshift2 // = 00000100b
+ transformed // = 00000101b
If you make the sub-transforms into an array, you can easily apply them in a loop.
Your domain is small enough that you can brute-force this. Trivially, an unsigned char LUT[256][256] can store all possible outcomes in just 64 KB.
I understand that the mask has at most 3 bits, so you can restrict the lookup table size in that dimension to [224]. And since f(input, mask) == f(input&mask, mask) you can in fact reduce the LUT to unsigned char[224][224].
A further size reduction is possible by realizing that the highest mask is 11100000 but you can just test the lowest bit of the mask. When mask is even, f(input, mask) == f((input&mask)/2, mask/2). The highest odd mask is only 11000001 or 191. This reduces your LUT further, to [192][192].
A more space-efficient algorithm splits input and mask into 2 nibbles (4 bits). You now have a very simple LUT[16][16] in which you look up the high and low parts:
int himask = mask >> 4, lomask = mask & 0xF;
int hiinp = input >> 4, loinp = input & 0xF;
unsigned char hiout = LUT[himask][hiinp];
unsigned char loout = LUT[lomask][loinp];
return hiout << bitsIn[lomask] | loout;
This shows that you need another table, char bitsIn[15].
Taking the example :
mask 0010 1100b
input 1010 0101b
himask = 0010
hiinp = 1010
hiout = 0001
lomask = 1100
loinp = 0101
loout = 0001
bitsIn[lowmask 1100] = 2
return (0001 << 2) | (0001)
Note that this generalizes fairly easily to more than 8 bits:
int bitsSoFar = 0;
int retval = 0;
while(mask) { // Until we've looked up all bits.
int mask4 = mask & 0xF;
int input4 = input & 0xF;
retval |= LUT[mask4][input4] << bitsSoFar;
bitsSoFar += bitsIn[mask4];
mask >>= 4;
input >>= 4;
}
Since this LUT only hold nibbles, you could reduce it to 16*16/2 bytes, but I suspect that's not worth the effort.
I see it like this:
mask 00101100b
input 10100101b
---------------
& ??1?01??b
>> 101b
5
I would create a bit weight table for each set bit in mask by scan bits from LSB and add weights 1,2,4,8,16... for set bits and leave zero for the rest so:
MSB LSB
--------------------------
mask 0 0 1 0 1 1 0 0 bin
--------------------------
weight 0 0 4 0 2 1 0 0 dec (A)
input 1 0 1 0 0 1 0 1 bin (B)
--------------------------
(A.B) 0*1+0*0+4*1+0*0+2*0+1*1+0*0+0*1 // this is dot product ...
4 + 1
--------------------------
5 dec
--------------------------
Sorry I do not code in Python at all so no code ... I still think using integral types for this directly would be better but that is probably just my low level C++ thinking ...
I've got an interesting problem that has me looking for a more efficient way of doing things.
Let's say we have a value (in binary)
(VALUE) 10110001
(MASK) 00110010
----------------
(AND) 00110000
Now, I need to be able to XOR any bits from the (AND) value that are set in the (MASK) value (always lowest to highest bit):
(RESULT) AND1(0) xor AND4(1) xor AND5(1) = 0
Now, on paper, this is certainly quick since I can see which bits are set in the mask. It seems to me that programmatically I would need to keep right shifting the MASK until I found a set bit, XOR it with a separate value, and loop until the entire byte is complete.
Can anyone think of a faster way? I'm looking for the way to do this with the least number of operations and stored values.
If I understood this question correctly, what you want is to get every bit from VALUE that is set in the MASK, and compute the XOR of those bits.
First of all, note that XOR'ing a value with 0 will not change the result. So, to ignore some bits, we can treat them as zeros.
So, XORing the bits set in VALUE that are in MASK is equivalent to XORing the bits in VALUE&MASK.
Now note that the result is 0 if the number of set bits is even, 1 if it is odd.
That means we want to count the number of set bits. Some architectures/compilers have ways to quickly compute this value. For instance, on GCC this can be obtained with __builtin_popcount.
So on GCC, this can be computed with:
int set_bits = __builtin_popcount(value & mask);
return set_bits % 2;
If you want the code to be portable, then this won't do. However, a comment in this answer suggests that some compilers can inline std::bitset::count to efficiently obtain the same result.
If I'm understanding you right, you have
result = value & mask
and you want to XOR the 1 bits of mask & result together. The XOR of a series of bits is the same as counting the number of bits and checking if that count is even or odd. If it's odd, the XOR would be 1; if even, XOR would give 0.
count_bits(mask & result) % 2 != 0
mask & result can be simplified to simply result. You don't need to AND it with mask again. The % 2 != 0 can be alternately written as & 1.
count_bits(result) & 1
As far as how to count bits, the Bit Twiddling Hacks web page gives a number of bit counting algorithms.
Counting bits set, Brian Kernighan's way
unsigned int v; // count the number of bits set in v
unsigned int c; // c accumulates the total bits set in v
for (c = 0; v; c++)
{
v &= v - 1; // clear the least significant bit set
}
Brian Kernighan's method goes through as many iterations as there are
set bits. So if we have a 32-bit word with only the high bit set, then
it will only go once through the loop.
If you were to use that implementation, you could optimize it a bit further. If you think about it, you don't need the full count of bits. You only need to track their parity. Instead of counting bits you could just flip c each iteration.
unsigned bit_parity(unsigned v) {
unsigned c;
for (c = 0; v; c ^= 1) {
v &= v - 1;
}
}
(Thanks to Slava for the suggestion.)
Using that the XOR with 0 doesn't change anything, it's OK to apply the mask and then unconditionally XOR all bits together, which can be done in a parallel-prefix way. So something like this (not tested):
x = m & v;
x ^= x >> 16;
x ^= x >> 8;
x ^= x >> 4;
x ^= x >> 2;
x ^= x >> 1;
result = x & 1
You can use more (or fewer) steps as needed, this is for 32 bits.
One significant issue to be aware of if using v &= v - 1 in the main body of your code is it will change the value of v to 0 in conducting the count. With other methods, the value is changed to the number of 1's. While count logic is generally wrapped as a function, where that is no longer a concern, if you are required to present your counting logic in the main body of your code, you must preserve a copy of v if that value is needed again.
In addition to the other two methods presented, the following is another favorite from bit-twiddling hacks that generally has a bit better performance than the loop method for larger numbers:
/* get the population 1's in the binary representation of a number */
unsigned getn1s (unsigned int v)
{
v = v - ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
v = (v + (v >> 4)) & 0x0F0F0F0F;
v = v + (v << 8);
v = v + (v << 16);
return v >> 24;
}
A solution is given to this question on geeksforgeeks website.
I wish to know does there exist a better and a simpler solution? This is a bit complicated to understand. Just an algorithm will be fine.
I am pretty sure this algorithm is as efficient and easier to understand than your linked algorithm.
The strategy here is to understand that the only way to make a number bigger without increasing its number of 1's is to carry a 1, but if you carry multiple 1's then you must add them back in.
Given a number 1001 1100
Right shift it until the value is odd, 0010 0111. Remember the number of shifts: shifts = 2;
Right shift it until the value is even, 0000 0100. Remember the number of shifts performed and bits consumed. shifts += 3; bits = 3;
So far, we have taken 5 shifts and 3 bits from the algorithm to carry the lowest digit possible. Now we pay it back.
Make the rightmost bit 1. 0000 0101. We now owe it 2 bits. bits -= 1
Shift left 3 times to add the 0's. 0010 1000. We do it three times because shifts - bits == 3 shifts -= 3
Now we owe the number two bits and two shifts. So shift it left twice, setting the leftmost bit to 1 each time. 1010 0011. We've paid back all the bits and all the shifts. bits -= 2; shifts -= 2; bits == 0; shifts == 0
Here's a few other examples... each step is shown as current_val, shifts_owed, bits_owed
0000 0110
0000 0110, 0, 0 # Start
0000 0011, 1, 0 # Shift right till odd
0000 0000, 3, 2 # Shift right till even
0000 0001, 3, 1 # Set LSB
0000 0100, 1, 1 # Shift left 0's
0000 1001, 0, 0 # Shift left 1's
0011 0011
0011 0011, 0, 0 # Start
0011 0011, 0, 0 # Shift right till odd
0000 1100, 2, 2 # Shift right till even
0000 1101, 2, 1 # Set LSB
0001 1010, 1, 1 # Shift left 0's
0011 0101, 0, 0 # Shift left 1's
There is a simpler, though definitely less efficient one. It follows:
Count the number of bits in your number (right shift your number until it reaches zero, and count the number of times the rightmost bit is 1).
Increment the number until you get the same result.
Of course it is extremely inefficient. Consider a number that's a power of 2 (having 1 bit set). You'll have to double this number to get your answer, incrementing the number by 1 in each iteration. Of course it won't work.
If you want a simpler efficient algorithm, I don't think there is one. In fact, it seems pretty simple and straightforward to me.
Edit: By "simpler", I mean it's mpre straightforward to implement, and possibly has a little less code lines.
Based on some code I happened to have kicking around which is quite similar to the geeksforgeeks solution (see this answer: https://stackoverflow.com/a/14717440/1566221) and a highly optimized version of #QuestionC's answer which avoids some of the shifting, I concluded that division is slow enough on some CPUs (that is, on my Intel i5 laptop) that looping actually wins out.
However, it is possible to replace the division in the g-for-g solution with a shift loop, and that turned out to be the fastest algorithm, again just on my machine. I'm pasting the code here for anyone who wants to test it.
For any implementation, there are two annoying corner cases: one is where the given integer is 0; the other is where the integer is the largest possible value. The following functions all have the same behaviour: if given the largest integer with k bits, they return the smallest integer with k bits, thereby restarting the loop. (That works for 0, too: it means that given 0, the functions return 0.)
Bit-hack solution with division:
template<typename UnsignedInteger>
UnsignedInteger next_combination_1(UnsignedInteger comb) {
UnsignedInteger last_one = comb & -comb;
UnsignedInteger last_zero = (comb + last_one) &~ comb;
if (last_zero)
return comb + last_one + ((last_zero / last_one) >> 1) - 1;
else if (last_one)
return UnsignedInteger(-1) / last_one;
else
return 0;
}
Bit-hack solution with division replaced by a shift loop
template<typename UnsignedInteger>
UnsignedInteger next_combination_2(UnsignedInteger comb) {
UnsignedInteger last_one = comb & -comb;
UnsignedInteger last_zero = (comb + last_one) &~ comb;
UnsignedInteger ones = (last_zero - 1) & ~(last_one - 1);
if (ones) while (!(ones & 1)) ones >>= 1;
comb += last_one;
if (comb) comb += ones >> 1; else comb = ones;
return comb;
}
Optimized shifting solution
template<typename UnsignedInteger>
UnsignedInteger next_combination_3(UnsignedInteger comb) {
if (comb) {
// Shift the trailing zeros, keeping a count.
int zeros = 0; for (; !(comb & 1); comb >>= 1, ++zeros);
// Adding one at this point turns all the trailing ones into
// trailing zeros, and also changes the 0 before them into a 1.
// In effect, this is steps 3, 4 and 5 of QuestionC's solution,
// without actually shifting the 1s.
UnsignedInteger res = comb + 1U;
// We need to put some ones back on the end of the value.
// The ones to put back are precisely the ones which were at
// the end of the value before we added 1, except we want to
// put back one less (because the 1 we added counts). We get
// the old trailing ones with a bit-hack.
UnsignedInteger ones = comb &~ res;
// Now, we finish shifting the result back to the left
res <<= zeros;
// And we add the trailing ones. If res is 0 at this point,
// we started with the largest value, and ones is the smallest
// value.
if (res) res += ones >> 1;
else res = ones;
comb = res;
}
return comb;
}
(Some would say that the above is yet another bit-hack, and I won't argue.)
Highly non-representative benchmark
I tested this by running through all 32-bit numbers. (That is, I create the smallest pattern with i ones and then cycle through all the possibilities, for each value of i from 0 to 32.):
#include <iostream>
int main(int argc, char** argv) {
uint64_t count = 0;
for (int i = 0; i <= 32; ++i) {
unsigned comb = (1ULL << i) - 1;
unsigned start = comb;
do {
comb = next_combination_x(comb);
++count;
} while (comb != start);
}
std::cout << "Found " << count << " combinations; expected " << (1ULL << 32) << '\n';
return 0;
}
The result:
1. Bit-hack with division: 43.6 seconds
2. Bit-hack with shifting: 15.5 seconds
3. Shifting algorithm: 19.0 seconds