Here is the code:
unsigned int v; // word value to compute the parity of
v ^= v >> 16;
v ^= v >> 8;
v ^= v >> 4;
v &= 0xf;
return (0x6996 >> v) & 1;
It computes the parity of given word, v. What is the meaning of 0x6996?
The number 0x6996 in binary is 110100110010110.
The first four lines convert v to a 4-bit number (0 to 15) that has the same parity as the original. The 16-bit number 0x6996 contains the parity of all the numbers from 0 to 15, and the right-shift is used to select the correct bit. It is similar to using a lookup table:
//This array contains the parity of the numbers 0 to 15
char parities[16] = {0,1,1,0,1,0,0,1,1,0,0,1,0,1,1,0};
return parities[v];
Note that the array entries are the same as the bits of 0x6996. Using (0x6996 >> v) & 1 gives the same result, but doesn't require the memory access.
Well the algorithm is compressing the 32-bit int into a 4-bit value of the same parity by successive bitwise ORs and then ANDing with 0xf so that there are only positive bits in the least-significant 4-bits. In other words after line 5, v will be an int between 0 and 15 inclusive.
It then shifts that magic number (0x6996) to the right by this 0-16 value and returns only the least significant bit (& 1).
That means that if there is a 1 in the v bit position of 0x6996 then the computed parity bit is 1, otherwise it's 0 - for example if in line 5 v is calculated as 2 then ` is returned, if it was 3 then 0 would be returned.
Related
This question already has an answer here:
I want to pack the bits based on arbitrary mask
(1 answer)
Closed 5 years ago.
Problem
Suppose I have a bit mask mask and an input n, such as
mask = 0x10f3 (0001 0000 1111 0011)
n = 0xda4d (1101 1010 0100 1101)
I want to 1) isolate the masked bits (remove bits from n not in mask)
masked_n = 0x10f3 & 0xda4d = 0x1041 (0001 0000 0100 0001)
and 2) "flatten" them (get rid of the zero bits in mask and apply those same shifts to masked_n)?
flattened_mask = 0x007f (0000 0000 0111 1111)
bits to discard (___1 ____ 0100 __01)
first shift ( __ _1__ __01 0001)
second shift ( __ _101 0001)
result = 0x0051 (0000 0000 0101 0001)
Tried solutions
a) For this case, one could craft an ad hoc series of bit shifts:
result = (n & 0b10) | (n & 0b11110000) >> 2 | (n & 0b1000000000000) >> 6
b) More generically, one could also iterate over each bit of mask and calculate result one bit at a time.
for (auto i = 0, pos = 0; i < 16; i++) {
if (mask & (1<<i)) {
if (n & (1<<i)) {
result |= (1<<pos);
}
pos++;
}
}
Question
Is there a more efficient way of doing this generically, or at the very least, ad hoc but with a fixed number of operations regardless of bit placement?
A more efficient generic approach would be to loop over the bits but only process the number of bits that are in the mask, removing the if (mask & (1<<i)) test from your loop and looping only 7 times instead of 16 for your example mask. In each iteration of the loop find the rightmost bit of the mask, test it with n, set the corresponding bit in the result and then remove it from the mask.
int mask = 0x10f3;
int n = 0xda4d;
int result = 0;
int m = mask, pos = 1;
while(m != 0)
{
// find rightmost bit in m:
int bit = m & -m;
if (n & bit)
result |= pos;
pos <<= 1;
m &= ~bit; // remove the rightmost bit from m
}
printf("%04x %04x %04x\n", mask, n, result);
Output:
10f3 da4d 0051
Or, perhaps less readably but without the bit temp variable:
if (n & -m & m)
result |= pos;
pos <<= 1;
m &= m-1;
How does it work? First, consider why m &= m-1 clears the rightmost (least significant) set bit. Your (non-zero) mask m is going to be made up of a certain number of bits, then a 1 in the least significant set place, then zero or more 0s:
e.g:
xxxxxxxxxxxx1000
Subtracting 1 gives:
xxxxxxxxxxxx0111
So all the bits higher than the least significant set bit will be unchanged (so ANDing them together leaves them unchanged), the least significant set bit changes from a 1 to a 0, and the less significant bits were all 0s beforehand so ANDing them with all 1s leaves them unchanged. Net result: least significant set bit is cleared and the rest of the word stays the same.
To understand why m & -m gives the least significant set bit, combine the above with the knowledge that in 2s complement, -x = ~(x-1)
This question is asked on Pearls of programming Question 2. And I am having trouble understanding its solution.
Here is the solution written in the book.
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];
void set(int i) { a[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { a[i>>SHIFT]&=~(1<<(i & MASK)); }
int test(int i) { return a[i>>SHIFT]&(1<<(i & MASK)); }
I have ran this in my compiler and I have looked at another question that talks about this problem, but I still dont understand how this solution works.
Why does it do a[i>>SHIFT]? Why cant it just be a[i]=1; Why does i need to shifted right 5 times?
32 is 25, so a right-shift of 5 bits is equivalent to dividing by 32. So by doing a[i>>5], you are dividing i by 32 to figure out which element of the array contains bit i -- there are 32 bits per element.
Meanwhile & MASK is equivalent to mod 32, so 1<<(i & MASK) builds a 1-bit mask for the particular bit within the word.
Divide the 32 bits of int i (starting form bit 0 to bit 31) into two parts.
First part is the most significant bits 31 to 5. Use this part to find the index in the array of ints (called a[] here) that you are using to implement the bit array. Initially, the entire array of ints is zeroed out.
Since every int in a[] is 32 bits, it can keep track of 32 ints with those 32 bits. We divide every input i with 32 to find the int in a[] that is supposed to keep track of this i.
Every time a number is divided by 2, it is effectively right shifted once. To divide a number by 32, you simply right shift it 5 times. And that is exactly what we get by filtering out the first part.
Second part is the least significant bits 0 to 4. After a number has been binned into the correct index, use this part to set the specific bit of the zero stored in a[] at this index. Obviously, if some bit of the zero at this index has already been set, the value at that index will not be zero anymore.
How to get the first part? Right shifting i by 5 (i.e. i >> SHIFT).
How to get the second part? Do bitwise AND of i by 11111. (11111)2 = 0x1F, defined as MASK. So, i & MASK will give the integer value represented by the last 5 bits of i.
The last 5 bits tell you how many bits to go inside the number in a[]. For example, if i is 5, you want to set the bit in the index 0 of a[] and you specifically want to set the 5th bit of the int value a[0].
Index to set = 5 / 32 = (0101 >> 5) = 0000 = 0.
Bit to set = 5th bit inside a[0]
= a[0] & (1 << 5)
= a[0] & (1 << (00101 & 11111)).
Setting the bit for given i
Get the int to set by a[i >> 5]
Get the bit to set by pushing a 1 a total of i % 32 times to the left i.e. 1 << (i & 0x1F)
Simply set the bit as a[i >> 5] = a[i >> 5] | (1 << (i & 0x1F));
That can be shortened to a[i >> 5] |= (1 << (i & 0x1F));
Getting/Testing the bit for given i
Get the int where the desired bit lies by a[i >> 5]
Generate a number where all bits except for the i & 0x1F bit are 0. You can do that by negating 1 << (i & 0x1F).
AND the number generated above with the value stored at this index in a[]. If the value is 0, this particular bit was 0. If the value is non-zero, this bit was 1.
In code you would simply, return a[i >> 5] & (1 << (i & 0x1F)) != 0;
Clearing the bit for given i: It means setting the bit for that i to 0.
Get the int where the bit lies by a[i >> 5]
Get the bit by 1 << (i & 0x1F)
Invert all the bits of 1 << (i & 0x1F) so that the i's bit is 0.
AND the number at this index and the number generated in step 3. That will clear i's bit, leaving all other bits intact.
In code, this would be: a[i >> 5] &= ~(1 << (i & 0x1F));
I have a variable mask of type std::bitset<8> as
std::string bit_string = "00101100";
std::bitset<8> mask(bit_string);
Is there an efficient way to quickly mask out the corresponding (three) bits of another given std::bitset<8> input and move all those masked out bits to the rightmost? E.g., if input is 10100101, then I would like to quickly get 00000101 which equals 5 in decimal. Then I can vect[5] to quickly index the 6th element of vect which is std::vector<int> of size 8.
Or rather, can I quickly get the decimal value of the masked out bits (with their relative positions retained)? Or I can't?
I guess in my case the advantage that can be taken is the bitset<8> mask I have. And I'm supposed to manipulate it somehow to do the work fast.
I see it like this (added by Spektre):
mask 00101100b
input 10100101b
---------------
& ??1?01??b
>> 101b
5
First things first: you can't avoid O(n) complexity with n being the number of mask bits if your mask is available as binary. However, if your mask is constant for multiple inputs, you can preprocess the mask into a series of m mask&shift transformations where m is less or equal to your number of value 1 mask bits. If you know the mask at compile time, you can even preconstruct the transformations and then you get your O(m).
To apply this idea, you need to create a sub-mask for each group of 1 bits in your mask and combine it with a shift information. The shift information is constructed by counting the number of zeroes to the right of the current group.
Example:
mask = 00101100b
// first group of ones
submask1 = 00001100b
// number of zeroes to the right of the group
subshift1 = 2
submask2 = 00100000b
subshift2 = 3
// Apply:
input = 10100101b
transformed = (input & submask1) >> subshift1 // = 00000001b
transformed = (input & submask2) >> subshift2 // = 00000100b
+ transformed // = 00000101b
If you make the sub-transforms into an array, you can easily apply them in a loop.
Your domain is small enough that you can brute-force this. Trivially, an unsigned char LUT[256][256] can store all possible outcomes in just 64 KB.
I understand that the mask has at most 3 bits, so you can restrict the lookup table size in that dimension to [224]. And since f(input, mask) == f(input&mask, mask) you can in fact reduce the LUT to unsigned char[224][224].
A further size reduction is possible by realizing that the highest mask is 11100000 but you can just test the lowest bit of the mask. When mask is even, f(input, mask) == f((input&mask)/2, mask/2). The highest odd mask is only 11000001 or 191. This reduces your LUT further, to [192][192].
A more space-efficient algorithm splits input and mask into 2 nibbles (4 bits). You now have a very simple LUT[16][16] in which you look up the high and low parts:
int himask = mask >> 4, lomask = mask & 0xF;
int hiinp = input >> 4, loinp = input & 0xF;
unsigned char hiout = LUT[himask][hiinp];
unsigned char loout = LUT[lomask][loinp];
return hiout << bitsIn[lomask] | loout;
This shows that you need another table, char bitsIn[15].
Taking the example :
mask 0010 1100b
input 1010 0101b
himask = 0010
hiinp = 1010
hiout = 0001
lomask = 1100
loinp = 0101
loout = 0001
bitsIn[lowmask 1100] = 2
return (0001 << 2) | (0001)
Note that this generalizes fairly easily to more than 8 bits:
int bitsSoFar = 0;
int retval = 0;
while(mask) { // Until we've looked up all bits.
int mask4 = mask & 0xF;
int input4 = input & 0xF;
retval |= LUT[mask4][input4] << bitsSoFar;
bitsSoFar += bitsIn[mask4];
mask >>= 4;
input >>= 4;
}
Since this LUT only hold nibbles, you could reduce it to 16*16/2 bytes, but I suspect that's not worth the effort.
I see it like this:
mask 00101100b
input 10100101b
---------------
& ??1?01??b
>> 101b
5
I would create a bit weight table for each set bit in mask by scan bits from LSB and add weights 1,2,4,8,16... for set bits and leave zero for the rest so:
MSB LSB
--------------------------
mask 0 0 1 0 1 1 0 0 bin
--------------------------
weight 0 0 4 0 2 1 0 0 dec (A)
input 1 0 1 0 0 1 0 1 bin (B)
--------------------------
(A.B) 0*1+0*0+4*1+0*0+2*0+1*1+0*0+0*1 // this is dot product ...
4 + 1
--------------------------
5 dec
--------------------------
Sorry I do not code in Python at all so no code ... I still think using integral types for this directly would be better but that is probably just my low level C++ thinking ...
I've got an interesting problem that has me looking for a more efficient way of doing things.
Let's say we have a value (in binary)
(VALUE) 10110001
(MASK) 00110010
----------------
(AND) 00110000
Now, I need to be able to XOR any bits from the (AND) value that are set in the (MASK) value (always lowest to highest bit):
(RESULT) AND1(0) xor AND4(1) xor AND5(1) = 0
Now, on paper, this is certainly quick since I can see which bits are set in the mask. It seems to me that programmatically I would need to keep right shifting the MASK until I found a set bit, XOR it with a separate value, and loop until the entire byte is complete.
Can anyone think of a faster way? I'm looking for the way to do this with the least number of operations and stored values.
If I understood this question correctly, what you want is to get every bit from VALUE that is set in the MASK, and compute the XOR of those bits.
First of all, note that XOR'ing a value with 0 will not change the result. So, to ignore some bits, we can treat them as zeros.
So, XORing the bits set in VALUE that are in MASK is equivalent to XORing the bits in VALUE&MASK.
Now note that the result is 0 if the number of set bits is even, 1 if it is odd.
That means we want to count the number of set bits. Some architectures/compilers have ways to quickly compute this value. For instance, on GCC this can be obtained with __builtin_popcount.
So on GCC, this can be computed with:
int set_bits = __builtin_popcount(value & mask);
return set_bits % 2;
If you want the code to be portable, then this won't do. However, a comment in this answer suggests that some compilers can inline std::bitset::count to efficiently obtain the same result.
If I'm understanding you right, you have
result = value & mask
and you want to XOR the 1 bits of mask & result together. The XOR of a series of bits is the same as counting the number of bits and checking if that count is even or odd. If it's odd, the XOR would be 1; if even, XOR would give 0.
count_bits(mask & result) % 2 != 0
mask & result can be simplified to simply result. You don't need to AND it with mask again. The % 2 != 0 can be alternately written as & 1.
count_bits(result) & 1
As far as how to count bits, the Bit Twiddling Hacks web page gives a number of bit counting algorithms.
Counting bits set, Brian Kernighan's way
unsigned int v; // count the number of bits set in v
unsigned int c; // c accumulates the total bits set in v
for (c = 0; v; c++)
{
v &= v - 1; // clear the least significant bit set
}
Brian Kernighan's method goes through as many iterations as there are
set bits. So if we have a 32-bit word with only the high bit set, then
it will only go once through the loop.
If you were to use that implementation, you could optimize it a bit further. If you think about it, you don't need the full count of bits. You only need to track their parity. Instead of counting bits you could just flip c each iteration.
unsigned bit_parity(unsigned v) {
unsigned c;
for (c = 0; v; c ^= 1) {
v &= v - 1;
}
}
(Thanks to Slava for the suggestion.)
Using that the XOR with 0 doesn't change anything, it's OK to apply the mask and then unconditionally XOR all bits together, which can be done in a parallel-prefix way. So something like this (not tested):
x = m & v;
x ^= x >> 16;
x ^= x >> 8;
x ^= x >> 4;
x ^= x >> 2;
x ^= x >> 1;
result = x & 1
You can use more (or fewer) steps as needed, this is for 32 bits.
One significant issue to be aware of if using v &= v - 1 in the main body of your code is it will change the value of v to 0 in conducting the count. With other methods, the value is changed to the number of 1's. While count logic is generally wrapped as a function, where that is no longer a concern, if you are required to present your counting logic in the main body of your code, you must preserve a copy of v if that value is needed again.
In addition to the other two methods presented, the following is another favorite from bit-twiddling hacks that generally has a bit better performance than the loop method for larger numbers:
/* get the population 1's in the binary representation of a number */
unsigned getn1s (unsigned int v)
{
v = v - ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
v = (v + (v >> 4)) & 0x0F0F0F0F;
v = v + (v << 8);
v = v + (v << 16);
return v >> 24;
}
I have a array of size 32. Each element in the array is a 0 or 1. I want to be able to store them into the bit positions of a 32-bit integer, and perform bit-wise operations on it. How can I do this ?
Also, if I have two arrays of size 32, and I want to do bitwise operations on the elements with the same index all at once, could I do this ?
op_and[31:0] = ip_1[31:0] & ip_2 [31:0];
I am using the gcc compiler.
You can use the or operator | and bitshifting ( << and >> ).
uint32_t myInt = 0;
for( int index=0; index < 32; index++ )
{
myInt |= ( arrayOf32Ints[i] << i );
}
This example assumes that the values of arrayOf32Ints are either 0 or 1 as per your question.
If they may contain "any true" or false value, one should ask for that explicitly (some people would tell you to use !! but the standard does not guarantee that true is 1).
The line would then be
myInt |= ( (arrayOf32Ints[i])?1:0) << i );
In the case you want to set individual bits on or off, you can do:
myInt |= (1<<3); //Sets bit 3 true by shifting 1 3 bits up (1 becomes 4), and ANDing it with myInt.
myInt |= 4; // Sets bit 3 by ANDing 4 (The binary form of 4 is 100) with myInt.
myInt ^= (1<<5);; // Turns OFF bit 5 by XORing it with myInt (XOR basically means "Any bits which are not the same in both numbers")
myInt ^= 16; //Sets bit 5 by XORing it with myInt (16 is 10000 in binary)