Fastest Way to XOR all bits from value based on bitmask? - c++

I've got an interesting problem that has me looking for a more efficient way of doing things.
Let's say we have a value (in binary)
(VALUE) 10110001
(MASK) 00110010
----------------
(AND) 00110000
Now, I need to be able to XOR any bits from the (AND) value that are set in the (MASK) value (always lowest to highest bit):
(RESULT) AND1(0) xor AND4(1) xor AND5(1) = 0
Now, on paper, this is certainly quick since I can see which bits are set in the mask. It seems to me that programmatically I would need to keep right shifting the MASK until I found a set bit, XOR it with a separate value, and loop until the entire byte is complete.
Can anyone think of a faster way? I'm looking for the way to do this with the least number of operations and stored values.

If I understood this question correctly, what you want is to get every bit from VALUE that is set in the MASK, and compute the XOR of those bits.
First of all, note that XOR'ing a value with 0 will not change the result. So, to ignore some bits, we can treat them as zeros.
So, XORing the bits set in VALUE that are in MASK is equivalent to XORing the bits in VALUE&MASK.
Now note that the result is 0 if the number of set bits is even, 1 if it is odd.
That means we want to count the number of set bits. Some architectures/compilers have ways to quickly compute this value. For instance, on GCC this can be obtained with __builtin_popcount.
So on GCC, this can be computed with:
int set_bits = __builtin_popcount(value & mask);
return set_bits % 2;
If you want the code to be portable, then this won't do. However, a comment in this answer suggests that some compilers can inline std::bitset::count to efficiently obtain the same result.

If I'm understanding you right, you have
result = value & mask
and you want to XOR the 1 bits of mask & result together. The XOR of a series of bits is the same as counting the number of bits and checking if that count is even or odd. If it's odd, the XOR would be 1; if even, XOR would give 0.
count_bits(mask & result) % 2 != 0
mask & result can be simplified to simply result. You don't need to AND it with mask again. The % 2 != 0 can be alternately written as & 1.
count_bits(result) & 1
As far as how to count bits, the Bit Twiddling Hacks web page gives a number of bit counting algorithms.
Counting bits set, Brian Kernighan's way
unsigned int v; // count the number of bits set in v
unsigned int c; // c accumulates the total bits set in v
for (c = 0; v; c++)
{
v &= v - 1; // clear the least significant bit set
}
Brian Kernighan's method goes through as many iterations as there are
set bits. So if we have a 32-bit word with only the high bit set, then
it will only go once through the loop.
If you were to use that implementation, you could optimize it a bit further. If you think about it, you don't need the full count of bits. You only need to track their parity. Instead of counting bits you could just flip c each iteration.
unsigned bit_parity(unsigned v) {
unsigned c;
for (c = 0; v; c ^= 1) {
v &= v - 1;
}
}
(Thanks to Slava for the suggestion.)

Using that the XOR with 0 doesn't change anything, it's OK to apply the mask and then unconditionally XOR all bits together, which can be done in a parallel-prefix way. So something like this (not tested):
x = m & v;
x ^= x >> 16;
x ^= x >> 8;
x ^= x >> 4;
x ^= x >> 2;
x ^= x >> 1;
result = x & 1
You can use more (or fewer) steps as needed, this is for 32 bits.

One significant issue to be aware of if using v &= v - 1 in the main body of your code is it will change the value of v to 0 in conducting the count. With other methods, the value is changed to the number of 1's. While count logic is generally wrapped as a function, where that is no longer a concern, if you are required to present your counting logic in the main body of your code, you must preserve a copy of v if that value is needed again.
In addition to the other two methods presented, the following is another favorite from bit-twiddling hacks that generally has a bit better performance than the loop method for larger numbers:
/* get the population 1's in the binary representation of a number */
unsigned getn1s (unsigned int v)
{
v = v - ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
v = (v + (v >> 4)) & 0x0F0F0F0F;
v = v + (v << 8);
v = v + (v << 16);
return v >> 24;
}

Related

Getting exponent value using bit shifts (C, C++) [duplicate]

Note - This is NOT a duplicate of this question - Count the consecutive zero bits (trailing) on the right in parallel: an explanation? . The linked question has a different context, it only asks the purpose of signed() being use. DO NOT mark this question as duplicate.
I've been finding a way to acquire the number of trailing zeros in a number. I found a bit twiddling Stanford University Write up HERE here that gives the following explanation.
unsigned int v; // 32-bit word input to count zero bits on right
unsigned int c = 32; // c will be the number of zero bits on the right
v &= -signed(v);
if (v) c--;
if (v & 0x0000FFFF) c -= 16;
if (v & 0x00FF00FF) c -= 8;
if (v & 0x0F0F0F0F) c -= 4;
if (v & 0x33333333) c -= 2;
if (v & 0x55555555) c -= 1;
Why does this end up working ? I have an understanding of how Hex numbers are represented as binary and bitwise operators, but I am unable to figure out the intuition behind this working ? What is the working mechanism ?
The code is broken (undefined behavior is present). Here is a fixed version which is also slightly easier to understand (and probably faster):
uint32_t v; // 32-bit word input to count zero bits on right
unsigned c; // c will be the number of zero bits on the right
if (v) {
v &= -v; // keep rightmost set bit (the one that determines the answer) clear all others
c = 0;
if (v & 0xAAAAAAAAu) c |= 1; // binary 10..1010
if (v & 0xCCCCCCCCu) c |= 2; // binary 1100..11001100
if (v & 0xF0F0F0F0u) c |= 4;
if (v & 0xFF00FF00u) c |= 8;
if (v & 0xFFFF0000u) c |= 16;
}
else c = 32;
Once we know only one bit is set, we determine one bit of the result at a time, by simultaneously testing all bits where the result is odd, then all bits where the result has the 2's-place set, etc.
The original code worked in reverse, starting with all bits of the result set (after the if (c) c--;) and then determining which needed to be zero and clearing them.
Since we are learning one bit of the output at a time, I think it's more clear to build the output using bit operations not arithmetic.
This code (from the net) is mostly C, although v &= -signed(v); isn't correct C. The intent is for it to behave as v &= ~v + 1;
First, if v is zero, then it remains zero after the & operation, and all of the if statements are skipped, so you get 32.
Otherwise, the & operation (when corrected) clears all bits to the left of the rightmost 1, so at that point v contains a single 1 bit. Then c is decremented to 31, i.e. all 1 bits within the possible result range.
The if statements then determine its numeric position one bit at a time (one bit of the position number, not of v), clearing the bits that should be 0.
The code first transforms v is such a way that is is entirely null, except the left most one that remains. Then, it determines the position of this first one.
First let's see how we suppress all ones but the left most one.
Assume that k is the position of the left most one in v. v=(vn-1,vn-2,..vk+1,1,0,..0).
-v is the number that added to v will give 0 (actually it gives 2^n, but bit 2^n is ignored if we only keep the n less significant bits).
What must the value of bits in -v so that v+-v=0?
obviously bits k-1..0 of -k must be at 0 so that added to the trailing zeros in v they give a zero.
bit k must be at 1. Added to the one in vk, it will give a zero and a carry at one at order k+1
bit k+1 of -v will be added to vk+1 and to the carry generated at step k. It must be the logical complement of vk+1. So whatever the value of vk+1, we will have 1+0+1 if vk+1=0 (or 1+1+0 if vk+1=1) and result will be 0 at order k+1 with a carry generated at order k+2.
This is similar for bits n-1..k+2 and they must all be the logical complement of the corresponding bit in v.
Hence, we get the well-known result that to get -v, one must
leave unchanged all trailing zeros of v
leave unchanged the left most one of v
complement all the other bits.
If we compute v&-v, we have
v vn-1 vn-2 ... vk+1 1 0 0 ... 0
-v & ~vn-1 ~vn-2 ... ~vk+1 1 0 0 ... 0
v&-v 0 0 ... 0 1 0 0 ... 0
So v&-v only keeps the left most one in v.
To find the location of first one, look at the code:
if (v) c--; // no 1 in result? -> 32 trailing zeros.
// Otherwise it will be in range c..0=31..0
if (v & 0x0000FFFF) c -= 16; // If there is a one in left most part of v the range
// of possible values for the location of this one
// will be 15..0.
// Otherwise, range must 31..16
// remaining range is c..c-15
if (v & 0x00FF00FF) c -= 8; // if there is one in either byte 0 (c=15) or byte 2 (c=31),
// the one is in the lower part of range.
// So we must substract 8 to boundaries of range.
// Other wise, the one is in the upper part.
// Possible range of positions of v is now c..c-7
if (v & 0x0F0F0F0F) c -= 4; // do the same for the other bits.
if (v & 0x33333333) c -= 2;
if (v & 0x55555555) c -= 1;

Isolating bits and flattening them [duplicate]

This question already has an answer here:
I want to pack the bits based on arbitrary mask
(1 answer)
Closed 5 years ago.
Problem
Suppose I have a bit mask mask and an input n, such as
mask = 0x10f3 (0001 0000 1111 0011)
n = 0xda4d (1101 1010 0100 1101)
I want to 1) isolate the masked bits (remove bits from n not in mask)
masked_n = 0x10f3 & 0xda4d = 0x1041 (0001 0000 0100 0001)
and 2) "flatten" them (get rid of the zero bits in mask and apply those same shifts to masked_n)?
flattened_mask = 0x007f (0000 0000 0111 1111)
bits to discard (___1 ____ 0100 __01)
first shift ( __ _1__ __01 0001)
second shift ( __ _101 0001)
result = 0x0051 (0000 0000 0101 0001)
Tried solutions
a) For this case, one could craft an ad hoc series of bit shifts:
result = (n & 0b10) | (n & 0b11110000) >> 2 | (n & 0b1000000000000) >> 6
b) More generically, one could also iterate over each bit of mask and calculate result one bit at a time.
for (auto i = 0, pos = 0; i < 16; i++) {
if (mask & (1<<i)) {
if (n & (1<<i)) {
result |= (1<<pos);
}
pos++;
}
}
Question
Is there a more efficient way of doing this generically, or at the very least, ad hoc but with a fixed number of operations regardless of bit placement?
A more efficient generic approach would be to loop over the bits but only process the number of bits that are in the mask, removing the if (mask & (1<<i)) test from your loop and looping only 7 times instead of 16 for your example mask. In each iteration of the loop find the rightmost bit of the mask, test it with n, set the corresponding bit in the result and then remove it from the mask.
int mask = 0x10f3;
int n = 0xda4d;
int result = 0;
int m = mask, pos = 1;
while(m != 0)
{
// find rightmost bit in m:
int bit = m & -m;
if (n & bit)
result |= pos;
pos <<= 1;
m &= ~bit; // remove the rightmost bit from m
}
printf("%04x %04x %04x\n", mask, n, result);
Output:
10f3 da4d 0051
Or, perhaps less readably but without the bit temp variable:
if (n & -m & m)
result |= pos;
pos <<= 1;
m &= m-1;
How does it work? First, consider why m &= m-1 clears the rightmost (least significant) set bit. Your (non-zero) mask m is going to be made up of a certain number of bits, then a 1 in the least significant set place, then zero or more 0s:
e.g:
xxxxxxxxxxxx1000
Subtracting 1 gives:
xxxxxxxxxxxx0111
So all the bits higher than the least significant set bit will be unchanged (so ANDing them together leaves them unchanged), the least significant set bit changes from a 1 to a 0, and the less significant bits were all 0s beforehand so ANDing them with all 1s leaves them unchanged. Net result: least significant set bit is cleared and the rest of the word stays the same.
To understand why m & -m gives the least significant set bit, combine the above with the knowledge that in 2s complement, -x = ~(x-1)

How to set the highest-valued 1 bit to 0 , prefferably in c++ [duplicate]

This question already has answers here:
What's the best way to toggle the MSB?
(4 answers)
Closed 8 years ago.
If, for example, I have the number 20:
0001 0100
I want to set the highest valued 1 bit, the left-most, to 0.
So
0001 0100
will become
0000 0100
I was wondering which is the most efficient way to achieve this.
Preferrably in c++.
I tried substracting from the original number the largest power of two like this,
unsigned long long int originalNumber;
unsigned long long int x=originalNumber;
x--;
x |= x >> 1;
x |= x >> 2;
x |= x >> 4;
x |= x >> 8;
x |= x >> 16;
x++;
x >>= 1;
originalNumber ^= x;
,but i need something more efficient.
The tricky part is finding the most significant bit, or counting the number of leading zeroes. Everything else is can be done more or less trivially with left shifting 1 (by one less), subtracting 1 followed by negation (building an inverse mask) and the & operator.
The well-known bit hacks site has several implementations for the problem of finding the most significant bit, but it is also worth looking into compiler intrinsics, as all mainstream compilers have an intrinsic for this purpose, which they implement as efficiently as the target architecture will allow (I tested this a few years ago using GCC on x86, came out as single instruction). Which is fastest is impossible to tell without profiling on your target architecture (fewer lines of code, or fewer assembly instructions are not always faster!), but it is a fair assumption that compilers implement these intrinsics not much worse than you'll be able to implement them, and likely faster.
Using an intrinsic with a somewhat intellegible name may also turn out easier to comprehend than some bit hack when you look at it 5 years from now.
Unluckily, although a not entirely uncommon thing, this is not a standardized function which you'd expect to find in the C or C++ libraries, at least there is no standard function that I'm aware of.
For GCC, you're looking for __builtin_clz, VisualStudio calls it _BitScanReverse, and Intel's compiler calls it _bit_scan_reverse.
Alternatively to counting leading zeroes, you may look into what the same Bit Twiddling site has under "Round up to the next power of two", which you would only need to follow up with a right shift by 1, and a NAND operation. Note that the 5-step implementation given on the site is for 32-bit integers, you would have to double the number of steps for 64-bit wide values.
#include <limits.h>
uint32_t unsetHighestBit(uint32_t val) {
for(uint32_t i = sizeof(uint32_t) * CHAR_BIT - 1; i >= 0; i--) {
if(val & (1 << i)) {
val &= ~(1 << i);
break;
}
}
return val;
}
Explanation
Here we take the size of the type uint32_t, which is 4 bytes. Each byte has 8 bits, so we iterate 32 times starting with i having values 31 to 0.
In each iteration we shift the value 1 by i to the left and then bitwise-and (&) it with our value. If this returns a value != 0, the bit at i is set. Once we find a bit that is set, we bitwise-and (&) our initial value with the bitwise negation (~) of the bit that is set.
For example if we have the number 44, its binary representation would be 0010 1100. The first set bit that we find is bit 5, resulting in the mask 0010 0000. The bitwise negation of this mask is 1101 1111. Now when bitwise and-ing & the initial value with this mask, we get the value 0000 1100.
In C++ with templates
This is an example of how this can be solved in C++ using a template:
#include <limits>
template<typename T> T unsetHighestBit(T val) {
for(uint32_t i = sizeof(T) * numeric_limits<char>::digits - 1; i >= 0; i--) {
if(val & (1 << i)) {
val &= ~(1 << i);
break;
}
}
return val;
}
If you're constrained to 8 bits (as in your example), then just precalculate all possible values in an array (byte[256]) using any algorithm, or just type it in by hand.
Then you just look up the desired value:
x = lookup[originalNumber]
Can't be much faster than that. :-)
UPDATE: so I read the question wrong.
But if using 64 bit values, then break it apart into 8 bytes, maybe by casting it to a byte[8] or overlaying it in a union or something more clever. After that, find the first byte which are not zero and do as in my answer above with that particular byte. Not as efficient I'm afraid, but still it is at most 8 tests (and in average 4.5) + one lookup.
Of course, creating a byte[65536} lookup will double the speed.
The following code will turn off the right most bit:
bool found = false;
int bit, bitCounter = 31;
while (!found) {
bit = x & (1 << bitCounter);
if (bit != 0) {
x &= ~(1 << bitCounter);
found = true;
}
else if (bitCounter == 0)
found = true;
else
bitCounter--;
}
I know method to set more right non zero bit to 0.
a & (a - 1)
It is from Book: Warren H.S., Jr. - Hacker's Delight.
You can reverse your bits, set more right to zero and reverse back. But I do now know efficient way to invert bits in your case.

How to set specific bits?

Let's say I've got a uint16_t variable where I must set specific bits.
Example:
uint16_t field = 0;
That would mean the bits are all zero: 0000 0000 0000 0000
Now I get some values that I need to set at specific positions.
val1=1; val2=2, val3=0, val4=4, val5=0;
The structure how to set the bits is the following
0|000| 0000| 0000 000|0
val1 should be set at the first bit on the left. so its only one or zero.
val2 should be set at the next three bits. val3 on the next four bits. val4 on the next seven bits and val5 one the last bit.
The result would be this:
1010 0000 0000 1000
I only found out how to the one specific bit but not 'groups'. (shift or bitset)
Does anyone have an idea how to solve this issue?
There are (at least) two basic approaches. One would be to create a struct with some bitfields:
struct bits {
unsigned a : 1;
unsigned b : 7;
unsigned c : 4;
unsigned d : 3;
unsigned e : 1;
};
bits b;
b.a = val1;
b.b = val2;
b.c = val3;
b.d = val4;
b.e = val5;
To get the 16-bit value, you could (for one example) create a union of that struct with a uint16_t. Just one minor problem: the standard doesn't guarantee what order the bit fields will end up in when you look at the 16-bit value. Just for example, you might need to reverse the order I've given above to get the order from most to least significant bits that you really want (but changing compilers might muck things up again).
The other obvious possibility would be to use shifting and masking to put the pieces together into a number:
int16_t result = val1 | (val2 << 1) | (val3 << 8) | (val4 << 12) | (val5 << 15);
For the moment, I've assumed each of the inputs starts out in the correct range (i.e., has a value that can be represented in the chosen number of bits). If there's a possibility that could be wrong, you'd want to mask it to the correct number of bits first. The usual way to do that is something like:
uint16_t result = input & ((1 << num_bits) - 1);
In case you're curious about the math there, it works like this. Lets's assume we want to ensure an input fits in 4 bits. Shifting 1 left 4 bits produces 00010000 (in binary). Subtracting one from that then clears the one bit that's set, and sets all the less significant bits than that, giving 00001111 for our example. That gives us the first least significant bits set. When we do a bit-wise AND between that and the input, any higher bits that were set in the input are cleared in the result.
One of the solutions would be to set a K-bit value starting at the N-th bit of field as:
uint16_t value_mask = ((1<<K)-1) << N; // for K=4 and N=3 will be 00..01111000
field = field & ~value_mask; // zeroing according bits inside the field
field = field | ((value << N) & value_mask); // AND with value_mask is for extra safety
Or, if you can use struct instead of uint16_t, you can use Bit fields and let the compiler to perform all these actions for you.
finalvle = 0;
finalvle = (val1&0x01)<<15;
finalvle += (val2&0x07)<<12;
finalvle += (val3&0x0f)<<8
finalvle += (val4&0xfe)<<1;
finalvle += (val5&0x01);
You can use the bitwise or and shift operators to achieve this.
Use shift << to 'move bytes to the left':
int i = 1; // ...0001
int j = i << 3 // ...1000
You can then use bitwise or | to put it at the right place, (assuming you have all zeros at the bits you are trying to overwrite).
int k = 0; // ...0000
k |= i // ...0001
k |= j // ...1001
Edit: Note that #Inspired's answer also explains with zeroing out a certain area of bits. It overall explains how you would go about implementing it properly.
try this code:
uint16_t shift(uint16_t num, int shift)
{
return num | (int)pow (2, shift);
}
where shift is position of bit that you wanna set

Find "edges" in 32 bits word bitpattern

Im trying to find the most efficient algorithm to count "edges" in a bit-pattern. An edge meaning a change from 0 to 1 or 1 to 0. I am sampling each bit every 250 us and shifting it into a 32 bit unsigned variable.
This is my algorithm so far
void CountEdges(void)
{
uint_least32_t feedback_samples_copy = feedback_samples;
signal_edges = 0;
while (feedback_samples_copy > 0)
{
uint_least8_t flank_information = (feedback_samples_copy & 0x03);
if (flank_information == 0x01 || flank_information == 0x02)
{
signal_edges++;
}
feedback_samples_copy >>= 1;
}
}
It needs to be at least 2 or 3 times as fast.
You should be able to bitwise XOR them together to get a bit pattern representing the flipped bits. Then use one of the bit counting tricks on this page: http://graphics.stanford.edu/~seander/bithacks.html to count how many 1's there are in the result.
One thing that may help is to precompute the edge count for all possible 8-bit value (a 512 entry lookup table, since you have to include the bit the precedes each value) and then sum up the count 1 byte at a time.
// prevBit is the last bit of the previous 32-bit word
// edgeLut is a 512 entry precomputed edge count table
// Some of the shifts and & are extraneous, but there for clarity
edgeCount =
edgeLut[(prevBit << 8) | (feedback_samples >> 24) & 0xFF] +
edgeLut[(feedback_samples >> 16) & 0x1FF] +
edgeLut[(feedback_samples >> 8) & 0x1FF] +
edgeLut[(feedback_samples >> 0) & 0x1FF];
prevBit = feedback_samples & 0x1;
My suggestion:
copy your input value to a temp variable, left shifted by one
copy the LSB of your input to yout temp variable
XOR the two values. Every bit set in the result value represents one edge.
use this algorithm to count the number of bits set.
This might be the code for the first 3 steps:
uint32 input; //some value
uint32 temp = (input << 1) | (input & 0x00000001);
uint32 result = input ^ temp;
//continue to count the bits set in result
//...
Create a look-up table so you can get the transitions within a byte or 16-bit value in one shot - then all you need to do is look at the differences in the 'edge' bits between bytes (or 16-bit values).
You are looking at only 2 bits during every iteration.
The fastest algorithm would probably be to build a hash table for all possibles values. Since there are 2^32 values that is not the best idea.
But why don't you look at 3, 4, 5 ... bits in one step? You can for instance precalculate for all 4 bit combinations your edgecount. Just take care of possible edges between the pieces.
you could always use a lookup table for say 8 bits at a time
this way you get a speed improvement of around 8 times
don't forget to check for bits in between those 8 bits though. These then have to be checked 'manually'