How would I check for allevenbits in bitwise operations? - bit-manipulation

Using bitwise operations exclusively, how would I set y to 1 if all even-numbered bits of x are 1, and otherwise y is set to 0 (maximum of 8-bits)?
So far I have as follows:
p = ~x + 1
a = p >> 2
b = p >> 4
c = p >> 6
d = p >> 8
y = [insert code here]
Permitted: 12 operations (may use !, ~, +, -, <<, >>, &, ^, |) and up to 8-bit constants.

You could shift down all bits to the first bit and do a bitwise AND:
#even
y = 1 & (x >> 6) & (x >> 4) & (x >> 2) & x
#odd
y = (x >> 7) & (x >> 5) & (x >> 3) & (x >> 1)
Note the 1 & part for the even case since x >> 6 leaves the 2 MSbs in the result and you only want the low one.
Here's a demonstrative program in written in C:
Demo
Since you mentioned that you may use ! (NOT), a simpler version may be:
#even
!((x & 0b01010101) ^ 0b01010101)
#odd
!((x & 0b10101010) ^ 0b10101010)
Here the constants are first used to filter out the even/odd bits and then XOR with the same. This will produce 0 if all bits were set. Combine with NOT and you'll get 1 if they were all set and 0 if they were not.
Demo

Related

Trouble understanding piece of code. Bitwise operations in c

I have the following segment of code and am having trouble deciphering what it does.
/* assume 0 <= n <=3 and 0 <= m <=3 */
int n8= n <<3;
int m8 = m <<3;
int n_mask = 0xff << n8;
int m_mask = 0xff << m8; // left bitshifts 255 by the value of m8
int n_byte = ((x & n_mask) >> n8) & 0xff;
int m_byte = ((x & m_mask) >> m8) & 0xff;
int bytes_mask = n_mask | m_mask ;
int leftover = x & ~bytes_mask;
return ( leftover | (n_byte <<m8)| (m_byte << n8) );
It swaps the nth and mth bytes.
The start has two parallel computations, one sequence with n and one sequence with m, that select the nth and mth byte like this:
Step 1: 0xff << n8
0x000000ff << 0 = 0x000000ff
.. 8 = 0x0000ff00
.. 16 = 0x00ff0000
.. 24 = 0xff000000
Step 2: x & n_mask
x = 0xDDCCBBAA
x & 0x000000ff = 0x000000AA
x & 0x0000ff00 = 0x0000BB00
x & 0x00ff0000 = 0x00CC0000
x & 0xff000000 = 0xDD000000
Step 3: ((x & n_mask) >> n8) & 0xff (note: & 0xff is required because the right shift is likely to be an arithmetic right shift, it would not be required if the code worked with unsigned integers)
n = 0: 0x000000AA
1: 0x000000BB
2: 0x000000CC
3: 0x000000DD
So it extracts the nth byte and puts it at the bottom of the integer.
The same thing is done for m.
leftover is the other (2 or 3) bytes, the ones not extracted by the previous process. There may be 3 bytes left over, because n and m can be the same.
Finally the last step is to put it all back together, but with the byte extracted from the nth position shifted to the mth position, and the mth byte shifted to the nth position, so they switch places.

Replicating the function of a for loop using only bitwise operators

I'm trying to replicate the function of a loop using only bitwise and certain operators including ! ~ & ^ | + << >>
int loop(int x) {
for (int i = 1; i < 32; i += 2)
if ((x & (1 << i)) == 0)
return 0;
return 1;
}
Im unsure however how to replicate the accumulating nature of a loop using just these operators. I understand shifting << >> will allow me to multiply and divide. However manipulation using ! ~ & ^ ~ has proven more difficult. Any Tips?
http://www.tutorialspoint.com/cprogramming/c_operators.htm
Edit:
I understand how the addition of bits can be achieved, however not how such an output can be achieved without first calling a while or for loop.
Maybe this can help:
int loop(int x) {
x = x & 0xaaaaaaaa; // Set all even numbered bits in x to zero
x = x ^ 0xaaaaaaaa; // If all odd numbered bits in x are 1, x becomes zero
x = !x; // The operation returns 1 if x is zero - otherwise 0
return x;
}
Your code tests all odd bits and returns 1 if all of them are set. You can use this bitmask: ...0101 0101 0101
Which, for 32 bits is 0xAAAAAAAA.
Then you take your value und bitwise-and it. If the result is the same as your mask, it means all bits are set.
int testOddBits(int x) {
return (x & 0xAAAAAAAA) == 0xAAAAAAAA;
}

8-digit BCD check

I've a 8-digit BCD number and need to check it out to see if it is a valid BCD number. How can I programmatically (C/C++) make this?
Ex: 0x12345678 is valid, but 0x00f00abc isn't.
Thanks in advance!
You need to check each 4-bit quantity to make sure it's less than 10. For efficiency you want to work on as many bits as you can at a single time.
Here I break the digits apart to leave a zero between each one, then add 6 to each and check for overflow.
uint32_t highs = (value & 0xf0f0f0f0) >> 4;
uint32_t lows = value & 0x0f0f0f0f;
bool invalid = (((highs + 0x06060606) | (lows + 0x06060606)) & 0xf0f0f0f0) != 0;
Edit: actually we can do slightly better. It doesn't take 4 bits to detect overflow, only 1. If we divide all the digits by 2, it frees a bit and we can check all the digits at once.
uint32_t halfdigits = (value >> 1) & 0x77777777;
bool invalid = ((halfdigits + 0x33333333) & 0x88888888) != 0;
The obvious way to do this is:
/* returns 1 if x is valid BCD */
int
isvalidbcd (uint32_t x)
{
for (; x; x = x>>4)
{
if ((x & 0xf) >= 0xa)
return 0;
}
return 1;
}
This link tells you all about BCD, and recommends something like this asa more optimised solution (reworking to check all the digits, and hence using a 64 bit data type, and untested):
/* returns 1 if x is valid BCD */
int
isvalidbcd (uint32_t x)
{
return !!(((uint64_t)x + 0x66666666ULL) ^ (uint64_t)x) & 0x111111110ULL;
}
For a digit to be invalid, it needs to be 10-15. That in turn means 8 + 4 or 8+2 - the low bit doesn't matter at all.
So:
long mask8 = value & 0x88888888;
long mask4 = value & 0x44444444;
long mask2 = value & 0x22222222;
return ((mask8 >> 2) & ((mask4 >>1) | mask2) == 0;
Slightly less obvious:
long mask8 = (value>>2);
long mask42 = (value | (value>>1);
return (mask8 & mask42 & 0x22222222) == 0;
By shifting before masking, we don't need 3 different masks.
Inspired by #Mark Ransom
bool invalid = (0x88888888 & (((value & 0xEEEEEEEE) >> 1) + (0x66666666 >> 1))) != 0;
// or
bool valid = !((((value & 0xEEEEEEEEu) >> 1) + 0x33333333) & 0x88888888);
Mask off each BCD digit's 1's place, shift right, then add 6 and check for BCD digit overflow.
How this works:
By adding +6 to each digit, we look for an overflow * of the 4-digit sum.
abcd
+ 110
-----
*efgd
But the bit value of d does not contribute to the sum, so first mask off that bit and shift right. Now the overflow bit is in the 8's place. This all is done in parallel and we mask these carry bits with 0x88888888 and test if any are set.
0abc
+ 11
-----
*efg

How does this algorithm to count the number of set bits in a 32-bit integer work?

int SWAR(unsigned int i)
{
i = i - ((i >> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;
}
I have seen this code that counts the number of bits equals to 1 in 32-bit integer, and I noticed that its performance is better than __builtin_popcount but I can't understand the way it works.
Can someone give a detailed explanation of how this code works?
OK, let's go through the code line by line:
Line 1:
i = i - ((i >> 1) & 0x55555555);
First of all, the significance of the constant 0x55555555 is that, written using the Java / GCC style binary literal notation),
0x55555555 = 0b01010101010101010101010101010101
That is, all its odd-numbered bits (counting the lowest bit as bit 1 = odd) are 1, and all the even-numbered bits are 0.
The expression ((i >> 1) & 0x55555555) thus shifts the bits of i right by one, and then sets all the even-numbered bits to zero. (Equivalently, we could've first set all the odd-numbered bits of i to zero with & 0xAAAAAAAA and then shifted the result right by one bit.) For convenience, let's call this intermediate value j.
What happens when we subtract this j from the original i? Well, let's see what would happen if i had only two bits:
i j i - j
----------------------------------
0 = 0b00 0 = 0b00 0 = 0b00
1 = 0b01 0 = 0b00 1 = 0b01
2 = 0b10 1 = 0b01 1 = 0b01
3 = 0b11 1 = 0b01 2 = 0b10
Hey! We've managed to count the bits of our two-bit number!
OK, but what if i has more than two bits set? In fact, it's pretty easy to check that the lowest two bits of i - j will still be given by the table above, and so will the third and fourth bits, and the fifth and sixth bits, and so and. In particular:
despite the >> 1, the lowest two bits of i - j are not affected by the third or higher bits of i, since they'll be masked out of j by the & 0x55555555; and
since the lowest two bits of j can never have a greater numerical value than those of i, the subtraction will never borrow from the third bit of i: thus, the lowest two bits of i also cannot affect the third or higher bits of i - j.
In fact, by repeating the same argument, we can see that the calculation on this line, in effect, applies the table above to each of the 16 two-bit blocks in i in parallel. That is, after executing this line, the lowest two bits of the new value of i will now contain the number of bits set among the corresponding bits in the original value of i, and so will the next two bits, and so on.
Line 2:
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
Compared to the first line, this one's quite simple. First, note that
0x33333333 = 0b00110011001100110011001100110011
Thus, i & 0x33333333 takes the two-bit counts calculated above and throws away every second one of them, while (i >> 2) & 0x33333333 does the same after shifting i right by two bits. Then we add the results together.
Thus, in effect, what this line does is take the bitcounts of the lowest two and the second-lowest two bits of the original input, computed on the previous line, and add them together to give the bitcount of the lowest four bits of the input. And, again, it does this in parallel for all the 8 four-bit blocks (= hex digits) of the input.
Line 3:
return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;
OK, what's going on here?
Well, first of all, (i + (i >> 4)) & 0x0F0F0F0F does exactly the same as the previous line, except it adds the adjacent four-bit bitcounts together to give the bitcounts of each eight-bit block (i.e. byte) of the input. (Here, unlike on the previous line, we can get away with moving the & outside the addition, since we know that the eight-bit bitcount can never exceed 8, and therefore will fit inside four bits without overflowing.)
Now we have a 32-bit number consisting of four 8-bit bytes, each byte holding the number of 1-bit in that byte of the original input. (Let's call these bytes A, B, C and D.) So what happens when we multiply this value (let's call it k) by 0x01010101?
Well, since 0x01010101 = (1 << 24) + (1 << 16) + (1 << 8) + 1, we have:
k * 0x01010101 = (k << 24) + (k << 16) + (k << 8) + k
Thus, the highest byte of the result ends up being the sum of:
its original value, due to the k term, plus
the value of the next lower byte, due to the k << 8 term, plus
the value of the second lower byte, due to the k << 16 term, plus
the value of the fourth and lowest byte, due to the k << 24 term.
(In general, there could also be carries from lower bytes, but since we know the value of each byte is at most 8, we know the addition will never overflow and create a carry.)
That is, the highest byte of k * 0x01010101 ends up being the sum of the bitcounts of all the bytes of the input, i.e. the total bitcount of the 32-bit input number. The final >> 24 then simply shifts this value down from the highest byte to the lowest.
Ps. This code could easily be extended to 64-bit integers, simply by changing the 0x01010101 to 0x0101010101010101 and the >> 24 to >> 56. Indeed, the same method would even work for 128-bit integers; 256 bits would require adding one extra shift / add / mask step, however, since the number 256 no longer quite fits into an 8-bit byte.
I prefer this one, it's much easier to understand.
x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x & 0x0f0f0f0f) + ((x >> 4) & 0x0f0f0f0f);
x = (x & 0x00ff00ff) + ((x >> 8) & 0x00ff00ff);
x = (x & 0x0000ffff) + ((x >> 16) &0x0000ffff);
This is a comment to Ilamari's answer.
I put it as an answer because of format issues:
Line 1:
i = i - ((i >> 1) & 0x55555555); // (1)
This line is derived from this easier to understand line:
i = (i & 0x55555555) + ((i >> 1) & 0x55555555); // (2)
If we call
i = input value
j0 = i & 0x55555555
j1 = (i >> 1) & 0x55555555
k = output value
We can rewrite (1) and (2) to make the explanation clearer:
k = i - j1; // (3)
k = j0 + j1; // (4)
We want to demonstrate that (3) can be derived from (4).
i can be written as the addition of its even and odd bits (counting the lowest bit as bit 1 = odd):
i = iodd + ieven =
= (i & 0x55555555) + (i & 0xAAAAAAAA) =
= (i & modd) + (i & meven)
Since the meven mask clears the last bit of i,
the last equality can be written this way:
i = (i & modd) + ((i >> 1) & modd) << 1 =
= j0 + 2*j1
That is:
j0 = i - 2*j1 (5)
Finally, replacing (5) into (4) we achieve (3):
k = j0 + j1 = i - 2*j1 + j1 = i - j1
This is an explanation of yeer's answer:
int SWAR(unsigned int i) {
i = (i & 0x55555555) + ((i >> 1) & 0x55555555); // A
i = (i & 0x33333333) + ((i >> 2) & 0x33333333); // B
i = (i & 0x0f0f0f0f) + ((i >> 4) & 0x0f0f0f0f); // C
i = (i & 0x00ff00ff) + ((i >> 8) & 0x00ff00ff); // D
i = (i & 0x0000ffff) + ((i >> 16) &0x0000ffff); // E
return i;
}
Let's use Line A as the basis of my explanation.
i = (i & 0x55555555) + ((i >> 1) & 0x55555555)
Let's rename the above expression as follows:
i = (i & mask) + ((i >> 1) & mask)
= A1 + A2
First, think of i not as 32 bits, but rather as an array of 16 groups, 2 bits each. A1 is the count array of size 16, each group containing the count of 1s at the right-most bit of the corresponding group in i:
i = yx yx yx yx yx yx yx yx yx yx yx yx yx yx yx yx
mask = 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
i & mask = 0x 0x 0x 0x 0x 0x 0x 0x 0x 0x 0x 0x 0x 0x 0x 0x
Similarly, A2 is "counting" the left-most bit for each group in i. Note that I can rewrite A2 = (i >> 1) & mask as A2 = (i & mask2) >> 1:
i = yx yx yx yx yx yx yx yx yx yx yx yx yx yx yx yx
mask2 = 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
(i & mask2) = y0 y0 y0 y0 y0 y0 y0 y0 y0 y0 y0 y0 y0 y0 y0 y0
(i & mask2) >> 1 = 0y 0y 0y 0y 0y 0y 0y 0y 0y 0y 0y 0y 0y 0y 0y 0y
(Note that mask2 = 0xaaaaaaaa)
Thus, A1 + A2 adds the counts of the A1 array and A2 array, resulting in an array of 16 groups, each group now contains the count of bits in each group.
Moving onto Line B, we can rename the line as follows:
i = (i & 0x33333333) + ((i >> 2) & 0x33333333)
= (i & mask) + ((i >> 2) & mask)
= B1 + B2
B1 + B2 follows the same "form" as A1 + A2 from before. Think of i no longer as 16 groups of 2 bits, but rather as 8 groups of 4 bits. So similar to before, B1 + B2 adds the counts of B1 and B2 together, where B1 is the counts of 1s in the right side of the group, and B2 is the counts of the left side of the group. B1 + B2 is thus the counts of bits in each group.
Lines C through E now become more easily understandable:
int SWAR(unsigned int i) {
// A: 16 groups of 2 bits, each group contains number of 1s in that group.
i = (i & 0x55555555) + ((i >> 1) & 0x55555555);
// B: 8 groups of 4 bits, each group contains number of 1s in that group.
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
// C: 4 groups of 8 bits, each group contains number of 1s in that group.
i = (i & 0x0f0f0f0f) + ((i >> 4) & 0x0f0f0f0f);
// D: 2 groups of 16 bits, each group contains number of 1s in that group.
i = (i & 0x00ff00ff) + ((i >> 8) & 0x00ff00ff);
// E: 1 group of 32 bits, containing the number of 1s in that group.
i = (i & 0x0000ffff) + ((i >> 16) &0x0000ffff);
return i;
}

Swapping bits at a given point between two bytes

Let's say I have these two numbers:
x = 0xB7
y = 0xD9
Their binary representations are:
x = 1011 0111
y = 1101 1001
Now I want to crossover (GA) at a given point, say from position 4 onwards.
The expected result should be:
x = 1011 1001
y = 1101 0111
Bitwise, how can I achieve this?
I would just use bitwise operators:
t = (x & 0x0f)
x = (x & 0xf0) | (y & 0x0f)
y = (y & 0xf0) | t
That would work for that specific case. In order to make it more adaptable, I'd put it in a function, something like (pseudo-code, with &, | and ! representing bitwise "and", "or", and "not" respectively):
def swapBits (x, y, s, e):
lookup = [255,127,63,31,15,7,3,1]
mask = lookup[s] & !lookup[e]
t = x & mask
x = (x & !mask) | (y & mask)
y = (y & !mask) | t
return (x,y)
The lookup values allow you to specify which bits to swap. Let's take the values xxxxxxxx for x and yyyyyyyy for y along with start bit s of 2 and end bit e of 6 (bit numbers start at zero on the left in this scenario):
x y s e t mask !mask execute
-------- -------- - - -------- -------- -------- -------
xxxxxxxx yyyyyyyy 2 6 starting point
00111111 mask = lookup[2](00111111)
00111100 & !lookup[6](11111100)
00xxxx00 t = x & mask
xx0000xx x = x & !mask(11000011)
xxyyyyxx | y & mask(00111100)
yy0000yy y = y & !mask(11000011)
yyxxxxyy | t(00xxxx00)
If a bit position is the same in both values, no change is needed in either. If it's opposite, they both need to invert.
XOR with 1 flips a bit; XOR with 0 is a no-op.
So what we want is a value that has a 1 everywhere there's a bit-difference between the inputs, and a 0 everywhere else. That's exactly what a XOR b does.
Simply mask this bit-difference to only keep the differences in the bits we want to swap, and we have a bit-swap in 3 XORs + 1 AND.
Your mask is (1UL << position) -1. One less than a power of 2 has all the bits below that set. Or more generally with a high and low position for your bit-range: (1UL << highpos) - (1UL << lowpos). Whether a lookup-table is faster than bit-set / sub depends on the compiler and hardware. (See #PaxDiablo's answer for the LUT suggestion).
// Portable C:
//static inline
void swapBits_char(unsigned char *A, unsigned char *B)
{
const unsigned highpos = 4, lowpos=0; // function args if you like
const unsigned char mask = (1UL << highpos) - (1UL << lowpos);
unsigned char tmpA = *A, tmpB = *B; // read into locals in case A==B
unsigned char bitdiff = tmpA ^ tmpB;
bitdiff &= mask; // clear all but the selected bits
*A = tmpA ^ bitdiff; // flip bits that differed
*B = tmpB ^ bitdiff;
}
//static inline
void swapBit_uint(unsigned *A, unsigned *B, unsigned mask)
{
unsigned tmpA = *A, tmpB = *B;
unsigned bitdiff = tmpA ^ tmpB;
bitdiff &= mask; // clear all but the selected bits
*A = tmpA ^ bitdiff;
*B = tmpB ^ bitdiff;
}
(Godbolt compiler explorer with gcc for x86-64 and ARM)
This is not an xor-swap. It does use temporary storage. As #chux's answer on a near-duplicate question demonstrates, a masked xor-swap requires 3 AND operations as well as 3 XOR. (And defeats the only benefit of XOR-swap by requiring a temporary register or other storage for the & results.) This answer is a modified copy of my answer on that other question.
This version only requires 1 AND. Also, the last two XORs are independent of each other, so total latency from inputs to both outputs is only 3 operations. (Typically 3 cycles).
For an x86 asm example of this, see this code-golf Exchange capitalization of two strings in 14 bytes of x86-64 machine code (with commented asm source)
Swapping individual bits with XOR
unsigned int i, j; // positions of bit sequences to swap
unsigned int n; // number of consecutive bits in each sequence
unsigned int b; // bits to swap reside in b
unsigned int r; // bit-swapped result goes here
unsigned int x = ((b >> i) ^ (b >> j)) & ((1U << n) - 1); // XOR temporary
r = b ^ ((x << i) | (x << j));