Meaning of XOR shift - c++

I came across an algorithm that uses a lot of XOR shift like so:
std::uint16_t a = ...
std::uint16_t b = a ^ ( a >> 4 )
I read XOR is used for all kind of good stuff, like finding parity, determining odd/even counts etc. So I'm wondering: Does this operation (on its own) have a certain meaning? Is it a common pattern? Or is it just unique to this algorithm?
No I'm not talking about THE xorshift pseudo-number algorithm.

Let's take a look at what it produces given input 0xIJKL:
0xIJKL
^ 0x0IJK
--------
0xI???
Doesn't seem very meaningful to me by itself, but this pattern seems to be used as a sub step in many parity bit twiddles. For example, a twiddle for calculating parity bit of a word (from https://graphics.stanford.edu/~seander/bithacks.html):
unsigned int v; // word value to compute the parity of
v ^= v >> 16;
v ^= v >> 8;
v ^= v >> 4;
v &= 0xf;
return (0x6996 >> v) & 1;

Related

How to find a position of Least Significant Bit (LSB) in an int? c++

I have to write a c++ function which swaps nth and least significant bit of an int. I found some examples and did this:
v1 = v1 ^ ((((v1&1) ^ (v1>>n)&1) << LSBpos) | (((v1&1) ^ (v1>>n)&1) << n));
cout<<v1;
v1 is an int.
v1&1 is the value of LSB.
LSBpos should be the position of LSB but I don't know how to get it. There are explanations of how to get the position of LSB that is set or clear, but I just need that position, whether it is set or not.
You don't need to know the position of the LSB. And this is great because due to endianness it could be at multiple places!
Let us find some help: How do you set, clear, and toggle a single bit?:
Checking a bit
You didn't ask for this, but I might as well add it.
To check a bit, shift the number n to the right, then bitwise AND it:
bit = (number >> n) & 1U;
Changing the nth bit to x
Setting the nth bit to either 1 or 0 can be achieved with the following on a 2's complement C++ implementation:
number ^= (-x ^ number) & (1UL << n);
And go for it!
int swap_nth_and_lsb(int x, int n)
{
// let to the reader: check the validity of n
// read LSB and nth bit
int const lsb_value = x& 1U;
int const nth_value = (x>> n) & 1U;
// swap
x ^= (-lsb_value) & (1UL << n);
x ^= /* let to the reader: set the lsb to nth_value */
return x;
}
In the comment, OP said "I have to write one line of code for getting result". Well, if the one-line condition holds, you can start from the above solution and turn it into a one-liner step by step.

Convert every 5 bits into integer values in C++

Firstly, if anyone has a better title for me, let me know.
Here is an example of the process I am trying to automate with C++
I have an array of values that appear in this format:
9C07 9385 9BC7 00 9BC3 9BC7 9385
I need to convert them to binary and then convert every 5 bits to decimal like so with the last bit being a flag:
I'll do this with only the first word here.
9C07
10011 | 10000 | 00011 | 1
19 | 16 | 3
These are actually x,y,z coordinates and the final bit determines the order they are in a '0' would make it x=19 y=16 z=3 and '1' is x=16 y=3 z=19
I already have a buffer filled with these hex values, but I have no idea where to go from here.
I assume these are integer literals, not strings?
The way to do this is with bitwise right shift (>>) and bitwise AND (&)
#include <cstdint>
struct Coordinate {
std::uint8_t x;
std::uint8_t y;
std::uint8_t z;
constexpr Coordinate(std::uint16_t n) noexcept
{
if (n & 1) { // flag
x = (n >> 6) & 0x1F; // 1 1111
y = (n >> 1) & 0x1F;
z = n >> 11;
} else {
x = n >> 11;
y = (n >> 6) & 0x1F;
z = (n >> 1) & 0x1F;
}
}
};
The following code would extract the three coordinates and the flag from the 16 least significant bits of value (ie. its least significant word).
int flag = value & 1; // keep only the least significant bit
value >>= 1; // shift right by one bit
int third_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits
int second_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits
int first_integer = value & 0x1f; // keep only the five least significant bits
value >>= 5; // shift right by five bits (only useful if there are other words in "value")
What you need is most likely some loop doing this on each word of your array.

Fastest Way to XOR all bits from value based on bitmask?

I've got an interesting problem that has me looking for a more efficient way of doing things.
Let's say we have a value (in binary)
(VALUE) 10110001
(MASK) 00110010
----------------
(AND) 00110000
Now, I need to be able to XOR any bits from the (AND) value that are set in the (MASK) value (always lowest to highest bit):
(RESULT) AND1(0) xor AND4(1) xor AND5(1) = 0
Now, on paper, this is certainly quick since I can see which bits are set in the mask. It seems to me that programmatically I would need to keep right shifting the MASK until I found a set bit, XOR it with a separate value, and loop until the entire byte is complete.
Can anyone think of a faster way? I'm looking for the way to do this with the least number of operations and stored values.
If I understood this question correctly, what you want is to get every bit from VALUE that is set in the MASK, and compute the XOR of those bits.
First of all, note that XOR'ing a value with 0 will not change the result. So, to ignore some bits, we can treat them as zeros.
So, XORing the bits set in VALUE that are in MASK is equivalent to XORing the bits in VALUE&MASK.
Now note that the result is 0 if the number of set bits is even, 1 if it is odd.
That means we want to count the number of set bits. Some architectures/compilers have ways to quickly compute this value. For instance, on GCC this can be obtained with __builtin_popcount.
So on GCC, this can be computed with:
int set_bits = __builtin_popcount(value & mask);
return set_bits % 2;
If you want the code to be portable, then this won't do. However, a comment in this answer suggests that some compilers can inline std::bitset::count to efficiently obtain the same result.
If I'm understanding you right, you have
result = value & mask
and you want to XOR the 1 bits of mask & result together. The XOR of a series of bits is the same as counting the number of bits and checking if that count is even or odd. If it's odd, the XOR would be 1; if even, XOR would give 0.
count_bits(mask & result) % 2 != 0
mask & result can be simplified to simply result. You don't need to AND it with mask again. The % 2 != 0 can be alternately written as & 1.
count_bits(result) & 1
As far as how to count bits, the Bit Twiddling Hacks web page gives a number of bit counting algorithms.
Counting bits set, Brian Kernighan's way
unsigned int v; // count the number of bits set in v
unsigned int c; // c accumulates the total bits set in v
for (c = 0; v; c++)
{
v &= v - 1; // clear the least significant bit set
}
Brian Kernighan's method goes through as many iterations as there are
set bits. So if we have a 32-bit word with only the high bit set, then
it will only go once through the loop.
If you were to use that implementation, you could optimize it a bit further. If you think about it, you don't need the full count of bits. You only need to track their parity. Instead of counting bits you could just flip c each iteration.
unsigned bit_parity(unsigned v) {
unsigned c;
for (c = 0; v; c ^= 1) {
v &= v - 1;
}
}
(Thanks to Slava for the suggestion.)
Using that the XOR with 0 doesn't change anything, it's OK to apply the mask and then unconditionally XOR all bits together, which can be done in a parallel-prefix way. So something like this (not tested):
x = m & v;
x ^= x >> 16;
x ^= x >> 8;
x ^= x >> 4;
x ^= x >> 2;
x ^= x >> 1;
result = x & 1
You can use more (or fewer) steps as needed, this is for 32 bits.
One significant issue to be aware of if using v &= v - 1 in the main body of your code is it will change the value of v to 0 in conducting the count. With other methods, the value is changed to the number of 1's. While count logic is generally wrapped as a function, where that is no longer a concern, if you are required to present your counting logic in the main body of your code, you must preserve a copy of v if that value is needed again.
In addition to the other two methods presented, the following is another favorite from bit-twiddling hacks that generally has a bit better performance than the loop method for larger numbers:
/* get the population 1's in the binary representation of a number */
unsigned getn1s (unsigned int v)
{
v = v - ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
v = (v + (v >> 4)) & 0x0F0F0F0F;
v = v + (v << 8);
v = v + (v << 16);
return v >> 24;
}

How to manipulate and represent binary numbers in C++

I'm currently trying to build a lookup table for a huffman tree using a pretty simple preorder traversal algorithm, but I'm getting stuck carrying out very basic bit wise operations. The psuedo code follows:
void preOrder(huffNode *node, int bit) //not sure how to represent bit
{
if (node == NULL)
return;
(1) bit = bit + 0; //I basically want to add a 0 onto this number (01 would go to 010)
preOrder(node->getLeft(), bit);
(2) bit = bit - 0 + 1; //This should subtract the last 0 and add a 1 (010 would go to 011)
preOrder(node->getRight());
}
I'm getting quite confused about how to carry out the operations defined on lines (1) and (2)
What data type type does one use to represent and print binary numbers? In the above example I have the number represented as an int, but i'm pretty sure that that is incorrect. Also how do you add or subtract values? I understand how & and | types logic works, but I'm getting confused as to how one carries out these sorts of operations in code.
Could anyone post some very simple examples?
Here's some basic examples of binary operations. I've used mostly in-place operations here.
int bit = 0x02; // 0010
bit |= 1; // OR 0001 -> 0011
bit ^= 1; // XOR 0001 -> 0010
bit ^= 7; // XOR 0111 -> 0101
bit &= 14; // AND 1110 -> 0100
bit <<= 1; // LSHIFT 1 -> 1000
bit >>= 2; // RSHIFT 2 -> 0010
bit = ~bit; // COMPLEMENT -> 1101
If you want to print a binary number you need to do it yourself... Here's one slightly inefficient, but moderately readable, way to do it:
char bitstr[33] = {0};
for( int b = 0; b < 32; b++ ) {
if( bit & (1 << (31-b)) )
bitstr[b] = '1';
else
bitstr[b] = '0';
}
printf( "%s\n", bitstr );
[edit] If I wanted faster code, I might pre-generate (or hardcode) a lookup table with the 8-bit sequences for all numbers from 0-255.
// This turns a 32-bit integer into a binary string.
char lookup[256][9] = {
"00000000",
"00000001",
"00000010",
"00000011",
// ... etc (you don't want to do this by hand)
"11111111"
};
char * lolo = lookup[val & 0xff];
char * lohi = lookup[(val>>8) & 0xff];
char * hilo = lookup[(val>>16) & 0xff];
char * hihi = lookup[(val>>24) & 0xff];
// This part is maybe a bit lazy =)
char bitstr[33];
sprintf( "%s%s%s%s", hihi, hilo, lohi, lolo );
Instead, you could do this:
char *bits = bitstr;
while( *hihi ) *bits++ = *hihi++;
while( *hilo ) *bits++ = *hilo++;
while( *lohi ) *bits++ = *lohi++;
while( *lolo ) *bits++ = *lolo++;
*bits = 0;
Or just unroll the whole thing. ;-)
char bitstr[33] = {
hihi[0], hihi[1], hihi[2], hihi[3], hihi[4], hihi[5], hihi[6], hihi[7],
hilo[0], hilo[1], hilo[2], hilo[3], hilo[4], hilo[5], hilo[6], hilo[7],
lohi[0], lohi[1], lohi[2], lohi[3], lohi[4], lohi[5], lohi[6], lohi[7],
lolo[0], lolo[1], lolo[2], lolo[3], lolo[4], lolo[5], lolo[6], lolo[7],
0 };
Of course, those 8 bytes in the lookup are the same length as a 64-bit integer... So what about this? Much faster than all that pointless meandering through character arrays.
char bitstr[33];
__int64 * intbits = (__int64*)bitstr;
intbits[0] = *(__int64*)lookup[(val >> 24) & 0xff];
intbits[1] = *(__int64*)lookup[(val >> 16) & 0xff];
intbits[2] = *(__int64*)lookup[(val >> 8) & 0xff];
intbits[3] = *(__int64*)lookup[val & 0xff];
bitstr[32] = 0;
Naturally, in the above code you would represent your lookup values as int64 instead of strings.
Anyway, just pointing out that you can write it however is appropriate for your purposes. If you need to optimize, things get fun, but for most practical applications such optimizations are negligible or pointless.
Unless your binary sequences will get longer than the number of bits in an int, you can just use an int.
To add a 0 to the end of the current representation of a, you can use
a << 1
To replace a 0 at the end of the current representation of a with a 1, you can use
a ^= 1
Note that to use an int in this way, you will also need to keep track of where in the int your bits start, so that if you have e.g., the value 0x0, you can know which of 0, 00, 000, ... it is.
Operations in your code:
(1) bit = bit << 1;
(2) bit = bit|1;
However, you must also keep the length of the sequence.
If length of an int is good enough for you, there's no reason not to use it. However, in huffman algorithm it would really depend on the data. C++ programmers should use boost::dynamic_bitset for bit sequences of an arbitrary length. It also supports the bit operations above. http://www.boost.org/doc/libs/1_42_0/libs/dynamic_bitset/dynamic_bitset.html

Question from bit twiddling site

Here is the code:
unsigned int v; // word value to compute the parity of
v ^= v >> 16;
v ^= v >> 8;
v ^= v >> 4;
v &= 0xf;
return (0x6996 >> v) & 1;
It computes the parity of given word, v. What is the meaning of 0x6996?
The number 0x6996 in binary is 110100110010110.
The first four lines convert v to a 4-bit number (0 to 15) that has the same parity as the original. The 16-bit number 0x6996 contains the parity of all the numbers from 0 to 15, and the right-shift is used to select the correct bit. It is similar to using a lookup table:
//This array contains the parity of the numbers 0 to 15
char parities[16] = {0,1,1,0,1,0,0,1,1,0,0,1,0,1,1,0};
return parities[v];
Note that the array entries are the same as the bits of 0x6996. Using (0x6996 >> v) & 1 gives the same result, but doesn't require the memory access.
Well the algorithm is compressing the 32-bit int into a 4-bit value of the same parity by successive bitwise ORs and then ANDing with 0xf so that there are only positive bits in the least-significant 4-bits. In other words after line 5, v will be an int between 0 and 15 inclusive.
It then shifts that magic number (0x6996) to the right by this 0-16 value and returns only the least significant bit (& 1).
That means that if there is a 1 in the v bit position of 0x6996 then the computed parity bit is 1, otherwise it's 0 - for example if in line 5 v is calculated as 2 then ` is returned, if it was 3 then 0 would be returned.