Masking and shifting - c++

Should we first shift the value then mask or the other way? And what is the risk in first masking before shifting?
((loc_Task_value_avg >> 8) & 0x00FF)
OR
((loc_Task_value_avg & 0xFF00) >> 8)

Try working through these kind of examples with real numbers. In this case, you'll find they do not result in the same output.
We'll use two examples.
First, suppose loc_Task_value_avg is equal to 0x1234
((loc_Task_value_avg >> 8) & 0x00FF)
((0x1234 >> 8 ) & 0x00FF)
(0x0012 & 0x00FF)
0x0012
vs
((loc_Task_value_avg & 0xFF00) >> 8)
((0x1234 & 0xFF00) >> 8)
(0x0012 >> 8)
0x0012
The danger comes when we are using signed values. Let's use 0xFEDC.
((loc_Task_value_avg >> 8) & 0x00FF)
((0xFEDC >> 8 ) & 0x00FF)
(0xFFFE & 0x00FF)
0x00FE
vs
((loc_Task_value_avg & 0xFF00) >> 8)
((0xFEDC & 0xFF00) >> 8)
(0xFE00 >> 8)
0xFFFE
The reason we get two separate outputs is because when dealing with signed values (Two's complement), shifting from the highest to lowest order value may result in the sign bit being extended. Whether this happens depends on whether the instructions by the compiler use signed or unsigned shift.

It depends on the size of the value you are shifting, the number of bits in the mask and whether the underlying value is signed or unsigned.
A shift-right is a divide by 2. On a signed value this means the sign bit will be preserved (because the underlying representation is most likely twos compliment). If your shift is large enough to shift copied sign bits into the masked result, it will make a difference.
If the underlying value is unsigned, it doesn't matter whether you shift then mask or mask then shift.

Each case would do different things.
Take the case with bits 1101.
If I mask the second bit (the zero) and then shift it one, then I would have the value 0.
On the other hand, if I shift the bits by one and then mask the second bit, I would have the value 1.
It is important to clearly identify what exactly you are intending to do and then go about from there.

Related

Counting bits of ones in Byte by time Complexity O(1) C++ code

I've searched an algorithm that counts the number of ones in Byte by time complexity of O(1)
and what I found in google:
// C++ implementation of the approach
#include <bits/stdc++.h>
using namespace std;
int BitsSetTable256[256];
// Function to initialise the lookup table
void initialize()
{
// To initially generate the
// table algorithmically
BitsSetTable256[0] = 0;
for (int i = 0; i < 256; i++)
{
BitsSetTable256[i] = (i & 1) +
BitsSetTable256[i / 2];
}
}
// Function to return the count
// of set bits in n
int countSetBits(int n)
{
return (BitsSetTable256[n & 0xff] +
BitsSetTable256[(n >> 8) & 0xff] +
BitsSetTable256[(n >> 16) & 0xff] +
BitsSetTable256[n >> 24]);
}
// Driver code
int main()
{
// Initialise the lookup table
initialize();
int n = 9;
cout << countSetBits(n);
}
I understand what I need 256 size of the array (in other words size of the look up table) for indexing from 0 to 255 which they are all the decimals value that Byte represents !
but in the function initialize I didn't understand the terms inside the for loop:
BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
Why Im doing that?! I didn't understand what's the purpose of this row code inside the for loop.
In addition , in the function countSetBits , this function returns:
return (BitsSetTable256[n & 0xff] +
BitsSetTable256[(n >> 8) & 0xff] +
BitsSetTable256[(n >> 16) & 0xff] +
BitsSetTable256[n >> 24]);
I didn't understand at all what Im doing and bitwise with 0xff and why Im doing right shift ..
may please anyone explain to me the concept?! I didn't understand at all why in function countSetBits at BitsSetTable256[n >> 24] we didn't do and wise by 0xff ?
I understand why I need the lookup table with size 2^8 , but the other code rows that I mentioned above didn't understand, could anyone please explain them to me in simple words? and what's purpose for counting the number of ones in Byte?
thanks alot guys!
Concerning the first part of question:
// Function to initialise the lookup table
void initialize()
{
// To initially generate the
// table algorithmically
BitsSetTable256[0] = 0;
for (int i = 0; i < 256; i++)
{
BitsSetTable256[i] = (i & 1) +
BitsSetTable256[i / 2];
}
}
This is a neat kind of recursion. (Please, note I don't mean "recursive function" but recursion in a more mathematical sense.)
The seed is BitsSetTable256[0] = 0;
Then every element is initialized using the (already existing) result for i / 2 and adds 1 or 0 for this. Thereby,
1 is added if the last bit of index i is 1
0 is added if the last bit of index i is 0.
To get the value of last bit of i, i & 1 is the usual C/C++ bit mask trick.
Why is the result of BitsSetTable256[i / 2] a value to built upon?
The result of BitsSetTable256[i / 2] is the number of all bits of i the last one excluded.
Please, note that i / 2 and i >> 1 (the value (or bits) shifted to right by 1 whereby the least/last bit is dropped) are equivalent expressions (for positive numbers in the resp. range – edge cases excluded).
Concerning the other part of the question:
return (BitsSetTable256[n & 0xff] +
BitsSetTable256[(n >> 8) & 0xff] +
BitsSetTable256[(n >> 16) & 0xff] +
BitsSetTable256[n >> 24]);
n & 0xff masks out the upper bits isolating the lower 8 bits.
(n >> 8) & 0xff shifts the value of n 8 bits to right (whereby the 8 least bits are dropped) and then again masks out the upper bits isolating the lower 8 bits.
(n >> 16) & 0xff shifts the value of n 16 bits to right (whereby the 16 least bits are dropped) and then again masks out the upper bits isolating the lower 8 bits.
(n >> 24) & 0xff shifts the value of n 24 bits to right (whereby the 24 least bits are dropped) which should make effectively the upper 8 bits the lower 8 bits.
Assuming that int and unsigned have usually 32 bits on nowadays common platforms this covers all bits of n.
Please, note that the right shift of a negative value is implementation-defined.
(I recalled Bitwise shift operators to be sure.)
So, a right-shift of a negative value may fill all upper bits with 1s.
That can break BitsSetTable256[n >> 24] resulting in (n >> 24) > 256 and hence BitsSetTable256[n >> 24] an out of bound access.
The better solution would've been:
return (BitsSetTable256[n & 0xff] +
BitsSetTable256[(n >> 8) & 0xff] +
BitsSetTable256[(n >> 16) & 0xff] +
BitsSetTable256[(n >> 24) & 0xff]);
BitsSetTable256[0] = 0;
...
BitsSetTable256[i] = (i & 1) +
BitsSetTable256[i / 2];
The above code seeds the look-up table where each index contains the number of ones for the number used as index and works as:
(i & 1) gives 1 for odd numbers, otherwise 0.
An even number will have as many binary 1 as that number divided by 2.
An odd number will have one more binary 1 than that number divided by 2.
Examples:
if i==8 (1000b) then (i & 1) + BitsSetTable256[i / 2] ->
0 + BitsSetTable256[8 / 2] = 0 + index 4 (0100b) = 0 + 1 .
if i==7 (0111b) then 1 + BitsSetTable256[7 / 2] = 1 + BitsSetTable256[3] = 1 + index 3 (0011b) = 1 + 2.
If you want some formal mathematical proof why this is so, then I'm not the right person to ask, I'd poke one of the math sites for that.
As for the shift part, it's just the normal way of splitting up a 32 bit value in 4x8, portably without care about endianess (any other method to do that is highly questionable). If we un-sloppify the code, we get this:
BitsSetTable256[(n >> 0) & 0xFFu] +
BitsSetTable256[(n >> 8) & 0xFFu] +
BitsSetTable256[(n >> 16) & 0xFFu] +
BitsSetTable256[(n >> 24) & 0xFFu] ;
Each byte is shifted into the LS byte position, then masked out with a & 0xFFu byte mask.
Using bit shifts on int is however code smell and potentially buggy. To avoid poorly-defined behavior, you need to change the function to this:
#include <stdint.h>
uint32_t countSetBits (uint32_t n);
The code in countSetBits takes an int as an argument; apparently 32 bits are assumed. The implementation there is extracting four single bytes from n by shifting and masking; for these four separated bytes, the lookup is used and the number of bits per byte there are added to yield the result.
The initialization of the lookup table is a bit more tricky and can be seen as a form of dynamic programming. The entries are filled in increasing index of the argument. The first expression masks out the least significant bit and counts it; the second expression halves the argument (which could be also done by shifting). The resulting argument is smaller; it is then correctly assumed that the necessary value for the smaller argument is already available in the lookup table.
For the access to the lookup table, consider the following example:
input value (contains 5 ones):
01010000 00000010 00000100 00010000
input value, shifting is not necessary
masked with 0xff (11111111)
00000000 00000000 00000000 00010000 (contains 1 one)
input value shifted by 8
00000000 01010000 00000010 00000100
and masked with 0xff (11111111)
00000000 00000000 00000000 00000100 (contains 1 one)
input value shifted by 16
00000000 00000000 01010000 00000010
and masked with 0xff (11111111)
00000000 00000000 00000000 00000010 (contains 1 one)
input value shifted by 24,
masking is not necessary
00000000 00000000 00000000 01010000 (contains 2 ones)
The extracted values have only the lowermost 8 bits set, which means that the corresponding entries are available in the lookup table. The entries from the lookuptable are added. The underlying idea is that the number of ones in in the argument can be calculated byte-wise (in fact, any partition in bitstrings would be suitable).

Split parts of a uint32_t hex value into smaller parts in C++

I have a uint32_t as follows:
uint32_t midiData=0x9FCC00;
I need to separate this uint32_t into smaller parts so that 9 becomes its own entity, F becomes its own entity, and CC becomes its own entity. If you're wondering what I am doing, I am trying to break up the parts of a MIDI message so that they are easier to manage in my program.
I found this solution, but the problem is I don't know how to apply it to the CC section, and that I am not sure that this method works with C++.
Here is what I have so far:
uint32_t midiData=0x9FCC00;
uint32_t status = 0x0FFFFF & midiData; // Retrieve 9
uint32_t channel = (0xF0FFFF & midiData)>>4; //Retrieve F
uint32_t note = (0xFF00FF & midiData) >> 8; //Retrieve CC
Is this correct for C++? Reason I ask is cause I have never used C++ before and its syntax of using the > and < has always confused me (thus why I tend to avoid it).
You can use bit shift operator >> and bit masking operator & in C++ as well.
There are, however, some issues on how you use it:
Operator v1 & v2 gives a number built from those bits that are set in both v1 and v2, such that, for example, 0x12 & 0xF0 gives 0x10, not 0x02. Further, bit shift operator takes the number of bits, and a single digit in a hex number (which is usually called a nibble), consists of 4 bits (0x0..0xF requires 4 bits). So, if you have 0x12 and want to get 0x01, you have to write 0x12 >>4.
Hence, your shifts need to be adapted, too:
#define BITS_OF_A_NIBBLE 4
unsigned char status = (midiData & 0x00F00000) >> (5*BITS_OF_A_NIBBLE);
unsigned char channel = (midiData & 0x000F0000) >> (4*BITS_OF_A_NIBBLE);
unsigned char note = (midiData & 0x0000FF00) >> (2*BITS_OF_A_NIBBLE);
unsigned char theRest = (midiData & 0x000000FF);
You have it backwards, in a way.
In boolean logic (the & is a bitwise-AND), ANDing something with 0 will exclude it. Knowing that F in hex is 1111 in binary, a line like 0x9FCC00 & 0x0FFFFF will give you all the hex digits EXCEPT the 9, the opposite of what you want.
So, for status:
uint32_t status = 0xF000000 & midiData; // Retrieve 9
Actually, this will give you 0x900000. If you want 0x9 (also 9 in decimal), you need to bitshift the result over.
Now, the right bitshift operator (say, X >> 4) means move X 4 bits to the right; dividing by 16. That is 4 bits, not 4 hex digits. 1 hex digit == 4 bits, so to get 9 from 0x900000, you need 0x900000 >> 20.
So, to put them together, to get a status of 9:
uint32_t status = (0xF000000 & midiData) >> 20;
A similar process will get you the remaining values you want.
In general I'd recommend shift first, then mask - it's less error prone:
uint8_t cmd = (midiData >> 16) & 0xff;
uint8_t note = (midiData >> 8) & 0x7f; // MSB can't be set
uint8_t velocity = (midiData >> 0) & 0x7f; // ditto
and then split the cmd variable:
uint8_t status = (cmd & 0xf0); // range 0x00 .. 0xf0
uint8_t channel = (cmd & 0x0f); // range 0 .. 15
I personally wouldn't bother mapping the status value back into the range 0 .. 15 - it's commonly understood that e.g. 0x90 is a "note on", and not the plain value 9.

Shifting more than 8 bits - leading to wrong output

I am trying to perform this operation, and im getting the wrong output.
signed char temp3[3] = {0x0D, 0xFF, 0xC0};
double temp = ((temp3[0] & 0x03) << 10) | (temp3[1]) | ((temp3[2] & 0xC0) >> 6)
I am trying to form a 12 bit number. get the last 2 bits of 0x0D, all 8 of 0xFF and first 2 of 0xC0 to form the binary number (011111111111) = 2047, however I am getting -1. When I break the first mask and shift of 10, I get 0. I dont know if this is my problem, trying to shift an 8 bit character 10 bits.
When bit twiddling, always use unsigned numbers.
Change the array to unsigned char.
Add the 'U' suffix to each constant, because each constant is a signed integer by default.
BTW, right shifting is undefined implementation defined for signed integers.
Per comments, changed "undefined" to "implementation defined".
There are a few things you need to address.
First up, c++ doesn't have 12 bit numbers. The best you can have are 16 bit. The top bit represents sign in twos complement form.
You also need to be very careful shift of the type of the number you are shifting. In your example, you are left shifting a char by over 8 bits. As a char is only 8 bits, you are zeroing it.
The following example gives a correct implmentation (for signed 12 bit numbers). There are no doubt more efficient ones.
// shift in top 2 bits
signed short test = static_cast<signed short>(temp3[0] & 0x03) << 10 ;
// shift in middle 8 bits
test |= (static_cast<signed short>(temp3[1]) << 2) & 0x03FC;
// rightshift, mask and append lower 2 bits
test |= (static_cast<signed short>(temp3[2]) >> 6) & 0x0003;
// sign extend top bits from 12 bits to 16 bits
test |= (temp3[0] & 0x02) == 0 ? 0x0000 : 0xF0000;

What is this doing: "input >> 4 & 0x0F"?

I don't understand what this code is doing at all, could someone please explain it?
long input; //just here to show the type, assume it has a value stored
unsigned int output( input >> 4 & 0x0F );
Thanks
bitshifts the input 4 bits to the right, then masks by the lower 4 bits.
Take this example 16 bit number: (the dots are just for visual separation)
1001.1111.1101.1001 >> 4 = 0000.1001.1111.1101
0000.1001.1111.1101 & 0x0F = 1101 (or 0000.0000.0000.1101 to be more explicit)
& is the bitwise AND operator. "& 0x0F" is sometimes done to pad the first 4 bits with 0s, or ignore the first(leftmost) 4 bits in a value.
0x0f = 00001111. So a bitwise & operation of 0x0f with any other bit pattern will retain only the rightmost 4 bits, clearing the left 4 bits.
If the input has a value of 01010001, after doing &0x0F, we'll get 00000001 - which is a pattern we get after clearing the left 4 bits.
Just as another example, this is a code I've used in a project:
Byte verflag = (Byte)(bIsAck & 0x0f) | ((version << 4) & 0xf0). Here I'm combining two values into a single Byte value to save space because it's being used in a packet header structure. bIsAck is a BOOL and version is a Byte whose value is very small. So both these values can be contained in a single Byte variable.
The first nibble in the resultant variable will contain the value of version and the second nibble will contain the value of bIsAck. I can retrieve the values into separate variables at the receiving by doing a 4 bits >> while taking the value of version.
Hope this is somewhere near to what you asked for.
That is doing a bitwise right shift the contents of "input" by 4 bits, then doing a bitwise AND of the result with 0x0F (1101).
What it does depends on the contents and type of "input". Is it an int? A long? A string (which would mean the shift and bitwise AND are being done on a pointer to the first byte).
Google for "c++ bitwise operations" for more details on what's going on under the hood.
Additionally, look at C++ operator precedence because the C/C++ precedence is not exactly the same as in many other languages.

Find all the 2-bit values that match against another binary pattern and then sum them

First Value:
I have a binary value which is actually a compact series of 2-bit values. (That is, each 2 bits in the binary value represents 0, 1, 2, or 3.) So, for example, 0, 3, 1, 2 becomes 00110110. In this binary string, all I care about are the 3's (or alternately, I could flip the bits and only care about the 0's, if that makes your answer easier). All the other numbers are irrelevant (for reasons we'll get into in a bit).
Second Value:
I have a second binary value which is also a compacted series of 2-bit values represented the same way. It has an identical length to the First Value.
Math:
I want the sum of the 2-bit numbers in the Second Value that have the same position as a 3 from the First Value. In other words, if I have:
First: 11000011
Second: 01111101
Then my answer would be "2" (I added the first number and the last number from "Second" together, because those were the only ones that had a "11" in the First Value that matched them.)
I want to do this in as few clock cycles as possible (either on a GPU or on an x86 architecture). However, I'm generally looking for an algorithm, not an assembler solution. Is there any way faster than masking off two bits at a time from each number and running several loops?
Sure.
// the two numbers
unsigned int a;
unsigned int b;
Now create a mask from a that contains '1' bit at an odd position only if in a there was '11' ending at same position.
unsigned int mask = a & (a >> 1) & 0x55555555;
Expand it to get the '11' pattern back:
mask = mask | (mask << 1);
So now if a was 1101100011, mask is 1100000011.
Then mask b with the mask:
b = b & mask;
You can then perform the addition of (masked) numbers from b in parallel:
b = (b & 0x33333333) + ((b & 0xcccccccc) >> 2);
b = (b & 0x0f0f0f0f) + ((b & 0xf0f0f0f0) >> 4);
b = (b & 0x00ff00ff) + ((b & 0xff00ff00) >> 8);
b = (b & 0x0000ffff) + ((b & 0xffff0000) >> 16);
For a 32-bit number, the sum is now at the lowest bits of b. This is a commonly known pattern for parallel addition of bit fields. For larger than 32-bit numbers, you would add one more round for 64-bit numbers and two rounds for 128-bit numbers.