How to implement bit vectors with bitwise operations? - c++

I am studying a question in the book Programming Pearls, and they recommended this function to set a bit in a bit vector. I'm a bit confused at to what it does.
#define BITSPERWORD 32
#define MASK 0x1F
#define SHIFT 5
#define N 1000000
int a[1 + N/BITSPERWORD];
void set(int i){
a[i >> SHIFT] |= (1 << (i & MASK));
}
Here is my (probably wrong) interpretation of this code.
if i = 64,
1) first, it takes i and shifts it to the right by SHIFT (which is 5) bits. This is equivalent to DIVIDING (not multiplying, as I first thought) i by 2^5. So if i is 64, the index of a is 2 (64 / 2^5)
2) a[2] |= (1 << (64 & MASK))
64 & 1 = 1000000 & 01 = 1000001.
So 1 gets left shifted how many bits????

It seems how this method works, even though I feel like there are better ways to set a bit. Is to find the index of the ith bit it essentially divides by 32 because that is the number of bits per word.
Since the operator used here is | the function is setting the bit to one not toggling the bit
0x1F is actually 31 and when anded with the i you get the remainder (not sure why they just didn't use %)
And lastly the shift takes the 1 to the proper location and or's it with the right slot in the vector.
If you are planning to use this code
you could write it a lot clear without defines and using more obvious methods of doing it, I doubt it would make a difference in speed.
Also you should probably just use std::bitset
the use of the mask to get the remainder particularly annoyed me because I'm pretty sure it would not necessarily work for every number, 31 happens to work because it's all 1's

Related

Keep every n-th bits and collapse them in the least significant bits

I have a 32 bits integer that I treat as a bitfield. I'm interested in the value of the bits with an index of the form 3n where n range from 0 to 6 (every third bit between 0 and 18) I'm not interested in the bits with index in the form 3n+1 or 3n+2.
I can easily use the bitwise AND operator to keep the bits i'm interested in and set all the others bits to zero.
I would also need to "pack" the bits I'm interested in in the 7 least significant bits positions. So the bit at position 0 stay at 0, but the bit at position 3 is moved to position 1, the bit at position 6 moves to position 2 and so on.
I would like to do this in an efficient way, ideally without using a loop. Is there a combinations of operations I could apply to an integer to achieve this?
Since we're only talking about integer arithmetics here, I don't think the programming language I plan to use is of importance. But if you need to know :
I'm gonna use JavaScript.
If the order of the bits is not important, they can be packed into bits 0-6 like this:
function packbits(a)
{
// mask out the bits we're not interested in:
var b = a & 299593; // 1001001001001001001 in binary
// pack into the lower 7 bits:
return (b | (b >> 8) | (b >> 13)) & 127;
}
If the initial bit ordering is like this:
bit 31 bit 0
xxxxxxxxxxxxxGxxFxxExxDxxCxxBxxA
Then the packed ordering is like this:
bit 7 bit 0
0CGEBFDA

How to set the highest-valued 1 bit to 0 , prefferably in c++ [duplicate]

This question already has answers here:
What's the best way to toggle the MSB?
(4 answers)
Closed 8 years ago.
If, for example, I have the number 20:
0001 0100
I want to set the highest valued 1 bit, the left-most, to 0.
So
0001 0100
will become
0000 0100
I was wondering which is the most efficient way to achieve this.
Preferrably in c++.
I tried substracting from the original number the largest power of two like this,
unsigned long long int originalNumber;
unsigned long long int x=originalNumber;
x--;
x |= x >> 1;
x |= x >> 2;
x |= x >> 4;
x |= x >> 8;
x |= x >> 16;
x++;
x >>= 1;
originalNumber ^= x;
,but i need something more efficient.
The tricky part is finding the most significant bit, or counting the number of leading zeroes. Everything else is can be done more or less trivially with left shifting 1 (by one less), subtracting 1 followed by negation (building an inverse mask) and the & operator.
The well-known bit hacks site has several implementations for the problem of finding the most significant bit, but it is also worth looking into compiler intrinsics, as all mainstream compilers have an intrinsic for this purpose, which they implement as efficiently as the target architecture will allow (I tested this a few years ago using GCC on x86, came out as single instruction). Which is fastest is impossible to tell without profiling on your target architecture (fewer lines of code, or fewer assembly instructions are not always faster!), but it is a fair assumption that compilers implement these intrinsics not much worse than you'll be able to implement them, and likely faster.
Using an intrinsic with a somewhat intellegible name may also turn out easier to comprehend than some bit hack when you look at it 5 years from now.
Unluckily, although a not entirely uncommon thing, this is not a standardized function which you'd expect to find in the C or C++ libraries, at least there is no standard function that I'm aware of.
For GCC, you're looking for __builtin_clz, VisualStudio calls it _BitScanReverse, and Intel's compiler calls it _bit_scan_reverse.
Alternatively to counting leading zeroes, you may look into what the same Bit Twiddling site has under "Round up to the next power of two", which you would only need to follow up with a right shift by 1, and a NAND operation. Note that the 5-step implementation given on the site is for 32-bit integers, you would have to double the number of steps for 64-bit wide values.
#include <limits.h>
uint32_t unsetHighestBit(uint32_t val) {
for(uint32_t i = sizeof(uint32_t) * CHAR_BIT - 1; i >= 0; i--) {
if(val & (1 << i)) {
val &= ~(1 << i);
break;
}
}
return val;
}
Explanation
Here we take the size of the type uint32_t, which is 4 bytes. Each byte has 8 bits, so we iterate 32 times starting with i having values 31 to 0.
In each iteration we shift the value 1 by i to the left and then bitwise-and (&) it with our value. If this returns a value != 0, the bit at i is set. Once we find a bit that is set, we bitwise-and (&) our initial value with the bitwise negation (~) of the bit that is set.
For example if we have the number 44, its binary representation would be 0010 1100. The first set bit that we find is bit 5, resulting in the mask 0010 0000. The bitwise negation of this mask is 1101 1111. Now when bitwise and-ing & the initial value with this mask, we get the value 0000 1100.
In C++ with templates
This is an example of how this can be solved in C++ using a template:
#include <limits>
template<typename T> T unsetHighestBit(T val) {
for(uint32_t i = sizeof(T) * numeric_limits<char>::digits - 1; i >= 0; i--) {
if(val & (1 << i)) {
val &= ~(1 << i);
break;
}
}
return val;
}
If you're constrained to 8 bits (as in your example), then just precalculate all possible values in an array (byte[256]) using any algorithm, or just type it in by hand.
Then you just look up the desired value:
x = lookup[originalNumber]
Can't be much faster than that. :-)
UPDATE: so I read the question wrong.
But if using 64 bit values, then break it apart into 8 bytes, maybe by casting it to a byte[8] or overlaying it in a union or something more clever. After that, find the first byte which are not zero and do as in my answer above with that particular byte. Not as efficient I'm afraid, but still it is at most 8 tests (and in average 4.5) + one lookup.
Of course, creating a byte[65536} lookup will double the speed.
The following code will turn off the right most bit:
bool found = false;
int bit, bitCounter = 31;
while (!found) {
bit = x & (1 << bitCounter);
if (bit != 0) {
x &= ~(1 << bitCounter);
found = true;
}
else if (bitCounter == 0)
found = true;
else
bitCounter--;
}
I know method to set more right non zero bit to 0.
a & (a - 1)
It is from Book: Warren H.S., Jr. - Hacker's Delight.
You can reverse your bits, set more right to zero and reverse back. But I do now know efficient way to invert bits in your case.

Finding the dominating bit

I'm trying to determine if a bitstring, say 64 bit long, is at least 50% ones. I've searched around and looked at the great http://graphics.stanford.edu/~seander/bithacks.html, but I haven't found anything specifically for this problem.
I can split the string up into 8bit chunks, pre-calculate the number of 1s in each, and then find the result in 8 lookups and 7 additions.
Example of bytewise approach:
10001000 10000010 00111001 00001111 01011010 11001100 00001111 11110111
2 + 2 + 4 + 4 + 4 + 4 + 4 + 7 = 31
hence 0 dominates.
I just feel like there must be a better way given I just want to find the dominator. Maybe I'm just using the wrong name?
You can use the divide and concur solution here, which is easy adaptable to 32-bit. Or maybe just a popcnt instruction depending on your hardware. Then you just check if that value is less than 32, if so 0s dominate, otherwise 1s dominate.
The code from the link adapted to 64-bit and with the domination logic inserted:
(I've bit shifted right by an extra 5 bits at the end to check if the set bits is greater than 31 in the same shift)
int AtLeastHalfOnes(long long i) {
i = i - ((i >> 1) & 0x5555555555555555LL);
i = (i & 0x3333333333333333LL) + ((i >> 2) & 0x3333333333333333LL);
return (((i + (i >> 4)) & 0x0F0F0F0F0F0F0F0FLL) * 0x0101010101010101LL) >> 61;
}
I think it is better to use Stack data structure. When your input bit is 1, then push(1);. Otherwise pop(); from top of your stack. Finally if your stack is not empty, I think your problem solved.

Sieve of eratosthenes : bit wise optimized

After searching the net I came to know that the bit-wise version of the sieve of eratosthenes is pretty efficient.
The problem is I am unable to understand the math/method it is using.
The version that I have been busy with looks like this:
#define MAX 100000000
#define LIM 10000
unsigned flag[MAX>>6]={0};
#define ifc(n) (flag[n>>6]&(1<<((n>>1)&31))) //LINE 1
#define isc(n) (flag[n>>6]|=(1<<((n>>1)&31))) //LINE 2
void sieve() {
unsigned i, j, k;
for(i=3; i<LIM; i+=2)
if(!ifc(i))
for(j=i*i, k=i<<1; j<LIM*LIM; j+=k)
isc(j);
}
Points that I understood (Please correct me if I am wrong):
Statement in line 1 checks if the number is composite.
Statement in line 2 marks the number 'n' as composite.
The program is storing the value 0 or 1 at a bit of an int. This tends to reduce the memory usage to x/32. (x is the size that would have been used had an int been used to store the 0 or 1 instead of a bit like in my solution above)
Points that are going above my head as of now :
How is the finction in LINE 1 functioning.How is the function making sure that the number is composite or not.
How is function in LINE 2 setting the bit.
I also came to know that the bitwise sieve is timewise efficient as
well. Is it because of the use of bitwise operators only or
something else is contributing to it as well.
Any ideas or suggestions?
Technically, there is a bug in the code as well:
unsigned flag[MAX>>6]={0};
divides MAX by 64, but if MAX is not an exact multiple of 64, the array is one element short.
Line 1: Let's pick it apart:
(flag[n>>6]&(1<<((n>>1)&31)))
The flag[n>>6] (n >> 6 = n / 64) gives the 32-bit integer that holds the bit value for n / 2.
Since only "Odd" numbers are possible primes, divide n by two: (n>>1).
The 1<<((n>>1)&31) gives us the bit corresponding to n/2 within the 0..31 - (& 31 makes sure that it's "in range").
Finally, use & to combine the value on the left with the value on the right.
So, the result is true if element for n has bit number n modulo 32 set.
The second line is essentially the same concept, just that it uses |= (or equal) to set the bit corresponding to the multiple.

Swapping pair of bits in a Byte

I have an arbitrary 8-bit binary number e.g., 11101101
I have to swap all the pair of bits like:
Before swapping: 11-10-11-01
After swapping: 11-01-11-10
I was asked this in an interview !
In pseudo-code:
x = ((x & 0b10101010) >> 1) | ((x & 0b01010101) << 1)
It works by handling the low bits and high bits of each bit-pair separately and then combining the result:
The expression x & 0b10101010 extracts the high bit from each pair, and then >> 1 shifts it to the low bit position.
Similarly the expression (x & 0b01010101) << 1 extracts the low bit from each pair and shifts it to the high bit position.
The two parts are then combined using bitwise-OR.
Since not all languages allow you to write binary literals directly, you could write them in for example hexadecimal:
Binary Hexadecimal Decimal
0b10101010 0xaa 170
0b01010101 0x55 85
Make two bit masks, one containing all the even bits and one containing the uneven bits (10101010 and 01010101).
Use bitwise-and to filter the input into two numbers, one having all the even bits zeroed, the other having all the uneven bits zeroed.
Shift the number that contains only even bits one bit to the left, and the other one one bit to the right
Use bitwise-or to combine them back together.
Example for 16 bits (not actual code):
short swap_bit_pair(short i) {
return ((i & 0101010110101010b) >> 1) | ((i & 0x0101010101010101b) << 1));
}
b = (a & 170 >> 1) | (a & 85 << 1)
The most elegant and flexible solution is, as others have said, to apply an 'comb' mask to both the even and odd bits seperately and then, having shifted them left and right respectively one place to combine them using bitwise or.
One other solution you may want to think about takes advantage of the relatively small size of your datatype. You can create a look up table of 256 values which is statically initialised to the values you want as output to your input:
const unsigned char lookup[] = { 0x02, 0x01, 0x03, 0x08, 0x0A, 0x09, 0x0B ...
Each value is placed in the array to represent the transformation of the index. So if you then do this:
unsigned char out = lookup[ 0xAA ];
out will contain 0x55
This is more cumbersome and less flexible than the first approach (what if you want to move from 8 bits to 16?) but does have the approach that it will be measurably faster if performing a large number of these operations.
Suppose your number is num.
First find the even position bit:
num & oxAAAAAAAA
Second step find the odd position bit:
num & ox55555555
3rd step change position odd position to even position bit and even position bit to odd position bit:
Even = (num & oxAAAAAAAA)>>1
Odd = (num & 0x55555555)<<1
Last step ... result = Even | Odd
Print result
I would first code it 'longhand' - that is to say in several obvious, explicit stages, and use that to validate that the unit tests I had in place were functioning correctly, and then only move to more esoteric bit manipulation solutions if I had a need for performance (and that extra performance was delivered by said improvments)
Code for people first, computers second.