implemenation of sets using bits - c++

I am reading about sets representing as bits at following location
http://www.brpreiss.com/books/opus4/html/page395.html
class SetAsBitVector : public Set
{
typedef unsigned int Word;
enum { wordBits = bitsizeof (Word) };
Array<Word> vector;
public:
SetAsBitVector (unsigned int);
// ...
};
SetAsBitVector::SetAsBitVector (unsigned int n) :
Set (n),
vector ((n + wordBits - 1U) / wordBits)
{
// Question here?
for (unsigned int i = 0; i < vector.Length (); ++i)
vector [i] = 0;
}
void SetAsBitVector::Insert (Object& object)
{
unsigned int const item = dynamic_cast<Element&> (object);
vector [item / wordBits] |= 1 << item % wordBits;
// Question here
}
To insert an item into the set, we need to change the appropriate bit
in the array of bits to one. The ith bit of the bit array is bit i mod
w of word ceiling(i/w). Thus, the Insert function is implemented using
a bitwise or operation to change the ith bit to one as shown in above
Program . Even though it is slightly more complicated than the
corresponding operation for the SetAsArray class, the running time for
this operation is still O(1). Since w = wordBits is a power of two, it
is possible to replace the division and modulo operations, / and %,
with shifts and masks like this:
vector [item >> shift] |= 1 << (item & mask);
Depending on the compiler and machine architecture, doing so may
improve the performance of the Insert operation by a constant factor
Questions
My question in constructor why author adding wordBits to "n" and subtracting 1, instead we can use directly as n/wordbits?
Second question whay does author mean by statement "ince w = wordBits is a power of two, it is possible to replace the division and modulo operations, / and %, with shifts and masks like this:
vector [item >> shift] |= 1 << (item & mask);
Reequest to give an example in case of above scenario what is value of shift and mask.
Why author mentioned depending on architecture and compiler there is improve in performance?

I re-tagged this as C++, since it's clearly not C.
To round up. Consider what happens if you call it with n equal to something smaller than wordBits for instance. The generic formula is exactly the one being used, i.e. b = (a + Q - 1) / Q makes sure b * Q is at least a.
Basic binary arithmmetic, division by two is equivalent with shifting to the right and so on.
On some machines, bitwise operations like shifts and masks are faster than divisions and modulos.

Related

The fastest way to swap the two lowest bits in an unsigned int in C++

Assume that I have:
unsigned int x = 883621;
which in binary is :
00000000000011010111101110100101
I need the fastest way to swap the two lowest bits:
00000000000011010111101110100110
Note: To clarify: If x is 7 (0b111), the output should be still 7.
If you have few bytes of memory to spare, I would start with a lookup table:
constexpr unsigned int table[]={0b00,0b10,0b01,0b11};
unsigned int func(unsigned int x){
auto y = (x & (~0b11)) |( table[x&0b11]);
return y;
}
Quickbench -O3 of all the answers so far.
Quickbench -Ofast of all the answers so far.
(Plus my ifelse naive idea.)
[Feel free to add yourself and edit my answer].
Please do correct me if you believe the benchmark is incorrect, I am not an expert in reading assembly. So hopefully volatile x prevented caching the result between loops.
I'll ignore the top bits for a second - there's a trick using multiplication. Multiplication is really a convolution operation, and you can use that to shuffle bits.
In particular, assume the two lower bits are AB. Multiply that by 0b0101, and you get ABAB. You'll see that the swapped bits BA are the middle bits.
Hence,
x = (x & ~3U) | ((((x&3)*5)>>1)&3)
[edit] The &3 is needed to strip the top A bit, but with std::uint_32_t you can use overflow to lose that bit for free - multiplication then gets you the result BAB0'0000'0000'0000'0000'0000'0000'0000'0000' :
x = (x & ~3U) | ((((x&3)*0xA0000000)>>30));
I would use
x = (x & ~0b11) | ((x & 0b10) >> 1) | ((x & 0b01) << 1);
Inspired by the table idea, but with the table as a simple constant instead of an array. We just need mask(00)==00, mask(01)==11, mask(10)=11, masK(11)==11.
constexpr unsigned int table = 0b00111100;
unsigned int func(unsigned int x) {
auto xormask = (table >> ((x&3) * 2)) &3;
x ^= xormask;
return x;
}
This also uses the xor-trick from dyungwang to avoid isolating the top bits.
Another idea, to avoid stripping the top bits. Assume x has the bits XXXXAB, then we want to x-or it with 0000(A^B)(A^B). Thus
auto t = x^(x>>1); // Last bit is now A^B
t &=1; // take just that bit
t *= 3; // Put in the last two positions
x ^= t; // Change A to B and B to A.
Just looking from a mathematical point of view, I would start with a rotate_left() function, which rotates a list of bits one place to the left (011 becomes 110, then 101, and then back 011), and use this as follows:
int func(int input){
return rotate_left(rotate_left((input / 4))) + rotate_left(input % 4);
}
Using this on the author's example 11010111101110100101:
input = 11010111101110100101;
input / 4 = 110101111011101001;
rotate_left(input / 4) = 1101011110111010010;
rotate_left(rotate_left(input / 4) = 11010111101110100100;
input % 4 = 01;
rotate_left(input % 4) = 10;
return 11010111101110100110;
There is also a shift() function, which can be used (twice!) for replacing the integer division.

Fastest Way to XOR all bits from value based on bitmask?

I've got an interesting problem that has me looking for a more efficient way of doing things.
Let's say we have a value (in binary)
(VALUE) 10110001
(MASK) 00110010
----------------
(AND) 00110000
Now, I need to be able to XOR any bits from the (AND) value that are set in the (MASK) value (always lowest to highest bit):
(RESULT) AND1(0) xor AND4(1) xor AND5(1) = 0
Now, on paper, this is certainly quick since I can see which bits are set in the mask. It seems to me that programmatically I would need to keep right shifting the MASK until I found a set bit, XOR it with a separate value, and loop until the entire byte is complete.
Can anyone think of a faster way? I'm looking for the way to do this with the least number of operations and stored values.
If I understood this question correctly, what you want is to get every bit from VALUE that is set in the MASK, and compute the XOR of those bits.
First of all, note that XOR'ing a value with 0 will not change the result. So, to ignore some bits, we can treat them as zeros.
So, XORing the bits set in VALUE that are in MASK is equivalent to XORing the bits in VALUE&MASK.
Now note that the result is 0 if the number of set bits is even, 1 if it is odd.
That means we want to count the number of set bits. Some architectures/compilers have ways to quickly compute this value. For instance, on GCC this can be obtained with __builtin_popcount.
So on GCC, this can be computed with:
int set_bits = __builtin_popcount(value & mask);
return set_bits % 2;
If you want the code to be portable, then this won't do. However, a comment in this answer suggests that some compilers can inline std::bitset::count to efficiently obtain the same result.
If I'm understanding you right, you have
result = value & mask
and you want to XOR the 1 bits of mask & result together. The XOR of a series of bits is the same as counting the number of bits and checking if that count is even or odd. If it's odd, the XOR would be 1; if even, XOR would give 0.
count_bits(mask & result) % 2 != 0
mask & result can be simplified to simply result. You don't need to AND it with mask again. The % 2 != 0 can be alternately written as & 1.
count_bits(result) & 1
As far as how to count bits, the Bit Twiddling Hacks web page gives a number of bit counting algorithms.
Counting bits set, Brian Kernighan's way
unsigned int v; // count the number of bits set in v
unsigned int c; // c accumulates the total bits set in v
for (c = 0; v; c++)
{
v &= v - 1; // clear the least significant bit set
}
Brian Kernighan's method goes through as many iterations as there are
set bits. So if we have a 32-bit word with only the high bit set, then
it will only go once through the loop.
If you were to use that implementation, you could optimize it a bit further. If you think about it, you don't need the full count of bits. You only need to track their parity. Instead of counting bits you could just flip c each iteration.
unsigned bit_parity(unsigned v) {
unsigned c;
for (c = 0; v; c ^= 1) {
v &= v - 1;
}
}
(Thanks to Slava for the suggestion.)
Using that the XOR with 0 doesn't change anything, it's OK to apply the mask and then unconditionally XOR all bits together, which can be done in a parallel-prefix way. So something like this (not tested):
x = m & v;
x ^= x >> 16;
x ^= x >> 8;
x ^= x >> 4;
x ^= x >> 2;
x ^= x >> 1;
result = x & 1
You can use more (or fewer) steps as needed, this is for 32 bits.
One significant issue to be aware of if using v &= v - 1 in the main body of your code is it will change the value of v to 0 in conducting the count. With other methods, the value is changed to the number of 1's. While count logic is generally wrapped as a function, where that is no longer a concern, if you are required to present your counting logic in the main body of your code, you must preserve a copy of v if that value is needed again.
In addition to the other two methods presented, the following is another favorite from bit-twiddling hacks that generally has a bit better performance than the loop method for larger numbers:
/* get the population 1's in the binary representation of a number */
unsigned getn1s (unsigned int v)
{
v = v - ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
v = (v + (v >> 4)) & 0x0F0F0F0F;
v = v + (v << 8);
v = v + (v << 16);
return v >> 24;
}

C++: Binary to Decimal Conversion

I am trying to convert a binary array to decimal in following way:
uint8_t array[8] = {1,1,1,1,0,1,1,1} ;
int decimal = 0 ;
for(int i = 0 ; i < 8 ; i++)
decimal = (decimal << 1) + array[i] ;
Actually I have to convert 64 bit binary array to decimal and I have to do it for million times.
Can anybody help me, is there any faster way to do the above ? Or is the above one is nice ?
Your method is adequate, to call it nice I would just not mix bitwise operations and "mathematical" way of converting to decimal, i.e. use either
decimal = decimal << 1 | array[i];
or
decimal = decimal * 2 + array[i];
It is important, before attempting any optimisation, to profile the code. Time it, look at the code being generated, and optimise only when you understand what is going on.
And as already pointed out, the best optimisation is to not do something, but to make a higher level change that removes the need.
However...
Most changes you might want to trivially make here, are likely to be things the compiler has already done (a shift is the same as a multiply to the compiler). Some may actually prevent the compiler from making an optimisation (changing an add to an or will restrict the compiler - there are more ways to add numbers, and only you know that in this case the result will be the same).
Pointer arithmetic may be better, but the compiler is not stupid - it ought to already be producing decent code for dereferencing the array, so you need to check that you have not in fact made matters worse by introducing an additional variable.
In this case the loop count is well defined and limited, so unrolling probably makes sense.
Further more it depends on how dependent you want the result to be on your target architecture. If you want portability, it is hard(er) to optimise.
For example, the following produces better code here:
unsigned int x0 = *(unsigned int *)array;
unsigned int x1 = *(unsigned int *)(array+4);
int decimal = ((x0 * 0x8040201) >> 20) + ((x1 * 0x8040201) >> 24);
I could probably also roll a 64-bit version that did 8 bits at a time instead of 4.
But it is very definitely not portable code. I might use that locally if I knew what I was running on and I just wanted to crunch numbers quickly. But I probably wouldn't put it in production code. Certainly not without documenting what it did, and without the accompanying unit test that checks that it actually works.
The binary 'compression' can be generalized as a problem of weighted sum -- and for that there are some interesting techniques.
X mod (255) means essentially summing of all independent 8-bit numbers.
X mod 254 means summing each digit with a doubling weight, since 1 mod 254 = 1, 256 mod 254 = 2, 256*256 mod 254 = 2*2 = 4, etc.
If the encoding was big endian, then *(unsigned long long)array % 254 would produce a weighted sum (with truncated range of 0..253). Then removing the value with weight 2 and adding it manually would produce the correct result:
uint64_t a = *(uint64_t *)array;
return (a & ~256) % 254 + ((a>>9) & 2);
Other mechanism to get the weight is to premultiply each binary digit by 255 and masking the correct bit:
uint64_t a = (*(uint64_t *)array * 255) & 0x0102040810204080ULL; // little endian
uint64_t a = (*(uint64_t *)array * 255) & 0x8040201008040201ULL; // big endian
In both cases one can then take the remainder of 255 (and correct now with weight 1):
return (a & 0x00ffffffffffffff) % 255 + (a>>56); // little endian, or
return (a & ~1) % 255 + (a&1);
For the sceptical mind: I actually did profile the modulus version to be (slightly) faster than iteration on x64.
To continue from the answer of JasonD, parallel bit selection can be iteratively utilized.
But first expressing the equation in full form would help the compiler to remove the artificial dependency created by the iterative approach using accumulation:
ret = ((a[0]<<7) | (a[1]<<6) | (a[2]<<5) | (a[3]<<4) |
(a[4]<<3) | (a[5]<<2) | (a[6]<<1) | (a[7]<<0));
vs.
HI=*(uint32_t)array, LO=*(uint32_t)&array[4];
LO |= (HI<<4); // The HI dword has a weight 16 relative to Lo bytes
LO |= (LO>>14); // High word has 4x weight compared to low word
LO |= (LO>>9); // high byte has 2x weight compared to lower byte
return LO & 255;
One more interesting technique would be to utilize crc32 as a compression function; then it just happens that the result would be LookUpTable[crc32(array) & 255]; as there is no collision with this given small subset of 256 distinct arrays. However to apply that, one has already chosen the road of even less portability and could as well end up using SSE intrinsics.
You could use accumulate, with a doubling and adding binary operation:
int doubleSumAndAdd(const int& sum, const int& next) {
return (sum * 2) + next;
}
int decimal = accumulate(array, array+ARRAY_SIZE,
doubleSumAndAdd);
This produces big-endian integers, whereas OP code produces little-endian.
Try this, I converted a binary digit of up to 1020 bits
#include <sstream>
#include <string>
#include <math.h>
#include <iostream>
using namespace std;
long binary_decimal(string num) /* Function to convert binary to dec */
{
long dec = 0, n = 1, exp = 0;
string bin = num;
if(bin.length() > 1020){
cout << "Binary Digit too large" << endl;
}
else {
for(int i = bin.length() - 1; i > -1; i--)
{
n = pow(2,exp++);
if(bin.at(i) == '1')
dec += n;
}
}
return dec;
}
Theoretically this method will work for a binary digit of infinate length

bitwise bitmanipulation puzzle

Hello is have a question for a school assignment i need to :
Read a round number, and with the internal binaire code with bit 0 on the right and bit 7 on the left.
Now i need to change:
bit 0 with bit 7
bit 1 with bit 6
bit 2 with bit 5
bit 3 with bit 4
by example :
if i use hex F703 becomes F7C0
because 03 = 0000 0011 and C0 = 1100 0000
(only the right byte (8 bits) need to be switched.
The lession was about bitmanipulation but i can't find a way to make it correct for al the 16 hexnumbers.
I`am puzzling for a wile now,
i am thinking for using a array for this problem or can someone say that i can be done with only bitwise ^,&,~,<<,>>, opertors ???
Study the following two functions:
bool GetBit(int value, int bit_position)
{
return value & (1 << bit_position);
}
void SetBit(int& value, int bit_position, bool new_bit_value)
{
if (new_bit_value)
value |= (1 << bit_position);
else
value &= ~(1 << bit_position);
}
So now we can read and write arbitrary bits just like an array.
1 << N
gives you:
000...0001000...000
Where the 1 is in the Nth position.
So
1 << 0 == 0000...0000001
1 << 1 == 0000...0000010
1 << 2 == 0000...0000100
1 << 3 == 0000...0001000
...
and so on.
Now what happens if I BINARY AND one of the above numbers with some other number Y?
X = 1 << N
Z = X & Y
What is Z going to look like? Well every bit apart from the Nth is definately going to be 0 isnt it? because those bits are 0 in X.
What will the Nth bit of Z be? It depends on the value of the Nth bit of Y doesn't it? So under what circumstances is Z zero? Precisely when the Nth bit of Y is 0. So by converting Z to a bool we can seperate out the value of the Nth bit of Y. Take another look at the GetBit function above, this is exactly what it is doing.
Now thats reading bits, how do we set a bit? Well if we want to set a bit on we can use BINARY OR with one of the (1 << N) numbers from above:
X = 1 << N
Z = Y | X
What is Z going to be here? Well every bit is going to be the same as Y except the Nth right? And the Nth bit is always going to be 1. So we have set the Nth bit on.
What about setting a bit to zero? What we want to do is take a number like 11111011111 where just the Nth bit is off and then use BINARY AND. To get such a number we just use BINARY NOT:
X = 1 << N // 000010000
W = ~X // 111101111
Z = W & Y
So all the bits in Z apart from the Nth will be copies of Y. The Nth will always be off. So we have effectively set the Nth bit to 0.
Using the above two techniques is how we have implemented SetBit.
So now we can read and write arbitrary bits. Now we can reverse the bits of the number just like it was an array:
int ReverseBits(int input)
{
int output = 0;
for (int i = 0; i < N; i++)
{
bool bit = GetBit(input, i); // read ith bit
SetBit(output, N-i-1, bit); // write (N-i-1)th bit
}
return output;
}
Please make sure you understand all this. Once you have understood this all, please close the page and implement and test them without looking at it.
If you enjoyed this than try some of these:
http://graphics.stanford.edu/~seander/bithacks.html
And/or get this book:
http://www.amazon.com/exec/obidos/ASIN/0201914654/qid%3D1033395248/sr%3D11-1/ref%3Dsr_11_1/104-7035682-9311161
This does one quarter of the job, but I'm not going to give you any more help than that; if you can work out why I said that, then you should be able to fill in the rest of the code.
if ((i ^ (i >> (5 - 2))) & (1 >> 2))
i ^= (1 << 2) | (1 << 5);
Essentially you need to reverse the bit ordering.
We're not going to solve this for you.. but here's a hint:
What if you had a 2-bit value. How would you reverse these bits?
A simple swap would work, right? Think about how to code this swap with operators that are available to you.
Now let's say you had a 4-bit value. How would you reverse these bits?
Could you split it into two 2-bit values, reverse each one, and then swap them? Would that give you the right result? Now code this.
Generalizing that solution to the 8-bit value should be trivial now.
Good luck!

binary comparation

Is there any function in c++ to convert decimal number to binary number without using divide algorithm?
I want to count different bits of binary format of 2 numbers. like diff(0,2) is 1 bit. or diff(3,15) is 2 bit.
I want to write diff function.
thanks
You can find the number of different bits by counting the bits in the xor of the two numbers.
Something like this.
int count_bits(unsigned int n) {
int result = 0;
while(n) {
result += 1;
// Remove the lowest bit.
n &= n - 1;
}
return result;
}
int diff(unsigned int a, unsigned int b) {
return count_bits(a ^ b);
}
You can use XOR on the numbers ( if Z = X XOR Y then each bit which is set differently in X and Y will be set to 1 in Z, each bit that is set the same in X and Y will be set to 0), and count the bits of the result using a simple loop and shift.
Everything is already in binary technically. You just need to start looking at bitwise operators to access the individual bits composing the decimal numbers you're looking at.
For example,
if (15 & 1) would check to see if 15 has its first bit turned on.
if (15 & 3) would check to see if its first 2 bits were turned on.
if (15 & 4) would check to see if its 3rd bit only was turned on.
You can do this with and/or/xor/etc. Google bitwise operators and read up.