Set all bits up to given highest bit - bit-manipulation

If I have a binary number, say x = 00010000, I can set all bits up to the highest and only set bit by doing y = x | (x - 1) resulting in y = 00011111.
How can I achieve the same result if multiple bits are set, e.g. x = 00010101?

This is a find first set problem
No possible solution with only bitwise operations.
If you want to code it the usual way, think "binary search" for more efficient algorithms.
Here's something to get you started but I am sure there are more efficient algos out there
int findfirstset(unsigned int x) {
x |= (x >> 1);
x |= (x >> 2);
x |= (x >> 4);
x |= (x >> 8);
x |= (x >> 16);
return x - (x >> 1);
}
From there you could do your operation:
y = x | (x - 1)

Related

Is it possible to write a function adding two integers without control flow and strictly bitwise operations?

I misunderstood a question that said to add two integers using bitwise operations. I did not use any control flow and could not do it. After giving up, all the solutions I found use control flow to accomplish this whether it be an if, while, for, recursion, etc,. Is there a proof that is can\cannot be accomplished?
For a fixed length integer, you can just unroll a ripple carry adder. In the worst case, a carry signal has to propagate all the way from the least significant bit to the most significant bit.
Like this (only slightly tested) (to avoid the C-purists' wrath, I will call this C# code)
int add_3bits(int x, int y)
{
int c = x & y;
x = x ^ y;
y = c << 1;
//
c = x & y; // \
x = x ^ y; // | for more bits, insert more of these blocks
y = c << 1; // /
//
// optimized last iteration
return (x ^ y) & 7; // for more bits, change that mask
}
If you do it for as many bits as your integer will hold, you won't need the mask in the end.
That's not very efficient, clearly. For 3 bits it's fine, but for 32 bits it becomes quite long. A Kogge-Stone adder (one of the O(log n) delay adder circuits) is also surprisingly easy to implement in software (in hardware you have to deal with a lot of wires, software doesn't have that problem).
For example: (verified using my website)
static uint add_32bits(uint x, uint y)
{
uint p = x ^ y;
uint g = x & y;
g |= p & (g << 1);
p &= p << 1;
g |= p & (g << 2);
p &= p << 2;
g |= p & (g << 4);
p &= p << 4;
g |= p & (g << 8);
p &= p << 8;
g |= p & (g << 16);
return x ^ y ^ (g << 1);
}

How to get the most significant non-zero byte in a 32 bit integer without a while loop?

I have a method to extract the most significant, non-zero byte in an integer using the following method:
private static int getFirstByte(int n)
{
while (n > 0xFF)
n >>= 8;
return n;
}
There's a logic problem with this method. The integer parameter could be negative, which means it would return the number being passed in, which is incorrect.
There is also a possible issue with the method itself. It is using a while loop.
Is there a way to perform this logic without a while loop and also possibly avoiding the incorrectly returned result for negative numbers?
Not clever, not elegant - but I believe it does "extract the most significant, non-zero byte in an integer ... without using a loop":
private static int getFirstByte(int n) {
int i;
if ((i = n & 0xff000000) != 0)
return (i >> 24) & 0xff;
if ((i = n & 0xff0000) != 0)
return (i >> 16) & 0xff;
if ((i = n & 0xff00) != 0)
return (i >> 8) & 0xff;
// all of the higher bytes are zeroes
return n;
}
You could use log n / log 256… But then you’d have a bigger problem.
I assume by get the first non-zero byte in an int you mean natural 8 bit breaks of the int and not a dynamic 8 bit break.
Natural 8 bit breaks:
00000000|00010110|10110010|11110001 ==> 00010110
Dynamic 8 bit break:
00000000000|10110101|1001011110001 ==> 10110101
This will return the first non-zero byte on a natural 8-bit break of an int without looping or branching. This code may or may not be more efficient then paulsm4's answer. Be sure to do benchmarking and/or profiling of the code to determine which is best for you.
Java Code: ideone link
class Main {
public static void main(String[] args) {
int i,j;
for (i=0,j=1; i<32; ++i,j<<=1) {
System.out.printf("0x%08x : 0x%02x\n",j,getByte(j));
}
}
public static byte getByte(int n) {
int x = n;
x |= (x >>> 1);
x |= (x >>> 2);
x |= (x >>> 4);
x |= (x >>> 8);
x |= (x >>> 16);
x -= ((x >>> 1) & 0x55555555);
x = (((x >>> 2) & 0x33333333) + (x & 0x33333333));
x = (((x >>> 4) + x) & 0x0f0f0f0f);
x += (x >>> 8);
x += (x >>> 16);
x &= 0x0000003f;
x = 32 - x; // x now equals the number of leading zeros
x &= 0x00000038; // mask out last 3 bits (cause natural byte break)
return (byte)((n&(0xFF000000>>>x))>>>(24-x));
}
}

High Order Bits - Take them and make a uint64_t into a uint8_t

Let's say you have a uint64_t and care only about the high order bit for each byte in your uint64_t. Like so:
uint32_t:
0000 ... 1000 0000 1000 0000 1000 0000 1000 0000 ---> 0000 1111
Is there a faster way than:
return
(
((x >> 56) & 128)+
((x >> 49) & 64)+
((x >> 42) & 32)+
((x >> 35) & 16)+
((x >> 28) & 8)+
((x >> 21) & 4)+
((x >> 14) & 2)+
((x >> 7) & 1)
)
Aka shifting x, masking, and adding the correct bit for each byte? This will compile to a lot of assembly and I'm looking for a quicker way... The machine I'm using only has up to SSE2 instructions and I failed to find helpful SIMD ops.
Thanks for the help.
As I mentioned in a comment, pmovmskb does what you want. Here's how you could use it:
MMX + SSE1:
movq mm0, input ; input can be r/m
pmovmskb output, mm0 ; output must be r
SSE2:
movq xmm0, input
pmovmskb output, xmm0
And I looked up the new way
BMI2:
mov rax, 0x8080808080808080
pext output, input, rax ; input must be r
return ((x & 0x8080808080808080) * 0x2040810204081) >> 56;
works. The & selects the bits you want to keep. The multiplications all the bits into the most significant byte, and the shift moves them to the least significant byte. Since multiplication is fast on most modern CPUs this shouldn't be much slower than using assembly.
And here's how to do it using SSE intrinsics:
#include <xmmintrin.h>
#include <inttypes.h>
#include <stdio.h>
int main (void)
{
uint64_t x
= 0b0000000010000000000000001000000000000000100000000000000010000000;
printf ("%x\n", _mm_movemask_pi8 ((__m64) x));
return 0;
}
Works fine with:
gcc -msse
You don't need all the separate logical ANDs, you can simplify it to:
x &= 0x8080808080808080;
return (x >> 7) | (x >> 14) | (x >> 21) | (x >> 28) |
(x >> 35) | (x >> 42) | (x >> 49) | (x >> 56);
(assuming that the function return type is uint8_t).
You can also convert that to an unrolled loop:
uint8_t r = 0;
x &= 0x8080808080808080;
x >>= 7; r |= x;
x >>= 7; r |= x;
x >>= 7; r |= x;
x >>= 7; r |= x;
x >>= 7; r |= x;
x >>= 7; r |= x;
x >>= 7; r |= x;
x >>= 7; r |= x;
return r;
I'm not sure which will perform better in practice, though I'd tend to bet on the first - the second might produce shorter code but with a long dependency chain.
First you don't really need so many operations. You can act on more than one bit at a time:
x = (x >> 7) & 0x0101010101010101; // 0x0101010101010101
x |= x >> 28; // 0x????????11111111
x |= x >> 14; // 0x????????????5555
x |= x >> 7; // 0x??????????????FF
return x & 0xFF;
An alternative is to use modulo to do sideway additions. The first thing is to note that x % n is the sum of the digits in base n+1, so if n+1 is 2^k, you are adding groups of k bits. If you start with
t = (x >> 7) & 0x0101010101010101 like above, you want to sum groups of 7 bits, thus t % 127 would be the solution. But t%127 works only for result up to 126. 0x8080808080808080 and anything above will gives incorrect result. I've tried some corrections, none where easy.
Trying to use modulo to put us in the situation where there is just the last step of the previous algorithm to was possible. What we want is to keep the two less significant bits, and then have the sum of the other one, grouped by 14. So
ull t = (x & 0x8080808080808080) >> 7;
ull u = (t & 3) | (((t>>2) % 0x3FFF) << 2);
return (u | (u>>7)) & 0xFF;
But t>>2 is t/4 and << 2 is multiplying by 4. And if we have (a % b)*c == (a*c % b*c), thus (((t>>2) % 0x3FFF) << 2) is (t & ~3) % 0xFFFC. But we also have the fact that a + b%c = (a+b)%c if it is less than c. So we have simply u = t % FFFC. Giving:
ull t = ((x & 0x8080808080808080) >> 7) % 0xFFFC;
return (t | (t>>7)) & 0xFF;
This seems to work:
return (x & 0x8080808080808080) % 127;

how to store the bit string in array?

how to count number of occurrences of 1 in a 8 bit string. such as 10110001.
bit string is taken from user. like 10110001
what type of array should be used to store this bit string in c?
Short and simple. Use std::bitset(C++)
#include <iostream>
#include <bitset>
int main()
{
std::bitset<8> mybitstring;
std::cin >> mybitstring;
std::cout << mybitstring.count(); // returns the number of set bits
}
Online Test at Ideone
Don't use an array at all, use a std::string. This gives you the possibility of better error handling. You can write code like:
bitset <8> b;
if ( cin >> b ) {
cout << b << endl;
}
else {
cout << "error" << endl;
}
but there is no way of finding out which character caused the error.
You'd probably use an unsigned int to store those bits in C.
If you're using GCC then you can use __builtin_popcount to count the one bits:
Built-in Function: int __builtin_popcount (unsigned int x)
Returns the number of 1-bits in x.
This should resolve to a single instruction on CPUs that support it too.
From hacker's delight:
For machines that don't have this instruction, a good way to count the number
of 1-bits is to first set each 2-bit field equal to the sum of the two single
bits that were originally in the field, and then sum adjacent 2-bit fields,
putting the results in each 4-bit field, and so on.
so, if x is an integer:
x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F);
x = (x & 0x00FF00FF) + ((x >> 8) & 0x00FF00FF);
x = (x & 0x0000FFFF) + ((x >>16) & 0x0000FFFF);
x will now contain the number of 1 bits. Just adapt the algorithm with 8 bit values.

How to decipher 4 short vars from a long var using bit manipulations in C++?

long wxyz; //(w = bits 0-8, x = bits 9-17 , y = bits 18-23, z = bits 24-29)
short w;
short x;
short y;
short z;
w= wxyz & 0xFF800000;
x= wxyz & 0x007FC000;
y= wxyz & 0x00003F00;
z= wxyz & 0x000000FC;
Is this code correct?
Thanks
You need to shift the bits down.
w= (wxyz & 0xFF800000) >> 23;
x= (wxyz & 0x007FC000) >> 14;
y= (wxyz & 0x00003F00) >> 8;
z= (wxyz & 0x000000FC) >> 2;
You should do the following to get the highest byte from the 4 bytes int w = (wxyz & 0xFF000000) >> 24. First apply bit mask and then shift bits to the lowest byte.
Or you can do it other way around - shift, apply bitmask:
w = (wxyz >> 24) & 0xFF
x = (wxyz >> 16) & 0xFF
y = (wxyz >> 8) & 0xFF
z = wxyz & 0xFF
But isn't it easier to use unions?
w = wxyz & 0x000001ff;
x = (wxyz & 0x0003fe00) >> 9;
y = (wxyz & 0x00fc0000) >> 17;
z = (wxyz & 0x3f000000) >> 23;
Edit: need to cast long to short to avoid compiler warning:
w = (short) wxyz & 0x000001ff;
x = (short) ((wxyz & 0x0003fe00) >> 9);
y = (short) ((wxyz & 0x00fc0000) >> 17);
z = (short) ((wxyz & 0x3f000000) >> 23);
Hold on -- what do you mean by bits 0-8? This usually means the nine least significant bits, in which case you've grasped the wrong end of the int.
This is the way I prefer to handle this, by "inching". It just makes more sense in my head. Also, unlike a mask and then shift, there is no problem of a >> being sign-extending (C/C++ isn't Java or C# in well-definedness there). I am going with the assumption that 0 is the MSB (and there are 32bits total, although a long can be more), as stated in the question.
long wxyz = ...; //(w = bits 0-8, x = bits 9-17 , y = bits 18-23, z = bits 24-29)
wxyz >>= 2; // discard 30-31 (or, really, "least two insignificant")
z = wzyz & 0x3f; // easy to see this is "6 bits", no?
wzyz >>= 6; // throw them out
y = wzyz & 0x3f;
wzyz >>= 6;
x = wzyz & 0x1ff;
wzyz >>= 9;
w = wzyz & 0x1ff;
wzyz >>= 9; // for fun, but nothing consumes after
P.S. Adjusting for types is left as an exercise to the reader.
Here's a different solution you can use.
long wxyz;
short w, x, y, z;
char* buf = new char[sizeof(long)];
buf = (char*)long; // cast long as byte array
w = (short)buf[0]; // The way you sort depends on endianness
x = (short)buf[1];
y = (short)buf[2];
z = (short)buf[3];
delete[] buf;
Partially correct. You'll have to shift them to the right if you want the values of each segment.
short w = (short)((wxyz & 0xFF800000) >> 23);
short x = (short)((wxyz & 0x007FC000) >> 14);
short y = (short)((wxyz & 0x00003F00) >> 8);
short z = (short)((wxyz & 0x000000FC) >> 2);
These are correct values.