How to implement arithmetic right shift from logical shift? - bit-manipulation

I need to implement a 32-bit arithmetic right shift from logical shifts, and, or, xor and normal integer arithmetic operations.
I read somewhere the following is supposed to work:
(x>>N)|(((1<<N)-1)<<(32-N))
x is the integer that will be shifted and N is the amount of bits to shift.
This works for negative (msb is 1) numbers but not for positive numbers (msb is 0).
Does anyone know an efficient algorithm that always produces the right result?

You can use this
(x >> N) | (-(x < 0) << (32 - N))
If x is negative then -(x < 0) returns -1, which have a bit pattern of all 1s, assuming 2's complement. -1 << (32 - N) will make a value which has all 1s in the top N bits and 0s in the remaining part. If x is non-negative then the latter part will always be zero, and the result will be the same as a logical shift. Alternatively it can be modified to
(x >> N) | ~(((x < 0) << (32 - N)) - 1)
Note that it won't work for N <= 0 or N >= 32 (since shifting more than the width of type invokes UB) so you should treat those cases specifically if needed
If you're not allowed to use comparison then you can change x < 0 to (unsigned)x >> 31 and get the below equivalent ways
(x >> N) | (-((unsigned)x >> 31) << (32 - N))
(x >> N) | ((~0*(unsigned)x >> 31) << (32 - N))
(x >> N) | ~((((unsigned)x >> 31) << (32 - N)) - 1)

LSR 1:
+---+---+---+---
| | | | ...
+---+---+---+---
\ \ \ \
\ \ \
\ \ \
+---+---+---+---
| 0 | | | ...
+---+---+---+---
ASR 1:
+---+---+---+---
| | | | ...
+---+---+---+---
|\ \ \ \
| \ \ \
| \ \ \
+---+---+---+---
| | | | ...
+---+---+---+---
An ASR consists of an LSR plus an OR.
So you want to replicate bit31 N times. The efficient solution is probably an efficient implementation of
( bit31 ? 0xFFFFFFFF : 0x00000000 ) << ( 32 - N ) )
I've come up with
LSL(SUB(LSR(NOT(X), 31), 1), SUB(32, N))
The whole thing
OR(LSL(SUB(LSR(NOT(X), 31), 1), SUB(32, N)), LSR(X, N))
That doesn't seem very efficient.

Related

What does 'if((mask | u)==u)' mean?

This is the recurrence relation of maximum sum subset problem.
The complete code is:
if ((mask | u) == u)
dp[u] = max(max(0, dp[u ^ mask] + array[I], dp[u]);
What does exactly mean the following if-statement?
if((mask | u) == u)
Thank you in advance!
It means: “Are all bits of mask in u”.
So if there is a bit in mask not in u this test returns false.
For instance with mask=0b001 and u=0b011 it returns true. But with mask=0b101 and u=0b011 it returns false because the third bit of mask is not set in u.
Binary OR for binary values A, B evaluates as (1) if either A = 1 or B = 1. These bitwise operations extend to strings of binary digits. In C/C++, that's most commonly expressed as integral types.
OR | A = 0 | A = 1 |
-----------------------
B = 0 | (0) | (1) |
-----------------------
B = 1 | (1) | (1) |
-----------------------
(forgive the ASCII art - more concise illustrations and links are here)
mask = {m(n - 1), m(n - 2), .., m(1), m(0)} : (n) binary digits (bits) m(i)
u = {u(n - 1), u(n - 2), .., u(1), u(0)} : (n) binary digits (bits) u(i)
Let's consider (m(i) | u(i)) == u(i) for: i = {0, .., n - 1} ; should any of these bit-wise comparisons be false, then the expression ((mask | u) == u) evaluates as false.
From the OR table we can conclude that the expression is false if and only if m(i) = 1 and u(i) = 0. That is: m(i) | u(i) == (1) OR (0) == (1) which does not equal u(i) == 0
A more concise way of expressing the issue is that if mask has a bit at a position (i) set to (1), and u has a bit at the same position cleared to (0), then (mask | u) cannot equal u.

How does this code calculate the parity of a number? [duplicate]

This question already has answers here:
Computing the Parity
(2 answers)
Closed 3 years ago.
So I am supposed to find the parity of a number where parity is odd if there are odd number of set bits in the binary representation of the given number and even otherwise. I have a solution but the code is not very understandable and thus am looking for an explanation to it.
int y = x ^ (x >> 1); //given number is x.
y = y ^ (y >> 2);
y = y ^ (y >> 4);
y = y ^ (y >> 8);
y = y ^ (y >> 16);
// Rightmost bit of y holds the parity value
// if (y&1) is 1 then parity is odd else even
if (y & 1)
return odd;
return even;
This is quite simple. This is how bits are merged to get parity bit:
Bit numbers:
0 1 2 3 4 5 6 7
\ / \ / \ / \ /
01 23 45 56 x ^ (x >> 1)
\ / \ /
\ / \ /
0123 4567 y ^ (y >> 2);
\ /
\ /
\ /
\ /
01234567 y ^ (y >> 4);
Simply this tree shows how the youngest bit is effectively evaluated.
Note that there are extra pairs on each step, but they are simply do not have impact on bit which is a final result.

What does i+=(i&-i) do? Is it portable?

Let i be a signed integer type. Consider
i += (i&-i);
i -= (i&-i);
where initially i>0.
What do these do? Is there an equivalent code using arithmetic only?
Is this dependent on a specific bit representation of negative integers?
Source: setter's code of an online coding puzzle (w/o any explanation/comments).
The expression i & -i is based on Two's Complement being used to represent negative integers. Simply put, it returns a value k where each bit except the least significant non-zero bit of i is set to 0, but that particular bit keeps its own value. (i.e. 1)
As long as the expression you provided executes in a system where Two's Complement is being used to represent negative integers, it would be portable. So, the answer to your second question would be that the expression is dependent on the representation of negative integers.
To answer your first question, since arithmetic expressions are dependent on the data types and their representations, I do not think that there is a solely arithmetic expression that would be equivalent to the expression i & -i. In essence, the code below would be equivalent in functionality to that expression. (assuming that i is of type int) Notice, though, that I had to make use of a loop to produce the same functionality, and not just arithmetics.
int tmp = 0, k = 0;
while(tmp < 32)
{
if(i & (1 << tmp))
{
k = i & (1 << tmp);
break;
}
tmp++;
}
i += k;
On a Two's Complement architecture, with 4 bits signed integers:
| i | bin | comp | -i | i&-i | dec |
+----+------+------+----+------+-----+
| 0 | 0000 | 0000 | -0 | 0000 | 0 |
| 1 | 0001 | 1111 | -1 | 0001 | 1 |
| 2 | 0010 | 1110 | -2 | 0010 | 2 |
| 3 | 0011 | 1101 | -3 | 0001 | 1 |
| 4 | 0100 | 1100 | -4 | 0100 | 4 |
| 5 | 0101 | 1011 | -5 | 0001 | 1 |
| 6 | 0110 | 1010 | -6 | 0010 | 2 |
| 7 | 0111 | 1001 | -7 | 0001 | 1 |
| -8 | 1000 | 1000 | -8 | 1000 | 8 |
| -7 | 1001 | 0111 | 7 | 0001 | 1 |
| -6 | 1010 | 0110 | 6 | 0010 | 2 |
| -5 | 1011 | 0101 | 5 | 0001 | 1 |
| -4 | 1100 | 0100 | 4 | 0100 | 4 |
| -3 | 1101 | 0011 | 3 | 0001 | 1 |
| -2 | 1110 | 0010 | 2 | 0010 | 2 |
| -1 | 1111 | 0001 | 1 | 0001 | 1 |
Remarks:
You can conjecture that i&-i only has one bit set (it's a power of 2) and it matches the least significant bit set of i.
i + (i&-i) has the interesting property to be one bit closer to the next power of two.
i += (i&-i) sets the least significant unset bit of i.
So, doing i += (i&-i); will eventually make you jump to the next power of two:
| i | i&-i | sum | | i | i&-i | sum |
+---+------+-----+ +---+------+-----+
| 1 | 1 | 2 | | 5 | 1 | 6 |
| 2 | 2 | 4 | | 6 | 2 | -8 |
| 4 | 4 | -8 | |-8 | -8 | UB |
|-8 | -8 | UB |
| i | i&-i | sum | | i | i&-i | sum |
+---+------+-----+ +---+------+-----+
| 3 | 1 | 4 | | 7 | 1 | -8 |
| 4 | 4 | -8 | |-8 | -8 | UB |
|-8 | -8 | UB |
UB: overflow of signed integer exhibits undefined behavior.
If i has unsigned type, the expressions are completely portable and well-defined.
If i has signed type, it's not portable, since & is defined in terms of representations but unary -, +=, and -= are defined in terms of values. If the next version of the C++ standard mandates twos complement, though, it will become portable, and will do the same thing as in the unsigned case.
In the unsigned case (and the twos complement case), it's easy to confirm that i&-i is a power of two (has only one bit nonzero), and has the same value as the lowest-place bit of i (which is also the lowest-place bit of -i). Therefore:
i -= i&-i; clears the lowest-set bit of i.
i += i&-i; increments (clearing, but with carry to higher bits) the lowest-set bit of i.
For unsigned types there is never overflow for either expression. For signed types, i -= i&-i overflows taking -i when i initially has the minimum value of the type, and i += i&-i overflows in the += when i initially has the max value of the type.
Here is what I researched prompted by other answers. The bit manipulations
i -= (i&-i); // strips off the LSB (least-significant bit)
i += (i&-i); // adds the LSB
are used, predominantly, in traversing a Fenwick tree. In particular, i&-i gives the LSB if signed integers are represented via two's complement. As already pointed out by Peter Fenwick in his original proposal, this is not portable to other signed integer representations. However,
i &= i-1; // strips off the LSB
is (it also works with one's complement and signed magnitude representations) and has one fewer operations.
However there appears to be no simple portable alternative for adding the LSB.
i & -i is the easiest way to get the least significant bit (LSB) for an integer i.
You can read more here.
A1: You can read more about 'Mathematical Equivalents' here.
A2: If the negative integer representation is not the usual standard form (i.e. weird big integers), then i & -i might not be LSB.
The easiest way to think of it is in terms of the mathematical equivalence:
-i == (~i + 1)
So -i inverts the bits of the value and then adds 1. The significance of this is that all the lower 0 bits of i are turned into 1s by the ~i operation, so adding 1 to the value causes all those low 1 bits to flip to 0 whilst carrying the 1 upwards until it lands in a 0 bit, which will just happen to be the same position as the lowest 1 bit in i.
Here's an example for the number 6 (0110 in binary):
i = 0110
~i == 1001
(~i + 1) == 1010
i & (~i + 1) == 0010
You may need to do each operation manually a few times before you realise the patterns in the bits.
Here's two more examples:
i = 1000
~i == 0111
(~i + 1) == 1000
i & (~i + 1) == 1000
i = 1100
~i == 0011
(~i + 1) == 0100
i & (~i + 1) == 0100
See how the + 1 causes a sort of 'bit cascade' carrying the one up to the first open 0 bit?
So if (i & -i) is a means of extracting the lowest 1 bit, then it follows that the use cases of i += (i & -i) and i -= (i & -i) are attempts to add and subtract the lowest 1 bit of a value.
Subtracting the lowest 1 bit of a value from itself serves as a means to zero out that bit.
Adding the lowest 1 bit of a value to itself doesn't appear to have any special purpose, it just does what it says on the tin.
It should be portable on any system using two's complement.

Odd bit operator in the increment statement of a for loop [duplicate]

This question already has answers here:
meaning of (number) & (-number)
(4 answers)
Closed 8 years ago.
Given this for loop:
for(++i; i < MAX_N; i += i & -i)
what is it supposed to mean? What does the statement i += i & -i accomplish?
This loop is often observed in binary indexed tree (or BIT) implementation which is useful to update range or point and query range or point in logarithmic time. This loop helps to choose the appropriate bucket based on the set bit in the index. For more details, please consider to read about BIT from some other source. In below post I will show how does this loop help to find the appropriate buckets based on the least significant set bits.
2s complementary signed system (when i is signed)
i & -i is a bit hack to quickly find the number that should be added to given number to make its trailing bit 0(that's why performance of BIT is logarithmic). When you negate a number in 2s complementary system, you will get a number with bits in inverse pattern added 1 to it. When you add 1, all the less significant bits would start inverting as long as they are 1 (were 0 in original number). First 0 bit encountered (1 in original i) would become 1.
When you and both i and -i, only that bit (least significant 1 bit) would remain set and all lesser significant (right) bits would be zero and more significant bits would be inverse of original number.
Anding would generate a power of 2 number that when added to number i would clear the least significant set bit. (as per the requirement of BIT)
For example:
i = 28
+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
*
-i
+---+---+---+---+---+---+---+---+
| 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
I I I I I S Z Z
Z = Zero
I = Inverted
S = Same
* = least significant set bit
i & -i
+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
Adding
+---+---+---+---+---+---+---+---+
| 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
x
x = cleared now
Unsigned (when i is unsigned)
It will work even for 1s complementary system or any other representation system as long as i is unsigned for the following reason:
-i will take value of 2sizeof(unsigned int) * CHAR_BITS - i. So all the bits right to the least significant set bit would remain zero, least significant bit would also remain zero, but all the bits after that would be inverted because of the carry bits.
For example:
i = 28
+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
*
-i
0 1 1 1 1 1 <--- Carry bits
+---+---+---+---+---+---+---+---+
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
+---+---+---+---+---+---+---+---+
- | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
----------------------------------------
+---+---+---+---+---+---+---+---+
| 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
I I I I I S Z Z
Z = Zero
I = Inverted
S = Same
* = least significant set bit
i & -i
+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
Adding
+---+---+---+---+---+---+---+---+
| 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
x
x = cleared now
Implementation without bithack
You can also use i += (int)(1U << __builtin_ctz((unsigned)i)) on gcc.
Live example here
A non obfuscated version for the same would be:
/* Finds smallest power of 2 that can reset the least significant set bit on adding */
int convert(int i) /* I purposely kept i signed as this works for both */
{
int t = 1, d;
for(; /* ever */ ;) {
d = t + i; /* Try this value of t */
if(d & t) t *= 2; /* bit at mask t was 0, try next */
else break; /* Found */
}
return t;
}
EDITAdding from this answer:
Assuming 2's complement (or that i is unsigned), -i is equal to ~i+1.
i & (~i + 1) is a trick to extract the lowest set bit of i.
It works because what +1 actually does is to set the lowest clear bit,
and clear all bits lower than that. So the only bit that is set in
both i and ~i+1 is the lowest set bit from i (that is, the lowest
clear bit in ~i). The bits lower than that are clear in ~i+1, and the
bits higher than that are non-equal between i and ~i.

what this "if(k.c[3] & c)" part of code doing?

#include<stdio.h>
#include<iostream.h>
main()
{
unsigned char c,i;
union temp
{
float f;
char c[4];
} k;
cin>>k.f;
c=128;
for(i=0;i<8;i++)
{
if(k.c[3] & c) cout<<'1';
else cout<<'0';
c=c>>1;
}
c=128;
cout<<'\n';
for(i=0;i<8;i++)
{
if(k.c[2] & c) cout<<'1';
else cout<<'0';
c=c>>1;
}
return 0;
}
if(k.c[2] & c)
That is called bitwise AND.
Illustration of bitwise AND
//illustration : mathematics of bitwise AND
a = 10110101 (binary representation)
b = 10011010 (binary representation)
c = a & b
= 10110101 & 10011010
= 10010000 (binary representation)
= 128 + 16 (decimal)
= 144 (decimal)
Bitwise AND uses this truth table:
X | Y | R = X & Y
---------
0 | 0 | 0
0 | 1 | 0
1 | 0 | 0
1 | 1 | 1
See these tutorials on bitwise AND:
Bitwise Operators in C and C++: A Tutorial
Bitwise AND operator &
A bitwise operation (AND in this case) perform a bit by bit operation between the 2 operands.
For example the & :
11010010 &
11000110 =
11000010
Bitwise Operation in your code
c = 128 therefore the binary representation is
c = 10000000
a & c will and every ith but if c with evert ith bit of a. Because c only has 1 in the MSB position (pos 7), so a & c will be non-zero if a has a 1 in its position 7 bit, if a has a 0 in pos bit, then a & c will be zero. This logic is used in the if block above. The if block is entered depending upon if the MSB (position 7 bit) of the byte is 1 or not.
Suppose a = ? ? ? ? ? ? ? ? where a ? is either 0 or 1
Then
a = ? ? ? ? ? ? ? ?
AND & & & & & & & &
c = 1 0 0 0 0 0 0 0
---------------
? 0 0 0 0 0 0 0
As 0 & ? = 0. So if the bit position 7 is 0 then answer is 0 is bit position 7 is 1 then answer is 1.
In each iteration c is shifted left one position, so the 1 in the c propagates left wise. So in each iteration masking with the other variable you are able to know if there is a 1 or a 0 at that position of the variable.
Use in your code
You have
union temp
{
float f;
char c[4];
} k;
Inside the union the float and the char c[4] share the same memory location (as the property of union).
Now, sizeof (f) = 4bytes) You assign k.f = 5345341 or whatever . When you access the array k.arr[0] it will access the 0th byte of the float f, when you do k.arr[1] it access the 1st byte of the float f . The array is not empty as both the float and the array points the same memory location but access differently. This is actually a mechanism to access the 4 bytes of float bytewise.
NOTE THAT k.arr[0] may address the last byte instead of 1st byte (as told above), this depends on the byte ordering of storage in memory (See little endian and big endian byte ordering for this)
Union k
+--------+--------+--------+--------+ --+
| arr[0] | arr[1] | arr[2] | arr[3] | |
+--------+--------+--------+--------+ |---> Shares same location (in little endian)
| float f | |
+-----------------------------------+ --+
Or the byte ordering could be reversed
Union k
+--------+--------+--------+--------+ --+
| arr[3] | arr[2] | arr[1] | arr[0] | |
+--------+--------+--------+--------+ |---> Shares same location (in big endian)
| float f | |
+-----------------------------------+ --+
Your code loops on this and shifts the c which propagates the only 1 in the c from bit 7 to bit 0 in one step at a time in each location, and the bitwise anding checks actually every bit position of the bytes of the float variable f, and prints a 1 if it is 1 else 0.
If you print all the 4 bytes of the float, then you can see the IEEE 754 representation.
c has single bit in it set. 128 is 10000000 in binary. if(k.c[2] & c) checks if that bit is set in k.c[2] as well. Then the bit in c is shifted around to check for other bits.
As result the program is made to display the binary representation of float it seems.