Odd bit operator in the increment statement of a for loop [duplicate] - c++

This question already has answers here:
meaning of (number) & (-number)
(4 answers)
Closed 8 years ago.
Given this for loop:
for(++i; i < MAX_N; i += i & -i)
what is it supposed to mean? What does the statement i += i & -i accomplish?

This loop is often observed in binary indexed tree (or BIT) implementation which is useful to update range or point and query range or point in logarithmic time. This loop helps to choose the appropriate bucket based on the set bit in the index. For more details, please consider to read about BIT from some other source. In below post I will show how does this loop help to find the appropriate buckets based on the least significant set bits.
2s complementary signed system (when i is signed)
i & -i is a bit hack to quickly find the number that should be added to given number to make its trailing bit 0(that's why performance of BIT is logarithmic). When you negate a number in 2s complementary system, you will get a number with bits in inverse pattern added 1 to it. When you add 1, all the less significant bits would start inverting as long as they are 1 (were 0 in original number). First 0 bit encountered (1 in original i) would become 1.
When you and both i and -i, only that bit (least significant 1 bit) would remain set and all lesser significant (right) bits would be zero and more significant bits would be inverse of original number.
Anding would generate a power of 2 number that when added to number i would clear the least significant set bit. (as per the requirement of BIT)
For example:
i = 28
+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
*
-i
+---+---+---+---+---+---+---+---+
| 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
I I I I I S Z Z
Z = Zero
I = Inverted
S = Same
* = least significant set bit
i & -i
+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
Adding
+---+---+---+---+---+---+---+---+
| 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
x
x = cleared now
Unsigned (when i is unsigned)
It will work even for 1s complementary system or any other representation system as long as i is unsigned for the following reason:
-i will take value of 2sizeof(unsigned int) * CHAR_BITS - i. So all the bits right to the least significant set bit would remain zero, least significant bit would also remain zero, but all the bits after that would be inverted because of the carry bits.
For example:
i = 28
+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
*
-i
0 1 1 1 1 1 <--- Carry bits
+---+---+---+---+---+---+---+---+
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
+---+---+---+---+---+---+---+---+
- | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
----------------------------------------
+---+---+---+---+---+---+---+---+
| 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
I I I I I S Z Z
Z = Zero
I = Inverted
S = Same
* = least significant set bit
i & -i
+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
+---+---+---+---+---+---+---+---+
Adding
+---+---+---+---+---+---+---+---+
| 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
x
x = cleared now
Implementation without bithack
You can also use i += (int)(1U << __builtin_ctz((unsigned)i)) on gcc.
Live example here
A non obfuscated version for the same would be:
/* Finds smallest power of 2 that can reset the least significant set bit on adding */
int convert(int i) /* I purposely kept i signed as this works for both */
{
int t = 1, d;
for(; /* ever */ ;) {
d = t + i; /* Try this value of t */
if(d & t) t *= 2; /* bit at mask t was 0, try next */
else break; /* Found */
}
return t;
}
EDITAdding from this answer:
Assuming 2's complement (or that i is unsigned), -i is equal to ~i+1.
i & (~i + 1) is a trick to extract the lowest set bit of i.
It works because what +1 actually does is to set the lowest clear bit,
and clear all bits lower than that. So the only bit that is set in
both i and ~i+1 is the lowest set bit from i (that is, the lowest
clear bit in ~i). The bits lower than that are clear in ~i+1, and the
bits higher than that are non-equal between i and ~i.

Related

What does i+=(i&-i) do? Is it portable?

Let i be a signed integer type. Consider
i += (i&-i);
i -= (i&-i);
where initially i>0.
What do these do? Is there an equivalent code using arithmetic only?
Is this dependent on a specific bit representation of negative integers?
Source: setter's code of an online coding puzzle (w/o any explanation/comments).
The expression i & -i is based on Two's Complement being used to represent negative integers. Simply put, it returns a value k where each bit except the least significant non-zero bit of i is set to 0, but that particular bit keeps its own value. (i.e. 1)
As long as the expression you provided executes in a system where Two's Complement is being used to represent negative integers, it would be portable. So, the answer to your second question would be that the expression is dependent on the representation of negative integers.
To answer your first question, since arithmetic expressions are dependent on the data types and their representations, I do not think that there is a solely arithmetic expression that would be equivalent to the expression i & -i. In essence, the code below would be equivalent in functionality to that expression. (assuming that i is of type int) Notice, though, that I had to make use of a loop to produce the same functionality, and not just arithmetics.
int tmp = 0, k = 0;
while(tmp < 32)
{
if(i & (1 << tmp))
{
k = i & (1 << tmp);
break;
}
tmp++;
}
i += k;
On a Two's Complement architecture, with 4 bits signed integers:
| i | bin | comp | -i | i&-i | dec |
+----+------+------+----+------+-----+
| 0 | 0000 | 0000 | -0 | 0000 | 0 |
| 1 | 0001 | 1111 | -1 | 0001 | 1 |
| 2 | 0010 | 1110 | -2 | 0010 | 2 |
| 3 | 0011 | 1101 | -3 | 0001 | 1 |
| 4 | 0100 | 1100 | -4 | 0100 | 4 |
| 5 | 0101 | 1011 | -5 | 0001 | 1 |
| 6 | 0110 | 1010 | -6 | 0010 | 2 |
| 7 | 0111 | 1001 | -7 | 0001 | 1 |
| -8 | 1000 | 1000 | -8 | 1000 | 8 |
| -7 | 1001 | 0111 | 7 | 0001 | 1 |
| -6 | 1010 | 0110 | 6 | 0010 | 2 |
| -5 | 1011 | 0101 | 5 | 0001 | 1 |
| -4 | 1100 | 0100 | 4 | 0100 | 4 |
| -3 | 1101 | 0011 | 3 | 0001 | 1 |
| -2 | 1110 | 0010 | 2 | 0010 | 2 |
| -1 | 1111 | 0001 | 1 | 0001 | 1 |
Remarks:
You can conjecture that i&-i only has one bit set (it's a power of 2) and it matches the least significant bit set of i.
i + (i&-i) has the interesting property to be one bit closer to the next power of two.
i += (i&-i) sets the least significant unset bit of i.
So, doing i += (i&-i); will eventually make you jump to the next power of two:
| i | i&-i | sum | | i | i&-i | sum |
+---+------+-----+ +---+------+-----+
| 1 | 1 | 2 | | 5 | 1 | 6 |
| 2 | 2 | 4 | | 6 | 2 | -8 |
| 4 | 4 | -8 | |-8 | -8 | UB |
|-8 | -8 | UB |
| i | i&-i | sum | | i | i&-i | sum |
+---+------+-----+ +---+------+-----+
| 3 | 1 | 4 | | 7 | 1 | -8 |
| 4 | 4 | -8 | |-8 | -8 | UB |
|-8 | -8 | UB |
UB: overflow of signed integer exhibits undefined behavior.
If i has unsigned type, the expressions are completely portable and well-defined.
If i has signed type, it's not portable, since & is defined in terms of representations but unary -, +=, and -= are defined in terms of values. If the next version of the C++ standard mandates twos complement, though, it will become portable, and will do the same thing as in the unsigned case.
In the unsigned case (and the twos complement case), it's easy to confirm that i&-i is a power of two (has only one bit nonzero), and has the same value as the lowest-place bit of i (which is also the lowest-place bit of -i). Therefore:
i -= i&-i; clears the lowest-set bit of i.
i += i&-i; increments (clearing, but with carry to higher bits) the lowest-set bit of i.
For unsigned types there is never overflow for either expression. For signed types, i -= i&-i overflows taking -i when i initially has the minimum value of the type, and i += i&-i overflows in the += when i initially has the max value of the type.
Here is what I researched prompted by other answers. The bit manipulations
i -= (i&-i); // strips off the LSB (least-significant bit)
i += (i&-i); // adds the LSB
are used, predominantly, in traversing a Fenwick tree. In particular, i&-i gives the LSB if signed integers are represented via two's complement. As already pointed out by Peter Fenwick in his original proposal, this is not portable to other signed integer representations. However,
i &= i-1; // strips off the LSB
is (it also works with one's complement and signed magnitude representations) and has one fewer operations.
However there appears to be no simple portable alternative for adding the LSB.
i & -i is the easiest way to get the least significant bit (LSB) for an integer i.
You can read more here.
A1: You can read more about 'Mathematical Equivalents' here.
A2: If the negative integer representation is not the usual standard form (i.e. weird big integers), then i & -i might not be LSB.
The easiest way to think of it is in terms of the mathematical equivalence:
-i == (~i + 1)
So -i inverts the bits of the value and then adds 1. The significance of this is that all the lower 0 bits of i are turned into 1s by the ~i operation, so adding 1 to the value causes all those low 1 bits to flip to 0 whilst carrying the 1 upwards until it lands in a 0 bit, which will just happen to be the same position as the lowest 1 bit in i.
Here's an example for the number 6 (0110 in binary):
i = 0110
~i == 1001
(~i + 1) == 1010
i & (~i + 1) == 0010
You may need to do each operation manually a few times before you realise the patterns in the bits.
Here's two more examples:
i = 1000
~i == 0111
(~i + 1) == 1000
i & (~i + 1) == 1000
i = 1100
~i == 0011
(~i + 1) == 0100
i & (~i + 1) == 0100
See how the + 1 causes a sort of 'bit cascade' carrying the one up to the first open 0 bit?
So if (i & -i) is a means of extracting the lowest 1 bit, then it follows that the use cases of i += (i & -i) and i -= (i & -i) are attempts to add and subtract the lowest 1 bit of a value.
Subtracting the lowest 1 bit of a value from itself serves as a means to zero out that bit.
Adding the lowest 1 bit of a value to itself doesn't appear to have any special purpose, it just does what it says on the tin.
It should be portable on any system using two's complement.

Define a pointer with an arbitrary span

If I have a large array where the data streams are interleaved in some complex fashion, can I define a pointer p such that p + 1 is some arbitrary offset b bytes.
For example lets say I have 1,000,000 ints, 4 bytes each.
int* p_int = my_int_array;
This gives me *(p_int+1) == my_int_array[1] (moves 4 bytes)
I am looking for something like
something_here(b)* big_span_p_int = my_int_array;
which would make *(big_span_p_int + 1) == my_int_array[b] (moves 4*b or b bytes, whichever is possible or easier)
Is this possible? easy?
Thanks for the help.
EDIT:
b is compile time variable.
Using some of your code. There is no need to declare an additional pointer/array. Applying pointer arithmetic on p_int is enough to traverse and reach the number value you are seeking.
Let's look at this example:
int main() {
int my_int_array[5] {1,2,3,4,5};
int* p_int = my_int_array;
int b = 2;
std::cout << *(p_int + b) << std::endl; // Output is 3, because *p_int == my_int_array[0], so my_int_array[2] will give you the third index of the array.
}
Graphically represented:
Memory Address | Stored Value (values or memory addresses)
----------------------------------------------
0 | .....
1 | .....
2 | .....
3 | .....
4 | .....
5 | .....
6 | .....
7 | .....
8 | .....
. | .....
. | .....
. | .....
n-1 | .....
Imagine the memory as being a very big array in which you can access positions by its memory address (in this case we've simplified the addresses to natural numbers. In reality they're hexadecimal values). "n" is the total amount (or size) of the memory. Since Memory counts and starts in 0, size is equivalent to n-1.
Using the example above:
1. When you invoke:
int my_int_array[5] {1,2,3,4,5};
The Operating System and the C++ compiler allocates the integer array memory statically for you, but we can think that our memory has been changed. E.g. Memory address 2 (decided by the compiler) now has our first value of my_int_array.
Memory Address | Name - Stored Value (values or memory addresses)
-----------------------------------------------------
0 | .....
1 | .....
2 | my_int_array[0] = 1
3 | my_int_array[1] = 2
4 | my_int_array[2] = 3
5 | my_int_array[3] = 4
6 | my_int_array[4] = 5
7 | .....
8 | .....
. | .....
. | .....
. | .....
n-1 | .....
2. Now if we say:
int* p_int = my_int_array;
The memory changes again. E.g. Memory address 8 (decided by the compiler) now has a int pointer called *p_int.
Memory Address | Name - Stored Value (values or memory addresses)
-----------------------------------------------------
0 | .....
1 | .....
2 | my_int_array[0] = 1
3 | my_int_array[1] = 2
4 | my_int_array[2] = 3
5 | my_int_array[3] = 4
6 | my_int_array[4] = 5
7 | .....
8 | p_int = 2 (which means it points to memory address 2, which has the value of my_int_array[0] = 1)
. | .....
. | .....
. | .....
n-1 | .....
3. If in your program you now say:
p_int += 2; // You increase the value by 2 (or 8 bytes), it now points elsewhere, 2 index values ahead in the array.
Memory Address | Name - Stored Value (values or memory addresses)
-----------------------------------------------------
0 | .....
1 | .....
2 | my_int_array[0] = 1
3 | my_int_array[1] = 2
4 | my_int_array[2] = 3
5 | my_int_array[3] = 4
6 | my_int_array[4] = 5
7 | .....
8 | p_int = 4 (which means it points to memory address 4, which has the value of my_int_array[2] = 3)
. | .....
. | .....
. | .....
n-1 | .....
When doing memory allocation and pointer arithmetic in a simple case like this, you don't have to worry about the size in bytes of an int (4 bytes). The pointers here are already bound to a type (in this case int) when you declared them, so just by increasing their value by integer values, p_int + 1, this will make point p_int point to the next 4 bytes or int value. Just by adding the values to the pointers you will get the next integer.
If b is a constant expression (a compile-time constant), then pointer declared as
int (*big_span_p_int)[b]
will move by b * sizeof(int) bytes every time you increment it.
In C you can use a run-time value in place of b, but since your question is tagged [C++], this is not applicable.

Queue with mod operation

I'm studying fundamental of data structure (Queue) , so far I understand the flow of Queue but I don't understand whenever queue is applying with Mod operator. There a several question which confusing my brain. How to answer this question (refer to picture)?
The best method for handling circular queues is to draw them out. Since circles don't post very well with ASCII art, I'll use a linear array.
+---+---+---+---+---+
| | | | | |
+---+---+---+---+---+
0 1 2 3 4
^
Rear
The REAR is at index 4.
Let's perform the operation step by step.
First: Add 1 to REAR. This makes REAR point beyond the array:
+---+---+---+---+---+
| | | | | |
+---+---+---+---+---+
0 1 2 3 4 5
^
Rear
Applying the modulo operation, %, this will give us the remainder of 5 / 5 which is zero:
+---+---+---+---+---+
| | | | | |
+---+---+---+---+---+
0 1 2 3 4
^
Rear
Thus the modulo operation wraps around the array, like a circle.
The next question is for you to solve. Remember draw the array or queue. You can use circles (think of a pie sliced or a pizza sliced).
Edit 1: Modulo details
The modulo operation will give a value in the range 0..N, when N is the divisor.
Given N == 4, here are some results for modulo:
Index result
0 0
1 1
2 2
3 3
4 0 --> The remainder of 4 / 4 == 0.
5 1
6 2
7 3
8 0 --> The remainder of 8 / 4 == 0.
Modulus returns the remainder of the two operands. For example, 4%2=0 since 4/2=2 with no remainder, while 4%3=1 since 4/3=1 with remainder 1. Since you can never have a remainder higher than the right operand, you have an effective "range" of answers for any modulus of 0 to (n-1). With that in mind, just plug in the numbers for the variables ((4+1)%5=? and (1+1)%4=?). Usually to find the remainder you would use long division, but one useful thing to remember is that any number divided by itself has a remainder of 0, and any number divided by a larger number will have a remainder equal to itself.

Int into Array C++ help please

I am trying to separate an integer into an array. I have been using modulo 10 and then dividing by 10 but I believe that will only work for numbers 6 digits or less I may be wrong but it is not working for me. This is what I have:
for(int i=0; i<=8; i++){
intAr[i] = intVal%10;
intVal /= 10;
}
It is not working for me and help would be lovely
The problem i guess you have is that the number in the array is reversed. So try this:
for(i=8;i>=0;i--)
{
intAr[i] = intVal%10;
intVal /= 10;
}
This will work and have the number stored correctly in the array
If you are expecting the numbers to be stored in your array right-to-left, you'll need to reverse the way your store the values:
for(int i=0; i < 9; i++)
{
intAr[9 - i - 1] = intVal % 10;
intVal /= 10;
}
This will store your number (103000648) like this
|-0-|-1-|-2-|-3-|-4-|-5-|-6-|-7-|-8-|
| 1 | 0 | 3 | 0 | 0 | 0 | 6 | 4 | 8 |
instead of
|-0-|-1-|-2-|-3-|-4-|-5-|-6-|-7-|-8-|
| 8 | 4 | 6 | 0 | 0 | 0 | 3 | 0 | 1 |

what this "if(k.c[3] & c)" part of code doing?

#include<stdio.h>
#include<iostream.h>
main()
{
unsigned char c,i;
union temp
{
float f;
char c[4];
} k;
cin>>k.f;
c=128;
for(i=0;i<8;i++)
{
if(k.c[3] & c) cout<<'1';
else cout<<'0';
c=c>>1;
}
c=128;
cout<<'\n';
for(i=0;i<8;i++)
{
if(k.c[2] & c) cout<<'1';
else cout<<'0';
c=c>>1;
}
return 0;
}
if(k.c[2] & c)
That is called bitwise AND.
Illustration of bitwise AND
//illustration : mathematics of bitwise AND
a = 10110101 (binary representation)
b = 10011010 (binary representation)
c = a & b
= 10110101 & 10011010
= 10010000 (binary representation)
= 128 + 16 (decimal)
= 144 (decimal)
Bitwise AND uses this truth table:
X | Y | R = X & Y
---------
0 | 0 | 0
0 | 1 | 0
1 | 0 | 0
1 | 1 | 1
See these tutorials on bitwise AND:
Bitwise Operators in C and C++: A Tutorial
Bitwise AND operator &
A bitwise operation (AND in this case) perform a bit by bit operation between the 2 operands.
For example the & :
11010010 &
11000110 =
11000010
Bitwise Operation in your code
c = 128 therefore the binary representation is
c = 10000000
a & c will and every ith but if c with evert ith bit of a. Because c only has 1 in the MSB position (pos 7), so a & c will be non-zero if a has a 1 in its position 7 bit, if a has a 0 in pos bit, then a & c will be zero. This logic is used in the if block above. The if block is entered depending upon if the MSB (position 7 bit) of the byte is 1 or not.
Suppose a = ? ? ? ? ? ? ? ? where a ? is either 0 or 1
Then
a = ? ? ? ? ? ? ? ?
AND & & & & & & & &
c = 1 0 0 0 0 0 0 0
---------------
? 0 0 0 0 0 0 0
As 0 & ? = 0. So if the bit position 7 is 0 then answer is 0 is bit position 7 is 1 then answer is 1.
In each iteration c is shifted left one position, so the 1 in the c propagates left wise. So in each iteration masking with the other variable you are able to know if there is a 1 or a 0 at that position of the variable.
Use in your code
You have
union temp
{
float f;
char c[4];
} k;
Inside the union the float and the char c[4] share the same memory location (as the property of union).
Now, sizeof (f) = 4bytes) You assign k.f = 5345341 or whatever . When you access the array k.arr[0] it will access the 0th byte of the float f, when you do k.arr[1] it access the 1st byte of the float f . The array is not empty as both the float and the array points the same memory location but access differently. This is actually a mechanism to access the 4 bytes of float bytewise.
NOTE THAT k.arr[0] may address the last byte instead of 1st byte (as told above), this depends on the byte ordering of storage in memory (See little endian and big endian byte ordering for this)
Union k
+--------+--------+--------+--------+ --+
| arr[0] | arr[1] | arr[2] | arr[3] | |
+--------+--------+--------+--------+ |---> Shares same location (in little endian)
| float f | |
+-----------------------------------+ --+
Or the byte ordering could be reversed
Union k
+--------+--------+--------+--------+ --+
| arr[3] | arr[2] | arr[1] | arr[0] | |
+--------+--------+--------+--------+ |---> Shares same location (in big endian)
| float f | |
+-----------------------------------+ --+
Your code loops on this and shifts the c which propagates the only 1 in the c from bit 7 to bit 0 in one step at a time in each location, and the bitwise anding checks actually every bit position of the bytes of the float variable f, and prints a 1 if it is 1 else 0.
If you print all the 4 bytes of the float, then you can see the IEEE 754 representation.
c has single bit in it set. 128 is 10000000 in binary. if(k.c[2] & c) checks if that bit is set in k.c[2] as well. Then the bit in c is shifted around to check for other bits.
As result the program is made to display the binary representation of float it seems.