Bitwise operation for add - c++

Could you please help me figure out why the following expression is true:
x + y = x ^ y + (x & y) << 1
I am looking for some rules from the bitwise logic to explain this mathematical equivalent.

It's like solving an ordinary base 10 addition problem 955 + 445, by first adding all the columns individually and throwing away carried 1s:
955
445
-----
390
Then finding all the columns where there should be a carried 1:
955
445
-----
101
Shifting this and adding it to the original result:
390
+ 1010
------
1400
So basically you're doing addition but ignoring all the carried 1s, and then adding in the carried ones after, as a separate step.
In base 2, XOR (^) correctly performs addition when either of the bits is a 0. When both bits are 1, it performs addition without carry, just like we did in the first step above.
x ^ y correctly adds all the bits where x and y are not both 1:
1110111011
^ 0110111101
-------------
1000000110 (x ^ y)
x & y gives us a 1 in all the columns where both bits are a 1. These are exactly the columns where we missed a carry:
1110111011
& 0110111101
-------------
0110111001 (x & y)
Of course when you carry a 1 when doing addition you shift it left one place, just like when you add in base 10.
1000000110 (x ^ y)
+ 01101110010 + (x & y) << 1
-------------
10101111000

x + y is not equivalent to x ^ y + (x & y) << 1
However, your expression above will evaluate to true for most values since = means assignment and non-zero values mean true. == will test for equality.
EDIT
x ^ y + ((x & y) << 1) is correct with parentheses. The AND finds where a carry would happen and the shift carries it. The XOR finds where and addition would happen with no carry. Adding the two together unifies the result.

Related

Why does b = (b - x) & x result in getting the next subset?

The Competitive Programmer's Handbook on page 99 suggests the following way of going through all subsets of a set x (the set bits represent the numbers in the set):
int b = 0;
do {
// Process subset b
} while (b = (b - x) & x);
I understand all the background about bit representation and bitwise operators.
What I am not understanding is why b = (b - x) & x results in getting the next subset.
This post gives an example, but does not provide an insight.
So, why does this work?
Things become clearer when we remember two's complement. The negative of a number is just 1 plus the bitwise NOT of that number. Thus,
(b - x) = (b + ~x + 1)
Let's work through an example of one iteration of the algorithm. Then I'll explain the logic.
Suppose
x = . 1 1 . . 1 .
b = . [.][.] . . [1] .
^
where . denotes zero.
Let's define "important" bits to be the bits that are in the same position as a 1 in x. I've surrounded the important bits with [], and I've marked the right-most important zero in b with ^.
~x = 1 [.][.] 1 1 [.] 1
~x + b = 1 [.][.] 1 1 [1] 1
~x + b + 1 = 1 [.][1] . . [.] .
(~x + b + 1) & x = . [.][1] . . [.] .
Notice that ~x + b always has a string of ones to the right of the right-most important zero of b. When we add 1, all those ones become zeros, and the right-most important zero becomes a 1.
If we look only at the important bits, we see that b transformed from [.][.][1] into [.][1][.]. Here are what the important bits will be if we continue:
[.][1][.]
[.][1][1]
[1][.][.]
[1][.][1]
[1][1][.]
[1][1][1]
If we write the important bits side-by-side like this, as if they were a binary number, then the operation effectively increments that number by 1. The operation is counting.
Once all the important bits are ones, (b - x) & x simply becomes (x - x) & x, which is 0, causing the loop to terminate.
By that point, we've encountered all 2^n possible values of the n important bits. Those values are the subsets of x.

Absolute value abs(x) using bitwise operators and Boolean logic [duplicate]

This question already has answers here:
How to compute the integer absolute value
(11 answers)
Closed 2 years ago.
How does this work?
The idea is to make abs(x) use bitwise operators for integers (assuming 32 bit words):
y = x >> 31
(x + y) ^ y // This gives abs(x) (is ^ XOR)?
Assuming 32-bit words, as stated in the question:
For negative x, x >> 31 is implementation-defined in the C and C++ standards. The author of the code expects two’s complement integers and an arithmetic right-shift, in which x >> 31 produces all zero bits if the sign bit of x is zero and all one bits if the sign bit is one.
Thus, if x is positive or zero, y is zero, and x + y is x, so (x + y) ^ y is x, which is the absolute value of x.
If x is negative, y is all ones, which represents −1 in two’s complement. Then x + y is x - 1. Then XORing with all ones inverts all the bits. Inverting all the bits is equivalent to taking the two’s complement and subtracting one, and two’s complement is the method used to negate integers in two’s complement format. In other words, XORing q with all ones gives -q - 1. So x - 1 XORed with all ones produces -(x - 1) - 1 = -x + 1 - 1 = -x, which is the absolute value of x except when x is the minimum possible value for the format (−2,147,483,648 for 32-bit two’s complement), in which case the absolute value (2,147,483,648) is too large to represent, and the resulting bit pattern is just the original x.
This approach relies on many implementation specific behavior:
It assumes that x is 32 bits wide. Though, you could fix this by x >> (sizeof(x) * CHAR_BIT - 1)
It assumes that the machine uses two's complement representation.
the right-shift operator copies the sign bit from left to right.
Example with 3 bits:
101 -> x = -3
111 -> x >> 2
101 + 111 = 100 -> x + y
100 XOR 111 -> 011 -> 3
This is not portable.
This isn't portable, but I'll explain why it works anyway.
The first operation exploits a trait of 2's complement negative numbers, that the first bit if 1 if negative, and 0 if positive. This is because the numbers range from
The example below is for 8 bits, but can be extrapolated to any number of bits. In your case it's 32 bits (but 8 bits displays the ranges more easily)
10000000 (smallest negative number)
10000001 (next to smallest)
...
11111111 (negative one)
00000000 (zero)
00000001 (one)
...
01111110 (next to largest)
01111111 (largest)
Reasons for using 2's complement encoding of numbers come about by the property that adding any negative number to it's positive number yields zero.
Now, to create the negative of a 2's complement number, you would need to
Take the inverse (bitwise not) of a the input number.
Add one to it.
The reason the 1 is added to it is to force the feature of the addition zeroing the register. You see, if it was just x + ~(x), then you would get a register of all 1's. By adding one to it, you get a cascading carry which yields a register of zeros (with a 1 in the carry out of the register).
This understanding is important to know "why" the algorithm you provided (mostly) works.
y = x >> 31 // this line acts like an "if" statement.
// Depending on if y is 32 signed or unsigned, when x is negative,
// it will fill y with 0xFFFFFFFF or 1. The rest of the
// algorithm doesn't, care because it accommodates both inputs.
// when x is positive, the result is zero.
We will explore (x is positive first)
(x + y) ^ y // for positive x, first we substitute the y = 0
(x + 0) ^ 0 // reduce the addition
(x) ^ 0 // remove the parenthesis
x ^ 0 // which, by definition of xor, can only yield x
x
Now let's explore (x is negative, y is 0xFFFFFFFF (y was signed))
(x + y) ^ y // first substitute the Y
(x + 0xFFFFFFFF) ^ 0xFFFFFFFF // note that 0xFFFFF is the same as 2's complement -1
(x - 1) ^ 0xFFFFFFFF // add in a new variable Z to hold the result
(x - 1) ^ 0xFFFFFFFF = Z // take the ^ 0xFFFFFFFF of both sides
(x - 1) ^ 0xFFFFFFFF ^ 0xFFFFFFFF = Z ^ 0xFFFFFFFF // reduce the left side
(x - 1) = z ^ 0xFFFFFFFF // note that not is equivalent to ^ 0xFFFFFFFF
(x - 1) = ~(z) // add one to both sides
x - 1 + 1 = ~(z) + 1 // reduce
x = ~(z) + 1 // by definition z is negative x (for 2's complement numbers)
Now let's explore (x is negative, y is 0x01 (y was unsigned))
(x + y) ^ y // first substitute the Y
(x + 1) ^ 0x00000001 // note that x is a 2's complement negative, but is
// being treated as unsigned, so to make the unsigned
// context of x tracable, I'll add a -(x) around the X
(-(x) + 1) ^ 0x00000001 // which simplifies to
(-(x - 1)) ^ 0x00000001 // negative of a negative is positive
(-(x - 1)) ^ -(-(0x00000001)) // substituting 1 for bits of -1
(-(x - 1)) ^ -(0xFFFFFFFF) // pulling out the negative sign
-((x-1) ^ 0xFFFFFFFF) // recalling that while we added signs and negations to
// make the math sensible, there's actually no place to
// store them in an unsigned storage system, so dropping
// them is acceptable
x-1 ^ 0XFFFFFFFF = Z // introducing a new variable Z, take the ^ 0xFFFFFFF of both sides
x-1 ^ 0xFFFFFFFF ^ 0xFFFFFFFF = Z ^ 0xFFFFFFFF // reduce the left side
x-1 = z ^ 0xFFFFFFFF // note that not is equivalent to ^ 0xFFFFFFFF
x-1 = ~(z) // add one to both sides
x - 1 + 1 = ~(z) + 1 // reduce
x = ~(z) + 1 // by definition z is negative x (for 2's complement numbers, even though we used only non-2's complement types)
Note that while the above proofs are passable for a general explanation, the reality is that these proofs don't cover important edge cases, like x = 0x80000000 , which represents a negative number greater in absolute value than any positive X which could be stored in the same number of bits.
I use this code, first the calculation of the two's complement (the guard just ensures with a compile time check, the template is an Integer)
/**
* Zweierkomplement - Two's Complement
*/
template<typename T> constexpr auto ZQ(T const& _x) noexcept ->T{
Compile::Guards::IsInteger<T>();
return ((~(_x))+1);
}
and in a second step this is used to calculate the integer abs()
/**
* if number is negative, get the same number with positiv sign
*/
template<typename T> auto INTABS(T const _x) -> typename std::make_unsigned<T>::type{
Compile::Guards::IsInteger<T>();
return static_cast<typename std::make_unsigned<T>::type>((_x<0)?(ZQ<T>(_x)):(_x));
}
why I use this kind of code:
* compile-time checks
* works with all Integer sizes
* portable from small µC to modern cores
* Its clear, that we need to consider the two's complement, so you need an unsigned return value, e.g for 8bit abs(-128)=128 can not be expressed in an signed integer

Can XorShift return zero?

I've been reading about the XorShift PRNG especially the paper here
A guy here states that
The number lies in the range [1, 2**64). Note that it will NEVER be 0.
Looking at the code that makes sense:
uint64_t x;
uint64_t next(void) {
x ^= x >> 12; // a
x ^= x << 25; // b
x ^= x >> 27; // c
return x * UINT64_C(2685821657736338717);
}
If x would be zero than every next number would be zero too. But wouldn't that make it less useful? The usual use-pattern would be something like min + rand() % (max - min) or converting the 64 bits to 32 bits if you only need an int. But if 0 is never returned than that might be a serious problem. Also the bits are not 0 or 1 with the same probability as obviously 0 is missing so zeroes or slightly less likely. I even can't find any mention of that on Wikipedia so am I missing something?
So what is a good/appropriate way to generate random, equally distributed numbers from XorShift64* in a given range?
Short answer: No it cannot return zero.
According the Numeric Recipes "it produces a full period of 2^64-1 [...] the missing value is zero".
The essence is that those shift values have been chosen carefully to make very long sequences (full possible one w/o zero) and hence one can be sure that every number is produced. Zero is indeed the fixpoint of this generator, hence it produces 2 sequences: Zero and the other containing every other number.
So IMO for a sufficiently small range max-min it is enough to make a function (next() - 1) % (max - min) + min or even omitting the subtraction altogether as zero will be returned by the modulo.
If one wants better quality equal distribution one should use the 'usual' method by using next() as a base generator with a range of [1, 2^64)
I am nearly sure that there is an x, for which the xorshift operation returns 0.
Proof:
First, we have these equations:
a = x ^ (x >> 12);
b = a ^ (a << 25);
c = b ^ (b >> 27);
Substituting them:
b = (x ^ x >> 12) ^ ((x ^ x >> 12) << 25);
c = b ^ (b >> 27) = ((x ^ x >> 12) ^ ((x ^ x >> 12) << 25)) ^ (((x ^ x >> 12) ^ ((x ^ x >> 12) << 25)) >> 27);
As you can see, although c is a complex equation, it is perfectly abelian.
It means, you can express the bits of c as fully boolean expressions of the bits of x.
Thus, you can simply construct an equation system for the bits b0, b1, b2, ... so:
(Note: the coefficients are only examples, I didn't calculate them, but so would it look):
c0 = x1 ^ !x32 ^ x47 ...
c1 = x23 ^ x45 ^ !x61 ...
...
c63 = !x13 ^ ...
From that point, you have 64 equations and 64 unknowns. You can simply solve it with Gauss-elimination, you will always have a single unique solution.
Except some rare cases, i.e. if the determinant of the coefficients of the equation system is zero, but it is very unlikely in the size of such a big matrix.
Even if it happens, it would mean that you have an information loss in every iteration, i.e. you can't get all of the 2^64 possible values of x, only some of them.
Now consider the much more probable possibility, that the coefficient matrix is non-zero. In this case, for all the possible 2^64 values of x, you have all possible 2^64 values of c, and these are all different.
Thus, you can get zero.
Extension: actually you get zero for zero... sorry, the proof is more useful to show that it is not so simple as it seems for the first spot. The important part is that you can express the bits of c as a boolean function of the bits of x.
There is another problem with this random number generator. And this is that even if you somehow modify the function to not have such problem (for example, by adding 1 in every iteration):
You still can't guarantee that it won't get into a short loop *for any possible values of x. What if there is a 5 length loop for value 345234523452345? Can you prove for all possible initial values? I can't.
Actually, having a really pseudorandom iteration function, your system will likely loop after 2^32 iterations. It has a nearly trivial combinatoric reason, but "unfortunately this margin is small to contain it" ;-)
So:
If a 2^32 loop length is for your PRNG okay, then use a proven iteration function collected from somewhere on the net.
If it isn't, upgrade the bit length to at least 2^128. It will result a roughly 2^64 loop length which is not so bad.
If you still want a 64-bit output, then use 128-bit numeric internally, but return (x>>64) ^ (x&(2^64-1)) (i.e. xor-ing the upper and lower half of the internal state x).

Why does this work for determining if a number is a power of 2?

int isPower2(int x) {
int neg_one = ~0;
return !(neg_one ^ (~x+1));
}
This code works, I have implemented it and it performs perfectly. However, I cannot wrap my head around why. When I do it by hand, it doesn't make any sense to me.
Say we are starting with a 4 bit number, 4:
0100
This is obviously a power of 2. When I follow the algorithm, though, ~x+1 =
1011 + 1 = 1100
XORing this with negative one (1111) gives 0011. !(0011) = 0. Where am I going wrong here? I know this has to be a flaw in the way I am doing this by hand.
To paraphrase Inigo Montoya, "I do not think this does what you think it does".
Let's break it down.
~x + 1
This flips the bits of 'x' and then adds one. This is the same as taking the 2's complement of 'x'. Or, to put it another way, this is the same as '-x'.
neg_one ^ (~x + 1)
Using what we noted in step 1, this simplifies to ...
neg_one ^ (-x)
or more simply ...
-1 ^ (-x)
But wait! XOR'ing something with -1 is the same as flipping the bits. That is ...
~(-x)
~(-x)
This can be simplified even more if we make use of the 2's complement.
~(-x) + 0
= ~(-x) + 1 - 1
= x - 1
If you are looking for an easy way to determine if a number is a power of 2, you can use the following instead for numbers greater than zero. It will return true if it is a power of two.
(x & (x - 1)) == 0

What does the statement (1<<y) mean in bitwise operations

I am a complete beginner at bitwise operations (and not very experienced at C either) and I bumped into the expression:
x |= (1<<y)
At first I thought it meant "x equals x or y shifted left by on bit", but then I realized that would be:
x |= (y<<1)
Lastly I thought it meant "x equals x or 1 shifted left by y bits", but I don't understand where that 1 is in an 8-bit register, does it mean 00000001? so that:
a = 2
b = 1<<a // so b=00000010
Could someone tell me the correct meaning of this statement. Also, if anyone has a good link explaining bitwise syntax I'd be grateful.
Thanks.
x |= ...
is shorthand for
x = x | ...
It assigns the value of x | ... to x.
1 << y
is 1 left-shifted by y. E.g.
00000001 << 1 -> 00000010
So,
x |= (1 << y)
is OR x with 1 left shifted by y (and assign the result to x).
In other words, it sets the y'th bit of x to 1.
x = 01010101
x |= (1 << 1) -> 01010111 (it set the 2nd bit to 1)
The first statement means left shift the binary representation of 1 (0b0000001) by y bits. Then OR the value with X.
The assumption is correct for the second statement.
The third statement will yield 4 (0b0000000100).
In terms of bit operation semantics the C standard defines all bit operations to represented such that binary numbers are read right to left with ascending values of powers of 2. You do not need to worry about endianess or two complements etc, the compiler will handle that for you. So (0b00100) = 4, (0b000010) = 2, (0b00001) = 1, and so on.