Understanding Left and Right Bitwise shift on a 128-bit number - c++

Artichoke101 asked this:
Lets say that I have an array of 4 32-bit integers which I use to store the 128-bit number
How can I perform left and right shift on this 128-bit number?"
My question is related to the answer Remus Rusanu gave:
void shiftl128 (
unsigned int& a,
unsigned int& b,
unsigned int& c,
unsigned int& d,
size_t k)
{
assert (k <= 128);
if (k > 32)
{
a=b;
b=c;
c=d;
d=0;
shiftl128(a,b,c,d,k-32);
}
else
{
a = (a << k) | (b >> (32-k));
b = (b << k) | (c >> (32-k));
c = (c << k) | (d >> (32-k));
d = (d << k);
}
}
void shiftr128 (
unsigned int& a,
unsigned int& b,
unsigned int& c,
unsigned int& d,
size_t k)
{
assert (k <= 128);
if (k > 32)
{
d=c;
c=b;
b=a;
a=0;
shiftr128(a,b,c,d,k-32);
}
else
{
d = (c << (32-k)) | (d >> k); \
c = (b << (32-k)) | (c >> k); \
b = (a << (32-k)) | (b >> k); \
a = (a >> k);
}
}
Lets just focus on one shift, the left shift say. Specifically,
a = (a << k) | (b >> (32-k));
b = (b << k) | (c >> (32-k));
c = (c << k) | (d >> (32-k));
d = (d << k);
How is this left shifting the 128-bit number? I understand what bit shifting is, << shifts bits left, (8-bit number) like 00011000 left shifted 2 is 01100000. Same goes for the right shift, but to the right. Then the single "pipe" | is OR meaning any 1 in either 32-bit number will be in the result.
How is a = (a << k) | (b >> (32-k)) shifting the first part (32) of the 128-bit number correctly?

This technique is somewhat idiomatic. Let's simplify to just a and b. We start with:
+----------+----------+
| a | b |
+----------+----------+
and we want to shift left some amount to obtain:
+----------+----------+
| a : | b : | c ...
+----------+----------+
|<--x-->| |
->|y |<-
So X is simply a << k. y is the k msbs of b, right-aligned in the word. You obtain that result with b >> (32-k).
So overall, you get:
a = x | y
= (a << k) | (b >> (32-k))
[Note: This approach is only valid for 1 <= k <= 31, so your code is actually incorrect.]

When the bits of a get shifted to the left, something has to fill in the space left over on the right end. Since a and b are conceptually adjacent to each other, the void left by shifting the bits of a gets filled by the bits that are shifted off the end of b. The expression b >> (32-k) computes the bits that get shifted off of b.

You need to remember that it is acceptable, in shifting, to "lose" data.
The simplest way to understand shifting is to imagine a window. For example, let us work on bytes. You can view a byte as:
0 0 0 0 0 0 0 0 a b c d e f g h 0 0 0 0 0 0 0 0
[ B ]
Now, shifting is just about moving this window:
0 0 0 0 0 0 0 0 a b c d e f g h 0 0 0 0 0 0 0 0
[ B >> 8 ]
[ B >> 7 ]
[ B >> 6 ]
[ B >> 5 ]
0 0 0 0 0 0 0 0 a b c d e f g h 0 0 0 0 0 0 0 0
[ B >> 4 ]
[ B >> 3 ]
[ B >> 2 ]
[ B >> 1 ]
0 0 0 0 0 0 0 0 a b c d e f g h 0 0 0 0 0 0 0 0
[ B << 1 ]
[ B << 2 ]
[ B << 3 ]
[ B << 4 ]
0 0 0 0 0 0 0 0 a b c d e f g h 0 0 0 0 0 0 0 0
[ B << 5 ]
[ B << 6 ]
[ B << 7 ]
[ B << 8 ]
0 0 0 0 0 0 0 0 a b c d e f g h 0 0 0 0 0 0 0 0
If you look at the direction of the arrows, you can think of it as having a fixed window and a moving content... just like your fancy mobile phone touch screen!
So, what is happening in the expression a = (a << k) | (b >> (32-k)) ?
a << k selects the 32 - k rightmost bits of a and move them toward the left, creating a space of k 0 on the right side
b >> (32-k) selects the k leftmost bits of b and move them toward the right, creating a space of 32 - k 0 on the left side
the two are merged together
Getting back to using byte-length bites:
Suppose that a is [a7, a6, a5, a4, a3, a2, a1, a0]
Suppose that b is [b7, b6, b5. b4, b3, b2, b1, b0]
Suppose that k is 3
The number represented is:
// before
a7 a6 a5 a4 a3 a2 a1 a0 b7 b6 b5 b4 b3 b2 b1 b0
[ a ]
[ b ]
// after (or so we would like)
a7 a6 a5 a4 a3 a2 a1 a0 b7 b6 b5 b4 b3 b2 b1 b0
[ a ]
[ b ]
So:
a << 3 does actually become a4 a3 a2 a1 a0 0 0 0
b >> (8 - 3) becomes 0 0 0 0 0 b7 b6 b5
combining with | we get a4 a3 a2 a1 a0 b7 b6 b5
rinse and repeat :)

Note that in the else case k is guaranteed to be 32 or less. So each part of your larger number can actually be shifted by k bits. However, shifting it either left or right makes the k higher/lower bits 0. To shift the whole 128bit number you need to fill these k bits with the bits "shifted out" of the neighboring number.
In the case of a left shift by k, the k lower bits of the higher number need to be filled with the k upper bits of the lower number. to get these upper k bits, we shift that (32bit) number right by 32-k bits and now we got those bits in the right position to fill in the zero k bits from the higher number.
BTW: the code above assumes that an unsigned int is exactly 32 bits. That is not portable.

To simplify, consider a 16-bit unsigned short, where we store the high and low bytes as unsigned char h, l respectively.
To simplify further, let's just shift it left by one bit, to see how that goes.
I'm writing it out as 16 consecutive bits, since that's what we're modelling:
[h7 h6 h5 h4 h3 h2 h1 h0 l7 l6 l5 l4 l3 l2 l1 l0]
so, [h, l] << 1 will be
[h6 h5 h4 h3 h2 h1 h0 l7 l6 l5 l4 l3 l2 l1 l0 0]
(the top bit, h7 has been rotated off the top, and the low bit is filled with zero).
Now let's break that back up into h and l ...
[h, l] = [h6 h5 h4 h3 h2 h1 h0 l7 l6 l5 l4 l3 l2 l1 l0 0]
=> h = [h6 h5 h4 h3 h2 h1 h0 l7]
= (h << 1) | (l >> 7)
etc.

my variant for logical left shift of 128 bit number in little endian environment:
typedef struct { unsigned int component[4]; } vector4;
vector4 shift_left_logical_128bit_le(vector4 input,unsigned int numbits) {
vector4 result;
if(n>=128) {
result.component[0]=0;
result.component[1]=0;
result.component[2]=0;
result.component[3]=0;
return r;
}
result=input;
while(numbits>32) {
numbits-=32;
result.component[0]=0;
result.component[1]=result.component[0];
result.component[2]=result.component[1];
result.component[3]=result.component[2];
}
unsigned long long temp;
result.component[3]<<=numbits;
temp=(unsigned long long)result.component[2];
temp=(temp<<numbits)>>32;
result.component[3]|=(unsigned int)temp;
result.component[2]<<=numbits;
temp=(unsigned long long)result.component[1];
temp=(temp<<numbits)>>32;
result.component[2]|=(unsigned int)temp;
result.component[1]<<=numbits;
temp=(unsigned long long)result.component[0];
temp=(temp<<numbits)>>32;
result.component[1]|=(unsigned int)temp;
result.component[0]<<=numbits;
return result;
}

Related

A many-to-one mapping in the natural domain using discrete input variables?

I would like to find a mapping f:X --> N, with multiple discrete natural variables X of varying dimension, where f produces a unique number between 0 to the multiplication of all dimensions. For example. Assume X = {a,b,c}, with dimensions |a| = 2, |b| = 3, |c| = 2. f should produce 0 to 12 (2*3*2).
a b c | f(X)
0 0 0 | 0
0 0 1 | 1
0 1 0 | 2
0 1 1 | 3
0 2 0 | 4
0 2 1 | 5
1 0 0 | 6
1 0 1 | 7
1 1 0 | 8
1 1 1 | 9
1 2 0 | 10
1 2 1 | 11
This is easy when all dimensions are equal. Assume binary for example:
f(a=1,b=0,c=1) = 1*2^2 + 0*2^1 + 1*2^0 = 5
Using this naively with varying dimensions we would get overlapping values:
f(a=0,b=1,c=1) = 0*2^2 + 1*3^1 + 1*2^2 = 4
f(a=1,b=0,c=0) = 1*2^2 + 0*3^1 + 0*2^2 = 4
A computationally fast function is preferred as I intend to use/implement it in C++. Any help is appreciated!
Ok, the most important part here is math and algorythmics. You have variable dimensions of size (from least order to most one) d0, d1, ... ,dn. A tuple (x0, x1, ... , xn) with xi < di will represent the following number: x0 + d0 * x1 + ... + d0 * d1 * ... * dn-1 * xn
In pseudo-code, I would write:
result = 0
loop for i=n to 0 step -1
result = result * d[i] + x[i]
To implement it in C++, my advice would be to create a class where the constructor would take the number of dimensions and the dimensions itself (or simply a vector<int> containing the dimensions), and a method that would accept an array or a vector of same size containing the values. Optionaly, you could control that no input value is greater than its dimension.
A possible C++ implementation could be:
class F {
vector<int> dims;
public:
F(vector<int> d) : dims(d) {}
int to_int(vector<int> x) {
if (x.size() != dims.size()) {
throw std::invalid_argument("Wrong size");
}
int result = 0;
for (int i = dims.size() - 1; i >= 0; i--) {
if (x[i] >= dims[i]) {
throw std::invalid_argument("Value >= dimension");
}
result = result * dims[i] + x[i];
}
return result;
}
};

Lexicographical comparison of integers

I want to compare two small (<=20) sets of integers (1..20) lexicographically.
The sets are represented by single integers, e.g.
1, 2, 4, 6
will be represented as
... 0 1 0 1 0 1 1
(... 7 6 5 4 3 2 1)
So where there's a 1 the number is present in the set.
Could someone verify if this code is correct?
bool less_than(unsigned a, unsigned b) {
unsigned tmp = a ^ b;
tmp = tmp & (~tmp + 1); //first difference isolated
return (tmp & a) && (__builtin_clz(b) < __builtin_clz(tmp));
}
The __builtin_clz part is for the case when b is a prefix of a.
The case of an empty set is handled elsewhere (__builtin_clz is undefined for 0).
EDIT:
bool less_than(unsigned a, unsigned b) {
unsigned tmp = a ^ b;
tmp &= -tmp; //first difference isolated
return ((tmp & a) && (__builtin_clz(b) < __builtin_clz(tmp)))
|| (__builtin_clz(a) > __builtin_clz(tmp));
}
and
bool less_than_better(unsigned a, unsigned b) {
unsigned tmp = a ^ b;
tmp &= -tmp; //first difference isolated
return ((tmp & a) && tmp < b) || tmp > a;
}
appear to be both correct.
(Tested versus a naive implementation using std::lexicographical_compare on tens of millions of randomized tests)
The second one is more portable though since it doesn't use __builtin_clz.
The difference in speed on my machine is negligible (the second one being ~2% faster), however on machines without __builtin_clz as one processor instruction (e.g. BSR on x86) the difference will probably be huge.
It's not correct in the case that a == 0. This should return true unless b == 0, but since tmp & a will be false regardless of the value of tmp (which will be the lowest-order 1-bit in b), the function will return false.
a should be "less than" b if:
1. `a` is a proper prefix of `b`, or
2. The lowest-order bit of `a^b` is in `a`.
The first condition also handles the case where a is the empty set and b is not. (This is slightly different from your formulation, which is "(The lowest-order bit of a^b is in a) and not (b is a proper prefix of a).)
A simple test of the case "a is a proper prefix of b", given the fact that we have the the lowest-order bit of a^b in tmp, is tmp > a. That avoids the use of __builtin_clz [Note 1].
Also, you could write
tmp = tmp & (~tmp + 1);
as
tmp &= -tmp;
but I think that most C compilers will find that optimization on their own. [Note 2].
Applying those optimizations, the result would be (untested):
bool less_than(unsigned a, unsigned b) {
unsigned tmp = a ^ b;
tmp &= -tmp; //first difference isolated
return tmp > a || tmp & a;
}
Notes
This is worth doing because (1) even though __builtin_clz is builtin, it is not necessarily super-fast; and (2) it may not be present if you're compiling with a compiler other than gcc or clang.
-tmp is guaranteed to be the 2s-complement negative of tmp if tmp is an unsigned type, even if the underlying implementation is not 2s-complement. See ยง6.2.6.2/1 (the range of an unsigned type is 0..2N-1 for some integer N) and &6.3.1.3/2 (a negative value is converted to an unsigned integer type by repeatedly adding 2N until the value is in range.
Here's a listing calculating all combinations for 2-bit inputs:
#include <stdio.h>
bool less_than(unsigned a, unsigned b) {
unsigned tmp = a ^ b;
tmp = tmp & (~tmp + 1); //first difference isolated
return (tmp & a) && (__builtin_clz(b) < __builtin_clz(tmp));
}
#define BITPATTERN "%d%d%d"
#define BYTETOBITS(byte) \
(byte & 0x04 ? 1 : 0), \
(byte & 0x02 ? 1 : 0), \
(byte & 0x01 ? 1 : 0)
int main(int argc, char** argv) {
for ( int a = 0; a < 4; a ++ )
for ( int b = 0; b < 4; b ++)
printf("a: "BITPATTERN" b: "BITPATTERN": %d\n",
BYTETOBITS(a), BYTETOBITS(b), less_than(a,b)
);
}
And here's the output:
a: 000 b: 000: 0
a: 000 b: 001: 0
a: 000 b: 010: 0
a: 000 b: 011: 0
a: 001 b: 000: 0
a: 001 b: 001: 0
a: 001 b: 010: 1
a: 001 b: 011: 0
a: 010 b: 000: 0
a: 010 b: 001: 0
a: 010 b: 010: 0
a: 010 b: 011: 0
a: 011 b: 000: 0
a: 011 b: 001: 0
a: 011 b: 010: 1
a: 011 b: 011: 0
It doesn't seem to look correct..

Whats the reverse function of x XOR (x/2)?

Whats the reverse function of x XOR (x/2)?
Is there a system of rules for equation solving, similar to algebra, but with logic operators?
Suppose we have a number x of N bits. You could write this as:
b(N-1) b(N-2) b(N-3) ... b(0)
where b(i) is bit number i in the number (where 0 is the least significant bit).
x / 2 is the same as x shifted left 1 bit. Let's assume unsigned numbers. So:
x / 2 = 0 b(N-1) b(N-2) ... b(1)
Now we XOR x with x / 2:
x ^ (x / 2) = b(N-1)^0 b(N-2)^b(N-1) b(N-3)^b(N-2) ... b(0)^b(1)
Note that the rightmost bit (the most significant bit) of this is b(N-1)^0 which is b(N-1). In other words, you can get bit b(N-1) from the result immediately. When you have this bit, you can calculate b(N-2) because the second bit of the result is b(N-2)^b(N-1) and you already know b(N-1). And so on, you can compute all bits b(N-1) to b(0) of the original number x.
I can give you an algorithm in bits:
Assuming you have an array of n bits:
b = [b1 .. bn] // b1-bn are 0 or 1
The original array is:
x0 = b0
x1 = b1 ^ x0
x2 = b2 ^ x1
or in general
x[i] = b[i] ^ x[i-1]
Assume Y = X ^ (X / 2)
If you want to find X, do this
X = 0
do
X ^= Y
Y /= 2
while Y != 0
I hope it helps!
I know it's an old topic, but I stumbled upon the same question, and I found out a little trick. If you have n bits, instead of requiring n bits operations (like the answer by Jesper), you can do it with log2(n) number operations :
Suppose that y is equal to x XOR (x/2) at the beginning of the program, you can do the following C program :
INPUT : y
int i, x;
x = y;
for (i = 1; i < n; i <<= 1)
x ^= x >> i;
OUTPUT : x
and here you have the solution.
">>" is the right bit shift operation. For example the number 13, 1101 in binary, if shifted by 1 on the right, will become 110 in binary, thus 13 >> 1 = 6. x >> i is equivalent to x / 2^i (division in the integers, of course)
"<<" is the left bit shift operation (i <<= 1 is equivalent to i *= 2)
Why does it work ? Let's take as example n = 5 bits, and start with y = b4 b3 b2 b1 b0 (in binary : in the following x is written in binary also, but i is written in decimal)
Initialisation :
x = b4 b3 b2 b1 b0
First step : i = 1
x >> 1 = b4 b3 b2 b1 so we have
x = b4 b3 b2 b1 b0 XOR b3 b2 b1 b0 = b4 (b3^b4) (b2^b3) (b1^b2) (b0^b1)
Second step : i = 2
x >> 2 = b4 (b3^b4) (b2^b3) so we have
x = b4 (b3^b4) (b2^b3) (b1^b2) (b0^b1) XOR b4 (b3^b4) (b2^b3) = b4 (b3^b4) (b2^b3^b4) (b1^b2^b3^b4) (b0^b1^b2^b3)
Third step : i = 4
x >> 4 = b4 so we have
x = b4 (b3^b4) (b2^b3^b4) (b1^b2^b3^b4) (b0^b1^b2^b3) XOR b4 = b4 (b3^b4) (b2^b3^b4) (b1^b2^b3^b4) (b0^b1^b2^b3^b4)
Then i = 8, which is more than 5, we exit the loop.
And we have the desired output.
The loop has log2(n) iterations because i starts at 1 and is multiplied by 2 at each step, so for i to reach n, we have to do it log2(n) times.

what this "if(k.c[3] & c)" part of code doing?

#include<stdio.h>
#include<iostream.h>
main()
{
unsigned char c,i;
union temp
{
float f;
char c[4];
} k;
cin>>k.f;
c=128;
for(i=0;i<8;i++)
{
if(k.c[3] & c) cout<<'1';
else cout<<'0';
c=c>>1;
}
c=128;
cout<<'\n';
for(i=0;i<8;i++)
{
if(k.c[2] & c) cout<<'1';
else cout<<'0';
c=c>>1;
}
return 0;
}
if(k.c[2] & c)
That is called bitwise AND.
Illustration of bitwise AND
//illustration : mathematics of bitwise AND
a = 10110101 (binary representation)
b = 10011010 (binary representation)
c = a & b
= 10110101 & 10011010
= 10010000 (binary representation)
= 128 + 16 (decimal)
= 144 (decimal)
Bitwise AND uses this truth table:
X | Y | R = X & Y
---------
0 | 0 | 0
0 | 1 | 0
1 | 0 | 0
1 | 1 | 1
See these tutorials on bitwise AND:
Bitwise Operators in C and C++: A Tutorial
Bitwise AND operator &
A bitwise operation (AND in this case) perform a bit by bit operation between the 2 operands.
For example the & :
11010010 &
11000110 =
11000010
Bitwise Operation in your code
c = 128 therefore the binary representation is
c = 10000000
a & c will and every ith but if c with evert ith bit of a. Because c only has 1 in the MSB position (pos 7), so a & c will be non-zero if a has a 1 in its position 7 bit, if a has a 0 in pos bit, then a & c will be zero. This logic is used in the if block above. The if block is entered depending upon if the MSB (position 7 bit) of the byte is 1 or not.
Suppose a = ? ? ? ? ? ? ? ? where a ? is either 0 or 1
Then
a = ? ? ? ? ? ? ? ?
AND & & & & & & & &
c = 1 0 0 0 0 0 0 0
---------------
? 0 0 0 0 0 0 0
As 0 & ? = 0. So if the bit position 7 is 0 then answer is 0 is bit position 7 is 1 then answer is 1.
In each iteration c is shifted left one position, so the 1 in the c propagates left wise. So in each iteration masking with the other variable you are able to know if there is a 1 or a 0 at that position of the variable.
Use in your code
You have
union temp
{
float f;
char c[4];
} k;
Inside the union the float and the char c[4] share the same memory location (as the property of union).
Now, sizeof (f) = 4bytes) You assign k.f = 5345341 or whatever . When you access the array k.arr[0] it will access the 0th byte of the float f, when you do k.arr[1] it access the 1st byte of the float f . The array is not empty as both the float and the array points the same memory location but access differently. This is actually a mechanism to access the 4 bytes of float bytewise.
NOTE THAT k.arr[0] may address the last byte instead of 1st byte (as told above), this depends on the byte ordering of storage in memory (See little endian and big endian byte ordering for this)
Union k
+--------+--------+--------+--------+ --+
| arr[0] | arr[1] | arr[2] | arr[3] | |
+--------+--------+--------+--------+ |---> Shares same location (in little endian)
| float f | |
+-----------------------------------+ --+
Or the byte ordering could be reversed
Union k
+--------+--------+--------+--------+ --+
| arr[3] | arr[2] | arr[1] | arr[0] | |
+--------+--------+--------+--------+ |---> Shares same location (in big endian)
| float f | |
+-----------------------------------+ --+
Your code loops on this and shifts the c which propagates the only 1 in the c from bit 7 to bit 0 in one step at a time in each location, and the bitwise anding checks actually every bit position of the bytes of the float variable f, and prints a 1 if it is 1 else 0.
If you print all the 4 bytes of the float, then you can see the IEEE 754 representation.
c has single bit in it set. 128 is 10000000 in binary. if(k.c[2] & c) checks if that bit is set in k.c[2] as well. Then the bit in c is shifted around to check for other bits.
As result the program is made to display the binary representation of float it seems.

Direct formula for summing XOR

I have to XOR numbers from 1 to N, does there exist a direct formula for it ?
For example if N = 6 then 1^2^3^4^5^6 = 7 I want to do it without using any loop so I need an O(1) formula (if any)
Your formula is N & (N % 2 ? 0 : ~0) | ( ((N & 2)>>1) ^ (N & 1) ):
int main()
{
int S = 0;
for (int N = 0; N < 50; ++N) {
S = (S^N);
int check = N & (N % 2 ? 0 : ~0) | ( ((N & 2)>>1) ^ (N & 1) );
std::cout << "N = " << N << ": " << S << ", " << check << std::endl;
if (check != S) throw;
}
return 0;
}
Output:
N = 0: 0, 0 N = 1: 1, 1 N = 2: 3, 3
N = 3: 0, 0 N = 4: 4, 4 N = 5: 1, 1
N = 6: 7, 7 N = 7: 0, 0 N = 8: 8, 8
N = 9: 1, 1 N = 10: 11, 11 N = 11: 0, 0
N = 12: 12, 12 N = 13: 1, 1 N = 14: 15, 15
N = 15: 0, 0 N = 16: 16, 16 N = 17: 1, 1
N = 18: 19, 19 N = 19: 0, 0 N = 20: 20, 20
N = 21: 1, 1 N = 22: 23, 23 N = 23: 0, 0
N = 24: 24, 24 N = 25: 1, 1 N = 26: 27, 27
N = 27: 0, 0 N = 28: 28, 28 N = 29: 1, 1
N = 30: 31, 31 N = 31: 0, 0 N = 32: 32, 32
N = 33: 1, 1 N = 34: 35, 35 N = 35: 0, 0
N = 36: 36, 36 N = 37: 1, 1 N = 38: 39, 39
N = 39: 0, 0 N = 40: 40, 40 N = 41: 1, 1
N = 42: 43, 43 N = 43: 0, 0 N = 44: 44, 44
N = 45: 1, 1 N = 46: 47, 47 N = 47: 0, 0
N = 48: 48, 48 N = 49: 1, 1 N = 50: 51, 51
Explanation:
Low bit is XOR between low bit and next bit.
For each bit except low bit the following holds:
if N is odd then that bit is 0.
if N is even then that bit is equal to corresponded bit of N.
Thus for the case of odd N the result is always 0 or 1.
edit
GSerg Has posted a formula without loops, but deleted it for some reason (undeleted now). The formula is perfectly valid (apart from a little mistake). Here's the C++-like version.
if n % 2 == 1 {
result = (n % 4 == 1) ? 1 : 0;
} else {
result = (n % 4 == 0) ? n : n + 1;
}
One can prove it by induction, checking all reminders of division by 4. Although, no idea how you can come up with it without generating output and seeing regularity.
Please explain your approach a bit more.
Since each bit is independent in xor operation, you can calculate them separately.
Also, if you look at k-th bit of number 0..n, it'll form a pattern. E.g., numbers from 0 to 7 in binary form.
000
001
010
011
100
101
110
111
You see that for k-th bit (k starts from 0), there're 2^k zeroes, 2^k ones, then 2^k zeroes again, etc.
Therefore, you can for each bit calculate how many ones there are without actually going through all numbers from 1 to n.
E.g., for k = 2, there're repeated blocks of 2^2 == 4 zeroes and ones. Then,
int ones = (n / 8) * 4; // full blocks
if (n % 8 >= 4) { // consider incomplete blocks in the end
ones += n % 8 - 3;
}
For odd N, the result is either 1 or 0 (cyclic, 0 for N=3, 1 for N=5, 0 for N=7 etc.)
For even N, the result is either N or N+1 (cyclic, N+1 for N=2, N for N=4, N+1 for N=6, N for N=8 etc).
Pseudocode:
if (N mod 2) = 0
if (N mod 4) = 0 then r = N else r = N+1
else
if (N mod 4) = 1 then r = 1 else r = 0
Lets say the function that XORs all the values from 1 to N be XOR(N), then
XOR(1) = 000 1 = 0 1 ( The 0 is the dec of bin 000)
XOR(2) = 001 1 = 1 1
XOR(3) = 000 0 = 0 0
XOR(4) = 010 0 = 2 0
XOR(5) = 000 1 = 0 1
XOR(6) = 011 1 = 3 1
XOR(7) = 000 0 = 0 0
XOR(8) = 100 0 = 4 0
XOR(9) = 000 1 = 0 1
XOR(10)= 101 1 = 5 1
XOR(11)= 000 0 = 0 0
XOR(12)= 110 0 = 6 0
I hope you can see the pattern. It should be similar for other numbers too.
Try this:
the LSB gets toggled each time the N is odd, so we can say that
rez & 1 == (N & 1) ^ ((N >> 1) & 1)
The same pattern can be observed for the rest of the bits.
Each time the bits B and B+1 (starting from LSB) in N will be different, bit B in the result should be set.
So, the final result will be (including N): rez = N ^ (N >> 1)
EDIT: sorry, it was wrong. the correct answer:
for odd N: rez = (N ^ (N >> 1)) & 1
for even N: rez = (N & ~1) | ((N ^ (N >> 1)) & 1)
Great answer by Alexey Malistov! A variation of his formula: n & 1 ? (n & 2) >> 1 ^ 1 : n | (n & 2) >> 1 or equivalently n & 1 ? !(n & 2) : n | (n & 2) >> 1.
this method avoids using conditionals F(N)=(N&((N&1)-1))|((N&1)^((N&3)>>1)
F(N)= (N&(b0-1)) | (b0^b1)
If you look at the XOR of the first few numbers you get:
N | F(N)
------+------
0001 | 0001
0010 | 0011
0011 | 0000
0100 | 0100
0101 | 0001
0110 | 0111
0111 | 0000
1000 | 1000
1001 | 0001
Hopefully you notice the pattern:
if N mod 4 = 1 than F(N)=1
if N mod 4 = 3 than F(N)=0
if N mod 4 = 0 than F(N)=N
if N mod 4 = 2 than F(N)=N but with the first bit as 1 so N|1
the tricky part is getting this in one statement without conditionals ill explain the logic I used to do this.
take the first 2 significant bits of N call them:
b0 and b1 and these are obtained with:
b0 = N&1
b1 = N&3>>1
Notice that if b0 == 1 we have to 0 all of the bits, but if it isn't all of the bits except for the first bit stay the same. We can do this behavior by:
N & (b0-1) : this works because of 2's complement, -1 is equal to a number with all bits set to 1 and 1-1=0 so when b0=1 this results in F(N)=0.. so that is the first part of the function:
F(N)= (N&(b0-1))...
now this will work for for N mod 4 == 3 and 0, for the other 2 cases lets look solely at b1, b0 and F(N)0:
b0|b1|F(N)0
--+--+-----
1| 1| 0
0| 0| 0
1| 0| 1
0| 1| 1
Ok hopefully this truth table looks familiar! it is b0 XOR b1 (b1^b0). so now that we know how to get the last bit let put that on our function:
F(N)=(N&(b0-1))|b0^b1
and there you go, a function without using conditionals. also this is useful if you want to compute the XOR from positive numbers a to b. you can do:
F(a) XOR F(b).
With minimum change to the original logic:
int xor = 0;
for (int i = 1; i <= N; i++) {
xor ^= i;
}
We can have:
int xor = 0;
for (int i = N - (N % 4); i <= N; i++) {
xor ^= i;
}
It does have a loop but it would take a constant time to execute. The number of times we iterate through the for-loop would vary between 1 and 4.
How about this?
!(n&1)*n+(n%4&n%4<3)
This works fine without any issues for any n;
unsigned int xorn(unsigned int n)
{
if (n % 4 == 0)
return n;
else if(n % 4 == 1)
return 1;
else if(n % 4 == 2)
return n+1;
else
return 0;
}
Take a look at this. This will solve your problem.
https://stackoverflow.com/a/10670524/4973570
To calculate the XOR sum from 1 to N:
int ans,mod=N%4;
if(mod==0) ans=N;
else if(mod==1) ans=1;
else if(mod==2) ans=N+1;
else if(mod==3) ans=0;
If still someone needs it here simple python solution:
def XorSum(L):
res = 0
if (L-1)%4 == 0:
res = L-1
elif (L-1)%4 == 1:
res = 1
elif (L-1)%4 == 2:
res = (L-1)^1
else: #3
res= 0
return res