I want to implement unsigned left rotation in my integer class. However, because it is a template class, it can be in any size from 128-bit and goes on; so I cannot use algorithms that require a temporary of the same size because, if the type becomes big, a stack overflow will occur (specially if such function was in call chain).
So to fix such problem I minimized it to a question: what steps do I have to do to rotate a 32-bit number using only 4 bits. Well, if you think about, it a 32-bit number contains 8 groups of 4 bits each, so if the number of bits to rotate is 4 then a swap will occur between groups 0 and 4, 1 and 5, 2 and 6, 3 and 7, after which the rotation is done.
If bits to rotate less than 4 and greater than 0 then it is simple just preserve the last N bits and start shift-Or loop, e.g. suppose we have the number 0x9CE2 to left rotate it 3 bits we will do that following:
The number in little endian binary is 1001 1100 1110 0010, each nibble indexed from 0 to 3 from right to left and we will call the this number N and number of bits in one group B
[1] x <- N[3] >> 3
x <- 1001 >> 3
x <- 0100
y <- N[0] >> (B - 3)
y <- N[0] >> (4 - 3)
y <- 0010 >> 1
y <- 0001
N[0] <- (N[0] << 3) | x
N[0] <- (0010 << 3) | 0100
N[0] <- 0000 | 0100
N[0] <- 0100
[2] x <- y
x <- 0001
y <- N[1] >> (B - 3)
y <- N[1] >> (4 - 3)
y <- 1110 >> 1
y <- 0111
N[1] <- (N[1] << 3) | x
N[1] <- (1110 << 3) | 0001
N[1] <- 0000 | 0001
N[1] <- 0001
[3] x <- y
x <- 0111
y <- N[2] >> (B - 3)
y <- N[2] >> (4 - 3)
y <- 1100 >> 1
y <- 0110
N[2] <- (N[2] << 3) | x
N[2] <- (1100 << 3) | 0111
N[2] <- 0000 | 0111
N[2] <- 0111
[4] x <- y
x <- 0110
y <- N[3] >> (B - 3)
y <- N[3] >> (4 - 3)
y <- 1001 >> 1
y <- 0100
N[3] <- (N[3] << 3) | x
N[3] <- (1001 << 3) | 0110
N[3] <- 1000 | 0110
N[3] <- 1110
The result is 1110 0111 0001 0100, 0xE714 in hexadecimal, which is the right answer; and, if you try to apply it on any number with any precision, all you will need is one variable which type is the type of any element of the array forming that bignum type.
Now the real problem is when the number bits to rotate is bigger than one group or bigger than half size of the type (i.e. bigger than 4bits or 8bits in this example).
Usually, we shift bits from last element to first element and so on; but now, after shifting
the last element to first element, the result has to be relocated to new place because the number of bits to rotate is bigger than on element (i.e. > 4 bits). The start index where the shift will start is the last index (3 in this example), and for destination index we use the equation: dest_index = int(bits_count/half_bits) + 1, where half_bits is number of bits in half the number and in this example half_bits = 8, so if bits_count = 7 then dest_index = int(7/8) + 1 = 1 + 1 = 2, and that means the result of the first shift must relocated to destination index 2 -- and that is my problem, for I cannot think of a way to write an algorithm for this situation.
Thanks.
This will just be some hints for one way to accomplish this. You can think about making two passes.
first pass, rotate on 4 bit boundaries only
second pass, rotate on 1 bit boundaries
So, the top level pseudo code might look like:
rotate (unsigned bits) {
bits %= 32; /* only have 32 bits */
if (bits == 0) return;
rotate_nibs(bits/4);
rotate_bits(bits%4);
}
So, to rotate by 13 bits, you first rotate by 3 nibbles, then rotate by 1 bit to get your total of 13 bits of rotation.
You could avoid nibble rotation altogether if you treat your array of nibbles as a circular buffer. Then, a nibble rotation is just a matter of changing the start position in the array for the 0 index.
If you must do rotation, it can be tricky. If you are rotating an 8 item array and only want to use 1 item of storage overhead to do the rotation, then to rotate by 3 items, you might approach it like this:
orig: A B C D E F G H
step 1: A B C A E F G H rem: D
2: A B C A E F D H rem: G
3: A G C A E F D H rem: B
4: A G C A B F D H rem: E
5: A G C A B F D E rem: H
6: A G H A B F D E rem: C
7: A G H A B C D E rem: F
8: F G H A B C D E done
But, if you tried the same technique with 2, 4, or 6 item rotations, the cycle does not run through the whole array. So, you have to be aware if the rotation count and the array size has a common divisor, and make the algorithm account for that. If you step through with the 6 step rotation, some more clues fall out.
orig: A B C D E F G H
A
G
E
C cycled back to A's position
B
H
F
D done
Notice that the GCD(6,8) is 2, which means we should expect 4 iterations for each pass. Then, the rotation algorithm for an N item array could look like:
rotate (n) {
G = GCD(n, N)
for (i = 0; i < G; ++i) {
p = arr[i];
for (j = 1; j < N/G; ++j) {
swap(p, arr[(i + j*n) % N]);
}
arr[i] = p;
}
}
There is an optimization you can do to avoid the swap per iteration, that I'll leave as an exercise.
I suggest calling an assembly language function for the bit rotation.
Many assembly languages have better facilities for rotating bits through carry and rotating carry through bits.
Many times the assembly language is less complex than the C or C++ function.
The drawback is that you will need one instance of each assembly function for each {different} platform.
Related
Is there a way to derive the 4-bit nth Gray code using the (n-1)th Gray code by using bit operations on the (n-1)th Gray Code?
For example the 4th Gray code is 0010. Now I want to get the 5th Gray Code, 0110, by doing bit operations on 0010.
Perhaps it's "cheating" but you can just pack a lookup table into a 64-bit constant value, like this:
0000 0 -> 1
0001 1 -> 3
0011 3 -> 2
0010 2 -> 6
0110 6 -> 7
0111 7 -> 5
0101 5 -> 4
0100 4 -> C
1100 C -> D
1101 D -> F
1111 F -> E
1110 E -> A
1010 A -> B
1011 B -> 9
1001 9 -> 8
1000 8 -> 0
FEDCBA9876543210 nybble order (current Gray code)
| |
V V
EAFD9B80574C2631 next Gray code
Then you can use shifts and masks to perform a lookup (depending on your language):
int next_gray_code(int code)
{
return (0xEAFD9B80574C2631ULL >> (code << 2)) & 15;
}
Alternatively, you can use the formula for converting from Gray to binary, increment the value, and then convert from binary to Gray, which is just n xor (n / 2):
int next_gray_code(int code)
{
code = code ^ (code >> 2);
code = code ^ (code >> 1);
code = (code + 1) & 15;
return code ^ (code >> 1);
}
What about the following?
t1 := XOR(g0, g1)
b0 := !XOR(g0, g1, g2, g3)
b1 := t1 & g2 & g3 + !t1 & !g2 & !g3
b2 := t1 & g2 & !g3
b3 := t1 & !g2 & !g3
n0 := XOR(b0, g0)
n1 := XOR(b1, g1)
n2 := XOR(b2, g2)
n3 := XOR(b3, g3)
The current gray code word is g3 g2 g1 g0 and the next code word is n3 n2 n1 n0. b3 b2 b1 b0 are the four bits which flip or not flip a bit in the code word to progress to the subsequent code word. Only one bit is changed between adjacent code words.
Suppose X and Y are two positive integers and Y is a power of two. Then what does this expression calculate?
(X+Y-1) & ~(Y-1)
I found this expression appearing in certain c/c++ implementation of Memory Pool (X represents the object size in bytes and Y represents the alignment in bytes, the expression returns the block size in bytes fit for use in the Memory Pool).
&~(Y-1) where Y is a power of 2, zeroes the last n bits, where Y = 2n: Y-1 produces n 1-bits, inverting that via ~ gives you a mask with n zeroes at the end, anding via bit-level & zeroes the bits where the mask is zero.
Effectively that produces a number that is some multiple of Y's power of 2.
It can maximally have the effect of subtracting Y-1 from the number, so add that first, giving (X+Y-1) & ~(Y-1). This is a number that's not less than X, and is a multiple of Y.
It gives you the next Y-aligned address of current address X.
Say, your current address X is 0x10000, and your alignment is 0x100, it will give you 0x10000. But if your current address X is 0x10001, you will get "next" aligned address of 0x10100.
This is useful in the scenario that you want your new object always to be aligned to blocks in memory, but not leaving any block unused. So you want to know what is the next available block-aligned address.
Why don't you just try some input and observe what happens?
#include <iostream>
unsigned compute(unsigned x, unsigned y)
{
return (x + y - 1) & ~(y - 1);
}
int main()
{
std::cout << "(x + y - 1) & ~(y - 1)" << std::endl;
for (unsigned x = 0; x < 9; ++x)
{
std::cout << "x=" << x << ", y=2 -> " << compute(x, 2) << std::endl;
}
std::cout << "----" << std::endl;
std::cout << "(x + y - 1) & ~(y - 1)" << std::endl;
for (unsigned x = 0; x < 9; ++x)
{
std::cout << "(x=" << x << ", y=2) -> " << compute(x, 2) << std::endl;
}
return 0;
}
Live Example
Output:
First set uses x in [0, 8] and y is constant 2. Second set uses x in [0, 8] and y is constant 4.
(x + y - 1) & ~(y - 1)
x=0, y=2 -> 0
x=1, y=2 -> 2
x=2, y=2 -> 2
x=3, y=2 -> 4
x=4, y=2 -> 4
x=5, y=2 -> 6
x=6, y=2 -> 6
x=7, y=2 -> 8
x=8, y=2 -> 8
----
(x + y - 1) & ~(y - 1)
(x=0, y=2) -> 0
(x=1, y=2) -> 2
(x=2, y=2) -> 2
(x=3, y=2) -> 4
(x=4, y=2) -> 4
(x=5, y=2) -> 6
(x=6, y=2) -> 6
(x=7, y=2) -> 8
(x=8, y=2) -> 8
It's easy to see the output (i.e., result right of ->) is always a multiple of y such that the output is greater than or equal to x.
First I assume that X and Y are unsigned integers.
Let's have a look at the right part:
If Y is a power of 2, it is represented in binary by one bit to 1 and all the others to 0. Example 8 will be binary 00..01000.
If you substract 1 the highest bit will be 0 and all the bits to its right will become 1. Example 8-1= 7 and in binary 00..00111
If you ~ negate this number you will make sure that all highest bit (including the original one will turn to 1 and the lovest to 0. Example: ~7 will be 11..11000
Now if you do a binary AND (&) with any number, you will set to 0 all the lower bits, in our example, the 3 lower bits. THe resulting number is hence a multiple of Y.
Let's look at the left side:
We've already analysed Y-1. In our example we had 7, that is 00..00111
If you add this to any number, you make sure that the result is greater than or equal to Y. Example with 5: 5+7=12 so 00..01100 and example with 10: 10+7=17 so 00..10001
If you then perform the AND, you'll erase the lower bits. so in our example with 5, we come to 00..01000 = 8 and in our example with 10 we get 00..10000 16.
Conclusion, it's the smallest multiple of Y wich is greater or equal to X.
Let's break it down, piece by piece.
(X+Y-1) & ~(Y-1)
Let's suppose that X = 11 and Y = 16 in accordance with your rules and that the integers are 8 bits.
(11+16-1) & ~(16-1)
Do the Addition and Subtraction
(26) & ~(15)
Translate this into binary
(0001 1010) & ~(0000 1111)
~ means not or to invert the zeros and ones
(0001 1010) & (1111 0000)
& means only to take the bits that are both ones
0001 0000
convert back to decimal
16
other examples
X = 78, Y = 32 results in 96
X = 25, Y = 64 results in 64
X = 47, Y = 16 results in 48
So, it would seem to me that the purpose of this is to find lowest multiple of Y that is equal to or greater than X. This could be used for finding the start/end address of a block of memory, or it could be used for positioning items on the screen, or any number of other possible answers as well. But without context and possibly even a full code example. There's no guarantee.
(X+Y-1) & ~(Y-1)
x = 7 = 0b0111
y = 4 = 0b0100
x+y-1 = 0b1010
y-1 = 3 = 0b0011
~(y-1) = 0b1100
(x+y-1) & ~(y-1) = 0b1000 = 8
--
x = 12 = 0b1100
y = 2 = 0b0010
x+y-1 = 13 = 0b1101
y-1 = 1 = 0b0001
~(y-1) = 0b1110
(x+y-1) & ~(y-1) = 0b1100 = 12
(x+y-1) & ~(y-1) is the smallest multiple of y greater than or equal to x
It seems provides a specified alignment of a value for example of a memory address (for example when you want to get the next aligned address).
For example if you want that a memory address would be aligned at the paragraph bound you can write
( address + 16 - 1 ) & ~( 16 - 1 )
or
( address + 15 ) & ~15
or
( address + 15 ) & ~0xf
In this case all bits before 16 will be zeroed.
This part of expression
( address + alignment - )
is used for rounding.
and this part of expression
~( alignment - 1 )
is used to build a mask thet zeroes low bits.
What do "Non-Power-Of-Two Textures" mean? I read this tutorial and I meet some binaries operations("<<", ">>", "^", "~"), but I don't understand what they are doing.
For example following code:
GLuint LTexture::powerOfTwo(GLuint num)
{
if (num != 0)
{
num--;
num |= (num >> 1); //Or first 2 bits
num |= (num >> 2); //Or next 2 bits
num |= (num >> 4); //Or next 4 bits
num |= (num >> 8); //Or next 8 bits
num |= (num >> 16); //Or next 16 bits
num++;
}
return num;
}
I very want to understand this operations. As well, I read this. Very short article. I want to see examples of using, but I not found. I did the test:
int a = 5;
a <<= 1; //a = 10
a = 5;
a <<= 2; //a = 20
a = 5;
a <<= 3; //a = 40
Okay, this like multiply on two, but
int a = 5;
a >>= 1; // a = 2 Whaat??
In C++, the <<= is the "left binary shift" assignment operator; the operand on the left is treated as a binary number, the bits are moved to the left, and zero bits are inserted on the right.
The >>= is the right binary shift; bits are moved to the right and "fall off" the right end, so it's like a division by 2 (for each bit) but with truncation. For negative signed integers, by the way, additional 1 bits are shifted in at the left end ("arithmetic right shift"), which may be surprising; for positive signed integers, or unsigned integers, 0 bits are shifted in at the left ("logical right shift").
"Powers of two" are the numbers created by successive doublings of 1: 2, 4, 8, 16, 32… Most graphics hardware prefers to work with texture maps which are powers of two in size.
As said in http://lazyfoo.net/tutorials/OpenGL/08_non_power_of_2_textures/index.php
powerOfTwo will take the argument and find nearest number that is power of two.
GLuint powerOfTwo( GLuint num );
/*
Pre Condition:
-None
Post Condition:
-Returns nearest power of two integer that is greater
Side Effects:
-None
*/
Let's test:
num=60 (decimal) and its binary is 111100
num--; .. 59 111011
num |= (num >> 1); //Or first 2 bits 011101 | 111011 = 111111
num |= (num >> 2); //Or next 2 bits 001111 | 111111 = 111111
num |= (num >> 4); //Or next 4 bits 000011 | 111111 = 111111
num |= (num >> 8); //Or next 8 bits 000000 | 111111 = 111111
num |= (num >> 16); //Or next 16 bits 000000 | 111111 = 111111
num++; ..63+1 = 64
output 64.
For num=5: num-1 =4 (binary 0100), after all num |= (num >> N) it will be 0111 or 7 decimal). Then num+1 is equal to 8.
As you should know the data in our computers is represented in the binary system, in which digits are either a 1 or a 0.
So for example number 10 decimal = 1010 binary. (1*2^3 + 0*2^2 + 1*2^1 + 0*2^0).
Let's go to the operations now.
Binary | OR means that wherever you have at least one 1 the output will be 1.
1010
| 0100
------
1110
~ NOT means negation i.e. all 0s become 1s and all 1s become 0s.
~ 1010
------
0101
^ XOR means you turn a pair of 1 and 0 into a 1. All other combinations leave a 0 as output.
1010
^ 0110
------
1100
Bit shift.
N >> x means we "slide" our number N, x bits to the right.
1010 >> 1 = 0101(0) // zero in the brackets is dropped,
since it goes out of the representation = 0101
1001 >> 1 = 0100(1) // (1) is dropped = 0100
<< behaves the same way, just the opposite direction.
1000 << 1 = 0001
Since in binary system numbers are represented as powers of 2, shifting a bit one or the other direction will result in multiplying or dividing by 2.
Let num = 36. First subtract 1, giving 35. In binary, this is 100011.
Right shift by 1 position gives 10001 (the rightmost digit disappears). Bitwise Or'ed with num gives:
100011
10001
-------
110011
Note that this ensures two 1's on the left.
Now right shift by 2 positions, giving 1100. Bitwise Or:
110011
1100
-------
111111
This ensures four 1's on the left.
And so on, until the value is completely filled with 1's from the leftmost.
Add 1 and you get 1000000, a power of 2.
This procedure always generates a power of two, and you can check that it is just above the initial value of num.
This question showed up on one of my teacher's old final exams. How does one even think logically about arriving at the answer?
I am familiar with the bit-manipulation operators and conversion between hex and binary.
int whatisthis(int x) {
x = (0x55555555 & x) + (0x55555555 & (x >>> 1));
x = (0x33333333 & x) + (0x33333333 & (x >>> 2));
x = (0x0f0f0f0f & x) + (0x0f0f0f0f & (x >>> 4));
x = (0x00ff00ff & x) + (0x00ff00ff & (x >>> 8));
x = (0x0000ffff & x) + (0x0000ffff & (x >>> 16));
return x;
}
Didn't you forget some left shifts?
x = ((0x55555555 & x) <<< 1) + (0x55555555 & (x >>> 1));
x = ((0x33333333 & x) <<< 2) + (0x33333333 & (x >>> 2));
snip...
This would then be the reversal of bits from left to right.
You can see that bits are moved together rather than one by one and this lead to a cost in O(log2(nbit))
(you invert 2^5=32 bits in 5 statements)
It might help you to rewrite the constants in binary to understand better how it works.
If there are no left shifts, then I can't help you because the additions will generate carry and I can't see any obvious meaning...
EDIT: OK, interesting, so this is for counting the number of bits set to 1 (also known as population count or popcount)... Here is a squeak Smalltalk quick test on 16 bits
| f |
f := [:x |
| y |
y := (x bitAnd: 16r5555) + (x >> 1 bitAnd: 16r5555).
y := (y bitAnd: 16r3333) + (y >> 2 bitAnd: 16r3333).
y := (y bitAnd: 16r0F0F) + (y >> 4 bitAnd: 16r0F0F).
y := (y bitAnd: 16r00FF) + (y >> 8 bitAnd: 16r00FF).
y].
^(0 to: 16rFFFF) detect: [:i | i bitCount ~= (f value: i)] ifNone: [nil]
The first statement handle each bit pairs. If no bit is set in the pair then it produce 00, if a single bit is set, it produces 01, if two bits are set, it produces 10.
00 -> 0+0 -> 00 = 0, no bit set
01 -> 1+0 -> 01 = 1, 1 bit set
10 -> 0+1 -> 01 = 1, 1 bit set
11 -> 1+1 -> 10 = 2, 2 bits set
So it count the number of bits in each pair.
The second statement handles group of 4 adjacent bits:
0000 -> 00+00 -> 0000 0+0=0 bits set
0001 -> 01+00 -> 0001 1+0=1 bits set
0010 -> 10+00 -> 0010 2+0=2 bits set
0100 -> 00+01 -> 0001 0+1=1 bits set
0101 -> 01+01 -> 0010 1+1=2 bits set
0110 -> 10+01 -> 0011 2+1=3 bits set
1000 -> 00+10 -> 0010 0+2=2 bits set
1001 -> 01+10 -> 0011 1+2=3 bits set
1010 -> 10+10 -> 0100 2+2=4 bits set
So, while the first step did replace each pair of bits by the number of bits set in this pair, the second did add this count in each pair of pair...
Next will handle each group of 8 adjacent bits, and sum the number of bits sets in two groups of 4...
can someone explain to me why the following results in b = 13?
int a, b, c;
a = 1|2|4;
b = 8;
c = 2;
b |= a;
b&= ~c;
It is using binary manipultaors. (Assuming ints are 1 byte, and use Two's complement for storage, etc.)
a = 1|2|4 means a = 00000001 or 00000010 or 00000100, which is 00000111, or 7.
b = 8 means b = 00001000.
c = 2 means c = 00000010.
b |= a means b = b | a which means b = 00001000 or 00000111, which is 00001111, or 15.
~c means not c, which is 11111101.
b &= ~c means b = b & ~c, which means b = 00001111 and 11111101, which is 00001101, or 13.
http://www.cs.cf.ac.uk/Dave/C/node13.html
a = 1|2|4
= 0b001
| 0b010
| 0b100
= 0b111
= 7
b = 8 = 0b1000
c = 2 = 0b10
b|a = 0b1000
| 0b0111
= 0b1111 = 15
~c = 0b111...1101
(b|a) & ~c = 0b00..001111
& 0b11..111101
= 0b00..001101
= 13
lets go into binary mode:
a = 0111 (7 in decimal)
b = 1000 (8)
c = 0010 (2)
then we OR b with a to get b = 1111 (15)
c = 0010 and ~c = 1101
finally b is anded with negated c which gives us c = 1101 (13)
hint: Convert decimal to binary and give it a shot.. maybe... just maybe you'll figure it all out by yourself
a = 1 | 2 | 4;
Assigns the value 7 to a. This is because you are performing a bitwise OR operation on the constants 1, 2 and 4. Since the binary representation of each of these is 1, 10 and 100 respectively, you get 111 which is 7.
b |= a;
This ORs b and a and assigns the result to b. Since b's binary representation is now 111 and a's binary representation is 1000 (8), you end up with 1111 or 15.
b &= ~c;
The ~c in this expression means the bitwise negation of c. This essentially flips all 0's to 1's and vice versa in the binary representation of c. This means c switches from 10 to 111...11101.
After negating c, there is a bitwise AND operation between b and c. This means only bits that are 1 in both b and c remain 1, all others equal 0. Since b is now 1111 and c is all 1's except in the second lowest order bit, all of b's bits remain 1 except the 2 bit.
The result of flipping b's 2 bit is the same as if you simply subtracted 2 from its value. Since its current value is 15, and 15-2 = 13, the assignment results in b == 13.