Suppose X and Y are two positive integers and Y is a power of two. Then what does this expression calculate?
(X+Y-1) & ~(Y-1)
I found this expression appearing in certain c/c++ implementation of Memory Pool (X represents the object size in bytes and Y represents the alignment in bytes, the expression returns the block size in bytes fit for use in the Memory Pool).
&~(Y-1) where Y is a power of 2, zeroes the last n bits, where Y = 2n: Y-1 produces n 1-bits, inverting that via ~ gives you a mask with n zeroes at the end, anding via bit-level & zeroes the bits where the mask is zero.
Effectively that produces a number that is some multiple of Y's power of 2.
It can maximally have the effect of subtracting Y-1 from the number, so add that first, giving (X+Y-1) & ~(Y-1). This is a number that's not less than X, and is a multiple of Y.
It gives you the next Y-aligned address of current address X.
Say, your current address X is 0x10000, and your alignment is 0x100, it will give you 0x10000. But if your current address X is 0x10001, you will get "next" aligned address of 0x10100.
This is useful in the scenario that you want your new object always to be aligned to blocks in memory, but not leaving any block unused. So you want to know what is the next available block-aligned address.
Why don't you just try some input and observe what happens?
#include <iostream>
unsigned compute(unsigned x, unsigned y)
{
return (x + y - 1) & ~(y - 1);
}
int main()
{
std::cout << "(x + y - 1) & ~(y - 1)" << std::endl;
for (unsigned x = 0; x < 9; ++x)
{
std::cout << "x=" << x << ", y=2 -> " << compute(x, 2) << std::endl;
}
std::cout << "----" << std::endl;
std::cout << "(x + y - 1) & ~(y - 1)" << std::endl;
for (unsigned x = 0; x < 9; ++x)
{
std::cout << "(x=" << x << ", y=2) -> " << compute(x, 2) << std::endl;
}
return 0;
}
Live Example
Output:
First set uses x in [0, 8] and y is constant 2. Second set uses x in [0, 8] and y is constant 4.
(x + y - 1) & ~(y - 1)
x=0, y=2 -> 0
x=1, y=2 -> 2
x=2, y=2 -> 2
x=3, y=2 -> 4
x=4, y=2 -> 4
x=5, y=2 -> 6
x=6, y=2 -> 6
x=7, y=2 -> 8
x=8, y=2 -> 8
----
(x + y - 1) & ~(y - 1)
(x=0, y=2) -> 0
(x=1, y=2) -> 2
(x=2, y=2) -> 2
(x=3, y=2) -> 4
(x=4, y=2) -> 4
(x=5, y=2) -> 6
(x=6, y=2) -> 6
(x=7, y=2) -> 8
(x=8, y=2) -> 8
It's easy to see the output (i.e., result right of ->) is always a multiple of y such that the output is greater than or equal to x.
First I assume that X and Y are unsigned integers.
Let's have a look at the right part:
If Y is a power of 2, it is represented in binary by one bit to 1 and all the others to 0. Example 8 will be binary 00..01000.
If you substract 1 the highest bit will be 0 and all the bits to its right will become 1. Example 8-1= 7 and in binary 00..00111
If you ~ negate this number you will make sure that all highest bit (including the original one will turn to 1 and the lovest to 0. Example: ~7 will be 11..11000
Now if you do a binary AND (&) with any number, you will set to 0 all the lower bits, in our example, the 3 lower bits. THe resulting number is hence a multiple of Y.
Let's look at the left side:
We've already analysed Y-1. In our example we had 7, that is 00..00111
If you add this to any number, you make sure that the result is greater than or equal to Y. Example with 5: 5+7=12 so 00..01100 and example with 10: 10+7=17 so 00..10001
If you then perform the AND, you'll erase the lower bits. so in our example with 5, we come to 00..01000 = 8 and in our example with 10 we get 00..10000 16.
Conclusion, it's the smallest multiple of Y wich is greater or equal to X.
Let's break it down, piece by piece.
(X+Y-1) & ~(Y-1)
Let's suppose that X = 11 and Y = 16 in accordance with your rules and that the integers are 8 bits.
(11+16-1) & ~(16-1)
Do the Addition and Subtraction
(26) & ~(15)
Translate this into binary
(0001 1010) & ~(0000 1111)
~ means not or to invert the zeros and ones
(0001 1010) & (1111 0000)
& means only to take the bits that are both ones
0001 0000
convert back to decimal
16
other examples
X = 78, Y = 32 results in 96
X = 25, Y = 64 results in 64
X = 47, Y = 16 results in 48
So, it would seem to me that the purpose of this is to find lowest multiple of Y that is equal to or greater than X. This could be used for finding the start/end address of a block of memory, or it could be used for positioning items on the screen, or any number of other possible answers as well. But without context and possibly even a full code example. There's no guarantee.
(X+Y-1) & ~(Y-1)
x = 7 = 0b0111
y = 4 = 0b0100
x+y-1 = 0b1010
y-1 = 3 = 0b0011
~(y-1) = 0b1100
(x+y-1) & ~(y-1) = 0b1000 = 8
--
x = 12 = 0b1100
y = 2 = 0b0010
x+y-1 = 13 = 0b1101
y-1 = 1 = 0b0001
~(y-1) = 0b1110
(x+y-1) & ~(y-1) = 0b1100 = 12
(x+y-1) & ~(y-1) is the smallest multiple of y greater than or equal to x
It seems provides a specified alignment of a value for example of a memory address (for example when you want to get the next aligned address).
For example if you want that a memory address would be aligned at the paragraph bound you can write
( address + 16 - 1 ) & ~( 16 - 1 )
or
( address + 15 ) & ~15
or
( address + 15 ) & ~0xf
In this case all bits before 16 will be zeroed.
This part of expression
( address + alignment - )
is used for rounding.
and this part of expression
~( alignment - 1 )
is used to build a mask thet zeroes low bits.
Related
Given a long int x, count the number of values of a that satisfy the following conditions:
a XOR x > x
0 < a < x
where a and x are long integers and XOR is the bitwise XOR operator
How would you go about completing this problem?
I should also mentioned that the input x can be as large as 10^10
I have managed to get a brute force solution by iterating over 0 to x checking the conditions and incrementing a count value.. however this is not an optimal solution...
This is the brute force that I tried. It works but is extremely slow for large values of x.
for(int i =0; i < x; i++)
{
if((0 < i && i < x) && (i ^ x) > x)
count++;
}
long long NumberOfA(long long x)
{
long long t = x <<1;
while(t^(t&-t)) t ^= (t&-t);
return t-++x;
}
long long x = 10000000000;
printf("%lld ==> %lld\n", 10LL, NumberOfA(10LL) );
printf("%lld ==> %lld\n", x, NumberOfA(x) );
Output
10 ==> 5
10000000000 ==> 7179869183
Link to IDEOne Code
Trying to explain the logic (using example 10, or 1010b)
Shift x to the left 1. (Value 20 or 10100b)
Turn off all low bits, leaving just the high bit (Value 16 or 10000b)
Subtract x+1 (16 - 11 == 5)
Attempting to explain
(although its not easy)
Your rule is that a ^ x must be bigger than x, but that you cannot add extra bits to a or x.
(If you start with a 4-bit value, you can only use 4-bits)
The biggest possible value for a number in N-bits is 2^n -1.
(eg. 4-bit number, 2^4-1 == 15)
Lets call this number B.
Between your value x and B (inclusive), there are B-x possible values.
(back to my example, 10. Between 15 and 10, there are 5 possible values: 11, 12, 13, 14, 15)
In my code, t is x << 1, then with all the low bits turned off.
(10 << 1 is 20; turn off all the low bits to get 16)
Then 16 - 1 is B, and B - x is your answer:
(t - 1 - x, is the same as t - ++x, is the answer)
One way to look at this is to consider each bit in x.
If it's 1, then flipping it will yield a smaller number.
If it's 0, then flipping it will yield a larger number, and we should count it - and also all the combinations of bits to the right. That conveniently adds up to the mask value.
long f(long const x)
{
// only positive x can have non-zero result
if (x <= 0) return 0;
long count = 0;
// Iterate from LSB to MSB
for (long mask = 1; mask < x; mask <<= 1)
count += x & mask
? 0
: mask;
return count;
}
We might suspect a pattern here - it looks like we're just copying x and flipping its bits.
Let's confirm, using a minimal test program:
#include <cstdlib>
#include <iostream>
int main(int, char **argv)
{
while (*++argv)
std::cout << *argv << " -> " << f(std::atol(*argv)) << std::endl;
}
0 -> 0
1 -> 0
2 -> 1
3 -> 0
4 -> 3
5 -> 2
6 -> 1
7 -> 0
8 -> 7
9 -> 6
10 -> 5
11 -> 4
12 -> 3
13 -> 2
14 -> 1
15 -> 0
So all we have to do is 'smear' the value so that all the zero bits after the most-significant 1 are set, then xor with that:
long f(long const x)
{
if (x <= 0) return 0;
long mask = x;
while (mask & (mask+1))
mask |= mask+1;
return mask ^ x;
}
This is much faster, and still O(log n).
I am doing a project on digital filters. I needed to know how to add a 4 bit binary number to the most significant 4 bits of an 8 bit number. For example:
0 1 0 0 0 0 0 0 //x
+ 1 0 1 0 //y
= 1 1 1 0 0 0 0 0 //z
Can I add using a code somewhat like this?
z=[7:4]x + y
or should I have to concatenate the 4 bit number with another four zeros and add?
Assuming y is the 4 bit number and x the 8 bit number:
If you do
assign z = x[7:4] + y
Then you are doing a 4-bit addition and the most significant part of z is padded with 0's.
If you do
assign z = y[7:4] + x
You will get an error message from the synthesizer, as subscripts for y are wrong.
So do as this:
assign z = {y,4'b0} + x
Which performs an 8-bit addition with x and the value of y shifted 4 bits to the left, which is want you wanted.
I'm writing a program that exchanges the values of the bits on positions 3, 4 and 5 with bits on positions 24, 25 and 26 of a given 32-bit unsigned integer.
So lets say I use the number 15 and I want to turn the 4th bit into a 0, I'd use...
int number = 15
int newnumber = number & (~(1 << 3));
// output is 7
This makes sense because I'm exchanging the 4th bit from 1 to 0 so 15(1111) becomes 7(0111).
However this wont work the other way round (change a 0 to a 1), Now I know how to achieve exchanging a 0 to a 1 via a different method, but I really want to understand the code in this method.
So why wont it work?
The truth table for x AND y is:
x y Output
-----------
0 0 0
0 1 0
1 0 0
1 1 1
In other words, the output/result will only be 1 if both inputs are 1, which means that you cannot change a bit from 0 to 1 through a bitwise AND. Use a bitwise OR for that (e.g. int newnumber = number | (1 << 3);)
To summarize:
Use & ~(1 << n) to clear bit n.
Use | (1 << n) to set bit n.
To set the fourth bit to 0, you AND it with ~(1 << 3) which is the negation of 1000, or 0111.
By the same reasoning, you can set it to 1 by ORing with 1000.
To toggle it, XOR with 1000.
This question showed up on one of my teacher's old final exams. How does one even think logically about arriving at the answer?
I am familiar with the bit-manipulation operators and conversion between hex and binary.
int whatisthis(int x) {
x = (0x55555555 & x) + (0x55555555 & (x >>> 1));
x = (0x33333333 & x) + (0x33333333 & (x >>> 2));
x = (0x0f0f0f0f & x) + (0x0f0f0f0f & (x >>> 4));
x = (0x00ff00ff & x) + (0x00ff00ff & (x >>> 8));
x = (0x0000ffff & x) + (0x0000ffff & (x >>> 16));
return x;
}
Didn't you forget some left shifts?
x = ((0x55555555 & x) <<< 1) + (0x55555555 & (x >>> 1));
x = ((0x33333333 & x) <<< 2) + (0x33333333 & (x >>> 2));
snip...
This would then be the reversal of bits from left to right.
You can see that bits are moved together rather than one by one and this lead to a cost in O(log2(nbit))
(you invert 2^5=32 bits in 5 statements)
It might help you to rewrite the constants in binary to understand better how it works.
If there are no left shifts, then I can't help you because the additions will generate carry and I can't see any obvious meaning...
EDIT: OK, interesting, so this is for counting the number of bits set to 1 (also known as population count or popcount)... Here is a squeak Smalltalk quick test on 16 bits
| f |
f := [:x |
| y |
y := (x bitAnd: 16r5555) + (x >> 1 bitAnd: 16r5555).
y := (y bitAnd: 16r3333) + (y >> 2 bitAnd: 16r3333).
y := (y bitAnd: 16r0F0F) + (y >> 4 bitAnd: 16r0F0F).
y := (y bitAnd: 16r00FF) + (y >> 8 bitAnd: 16r00FF).
y].
^(0 to: 16rFFFF) detect: [:i | i bitCount ~= (f value: i)] ifNone: [nil]
The first statement handle each bit pairs. If no bit is set in the pair then it produce 00, if a single bit is set, it produces 01, if two bits are set, it produces 10.
00 -> 0+0 -> 00 = 0, no bit set
01 -> 1+0 -> 01 = 1, 1 bit set
10 -> 0+1 -> 01 = 1, 1 bit set
11 -> 1+1 -> 10 = 2, 2 bits set
So it count the number of bits in each pair.
The second statement handles group of 4 adjacent bits:
0000 -> 00+00 -> 0000 0+0=0 bits set
0001 -> 01+00 -> 0001 1+0=1 bits set
0010 -> 10+00 -> 0010 2+0=2 bits set
0100 -> 00+01 -> 0001 0+1=1 bits set
0101 -> 01+01 -> 0010 1+1=2 bits set
0110 -> 10+01 -> 0011 2+1=3 bits set
1000 -> 00+10 -> 0010 0+2=2 bits set
1001 -> 01+10 -> 0011 1+2=3 bits set
1010 -> 10+10 -> 0100 2+2=4 bits set
So, while the first step did replace each pair of bits by the number of bits set in this pair, the second did add this count in each pair of pair...
Next will handle each group of 8 adjacent bits, and sum the number of bits sets in two groups of 4...
I want to implement unsigned left rotation in my integer class. However, because it is a template class, it can be in any size from 128-bit and goes on; so I cannot use algorithms that require a temporary of the same size because, if the type becomes big, a stack overflow will occur (specially if such function was in call chain).
So to fix such problem I minimized it to a question: what steps do I have to do to rotate a 32-bit number using only 4 bits. Well, if you think about, it a 32-bit number contains 8 groups of 4 bits each, so if the number of bits to rotate is 4 then a swap will occur between groups 0 and 4, 1 and 5, 2 and 6, 3 and 7, after which the rotation is done.
If bits to rotate less than 4 and greater than 0 then it is simple just preserve the last N bits and start shift-Or loop, e.g. suppose we have the number 0x9CE2 to left rotate it 3 bits we will do that following:
The number in little endian binary is 1001 1100 1110 0010, each nibble indexed from 0 to 3 from right to left and we will call the this number N and number of bits in one group B
[1] x <- N[3] >> 3
x <- 1001 >> 3
x <- 0100
y <- N[0] >> (B - 3)
y <- N[0] >> (4 - 3)
y <- 0010 >> 1
y <- 0001
N[0] <- (N[0] << 3) | x
N[0] <- (0010 << 3) | 0100
N[0] <- 0000 | 0100
N[0] <- 0100
[2] x <- y
x <- 0001
y <- N[1] >> (B - 3)
y <- N[1] >> (4 - 3)
y <- 1110 >> 1
y <- 0111
N[1] <- (N[1] << 3) | x
N[1] <- (1110 << 3) | 0001
N[1] <- 0000 | 0001
N[1] <- 0001
[3] x <- y
x <- 0111
y <- N[2] >> (B - 3)
y <- N[2] >> (4 - 3)
y <- 1100 >> 1
y <- 0110
N[2] <- (N[2] << 3) | x
N[2] <- (1100 << 3) | 0111
N[2] <- 0000 | 0111
N[2] <- 0111
[4] x <- y
x <- 0110
y <- N[3] >> (B - 3)
y <- N[3] >> (4 - 3)
y <- 1001 >> 1
y <- 0100
N[3] <- (N[3] << 3) | x
N[3] <- (1001 << 3) | 0110
N[3] <- 1000 | 0110
N[3] <- 1110
The result is 1110 0111 0001 0100, 0xE714 in hexadecimal, which is the right answer; and, if you try to apply it on any number with any precision, all you will need is one variable which type is the type of any element of the array forming that bignum type.
Now the real problem is when the number bits to rotate is bigger than one group or bigger than half size of the type (i.e. bigger than 4bits or 8bits in this example).
Usually, we shift bits from last element to first element and so on; but now, after shifting
the last element to first element, the result has to be relocated to new place because the number of bits to rotate is bigger than on element (i.e. > 4 bits). The start index where the shift will start is the last index (3 in this example), and for destination index we use the equation: dest_index = int(bits_count/half_bits) + 1, where half_bits is number of bits in half the number and in this example half_bits = 8, so if bits_count = 7 then dest_index = int(7/8) + 1 = 1 + 1 = 2, and that means the result of the first shift must relocated to destination index 2 -- and that is my problem, for I cannot think of a way to write an algorithm for this situation.
Thanks.
This will just be some hints for one way to accomplish this. You can think about making two passes.
first pass, rotate on 4 bit boundaries only
second pass, rotate on 1 bit boundaries
So, the top level pseudo code might look like:
rotate (unsigned bits) {
bits %= 32; /* only have 32 bits */
if (bits == 0) return;
rotate_nibs(bits/4);
rotate_bits(bits%4);
}
So, to rotate by 13 bits, you first rotate by 3 nibbles, then rotate by 1 bit to get your total of 13 bits of rotation.
You could avoid nibble rotation altogether if you treat your array of nibbles as a circular buffer. Then, a nibble rotation is just a matter of changing the start position in the array for the 0 index.
If you must do rotation, it can be tricky. If you are rotating an 8 item array and only want to use 1 item of storage overhead to do the rotation, then to rotate by 3 items, you might approach it like this:
orig: A B C D E F G H
step 1: A B C A E F G H rem: D
2: A B C A E F D H rem: G
3: A G C A E F D H rem: B
4: A G C A B F D H rem: E
5: A G C A B F D E rem: H
6: A G H A B F D E rem: C
7: A G H A B C D E rem: F
8: F G H A B C D E done
But, if you tried the same technique with 2, 4, or 6 item rotations, the cycle does not run through the whole array. So, you have to be aware if the rotation count and the array size has a common divisor, and make the algorithm account for that. If you step through with the 6 step rotation, some more clues fall out.
orig: A B C D E F G H
A
G
E
C cycled back to A's position
B
H
F
D done
Notice that the GCD(6,8) is 2, which means we should expect 4 iterations for each pass. Then, the rotation algorithm for an N item array could look like:
rotate (n) {
G = GCD(n, N)
for (i = 0; i < G; ++i) {
p = arr[i];
for (j = 1; j < N/G; ++j) {
swap(p, arr[(i + j*n) % N]);
}
arr[i] = p;
}
}
There is an optimization you can do to avoid the swap per iteration, that I'll leave as an exercise.
I suggest calling an assembly language function for the bit rotation.
Many assembly languages have better facilities for rotating bits through carry and rotating carry through bits.
Many times the assembly language is less complex than the C or C++ function.
The drawback is that you will need one instance of each assembly function for each {different} platform.