Shifting by k, for large values of k (CSAPP) - bit-manipulation

I am reading about shifting by k, for large values of k in the book CSAPP. It was discussing about what would be the effects of shifting a data type consisting of w bits by some value k >= w. It stated the following line:
"On many machines, the shift instructions consider only the lower log_2 w bits of the shift amount when shifting a w-bit value, and so the shift amount is effectively computed as k mod w."
While I do understand the k mod w part, I do not understand what CSAPP means by the lower log_2 w bits of the shift amount. I was thinking that if we have an integer on a 32-bit machine that we want to shift 36 units to the left, we would shift it 36 mod 32, or 4 bits to the left. I wasn't sure how that would be equivalent to the lower log_2 32 bits = 5 bits of the shift amount.

Related

Why is the result of a bitwise shift unrecoverable if there is a mathematical equivalent of the same operation?

Take for example the number 91. That number in binary is 1011011. If you shift that number to the right by 5 bits, you would get 2 (10 in binary). According to a google search, bit shifting to the left or right by a certain amount of bits is the same as multiplying or dividing the number by 2 to the power of the number of bits to be shifted, respectively. so to get from 91 to 2 by bit shifting, the equation would look like this: 91 / 2^5, which is also 91 / 32. Now, of course if you did that in your calculator, there would be some decimal values, which aren't included when bit shifting. The resulting 2 is actually 2.84357. I'm sure you know that if you do a certain operation on a number and then you do the inverse, the result would be what you had in the first place. So does decimal precision have something to do with this?
There is a mathematical equivalent of shifting to the right... and the mathematical operation is UNRECOVERABLE.
You seem to think that shifting to the right is:
bit shifting to the left or right by a certain amount of bits is the same as multiplying or dividing the number by 2
This is what you will hear people casually say, but it is only half right. As it it is not the same but only similar.
The correct statement is:
shifting a base-2 number one digit to the right is THE SAME as dividing by two in the integer domain
If you have an integer calculator, if you did 91/32 you will get 2. You will not get ANY decimal point because we are operating in the integer domain.
For real numbers, the equivalent operation is:
FLOOR(91/32)
Which is also unrecoverable because it also results in 2.
The lesson here is be careful when listening to what people CASUALLY say. Casual speech is often imprecise and assumes the listener is familiar with the subject. You need to dig deeper what the statement is actually trying to say.
As for why it is unrecoverable? Division of integers give two results: the quotient (which is the main result) and the remainder. When we divide 91 by 32 we are doing this:
2
_____
32 ) 91
64
__
27
So we get the result of 2 and a remainder of 27. The reason you can't get 91 by multiplying 2*32 is because we threw away the remainder.
You can get the result back if you saved the remainder. However, calculating the remainder is not a matter of simple shifts. Here's an example of how to make it reversable in C:
int test () {
int a = 91;
int b = 32;
int result;
int remainder;
result = a / b; // result will be 2
remainder = a % b; // remainder will be 27
return (result * b) + remainder; // returns 91
}
You can only recover the result of an operation if it has a 1-1 mapping between the inputs and outputs, i.e. it has an inverse function. But not all mathematical functions have an inverse function
For example if f(x) = x >> n with >> is the shift operator then it'll be equivalent to
f(x) = ⌊x/2n⌋
with ⌊ ⌋ being the floor function. Since there are many inputs that lead to the same output, the relationship isn't 1-1 and there can't be an inverse function for it. This function works the same for both signed and unsigned right shift:
91 >> 5 == floor(91.0/32.0) == 2
-91 >> 5 == floor(-91.0/32.0) == -3
Similarly for an unsigned left shift function g(x) = x << n then the equivalent is
g(x) = (x * 2n) mod 2N
with N being the size in bits of x, because integer math in hardware, C and many other languages always reduce modulo 2N due to the limit of register size and the use of two's complement. And it's clear that the modulo function also isn't invertible/recoverable. The signed left shift is almost the same with some small modifications

How can I force a xor operation to stay within the visible ascii range?

Say you have a value k in [32,126] and another value p in [32,126]. If you compute p xor k you might get low values such as 0 not in [32,126]. One trick to stay within [32,127] is to compute p xor (k and 15). The and(k, 15) eliminates the 4 most significant bits of k while keeping k's least significant bits intact.
However, 127 is a control character in the ASCII table. Can you do something elegant that makes this range go from 32 to 126 and not 32 to 127?

How to choose the correct left shift in bit wise operations?

I am learning bare metal programming in c++ and it often involves setting a portion of a 32 bit hardware register address to some combination.
For example for an IO pin, I can set the 15th to 17th bit in a 32 bit address to 001 to mark the pin as an output pin.
I have seen code that does this and I half understand it based on an explanation of another SO question.
# here ra is a physical address
# the 15th to 17th bits are being
# cleared by AND-ing it with a value that is one everywhere
# except in the 15th to 17th bits
ra&=~(7<<12);
Another example is:
# this clears the 21st to 23rd bits of another address
ra&=~(7<<21);
How do I choose the 7 and how do I choose the number of bits to shift left?
I tried this out in python to see if I can figure it out
bin((7<<21)).lstrip('-0b').zfill(32)
'00000000111000000000000000000000'
# this has 8, 9 and 10 as the bits which is wrong
The 7 (base 10) is chosen as its binary representation is 111 (7 in base 2).
As for why it's bits 8, 9 and 10 set it's because you're reading from the wrong direction. Binary, just as normal base 10, counts right to left.
(I'd left this as a comment but reputation isn't high enough.)
If you want to isolate and change some bits in a register but not all you need to understand the bitwise operations like and and or and xor and not operate on a single bit column, bit 3 of each operand is used to determine bit 3 of the result, no other bits are involved. So I have some bits in binary represented by letters since they can each either be a 1 or zero
jklmnopq
The and operation truth table you can look up, anything anded with zero is a zero anything anded with one is itself
jklmnopq
& 01110001
============
0klm000q
anything orred with one is a one anything orred with zero is itself.
jklmnopq
| 01110001
============
j111nop1
so if you want to isolate and change two bits in this variable/register say bits 5 and 6 and change them to be a 0b10 (a 2 in decimal), the common method is to and them with zero then or them with the desired value
76543210
jklmnopq
& 10011111
============
j00mnopq
jklmnopq
| 01000000
============
j10mnopq
you could have orred bit 6 with a 1 and anded bit 5 with a zero, but that is specific to the value you wanted to change them to, generically we think I want to change those bits to a 2, so to use that value 2 you want to zero the bits then force the 2 onto those bits, and them to make them zero then orr the 2 onto the bits. generic.
In c
x = read_register(blah);
x = (x&(~(3<<5)))|(2<<5);
write_register(blah,x);
lets dig into this (3 << 5)
00000011
00000110 1
00001100 2
00011000 3
00110000 4
01100000 5
76543210
that puts two ones on top of the bits we are interested in but anding with that value isolates the bits and messes up the others so to zero those and not mess with the other bits in the register we need to invert those bits
using x = ~x inverts those bits a logical not operation.
01100000
10011111
Now we have the mask we want to and with our register as shown way above, zeroing the bits in question while leaving the others alone j00mnopq
Now we need to prep the bits to or (2<<5)
00000010
00000100 1
00001000 2
00010000 3
00100000 4
01000000 5
Giving the bit pattern we want to orr in giving j10mnopq which we write back to the register. Again the j, m, n, ... bits are bits they are either a one or a zero and we dont want to change them so we do this extra masking and shifting work. You may/will sometimes see examples that simply write_register(blah,2<<5); either because they know the state of the other bits, know they are not using those other bits and zero is okay/desired or dont know what they are doing.
x read_register(blah); //bits are jklmnopq
x = (x&(~(3<<5)))|(2<<5);
z = 3
z = z << 5
z = ~z
x = x & z
z = 2
z = z << 5
x = x | z
z = 3
z = 00000011
z = z << 5
z = 01100000
z = ~z
z = 10011111
x = x & z
x = j00mnopq
z = 2
z = 00000010
z = z << 5
z = 01000000
x = x | z
x = j10mnopq
if you have a 3 bit field then the binary is 0b111 which in decimal is the number 7 or hex 0x7. a 4 bit field 0b1111 which is decimal 15 or hex 0xF, as you get past 7 it is easier to use hex IMO. 6 bit field 0x3F, 7 bit field 0x7F and so on.
You can take this further in a way to try to be more generic. If there is a register that controls some function for gpio pins 0 through say 15. starting with bit 0. If you wanted to change the properties for gpio pin 5 then that would be bits 10 and 11, 5*2 = 10 there are two pins so 10 and the next one 11. But generically you could:
x = (x&(~(0x3<<(pin*2)))) | (value<<(pin*2));
since 2 is a power of 2
x = (x&(~(0x3<<(pin<<1)))) | (value<<(pin<<1));
an optimization the compiler might do for if pin cannot be reduced to a specific value at compile time.
but if it were 3 bits per field and the fields start aligned with bit zero
x = (x&(~(0x7<<(pin*3)))) | (value<<(pin*3));
which the compiler might do a multiply by 3 but maybe instead just
pinshift = (pinshift<<1)|pinshift;
to get the multiply by three. depends on the compiler and instruction set.
overall this is called a read modify write as you read something, modify some of it, then write back (if you were modifying all of it you wouldnt need to bother with a read and a modify you would write the whole new value). And folks will say masking and shifting to generically cover isolating bits in a variable either for modification purposes or if you wanted to read/see what those two bits above were you would
x = read_register(blah);
x = x >> 5;
x = x & 0x3;
or mask first then shift
x = x & (0x3<<5);
x = x >> 5;
six of one half a dozen of another, both are equal in general, some instruction sets one might be more efficient than another (or might be equal and then shift, or shift then and). One might make more sense visually to some folks rather than the other.
Although technically this is an endian thing as some processors bit 0 is the most significant bit. In C AFAIK bit 0 is the least significant bit. If/when a manual shows the bits laid out left to right you want your right and left shifts to match that, so as above I showed 76543210 to indicate the documented bits and associated that with jklmnopq and that was the left to right information that mattered to continue the conversation about modifying bits 5 and 6. some documents will use verilog or vhdl style notation 6:5 (meaning bits 6 to 5 inclusive, makes more sense with say 4:2 meaning bits 4,3,2) or [6 downto 5], more likely to just see a visual picture with boxes or lines to show you what bits are what field.
How do I choose the 7
You want to clear three adjacent bits. Three adjacent bits at the bottom of a word is 1+2+4=7.
and how do I choose the number of bits to shift left
You want to clear bits 21-23, not bits 1-3, so you shift left another 20.
Both your examples are wrong. To clear 15-17 you need to shift left 14, and to clear 21-23 you need to shift left 20.
this has 8, 9,and 10 ...
No it doesn't. You're counting from the wrong end.

What is the upper bound of BigInteger with character array implementation?

If I impement BigInteger with a character array (in C++), in terms of power of 10, what is my upper bound in a 32bit system?
In other words,
- 10^x < N <= 10^x
(first character is reserved for sign).
What is x in 32 bit system?
Please ignore for now that we have reserved memory for OS and consider all 4GB memory is addressable by us.
An 8-bit byte can hold 28, or 256 unique values.
4GB of memory is 232, or 4294967296 bytes.
Or 4294967295, if we subtract the one byte that you want to reserve for a sign
That's 34359738360 bits.
This many bits can hold 234359738360 unique values.
- 10^x < N <= 10^x
(first character is reserved for sign).
What is x in 32 bit system?
Wolfram Alpha suggests - 10^1292913986 < N <= 10^1292913986 as the largest representable powers of 10.
So x is 1,292,913,986.
(−(2^(n−1))) to (2^(n−1) − 1) calculates the range of a signed integer where n is the number of bits.[1]
Assuming your referring to the whole 4GB of memory being allocated, that is 232 (4,294,967,295) addressable bytes in 32 bit memory space, which is 235 (34,359,738,368) bits.
Put that into the formula at the start and you get a range of - (2235-1) to 2235-1 -1
This is assuming you use a bit for a sign, instead of a whole byte. If your going a use a whole byte for sign, you should calculate the unsigned range of 235-8 bits. Which is from 0 to 2235-8−1
According to this page, to convert from an exponent of base 2 to an exponent of base 10, you should use the formula x = m*ln(2)/ln(10),where you are converting from 2m to 10 x.
Therefore, your answer is that the upper bound is 10235-8*ln(2)/ln(10). I'm not going to even attempt to change that exponent into a decimal value.

Calculating polynomial division result as well as remainder (CRC)

I'm trying to write a table-based CRC routine for receiving Mode S uplink interrogator messages. On the downlink side, the CRC is just the 24-bit CRC based on polynomial P=0x1FFF409. So far, so good -- I wrote a table-based implementation that follows the usual byte-at-a-time convention, and it's working fine.
On the uplink side, though, things get weird. The protocol specification says that calculating the target uplink address is by finding:
U' = x^24 * U / G(x)
...where U is the received message and G(x) is the encoding polynomial 0x1FFF409, resulting in:
U' = x^24 * m(x) + A(x) + r(x) / G(x)
...where m(x) is the original message, A(x) is the address, and r(x) is the remainder. I want the low-order quotient A(x); e.g., the result of the GF(2) polynomial division operation instead of the remainder. The remainder is effectively discarded. The target address is encoded with the transmitted checksum such that the receiving aircraft can validate the checksum by comparing it with its address.
This is great and all, and I have a bitwise implementation which follows from the above. Please ignore the weird shifting of the polynomial and checksum, this has been cribbed from this Pascal implementation (on page 15) which assumes 32-bit registers and makes optimizations based on that assumption. In reality the message and checksum come as a single, 56-bit transmission.
#This is the reference bit-shifting implementation. It is slow.
def uplink_bitshift_crc():
p = 0xfffa0480 #polynomial (0x1FFF409 shifted left 7 bits)
a = 0x00000000 #rx'ed uplink data (32 bits)
adr = 0xcc5ee900 #rx'ed checksum (24 bits, shifted left 8 bits)
ad = 0 #will hold division result low-order bits
for j in range(56):
#if MSBit is 1, xor w/poly
if a & 0x80000000:
a = a ^ p
#shift off the top bit of A (we're done with it),
#and shift in the top bit of adr
a = ((a << 1) & 0xFFFFFFFF) + ((adr >> 31) & 1)
#shift off the top bit of adr
adr = (adr << 1) & 0xFFFFFFFF
if j > 30:
#shift ad left 1 bit and shift in the msbit of a
#this extracts the LS 24bits of the division operation
#and ignores the remainder at the end
ad = ad + ((a >> 31) & 1)
ad = ((ad << 1) & 0xFFFFFFFF)
#correct the ad
ad = ad >> 2
return ad
The above is of course slower than molasses in software and I'd really like to be able to construct a lookup table that would allow similar byte-at-a-time calculation of the received address, or massage the remainder (which is quickly calculated) into a quotient.
TL;DR:
Given a message, the encoding polynomial, and the remainder (calculated by the normal CRC method), is there a faster way to obtain the quotient of the polynomial division operation than by using shift registers to do polynomial division "longhand"?
You might take a look at the PyCRC library, I guess this may answer your questions.
Too late for the OP, but I'm posting this for others that might see this question. You can generate two tables to operate a byte at a time. The first 256 by 8 bit table is indexed by the current leading 8 bits of the dividend (message), and the 8 bit values are the quotients. The second 256 by 32 bit table is indexed by the 8 bit quotient and the 32 bit values are the 32 bit product of the 8 bit quotient times the 25 bit polynomial (since this is a carryless multiply, the product is 32 bits, (x^7 * x^24 = x^31)), which you xor to the upper 32 bits of the dividend, which will zero out the upper 8 bits of the dividend. Then loop back for the next 8 bits of the dividend.
A modern X86 cpu has the carryless multiply instruction, PCLMULQDQ that operates on 128 bit xmm registers, performing a 64 bit by 64 bit multiply to produce a 128 bit product (since it's a carryless multiply bit 127 is always 0, so it's really a 127 bit product). A multiply of the 56 bit message by the 41 bit constant 2^64/G(x) will produce a 96 bit product, of which the upper 32 bits will be the quotient (lower 64 bits are not used).