This question already has answers here:
Bitwise NOT operator returning unexpected and negative value? [duplicate]
(4 answers)
Closed 4 years ago.
I want to see how bitwise NOT works through a simple example:
int x = 4;
int y;
int z;
y = ~(x<<1);
z =~(0x01<<1);
cout<<"y = "<<y<<endl;
cout<<"z = "<<z<<endl;
This results in y = -9 and z = -3. I don't see how this happen. Anyone can educate me a bit?
(x<<1) will shift the bits one, so
00000000 00000000 00000000 00000100
will become:
00000000 00000000 00000000 00001000
Which is the representation of 8. Then ~ will invert all the bits such that it becomes:
11111111 11111111 11111111 11110111
Which is the representation of -9.
0x01 is
00000000 00000000 00000000 00000001
in binary, so when shifted once becomes:
00000000 00000000 00000000 00000010
And then when ~ is applied we get:
11111111 11111111 11111111 11111101
Which is -3 in binary
Well, there is a very long story behind.
To make it easier to understand let's use binary numbers.
x = 4 or x = 0b 0000 0000 0000 0000 0000 0000 0000 0100 because sizeOf(int) = 4
after x<<1 x = 0b 0000 0000 0000 0000 0000 0000 0000 1000 and after
~(x<<1) x = 0b 1111 1111 1111 1111 1111 1111 1111 0111.
and here begin complication. Since int is signed type it's mean that the first bit is a sign and the whole system is Two complemnt.
so x = 0b 1111 1111 1111 1111 1111 1111 1111 0111 is x = -9 and for example
x = 0b 1111 1111 1111 1111 1111 1111 1111 1111 is x = -1
and x = 0b 0000 0000 0000 0000 0000 0000 0000 0010 is 2
Learn more about Two complemnt.
Whether an integer is positive or negative (the sign of the integer) is stored in a dedicated bit, the sign bit. The bitwise NOT affects this bit, too, so any positive number becomes a negative number and vice versa.
Note that "dedicated bit" is a bit of an oversimplification, as most contemporary computers do not use "sign and magnitude" representation (where the sign bit would just switch the sign), but "two's complement" representation, where the sign bit also affects the magnitude.
For example, the 8-bit signed integer 00000000 would be 0, but 10000000 (sign bit flipped) would be -128.
Related
char char_ = '3';
unsigned int * custom_mem_address = (unsigned int *) &char_;
cout<<char_<<endl;
cout << *custom_mem_address<<endl;
Since custom_mem_address contains one byte value of char '3', I except it to contain the ascii value of '3' which is 51.
But the output is the following.
3
1644042035
Depending on the byte alignment at least one byte in the 1644042035 should be 51 right? But its not. Can you please explain.
Can someone explain where am I wrong
1644042035 in binary is 0110 0001 1111 1110 0001 0111 0011 0011 and 51 is 0011 0011.
0110 0001 1111 1110 0001 0111 0011 0011
0000 0000 0000 0000 0000 0000 0011 0011
Isn't that what you are looking for?
I am reading TCPPPL by Stroustrup. It gives an example of a function that extracts the middle 16 bits of a 32 bit long like this:
unsigned short middle(long a){ return (a>>8)&0xffff;}.
My question is: isn't it extracting the last 16 bits? Tell me how am I wrong.
It does indeed extract the middle 16 bits:
// a := 0b xxxx xxxx 1111 1111 1111 1111 xxxx xxxx
a>>8; // 0b 0000 0000 xxxx xxxx 1111 1111 1111 1111
&0xffff // 0b 0000 0000 0000 0000 1111 1111 1111 1111
a >> 8 will right-shift the value in a by 8 bits. The low 8 bits are forgotten, and bits previously numbered 31–8 now get moved (renumbered) to 23–0. Finally, masking out the higher 16 bits leaves you with bits 15–0, which were originally (before the shift) at positions 23–8. Voila.
a is going to right shift 8-bit (a>>8) before bitwise and operation.
Have you noticed the >>8 part? It shifts the argument right by eight bits, first.
Here is the code that reports the bit parity of a given integer:
01: bool parity(unsigned int x)
02: {
03: x ^= x >> 16;
04: x ^= x >> 8;
05: x ^= x >> 4;
06: x &= 0x0F;
07: return ((0x6996 >> x) & 1) != 0;
08: }
I found this here.. while there seems to be explanation in the link, I do not understand.
The first explanation that start with The code first "merges" bits 0 − 15 with bits 16 − 31 using a right shift and XOR (line 3). is making it hard for me to understand as to what is going on. I tried to play around them but that did not help. if a clarity on how this work is given, it will be useful for beginners like me
Thanks
EDIT: from post below:
value : 1101 1110 1010 1101 1011 1110 1110 1111
value >> 16: 0000 0000 0000 0000 1101 1110 1010 1101
----------------------------------------------------
xor : 1101 1110 1010 1101 0110 0001 0100 0010
now right shift this again by 8 bits:
value : 1101 1110 1010 1101 0110 0001 0100 0010
value >>8 : 0000 0000 1101 1110 1010 1101 0110 0001
----------------------------------------------------
xor : 1101 1110 1110 0001 0100 1100 0010 0011
so where is the merging of parity happening here?
Let's start first with a 2-bit example so you can see what's going on. The four possibilities are:
ab a^b
-- ---
00 0
01 1
10 1
11 0
You can see that a^b (xor) gives 0 for an even number of one-bits and 1 for an odd number. This woks for 3-bit values as well:
abc a^b^c
--- -----
000 0
001 1
010 1
011 0
100 1
101 0
110 0
111 1
The same trick is being used in lines 3 through 6 to merge all 32 bits into a single 4-bit value. Line 3 merges b31-16 with b15-0 to give a 16-bit value, then line 4 merges the resultant b15-b8 with b7-b0, then line 5 merges the resultant b7-b4 with b3-b0. Since b31-b4 (the upper half of each xor operation) aren't cleared by that operations, line 6 takes care of that by clearing them out (anding with binary 0000...1111 to clear all but the lower 4 bits).
The merging here is achieved in a chunking mode. By "chunking", I mean that it treats the value in reducing chunks rather than as individual bits, which allows it to efficiently reduce the value to a 4-bit size (it can do this because the xor operation is both associative and commutative). The alternative would be to perform seven xor operations on the nybbles rather than three. Or, in complexity analysis terms, O(log n) instead of O(n).
Say you have the value 0xdeadbeef, which is binary 1101 1110 1010 1101 1011 1110 1110 1111. The merging happens thus:
value : 1101 1110 1010 1101 1011 1110 1110 1111
>> 16: 0000 0000 0000 0000 1101 1110 1010 1101
----------------------------------------------------
xor : .... .... .... .... 0110 0001 0100 0010
(with the irrelevant bits, those which will not be used in future, left as . characters).
For the complete operation:
value : 1101 1110 1010 1101 1011 1110 1110 1111
>> 16: 0000 0000 0000 0000 1101 1110 1010 1101
----------------------------------------------------
xor : .... .... .... .... 0110 0001 0100 0010
>> 8: .... .... .... .... 0000 0000 0110 0011
----------------------------------------------------
xor : .... .... .... .... .... .... 0010 0001
>> 4: .... .... .... .... .... .... 0000 0010
----------------------------------------------------
xor : .... .... .... .... .... .... .... 0011
And, looking up 0011 in the table below, we see that it gives even parity (there are 24 1-bits in the original value). Changing just one bit in that original value (any bit, I've chosen the righmost bit) will result in the opposite case:
value : 1101 1110 1010 1101 1011 1110 1110 1110
>> 16: 0000 0000 0000 0000 1101 1110 1010 1101
----------------------------------------------------
xor : .... .... .... .... 0110 0001 0100 0011
>> 8: .... .... .... .... 0000 0000 0110 0011
----------------------------------------------------
xor : .... .... .... .... .... .... 0010 0000
>> 4: .... .... .... .... .... .... 0000 0010
----------------------------------------------------
xor : .... .... .... .... .... .... .... 0010
And 0010 in the below table is odd parity.
The only "magic" there is the 0x6996 value which is shifted by the four-bit value to ensure the lower bit is set appropriately, then that bit is used to decide the parity. The reason 0x6996 (binary 0110 1001 1001 0110) is used is because of the nature of parity for binary values as shown in the lined page:
Val Bnry #1bits parity (1=odd)
--- ---- ------ --------------
+------> 0x6996
|
0 0000 0 even (0)
1 0001 1 odd (1)
2 0010 1 odd (1)
3 0011 2 even (0)
4 0100 1 odd (1)
5 0101 2 even (0)
6 0110 2 even (0)
7 0111 3 odd (1)
8 1000 1 odd (1)
9 1001 2 even (0)
10 1010 2 even (0)
11 1011 3 odd (1)
12 1100 2 even (0)
13 1101 3 odd (1)
14 1110 3 odd (1)
15 1111 4 even (0)
Note that it's not necessary to do the final shift-of-a-constant. You could just as easily continue the merging operations until you get down to a single bit, then use that bit:
bool parity (unsigned int x) {
x ^= x >> 16;
x ^= x >> 8;
x ^= x >> 4;
x ^= x >> 2;
x ^= x >> 1;
return x & 1;
}
However, once you have the value 0...15, a shift of a constant by that value is likely to be faster than two extra shift-and-xor operations.
From the original page,
Bit parity tells whether a given input contains an odd number of 1's.
So you want to add up the number of 1's. The code uses the xor operator to add pairs of bits,
0^1 = 1 bits on
1^0 = 1 bits on
0^0 = 0 bits on
1^1 = 0 bits on (well, 2, but we cast off 2's)
So the first three lines count up the number of 1's (tossing pairs of 1's).
That should help...
And notice from the original page, the description of why 0x6996,
If we encode even by 0 and odd by 1 beginning with parity(15) then we
get 0110 1001 0110 1001 = 0x6996, which is the magic number found in
line 7. The shift moves the relevant bit to bit 0. Then everything
except for bit 0 is masked out. In the end, we get 0 for even and 1
for odd, exactly as desired.
I'm studying signed-unsigned integer conversions and I came to these conclusions, can someone tell me if this is correct please
unsigned short var = -65537u;
Steps:
65537u (implicitly converted to unsigned int)
Binary representation:
0000 0000 0000 0001 0000 0000 0000 0001
-65537u
Binary representation: 1111 1111 1111 1110 1111 1111 1111 1111
Truncated to short
Binary representation: 1111 1111 1111 1111
read as an unsigned short: 65535
The same should apply for the following cases:
unsigned short var = -65541u;
65541u (unsigned int)
0000 0000 0000 0001 0000 0000 0000 0101
-65541u
1111 1111 1111 1110 1111 1111 1111 1011
Truncated to short
1111 1111 1111 1011
read as an unsigned short: 65531
unsigned short var = -5u;
5u (unsigned int)
0000 0000 0000 0000 0000 0000 0000 0101
-5u
1111 1111 1111 1111 1111 1111 1111 1011
Truncated to short
1111 1111 1111 1011
read as an unsigned short: 65531
Your analysis is correct for the usual platforms where short is 16 bits and int is 32 bits.
For some platforms, the constant 65537 may not fit in an unsigned int, but if that is the case, 65537u will be typed as a larger unsigned type. The list of types that are tried can be found in section 6.4.4.1:5 of the C99 standard. In C99 it will at least fit in an unsigned long, which is guaranteed by the standard to allow values that large.
The reasoning remains much of the same if that happens, until the conversion back to unsigned short for the assignment.
Conversely, unsigned short is allowed by the C99 standard to hold more than 16 bits. In this case var receives USHRT_MAX-65536 for your first example and similarly for the other ones.
The size of short is implementation dependant - not 16bit. 16bit is the minimum size.
Similairly the size of an int may only be 16bit also.
How exactly do the following lines work if pData = "abc"?
pDes[1] = ( pData[0] & 0x1c ) >> 2;
pDes[0] = ( pData[0] << 6 ) | ( pData[1] & 0x3f );
Okay, assuming ASCII which is by no means guaranteed, pData[0] is 'a' (0x61) and pData[1] is 'b' (0x62):
pDes[1]:
pData[0] 0110 0001
&0x1c 0001 1100
---- ----
0000 0000
>>2 0000 0000 0x00
pDes[0]:
pData[0] 0110 0001
<< 6 01 1000 0100 0000 (interim value *a)
pData[1] 0110 0010
&0x3f 0011 1111
-- ---- ---- ----
0010 0010
|(*a) 01 1000 0100 0000
-- ---- ---- ----
01 1000 0110 0010 0x1862
How it works:
<< N simply means shift the bits N spaces to the left, >> N is the same but shifting to the right.
The & (and) operation will set each bit of the result to 1 if and only if the corresponding bit in both inputs is 1.
The | (or) operations sets each bit of the result to 1 if one or more of the corresponding bit in both inputs is 1.
Note that the 0x1862 will be truncated to fit into pDes[0] if it's type is not wide enough.
The folowing C program shows this in action:
#include <stdio.h>
int main(void) {
char *pData = "abc";
int pDes[2];
pDes[1] = ( pData[0] & 0x1c ) >> 2;
pDes[0] = ( pData[0] << 6 ) | ( pData[1] & 0x3f );
printf ("%08x %08x\n", pDes[0], pDes[1]);
return 0;
}
It outputs:
00001862 00000000
and, when you change pDes to a char array, you get:
00000062 00000000
& is not logical AND - it is bit-wise AND.
a is 0x61, thus pData[0] & 0x1c gives
0x61 0110 0001
0x1c 0001 1100
--------------
0000 0000
>> 2 shifts this to right by two positions - value doesn't change as all bits are zero.
pData[0] << 6 left shifts 0x61 by 6 bits to give 01000000 or 0x40
pData[1] & 0x3f
0x62 0110 0010
0x3f 0011 1111
--------------
0x22 0010 0010
Thus it comes down to 0x40 | 0x22 - again | is not logical OR, it is bit-wise.
0x40 0100 0000
0x22 0010 0010
--------------
0x62 0110 0010
The results will be different if pDes is not a char array. Left shifting 0x61 would give you 0001 1000 0100 0000 or 0x1840 - (in case pDes is a char array, the left parts are not in the picture).
0x1840 0001 1000 0100 0000
0x0022 0000 0000 0010 0010
--------------------------
0x1862 0001 1000 0110 0010
pDes[0] would end up as 0x1862 or decimal 6242.
C++ will treat a character as a number according to it's encoding. So, assuming ASCII, 'a' is 97 (which has a bit pattern of 0110_0001) and 'b' is 98 (bit pattern 0110_0010).
Once you think of them as numbers, bit operations on characters should be a bit clearer.
In C, all characters are also integers. That means "abc" is equivalent to (char[]){0x61, 0x62, 0x63, 0}.
The & is not the logical AND operator (&&). It is the bitwise AND, which computes the AND at bit-level, e.g.
'k' = 0x6b -> 0 1 1 0 1 0 1 1
0x1c -> 0 0 0 1 1 1 0 0 (&
———————————————————
8 <- 0 0 0 0 1 0 0 0
The main purpose of & 0x1c here is to extract bits #2 ~ #4 from pData[0]. The >> 2 afterwards remove the extra zeros at the end.
Similarly, the & 0x3f is to extract bits #0 ~ #5 from pData[1].
The << 6 pushes 6 zeros at the least significant end of the bits. Assuming pDes[0] is also a char, the most significant 6 bits will be discarded:
'k' = 0x6b -> 0 1 1 0 1 0 1 1
<< 6 = 0 1 1 0 1 0 1 1 0 0 0 0 0 0
xxxxxxxxxxx—————————————————
0xc0 <- 1 1 0 0 0 0 0 0
In terms of bits, if
pData[1] pData[0]
pData -> b7 b6 b5 b4 b3 b2 b1 b0 a7 a6 a5 a4 a3 a2 a1 a0
then
pDes -> 0 0 0 0 0 a4 a3 a2 a1 a0 b5 b4 b3 b2 b1 b0
pDes[1] pDes[0]
This looks like an operation to pack three values into a 6-5-5 bit structure.