Understanding two's complement - bit-manipulation

I don't understand why an n-bit 2C system number can be extended to an (n+1)-bit 2C system number by making bit bn = bn−1, that is, extending to (n+1) bits by replicating the sign bit.

This works because of the way we calculate the value of a binary integer.
Working right to left, the sum of each bit_i * 2 ^ i,
where
i is the range 0 to n
n is the number of bits
Because each subsequent 0 bit will not increase the magnitude of the sum, it is the appropriate value to pad a smaller value into a wider bit field.
For example, using the number 5:
4 bit: 0101
5 bit: 00101
6 bit: 000101
7 bit 0000101
8 bit: 00000101
The opposite is true for negative numbers in a two's compliment system.
Remember you calculate two's compliment by first calculating the one's compliment and then adding 1.
Invert the value from the previous example to get -5:
4 bit: 0101 (invert)-> 1010 + 1 -> 1011
5 bit: 00101 (invert)-> 11010 + 1 -> 11011
6 bit: 000101 (invert)-> 111010 + 1 -> 111011
7 bit: 0000101 (invert)-> 1111010 + 1 -> 1111011
8 bit: 00000101 (invert)-> 11111010 + 1 -> 11111011

Related

Two's Complement on representing negative numbers

I am currently trying to gain a more intuitive understanding of two's complement and its uses; however, I cannot seem to perform subtraction using two's complement correctly. I understand that when a negative number is stored in a signed int variable the procedure is to perform two's complement on the number having the MSB be 1 to represent the negative sign.
So in a 4 bit system 1010 represents -6.
Now I am following this guide on how to subtract two binary #s
For example I have the number 0101 (5 in decimal) and 1010 (-6 in decimal). If I wanted to do the equation 5 - (-6) it would look like 0101 - 1010 in binary. Next I would take the 1010 and perform twos' complement on it to get 0110. Now I take 0101 + 0110 and get 1011. I don't have a carry so I perform two's complement on the result giving me 0101, but this says the answer is -5 when it should be 11.
4 bits can represent 16 different values. With two's complement, they are -8 to 7.
1000 -8
1001 -7
1010 -6
1011 -5
1100 -4
1101 -3
1110 -2
1111 -1
0000 0
0001 1
0010 2
0011 3
0100 4
0101 5
0110 6
0111 7
You can't possibly get an answer of 11. It's too large to fit in 4 bits as a two's complement number. As such, the calculation you are attempting results in an overflow. (See below for how to detect overflows.)
This means 4 bits is not enough to compute 5 - -6. You need at least 5 bits.
0(...0)0101 5
- 1(...1)1010 -6
------------------
0(...0)0101 5
+ 0(...0)0110 6
------------------
0(...0)1011 11
Detecting Overflow in Two's Complement
With two's complement numbers, an addition overflows when the carry into the sign bit is different than the carry out of it.
0101 5
+ 0110 6
-----------
1011 -5 Carry in: 1 Carry out: 0 OVERFLOW!
1011 -5
+ 1010 -6
-----------
0101 5 Carry in: 0 Carry out: 1 OVERFLOW!
0001 1
+ 0010 2
-----------
0011 3 Carry in: 0 Carry out: 0 No overflow
1111 -1
+ 1110 -2
-----------
1101 -3 Carry in: 1 Carry out: 1 No overflow
For example I have the number 0101 (5 in decimal) and 1010 (-6 in decimal). If I wanted to do the equation 5 - (-6) it would look like 0101 - 1010 in binary.
Ok.
Next I would take the 1010 and perform twos' complement on it to get 0110.
Ok. And 0110 binary is 6 decimal is -(-6) decimal.
Now I take 0101 + 0110 and get 1011.
Ok.
I don't have a carry so I perform two's complement on the result giving me 0101,
I take you to mean that as part of the process of interpreting the result, not computing it. The result itself is 1011 (4-bit, two's complement binary).
but this says the answer is -5 when it should be 11.
The answer in the operational system you have chosen, with 4-bit two's complement, is -5, exactly as you have computed. With three data bits and one sign bit, the maximum result your data type can represent is 7 (0111 binary). This underscores the importance of choosing data types appropriate for the computations you want to perform.
Note that 0b1011 is 11 in binary, if interpreted as unsigned or with more than 3 data bits, so both 11 and -5 are the result (which you use depends on how its interpreted).
In terms of modular arithmetic, 4 bits is a base of 24, and:
11 ≡ -5 (mod 16)
Which answer you use depends on whether the nibble is interpreted as signed or unsigned.

Invert (flip) last n bits of a number with only bitwise operations

Given a binary integer, how can I invert (flip) last n bits using only bitwise operations in c/c++?
For example:
// flip last 2 bits
0110 -> 0101
0011 -> 0000
1000 -> 1011
You can flip last n bits of your number with
#define flipBits(n,b) ((n)^((1u<<(b))-1))
for example flipBits(0x32, 4) will flip the last 4 bits and result will be 0x3d
this works because if you think how XOR works
0 ^ 0 => 0
1 ^ 0 => 1
bits aren't flipped
0 ^ 1 => 1
1 ^ 1 => 0
bits are flipped
(1<<b)-1
this part gets you the last n bits
for example, if b is 4 then 1<<4 is 0b10000 and if we remove 1 we get our mask which is 0b1111 then we can use this to xor with our number to get the desired output.
works for C and C++

relation between size of types and their range of values? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
In c++ or any other language, what is the relation between the size of types and the range of values they take?
E.g.- char has 1 byte size that means no. Of values it can store is 2^8. So why can it take values ranging from -128 to 127 only and why not larger values.
Is it related to bit pattern?
Or am I misunderstanding this thing. I am new to programming and i grasp the concepts fast but m stuck here in this concept!!
Please explain this in relation to floating point types too!! Thanks in advance
Start with the basic idea of the number of states. A bit has two states - 0 and 1. Two bits have four possible states: 00, 01, 10, and 11. For three bits the number of states is eight:
000 001 010 011 100 101 110 111
The pattern should emerge by now: adding an extra bit doubles the number of states that a group of bits can take. This is easy to see: if the number of states of k bits is N, then for k+1 bits there's N states for when the added bit is 0 and N more states for when it is 1, or N+N altogether. Hence, k bits can have 2k states.
Bytes are groups of 8 bits, so the number of states a byte could have is 2k, which is 256. If you use a byte to represent an unsigned value, its range would be 0..255, inclusive. For signed values one bit is taken to represent the sign. In two's complement representation the value range becomes -128..127. Negative values allow one extra value, because non-negative part of the range includes zero, while negative part of the range does not have a zero.
Its easy, variable of datatype has 2^(sizeof(datatype) * CHAR_BIT) values. Now it depends if this datatype is signed or unsigned.
signed has 0 .. ((2^(sizeof(datatype) * CHAR_BIT))-1) values.
unsigned has -((2^(sizeof(datatype) * CHAR_BIT))/2) .. +((2^(sizeof(datatype) * CHAR_BIT)/2)-1) values.
char datatype
2^8 is 256
where
-128..127 has 256 values
for signed char and unsigned char has range
0..255, still 256 values.
Byte is sequence of 8 bits.
+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
2^7 2^6 2^5 2^4 2^3 2^2 2^1 2^0
The highest bit (in little bit endian) indicates whether value is 0 - positive or 1 - negative, the rest of bits are for value.
Then you have
+---+---+---+---+---+---+---+---+
| 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | < Max positive number
+---+---+---+---+---+---+---+---+
2^7 2^6 2^5 2^4 2^3 2^2 2^1 2^0
and
+---+---+---+---+---+---+---+---+
| 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | < Max negative number
+---+---+---+---+---+---+---+---+
2^7 2^6 2^5 2^4 2^3 2^2 2^1 2^0
Zero's becouse numbers are usually represented in two's complement.
Convertion from two's complement is following
1. Invert all bits -> |0|1|1|1|1|1|1|1| -> 127
2. Add 1 -> |1|0|0|0|0|0|0|0| -> 128
3. Change sign -> -> -128

How does this implementation of bitset::count() work?

Here's the implementation of std::bitset::count with MSVC 2010:
size_t count() const
{ // count number of set bits
static char _Bitsperhex[] = "\0\1\1\2\1\2\2\3\1\2\2\3\2\3\3\4";
size_t _Val = 0;
for (int _Wpos = _Words; 0 <= _Wpos; --_Wpos)
for (_Ty _Wordval = _Array[_Wpos]; _Wordval != 0; _Wordval >>= 4)
_Val += _Bitsperhex[_Wordval & 0xF];
return (_Val);
}
Can someone explain to me how this is working? what's the trick with _Bitsperhex?
_Bitsperhex contains the number of set bits in a hexadecimal digit, indexed by the digit.
digit: 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
value: 0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4
index: 0 1 2 3 4 5 6 7 8 9 A B C D E F
The function retrieves one digit at a time from the value it's working with by ANDing with 0xF (binary 1111), looks up the number of set bits in that digit, and sums them.
_Bitsperhex is a 16 element integer array that maps a number in [0..15] range to the number of 1 bits in the binary representation of that number. For example, _Bitsperhex[3] is equal to 2, which is the number of 1 bits in the binary representation of 3.
The rest is easy: each multi-bit word in internal array _Array is interpreted as a sequence of 4-bit values. Each 4-bit value is fed through the above _Bitsperhex table to count the bits.
It is a slightly different implementation of the lookup table-based method described here: http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable. At the link they use a table of 256 elements and split 32-bit words into four 8-bit values.

represent negative number with 2' complement technique?

I am using 2' complement to represent a negative number in binary form
Case 1:number -5
According to the 2' complement technique:
Convert 5 to the binary form:
00000101, then flip the bits
11111010, then add 1
00000001
=> result: 11111011
To make sure this is correct, I re-calculate to decimal:
-128 + 64 + 32 + 16 + 8 + 2 + 1 = -5
Case 2: number -240
The same steps are taken:
11110000
00001111
00000001
00010000 => recalculate this I got 16, not -240
I am misunderstanding something?
The problem is that you are trying to represent 240 with only 8 bits. The range of an 8 bit signed number is -128 to 127.
If you instead represent it with 9 bits, you'll see you get the correct answer:
011110000 (240)
100001111 (flip the signs)
+
000000001 (1)
=
100010000
=
-256 + 16 = -240
Did you forget that -240 cannot be represented with 8 bits when it is signed ?
The lowest negative number you can express with 8 bits is -128, which is 10000000.
Using 2's complement:
128 = 10000000
(flip) = 01111111
(add 1) = 10000000
The lowest negative number you can express with N bits (with signed integers of course) is always - 2 ^ (N - 1).