This question already has answers here:
What is “two's complement”?
(24 answers)
Closed 7 years ago.
I'm trying to understand why INT_MIN is equal to -2^31 - 1 and not just -2^31.
My understanding is that an int is 4 bytes = 32 bits. Of these 32 bits, I assume 1 bit is used for the +/- sign, leaving 31 bits for the actual value. As such, INT_MAX is equal to 2^31-1 = 2147483647. On the other hand, why is INT_MIN equal to -2^31 = -2147483648? Wouldn't this exceed the '4 bytes' allotted for int? Based on my logic, I would have expected INT_MIN to equal -2^31 = -2147483647
Most modern systems use two's complement to represent signed integer data types. In this representation, one state in the positive side is used up to represent zero, hence one positive value lesser than the negatives. In fact this is one of the prime advantage this system has over the sign-magnitude system, where zero has two representations, +0 and -0. Since zero has only one representation in two's complement, the other state, now free, is used to represent one more number.
Let's take a small data type, say 4 bits wide, to understand this better. The number of possible states with this toy integer type would be 2⁴ = 16 states. When using two's complement to represent signed numbers, we would have 8 negative and 7 positive numbers and zero; in sign-magnitude system, we'd get two zeros, 7 positive and 7 negative numbers.
Bin Dec
0000 = 0
0001 = 1
0010 = 2
0011 = 3
0100 = 4
0101 = 5
0110 = 6
0111 = 7
1000 = -8
1001 = -7
1010 = -6
1011 = -5
1100 = -4
1101 = -3
1110 = -2
1111 = -1
I think you are confused since you are imagining that sign-magnitude representation is used for signed numbers; although this is also allowed by the language standards, this system is very less likely to be implemented as two's complement system is significantly a better representation.
As of C++20, only two's complement is allowed for signed integers; source.
Related
I am currently trying to gain a more intuitive understanding of two's complement and its uses; however, I cannot seem to perform subtraction using two's complement correctly. I understand that when a negative number is stored in a signed int variable the procedure is to perform two's complement on the number having the MSB be 1 to represent the negative sign.
So in a 4 bit system 1010 represents -6.
Now I am following this guide on how to subtract two binary #s
For example I have the number 0101 (5 in decimal) and 1010 (-6 in decimal). If I wanted to do the equation 5 - (-6) it would look like 0101 - 1010 in binary. Next I would take the 1010 and perform twos' complement on it to get 0110. Now I take 0101 + 0110 and get 1011. I don't have a carry so I perform two's complement on the result giving me 0101, but this says the answer is -5 when it should be 11.
4 bits can represent 16 different values. With two's complement, they are -8 to 7.
1000 -8
1001 -7
1010 -6
1011 -5
1100 -4
1101 -3
1110 -2
1111 -1
0000 0
0001 1
0010 2
0011 3
0100 4
0101 5
0110 6
0111 7
You can't possibly get an answer of 11. It's too large to fit in 4 bits as a two's complement number. As such, the calculation you are attempting results in an overflow. (See below for how to detect overflows.)
This means 4 bits is not enough to compute 5 - -6. You need at least 5 bits.
0(...0)0101 5
- 1(...1)1010 -6
------------------
0(...0)0101 5
+ 0(...0)0110 6
------------------
0(...0)1011 11
Detecting Overflow in Two's Complement
With two's complement numbers, an addition overflows when the carry into the sign bit is different than the carry out of it.
0101 5
+ 0110 6
-----------
1011 -5 Carry in: 1 Carry out: 0 OVERFLOW!
1011 -5
+ 1010 -6
-----------
0101 5 Carry in: 0 Carry out: 1 OVERFLOW!
0001 1
+ 0010 2
-----------
0011 3 Carry in: 0 Carry out: 0 No overflow
1111 -1
+ 1110 -2
-----------
1101 -3 Carry in: 1 Carry out: 1 No overflow
For example I have the number 0101 (5 in decimal) and 1010 (-6 in decimal). If I wanted to do the equation 5 - (-6) it would look like 0101 - 1010 in binary.
Ok.
Next I would take the 1010 and perform twos' complement on it to get 0110.
Ok. And 0110 binary is 6 decimal is -(-6) decimal.
Now I take 0101 + 0110 and get 1011.
Ok.
I don't have a carry so I perform two's complement on the result giving me 0101,
I take you to mean that as part of the process of interpreting the result, not computing it. The result itself is 1011 (4-bit, two's complement binary).
but this says the answer is -5 when it should be 11.
The answer in the operational system you have chosen, with 4-bit two's complement, is -5, exactly as you have computed. With three data bits and one sign bit, the maximum result your data type can represent is 7 (0111 binary). This underscores the importance of choosing data types appropriate for the computations you want to perform.
Note that 0b1011 is 11 in binary, if interpreted as unsigned or with more than 3 data bits, so both 11 and -5 are the result (which you use depends on how its interpreted).
In terms of modular arithmetic, 4 bits is a base of 24, and:
11 ≡ -5 (mod 16)
Which answer you use depends on whether the nibble is interpreted as signed or unsigned.
I'm new to C++ and I find something I can't understand. Could anyone provide some help?
For the following codes:
int i = -3;
printf("i=%d\n",i);
i = i >> 1:
printf("i >> 1 evaluates to: %d\n", i);
then I got the result:
i=-3
i >> 1 evaluates to: -2
I don't quite understand.
As 3 is coded as( let is be simple):
3 : 0000 0011
-3 : 1111 1100
then after right shift operation, we should have:
-1 : 1111 1110
right? Why I got -2? (My PC in 64 bit)
Thanks for any help!
Your mistake is in assuming that because 3 is 00000011, -3 is represented simply by inverting bits (the so-called "one's complement" representation of negative numbers) to get 11111100. And that likewise 00000001 becomes 11111110 when negated. In fact that's not the case—instead your computer seems to be using the almost-universal "two's complement" system in which -3 is represented as 11111101, -2 is 11111110 and -1 is 11111111.
One nice intuition pump for the two's-complement system is to consider a series of increments, and to note that the behavior is somewhat consistent and intuitive regardless of whether you imagine them happening in the bit pattern itself, in the signed representation, or in the unsigned. Let's stick to 8 bits for simplicity (imagine the "9th bit" just getting discarded):
bit pattern interpreted as...
signed byte unsigned byte
11111101 -3 253
11111110 -2 254
11111111 -1 255
00000000 0 0 (wrap-around)
00000001 1 1
When it goes from -1 to 0 I can almost "hear" all those bits flipping over one after the other.
Actually -1 = 0xFFFF = 1111 1111 1111 1111b, -3 = 0xFFFD = 1111 1111 1111 1101b(for 4 byte int).
So when you use right shift, you get 1111 1111 1111 1110b which is -2
So I want to represent the number -12.5. So 12.5 equals to:
001100.100
If I don't calculate the fraction then it's simple, -12 is:
110100
But what is -12.5? is it 110100.100? How can I calculate this negative fraction?
With decimal number systems, each number position (or column) represents (reading a number from right to left): units (which is 10^0), tens (i.e. 10^1),hundreds (i.e. 10^2), etc.
With unsigned binary numbers, the base is 2, thus each position becomes (again, reading from right to left): 1 (i.e. 2^0) ,2 (i.e. 2^1), 4 (i.e. 2^2), etc.
For example
2^2 (4), 2^1 (2), 2^0 (1).
In signed twos-complement the most significant bit (MSB) becomes negative. Therefore it represent the number sign: '1' for a negative number and '0' for a positive number.
For a three bit number the rows would hold these values:
-4, 2, 1
0 0 1 => 1
1 0 0 => -4
1 0 1 => -4 + 1 = -3
The value of the bits held by a fixed-point (fractional) system is unchanged. Column values follow the same pattern as before, base (2) to a power, but with power going negative:
2^2 (4), 2^1 (2), 2^0 (1) . 2^-1 (0.5), 2^-2 (0.25), 2^-3 (0.125)
-1 will always be 111.000
-0.5 add 0.5 to it: 111.100
In your case 110100.10 is equal to -32+16+4+0.5 = -11.5. What you did was create -12 then add 0.5 rather than subtract 0.5.
What you actually want is -32+16+2+1+0.5 = -12.5 = 110011.1
you can double the number again and again until it's negative integer or reaches a defined limit and then set the decimal point correspondingly.
-25 is 11100111, so -12.5 is 1110011.1
So;U want to represent -12.5 in 2's complement representation
12.5:->> 01100.1
2's complement of (01100.1):->>10011.1
verify the ans by checking the weighted code property of 2's complement representation(MSB weight is -ve). we will get -16+3+.5=-12.5
I am using 2' complement to represent a negative number in binary form
Case 1:number -5
According to the 2' complement technique:
Convert 5 to the binary form:
00000101, then flip the bits
11111010, then add 1
00000001
=> result: 11111011
To make sure this is correct, I re-calculate to decimal:
-128 + 64 + 32 + 16 + 8 + 2 + 1 = -5
Case 2: number -240
The same steps are taken:
11110000
00001111
00000001
00010000 => recalculate this I got 16, not -240
I am misunderstanding something?
The problem is that you are trying to represent 240 with only 8 bits. The range of an 8 bit signed number is -128 to 127.
If you instead represent it with 9 bits, you'll see you get the correct answer:
011110000 (240)
100001111 (flip the signs)
+
000000001 (1)
=
100010000
=
-256 + 16 = -240
Did you forget that -240 cannot be represented with 8 bits when it is signed ?
The lowest negative number you can express with 8 bits is -128, which is 10000000.
Using 2's complement:
128 = 10000000
(flip) = 01111111
(add 1) = 10000000
The lowest negative number you can express with N bits (with signed integers of course) is always - 2 ^ (N - 1).
I have the following code for self learning:
#include <iostream>
using namespace std;
struct bitfields{
unsigned field1: 3;
unsigned field2: 4;
unsigned int k: 4;
};
int main(){
bitfields field;
field.field1=8;
field.field2=1e7;
field.k=18;
cout<<field.k<<endl;
cout<<field.field1<<endl;
cout<<field.field2<<endl;
return 0;
}
I know that unsigned int k:4 means that k is 4 bits wide, or a maximum value of 15, and the result is the following.
2
0
1
For example, filed1 can be from 0 to 7 (included), field2 and k from 0 to 15. Why such a result? Maybe it should be all zero?
You're overflowing your fields. Let's take k as an example, it's 4 bits wide. It can hold values, as you say, from 0 to 15, in binary representation this is
0 -> 0000
1 -> 0001
2 -> 0010
3 -> 0011
...
14 -> 1110
15 -> 1111
So when you assign 18, having binary representation
18 -> 1 0010 (space added between 4th and 5th bit for clarity)
k can only hold the lower four bits, so
k = 0010 = 2.
The equivalent holds true for the rest of your fields as well.
You have these results because the assignments overflowed each bitfield.
The variable filed1 is 3 bits, but 8 takes 4 bits to present (1000). The lower three bits are all zero, so filed1 is zero.
For filed2, 17 is represented by 10001, but filed2 is only four bits. The lower four bits represent the value 1.
Finally, for k, 18 is represented by 10010, but k is only four bits. The lower four bits represent the value 2.
I hope that helps clear things up.
In C++ any unsigned type wraps around when you hit its ceiling[1]. When you define a bitfield of 4 bits, then every value you store is wrapped around too. The possible values for a bitfield of size 4 are 0-15. If you store '17', then you wrap to '1', for '18' you go one more to '2'.
Mathematically, the wrapped value is the original value modulo the number of possible values for the destination type:
For the bitfield of size 4 (2**4 possible values):
18 % 16 == 2
17 % 16 == 1
For the bitfield of size 3 (2**3 possible values):
8 % 8 == 0.
[1] This is not true for signed types, where it is undefined what happens then.