represent negative number with 2' complement technique? - twos-complement

I am using 2' complement to represent a negative number in binary form
Case 1:number -5
According to the 2' complement technique:
Convert 5 to the binary form:
00000101, then flip the bits
11111010, then add 1
00000001
=> result: 11111011
To make sure this is correct, I re-calculate to decimal:
-128 + 64 + 32 + 16 + 8 + 2 + 1 = -5
Case 2: number -240
The same steps are taken:
11110000
00001111
00000001
00010000 => recalculate this I got 16, not -240
I am misunderstanding something?

The problem is that you are trying to represent 240 with only 8 bits. The range of an 8 bit signed number is -128 to 127.
If you instead represent it with 9 bits, you'll see you get the correct answer:
011110000 (240)
100001111 (flip the signs)
+
000000001 (1)
=
100010000
=
-256 + 16 = -240

Did you forget that -240 cannot be represented with 8 bits when it is signed ?

The lowest negative number you can express with 8 bits is -128, which is 10000000.
Using 2's complement:
128 = 10000000
(flip) = 01111111
(add 1) = 10000000
The lowest negative number you can express with N bits (with signed integers of course) is always - 2 ^ (N - 1).

Related

How does this alignment works? ((n + ZBI_ALIGNMENT - 1) & -ZBI_ALIGNMENT)

I'm trying to understand how this alignment works. It should align an uint32 address to its nearest 8 byte aligned address
static inline uint32_t
ZBI_ALIGN(uint32_t n) {
return ((n + ZBI_ALIGNMENT - 1) & -ZBI_ALIGNMENT);
Let's take n=10, and ZBI_ALIGNMENT=8. The nearest address should be 16
returns ((10 + 8 -1) & -8) = 17 & -8
Why this should be aligned?
The key to this formula is that it is only valid if ZBI_ALIGNMENT happens to be a power of two, which is not a big deal because alignment requirements tend to fulfil that criteria.
A number being aligned to (aka being a multiple of) a power of two means that all bits smaller than that power of two are set to 0. You can convince yourself of that easily by looking at a few 8-bit numbers:
15: 00001111
16: 00010000 <--- aligned to 16
17: 00010001
31: 00011111
32: 00100000 <--- aligned to 16
48: 00110000 <--- aligned to 16
Assuming that we have a mask that happens to have only have the bits higher or equal to 16 set, N & mask, would be a no-op for all multiples of 16, and give us the previous multiple of 16 for all other values.
16: 00010000
mask for 16: 11110000
15 & mask -> 00000000 : 0
16 & mask -> 00010000 : 16
17 & mask -> 00010000 : 16
32 & mask -> 00100000 : 32
In order to get the right value directly, we can use (N + 15) & mask instead. If N is a multiple of 16 already, N + 15 will land just shy of the next multiple. Otherwise, it will always "bump" the value to the next range. e.g. 1+15 = 16, 16 + 15 = 31, etc... This generalises as (N + (DESIRED_ALIGMENT - 1)).
So all that's left to figure out is how to calculate the mask for a given desired alignment.
Conveniently, in two's complement representation (which all signed integers have to use), negative values of powers of two happen to be exactly the mask we need.
For 8 bit numbers it looks like this:
-1 -> 11111111
-2 -> 11111110
-4 -> 11111100
-8 -> 11111000
etc...
So mask can simply be computed as -ZBI_ALIGNMENT.
Putting all this together, we get:
((n + ZBI_ALIGNMENT - 1) & -ZBI_ALIGNMENT)

Why is absolute value of INT_MIN different from INT MAX? [duplicate]

This question already has answers here:
What is “two's complement”?
(24 answers)
Closed 7 years ago.
I'm trying to understand why INT_MIN is equal to -2^31 - 1 and not just -2^31.
My understanding is that an int is 4 bytes = 32 bits. Of these 32 bits, I assume 1 bit is used for the +/- sign, leaving 31 bits for the actual value. As such, INT_MAX is equal to 2^31-1 = 2147483647. On the other hand, why is INT_MIN equal to -2^31 = -2147483648? Wouldn't this exceed the '4 bytes' allotted for int? Based on my logic, I would have expected INT_MIN to equal -2^31 = -2147483647
Most modern systems use two's complement to represent signed integer data types. In this representation, one state in the positive side is used up to represent zero, hence one positive value lesser than the negatives. In fact this is one of the prime advantage this system has over the sign-magnitude system, where zero has two representations, +0 and -0. Since zero has only one representation in two's complement, the other state, now free, is used to represent one more number.
Let's take a small data type, say 4 bits wide, to understand this better. The number of possible states with this toy integer type would be 2⁴ = 16 states. When using two's complement to represent signed numbers, we would have 8 negative and 7 positive numbers and zero; in sign-magnitude system, we'd get two zeros, 7 positive and 7 negative numbers.
Bin Dec
0000 = 0
0001 = 1
0010 = 2
0011 = 3
0100 = 4
0101 = 5
0110 = 6
0111 = 7
1000 = -8
1001 = -7
1010 = -6
1011 = -5
1100 = -4
1101 = -3
1110 = -2
1111 = -1
I think you are confused since you are imagining that sign-magnitude representation is used for signed numbers; although this is also allowed by the language standards, this system is very less likely to be implemented as two's complement system is significantly a better representation.
As of C++20, only two's complement is allowed for signed integers; source.

Understanding two's complement

I don't understand why an n-bit 2C system number can be extended to an (n+1)-bit 2C system number by making bit bn = bn−1, that is, extending to (n+1) bits by replicating the sign bit.
This works because of the way we calculate the value of a binary integer.
Working right to left, the sum of each bit_i * 2 ^ i,
where
i is the range 0 to n
n is the number of bits
Because each subsequent 0 bit will not increase the magnitude of the sum, it is the appropriate value to pad a smaller value into a wider bit field.
For example, using the number 5:
4 bit: 0101
5 bit: 00101
6 bit: 000101
7 bit 0000101
8 bit: 00000101
The opposite is true for negative numbers in a two's compliment system.
Remember you calculate two's compliment by first calculating the one's compliment and then adding 1.
Invert the value from the previous example to get -5:
4 bit: 0101 (invert)-> 1010 + 1 -> 1011
5 bit: 00101 (invert)-> 11010 + 1 -> 11011
6 bit: 000101 (invert)-> 111010 + 1 -> 111011
7 bit: 0000101 (invert)-> 1111010 + 1 -> 1111011
8 bit: 00000101 (invert)-> 11111010 + 1 -> 11111011

How to represent a negative number with a fraction in 2's complement?

So I want to represent the number -12.5. So 12.5 equals to:
001100.100
If I don't calculate the fraction then it's simple, -12 is:
110100
But what is -12.5? is it 110100.100? How can I calculate this negative fraction?
With decimal number systems, each number position (or column) represents (reading a number from right to left): units (which is 10^0), tens (i.e. 10^1),hundreds (i.e. 10^2), etc.
With unsigned binary numbers, the base is 2, thus each position becomes (again, reading from right to left): 1 (i.e. 2^0) ,2 (i.e. 2^1), 4 (i.e. 2^2), etc.
For example
2^2 (4), 2^1 (2), 2^0 (1).
In signed twos-complement the most significant bit (MSB) becomes negative. Therefore it represent the number sign: '1' for a negative number and '0' for a positive number.
For a three bit number the rows would hold these values:
-4, 2, 1
0 0 1 => 1
1 0 0 => -4
1 0 1 => -4 + 1 = -3
The value of the bits held by a fixed-point (fractional) system is unchanged. Column values follow the same pattern as before, base (2) to a power, but with power going negative:
2^2 (4), 2^1 (2), 2^0 (1) . 2^-1 (0.5), 2^-2 (0.25), 2^-3 (0.125)
-1 will always be 111.000
-0.5 add 0.5 to it: 111.100
In your case 110100.10 is equal to -32+16+4+0.5 = -11.5. What you did was create -12 then add 0.5 rather than subtract 0.5.
What you actually want is -32+16+2+1+0.5 = -12.5 = 110011.1
you can double the number again and again until it's negative integer or reaches a defined limit and then set the decimal point correspondingly.
-25 is 11100111, so -12.5 is 1110011.1
So;U want to represent -12.5 in 2's complement representation
12.5:->> 01100.1
2's complement of (01100.1):->>10011.1
verify the ans by checking the weighted code property of 2's complement representation(MSB weight is -ve). we will get -16+3+.5=-12.5

Bitfields in C++

I have the following code for self learning:
#include <iostream>
using namespace std;
struct bitfields{
unsigned field1: 3;
unsigned field2: 4;
unsigned int k: 4;
};
int main(){
bitfields field;
field.field1=8;
field.field2=1e7;
field.k=18;
cout<<field.k<<endl;
cout<<field.field1<<endl;
cout<<field.field2<<endl;
return 0;
}
I know that unsigned int k:4 means that k is 4 bits wide, or a maximum value of 15, and the result is the following.
2
0
1
For example, filed1 can be from 0 to 7 (included), field2 and k from 0 to 15. Why such a result? Maybe it should be all zero?
You're overflowing your fields. Let's take k as an example, it's 4 bits wide. It can hold values, as you say, from 0 to 15, in binary representation this is
0 -> 0000
1 -> 0001
2 -> 0010
3 -> 0011
...
14 -> 1110
15 -> 1111
So when you assign 18, having binary representation
18 -> 1 0010 (space added between 4th and 5th bit for clarity)
k can only hold the lower four bits, so
k = 0010 = 2.
The equivalent holds true for the rest of your fields as well.
You have these results because the assignments overflowed each bitfield.
The variable filed1 is 3 bits, but 8 takes 4 bits to present (1000). The lower three bits are all zero, so filed1 is zero.
For filed2, 17 is represented by 10001, but filed2 is only four bits. The lower four bits represent the value 1.
Finally, for k, 18 is represented by 10010, but k is only four bits. The lower four bits represent the value 2.
I hope that helps clear things up.
In C++ any unsigned type wraps around when you hit its ceiling[1]. When you define a bitfield of 4 bits, then every value you store is wrapped around too. The possible values for a bitfield of size 4 are 0-15. If you store '17', then you wrap to '1', for '18' you go one more to '2'.
Mathematically, the wrapped value is the original value modulo the number of possible values for the destination type:
For the bitfield of size 4 (2**4 possible values):
18 % 16 == 2
17 % 16 == 1
For the bitfield of size 3 (2**3 possible values):
8 % 8 == 0.
[1] This is not true for signed types, where it is undefined what happens then.