Assign and compare against uint(-1) [duplicate] - c++

I was curious to know what would happen if I assign a negative value to an unsigned variable.
The code will look somewhat like this.
unsigned int nVal = 0;
nVal = -5;
It didn't give me any compiler error. When I ran the program the nVal was assigned a strange value! Could it be that some 2's complement value gets assigned to nVal?

For the official answer - Section 4.7 conv.integral
"If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]
This essentially means that if the underlying architecture stores in a method that is not Two's Complement (like Signed Magnitude, or One's Complement), that the conversion to unsigned must behave as if it was Two's Complement.

It will assign the bit pattern representing -5 (in 2's complement) to the unsigned int. Which will be a large unsigned value. For 32 bit ints this will be 2^32 - 5 or 4294967291

You're right, the signed integer is stored in 2's complement form, and the unsigned integer is stored in the unsigned binary representation. C (and C++) doesn't distinguish between the two, so the value you end up with is simply the unsigned binary value of the 2's complement binary representation.

It will show as a positive integer of value of max unsigned integer - 4 (value depends on computer architecture and compiler).
BTW
You can check this by writing a simple C++ "hello world" type program and see for yourself

Yes, you're correct. The actual value assigned is something like all bits set except the third. -1 is all bits set (hex: 0xFFFFFFFF), -2 is all bits except the first and so on. What you would see is probably the hex value 0xFFFFFFFB which in decimal corresponds to 4294967291.

When you assign a negative value to an unsigned variable then it uses the 2's complement method to process it and in this method it flips all 0s to 1s and all 1s to 0s and then adds 1 to it. In your case, you are dealing with int which is of 4 byte(32 bits) so it tries to use 2's complement method on 32 bit number which causes the higher bit to flip. For example:
┌─[student#pc]─[~]
└──╼ $pcalc 0y00000000000000000000000000000101 # 5 in binary
5 0x5 0y101
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 # flip all bits
4294967290 0xfffffffa 0y11111111111111111111111111111010
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 + 1 # add 1 to that flipped binarry
4294967291 0xfffffffb 0y11111111111111111111111111111011

In Windows and Ubuntu Linux that I have checked assigning any negative number (not just -1) to an unsigned integer in C and C++ results in the assignment of the value UINT_MAX to that unsigned integer.
Compiled example link.

Related

How come 129, which is 8-bit number, is stored as -127 in signed char in c++

I declared signed char and stored 129, an 8-bit number, in it. when typecasted it into integer and printed the result, its -127. I understand that it is overflow, but the confusion occurs when you look at the binary of 129 which is 10000001. In signed char, most significant bit is reserved as a sign bit and rest of the 7 bits are used to store the number's binary. According to this concept, 129 should be stored as -1. MSG representing negative sign and rest of the 7-bits are 0000001 which makes 1.
How come 129 becomes -127 when the binary of 129 makes -1.
#include <iostream>
using namespace std;
int main() {
char a=129;
cout<<(int) a; // OUTPUT IS -127
return 0;
}
Your current idea of negative numbers is like this:
00000001 == 1
10000001 == -1
01111111 == 127
11111111 == -127
This means that you have only available range of integers -127...127 and
also you have two zeros. (00000000 == 0 and 10000000 == -0)
Best method is so called two's complement. For any number x, you negate binary representation and add 1 to get -x.
It means:
00000001 == 1
11111111 == -1
01111111 == 127
10000001 == -127
10000000 == -128
In this way only 00000000 is zero and you have the widest range -128...127.
Also CPU don't need additional instructions for adding signed numbers because it's identical to unsigned number addition and subtraction.
You may wonder, why to add 1. Without it, it's called one's complement.
https://en.wikipedia.org/wiki/Ones%27_complement
Every current computer stores low level signed integers in a format known as two's complement.
In two's complement, MAX_POSITIVE+1 is MIN_NEGATIVE.
0x00000000 is 0.
0x00....0V is V.
0x01....11 is MAX_POSITIVE
0x10....00 is MIN_NEGATIVE
0x11....11 is -1.
This looks weird at first glance.
But, it means that the logic of addition for positive and negative numbers is the same. In fact, signed and unsigned math works out to be the same, except on overflow a positive signed number becomes a negative number, while the overflow of a positive unsigned number wraps around to 0.
Note that in C++, overflowing a signed number is undefined behavior, not because the hardware traps it, but because by making it UB the compiler is free to assume "adding positive numbers together is positive". But at the machine code level, the implementation is 2s complement, and C++ has defined converting from signed to unsigned as following the 2s complement assumption.

Is a negative integer summed with a greater unsigned integer promoted to unsigned int?

After getting advised to read "C++ Primer 5 ed by Stanley B. Lipman" I don't understand this:
Page 66. "Expressions Involving Unsigned Types"
unsigned u = 10;
int i = -42;
std::cout << i + i << std::endl; // prints -84
std::cout << u + i << std::endl; // if 32-bit ints, prints 4294967264
He said:
In the second expression, the int value -42 is converted to unsigned before the addition is done. Converting a negative number to unsigned behaves exactly as if we had attempted to assign that negative value to an unsigned object. The value “wraps around” as described above.
But if I do something like this:
unsigned u = 42;
int i = -10;
std::cout << u + i << std::endl; // Why the result is 32?
As you can see -10 is not converted to unsigned int. Does this mean a comparison occurs before promoting a signed integer to an unsigned integer?
-10 is being converted to a unsigned integer with a very large value, the reason you get a small number is that the addition wraps you back around. With 32 bit unsigned integers -10 is the same as 4294967286. When you add 42 to that you get 4294967328, but the max value is 4294967296, so we have to take 4294967328 modulo 4294967296 and we get 32.
Well, I guess this is an exception to "two wrongs don't make a right" :)
What's happening is that there are actually two wrap arounds (unsigned overflows) under the hood and the final result ends up being mathematically correct.
First, i is converted to unsigned and as per the wrap around behavior the value is std::numeric_limits<unsigned>::max() - 9.
When this value is summed with u the mathematical result would be std::numeric_limits<unsigned>::max() - 9 + 42 == std::numeric_limits<unsigned>::max() + 33 which is an overflow and we get another wrap around. So the final result is 32.
As a general rule in an arithmetic expression if you only have unsigned overflows (no matter how many) and if the final mathematical result is representable in the expression data type, then the value of the expression will be the mathematically correct one. This is a consequence of the fact that unsigned integers in C++ obey the laws of arithmetic modulo 2n (see bellow).
Important notice. According to C++ unsigned arithmetic does not overflow:
§6.9.1 Fundamental types [basic.fundamental]
Unsigned integers shall obey the laws of arithmetic modulo 2n where n
is the number of bits in the value representation of that particular
size of integer 49
49) This implies that unsigned arithmetic does not overflow because a
result that cannot be represented by the resulting unsigned integer
type is reduced modulo the number that is one greater than the largest
value that can be represented by the resulting unsigned integer type.
I will however leave "overflow" in my answer to express values that cannot be represented in regular arithmetic.
Also what we colloquially call "wrap around" is in fact just the arithmetic modulo nature of the unsigned integers. I will however use "wrap around" also because it is easier to understand.
i is in fact promoted to unsigned int.
Unsigned integers in C and C++ implement arithmetic in ℤ / 2nℤ, where n is the number of bits in the unsigned integer type. Thus we get
[42] + [-10] ≡ [42] + [2n - 10] ≡ [2n + 32] ≡ [32],
with [x] denoting the equivalence class of x in ℤ / 2nℤ.
Of course, the intermediate step of picking only non-negative representatives of each equivalence class, while it formally occurs, is not necessary to explain the result; the immediate
[42] + [-10] ≡ [32]
would also be correct.
"In the second expression, the int value -42 is converted to unsigned before the addition is done"
yes this is true
unsigned u = 42;
int i = -10;
std::cout << u + i << std::endl; // Why the result is 32?
Supposing we are in 32 bits (that change nothing in 64b, this is just to explain) this is computed as 42u + ((unsigned) -10) so 42u + 4294967286u and the result is 4294967328u truncated in 32 bits so 32. All was done in unsigned
This is part of what is wonderful about 2's complement representation. The processor doesn't know or care if a number is signed or unsigned, the operations are the same. In both cases, the calculation is correct. It's only how the binary number is interpreted after the fact, when printing, that is actually matters (there may be other cases, as with comparison operators)
-10 in 32BIT binary is FFFFFFF6
42 IN 32bit BINARY is 0000002A
Adding them together, it doesn't matter to the processor if they are signed or unsigned, the result is: 100000020. In 32bit, the 1 at the start will be placed in the overflow register, and in c++ is just disappears. You get 0x20 as the result, which is 32.
In the first case, it is basically the same:
-42 in 32BIT binary is FFFFFFD6
10 IN 32bit binary is 0000000A
Add those together and get FFFFFFE0
FFFFFFE0 as a signed int is -32 (decimal). The calculation is correct! But, because it is being PRINTED as an unsigned, it shows up as 4294967264. It's about interpreting the result.

Changed c++ unsigned int from 0 to -2, but prints 4294967294 [duplicate]

I was curious to know what would happen if I assign a negative value to an unsigned variable.
The code will look somewhat like this.
unsigned int nVal = 0;
nVal = -5;
It didn't give me any compiler error. When I ran the program the nVal was assigned a strange value! Could it be that some 2's complement value gets assigned to nVal?
For the official answer - Section 4.7 conv.integral
"If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]
This essentially means that if the underlying architecture stores in a method that is not Two's Complement (like Signed Magnitude, or One's Complement), that the conversion to unsigned must behave as if it was Two's Complement.
It will assign the bit pattern representing -5 (in 2's complement) to the unsigned int. Which will be a large unsigned value. For 32 bit ints this will be 2^32 - 5 or 4294967291
You're right, the signed integer is stored in 2's complement form, and the unsigned integer is stored in the unsigned binary representation. C (and C++) doesn't distinguish between the two, so the value you end up with is simply the unsigned binary value of the 2's complement binary representation.
It will show as a positive integer of value of max unsigned integer - 4 (value depends on computer architecture and compiler).
BTW
You can check this by writing a simple C++ "hello world" type program and see for yourself
Yes, you're correct. The actual value assigned is something like all bits set except the third. -1 is all bits set (hex: 0xFFFFFFFF), -2 is all bits except the first and so on. What you would see is probably the hex value 0xFFFFFFFB which in decimal corresponds to 4294967291.
When you assign a negative value to an unsigned variable then it uses the 2's complement method to process it and in this method it flips all 0s to 1s and all 1s to 0s and then adds 1 to it. In your case, you are dealing with int which is of 4 byte(32 bits) so it tries to use 2's complement method on 32 bit number which causes the higher bit to flip. For example:
┌─[student#pc]─[~]
└──╼ $pcalc 0y00000000000000000000000000000101 # 5 in binary
5 0x5 0y101
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 # flip all bits
4294967290 0xfffffffa 0y11111111111111111111111111111010
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 + 1 # add 1 to that flipped binarry
4294967291 0xfffffffb 0y11111111111111111111111111111011
In Windows and Ubuntu Linux that I have checked assigning any negative number (not just -1) to an unsigned integer in C and C++ results in the assignment of the value UINT_MAX to that unsigned integer.
Compiled example link.

Negation of -2147483648 not possible in C/C++?

#include <iostream>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int num=-2147483648;
int positivenum=-num;
int absval=abs(num);
std::cout<<positivenum<<"\n";
std::cout<<absval<<"\n";
return 0;
}
Hi I am quite curious why the output of the above code is
-2147483648
-2147483648
Now I know that -2147483648 is the smallest represntable number among signed ints, (assuming an int is 32 bits). I would have assumed that one would get garbage answers only after we went below this number. But in this case, +2147483648 IS covered by the 32 bit system of integers. So why the negative answer in both cases?
But in this case, +2147483648 IS covered by the 32 bit system of integers.
Not quite correct. It only goes up to +2147483647. So your assumption isn't right.
Negating -2147483648 will indeed produce 2147483648, but it will overflow back to -2147483648.
Furthermore, signed integer overflow is technically undefined behavior.
The value -(-2147483648) is not possible in 32-bit signed int. The range of signed 32-bit int is –2147483648 to 2147483647
Ahhh, but its not... remember 0, largest signed is actually 2147483647
Because the 2's complement representation of signed integers isn't symmetric and the minimum 32-bit signed integer is -2147483648 while the maximum is +2147483647. That -2147483648 is its own counterpart just as 0 is (in the 2's complement representation there's only one 0, there're no distinct +0 and -0).
Here's some explanation.
A negative number -X when represented as N-bit 2's complement, is effectively represented as unsigned number that's equal to 2N-X. So, for 32-bit integers:
if X = 1, then -X = 232 - 1 = 4294967295
if X = 2147483647, then -X = 232 - 2147483647 = 2147483649
if X = 2147483648, then -X = 232 - 2147483648 = 2147483648
if X = -2147483648, then -X = 232 + 2147483648 = 2147483648 (because we only keep low 32 bits)
So, -2147483648 = +2147483648. Welcome to the world of 2's complement values.
The previous answers have all pointed out that the result is UB (Undefined Behaviour) because 2147483648 is not a valid int32_t value. And we all know, UB means anything can happen, including having daemons flying out of your nose. The question is, why does the cout behavior print out a negative value, which seems to be the worst value it could have chosen randomly ?
I'll try to justify it on a two's complement system. Negation on a CPU is actually somewhat of a tricky operation. You can't do it in one step. One way of implementing negation, i.e. int32_t positivenum = -num is to do a bit inversion followed by adding 1, i.e. int32_t positivenum = ~num + 1, where ~ is the bitwise negation operator and the +1 is to fix the off-by-one error. For example, negation of 0x00000000 is 0xFFFFFFFF + 1 which is 0x00000000 (after roll over which is what most CPUs do). You can verify that this works for most integers... except for 2147483648. 2147483648 is stored as 0x80000000 in two's complement. When you invert and add one, you get
- (min) = -(0x80000000)
= ~(0x80000000) + 1
= 0x7FFFFFFF + 1
= 0x80000000
= min
So magically, the unary operator - operating on min gives you back min!
One thing that is not obvious is that two-complement CPUs' arithmetic have no concept of positive or negative numbers! It treats all numbers as unsigned under the hood. There is just one adder circuit, and one multiplier circuit. The adder circuit works for positive and negative numbers, and the multiplier circuit works for positive and negative number.
Example: -1 * -1
= -1 * -1
= (cast both to uint32_t)
= 0xFFFFFFFF * 0xFFFFFFFF
= FFFFFFFE00000001 // if you do the result to 64 bit precision
= 0x00000001 // after you truncate to 32 bit precision
= 1
The only time you care about signed vs unsigned is for comparisons, like < or >.

What happens if I assign a negative value to an unsigned variable?

I was curious to know what would happen if I assign a negative value to an unsigned variable.
The code will look somewhat like this.
unsigned int nVal = 0;
nVal = -5;
It didn't give me any compiler error. When I ran the program the nVal was assigned a strange value! Could it be that some 2's complement value gets assigned to nVal?
For the official answer - Section 4.7 conv.integral
"If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]
This essentially means that if the underlying architecture stores in a method that is not Two's Complement (like Signed Magnitude, or One's Complement), that the conversion to unsigned must behave as if it was Two's Complement.
It will assign the bit pattern representing -5 (in 2's complement) to the unsigned int. Which will be a large unsigned value. For 32 bit ints this will be 2^32 - 5 or 4294967291
You're right, the signed integer is stored in 2's complement form, and the unsigned integer is stored in the unsigned binary representation. C (and C++) doesn't distinguish between the two, so the value you end up with is simply the unsigned binary value of the 2's complement binary representation.
It will show as a positive integer of value of max unsigned integer - 4 (value depends on computer architecture and compiler).
BTW
You can check this by writing a simple C++ "hello world" type program and see for yourself
Yes, you're correct. The actual value assigned is something like all bits set except the third. -1 is all bits set (hex: 0xFFFFFFFF), -2 is all bits except the first and so on. What you would see is probably the hex value 0xFFFFFFFB which in decimal corresponds to 4294967291.
When you assign a negative value to an unsigned variable then it uses the 2's complement method to process it and in this method it flips all 0s to 1s and all 1s to 0s and then adds 1 to it. In your case, you are dealing with int which is of 4 byte(32 bits) so it tries to use 2's complement method on 32 bit number which causes the higher bit to flip. For example:
┌─[student#pc]─[~]
└──╼ $pcalc 0y00000000000000000000000000000101 # 5 in binary
5 0x5 0y101
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 # flip all bits
4294967290 0xfffffffa 0y11111111111111111111111111111010
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 + 1 # add 1 to that flipped binarry
4294967291 0xfffffffb 0y11111111111111111111111111111011
In Windows and Ubuntu Linux that I have checked assigning any negative number (not just -1) to an unsigned integer in C and C++ results in the assignment of the value UINT_MAX to that unsigned integer.
Compiled example link.