bit shift for unsigned int, why negative? - c++

Code:
unsigned int i = 1<<31;
printf("%d\n", i);
Why the out put is -2147483648, a negative value?
Updated question:
#include <stdio.h>
int main(int argc, char * argv[]) {
int i = 1<<31;
unsigned int j = 1<<31;
printf("%u, %u\n", i, j);
printf("%d, %d\n", i, j);
return 0;
}
The above print:
2147483648, 2147483648
-2147483648, -2147483648
So, does this means, signed int & unsigned int have the same bit values, the difference is how you treat the 31st bit when convert it to a number value?

%d prints the int version of the unsigned int i. Try %u for unsigned int.
printf("%u\n", i);
int main(){
printf("%d, %u",-1,-1);
return 0;
}
Output: -1, 4294967295
i.e The way a signed integer is stored and how it gets converted to signed from unsigned or vice-versa will help you. Follow this.
To answer your updated question, its how the system represents them i.e in a 2's complement (as in the above case where
-1 =2's complement of 1 = 4294967295.

Use '%u' for unsigned int
printf("%u\n", i);
.......................................................
Response to updated question: any sequence of bits can be interpreted as a signed or unsigned value.

printf("%d\n", i); invokes UB. i is unsigned int and you try to print it as signed int. Writing 1 << 31 instead of 1U << 31 is undefined too.
Print it as:
printf("%u\n", i);
or
printf("%X\n", i);
About your updated question, it also invokes UB for the very same reasons (If you use '1U' instead of 1, then for the reason that an int is initialized with 1U << 31 which is out of range value. If an unsigned is initialized with out of range value, modular arithmetic come into picture and remainder is assigned. For signed the behavior is undefined.)
Understanding the behavior on your platformOn your platform, int appears to be 4 byte. When you write something like 1 << 31, it converts to bit patters 0x80000000 on your machine.
Now when you try to print this pattern as signed, it prints signed interpretation which is -231 (AKA INT_MIN) in 2s completement system. When you print this as unsigned, you get expected 231 as output.
Learnings
1. Use 1U << 31 instead of 1 << 31
2. Always use correct print specifiers in printf.
3. Pass correct argument types to variadic functions.
4. Be careful when implicit typecast (unsigned -> signed, wider type -> narrow type) takes place. If possible, avoid such castings completely.

Try
printf("%u\n", i);
By using %d specifier, printf expects argument as int and it typecast to int.
So, use %u for unsigned int.

You are not doing the shift operation on an unsigned but on a signed int.
1 is signed.
the shift operation then shifts into the sign bit of
that signed (assuming that int is 32 bit wide), that is already undefined behavior
then you assign whatever the compiler thinks he wants for that value to an unsigned.
then you print an unsigned as a signed, again no defined behavior.

%u is for unsigned int.
%d is for signed int.
in your programs output :
2147483648, 2147483648 (output for unsigned int)
-2147483648, -2147483648 (output for signed int )

Related

Assign a negative number to an unsigned int

This code gives the meaningful output
#include <iostream>
int main() {
unsigned int ui = 100;
unsigned int negative_ui = -22u;
std::cout << ui + negative_ui << std::endl;
}
Output:
78
The variable negative_ui stores -22, but is an unsigned int.
My question is why does unsigned int negative_ui = -22u; work.
How can an unsigned int store a negative number? Is it save to be used or does this yield undefined behaviour?
I use the intel compiler 18.0.3. With the option -Wall no warnings occurred.
Ps. I have read What happens if I assign a negative value to an unsigned variable? and Why unsigned int contained negative number
How can an unsigned int store a negative number?
It doesn't. Instead, it stores a representable number that is congruent with that negative number modulo the number of all representable values. The same is also true with results that are larger than the largest representable value.
Is it save to be used or does this yield undefined behaviour?
There is no UB. Unsigned arithmetic overflow is well defined.
It is safe to rely on the result. However, it can be brittle. For example, if you add -22u and 100ull, then you get UINT_MAX + 79 (i.e. a large value assuming unsigned long long is a larger type than unsigned) which is congruent with 78 modulo UINT_MAX + 1 that is representable in unsigned long long but not representable in unsigned.
Note that signed arithmetic overflow is undefined.
Signed/Unsigned is a convention. It uses the last bit of the variable (in case of x86 int, the last 31th bit). What you store in the variable takes the full bit length.
It's the calculations that follow that take the upper bit as a sign indicator or ignore it. Therefore, any "unsigned" variable can contain a signed value which will be converted to the unsigned form when the unsigned variable participates in a calculation.
unsigned int x = -1; // x is now 0xFFFFFFFF.
x -= 1; // x is now 0xFFFFFFFE.
if (x < 0) // false. x is compared as 0xFFFFFFFE.
int x = -1; // x stored as 0xFFFFFFFF
x -= 1; // x stored as 0xFFFFFFFE
if (x < 0) // true, x is compared as -2.
Technically valid, bad programming.

how I get a number outside [0,255] when doing a converson from int to char?

I do not see why conversion from int to char gives me a number outside the range of (char)? Here is my C++ program:
#include <cmath>
#include <stdio.h>
int main(){
printf("%c\n",(char) 246 );
printf("%d\n", (char) (246) );
}
I get
\366
-10
Any explanation? What I need here is a conversion from int to char
char int2char(int i);
that returns the truncated 'i' to the char. Of course, one can do this by some (i mod 256) but I am looking for a way that uses type conversion to do this. Any idea?
When you execute:
printf("%d\n", (char) (246) );
It is equivalent to executing:
char c = 246;
int i = c;
printf("%d\n", i);
The tricky part is what happens in the line
char c = 246;
246 is stored as 11110110 in binrary.
It looks like on your platform, char is signed. When char is signed, the integer value of that binary number is equal to -10. Hence, the value of c is set to -10, and the value of i is also set to -10.
The native char type could be either signed or unsigned. It seems that it's signed on your machine.
If you want to make sure the result is within [0, 255] range, use unsigned char instead.
static_cast<unsigned char>(246)
Your C++ compiler treats char as a signed 8-bit integer, with range -128 to 127, rather than as an unsigned 8-bit integer with a range of 0 to 255.
The problem isn't the int-to-char conversion, but the char-to-int conversion, which can be done with either casting (i = (unsigned char) c or i = static_cast<unsigned char>(c)) or bitmasking (i = c & 0xFF).

Why does the unsigned int give a negative value in c++?

I have two functions add and main as follows.
int add(unsigned int a, unsigned int b)
{
return a+b;
}
int main()
{
unsigned int a,b;
cout << "Enter a value for a: ";
cin >> a;
cout << "Enter a value for b: ";
cin >> b;
cout << "a: " << a << " b: "<<b <<endl;
cout << "Result is: " << add(a,b) <<endl;
return 0;
}
When I run this program I get the following results:
Enter a value for a: -1
Enter a value for b: -2
a: 4294967295 b: 4294967294
Result is: -3
Why is the result -3?
Because add returns an int (no unsigned int) which cannot represent 4294967295 + 4294967294 = 4294967293 (unsigned integer arithmetic is defined mod 2^n with n = 32 in this case) because the result is too big.
Thus, you have signed integer overflow (or, more precisely, an implicit conversion from a source integer that cannot be represented as int) which has an implementation defined result, i.e. any output (that is representable as int) would be "correct".
The reason for getting exactly -3 is that the result is 2^32 - 3 and that gets converted to -3 on your system. But still, note that any result would be equally legal.
int add(unsigned int a, unsigned int b)
{
return a+b;
}
The expression a+b adds two operands of type unsigned int, yielding an unsigned int result. Unsigned addition, strictly speaking, does not "overflow"; rather than result is reduced modulo MAX + 1, where MAX is the maximum value of the unsigned type. In this case, assuming 32-bit unsigned int, the result of adding 4294967295 + 4294967294 is well defined: it's 4294967293, or 232-3.
Since add is defined to return an int result, the unsigned value is implicitly converted from unsigned int to int. Unlike an arithmetic overflow, an unsigned-to-signed conversion that can't be represented in the target type yields an implementation-defined result. On a typical implementation, such a conversion (where the source and target have the same size) will reinterpret the representation, yielding -3. Other results are possible, depending on the implementation, but not particularly likely.
As for why a and b were set to those values in the first place, apparently that's how cin >> a behaves when a is an unsigned value and the input is negative. I'm not sure whether that behavior is defined by the language, implementation-defined, or undefined. In any case, once you have those values, the result returned by add follows as described above.
If you are intending to return an unsigned int then you need to add unsigned to your function declaration. If you changed your return type to an unsigned int
and you use the values -1 & -2 then this will be your output:
a: 4294967295 b: 4294967294
Result: 4294967293
unsigned int ranges from [0, 4294967295] provided an unsigned int is 4bytes in size on your local machine. When you input -1 for an unsigned int you have buffer overflow and what happens here is the compiler will set -1 to be the largest possible valid number in an unsigned int. When you pass -2 into your function the same thing happens but you are being index back to the second largest value. With unsigned int there is no "sign" for negatives stored. If you take the largest possible value of an unsigned as stated above by (a) and add 1 it will give you a value of 0. These values are passed into the function, and the function creates 2 stack variables of local scope within this function.
So now you have a copy of a & b on the stack and a has a value of unsigned int max size and b has a value of unsigned int max size - 1. Now you are performing an addition on these two values which exceeds the max value size of an unsigned int and wrapping occurs in the opposite direction. So if the index value starts at .....95 and you add 1 this gives you 0 for the last digit, so if you take the second value which is max value - 1 : .....94 subtract one from it because we already reached 0 now we are in the positive direction. This will give you a result of ......93 which the function is returning if your return type is unsigned int.
This same concept applies if your return type is int, however what happens here
is this: The addition of the two unsigned values are the same result giving you
.....93 then the compiler will implicitly cast this to an int. Well here the range value for a signed int is -2147483648 to 2147483647 - one bit is used to store the (signed value) but it also depends on if two's compliment is being used etc.
The compiler here is smart enough to know that the signed int has a range of these values but the wrapping still occurs. What happens when we store 4294967293 into a singed int? Well the max value of int in the positive is 2147483647 so if we subtract the two (4294967293 - 2147483647) we would get 2147483646 left over. At this point does this value fit in the range of a signed int max value? It does however because of the implicit casting being done from an unsigned to a signed value we have 2 bits to account for, the signed itself and the wrapping value meaning max_value + 1 = 0 to account for, except this doesn't happen with signed values when you add 1 to max_value you get the largest possible -max_value.
For the unsigned values:
- ...95 = -1
- ...94 = -2
- ...93 = -3
With signed values the negative is preserved in its own bit, or if twos compliment is used etc., pending on the definitions to a signed int within the compiler. Here the compiler recognizes this unsigned value as being negative even though negative values are not stored in an unsigned for the sign is not preserved. So when it explicitly casts from an unsigned to signed one bit is used to store the signed value then the calculations are done and the wrapping occurs from a buffer overflow. So as we subtracted the actual unsigned value that would represent -3 : 4294967293 with the max+ value for a signed int 2147483647 we got the value 2147483646 now with signed ints if you add 1 to the max value it does not give you 0 like an unsigned does, it will give you the largest -signed value which for a 4byte int is -2147483648, since this does fit in the max+ value and the compiler knows that this is supposed to be a negative value if we add our remainder by the -max_value for a signed int : 2147483646 + (-2147483648) this will give us -2, but because of the fact that the wrapping with int is different we have to subtract 1 from this -2 - 1 = -3.
When it converts from an unsigned int to an int, the number is over what a signed integer can hold. This causes it to overflow as a negative value.
If you change the return value to an unsigned integer, your problem should be solved.

c/c++ left shift unsigned vs signed

I have this code.
#include <iostream>
int main()
{
unsigned long int i = 1U << 31;
std::cout << i << std::endl;
unsigned long int uwantsum = 1 << 31;
std::cout << uwantsum << std::endl;
return 0;
}
It prints out.
2147483648
18446744071562067968
on Arch Linux 64 bit, gcc, ivy bridge architecture.
The first result makes sense, but I don't understand where the second number came from. 1 represented as a 4byte int signed or unsigned is
00000000000000000000000000000001
When you shift it 31 times to the left, you end up with
10000000000000000000000000000000
no? I know shifting left for positive numbers is essentially 2^k where k is how many times you shift it, assuming it still fits within bounds. Why is it I get such a bizarre number?
Presumably you're interested in why this: unsigned long int uwantsum = 1 << 31; produces a "strange" value.
The problem is pretty simple: 1 is a plain int, so the shift is done on a plain int, and only after it's complete is the result converted to unsigned long.
In this case, however, 1<<31 overflows the range of a 32-bit signed int, so the result is undefined1. After conversion to unsigned, the result remains undefined.
That said, in most typical cases, what's likely to happen is that 1<<31 will give a bit pattern of 10000000000000000000000000000000. When viewed as a signed 2's complement2 number, this is -2147483648. Since that's negative, when it's converted to a 64-bit type, it'll be sign extended, so the top 32 bits will be filled with copies of what's in bit 31. That gives: 1111111111111111111111111111111110000000000000000000000000000000 (33 1-bits followed by 31 0-bits).
If we then treat that as an unsigned 64-bit number, we get 18446744071562067968.
§5.8/2:
The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1×2E2 is representable in the corresponding unsigned type of the result type, then that value, converted to the result type, is the resulting value; otherwise, the behavior is undefined.
In theory, the computer could use 1's complement or signed magnitude for signed numbers--but 2's complement is currently much more common than either of those. If it did use one of those, we'd expect a different final result.
The literal 1 with no U is a signed int, so when you shift << 31, you get integer overflow, generating a negative number (under the umbrella of undefined behavior).
Assigning this negative number to an unsigned long causes sign extension, because long has more bits than int, and it translates the negative number into a large positive number by taking its modulus with 264, which is the rule for signed-to-unsigned conversion.
It's not "bizarre".
Try printing the number in hex and see if it's any more recognizable:
std::cout << std::hex << i << std::endl;
And always remember to qualify your literals with "U", "L" and/or "LL" as appropriate:
http://en.cppreference.com/w/cpp/language/integer_literal
unsigned long long l1 = 18446744073709550592ull;
unsigned long long l2 = 18'446'744'073'709'550'592llu;
unsigned long long l3 = 1844'6744'0737'0955'0592uLL;
unsigned long long l4 = 184467'440737'0'95505'92LLU;
I think it is compiler dependent .
It gives same value
2147483648
2147483648
on my machiene (g++) .
Proof : http://ideone.com/cvYzxN
And if overflow is there , then because uwantsum is unsigned long int and unsigned values are ALWAYS positive , conversion is done from signed to unsigned by using (uwantsum)%2^64 .
Hope this helps !
Its in the way you printed it out.
using formar specifier %lu should represent a proper long int

Weird result after assigning 2^31 to a signed and unsigned 32-bit integer variable

As the question title reads, assigning 2^31 to a signed and unsigned 32-bit integer variable gives an unexpected result.
Here is the short program (in C++), which I made to see what's going on:
#include <cstdio>
using namespace std;
int main()
{
unsigned long long n = 1<<31;
long long n2 = 1<<31; // this works as expected
printf("%llu\n",n);
printf("%lld\n",n2);
printf("size of ULL: %d, size of LL: %d\n", sizeof(unsigned long long), sizeof(long long) );
return 0;
}
Here's the output:
MyPC / # c++ test.cpp -o test
MyPC / # ./test
18446744071562067968 <- Should be 2^31 right?
-2147483648 <- This is correct ( -2^31 because of the sign bit)
size of ULL: 8, size of LL: 8
I then added another function p(), to it:
void p()
{
unsigned long long n = 1<<32; // since n is 8 bytes, this should be legal for any integer from 32 to 63
printf("%llu\n",n);
}
On compiling and running, this is what confused me even more:
MyPC / # c++ test.cpp -o test
test.cpp: In function ‘void p()’:
test.cpp:6:28: warning: left shift count >= width of type [enabled by default]
MyPC / # ./test
0
MyPC /
Why should the compiler complain about left shift count being too large? sizeof(unsigned long long) returns 8, so doesn't that mean 2^63-1 is the max value for that data type?
It struck me that maybe n*2 and n<<1, don't always behave in the same manner, so I tried this:
void s()
{
unsigned long long n = 1;
for(int a=0;a<63;a++) n = n*2;
printf("%llu\n",n);
}
This gives the correct value of 2^63 as the output which is 9223372036854775808 (I verified it using python). But what is wrong with doing a left shit?
A left arithmetic shift by n is equivalent to multiplying by 2n
(provided the value does not overflow)
-- Wikipedia
The value is not overflowing, only a minus sign will appear since the value is 2^63 (all bits are set).
I'm still unable to figure out what's going on with left shift, can anyone please explain this?
PS: This program was run on a 32-bit system running linux mint (if that helps)
On this line:
unsigned long long n = 1<<32;
The problem is that the literal 1 is of type int - which is probably only 32 bits. Therefore the shift will push it out of bounds.
Just because you're storing into a larger datatype doesn't mean that everything in the expression is done at that larger size.
So to correct it, you need to either cast it up or make it an unsigned long long literal:
unsigned long long n = (unsigned long long)1 << 32;
unsigned long long n = 1ULL << 32;
The reason 1 << 32 fails is because 1 doesn't have the right type (it is int). The compiler doesn't do any converting magic before the assignment itself actually happens, so 1 << 32 gets evaluated using int arithmic, giving a warning about an overflow.
Try using 1LL or 1ULL instead which respectively have the long long and unsigned long long type.
The line
unsigned long long n = 1<<32;
results in an overflow, because the literal 1 is of type int, so 1 << 32 is also an int, which is 32 bits in most cases.
The line
unsigned long long n = 1<<31;
also overflows, for the same reason. Note that 1 is of type signed int, so it really only has 31 bits for the value and 1 bit for the sign. So when you shift 1 << 31, it overflows the value bits, resulting in -2147483648, which is then converted to an unsigned long long, which is 18446744071562067968. You can verify this in the debugger, if you inspect the variables and convert them.
So use
unsigned long long n = 1ULL << 31;