Why does bit-shifting an int upwards produce a negative number? - c++

I am new to bit manipulations tricks and I wrote a simple code to see the output of doing single bit shifts on a single number viz. 2
#include <iostream>
int main(int argc, char *argv[])
{
int num=2;
do
{
std::cout<<num<<std::endl;
num=num<<1;//Left shift by 1 bit.
} while (num!=0);
return 0;
}
The output of this is the following.
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432
67108864
134217728
268435456
536870912
1073741824
-2147483648
Obviously, continuously bit shifting to the left by 1 bit, will result in zero as it has done above, but why does the computer output a negative number at the very end before terminating the loop (since num turned zero)??
However when I replace int num=2 by unsigned int num=2 then I get the same output except
that the last number is this time displayed as positive i.e. 2147483648 instead of -2147483648
I am using the gcc compiler on Ubuntu Linux

That's because int is a signed integer. In the two's-complement representation, the sign of the integer is determined by the upper-most bit.
Once you have shifted the 1 into the highest (sign) bit, it flips negative.
When you use unsigned, there's no sign bit.
0x80000000 = -2147483648 for a signed 32-bit integer.
0x80000000 = 2147483648 for an unsigned 32-bit integer.
EDIT :
Note that strictly speaking, signed integer overflow is undefined behavior in C/C++. The behavior of GCC in this aspect is not completely consistent:
num = num << 1; or num <<= 1; usually behaves as described above.
num += num; or num *= 2; may actually go into an infinite loop on GCC.

Good question! The answer is rather simple though.
The maximum integer value is 2^31-1. The 31 (not 32) is there for a reason - the last bit on the integer is used for determining whether it's a positive or negative number.
If you keep shifting the bit to the left, you'll eventually hit this bit and it turns negative.
More information about this: http://en.wikipedia.org/wiki/Signed_number_representations

As soon as the bit reaches the sign bit of signed (most significant bit) it turns negative.

Related

how does the binary left shift is behaving after 30 shift in c++?

in C++ int is of 4 bytes that means that int memory it can store 32 bits, then how come
int i = 1;
i = i<<32;
cout<<i<<endl;
gives me the following error:-
main.cpp:7:5: warning: shift count >= width of type
[-Wshift-count-overflow]
i <<= 32;
whereas
int i = 1;
i = i<<31;
cout<<i<<endl;
gives me
./main
-2147483648
and
int i = 1;
i = i<<30;
cout<<i<<endl;
gives me
./main
1073741824
what is happening?
Let's represent a number in a binary format and see what a left shift does.
for example
if i = 5 // binary 101
i<<1 becomes 10 // binary 1010
i<<2 becomes 20 // binary 10100
and so on
Similarly
if i = 1 // binary 1
i<<1 becomes 2 // binary 10
i<<2 becomes 4 // binary 100
i<<n becomes 2^n // binary 1000...n times
i<<30 becomes 2^30 // binary 1000000000000000000000000000000
If you observe 2^n will require n+1 bits to store, which explains your first error. 2^32 will need 33 bits and std int being 32 bit, you get an overflow error.
Now note that 2^30 occupies 31 bits which are the number of bits allocated to represent the value of int since the 32nd bit is a sign bit (to distinguish between negative and positive numbers).
So, when you do i<<31, the highest order 1 overwrites the sign bit and we get a negative value.
Negative numbers in c++ represented using 2s complement. 2s complement of 2^31 for a 32 bit value is -2147483648 which is what you see.
Now i<<30 when i==1 is just 2^30 or 1073741824

Why does masking a negative number produce a positive number?

in c++, I have the following code:
int x = -3;
x &= 0xffff;
cout << x;
This produces
65533
But if I remove the negative, so I have this:
int x = 3;
x &= 0xffff;
cout << x;
I simply get 3 as a result
Why does the first result not produce a negative number? I would expect that -3 would be sign extended to 16 bits, which should still give a twos complement negative number, considering all those extended bits would be 1. Consequently the most significant bit would be 1 too.
It looks like your system uses 32-bit ints with two's complement representation of negatives.
Constant 0xFFFF covers the least significant two bytes, with the upper two bytes are zero.
The value of -3 is 0xFFFFFFFD, so masking it with 0x0000FFFF you get 0x0000FFFD, or 65533 in decimal.
Positive 3 is 0x00000003, so masking with 0x0000FFFF gives you 3 back.
You would get the result that you expect if you specify 16-bit data type, e.g.
int16_t x = -3;
x &= 0xffff;
cout << x;
In your case int is more than 2 bytes. You probably run on modern CPU where usually these days integer is 4 bytes (or 32 bits)
If you take a look the way system stores negative numbers you will see that its a complementary number. And if you take only last 2 bytes as your mask is 0xFFFF then you will get only a part of it.
your 2 options:
use short intstead of int. Usually its a half of integer and will be only 2 bites
use bigger mask like 0xFFFFFFFF that it covers all the bits of your integer
NOTE: I use "usually" because the amount of bits in your int and short depends on your CPU and compiler.

Shifting syntax error

I have a byte array:
byte data[2]
I want to to keep the 7 less significant bits from the first and the 3 most significant bits from the second.
I do this:
unsigned int the=((data[0]<<8 | data[1])<<1)>>6;
Can you give me a hint why this does not work?
If I do it in different lines it works fine.
Can you give me a hint why this does not work?
Hint:
You have two bytes and want to preserve 7 less significant bits from the first and the 3 most significant bits from the second:
data[0]: -xxxxxxx data[1]: xxx-----
-'s represent bits to remove, x's represent bits to preserve.
After this
(data[0]<<8 | data[1])<<1
you have:
the: 00000000 0000000- xxxxxxxx xx-----0
Then you make >>6 and result is:
the: 00000000 00000000 00000-xx xxxxxxxx
See, you did not remove high bit from data[0].
Keep the 7 less significant bits from the first and the 3 most significant bits from the second.
Assuming the 10 bits to be preserved should be the LSB of the unsigned int value, and should be contiguous, and that the 3 bits should be the LSB of the result, this should do the job:
unsigned int value = ((data[0] & 0x7F) << 3) | ((data[1] & 0xE0) >> 5);
You might not need all the masking operands; it depends in part on the definition of byte (probably unsigned char, or perhaps plain char on a machine where char is unsigned), but what's written should work anywhere (16-bit, 32-bit or 64-bit int; signed or unsigned 8-bit (or 16-bit, or 32-bit, or 64-bit) values for byte).
Your code does not remove the high bit from data[0] at any point — unless, perhaps, you're on a platform where unsigned int is a 16-bit value, but if that's the case, it is unusual enough these days to warrant a comment.

Negation of -2147483648 not possible in C/C++?

#include <iostream>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int num=-2147483648;
int positivenum=-num;
int absval=abs(num);
std::cout<<positivenum<<"\n";
std::cout<<absval<<"\n";
return 0;
}
Hi I am quite curious why the output of the above code is
-2147483648
-2147483648
Now I know that -2147483648 is the smallest represntable number among signed ints, (assuming an int is 32 bits). I would have assumed that one would get garbage answers only after we went below this number. But in this case, +2147483648 IS covered by the 32 bit system of integers. So why the negative answer in both cases?
But in this case, +2147483648 IS covered by the 32 bit system of integers.
Not quite correct. It only goes up to +2147483647. So your assumption isn't right.
Negating -2147483648 will indeed produce 2147483648, but it will overflow back to -2147483648.
Furthermore, signed integer overflow is technically undefined behavior.
The value -(-2147483648) is not possible in 32-bit signed int. The range of signed 32-bit int is –2147483648 to 2147483647
Ahhh, but its not... remember 0, largest signed is actually 2147483647
Because the 2's complement representation of signed integers isn't symmetric and the minimum 32-bit signed integer is -2147483648 while the maximum is +2147483647. That -2147483648 is its own counterpart just as 0 is (in the 2's complement representation there's only one 0, there're no distinct +0 and -0).
Here's some explanation.
A negative number -X when represented as N-bit 2's complement, is effectively represented as unsigned number that's equal to 2N-X. So, for 32-bit integers:
if X = 1, then -X = 232 - 1 = 4294967295
if X = 2147483647, then -X = 232 - 2147483647 = 2147483649
if X = 2147483648, then -X = 232 - 2147483648 = 2147483648
if X = -2147483648, then -X = 232 + 2147483648 = 2147483648 (because we only keep low 32 bits)
So, -2147483648 = +2147483648. Welcome to the world of 2's complement values.
The previous answers have all pointed out that the result is UB (Undefined Behaviour) because 2147483648 is not a valid int32_t value. And we all know, UB means anything can happen, including having daemons flying out of your nose. The question is, why does the cout behavior print out a negative value, which seems to be the worst value it could have chosen randomly ?
I'll try to justify it on a two's complement system. Negation on a CPU is actually somewhat of a tricky operation. You can't do it in one step. One way of implementing negation, i.e. int32_t positivenum = -num is to do a bit inversion followed by adding 1, i.e. int32_t positivenum = ~num + 1, where ~ is the bitwise negation operator and the +1 is to fix the off-by-one error. For example, negation of 0x00000000 is 0xFFFFFFFF + 1 which is 0x00000000 (after roll over which is what most CPUs do). You can verify that this works for most integers... except for 2147483648. 2147483648 is stored as 0x80000000 in two's complement. When you invert and add one, you get
- (min) = -(0x80000000)
= ~(0x80000000) + 1
= 0x7FFFFFFF + 1
= 0x80000000
= min
So magically, the unary operator - operating on min gives you back min!
One thing that is not obvious is that two-complement CPUs' arithmetic have no concept of positive or negative numbers! It treats all numbers as unsigned under the hood. There is just one adder circuit, and one multiplier circuit. The adder circuit works for positive and negative numbers, and the multiplier circuit works for positive and negative number.
Example: -1 * -1
= -1 * -1
= (cast both to uint32_t)
= 0xFFFFFFFF * 0xFFFFFFFF
= FFFFFFFE00000001 // if you do the result to 64 bit precision
= 0x00000001 // after you truncate to 32 bit precision
= 1
The only time you care about signed vs unsigned is for comparisons, like < or >.

Question on Infinte Loop in C++

This is kind of a curiosity.
I'm studying C++. I was asked to reproduce an infinite loop, for example one that prints a series of powers:
#include <iostream>
int main()
{
int powerOfTwo = 1;
while (true)
{
powerOfTwo *= 2;
cout << powerOfTwo << endl;
}
}
The result kinda troubled me. With the Python interpreter, for example, I used to get an effective infinite loop printing a power of two each time it iterates (until the IDE would stop for exceeding iteration's limit, of course). With this C++ program instead I get a series of 0. But, if I change this to a finite loop, and that is to say I only change the condition statement to:
(powerOfTwo <= 100)
the code works well, printing 2, 4, 16, ..., 128.
So my question is: why an infinite loop in C++ works in this way? Why it seems to not evaluate the while body at all?
Edit: I'm using Code::Blocks and compiling with g++.
In the infinite loop case you see 0 because the int overflows after 32 iterations to 0 and 0*2 == 0.
Look at the first few lines of output. http://ideone.com/zESrn
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432
67108864
134217728
268435456
536870912
1073741824
-2147483648
0
0
0
In Python, integers can hold an arbitrary number of digits. C++ does not work this way, its integers only have a limited precision (normally 32 bits, but this depends on the platform). Multiplication by 2 is implemented by bitwise shifting an integer one bit to the left. What is happening is that you initially have only the first bit in the integer set:
powerOfTwo = 1; // 0x00000001 = 0b00000000000000000000000000000001
After your loop iterates 31 times, the bit will have shifted to the very last position in the integer.
powerOfTwo = -2147483648; // 0x80000000 = 0b10000000000000000000000000000000
The next multiplication by two, the bit is shifted all the way out of the integer (since it has limited precision), and you end up with zero.
powerOfTwo = 0; // 0x00000000 = = 0b00000000000000000000000000000000
From then on, you are stuck, since 0 * 2 is always 0. If you watch your program in "slow motion", you would see an initial burst of powers of 2, followed by an infinite loop of zeroes.
In Python, on the other hand, your code would work as expected - Python integers can expand to hold any arbitrary number of digits, so your single set bit will never "shift off the end" of the integer. The number will simply keep expanding so that the bit is never lost, and you will never wrap back around and get trapped at zero.
Actually it prints powers of two until powerOfTwo gets overflowed and becomes 0. Then 0*2 = 0 and so on. http://ideone.com/XUuHS
I c++ it has a limited size - so therefore is able to compute even if errror
but the whole true makes the case
In C++ you will cause an overflow pretty soon, your int variable won't be able to handle big numbers.
int: 4 bytes signed can handle the range –2,147,483,648 to 2,147,483,647
So as #freerider said, your compiler is maybe optimizing the code for you.
I guess you know all data-type concept in C,C++, so you are declaring powerOfTwo as a integer.
so the range of integer get followed accordingly, if you want an continuous loop you can use char as datatype and by using data conversion you can get infinite loop for you function.
Carefully examine the output of the program. You don't really get an infinite series of zeroes. You get 32 numbers, followed by an infinite series of zeroes.
The thirty-two numbers are the first thirty-two powers of two:
1
2
4
8
...
(2 raised to the 30th)
(2 raised to the 31st)
0
0
0
The problem is how C represents numbers, as finite quantities. Since your mathematical quantity is no longer representable in the C int, C puts some other number in its place. In particular, it puts the true value modulo 2^32. But 2^32 mod 2^32 is zero, so there you are.