Right shift operator for signed numbers [duplicate] - c++

This question already has answers here:
Right shifting negative numbers in C
(6 answers)
How are negative numbers represented in 32-bit signed integer?
(7 answers)
Closed 8 years ago.
I am reading about shift operators in C.
Right shifting n bits divides by 2 raise to n. Shifting signed values may fail because for negative values the result never gets past -1: -5 >> 3 is -1 and not 0 like -5/8.
My question is why shifting signed values may fail?
Why value of -5 >> 3 is -1 and not zero?
Kindly explain.

It is simply implementation-defined:
From 5.8 Shift operators
The operands shall be of integral or unscoped enumeration type and
integral promotions are performed. The type of the result is that of
the promoted left operand. The behavior is undefined if the right
operand is negative, or greater than or equal to the length in bits of
the promoted left operand
[...]
If E1 has a signed type and a negative value, the resulting value is implementation-defined.

Shifting with signed integers is implementation defined, but if the architecture you're using has an arithmetic shift, you can pretty reliably guess it with use it.
It's because of how negative numbers are stored in the computer. It's called two's complement. To switch the sign of a two's complement, you NOT its bits and add 1. For example, with an 8 bit integer 00011010 (26), first you NOT to get 11100101, then you add 1 and get 11100110 (-26). The problem comes from that most significant bit being set. If when you shifted it put 0s in at the left, the number would become positive, but if it put 1s, then the lowest possible result is 11111111 which is -1. This is how arithmetic shifts work, when you shift the computer adds bits that are the same as the left most bit.
So to be explicit, this is what is happening (using 8 bit integers because it's easier and the size is arbitrary in this case): 11111011 gets shifted 3 to the right (so 011 goes away) and since the most significant bit is set 3 1s are inserted at the top, so you get 11111111 which is -1.

Related

How typecast works on initialization "unsigned int i = -100"? [duplicate]

I was curious to know what would happen if I assign a negative value to an unsigned variable.
The code will look somewhat like this.
unsigned int nVal = 0;
nVal = -5;
It didn't give me any compiler error. When I ran the program the nVal was assigned a strange value! Could it be that some 2's complement value gets assigned to nVal?
For the official answer - Section 4.7 conv.integral
"If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]
This essentially means that if the underlying architecture stores in a method that is not Two's Complement (like Signed Magnitude, or One's Complement), that the conversion to unsigned must behave as if it was Two's Complement.
It will assign the bit pattern representing -5 (in 2's complement) to the unsigned int. Which will be a large unsigned value. For 32 bit ints this will be 2^32 - 5 or 4294967291
You're right, the signed integer is stored in 2's complement form, and the unsigned integer is stored in the unsigned binary representation. C (and C++) doesn't distinguish between the two, so the value you end up with is simply the unsigned binary value of the 2's complement binary representation.
It will show as a positive integer of value of max unsigned integer - 4 (value depends on computer architecture and compiler).
BTW
You can check this by writing a simple C++ "hello world" type program and see for yourself
Yes, you're correct. The actual value assigned is something like all bits set except the third. -1 is all bits set (hex: 0xFFFFFFFF), -2 is all bits except the first and so on. What you would see is probably the hex value 0xFFFFFFFB which in decimal corresponds to 4294967291.
When you assign a negative value to an unsigned variable then it uses the 2's complement method to process it and in this method it flips all 0s to 1s and all 1s to 0s and then adds 1 to it. In your case, you are dealing with int which is of 4 byte(32 bits) so it tries to use 2's complement method on 32 bit number which causes the higher bit to flip. For example:
┌─[student#pc]─[~]
└──╼ $pcalc 0y00000000000000000000000000000101 # 5 in binary
5 0x5 0y101
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 # flip all bits
4294967290 0xfffffffa 0y11111111111111111111111111111010
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 + 1 # add 1 to that flipped binarry
4294967291 0xfffffffb 0y11111111111111111111111111111011
In Windows and Ubuntu Linux that I have checked assigning any negative number (not just -1) to an unsigned integer in C and C++ results in the assignment of the value UINT_MAX to that unsigned integer.
Compiled example link.

C++ and unsigned types

I'm reading the C++ Primer 5th Edition, and I don't understand the following part:
In an unsigned type, all the bits represent the value. For example, an 8-bit
unsigned char can hold the values from 0 through 255 inclusive.
What does it mean with "all the bits represent the value"?
You should compare this to a signed type. In a signed value, one bit (the top bit) is used to indicate whether the value is positive or negative, while the rest of the bits are used to hold the value.
The value of an object of trivially copyable type is determined by some bits in it, while other bits do not affect its value. In the C++ standard, the bits that do not affect the value are called padding bits.
For example, consider a type with 8 bits where the last 4 bits are padding bits, then the objects represented by 00000000 and 00001111 have the same value, and compare equal.
In reality, padding bits are often used for alignment and/or error detection.
Knowing the knowledge above, you can understand what the book is saying. It says there are no padding bits for an unsigned type. However, the statement is wrong. In fact, the standard only guarantees unsigned char (and signed char, char) has no padding bits. The following is a quote of related part of the standard [basic.fundamental]/1:
For narrow character types, all bits of the object representation participate in the value representation.
Also, the C11 standard 6.2.6.2/1 says
For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter).
It means that all 8 bits represent an actual value, while in signed char only 7 bits represent actual value and 8-th bit (the most significant) represent sign of that value - positive or negative (+/-).
For example, one byte contains 8 bits, and all 8 bits are used to counting up from 0.
For unsigned, all bits zero = 00000000 means 0, 00000001 = 1, 00000010 = 2, 00000011 = 3, ... up to 11111111 = 255.
For a signed byte (or signed char), the leftmost bit means the sign, and therefore cannot be used to count. (I am optically separating the leftmost bit!) 0 0000001 = 1, but 1 0000001 = -1, 0 0000010 = 2, and 1 0000010 = -2, etc, up to 0 1111111 = 127, and 1 1111111 = -127. In this example, 1 0000000 would mean -0, which is useless/wasted, so it can mean for example 128.
There are other ways to code the bits into numbers, and some computers start from the left instead from the right. These details are hardware specific, and not relevant to understand 'unsigned', you only need to care about that when you want to mess in the code with the single bits (not recommended).
This is mostly a theoretical thing. On real hardware, the same holds for signed integers as well. Obviously, with signed integers, some of those values are negative.
Back to unsigned - what the text says is basically that the value of an unsigned number is simply 1<<0 + 1<<1 + 1<<2 + ... up to the total number of bits. Importantly, not only are all bits contributing, but all combinations of bits form a valid number. This is NOT the case for signed integers. Therefore, if you need a bitmask, it has to be an unsigned type of sufficient width, or you could run into invalid bit patterns.

Bitwise NOT with boolean variable [duplicate]

This question already has answers here:
tilde operator returning -1, -2 instead of 0, 1 respectively
(2 answers)
Closed 6 years ago.
I'm a beginner at using bitwise operators and bool type. I might be wrong, but I thought bool type is represented on 1 bit and can take values from {0, 1}. So, I tried the NOT (~) operator with such a variable and the results are weird for me.
eg. for
bool x = 0;
cout << (~x);
I get -1. (I expected 1) Can you please tell me where I'm wrong and why only the ! operator does reverse the value (from 0 to 1 and from 1 to 0)?
Most processors do not have a 1 bit wide general purpose register so when you use a boolean it takes up whatever the default register size is on that platform (ie 64 bits on most Intel and ARM computers these days, but maybe 32 bit on some embedded systems). When you negate some thing that is all zeros, you get all 1's. In twos complement this evaluates to -1 in signed decimal. Long story short, your bool is really an int and ~0 is -1
The ! operator is a logical operator - hence 0 (false) is negated to 1 (true).
The ~ operator is a bitwise operator - hence every bit is negated. A bool, while notionally a single bit - can results in expressions of type int. Hence you are really negating 0.....000, which is 1...111, which is -1 (see two's complement).
The bool value x is implicitly converted to an int when used in the expression ~x. And the vast majority of computers use two's complement representation of signed integers, in which ~0 is equal to -1.
The ! operator is defined so that !x has type bool rather than int, so this issue doesn't occur.

Bit shifting and assignment [duplicate]

This question already has answers here:
Why doesn't left bit-shift, "<<", for 32-bit integers work as expected when used more than 32 times?
(10 answers)
Closed 9 years ago.
This is sort of driving me crazy.
int a = 0xffffffff;
int b = 32;
cout << (a << b) << "\n";
cout << (0xffffffff << 32) << "\n";
My output is
-1
0
Why am I not getting
0
0
Undefined behavior occurs when you shift a value by a number of bits which is not less than its size (e.g, 32 or more bits for a 32-bit integer). You've just encountered an example of that undefined behavior.
The short answer is that, since you're using an implementation with 32-bit int, the language standard says that a 32-bit shift is undefined behavior. Both the C standard (section 6.5.7) and the C++ standard (section 5.8) say
If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
But if you want to know why:
Many computers have an instruction that can shift the value in a register, but the hardware only handles shift values that are actually needed. For instance, when shifting a 32 bit word, only 5 bits are needed to represent a shift value of 0 ... 31 and so the hardware may ignore higher order bits, and does on *86 machines (except for the 8086). So that compiler implementations could just use the instruction without generating extra code to check whether the shift value is too big, the authors of the C Standard (many of whom represented compiler vendors) ruled that the result of shifting by larger amounts is undefined.
Your first shift is performed at run time and it encounters this situation ... only the low order 5 bits of b are considered by your machine, and they are 0, so no shift happens. Your second shift is done at compile time, and the compiler calculates the value differently and actually does the 32-bit shift.
If you want to shift by an amount that may be larger than the number of bits in the thing you're shifting, you need to check the range of the value yourself. One possible way to do that is
#define LEFT_SHIFT(a, b) ((b) >= CHAR_BIT * sizeof(a)? 0 : (a) << (b))
C++ standard says ::
If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
As GCC has no options to handle shifts by negative amounts or by amounts outside the width of the type predictably or trap on them; they are always treated as undefined.
So behavior is not defined.

What does ~0 mean in this code?

What's the meaning of ~0 in this code?
Can somebody analyze this code for me?
unsigned int Order(unsigned int maxPeriod = ~0) const
{
Point r = *this;
unsigned int n = 0;
while( r.x_ != 0 && r.y_ != 0 )
{
++n;
r += *this;
if ( n > maxPeriod ) break;
}
return n;
}
~0 is the bitwise complement of 0, which is a number with all bits filled. For an unsigned 32-bit int, that's 0xffffffff. The exact number of fs will depend on the size of the value that you assign ~0 to.
It's the one complement, which inverts all bits.
~ 0101 => 1010
~ 0000 => 1111
~ 1111 => 0000
As others have mentioned, the ~ operator performs bitwise complement. However, the result of performing the operation on a signed value is not defined by the standard.
In particular, the value of ~0 need not be -1, which is probably the value intended. Setting the default argument to
unsigned int maxPeriod = -1
would make maxPeriod contain the highest possible value (signed to unsigned conversion is defined as an assignment modulo 2**n, where n is a characteristic number of the given unsigned type (the number of bits of representation)).
Also note that default arguments are not valid in C.
It's a binary complement function.
Basically it means flip each bit.
It is the bitwise complement of 0 which would be, in this example, an int with all the bits set to 1. If sizeof(int) is 4, then the number is 0xffffffff.
Basically, it's saying that maxPeriod has a default value of UINT_MAX. Rather than writing it as UINT_MAX, the author used his knowledge of complements to calculate the value.
If you want to make the code a bit more readable in the future, include
#include <limits>
and change the call to read
unsigned int Order(unsigned int maxPeriod = UINT_MAX) const
Now to explain why ~0 is UINT_MAX. Since we are dealing with an int, in which 0 is represented with all zero bits (00000000). Adding one would give (00000001), adding one more would give (00000010), and one more would give (00000011). Finally one more addition would give (00000100) because the 1's carry.
For unsigned ints, if you repeat the process ad-infiniteum, eventually you have all one bits (11111111), and adding another one will overflow the buffer setting all the bits back to zero. This means that all one bits in an unsigned number is the maximum that data type (int in your case) can hold.
The "~" operation flips all bits from 0 to 1 or 1 to 0, flipping a zero integer (which has all zero bits) effectively gives you UINT_MAX. So he basically the previous coded opted to computer UINT_MAX instead of using the system defined copy located in #include <limits.h>
In the example it is probably an attempt to generate the UINT_MAX value. The technique is possibly flawed for reasons already stated.
The expression does however does have legitimate use to generate a bit mask with all bits set using a literal constant that is type-width independent; but that is not how it is being used in your example.
As others have said, ~ is the bitwise complement operator (sometimes also referred to as bitwise not). It's a unary operator which means that it takes a single input.
Bitwise operators treat the input as a bit pattern and perform their respective operations on each individual bit then return the resulting pattern. Applying the ~ operator to the bit pattern will negate each bit (each zero becomes a one, each one becomes a zero).
In the example you gave, the bit representation of the integer 0 is all zeros. Thus, ~0 will produce a bit pattern of all ones. Even though 0 is an int, it is the bit pattern ~0 that is assigned to maxPeriod (not the int value that would be represented by said bit pattern). Since maxPeriod is an unsigned int, it is assigned the unsigned int value represented by ~0 (a pattern of all ones), which is in fact the highest value that an unsigned int can store before wrapping around back to 0.