Weird behavior of right shift operator (1 >> 32)

Weird behavior of right shift operator (1 >> 32) - c++

I recently faced a strange behavior using the right-shift operator.
The following program:
#include <cstdio>
#include <cstdlib>
#include <iostream>
#include <stdint.h>
int foo(int a, int b)
{
return a >> b;
}
int bar(uint64_t a, int b)
{
return a >> b;
}
int main(int argc, char** argv)
{
std::cout << "foo(1, 32): " << foo(1, 32) << std::endl;
std::cout << "bar(1, 32): " << bar(1, 32) << std::endl;
std::cout << "1 >> 32: " << (1 >> 32) << std::endl; //warning here
std::cout << "(int)1 >> (int)32: " << ((int)1 >> (int)32) << std::endl; //warning here
return EXIT_SUCCESS;
}
Outputs:
foo(1, 32): 1 // Should be 0 (but I guess I'm missing something)
bar(1, 32): 0
1 >> 32: 0
(int)1 >> (int)32: 0
What happens with the foo() function ? I understand that the only difference between what it does and the last 2 lines, is that the last two lines are evaluated at compile time. And why does it "work" if I use a 64 bits integer ?
Any lights regarding this will be greatly appreciated !
Surely related, here is what g++ gives:
> g++ -o test test.cpp
test.cpp: In function 'int main(int, char**)':
test.cpp:20:36: warning: right shift count >= width of type
test.cpp:21:56: warning: right shift count >= width of type

It's likely the CPU is actually computing
a >> (b % 32)
in foo; meanwhile, the 1 >> 32 is a constant expression, so the compiler will fold the constant at compile-time, which somehow gives 0.
Since the standard (C++98 §5.8/1) states that
The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.
there is no contradiction having foo(1,32) and 1>>32 giving different results.
On the other hand, in bar you provided a 64-bit unsigned value, as 64 > 32 it is guaranteed the result must be 1 / 232 = 0. Nevertheless, if you write
bar(1, 64);
you may still get 1.
Edit: The logical right shift (SHR) behaves like a >> (b % 32/64) on x86/x86-64 (Intel #253667, Page 4-404):
The destination operand can be a register or a memory location. The count operand can be an immediate value or the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used). A special opcode encoding is provided for a count of 1.
However, on ARM (armv6&7, at least), the logical right-shift (LSR) is implemented as (ARMISA Page A2-6)
(bits(N), bit) LSR_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = ZeroExtend(x, shift+N);
result = extended_x<shift+N-1:shift>;
carry_out = extended_x<shift-1>;
return (result, carry_out);
where (ARMISA Page AppxB-13)
ZeroExtend(x,i) = Replicate('0', i-Len(x)) : x
This guarantees a right shift of ≥32 will produce zero. For example, when this code is run on the iPhone, foo(1,32) will give 0.
These shows shifting a 32-bit integer by ≥32 is non-portable.

OK. So it's in 5.8.1:
The operands shall be of integral or enumeration type and integral promotions are performed. The type of the result is
that of the promoted left operand. The behavior is undefined if the right operand is negative, or greater than or equal to
the length in bits of the promoted left operand.
So you have an Undefined Behaviour(tm).

What happens in foo is that the shift width is greater than or equal to the size of the data being shifted. In the C99 standard that results in undefined behaviour. It's probably the same in whatever C++ standard MS VC++ is built to.
The reason for this is to allow compiler designers to take advantage of any CPU hardware support for shifts. For example, the i386 architecture has an instruction to shift a 32 bit word by a number of bits, but the number of bits is defined in a field in the instruction that is 5 bits wide. Most likely, your compiler is generating the instruction by taking your bit shift amount and masking it with 0x1F to get the bit shift in the instruction. This means that shifting by 32 is the same as shifting by 0.

I compiled it on 32 bit windows using VC9 compiler. It gave me the following warning. Since sizeof(int) is 4 bytes on my system compiler is indicating that right shifting by 32 bits results in undefined behavior. Since it is undefined, you can not predict the result. Just for checking I right shifted with 31 bits and all the warnings disappeared and the result was also as expected (i.e. 0).

I suppose the reason is that int type holds 32-bits (for most systems), but one bit is used for sign as it is signed type. So only 31 bits are used for actual value.

The warning says it all!
But in fairness I got bitten by the same error once.
int a = 1;
cout << ( a >> 32);
is completely undefined. In fact the compiler generally gives a different results than the runtime in my experience. What I mean by this is if the compiler can see to evaluate the shift expression at run time it may give you a different result to the expression evaluated at runtime.

foo(1,32) performs a rotate-shit, so bits that should disappear on the right reappear on the left. If you do it 32 times, the single bit set to 1 is back to its original position.
bar(1,32) is the same, but the bit is in the 64-32+1=33th bit, which is above the representable numbers for a 32-bit int. Only the 32 lowest bit are taken, and they are all 0's.
1 >> 32 is performed by the compiler. No idea why gcc uses a non-rotating shift here and not in the generated code.
Same thing for ((int)1 >> (int)32)

Related

Unsigned integer wrapping, different behaviour

Here is a code in C:
#include <stdio.h>
int main()
{
printf("Int width = %lu\n", sizeof(unsigned int)); // Gives 32 bits on my computer
unsigned int n = 32;
unsigned int y = ((unsigned int)1 << n) - 1; // This is line 8
printf("%u\n", y);
unsigned int x = ((unsigned int)1 << 32) - 1; // This is line 11
printf("%u", x);
return 0;
}
It outputs:
main.c:11:39: warning: left shift count >= width of type [-Wshift-count-overflow]
Int width = 4
0
4 294 967 295 (= 2^32-1)
The warning for the line 11 is expected as explained in these links: wiki.sei.cmu.edu and https://stackoverflow.com/a/11270515
left-shift operations [...] if the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
There is no warning for the line 8, but I was expected the same warning as for line 11. Futhermore, the results are entirely different ! What do I miss ?
This behaviour is similar for C++:
#include <iostream>
using namespace std;
int main()
{
cout << "Int width = " << sizeof(uint64_t) << "\n"; // Gives 64 bits on my computer
int n = 64;
uint64_t y = ((uint64_t)1 << n) - 1; // This is line 8
cout << "y = " << y;
uint64_t x = ((uint64_t)1 << 64) - 1; // This is line 11
cout << "\nx = " << x;
return 0;
}
Which outputs:
main.cpp:11:34: warning: left shift count >= width of type [-Wshift-count-overflow]
Int width = 8
y = 0
x = 18 446 744 073 709 551 615 (= 2^64-1)
I used the onlineGBD for C compiler for the C code and onlineGBD for C++ compiler.
Here are the link to the code: C code and C++ code.

For line 8, the compiler has to prove that in ((unsigned int)1 << n), n is 32 or more. That can be difficult since n is not const so it's value could be changed. The compiler would have to do more static analysis to give you the warning.
On the other hand, with (unsigned int)1 << 32) the compiler knows that the value is 32 or more and can easily warn. This requies almost no time to detect, since the type and the value to shift by are both compile time "literals".
If you switch to using const int n = 64; in your C++ code, then you will get an error at OnlineGBD. You can see that here. I tried that with the C version but it still doesn't warn.

There is no warning for the line 8, but I was expected the same warning as for line 11.
The C standard does not require a compiler to diagnose an excessive shift amount. (Generally, it does not require C implementations to diagnose errors other than those explicitly listed in “Constraints” clauses.)
The compiler you using diagnoses the error with the integer constant expression (32), as this is easy. It does not diagnose the error with the variable n, as that involves more work and the compiler authors have not implemented it.
Futhermore, the results are entirely different !
With the integer constant expression, the compiler evaluates the shift during compilation, using whatever software is built into it. That apparently produces zero for (unsigned int) 1 << 32. With the variable, the compiler generates an instruction to perform the shift during program execution. That instruction likely uses only the low five bits of the right operand, so an operand of 32 (1000002) yields of shift of zero bits, so shifting (unsigned int) 1 produces one.
Both behaviors are allowed by the C standard.

It's likely because n is a variable, the compiler doesn't seem to be verifying it, as it doesn't know its value it doesn't issue a warning, if you turn it into a constant i.e const int n = 64;, the warning is issued.
https://godbolt.org/z/4s5jz6
As for the results, undefined behavior is what it is, for the sake of curiosity you can analyze a particular case and try to figure out what the compiler did, but the results can't be reasoned with because there is no correct result.
Even the warnings are optional, gcc is nice enough to to warn you when a constant or constant literal is used but it didn't have to.

Undefined behaviour (UB) means undefined behaviour. Literally anything can happen. Compilers are not required to tell you of UB, but are permitted to.
If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
So max unsigned -1, or zero, or format your SSD while installing viruses on your cloud stored files and emailing your browser history to your contact list are all permitted things for your compiler to convert a shift of 32 bits of a 32 bit integer.
As a Quality of Implementation issue, the last is rare.
The compiler is optimizing (1<<x)-1 as x 1 bits, not even doing a shift operation, in one case. Within the bounds of defined shift operations, this is equivalent, so this is a valid optimization. So when you pass 32, it writes 0xffffffff.
In the other case, it is maybe setting the nth bit, reading the low 5 bits of the shift operation to see which to set. Also valid within the range of defined behaviour, utterly different.
Welcome to UB.
I would expect further changes based on optimization level.

Weird result in left shift (1ull << s == 1 if s == 64) [duplicate]

This question already has answers here:
Why doesn't left bit-shift, "<<", for 32-bit integers work as expected when used more than 32 times?
(10 answers)
Closed 3 years ago.
Why is the result of
uint32_t s = 64;
uint64_t val = 1ull << s;
and
uint64_t s = 64;
uint64_t val = 1ull << s;
1?
But
uint64_t val = 1ull << 0x40;
gets optimized to 0?
I really don't understand why it equals 1. It does no matter whether I use my VC++ or g++ compiler.
And how can I ensure that 1ull << s equals 0 when s equals 64, what's in my opinion is the correct result? I also need the imo. correct result in my program.

This is because on x64, the instruction SHL (when operating on a 64-bit source/destination operand) only uses the bottom 6 bits of the shift amount. In effect, you are shifting by 0 bits.
From the "Intel 64 and IA-32 Architecture Software Developer's Manual" (can be downloaded from Intel in PDF form, which is hard to link into,) under the entry for "SAL/SAR/SHL/SHR - Shift" instructions:
The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used).
As commented below, it is also an "Undefined Behavior" in the C++ language to shift an integer by more bits than its size. (Thanks to #sgarizvi for the reference.) The C++ standard, under Section 8.5.7 (Shift Operators) states that:
The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand...
That's why the compiler is producing code that gives different results under different conditions (constant or variable shift count, optimized or not, etc.)
About how to "fix" it, I have no clever tricks. You can do something like this:
template <typename IntT>
IntT ShiftLeft (IntT x, unsigned count) {
// Optional check; depends on how much you want to embrace C++!
static_assert(std::is_integral_v<IntT>, "This shift only works for integral types.");
if (count < sizeof(IntT) * 8)
return x << count;
else
return 0;
}
This code would work for signed and unsigned integer types (left-shift is the same for signed and unsigned values.)

Unexpected output with left shift operator C++

Here is the code which is giving me the unexpected answer
#include<bits/stdc++.h>
using namespace std;
int main()
{
cout<<(1<<50);
}
The answer I get is 0.
But if I change the line to
cout<<pow(2, 50);
I get the right answer.
Could someone explain me the reason.

Assuming your compiler treats the constant 1 as a 32bit integer, you shifted it so far to the left, that only zeroes remain in the 32bit you have. 50 is larger than 32.

Try this (run it):
#include <iostream>
int main()
{
std::int64_t i { 1 }; // wide enough to allow 50 bits shift
std::cout << std::hex << ( i << 50 ); // should display 4000000000000
return 0;
}

From the C++ Standard (5.8 Shift operators)
1 The shift operators << and >> group left-to-right.
shift-expression:
additive-expression
shift-expression << additive-expression
shift-expression >> additive-expression
The operands shall be of integral or unscoped enumeration type and
integral promotions are performed. The type of the result is that of
the promoted left operand. The behavior is undefined if the right
operand is negative, or greater than or equal to the length in
bits of the promoted left operand.
Take into account that the behavior also will be undefined if the right operand is not greater or equal to the length in bits of the left operand but can touch the sign bit because the integer literal 1 has the type signed int.
As for this function call pow(2, 50) then there is used some algorithm that calculates the power.

You shift the "1" out of the 32 bit field, so zero is the result. Pow uses float representation where 2^50 can be handled.
EDIT
Without any modifications like "1LL" or "1ULL" (which generate long 64bit numbers), an integer number is usually handled as 32 bit on a x64 or x86 architectures. You can use
cout << (1ULL << 50);
or
cout << ((long long)1 << 50);
which should to it.

It's exactly what you're doing. You're shifting a single bit by 50 position in a portion of memory that's 32 bit... What's happening according to you? The bit goes somewhere else but it's not inside the memory portion of the integer anymore. pow(2, 50) performs a double casting, so you're not shifting bits anymore.
Also, never use #include<bits/stdc++.h>. It's not standard, and it's slow. You should use it only in precompiled headers but I'd avoid this also in that cases.

cout<<(1<<50);
Your code treats 1 as an int, so overflows. Instead, try:
cout << (1ULL << 50);

Why is the binary equivalent calculation getting incorrect?

I wrote the following program to output the binary equivalent of a integer taking(I checked that int on my system is of 4 bytes) it is of 4 bytes. But the output doesn't come the right. The code is:
#include<iostream>
#include<iomanip>
using namespace std;
void printBinary(int k){
for(int i = 0; i <= 31; i++){
if(k & ((1 << 31) >> i))
cout << "1";
else
cout << "0";
}
}
int main(){
printBinary(12);
}
Where am I getting it wrong?

The problem is in 1<<31. Because 231 cannot be represented with a 32-bit signed integer (range −231 to 231 − 1), the result is undefined [1].
The fix is easy: 1U<<31.
[1]: The behavior is implementation-defined since C++14.

This expression is incorrect:
if(k & ((1<<31)>>i))
int is a signed type, so when you shift 1 31 times, it becomes the sign bit on your system. After that, shifting the result right i times sign-extends the number, meaning that the top bits remain 1s. You end up with a sequence that looks like this:
80000000 // 10000...00
C0000000 // 11000...00
E0000000 // 11100...00
F0000000 // 11110...00
F8000000
FC000000
...
FFFFFFF8
FFFFFFFC
FFFFFFFE // 11111..10
FFFFFFFF // 11111..11
To fix this, replace the expression with 1 & (k>>(31-i)). This way you would avoid undefined behavior* resulting from shifting 1 to the sign bit position.
* C++14 changed the definition so that shifting 1 31 times to the left in a 32-bit int is no longer undefined (Thanks, Matt McNabb, for pointing this out).

A typical internal memory representation of a signed integer value looks like:
The most significant bit (first from the right) is the sign bit and in signed numbers(like int) it represents whether the number is negative or not.
When you shift additional bits sign extension is performed to preserve the number's sign. This is done by appending digits to the most significant side of the number.(following a procedure dependent on the particular signed number representation used).
In unsigned numbers the first bit from the right is just the MSB of the represented number, thus when you shift additional bits no sign extension is performed.
Note: the enumeration of the bits starts from 0, so 1 << 31 replaces your sign bit and after that every bit shift operation to the left >> results in sign extension. (as pointed out by #dasblinkenlight)
So, the simple solution to your problem is to make the number unsigned (this is what U does in 1U << 31) before you start the bit manipulation. (as pointed out by #Yu Hao)
For further reading see signed number representations and two's complement.(as it's the most common)

Right shift with zeros at the beginning

I'm trying to do a kind of left shift that would add zeros at the beginning instead of ones. For example, if I left shift 0xff, I get this:
0xff << 3 = 11111000
However, if I right shift it, I get this:
0xff >> 3 = 11111111
Is there any operation I could use to get the equivalent of a left shift? i.e. I would like to get this:
00011111
Any suggestion?
Edit
To answer the comments, here is the code I'm using:
int number = ~0;
number = number << 4;
std::cout << std::hex << number << std::endl;
number = ~0;
number = number >> 4;
std::cout << std::hex << number << std::endl;
output:
fffffff0
ffffffff
Since it seems that in general it should work, I'm interested as to why this specific code doesn't. Any idea?

This is how C and binary arithmetic both work:
If you left shift 0xff << 3, you get binary: 00000000 11111111 << 3 = 00000111 11111000
If you right shift 0xff >> 3, you get binary: 00000000 11111111 >> 3 = 00000000 00011111
0xff is a (signed) int with the positive value 255. Since it is positive, the outcome of shifting it is well-defined behavior in both C and C++. It will not do any arithmetic shifts nor any kind or poorly-defined behavior.
#include <stdio.h>
int main()
{
printf("%.4X %d\n", 0xff << 3, 0xff << 3);
printf("%.4X %d\n", 0xff >> 3, 0xff >> 3);
}
Output:
07F8 2040
001F 31
So you are doing something strange in your program because it doesn't work as expected. Perhaps you are using char variables or C++ character literals.
Source: ISO 9899:2011 6.5.7.
EDIT after question update
int number = ~0; gives you a negative number equivalent to -1, assuming two's complement.
number = number << 4; invokes undefined behavior, since you left shift a negative number. The program implements undefined behavior correctly, since it either does something or nothing at all. It may print fffffff0 or it may print a pink elephant, or it may format the hard drive.
number = number >> 4; invokes implementation-defined behavior. In your case, your compiler preserves the sign bit. This is known as arithmetic shift, and arithmetic right shift works in such a way that the MSB is filled with whatever bit value it had before the shift. So if you have a negative number, you will experience that the program is "shifting in ones".
In 99% of all real world cases, it doesn't make sense to use bitwise operators on signed numbers. Therefore, always ensure that you are using unsigned numbers, and that none of the dangerous implicit conversion rules in C/C++ transforms them into signed numbers (for more info about dangerous conversions, see "the integer promotion rules" and "the usual arithmetic conversions", plenty of good info about those on SO).
EDIT 2, some info from the C99 standard's rationale document V5.10:
6.5.7 Bitwise shift operators
The description of shift operators in K&R suggests that shifting by a
long count should force the left operand to be widened to long before
being shifted. A more intuitive practice, endorsed by the C89
Committee, is that the type of the shift count has no bearing on the
type of the result.
QUIET CHANGE IN C89
Shifting by a long count no longer coerces the shifted operand to
long. The C89 Committee affirmed the freedom in implementation granted
by K&R in not requiring the signed right shift operation to sign
extend, since such a requirement might slow down fast code and since
the usefulness of sign extended shifts is marginal. (Shifting a
negative two’s complement integer arithmetically right one place is
not the same as dividing by two!)

If you explicitly shift 0xff it works as you expected
cout << (0xff >> 3) << endl; // 31
It should be possible only if 0xff is in type of signed width 8 (char and signed char on popular platforms).
So, in common case:
You need to use unsigned ints
(unsigned type)0xff
right shift works as division by 2(with rounding down, if I understand correctly).
So when you have 1 as first bit, you have negative value and after division it's negative again.

The two kinds of right shift you're talking about are called Logical Shift and Arithmetic Shift. C and C++ use logical shift for unsigned integers and most compilers will use arithmetic shift for a signed integer but this is not guaranteed by the standard meaning that the value of right shifting a negative signed int is implementation defined.
Since you want a logical shift you need to switch to using an unsigned integer. You can do this by replacing your constant with 0xffU.

To explain your real code you just need the C++ versions of the quotes from the C standard that Lundin gave in comments:
int number = ~0;
number = number << 4;
Undefined behavior. [expr.shift] says
The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated
bits are zero-ﬁlled. If E1 has an unsigned type, the value of the
result is E1 × 2E2, reduced modulo one more than the maximum value
representable in the result type. Otherwise, if E1 has a signed type
and non-negative value, and E1×2E2 is representable in the result
type, then that is the resulting value; otherwise, the behavior is
undeﬁned.
number = ~0;
number = number >> 4;
Implementation-defined result, in this case your implementation gave you an arithmetic shift:
The value of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has
an unsigned type or if E1 has a signed type and a non-negative value,
the value of the result is the integral part of the quotient of
E1/2E2. If E1 has a signed type and a negative value, the resulting
value is implementation-deﬁned
You should use an unsigned type:
unsigned int number = -1;
number = number >> 4;
std::cout << std::hex << number << std::endl;
Output:
0x0fffffff

To add my 5 cents worth here...
I'm facing exactly the same problem as this.lau! I've done some perfunctory research on this and these are my results:
typedef unsigned int Uint;
#define U31 0x7FFFFFFF
#define U32 0xFFFFFFFF
printf ("U31 right shifted: 0x%08x\n", (U31 >> 30));
printf ("U32 right shifted: 0x%08x\n", (U32 >> 30));
Output:
U31 right shifted: 0x00000001 (expected)
U32 right shifted: 0xffffffff (not expected)
It would appear (in the absence of anyone with detailed knowledge) that the C compiler in XCode for Mac OS X v5.0.1 reserves the MSB as a carry bit that gets pulled along with each shift.
Somewhat annoyingly, the converse is NOT true:-
#define ST00 0x00000001
#define ST01 0x00000002
printf ("ST00 left shifted: 0x%08x\n", (ST00 << 30));
printf ("ST01 left shifted: 0x%08x\n", (ST01 << 30));
Output:
ST00 left shifted: 0x40000000
ST01 left shifted: 0x80000000
I concur completely with the people above that assert that the sign of the operand has no bearing on the behaviour of the shift operator.
Can anyone shed any light on the specification for the Posix4 implementation of C? I feel a definitive answer may rest there.
In the meantime, it appears that the only workaround is a construct along the following lines;-
#define CARD2UNIVERSE(c) (((c) == 32) ? 0xFFFFFFFF : (U31 >> (31 - (c))))
This works - exasperating but necessary.

Just in case if you want the first bit of negative number to be 0 after right shift what we can do is to take the XOR of that negative number with INT_MIN that will make its msb zero, I understand that its not appropriate arithmetic shift but will get work done

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Weird behavior of right shift operator (1 >> 32) - c++

I suppose the reason is that int type holds 32-bits (for most systems), but one bit is used for sign as it is signed type. So only 31 bits are used for actual value.

Related

Unsigned integer wrapping, different behaviour

Weird result in left shift (1ull << s == 1 if s == 64) [duplicate]

Unexpected output with left shift operator C++

Why is the binary equivalent calculation getting incorrect?

Right shift with zeros at the beginning

Categories

Resources