Unsigned integer wrapping, different behaviour - c++

Here is a code in C:
#include <stdio.h>
int main()
{
printf("Int width = %lu\n", sizeof(unsigned int)); // Gives 32 bits on my computer
unsigned int n = 32;
unsigned int y = ((unsigned int)1 << n) - 1; // This is line 8
printf("%u\n", y);
unsigned int x = ((unsigned int)1 << 32) - 1; // This is line 11
printf("%u", x);
return 0;
}
It outputs:
main.c:11:39: warning: left shift count >= width of type [-Wshift-count-overflow]
Int width = 4
0
4 294 967 295 (= 2^32-1)
The warning for the line 11 is expected as explained in these links: wiki.sei.cmu.edu and https://stackoverflow.com/a/11270515
left-shift operations [...] if the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
There is no warning for the line 8, but I was expected the same warning as for line 11. Futhermore, the results are entirely different ! What do I miss ?
This behaviour is similar for C++:
#include <iostream>
using namespace std;
int main()
{
cout << "Int width = " << sizeof(uint64_t) << "\n"; // Gives 64 bits on my computer
int n = 64;
uint64_t y = ((uint64_t)1 << n) - 1; // This is line 8
cout << "y = " << y;
uint64_t x = ((uint64_t)1 << 64) - 1; // This is line 11
cout << "\nx = " << x;
return 0;
}
Which outputs:
main.cpp:11:34: warning: left shift count >= width of type [-Wshift-count-overflow]
Int width = 8
y = 0
x = 18 446 744 073 709 551 615 (= 2^64-1)
I used the onlineGBD for C compiler for the C code and onlineGBD for C++ compiler.
Here are the link to the code: C code and C++ code.

For line 8, the compiler has to prove that in ((unsigned int)1 << n), n is 32 or more. That can be difficult since n is not const so it's value could be changed. The compiler would have to do more static analysis to give you the warning.
On the other hand, with (unsigned int)1 << 32) the compiler knows that the value is 32 or more and can easily warn. This requies almost no time to detect, since the type and the value to shift by are both compile time "literals".
If you switch to using const int n = 64; in your C++ code, then you will get an error at OnlineGBD. You can see that here. I tried that with the C version but it still doesn't warn.

There is no warning for the line 8, but I was expected the same warning as for line 11.
The C standard does not require a compiler to diagnose an excessive shift amount. (Generally, it does not require C implementations to diagnose errors other than those explicitly listed in “Constraints” clauses.)
The compiler you using diagnoses the error with the integer constant expression (32), as this is easy. It does not diagnose the error with the variable n, as that involves more work and the compiler authors have not implemented it.
Futhermore, the results are entirely different !
With the integer constant expression, the compiler evaluates the shift during compilation, using whatever software is built into it. That apparently produces zero for (unsigned int) 1 << 32. With the variable, the compiler generates an instruction to perform the shift during program execution. That instruction likely uses only the low five bits of the right operand, so an operand of 32 (1000002) yields of shift of zero bits, so shifting (unsigned int) 1 produces one.
Both behaviors are allowed by the C standard.

It's likely because n is a variable, the compiler doesn't seem to be verifying it, as it doesn't know its value it doesn't issue a warning, if you turn it into a constant i.e const int n = 64;, the warning is issued.
https://godbolt.org/z/4s5jz6
As for the results, undefined behavior is what it is, for the sake of curiosity you can analyze a particular case and try to figure out what the compiler did, but the results can't be reasoned with because there is no correct result.
Even the warnings are optional, gcc is nice enough to to warn you when a constant or constant literal is used but it didn't have to.

Undefined behaviour (UB) means undefined behaviour. Literally anything can happen. Compilers are not required to tell you of UB, but are permitted to.
If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
So max unsigned -1, or zero, or format your SSD while installing viruses on your cloud stored files and emailing your browser history to your contact list are all permitted things for your compiler to convert a shift of 32 bits of a 32 bit integer.
As a Quality of Implementation issue, the last is rare.
The compiler is optimizing (1<<x)-1 as x 1 bits, not even doing a shift operation, in one case. Within the bounds of defined shift operations, this is equivalent, so this is a valid optimization. So when you pass 32, it writes 0xffffffff.
In the other case, it is maybe setting the nth bit, reading the low 5 bits of the shift operation to see which to set. Also valid within the range of defined behaviour, utterly different.
Welcome to UB.
I would expect further changes based on optimization level.

Related

Weird result in left shift (1ull << s == 1 if s == 64) [duplicate]

This question already has answers here:
Why doesn't left bit-shift, "<<", for 32-bit integers work as expected when used more than 32 times?
(10 answers)
Closed 3 years ago.
Why is the result of
uint32_t s = 64;
uint64_t val = 1ull << s;
and
uint64_t s = 64;
uint64_t val = 1ull << s;
1?
But
uint64_t val = 1ull << 0x40;
gets optimized to 0?
I really don't understand why it equals 1. It does no matter whether I use my VC++ or g++ compiler.
And how can I ensure that 1ull << s equals 0 when s equals 64, what's in my opinion is the correct result? I also need the imo. correct result in my program.
This is because on x64, the instruction SHL (when operating on a 64-bit source/destination operand) only uses the bottom 6 bits of the shift amount. In effect, you are shifting by 0 bits.
From the "Intel 64 and IA-32 Architecture Software Developer's Manual" (can be downloaded from Intel in PDF form, which is hard to link into,) under the entry for "SAL/SAR/SHL/SHR - Shift" instructions:
The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used).
As commented below, it is also an "Undefined Behavior" in the C++ language to shift an integer by more bits than its size. (Thanks to #sgarizvi for the reference.) The C++ standard, under Section 8.5.7 (Shift Operators) states that:
The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand...
That's why the compiler is producing code that gives different results under different conditions (constant or variable shift count, optimized or not, etc.)
About how to "fix" it, I have no clever tricks. You can do something like this:
template <typename IntT>
IntT ShiftLeft (IntT x, unsigned count) {
// Optional check; depends on how much you want to embrace C++!
static_assert(std::is_integral_v<IntT>, "This shift only works for integral types.");
if (count < sizeof(IntT) * 8)
return x << count;
else
return 0;
}
This code would work for signed and unsigned integer types (left-shift is the same for signed and unsigned values.)

Somewhat unexpected behaviour from left shift <<

This is a 32-bit MFC application currently running on Windows 10. Compiled with Visual C++ 2013.
std::cout << "sizeof(long long) = " << sizeof(long long) << std::endl;
int rot{ 32 };
long long bits{ (1 << rot) };
std::cout << "bits with variable = " << bits << std::endl;
long long bits2 = (1 << 32);
std::cout << "bits2 with constant = " << bits2 << std::endl;
system("pause");
The size of long long is 8 bytes, sufficient to manage my 32 bits, I was thinking. Here is the output of the debug build:
sizeof(long long) = 8
bits with variable = 1
bits2 with constant = 0
Press any key to continue . . .
And here is the output of the release build:
sizeof(long long) = 8
bits with variable = 0
bits2 with constant = 0
Press any key to continue . . .
So, apparently my single bit is leftshifted into oblivion even with a 64 bit data type. But I'm really puzzled to why the debug build produces different outputs if I shift with a variable as a parameter compared to a constant?
You need a long long type for 64 bits.
The expression 1 << 32 will be evaluated with int types for the operands, irrespective of the type of the variable to which this result is assigned.
You will have more luck with 1LL << 32, and 1LL << rot. That causes the expression to be evaluated using long long types.
Currently the behaviour of your program is undefined as you are overshifting a type when you write 1 << 32. Note also that 1 << 32 is a compile time evaluable constant expression whereas 1 << rot isn't. That probably accounts for the observed difference between using a variable and a constant.
The expression 1 << rot, when rot is an int, will give you an int result. It doesn't matter if you then place it into a long long since the damage has already been done(a).
Use 1LL << rot instead.
(a) And, by damage, I mean undefined behaviour, as per C11 6.5.7 Bitwise shift operators:
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
As to "why the debug build produces different outputs if I shift with a variable as a parameter compared to a constant", that's one of the vagaries of undefined behaviour - literally anything that's possible is allowed to happen. It's perfectly within its rights to play derisive_laughter.ogg and format your hard disk :-)

Unexpected output with left shift operator C++

Here is the code which is giving me the unexpected answer
#include<bits/stdc++.h>
using namespace std;
int main()
{
cout<<(1<<50);
}
The answer I get is 0.
But if I change the line to
cout<<pow(2, 50);
I get the right answer.
Could someone explain me the reason.
Assuming your compiler treats the constant 1 as a 32bit integer, you shifted it so far to the left, that only zeroes remain in the 32bit you have. 50 is larger than 32.
Try this (run it):
#include <iostream>
int main()
{
std::int64_t i { 1 }; // wide enough to allow 50 bits shift
std::cout << std::hex << ( i << 50 ); // should display 4000000000000
return 0;
}
From the C++ Standard (5.8 Shift operators)
1 The shift operators << and >> group left-to-right.
shift-expression:
additive-expression
shift-expression << additive-expression
shift-expression >> additive-expression
The operands shall be of integral or unscoped enumeration type and
integral promotions are performed. The type of the result is that of
the promoted left operand. The behavior is undefined if the right
operand is negative, or greater than or equal to the length in
bits of the promoted left operand.
Take into account that the behavior also will be undefined if the right operand is not greater or equal to the length in bits of the left operand but can touch the sign bit because the integer literal 1 has the type signed int.
As for this function call pow(2, 50) then there is used some algorithm that calculates the power.
You shift the "1" out of the 32 bit field, so zero is the result. Pow uses float representation where 2^50 can be handled.
EDIT
Without any modifications like "1LL" or "1ULL" (which generate long 64bit numbers), an integer number is usually handled as 32 bit on a x64 or x86 architectures. You can use
cout << (1ULL << 50);
or
cout << ((long long)1 << 50);
which should to it.
It's exactly what you're doing. You're shifting a single bit by 50 position in a portion of memory that's 32 bit... What's happening according to you? The bit goes somewhere else but it's not inside the memory portion of the integer anymore. pow(2, 50) performs a double casting, so you're not shifting bits anymore.
Also, never use #include<bits/stdc++.h>. It's not standard, and it's slow. You should use it only in precompiled headers but I'd avoid this also in that cases.
cout<<(1<<50);
Your code treats 1 as an int, so overflows. Instead, try:
cout << (1ULL << 50);

C++: Shift by variable value instead of number -> whats the difference?

int value = 0xffffffff;
int len = 32;
int result = value << len; // result will be 0xffffffff
result = value << 32; // result will be 0x0
Why does it makes a difference?
Edit:
Sorry I made a mistake. In the example above, both results are 0xffffffff.
So look at this:
unsigned int value = 0xffffffff;
unsigned int len = 32;
printf("0x%x\n", value << len); //will print 0xffffffff
printf("0x%x\n", 0xffffffff << 32); //will print 0x0
If the size of an int is 32 bits or less, your code contains
undefined behavior. The number of bits
shifted must be greater than or equal 0, and strictly less than
the number of bits in what is being shifted.
What is probably happening in practice is that for the variable,
the compiler is probably just passing it to a machine
instruction which only considers 5 low order bits (which are
0 in the case of 32); when the shift count is a constant, the
compiler evaluates the expression internally, likely in long
long, and then truncates it. But this is just one possible
behavior; anything might happen as far as the language is
concerned.
If len >= sizeof(int) or len < 0, the code contains undefined behaviour.
See this answer for more details.

Weird behavior of right shift operator (1 >> 32)

I recently faced a strange behavior using the right-shift operator.
The following program:
#include <cstdio>
#include <cstdlib>
#include <iostream>
#include <stdint.h>
int foo(int a, int b)
{
return a >> b;
}
int bar(uint64_t a, int b)
{
return a >> b;
}
int main(int argc, char** argv)
{
std::cout << "foo(1, 32): " << foo(1, 32) << std::endl;
std::cout << "bar(1, 32): " << bar(1, 32) << std::endl;
std::cout << "1 >> 32: " << (1 >> 32) << std::endl; //warning here
std::cout << "(int)1 >> (int)32: " << ((int)1 >> (int)32) << std::endl; //warning here
return EXIT_SUCCESS;
}
Outputs:
foo(1, 32): 1 // Should be 0 (but I guess I'm missing something)
bar(1, 32): 0
1 >> 32: 0
(int)1 >> (int)32: 0
What happens with the foo() function ? I understand that the only difference between what it does and the last 2 lines, is that the last two lines are evaluated at compile time. And why does it "work" if I use a 64 bits integer ?
Any lights regarding this will be greatly appreciated !
Surely related, here is what g++ gives:
> g++ -o test test.cpp
test.cpp: In function 'int main(int, char**)':
test.cpp:20:36: warning: right shift count >= width of type
test.cpp:21:56: warning: right shift count >= width of type
It's likely the CPU is actually computing
a >> (b % 32)
in foo; meanwhile, the 1 >> 32 is a constant expression, so the compiler will fold the constant at compile-time, which somehow gives 0.
Since the standard (C++98 §5.8/1) states that
The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.
there is no contradiction having foo(1,32) and 1>>32 giving different results.
On the other hand, in bar you provided a 64-bit unsigned value, as 64 > 32 it is guaranteed the result must be 1 / 232 = 0. Nevertheless, if you write
bar(1, 64);
you may still get 1.
Edit: The logical right shift (SHR) behaves like a >> (b % 32/64) on x86/x86-64 (Intel #253667, Page 4-404):
The destination operand can be a register or a memory location. The count operand can be an immediate value or the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used). A special opcode encoding is provided for a count of 1.
However, on ARM (armv6&7, at least), the logical right-shift (LSR) is implemented as (ARMISA Page A2-6)
(bits(N), bit) LSR_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = ZeroExtend(x, shift+N);
result = extended_x<shift+N-1:shift>;
carry_out = extended_x<shift-1>;
return (result, carry_out);
where (ARMISA Page AppxB-13)
ZeroExtend(x,i) = Replicate('0', i-Len(x)) : x
This guarantees a right shift of ≥32 will produce zero. For example, when this code is run on the iPhone, foo(1,32) will give 0.
These shows shifting a 32-bit integer by ≥32 is non-portable.
OK. So it's in 5.8.1:
The operands shall be of integral or enumeration type and integral promotions are performed. The type of the result is
that of the promoted left operand. The behavior is undefined if the right operand is negative, or greater than or equal to
the length in bits of the promoted left operand.
So you have an Undefined Behaviour(tm).
What happens in foo is that the shift width is greater than or equal to the size of the data being shifted. In the C99 standard that results in undefined behaviour. It's probably the same in whatever C++ standard MS VC++ is built to.
The reason for this is to allow compiler designers to take advantage of any CPU hardware support for shifts. For example, the i386 architecture has an instruction to shift a 32 bit word by a number of bits, but the number of bits is defined in a field in the instruction that is 5 bits wide. Most likely, your compiler is generating the instruction by taking your bit shift amount and masking it with 0x1F to get the bit shift in the instruction. This means that shifting by 32 is the same as shifting by 0.
I compiled it on 32 bit windows using VC9 compiler. It gave me the following warning. Since sizeof(int) is 4 bytes on my system compiler is indicating that right shifting by 32 bits results in undefined behavior. Since it is undefined, you can not predict the result. Just for checking I right shifted with 31 bits and all the warnings disappeared and the result was also as expected (i.e. 0).
I suppose the reason is that int type holds 32-bits (for most systems), but one bit is used for sign as it is signed type. So only 31 bits are used for actual value.
The warning says it all!
But in fairness I got bitten by the same error once.
int a = 1;
cout << ( a >> 32);
is completely undefined. In fact the compiler generally gives a different results than the runtime in my experience. What I mean by this is if the compiler can see to evaluate the shift expression at run time it may give you a different result to the expression evaluated at runtime.
foo(1,32) performs a rotate-shit, so bits that should disappear on the right reappear on the left. If you do it 32 times, the single bit set to 1 is back to its original position.
bar(1,32) is the same, but the bit is in the 64-32+1=33th bit, which is above the representable numbers for a 32-bit int. Only the 32 lowest bit are taken, and they are all 0's.
1 >> 32 is performed by the compiler. No idea why gcc uses a non-rotating shift here and not in the generated code.
Same thing for ((int)1 >> (int)32)