Somewhat unexpected behaviour from left shift <<

Somewhat unexpected behaviour from left shift << - c++

This is a 32-bit MFC application currently running on Windows 10. Compiled with Visual C++ 2013.
std::cout << "sizeof(long long) = " << sizeof(long long) << std::endl;
int rot{ 32 };
long long bits{ (1 << rot) };
std::cout << "bits with variable = " << bits << std::endl;
long long bits2 = (1 << 32);
std::cout << "bits2 with constant = " << bits2 << std::endl;
system("pause");
The size of long long is 8 bytes, sufficient to manage my 32 bits, I was thinking. Here is the output of the debug build:
sizeof(long long) = 8
bits with variable = 1
bits2 with constant = 0
Press any key to continue . . .
And here is the output of the release build:
sizeof(long long) = 8
bits with variable = 0
bits2 with constant = 0
Press any key to continue . . .
So, apparently my single bit is leftshifted into oblivion even with a 64 bit data type. But I'm really puzzled to why the debug build produces different outputs if I shift with a variable as a parameter compared to a constant?

You need a long long type for 64 bits.
The expression 1 << 32 will be evaluated with int types for the operands, irrespective of the type of the variable to which this result is assigned.
You will have more luck with 1LL << 32, and 1LL << rot. That causes the expression to be evaluated using long long types.
Currently the behaviour of your program is undefined as you are overshifting a type when you write 1 << 32. Note also that 1 << 32 is a compile time evaluable constant expression whereas 1 << rot isn't. That probably accounts for the observed difference between using a variable and a constant.

The expression 1 << rot, when rot is an int, will give you an int result. It doesn't matter if you then place it into a long long since the damage has already been done(a).
Use 1LL << rot instead.
(a) And, by damage, I mean undefined behaviour, as per C11 6.5.7 Bitwise shift operators:
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
As to "why the debug build produces different outputs if I shift with a variable as a parameter compared to a constant", that's one of the vagaries of undefined behaviour - literally anything that's possible is allowed to happen. It's perfectly within its rights to play derisive_laughter.ogg and format your hard disk :-)

Related

Unsigned integer wrapping, different behaviour

Here is a code in C:
#include <stdio.h>
int main()
{
printf("Int width = %lu\n", sizeof(unsigned int)); // Gives 32 bits on my computer
unsigned int n = 32;
unsigned int y = ((unsigned int)1 << n) - 1; // This is line 8
printf("%u\n", y);
unsigned int x = ((unsigned int)1 << 32) - 1; // This is line 11
printf("%u", x);
return 0;
}
It outputs:
main.c:11:39: warning: left shift count >= width of type [-Wshift-count-overflow]
Int width = 4
0
4 294 967 295 (= 2^32-1)
The warning for the line 11 is expected as explained in these links: wiki.sei.cmu.edu and https://stackoverflow.com/a/11270515
left-shift operations [...] if the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
There is no warning for the line 8, but I was expected the same warning as for line 11. Futhermore, the results are entirely different ! What do I miss ?
This behaviour is similar for C++:
#include <iostream>
using namespace std;
int main()
{
cout << "Int width = " << sizeof(uint64_t) << "\n"; // Gives 64 bits on my computer
int n = 64;
uint64_t y = ((uint64_t)1 << n) - 1; // This is line 8
cout << "y = " << y;
uint64_t x = ((uint64_t)1 << 64) - 1; // This is line 11
cout << "\nx = " << x;
return 0;
}
Which outputs:
main.cpp:11:34: warning: left shift count >= width of type [-Wshift-count-overflow]
Int width = 8
y = 0
x = 18 446 744 073 709 551 615 (= 2^64-1)
I used the onlineGBD for C compiler for the C code and onlineGBD for C++ compiler.
Here are the link to the code: C code and C++ code.

For line 8, the compiler has to prove that in ((unsigned int)1 << n), n is 32 or more. That can be difficult since n is not const so it's value could be changed. The compiler would have to do more static analysis to give you the warning.
On the other hand, with (unsigned int)1 << 32) the compiler knows that the value is 32 or more and can easily warn. This requies almost no time to detect, since the type and the value to shift by are both compile time "literals".
If you switch to using const int n = 64; in your C++ code, then you will get an error at OnlineGBD. You can see that here. I tried that with the C version but it still doesn't warn.

There is no warning for the line 8, but I was expected the same warning as for line 11.
The C standard does not require a compiler to diagnose an excessive shift amount. (Generally, it does not require C implementations to diagnose errors other than those explicitly listed in “Constraints” clauses.)
The compiler you using diagnoses the error with the integer constant expression (32), as this is easy. It does not diagnose the error with the variable n, as that involves more work and the compiler authors have not implemented it.
Futhermore, the results are entirely different !
With the integer constant expression, the compiler evaluates the shift during compilation, using whatever software is built into it. That apparently produces zero for (unsigned int) 1 << 32. With the variable, the compiler generates an instruction to perform the shift during program execution. That instruction likely uses only the low five bits of the right operand, so an operand of 32 (1000002) yields of shift of zero bits, so shifting (unsigned int) 1 produces one.
Both behaviors are allowed by the C standard.

It's likely because n is a variable, the compiler doesn't seem to be verifying it, as it doesn't know its value it doesn't issue a warning, if you turn it into a constant i.e const int n = 64;, the warning is issued.
https://godbolt.org/z/4s5jz6
As for the results, undefined behavior is what it is, for the sake of curiosity you can analyze a particular case and try to figure out what the compiler did, but the results can't be reasoned with because there is no correct result.
Even the warnings are optional, gcc is nice enough to to warn you when a constant or constant literal is used but it didn't have to.

Undefined behaviour (UB) means undefined behaviour. Literally anything can happen. Compilers are not required to tell you of UB, but are permitted to.
If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
So max unsigned -1, or zero, or format your SSD while installing viruses on your cloud stored files and emailing your browser history to your contact list are all permitted things for your compiler to convert a shift of 32 bits of a 32 bit integer.
As a Quality of Implementation issue, the last is rare.
The compiler is optimizing (1<<x)-1 as x 1 bits, not even doing a shift operation, in one case. Within the bounds of defined shift operations, this is equivalent, so this is a valid optimization. So when you pass 32, it writes 0xffffffff.
In the other case, it is maybe setting the nth bit, reading the low 5 bits of the shift operation to see which to set. Also valid within the range of defined behaviour, utterly different.
Welcome to UB.
I would expect further changes based on optimization level.

Left shifting with a unit64_t - Gives warning

I am trying to do the following. However I am not sure where I might be going wrong
uint64_t x = (1 << 46);
std::cout << x;
I get the
-warning: left shift count >= width of type [-Wshift-count-overflow]
I get the output 0. I was expecting something a decimal output of a binary like this
1 0000........00 (46 0s)
My question is why am I getting this warning ? isnt uint64_t 64 bit ? also why am I getting the output 0 ?

The problem is that you are not shifting a 64-bit constant: 1 is a constant of type int, which is less than 64 bits on your platform (probably 32 bits; it is implementation-defined).
You can fix this by using UINT64_C macro around the constant:
uint64_t x = (UINT64_C(1) << 46);

1 is a 32-bit constant. The compiler (correctly) computes the constant expression as 0 --- the 1 shifted beyound the size of an int32. If the arguments to << were variables, an x86 cpu would return (1 << 14), i.e. 1 << (46 % 32).
Try "1ULL << 46".

How long long is represented in memory?

I am not an advanced C++ programmer. But I have been using C++ for a long time now. So, I love playing with it. Lately I was thinking about ways to maximize a variable programmatically. So I tried Bitwise Operators to fill a variable with 1's. Then there's signed and unsigned issue. My knowledge of memory representation is not very well. However, I ended up writing the following code which is working for both signed and unsigned short, int and long (although int and long are basically the same). Unfortunately, for long long, the program is failing.
So, what is going on behind the scenes for long long? How is it represented in memory? Besides, Is there any better way to do achieve the same thing in C++?
#include <bits/stdc++.h>
using namespace std;
template<typename T>
void Maximize(T &val, bool isSigned)
{
int length = sizeof(T) * 8;
cout << "\nlength = " << length << "\n";
// clearing
for(int i=0; i<length; i++)
{
val &= 0 << i;
}
if(isSigned)
{
length--;
}
val = 1 << 0;
for(int i=1; i<length; i++)
{
val |= 1 << i;
cout << "\ni = " << i << "\nval = " << val << "\n";
}
}
int main()
{
long long i;
Maximize(i, true);
cout << "\n\nsizeof(i) = " << sizeof(i) << " bytes" << "\n";
cout << "i = " << i << "\n";
return 0;
}

The basic issue with your code is in the statements
val &= 0 << i;
and
val |= 1 << i;
in the case that val is longer than an int.
In the first expression, 0 << i is (most likely) always 0, regardless of i (technically, it suffers from the same undefined behaviour described below, but you will not likely encounter the problem.) So there was no need for the loop at all; all of the statements do the same thing, which is to zero out val. Of course, val = 0; would have been a simpler way of writing that.
The issue 1 << i is that the constant literal 1 is an int (because it is small enough to be represented as an int, and int is the narrowest representation used for integeral constants). Since 1 is an int, so is 1 << i. If i is greater than or equal to the number of value bits in an int, that expression has undefined behaviour, so in theory the result could be anything. In practice, however, the result is likely to be the same width as an int, so only the low-order bits will be affected.
It is certainly possible to convert the 1 to type T (although in general, you might need to be cautious about corner cases when T is signed), but it is easier to convert the 1 to an unsigned type at least as wide as Tby using the maximum-width unsigned integer type defined in cstdint, uintmax_t:
val |= std::uintmax_t(1) << i;
In real-world code, it is common to see the assumption that the widest integer type is long long:
val |= 1ULL << i;
which will work fine if the program never attempts to instantiate the template with a extended integer type.
Of course, this is not the way to find the largest value for an integer type. The correct solution is to #include <limits> and then use the appropriate specialization of std::numeric_limits<T>::max()
C++ allows only one representation for positive (and unsigned) integers, and three possible representations for negative signed integers. Positive and unsigned integers are simply represented as a sequence of bits in binary notation. There may be padding bits as well, and signed integers have a single sign bit which must be 0 in the case of positive integers, so there is no guarantee that there are 8*sizeof(T) useful bits in the representation, even if the number of bits in a byte is known to be 8 (and, in theory, it could be larger). [Note 1]
The sign bit for negative signed integers is always 1, but there are three different formats for the value bits. The most common is "two's complement", where the value bits interpreted as a positive number would be exactly 2k more than the actual value of the number, where k is the number of value bits. (This is equivalent to specifying a weight of 2-k to the sign bits, which is why it is called 2s complement.)
Another alternative is "one's complement", in which the value bits are all inverted individually. This differs by exactly one from two's-complement representation.
The third allowable alternative is "sign-magnitude", in which the value bits are precisely the absolute value of the negative number. This representation is frequently used for floating point values, but only rarely used in integer values.
Both sign-magnitude and one's complement suffer from the disadvantage that there is a bit pattern which represents "negative 0". On the other hand, two's complement representation has the feature that the magnitude of the most negative representable value is one larger than the magnitude of the most positive representable value, with the result that both -x and x/-1 can overflow, leading to undefined behaviour.
Notes
I believe that it is theoretically possible for padding to be inserted between the value bits and the sign bits, but I certainly do not know of any real-world implementation with that feature. However, the fact that attempting to shift a 1 into the sign bit position is undefined behaviour makes it incorrect to assume that the sign bit is contiguous with the value bits.

I was thinking about ways to maximize a variable programmatically.
You are trying to reinvent the wheel. C++ STL already has this functionality: std::numeric_limits::max()
// x any kind of numeric type: any integer or any floating point value
x = std::numeric_limits<decltype(x)>::max();
This is also better since you will not relay on undefined behavior.

As harold commented, the solution is to use T(1) << i instead of 1 << i. Also as Some programmer dude mentioned, long long is represented as consecutive bytes (typically 8 bytes) with sign bit at the MSB if it is signed.

Unexpected output with left shift operator C++

Here is the code which is giving me the unexpected answer
#include<bits/stdc++.h>
using namespace std;
int main()
{
cout<<(1<<50);
}
The answer I get is 0.
But if I change the line to
cout<<pow(2, 50);
I get the right answer.
Could someone explain me the reason.

Assuming your compiler treats the constant 1 as a 32bit integer, you shifted it so far to the left, that only zeroes remain in the 32bit you have. 50 is larger than 32.

Try this (run it):
#include <iostream>
int main()
{
std::int64_t i { 1 }; // wide enough to allow 50 bits shift
std::cout << std::hex << ( i << 50 ); // should display 4000000000000
return 0;
}

From the C++ Standard (5.8 Shift operators)
1 The shift operators << and >> group left-to-right.
shift-expression:
additive-expression
shift-expression << additive-expression
shift-expression >> additive-expression
The operands shall be of integral or unscoped enumeration type and
integral promotions are performed. The type of the result is that of
the promoted left operand. The behavior is undefined if the right
operand is negative, or greater than or equal to the length in
bits of the promoted left operand.
Take into account that the behavior also will be undefined if the right operand is not greater or equal to the length in bits of the left operand but can touch the sign bit because the integer literal 1 has the type signed int.
As for this function call pow(2, 50) then there is used some algorithm that calculates the power.

You shift the "1" out of the 32 bit field, so zero is the result. Pow uses float representation where 2^50 can be handled.
EDIT
Without any modifications like "1LL" or "1ULL" (which generate long 64bit numbers), an integer number is usually handled as 32 bit on a x64 or x86 architectures. You can use
cout << (1ULL << 50);
or
cout << ((long long)1 << 50);
which should to it.

It's exactly what you're doing. You're shifting a single bit by 50 position in a portion of memory that's 32 bit... What's happening according to you? The bit goes somewhere else but it's not inside the memory portion of the integer anymore. pow(2, 50) performs a double casting, so you're not shifting bits anymore.
Also, never use #include<bits/stdc++.h>. It's not standard, and it's slow. You should use it only in precompiled headers but I'd avoid this also in that cases.

cout<<(1<<50);
Your code treats 1 as an int, so overflows. Instead, try:
cout << (1ULL << 50);

Weird behavior of right shift operator (1 >> 32)

I recently faced a strange behavior using the right-shift operator.
The following program:
#include <cstdio>
#include <cstdlib>
#include <iostream>
#include <stdint.h>
int foo(int a, int b)
{
return a >> b;
}
int bar(uint64_t a, int b)
{
return a >> b;
}
int main(int argc, char** argv)
{
std::cout << "foo(1, 32): " << foo(1, 32) << std::endl;
std::cout << "bar(1, 32): " << bar(1, 32) << std::endl;
std::cout << "1 >> 32: " << (1 >> 32) << std::endl; //warning here
std::cout << "(int)1 >> (int)32: " << ((int)1 >> (int)32) << std::endl; //warning here
return EXIT_SUCCESS;
}
Outputs:
foo(1, 32): 1 // Should be 0 (but I guess I'm missing something)
bar(1, 32): 0
1 >> 32: 0
(int)1 >> (int)32: 0
What happens with the foo() function ? I understand that the only difference between what it does and the last 2 lines, is that the last two lines are evaluated at compile time. And why does it "work" if I use a 64 bits integer ?
Any lights regarding this will be greatly appreciated !
Surely related, here is what g++ gives:
> g++ -o test test.cpp
test.cpp: In function 'int main(int, char**)':
test.cpp:20:36: warning: right shift count >= width of type
test.cpp:21:56: warning: right shift count >= width of type

It's likely the CPU is actually computing
a >> (b % 32)
in foo; meanwhile, the 1 >> 32 is a constant expression, so the compiler will fold the constant at compile-time, which somehow gives 0.
Since the standard (C++98 §5.8/1) states that
The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.
there is no contradiction having foo(1,32) and 1>>32 giving different results.
On the other hand, in bar you provided a 64-bit unsigned value, as 64 > 32 it is guaranteed the result must be 1 / 232 = 0. Nevertheless, if you write
bar(1, 64);
you may still get 1.
Edit: The logical right shift (SHR) behaves like a >> (b % 32/64) on x86/x86-64 (Intel #253667, Page 4-404):
The destination operand can be a register or a memory location. The count operand can be an immediate value or the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used). A special opcode encoding is provided for a count of 1.
However, on ARM (armv6&7, at least), the logical right-shift (LSR) is implemented as (ARMISA Page A2-6)
(bits(N), bit) LSR_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = ZeroExtend(x, shift+N);
result = extended_x<shift+N-1:shift>;
carry_out = extended_x<shift-1>;
return (result, carry_out);
where (ARMISA Page AppxB-13)
ZeroExtend(x,i) = Replicate('0', i-Len(x)) : x
This guarantees a right shift of ≥32 will produce zero. For example, when this code is run on the iPhone, foo(1,32) will give 0.
These shows shifting a 32-bit integer by ≥32 is non-portable.

OK. So it's in 5.8.1:
The operands shall be of integral or enumeration type and integral promotions are performed. The type of the result is
that of the promoted left operand. The behavior is undefined if the right operand is negative, or greater than or equal to
the length in bits of the promoted left operand.
So you have an Undefined Behaviour(tm).

What happens in foo is that the shift width is greater than or equal to the size of the data being shifted. In the C99 standard that results in undefined behaviour. It's probably the same in whatever C++ standard MS VC++ is built to.
The reason for this is to allow compiler designers to take advantage of any CPU hardware support for shifts. For example, the i386 architecture has an instruction to shift a 32 bit word by a number of bits, but the number of bits is defined in a field in the instruction that is 5 bits wide. Most likely, your compiler is generating the instruction by taking your bit shift amount and masking it with 0x1F to get the bit shift in the instruction. This means that shifting by 32 is the same as shifting by 0.

I compiled it on 32 bit windows using VC9 compiler. It gave me the following warning. Since sizeof(int) is 4 bytes on my system compiler is indicating that right shifting by 32 bits results in undefined behavior. Since it is undefined, you can not predict the result. Just for checking I right shifted with 31 bits and all the warnings disappeared and the result was also as expected (i.e. 0).

I suppose the reason is that int type holds 32-bits (for most systems), but one bit is used for sign as it is signed type. So only 31 bits are used for actual value.

The warning says it all!
But in fairness I got bitten by the same error once.
int a = 1;
cout << ( a >> 32);
is completely undefined. In fact the compiler generally gives a different results than the runtime in my experience. What I mean by this is if the compiler can see to evaluate the shift expression at run time it may give you a different result to the expression evaluated at runtime.

foo(1,32) performs a rotate-shit, so bits that should disappear on the right reappear on the left. If you do it 32 times, the single bit set to 1 is back to its original position.
bar(1,32) is the same, but the bit is in the 64-32+1=33th bit, which is above the representable numbers for a 32-bit int. Only the 32 lowest bit are taken, and they are all 0's.
1 >> 32 is performed by the compiler. No idea why gcc uses a non-rotating shift here and not in the generated code.
Same thing for ((int)1 >> (int)32)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js