What is happening when I left shift beyond INT_MAX ? - c++

I have this piece of code
int a = 1;
while(1) {
a<<=1;
cout<<a<<endl;
}
In the output, I get
.
.
536870912
1073741824
-2147483648
0
0
Why am I not reaching INT_MAX? and what is really happening beyond that point?

You have a signed int, so numbers are in two's complement. This is what happens
00..01 = 1
00..10 = 2
[...]
01..00 = 1073741824
10..00 = -2147483648 // Highest bit to one means -01..11 - 1 = -(2^31)
00..00 = 0
You cannot reach INT_MAX, at most you will have 2^30.
As pointed out in the comments, c++ standard does not enforce 2's complement, so this code could behave differently in other machines.

From ISO/IEC 14882:2011 Clause 5.8/2
The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1×2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined

Take a look at this (reference)[http://www.cplusplus.com/reference/climits/]: Assuming INT_MAX == 2^15-1, doing the loop as you are doing you will get 2^14, 2^15 and 2^16, but 2^15-1, never. But INT_MAX differ (look at the ref, or greater), try this in your machine:
#include<climits>
#include<iostream>
int main(){
int a = 1;
int iter = 0;
std::cout << "INT_MAX == " << INT_MAX << " in my env" << std::endl;
while(1) {
a <<=1;
std::cout << "2^" << ++iter << "==" << a << std::endl;
if((a-1) == INT_MAX){
std::cout << "Reach INT_MAX!" << std::endl;
break;
}
}
return 0;
}
See how INT_MAX is formed, 2^exp - 1.

According to the C++ Standard:
The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand
In you example when you shift the number left, vacated bits are zero-filled. As you can see, all your numbers are even. It is because the lowest bits are filled with zero. You should write:
a = ( a << 1 ) | 1;
if you want to get INT_MAX. Also the loop should check whether the number is positive.

Related

Question about Bitwise Shift in Microsoft C++ [duplicate]

This question already has answers here:
Why does it make a difference if left and right shift are used together in one expression or not?
(3 answers)
Unexepected behavior from multiple bitwise shifts on the same line [duplicate]
(1 answer)
Why does combining two shifts of a uint8_t produce a different result?
(2 answers)
Closed last year.
I am doing the following bitwise shift in Microsoft C++:
uint8_t arr[3] = {255, 255, 255};
uint8_t value = (arr[1] << 4) >> 4;
The result of these operations confused me quite a bit:
value = 255
However, if I do the bitwise shift separately:
value = (arr[i] << 4);
value = value >> 4;
the answer is different and makes much sense:
value = 15
Can someone explain to me why this happens? I am familiar with the concepts of bitwise shift, or so I believed...
Thanks in advance!
(P.S.: It seems g++ will have the same behavior. I am probably missing some important concepts with bitwise shift. Any help is greatly appreciated!)
In this expression with shift operators
(arr[1] << 4) >> 4;
there is used the integral promotions. That is the operand arr[1] is promoted to an object of the type int and such an object can store the result of the expression arr[i] << 4.
From the C++ 14 Standard (5.8 Shift operators, p.#1)
...The operands shall be of integral or unscoped enumeration type and
integral promotions are performed. The type of the result is that of
the promoted left operand. The behavior is undefined if the right
operand is negative, or greater than or equal to the length in bits of
the promoted left operand.
Here is a demonstration program
#include <iostream>
#include <iomanip>
#include <type_traits>
#include <cstdint>
int main()
{
uint8_t x = 255;
std::cout << "std::is_same_v<decltype( x << 4 ), int> is "
<< std::boolalpha
<< std::is_same_v<decltype( x << 4 ), int> << '\n';
std::cout << "x << 4 = " << ( x << 4 ) << '\n';
}
The program output is
std::is_same_v<decltype( x << 4 ), int> is true
x << 4 = 4080
As for this code snippet
value = (arr[i] << 4);
value = value >> 4;
then in the first assignment statement the result of the shift operation is truncated.
Expression (arr[1] << 4) will implicitly promote the value of arr[1] to type unsigned int before applying the shift operation, such that the "intermediate" result will not "loose" any bits (cf, for example, the explanation in implicit conversions).
However, when you write value = (arr[i] << 4);, then this "intermediate" result will be converted back to uint_8, and in this step bits get cut off.
See the difference when you write uint8_t value = ((uint8_t)(arr[1] << 4)) >> 4;

Why (int)pow(2, 32) == -2147483648

On the Internet I found the following problem:
int a = (int)pow(2, 32);
cout << a;
What does it print on the screen?
Firstly I thought about 0,
but after I wrote code and executed it, i got -2147483648, but why?
Also I noticed that even (int)(pow(2, 32) - pow(2, 31)) equals -2147483648.
Can anyone explain why (int)pow(2, 32) equals -2147483648?
Assuming int is 32 bits (or less) on your machine, this is undefined behavior.
From the standard, conv.fpint:
A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.
Most commonly int is 32 bits, and it can represent values in the interval [-2^31, 2^31-1] which is [-2147483648, 2147483647]. The result of std::pow(2, 32) is a double that represents the exact value 2^32. Since 2^32 exceeds the range that can be represented by int, the conversion attempt is undefined behavior. This means that in the best case, the result can be anything.
The same goes for your second example: pow(2, 32) - pow(2, 31) is simply the double representation of 2^31, which (just barely) exceeds the range that can be represented by a 32-bit int.
The correct way to do this would be to convert to a large enough integral type, e.g. int64_t:
std::cout << static_cast<int64_t>(std::pow(2, 32)) << "\n"; // prints 4294967296
The behavior you are seeing relates to using Two's Complement to represent
signed integers. For 3-bit numbers the range of values range from [-4, 3]. For 32-bit numbers it ranges from -(2^31) to (2^31)-1. (i.e. -2147483648 to 2147483647).
this because the result of the operation overflow int data type because it exceeds its max value so don't cast to int cast it to long
#include <iostream>
#include <cmath>
#include <climits>
using namespace std;
int main() {
cout << (int)pow(2, 32) << endl;
// 2147483647
cout << INT_MIN << endl;
//-2147483648
cout << INT_MAX << endl;
//2147483647
cout << (long)pow(2, 32) << endl;
//4294967296
cout << LONG_MIN << endl;
// -9223372036854775808
cout << LONG_MAX << endl;
// 9223372036854775808
return 0;
}
if you are not aware about int overflow you can check this link

How to obtain hexadecimal value of -nan

I am writing some code in intel intrinsics and did this:
#include <iostream>
#include <xmmintrin.h>
float data[4];
__m128 val1 = _mm_set_ps1(2);
__m128 val2 = _mm_set_ps1(1);
val1 = _mm_cmpgt_ps(val1, val2);
_mm_store_ps(data, val1);
std::cout << std::hex << data[0];
I am trying to get the hexadecimal value of "true" in SSE intrinsics (which is -nan), but only keep getting -nan as "the hexadecimal value" whenever I try to print the hexadecimal value of -nan.
I also tried using std::oct and std::dec and neither of those worked.
I also tried comparing 0xFFFFFFFF and data[0] in different combinations and got this:
float data[4];
__m128 val1 = _mm_set_ps1(2);
__m128 val2 = _mm_set_ps1(1);
val1 = _mm_cmpgt_ps(val1, val2);
_mm_store_ps(data, val1);
float f = 0xFFFFFFFF;
float g = 0xFFFFFFFF;
std::cout << std::dec << (data[0] == f) << "\n"; // Prints "0"
std::cout << std::dec << (data[0] == data[0]) << "\n"; // Prints "0"
std::cout << std::dec << (f == g); // Prints "1"
Is there any way for me to print the hexadecimal value of -nan and if not, can somebody please tell me the binary, hexadecimal, etc. value of -nan?
Per IEEE specification, NaN is a floating-point value that has all of its exponent bits set to "1".
So a value with all the bits set to "1" would also be a NaN.
If you want to see the raw bytes, just print the raw bytes:
#include <cmath>
#include <iomanip>
#include <iostream>
#include <sstream>
template<typename T>
std::string get_hex_bytes(T x) {
std::stringstream res;
auto p = reinterpret_cast<const unsigned char*>(&x);
for (int i = 0; i < sizeof(x); ++i) {
if (i)
res << ' ';
res << std::setfill('0') << std::setw(2) << std::hex << (int)p[i];
}
return res.str();
}
int main() {
float data = NAN;
std::cout << get_hex_bytes(data) << std::endl;
}
On a little-endian machine will print something like:
00 00 c0 ff
P.S. float f = 0xFFFFFFFF; will not set all of the bits to "1", it simply converts an integer 0xFFFFFFFF to a floating point representation (perfectly representable with some loss of precision).
As the manual says, _mm_cmpgt_ps (which is really cmpps with a specific comparison predicate),
Performs a SIMD compare of the packed single-precision floating-point values in the source operand (second
operand) and the destination operand (first operand) and returns the results of the comparison to the destination
operand. The comparison predicate operand (third operand) specifies the type of comparison performed on each of
the pairs of packed values. The result of each comparison is a doubleword mask of all 1s (comparison true) or all
0s (comparison false). The sign of zero is ignored for comparisons, so that –0.0 is equal to +0.0.
(emphasis added)
"All 1s", or 0xFFFFFFFF in hexadecimal (since it's 32 bits per element), has the sign bit set (so there is a legitimate reason to print a - sign in front of whatever else this number might be) and since the exponent is all ones and the significand is not zero, it is also a NaN. The NaN-ness usually isn't very relevant, the main intended use for this result is as a mask in bitwise operations (eg _mm_and_ps, _mm_blendv_ps, etc), which do not care about the special semantics of NaN.
First of all, there's no such thing as a "negative nan". nan is, by definition, Not a Number. You can't negate it. -nan is the same sort of thing as nan.
There's no exactly standards-compliant way to get the underlying bits comprised by a floating-point value, but the closest thing is memcpy. Simply copy from a pointer to float or double to a pointer to an equivalently-sized unsigned integer type, then print that with std::hex active.

Is right-shifting an unsigned integer by its total number of bits UB ? [duplicate]

This question already has answers here:
Arithmetic right shift gives bogus result?
(2 answers)
Closed 7 years ago.
I wanted to check that some big calculated memory needs (stored in an unsigned long long) would be roughly compatible with the memory model used to compile my code.
I assumed that right-shifting the needs by the number of bits in a pointer would result in 0 if and only if memory needs would fit in the virtual address space (independently of practical OS limitations).
Unfortunately, I found out some unexpected results when shifting a 64 bit number by 64 bits on some compilers.
Small demo:
const int ubits = sizeof (unsigned)*8; // number of bits, assuming 8 per byte
const int ullbits = sizeof (unsigned long long)*8;
cout << ubits << " bits for an unsigned\n";
cout << ullbits << " bits for a unsigned long long \n";
unsigned utest=numeric_limits<unsigned>::max(); // some big numbers
unsigned long long ulltest=numeric_limits<unsigned long long>::max();
cout << "unsigned "<<utest << " rshift by " << ubits << " = "
<< (utest>>ubits)<<endl;
cout << "unsigned long long "<<ulltest << " rshift by " << ullbits << " = "
<< (ulltest>>ullbits)<<endl;
I expected both displayed rshit results be 0.
This works as expected with gcc.
But with MSVC 13 :
in 32 bits debug: the 32 bit rshift on unsigned has NO EFFECT ( displays the original number) but the 64 bit shift of the unsigned long long is 0 as expected.
in 64 bits debug: the rshift has NO EFFECT in both cases.
in 32 and 64 bits release: the rshif is 0 as expected in both cases.
I'd like to know if this is a compiler bug, or if this is undefined behaviour.
According to the C++ Standard (5.8 Shift operators)
...The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted
left operand
The same is written in the C Standard (6.5.7 Bitwise shift operators)
3 The integer promotions are performed on each of the operands. The
type of the result is that of the promoted left operand. If the value
of the right operand is negative or is greater than or equal to the
width of the promoted left operand, the behavior is undefined.

Why is (int)'\xff' != 0xff but (int)'\x7f' == 0x7f?

Consider this code :
typedef union
{
int integer_;
char mem_[4];
} MemoryView;
int main()
{
MemoryView mv;
mv.integer_ = (int)'\xff';
for(int i=0;i<4;i++)
std::cout << mv.mem_[i]; // output is \xff\xff\xff\xff
mv.integer_ = 0xff;
for(int i=0;i<4;i++)
std::cout << mv.mem_[i]; // output is \xff\x00\x00\x00
// now i try with a value less than 0x80
mv.integer_ = (int)'\x7f'
for(int i=0;i<4;i++)
std::cout << mv.mem_[i]; // output is \x7f\x00\x00\x00
mv.integer_ = 0x7f;
for(int i=0;i<4;i++)
std::cout << mv.mem_[i]; // output is \x7f\x00\x00\x00
// now i try with 0x80
mv.integer_ = (int)'\x80'
for(int i=0;i<4;i++)
std::cout << mv.mem_[i]; // output is \x80\xff\xff\xff
mv.integer_ = 0x80;
for(int i=0;i<4;i++)
std::cout << mv.mem_[i]; // output is \x80\x00\x00\x00
}
I tested it with both GCC4.6 and MSVC2010 and results was same.
When I try with values less than 0x80 output is correct but with values bigger than 0x80,
left three bytes are '\xff'.
CPU : Intel 'core 2 Duo'
Endianness : little
OS : Ubuntu 12.04LTS (64bit), Windows 7(64 bit)
It's implementation-specific whether type char is signed or unsigned.
Assigning a variable of type char the value of 0xFF might either yield 255 (if type is really unsigned) or -1 (if type is really signed) in most implementations (where the number of bits in char is 8).
Values less, or equal to, 0x7F (127) will fit in both an unsigned char and a signed char which explains why you are getting the result you are describing.
#include <iostream>
#include <limits>
int
main (int argc, char *argv[])
{
std::cerr << "unsigned char: "
<< +std::numeric_limits<unsigned char>::min ()
<< " to "
<< +std::numeric_limits<unsigned char>::max ()
<< ", 0xFF = "
<< +static_cast<unsigned char> ('\xFF')
<< std::endl;
std::cerr << " signed char: "
<< +std::numeric_limits<signed char>::min ()
<< " to "
<< +std::numeric_limits<signed char>::max ()
<< ", 0xFF = "
<< +static_cast<signed char> ('\xFF')
<< std::endl;
}
typical output
unsigned char: 0 to 255, 0xFF = 255
signed char: -128 to 127, 0xFF = -1
To circumvent the problem you are experiencing explicitly declare your variable as either signed or unsigned, in this case casting your value into a unsigned char will be sufficient:
mv.integer_ = static_cast<unsigned char> ('\xFF'); /* 255, NOT -1 */
side note:
you are invoking undefined behaviour when reading a member of a union that is not the last member you wrote to. the standard doesn't specify what will be going on in this case. sure, under most implementations it will work as expected. accessing union.mem_[0] will most probably yield the first byte of union.integer_, but this is not guarenteed.
The type of '\xff' is char. char is a signed integral type on a lot of platforms, so the value of '\xff is negative (-1 rather than 255). When you convert (cast) that to an int (also signed), you get an int with the same, negative, value.
Anything strictly less than 0x80 will be positive, and you'll get a positive out of the conversion.
Because '\xff' is a signed char (default for char is signed in many architectures, but not always) - when converted to an integer, it is sign-extended, to make it 32-bit (in this case) int.
In binary arithmetic, nearly all negative representations use the highest bit to indicate "this is negative" and some sort of "inverse" logic to represent the value. The most common is to use "two's complement", where there is no "negative zero". In this form, all ones is -1, and the "most negative number" is a 1 followed by a lot of zeros, so 0x80 in 8 bits is -128, 0x8000 in 16 bits is -32768, and 0x80000000 is -2147 million (and some more digits).
A solution, in this case, would be to use static_cast<unsigned char>('\xff').
Basically, 0xff stored in a signed 8 bit char is -1. Whether a char without signedor unsigned specifier is signed or unsigned depends on the compiler and/or platform and in this case it seems to be.
Cast to an int, it keeps the value -1, which stored in a 32 bit signed int is 0xffffffff.
0x7f on the other hand stored in an 8 bit signed char is 127, which cast to a 32 bit int is 0x0000007f.