Double to uint64_t conversion - c++

Following demo program demonstrates some behaviour I don't understand.
#include <string>
#include <limits>
#include <iostream>
constexpr double bits64 = 18446744073709551616.0; // 2^64
void diff_hash(double diff)
{
double hash = bits64 / diff;
uint64_t hash_64_1 = hash;
uint64_t hash_64_2 = hash < std::numeric_limits<uint64_t>::max() ? hash : std::numeric_limits<uint64_t>::max();
uint64_t hash_64_3 = std::numeric_limits<uint64_t>::max();
if(hash < hash_64_3){
hash_64_3 = hash;
}
std::cout << "hash_64_1: " << hash_64_1 << ", " << "hash_64_2: " << hash_64_2 << ", " << "hash_64_3: " << hash_64_3 << std::endl;
}
int main()
{
diff_hash(1);
return 0;
}
output
hash_64_1: 0, hash_64_2: 0, hash_64_3: 18446744073709551615
Questions:
1.) Why is hash_64_1 == 0? Event though value that is assigned is clearly max 64 value
2.) Why is hash_64_2 == 0? I confirmed that if I change the line to
uint64_t hash_64_2 = hash < std::numeric_limits<uint64_t>::max() ? hash : std::numeric_limits<uint32_t>::max();
the value of hash_64_2 max 32 value
Link to Wandbox example https://wandbox.org/permlink/HyXRX2CiNgIIpYkQ

18446744073709551616.0 / 1.0 is evaluated as a double. Its value is 18446744073709551616.0, assuming IEEE754. The behaviour on converting this to an out of range uint64_t is undefined. A common manifestation of that undefined behaviour is wrap-around to 0. (That's what most folk assume happens, but the behaviour is undefined when converting from an out of range floating point value.)
With the expression hash < std::numeric_limits<uint64_t>::max(), the right hand side is converted implicitly to a double. But that number cannot be represented as a double, so it is rounded to the nearest double, which is 18446744073709551616.0. Hence hash_64_2 is 0 too.

1.) Why is hash_64_1 == 0? Event though value that is assigned is clearly max 64 value
That is actually hardly clear. hash is clearly greater than max 64 value. The behaviour of converting an unrepresentable (in target type) floating point to integer is undefined.

"is assigned is clearly max 64 value" --> Off-by-one.
The max uint64_t value is 18446744073709551615, not 18446744073709551616.
Effects seen are due to UB of converting an out of range double to uint64_t.

Related

How do I flip the bits of a double?

Consider this code:
#include <iostream>
int main(){
double k = ~0.0;
std::cout << k << "\n";
}
It doesn't compile. I want to get a double value with all the bits set, which would be a NaN. Why doesn't this code work, and how do I flip all the bits of a double?
Regarding the code in the original question:
The 0 here is the int literal 0. ~0 is an int with value -1. You are initializing k with the int -1. The conversion from int to double doesn't change the numerical value (but does change the bit pattern), and then you print out the resulting double (which is still representing -1).
Now, for the current question: You can't apply bitwise NOT to a double. It's just not an allowed operation, precisely because it tends not to do anything useful to floating point values. It exists for built in integral types (plus anything with operator~) only.
If you would like to flip all the bits in an object, the standard conformant way is to do something like this:
#include <memory>
void flip_bits(auto &x) {
// iterate through bytes of x and flip all of them
std::byte *p = reinterpret_cast<std::byte*>(std::addressof(x));
for(std::size_t i = 0; i < sizeof(x); i++) p[i] = ~p[i];
}
Then
int main() {
double x = 0;
flip_bits(x);
std::cout << x << "\n";
}
may (will usually) print some variation of nan (dependent on how your implementation actually represents double, of course).
Example on Godbolt
// the numeric constant ~0 is an integer
int foo = ~0;
std::cout << foo << '\n'; //< prints -1
// now it converts the int value of -1 to a double.
double k = foo;
If you want to invert all of the bits you'll need to use a union with a uint64.
#include <iostream>
#include <cstdint>
int main(){
union {
double k;
uint64_t u;
} double_to_uint64;
double_to_uint64.u = ~0ULL;
std::cout << double_to_uint64.k;
}
Which will result in a -NAN.

Unexpected boolean result with vector size and -1 [duplicate]

Please take a look at this simple program:
#include <iostream>
#include <vector>
using namespace std;
int main() {
vector<int> a;
std::cout << "vector size " << a.size() << std::endl;
int b = -1;
if (b < a.size())
std::cout << "Less";
else
std::cout << "Greater";
return 0;
}
I'm confused by the fact that it outputs "Greater" despite it's obvious that -1 is less than 0. I understand that size method returns unsigned value but comparison is still applied to -1 and 0. So what's going on? can anyone explain this?
Because the size of a vector is an unsigned integral type. You are comparing an unsigned type with a signed one, and the two's complement negative signed integer is being promoted to unsigned. That corresponds to a large unsigned value.
This code sample shows the same behaviour that you are seeing:
#include <iostream>
int main()
{
std::cout << std::boolalpha;
unsigned int a = 0;
int b = -1;
std::cout << (b < a) << "\n";
}
output:
false
The signature for vector::size() is:
size_type size() const noexcept;
size_type is an unsigned integral type. When comparing an unsigned and a signed integer, the signed one is promoted to unsigned. Here, -1 is negative so it rolls over, effectively yielding the maximal representable value of the size_type type. Hence it will compare as greater than zero.
-1 unsigned is a higher value than zero because the high bit is set to indicate that it's negative but unsigned comparison uses this bit to expand the range of representable numbers so it's no longer used as a sign bit. The comparison is done as (unsigned int)-1 < 0 which is false.

why for (int i=0; i<-1; i++) runs? [duplicate]

Please take a look at this simple program:
#include <iostream>
#include <vector>
using namespace std;
int main() {
vector<int> a;
std::cout << "vector size " << a.size() << std::endl;
int b = -1;
if (b < a.size())
std::cout << "Less";
else
std::cout << "Greater";
return 0;
}
I'm confused by the fact that it outputs "Greater" despite it's obvious that -1 is less than 0. I understand that size method returns unsigned value but comparison is still applied to -1 and 0. So what's going on? can anyone explain this?
Because the size of a vector is an unsigned integral type. You are comparing an unsigned type with a signed one, and the two's complement negative signed integer is being promoted to unsigned. That corresponds to a large unsigned value.
This code sample shows the same behaviour that you are seeing:
#include <iostream>
int main()
{
std::cout << std::boolalpha;
unsigned int a = 0;
int b = -1;
std::cout << (b < a) << "\n";
}
output:
false
The signature for vector::size() is:
size_type size() const noexcept;
size_type is an unsigned integral type. When comparing an unsigned and a signed integer, the signed one is promoted to unsigned. Here, -1 is negative so it rolls over, effectively yielding the maximal representable value of the size_type type. Hence it will compare as greater than zero.
-1 unsigned is a higher value than zero because the high bit is set to indicate that it's negative but unsigned comparison uses this bit to expand the range of representable numbers so it's no longer used as a sign bit. The comparison is done as (unsigned int)-1 < 0 which is false.

Why 0<-1 condition failing in a for loop in c++ and creating infinite loop [duplicate]

Please take a look at this simple program:
#include <iostream>
#include <vector>
using namespace std;
int main() {
vector<int> a;
std::cout << "vector size " << a.size() << std::endl;
int b = -1;
if (b < a.size())
std::cout << "Less";
else
std::cout << "Greater";
return 0;
}
I'm confused by the fact that it outputs "Greater" despite it's obvious that -1 is less than 0. I understand that size method returns unsigned value but comparison is still applied to -1 and 0. So what's going on? can anyone explain this?
Because the size of a vector is an unsigned integral type. You are comparing an unsigned type with a signed one, and the two's complement negative signed integer is being promoted to unsigned. That corresponds to a large unsigned value.
This code sample shows the same behaviour that you are seeing:
#include <iostream>
int main()
{
std::cout << std::boolalpha;
unsigned int a = 0;
int b = -1;
std::cout << (b < a) << "\n";
}
output:
false
The signature for vector::size() is:
size_type size() const noexcept;
size_type is an unsigned integral type. When comparing an unsigned and a signed integer, the signed one is promoted to unsigned. Here, -1 is negative so it rolls over, effectively yielding the maximal representable value of the size_type type. Hence it will compare as greater than zero.
-1 unsigned is a higher value than zero because the high bit is set to indicate that it's negative but unsigned comparison uses this bit to expand the range of representable numbers so it's no longer used as a sign bit. The comparison is done as (unsigned int)-1 < 0 which is false.

Why -INT_MIN is NOT 2147483648 for uint64_t type

I understand the results of p. Could someone please explain why up2 (uint64_t type) != 2147483648 but up (uint32_t type) == 2147483648 ?
Some mention that assigning -INT_MIN to unsigned integer up2 will cause overflow, but
-INT_MIN is already a positive number, so it is fine to assign it to uint64_t up2?
Why it seems to be ok to assign -INT_MIN to uint32_t up? It produces correct result as 2147483648.
#include <iostream>
#include <climits>
using namespace std;
int main() {
int n = INT_MIN;
int p = -n;
uint32_t up = -n;
uint64_t up2 = -n;
cout << "n: " << n << endl;
cout << "p: " << p << " up: " << up << " up2: " << up2 << endl;
return 0;
}
Result:
n: -2147483648
p: -2147483648 //because -INT_MIN = INT_MIN for signed integer
up: 2147483648 //because up is unsigned int from 0 to 4,294,967,295 (2^32 − 1) and can cover 2147483648
up2: 18446744071562067968 //Question here. WHY up2 != up (2147483648)???
The behaviour of int p = -n; is undefined on a 2's complement system (accepting that you have a typo in your question; INT_MAX is always odd on such a system), due to your overflowing an int type. So your entire program is undefined.
This is why you see INT_MIN defined as -INT_MAX - 1 in many libraries.
Note that while you invoke undefined behavior due to signed integer overflow, here is the most likely explanation for the behavior you are observing:
If int is 32 bits on your system, and your system uses one's complement or two's complement for signed integer storage, then the sign bit will be extended into the upper 32-bits of a 64 bit unsigned type.
It may make more sense if you print out your values in base-16 instead.
n = 0x80000000
p = 0x80000000
up = 0x80000000
up2 = 0xFFFFFFFF80000000
What you see is -n converted to an uint64, where the overflow is not on 4 billion, but 2**64:
18446744073709551616 - 2147483648 = 18446744071562067968
The expression -n in your case causes undefined behavior, since the result cannot fit into the range of int data type. (Whether or not you assign this undefined result to a variable of "wider" type doesn't matter at all, the inversion itself is made with int.)
Trying to explain undefined behavior makes no sense.