What is the benefit in CUDA's reinterpreting builtins over C/C++ reinterpretation? - casting

CUDA has some built-in device-side functions for reinterpreting integral as floating-point values and vice versa:
float __int_as_float(int);
int __float_as_int(float);
double __longlong_as_double(long long);
long long __double_as_longlong(double);
why, if it all, is it preferable to use these over a generic:
y = reinterpret_cast<T&>(x);
or even the C-language
y = *((T*)(&x));
?

Related

Convert an integer's binary data to float

Lets say I have an integer:
unsigned long long int data = 4599331010119547059;
Now I want to convert this data to a double. I basically want to change the type, but keep the bits exactly as they were. For the given example, the float value is 0.31415926536.
How can I do that in C++? I saw some methods using Union but many advised against using this approach.
Since C++20, you can use std::bit_cast:
std::bit_cast<double>(data)
Prior to C++20, you can use std::memcpy:
double d;
static_assert(sizeof d == sizeof data);
std::memcpy(&d, &data, sizeof d);
Note that result will vary depending on floating point representation (IEEE-754 is ubiquitous though) as well as whether floating point and integer types have the same endianness.
Taking the question on its face value (assuming you have a valid reason to do this!) this is the only proper way of doing this in current C++ standard:
int i = get_int();
float x;
static_assert(sizeof(float) == sizeof(int), "!!!");
memcpy(&x, &i, sizeof(x));
You can use reinterpret_cast:
float f = reinterpret_cast<float&>(data);
For your value, I don't get 0.314... but that's how you could do it.

Check if the sum of two unsigned ints is larger than uint_max

Suppose I have two integers x and y, and I want to check whether their sum is larger than UINT_MAX.
#define UINT64T_MAX std::numeric_limits<uint64_t>::max()
uint64_t x = foo();
uint64_t y = foo();
bool carry = UINT64T_MAX - x < y;
That code will work, but I want to know if there's a more efficient way to do it - possibly using some little-known feature that CPUs have.
In C++, unsigned integer overflow has well-defined behavior. If you add two unsigned integers and the result is smaller than either one, then the calculation overflowed. (The result will always be smaller than both, so it doesn't matter which one you check.)
#define UINT64T_MAX std::numeric_limits<uint64_t>::max()
uint64_t x = foo();
uint64_t y = foo();
uint64_t z = x + y;
bool carry = z < x;
I'm confident that this is the best way to do it in portable, well-defined C++. Clang and GCC both compile this trivial example to the optimal sequence of two amd64 instructions (add x, y; setc carry).
This does not generalize to signed integer overflow, as signed integer overflow is undefined behavior (although some C++ committee members are looking to change that).
Some compilers offer non-standard ways to check for overflow after various arithmetic functions, not just for addition, and not just for signed numbers. Using them for that added functionality might be worth investigating if you can afford to lose portability. For the specific case of unsigned addition overflow, performance is likely to be identical or negligibly faster in some non-trivial cases and it is probably not worth losing portability.
auto t = x + y;
bool x_y_overflows_unsigned = t < x || t < y; // Actually, the second check is unnecessary.
would be hard to beat and is possibly clearer insofar that using subtraction with unsigned types often introduces bugs.
But if you are in any doubt, check the generated assembly.

Copying bytes directly from float to an unsigned int in C++ Visual Studio?

I am trying to convert a float directly into an unsigned integer WITHOUT ANY OF THE IMPLICIT CONVERSION MATH, (so not the C style or static casts) just copying the bytes directly to the other. In Windows Visual Studio 2015, the sizes for a float and a unsigned integer are the same, (4 Bytes) so I don't think there is any problem on that end . . . I came up with a solution but there has got to be a better way to do what I want.
unsigned int x = 3;
float y = 2.4565;
*reinterpret_cast<float*>(&x) = y;
This does what I want and sets X to 1075656524.
I would prefer a cross-platform solution, if there is one. I know the sizes of types can vary from platform to platform, so that might be impossible.
EDIT: To clarify, I want all of the bytes of the float copied into the unsigned int unchanged. Every single bit stored in the float should be stored in the unsigned integer. Also is there a solution that does not use memcpy? I want to avoid using deprecated functions.
I am trying to convert a float directly into an unsigned integer WITHOUT ANY OF THE IMPLICIT CONVERSION MATH, (so not the C style or static casts) just copying the bytes directly to the other
It seems like all you want to do is copy the bit pattern from one memory location to another. The standard library function memcpy can be used for that. Just realize that if sizeof(int) is different than sizeof(float), all of this is moot.
unsigned int x = 3;
float y = 2.4565;
static_assert(sizeof(int) == sizeof(float), "Can't memcpy a float to an int");
memcpy(&x, &y);
A more portable solution would be to use an array of uint8_t or int8_t.
uint8_t x[sizeof(float)];
float y = 2.4565;
memcpy(x, &y);
Now you can examine the bit pattern by examining the values of the elements of the array.

Int or Unsigned Int to float without getting a warning

Sometimes I have to convert from an unsigned integer value to a float. For example, my graphics engine takes in a SetScale(float x, float y, float z) with floats and I have an object that has a certain size as an unsigned int. I want to convert the unsigned int to a float to properly scale an entity (the example is very specific but I hope you get the point).
Now, what I usually do is:
unsigned int size = 5;
float scale = float(size);
My3DObject->SetScale(scale , scale , scale);
Is this good practice at all, under certain assumptions (see Notes)? Is there a better way than to litter the code with float()?
Notes: I cannot touch the graphics API. I have to use the SetScale() function which takes in floats. Moreover, I also cannot touch the size, it has to be an unsigned int. I am sure there are plenty of other examples with the same 'problem'. The above can be applied to any conversion that needs to be done and you as a programmer have little choice in the matter.
My preference would be to use static_cast:
float scale = static_cast<float>(size);
but what you are doing is functionally equivalent and fine.
There is an implicit conversion from unsigned int to float, so the cast is strictly unnecessary.
If your compiler issues a warning, then there isn't really anything wrong with using a cast to silence the warning. Just be aware that if size is very large it may not be representable exactly by a float.

Which variables should I typecast when doing math operations in C/C++?

For example, when I'm dividing two ints and want a float returned, I superstitiously write something like this:
int a = 2, b = 3;
float c = (float)a / (float)b;
If I do not cast a and b to floats, it'll do integer division and return an int.
Similarly, if I want to multiply a signed 8-bit number with an unsigned 8-bit number, I will cast them to signed 16-bit numbers before multiplying for fear of overflow:
u8 a = 255;
s8 b = -127;
s16 = (s16)a * (s16)b;
How exactly does the compiler behave in these situations when not casting at all or when only casting one of the variables? Do I really need to explicitly cast all of the variables, or just the one on the left, or the one on the right?
Question 1: Float division
int a = 2, b = 3;
float c = static_cast<float>(a) / b; // need to convert 1 operand to a float
Question 2: How the compiler works
Five rules of thumb to remember:
Arithmetic operations are always performed on values of the same type.
The result type is the same as the operands (after promotion)
The smallest type arithmetic operations are performed on is int.
ANSCI C (and thus C++) use value preserving integer promotion.
Each operation is done in isolation.
The ANSI C rules are as follows:
Most of these rules also apply to C++ though not all types are officially supported (yet).
If either operand is a long double the other is converted to a long double.
If either operand is a double the other is converted to a double.
If either operand is a float the other is converted to a float.
If either operand is a unsigned long long the other is converted to unsigned long long.
If either operand is a long long the other is converted to long long.
If either operand is a unsigned long the other is converted to unsigned long.
If either operand is a long the other is converted to long.
If either operand is a unsigned int the other is converted to unsigned int.
Otherwise both operands are converted to int.
Overflow
Overflow is always a problem. Note. The type of the result is the same as the input operands so all the operations can overflow, so yes you do need to worry about it (though the language does not provide any explicit way to catch this happening.
As a side note:
Unsigned division can not overflow but signed division can.
std::numeric_limits<int>::max() / -1 // No Overflow
std::numeric_limits<int>::min() / -1 // Will Overflow
In general, if operands are of different types, the compiler will promote all to the largest or most precise type:
If one number is... And the other is... The compiler will promote to...
------------------- ------------------- -------------------------------
char int int
signed unsigned unsigned
char or int float float
float double double
Examples:
char + int ==> int
signed int + unsigned char ==> unsigned int
float + int ==> float
Beware, though, that promotion occurs only as required for each intermediate calculation, so:
4.0 + 5/3 = 4.0 + 1 = 5.0
This is because the integer division is performed first, then the result is promoted to float for the addition.
You can just cast one of them. It doesn't matter which one though.
Whenever the types don't match, the "smaller" type is automatically promoted to the "larger" type, with floating point being "larger" than integer types.
Division of integers: cast any one of the operands, no need to cast them both. If both operands are integers the division operation is an integer division, otherwise it is a floating-point division.
As for the overflow question, there is no need to explicitly cast, as the compiler implicitly does that for you:
#include <iostream>
#include <limits>
using namespace std;
int main()
{
signed int a = numeric_limits<signed int>::max();
unsigned int b = a + 1; // implicit cast, no overflow here
cout << a << ' ' << b << endl;
return 0;
}
In the case of the floating-point division, as long as one variable is of a floating-point datatype (float or double), then the other variable should be widened to a floating-point type, and floating-point division should occur; so there's no need to cast both to a float.
Having said that, I always cast both to a float, anyway.
I think as long as you are casting just one of the two variables the compiler will behave properly (At least on the compilers that I know).
So all of:
float c = (float)a / b;
float c = a / (float)b;
float c = (float)a / (float)b;
will have the same result.
Then there are older brain-damaged types like me who, having to use old-fashioned languages, just unthinkingly write stuff like
int a;
int b;
float z;
z = a*1.0*b;
Of course this isn't universal, good only for pretty much just this case.
Having worked on safety-critical systems, i tend to be paranoid and always cast both factors: float(a)/float(b) - just in case some subtle gotcha is planning to bite me later. No matter how good the compiler is said to be, no matter how well-defined the details are in the official language specs. Paranoia: a programmer's best friend!
Do you need to cast one or two sides? The answer isn't dictated by the compiler. It has to know the exact, precse rules. Instead, the answer should be dictated by the person who will read the code later. For that reason alone, cast both sides to the same type. Implicit truncation might be visible enough, so that cast could be redundant.
e.g. this cast float->int is obvious.
int a = float(foo()) * float(c);