C++ floating point representation

C++ floating point representation - c++

I am trying to create a float from a hexadecimal representation I got from here. For the representation of 32.50002, the site shows the IEEE 754 hexadecimal representation as 0x42020005.
In my code, I have this: float f = 0x42020005;. However, when I print the value, I get 1.10E+9 instead of 32.50002. Why is this?
I am using Microsoft Visual C++ 2010.

When you assign a value to a float variable via =, you don’t assign its internal representation, you assign its value. 0x42020005 in decimal is 1107427333, and that’s the value you are assigning.
The underlying representation of a float cannot be retrieved in a platform independent way. However, making some assumptions (namely, that the float is in fact using IEEE 754 format), we can trick a bit:
float f;
uint32_t rep = 0x42020005;
std::memcpy(&f, &rep, sizeof f);
Will give the desired result.

0x42020005 actually is int value of 1107427333.
You can try out this code. Should work... Use union:
union IntFloat {
uint32_t i;
float f;
};
and call it when you need to convert the value.
union IntFloat val;
val.i = 0x42020005;
printf("%f\n", val.f);

0x42020005 is an int with value of 1107427333.
float f = 0x42020005; is equal with
float f = 1107427333;

Related

Convert an integer's binary data to float

Lets say I have an integer:
unsigned long long int data = 4599331010119547059;
Now I want to convert this data to a double. I basically want to change the type, but keep the bits exactly as they were. For the given example, the float value is 0.31415926536.
How can I do that in C++? I saw some methods using Union but many advised against using this approach.

Since C++20, you can use std::bit_cast:
std::bit_cast<double>(data)
Prior to C++20, you can use std::memcpy:
double d;
static_assert(sizeof d == sizeof data);
std::memcpy(&d, &data, sizeof d);
Note that result will vary depending on floating point representation (IEEE-754 is ubiquitous though) as well as whether floating point and integer types have the same endianness.

Taking the question on its face value (assuming you have a valid reason to do this!) this is the only proper way of doing this in current C++ standard:
int i = get_int();
float x;
static_assert(sizeof(float) == sizeof(int), "!!!");
memcpy(&x, &i, sizeof(x));

You can use reinterpret_cast:
float f = reinterpret_cast<float&>(data);
For your value, I don't get 0.314... but that's how you could do it.

Is there any way to convert a float into an int without losing the decimal values?

I'm trying to convert a float value to an integer, modify the int value, then reconvert back to a float value. However, the decimals' value gets lost and I'm pretty sure I used the static_cast<>() function wrong in my code.
My code is a binary multiplier, which shifts the binary value f times to left. For example, when I'm doing something like 1.2 x 2, I'm only getting 2 instead of 2.4.
int mantissa;
int f;
int exp;
float result = mantissa + 0x800000;
int resultInt = static_cast<int>(result);
int expF = log2(abs(f));
int expM = exp + expF;
int newExp = (127 + 23 - expM);
resultInt >>= newExp;
float result2 = resultInt;

Bit shifting will not work for floating point values because the bits are laid out differently. They have to preserve the decimal location as well as the digits (hence the floating "point" value).
An integer, on the other hand, works well with bit shifting due to how well it maps from decimal-to-binary, but does not store a decimal point anywhere. Thus, when casting, you lose that information.
In short, it is impossible to multiply a decimal value directly using bit shifting the same way you can with an integer.
However, you can multiply the floating point by 10 until all digits are on the left side of the decimal, then cast to an integer. It may eat up performance depending on how it's implemented, but it's certainly possible to preserve all information this way. It's difficult to answer the question beyond that without understanding your intentions.

Treating a hexadecimal value as single precision or double precision value

Is there a way i could initialize a float type variable with hexadecimal number? what i want to do is say i have single precision representation for 4 which is 0x40800000 in hex. I want to do something like float a = 0x40800000 in which case it takes the hex value as integer. What can i do to make it treat as floating point number?

One option is to use type punning via a union. This is defined behaviour in C since C99 (previously this was implementation defined).
union {
float f;
uint32_t u;
} un;
un.u = 0x40800000;
float a = un.f;
As you tagged this C++, you could also use reinterpret_cast.
uint32_t u = 0x40800000;
float a = *reinterpret_cast<float*>(&u);
Before doing either of these, you should also confirm that they're the same size:
assert(sizeof(float) == sizeof(uint32_t));

You can do this if you introduce a temporary integer type variable, cast it to a floating point type and dereference it. You must be careful about the sizes of the types involved, and know that they may change. With my compiler, for example, this works:
unsigned i = 0x40800000;
float a = *(float*)&i;
printf("%f\n", a);
// output 4.00000

I'm not sure how you're getting your the value "0x40800000".
If that's coming in as an int you can just do:
const auto foo = 0x40800000;
auto a = *(float*)&foo;
If that's coming in as a string you can do:
float a;
sscanf("0x40800000", "0x%x", &a);

Float to int number conversion in c++

The following C++ code:
union float2bin{
float f;
int i;
};
float2bin obj;
obj.f=2.243;
cout<<obj.i;
gives output as some garbage value .
But
union float2bin{
float f;
float i;
};
float2bin obj;
obj.f=2.243;
cout<<obj.i;
gives output same as the value of f i.e 2.243
Compiler GCC has int & float of same size i.e 4 but then what's the reason behind this output behaviour?

The reason is because it is undefined behavior. In practice,
you'll get away with reading an int from something that was
stored as a float on most machines, but you'll read garbage
values unless you know what to expect. Doing it in the other
direction will likely cause the program to crash for certain
values of int.
Under the hood, of course, integral values and floating point
values have different representations, at least on most
machines. (On some Unisys mainframes, your code would do what
you expect. But they're not the most common systems around, and
you probably don't have one on your desktop.) Basically,
regardless of the type, you have a sequence of bits, which will
be interpreted by the hardware in some way. C++ requires
integers to use a pure binary representation, which constrains
the representation somewhat. It also requires a very large
range for floating point values, and more or less requires some
form of exponential notation, with some bits representing the
exponent, and others the mantissa. With different encodings for
each.

The reason is because floating point values are stored in a more complicated way, partitioning the 32 bits into a sign, an exponent and a fraction. If these bits are read as an integer straight off, it will look like a very different value.
The important point here is that if you create a union, you are saying that it is one contiguous block of memory that can be interpreted in two different ways. No where in this mechanism does it account for a safe conversion between float and int, in which case some kind of rounding occurs.
Update: What you might want is
float f = 10.25f;
int i = (int)f;
// Will give you i = 10
However, the union approach is closer to this:
float f = 10.25f;
int i = *((int *)&f);
// Will give you some seemingly arbitrary value

How to convert decimal number into 64 bit binary float number?

I need to convert decimal number into 64 bit binary float value.
If you know any algorithm or anything about it then please help.

Use boost::lexical_cast.
double num = lexical_cast<double>(decimal);

Assuming you mean a decimal stored inside a string, atof would be a standard solution to convert that to a double value (which is a 64-bit floating point number on the x86 architecture).
std::string s = "0.4";
double convertedValue = atof(s.c_str());
Or similar for C strings:
const char *s = "0.4";
double convertedValue = atof(s);
But if you mean integer number by "decimal number", then just write
int yourNumber = 100;
double convertedValue = yourNumber;
and the value will automatically be converted.

Value casting from a string to double can be implemented by boost::lexical_cast.
Type casting from int to double is a part of C++:
double d = (double)i;
It was already mentioned in the previous replies.
If you are interested to know how this casting is implemented, you may refer the sources of the C standard library your compiler is using given that the sources are provided and no floating point co-processor is used for this purpose. Many embedded target compilers do this work "manually" if no floating point co-processor is available.
For the binary format description, please see Victor's reply

Decimal decimalNumber = 1234;
Float binaryFloatValue = decimalNumber;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ floating point representation - c++

0x42020005 actually is int value of 1107427333. You can try out this code. Should work... Use union: union IntFloat { uint32_t i; float f; }; and call it when you need to convert the value. union IntFloat val; val.i = 0x42020005; printf("%f\n", val.f);

0x42020005 is an int with value of 1107427333. float f = 0x42020005; is equal with float f = 1107427333;

Related

Convert an integer's binary data to float

Is there any way to convert a float into an int without losing the decimal values?

Treating a hexadecimal value as single precision or double precision value

Float to int number conversion in c++

How to convert decimal number into 64 bit binary float number?

Categories

Resources