Issues while printing float values - c++

#include<stdio.h>
#include<math.h>
int main()
{
float i = 2.5;
printf("%d\n%d\n%d",i,i,i);
}
When I compile this using gcc and run it, I get this as the output:
0
1074003968
0
Why doesn't it print just
2
2
2

You're passing a float (which will be converted to a double) to printf, but telling printf to expect an int. The result is undefined behavior, so at least in theory, anything could happen.
What will typically happen is that printf will retrieve sizeof(int) bytes from the stack, and interpret whatever bit pattern they hold as an int, and print out whatever value that happens to represent.
What you almost certainly want is to cast the float to int before passing it to printf.

The "%d" format specifier is for decimal integers. Use "%f" instead.
And take a moment to read the printf() man page.

The "%d" is the specifier for a decimal integer (typically an 32-bit integer) while the "%f" specifier is used for decimal floating point. (typically a double or a float).
if you only want the non-decimal part of the floating point number you could specify the precision as 0.
i.e.
float i = 2.5;
printf("%.0f\n%.0f\n%.0f",i,i,i);
note you could also cast each value to an int and it would give the same result.
printf("%d\n%d\n%d",int(i),int(i),int(i));

%d prints decimal (int)s, not (float)s. printf() cannot tell that you passed a (float) to it (C does not have native objects; you cannot ask a value what type it is); you need to use the appropriate format character for the type you passed.

Related

Difference between directly assigning a float variable a hexadecimal integer and assigning through pointer conversion

I was investigating the structure of floating-point numbers, and I've found that most of compilers use IEEE 754 standard to store floating point numbers.
And when I tried to do:
float a=0x3f520000; //have to be equal to 0.8203125 in IEEE 754
printf("value of 'a' is: %X [%x] %f\n",(int)a,(int)a, a);
it produces the result:
value of 'a' is: 3F520000 [3f520000] 1062338560.000000
but if I try:
int b=0x3f520000;
float* c = (float*)&b;
printf("value of 'c' is: %X [%x] %f\r\n", *(int*)c, *(int*)c, c[0]);
it gives:
value of 'c' is: 3F520000 [3f520000] 0.820313
The second try gave me the right answer. What is it wrong with the first try? And why does the result differ from that when I cast int to float via pointer?
The difference is that the first converts the value (0x3f520000 is the integer 1062338560), and is equivalent to this:
float a = 1062338560;
printf("value of 'a' is: %X [%x] %f\n",(int)a,(int)a, a);
The second reinterprets the representation of the int - 111111010100100000000000000000 - as being the representation of a float instead.
(It's also undefined, so you shouldn't expect it to do anything in particular.)
[Note: This answer assumes C, not C++, which have different rules]
With
float a=0x3f520000;
you take the integer value 1062338560 and the compiler will convert it to 1062338560.0f.
If you want hexadecimal floating point constant you must use exponent-format using the letter p. As in 0x1.a40010c6f7a0bp-1 (which is the hexadecimal notation for 0.820313).
What happens with
int b=0x3f520000;
float* c = (float*)&b;
is that you break strict aliasing and tell the compiler that c is pointing to a floating-point value (the strict aliasing break is because b isn't a floating point value). The compiler will then reinterpret the bits in *c as a float value.
0x3f520000 is an integer constant. When assigned to a float, the integer is converted.
Some more proper example of how to convert in the second case:
#include <stdio.h>
#include <string.h>
#include <stdint.h>
int main() {
uint32_t fi = 0x3f520000;
float f;
memcpy(&f, &fi, sizeof(f));
printf("%.7g\n", f);
}
it prints:
0.8203125
so that is what you expected.
The approach I used is memcpy that is the safest for all compilers and best choice for modern compilers (GCC since approx. 4.6, Clang since 3.x) that interpret memcpy as "bit cast" in such case and optimize it in a efficient and safe way (at least in "hosted" mode). That's still safe for older compilers, but not nessessarily efficient in the same way; some can prefer cast through union or ever through different pointer type. On dangers of that ways, see here or generally search "type punning and strict aliasing".
(Also, there could be some weird platforms that suffer from endianness issue that integer endianness differs from float one; ones that have byte different than 8 bits, and so on. I don't consider them here.)
UPDATE: I was starting answering to initial version of the question. Yep, bit casting and value conversion will give principally different results. That's how float numbers work.

What is the correct type in c\c++ to store a COM's VT_DECIMAL?

I'm trying to write a wrapper to ADO.
A DECIMAL is one type a COM VARIANT can be, when the VARIANT type is VT_DECIMAL.
I'm trying to put it in c native data type, and keep the variable value.
it seem that the correct type is long double, but I get "no suitable conversion error".
For example:
_variant_t v;
...
if(v.vt == VT_DECIMAL)
{
double d = (double)v; //this works but I'm afraid can be loss of data...
long double ld1 = (long double)v; //error: more then one conversion from variant to long double applied.
long double ld2 = (long double)v.decVal; //error: no suitable conversion function from decimal to long double exist.
}
So my questions are:
is it totally safe to use double to store all possible decimal values?
if not, how can I convert the decimal to a long double?
How to convert a decimal to string? (using the << operator, sprintf is also good for me)
The internal representation for DECIMAL is not a double precision floating point value, it is integer instead with sign/scale options. If you are going to initialize DECIMAL parts, you should initialize these fields - 96-bit integer value, scale, sign, then you get valid decimal VARIANT value.
DECIMAL on MSDN:
scale - The number of decimal places for the number. Valid values are from 0 to 28. So 12.345 is represented as 12345 with a scale of 3.
sign - Indicates the sign; 0 for positive numbers or DECIMAL_NEG for negative numbers. So -1 is represented as 1 with the DECIMAL_NEG bit set.
Hi32 - The high 32 bits of the number.
Lo64 - The low 64 bits of the number. This is an _int64.
Your questions:
is it totally safe to use double to store all possible decimal values?
You cannot initialize as double directly (e.g. VT_R8), but you can initialize as double variant and use variant conversion API to convert to VT_DECIMAL. A small rounding can be applied to value.
if not, how can I convert the decimal to a long double?
How to convert a decimal to string? (using the << operator, sprintf is also good for me)
VariantChangeType can convert decimal variant to variant of another type, including integer, double, string - you provide the type to convert to. Vice versa, you can also convert something different to decimal.
"Safe" isn't exactly the correct word, the point of DECIMAL is to not introduce rounding errors due to base conversions. Calculations are done in base 10 instead of base 2. That makes them slow but accurate, the kind of accuracy that an accountant likes. He won't have to chase a billionth-of-a-penny mismatches.
Use _variant_t::ChangeType() to make conversions. Pass VT_R8 to convert to double precision. Pass VT_BSTR to convert to a string, the kind that the accountant likes. No point in chasing long double, that 10-byte FPU type is history.
this snippets is taken from http://hackage.haskell.org/package/com-1.2.1/src/cbits/AutoPrimSrc.c
the Hackage.org says:
Hackage is the Haskell community's central package archive of open
source software.
but please check the authors permissions
void writeVarWord64( unsigned int hi, unsigned int lo, VARIANT* v )
{
ULONGLONG r;
r = (ULONGLONG)hi;
r >>= 32;
r += (ULONGLONG)lo;
if (!v) return;
VariantInit(v);
v->vt = VT_DECIMAL;
v->decVal.Lo64 = r;
v->decVal.Hi32 = 0;
v->decVal.sign = 0;
v->decVal.scale = 0;
}
If I understood Microsoft's documentation (https://msdn.microsoft.com/en-us/library/cc234586.aspx) correctly, VT_DECIMAL is an exact 92-bit integer value with a fixed scale and precision. In that case you can't store this without loss of information in a float, a double or a 64-bit integer variable.
You're best bet would be to store it in a 128-bit integer like __int128 but I don't know the level of compiler support for it. I'm also not sure you will be able to just cast one to the other without resorting to some bit manipulations.
Is it totally safe to use double to store all possible decimal values?
It actually depends what you mean by safe. If you mean "is there any risk of introducing some degree of conversion imprecision?", yes there is a risk. The internal representations are far too different to guarantee perfect conversion, and conversion noise is likely to be introduced.
How can I convert the decimal to a long double / a string?
It depends (again) of what you want to do with the object:
For floating-point computation, see #Gread.And.Powerful.Oz's link to the following answer: C++ converting Variant Decimal to Double Value
For display, see MSDN documentation on string conversion
For storage without any conversion imprecision, you should probably store the decimal as a scaled integer of the form pair<long long,short>, where first holds the 96-bits mantissa and second holds the number of digits to the right of the decimal point. This representation is as close as possible to the decimal's internal representation, will not introduce any conversion imprecision and won't waste CPU resources on integer-to-string formatting.

C++ union to represent data memory vs C scalar variable type

Today I've a weird question.
The Code(C++)
#include <iostream>
union name
{
int num;
float num2;
}oblong;
int main(void)
{
oblong.num2 = 27.881;
std::cout << oblong.num << std::endl;
return 0;
}
The Code(C)
#include <stdio.h>
int main(void)
{
float num = 27.881;
printf("%d\n" , num);
return 0;
}
The Question
As we know, C++ unions can hold more than one type of data element but only one type at a time. So basically the name oblong will only reserve one portion of memory which is 32-bit (because the biggest type in the union is 32-bit, int and float) and this portion could either hold a integer or float.
So I just assign a value of 27.881 into oblong.num2 (as you can see on the above code). But out of curiosity, I access the memory using oblong.num which is pointing to the same memory location.
As expected, it gave me a value which is not 27 because the way float and integer represented inside a memory is different, that's why when I use oblong.num to access the memory portion it'll treat that portion of memory value as integer and interpret it using integer representation way.
I know this phenomena also will happen in C , that's why I initialize a float type variable with a value and later on read it using the %d.So I just try it out by using the same value 27.881 which you can see above. But when I run it, something weird happens, that is the value of the one I get in C is different from C++.
Why does this happen? From what I know the two values I get from the two codes in the end are not garbage values, but why do I get different values? I also use the sizeof to verified both C and C++ integer and float size and both are 32-bit. So memory size isn't the one that causes this to happen, so what prompt this difference in values?
First of all, having the wrong printf() format string is undefined behavior. Now that said, here is what is actually happening in your case:
In vararg functions such as printf(), integers smaller than int are promoted to int and floats smaller than double are promoted to double.
The result is that your 27.881 is being converted to an 8-byte double as it is passed into printf(). Therefore, the binary representation is no longer the same as a float.
Format string %d expects a 4-byte integer. So in effect, you will be printing the lower 4-bytes of the double-precision representation of 27.881. (assuming little-endian)
*Actually (assuming strict-FP), you are seeing the bottom 4-bytes of 27.881 after it is cast to float, and then promoted to double.
In both cases you are encountering undefined behaviour. Your implementation just happens to do something strange.

why sizeof(13.33) is 8 bytes?

When I give sizeof(a), where a=13.33, a float variable, the size is 4 bytes.
But if i give sizeof(13.33) directly, the size is 8 bytes.
I do not understand what is happening. Can someone help?
Those are the rules of the language.
13.33 is a numeric literal. It is treated as a double because it is a double. If you want 13.33 to be treated as a float literal, then you state 13.33f.
13.33 is a double literal. If sizeof(float) == 4, sizeof(13.33f) == 4 should also hold because 13.33f is a float literal.
The literal 13.33 is treated as a double precision floating point value, 8 bytes wide.
The 13.33 literal is being treated as 'double', not 'float'.
Try 13.33f instead.
The type and size of your variable are fine. It's just that the compiler has some default types for literals, those constant values hard-coded in your program.
If you request sizeof(1), you'll get sizeof(int). If you request sizeof(2.5), you'll get sizeof(double). Those would clearly fit into a char and a float respectively, but the compiler has default types for your literals and will treat them as such until assignment.
You can override this default behaviour, though. For example:
2.5 // as you didn't specify anything, the compiler will take it for a double.
2.5f // ah ha! you're specifying this literal to be float
Cheers!
Because 13.33 is a double, which gets truncated to a float if you assign it. And a double is 8bytes. To create a real float, use 13.33f (note the f).

Does casting to an int after std::floor guarantee the right result?

I'd like a floor function with the syntax
int floor(double x);
but std::floor returns a double. Is
static_cast <int> (std::floor(x));
guaranteed to give me the correct integer, or could I have an off-by-one problem? It seems to work, but I'd like to know for sure.
For bonus points, why the heck does std::floor return a double in the first place?
The range of double is way greater than the range of 32 or 64 bit integers, which is why std::floor returns a double. Casting to int should be fine so long as it's within the appropriate range - but be aware that a double can't represent all 64 bit integers exactly, so you may also end up with errors when you go beyond the point at which the accuracy of double is such that the difference between two consecutive doubles is greater than 1.
static_cast <int> (std::floor(x));
does pretty much what you want, yes. It gives you the nearest integer, rounded towards -infinity. At least as long as your input is in the range representable by ints.
I'm not sure what you mean by 'adding .5 and whatnot, but it won't have the same effect
And std::floor returns a double because that's the most general. Sometimes you might want to round off a float or double, but preserve the type. That is, round 1.3f to 1.0f, rather than to 1.
That'd be hard to do if std::floor returned an int. (or at least you'd have an extra unnecessary cast in there slowing things down).
If floor only performs the rounding itself, without changing the type, you can cast that to int if/when you need to.
Another reason is that the range of doubles is far greater than that of ints. It may not be possible to round all doubles to ints.
The C++ standard says (4.9.1):
"An rvalue of a floating point type can be converted to an rvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type".
So if you are converting a double to an int, the number is within the range of int and the required rounding-up is toward zero, then it is enough to simply cast the number to int:
(int)x;
If you want to deal with various numeric conditions and want to handle different types of conversions in a controlled way, then maybe you should look at the Boost.NumericConversion. This library allows to handle weird cases (like out-of-range, rounding, ranges, etc.)
Here is the example from the documentation:
#include <cassert>
#include <boost/numeric/conversion/converter.hpp>
int main() {
typedef boost::numeric::converter<int,double> Double2Int ;
int x = Double2Int::convert(2.0);
assert ( x == 2 );
int y = Double2Int()(3.14); // As a function object.
assert ( y == 3 ) ; // The default rounding is trunc.
try
{
double m = boost::numeric::bounds<double>::highest();
int z = Double2Int::convert(m); // By default throws positive_overflow()
}
catch ( boost::numeric::positive_overflow const& )
{
}
return 0;
}
Most of the standard math library uses doubles but provides float versions as well. std::floorf() is the single precision version of std::floor() if you'd prefer not to use doubles.
Edit: I've removed part of my previous answer. I had stated that the floor was redundant when casting to int, but I forgot that this is only true for positive floating point numbers.