Read float wrong value from txt file c++ - c++

I have a text file of values:
133.25 129.40 41.69 2.915
when I read it:
fscanf(File, "%f", &floatNumber[i]);
I get these values:
1.3325000000000000e+002, 1.2939999389648437e+002, 4.1689998626708984e+001 2.9149999618530273e+000
the first value is okay but the other three values why they are different?

The values are the same, you need to change the format specificier in your printf.
Also, floating point numbers have discrete precision, it is therefore not possible to reprenent
any arbitrary floating point numbers to infinite accuracy.
This is well-known problem with IEEE spec.

They're not different. Floating-point is only accurate to a point [sic]. These are the closest representations of those values. Floating-point is a special beast.

The reason the values are different is that all numbers except the first one cannot be represented exactly as a binary float value. If you need exact representation of decimals, you need to use a non-standard library.

Although most of your inputs cannot be represented exactly in either format, you would have got a lot more matching digits using double rather than float.
I regard float as a very specialized type. If you have a very large array of low precision floating point data, and are doing only very well behaved calculations on it, you may be able to gain some performance by using float. You get twice as many floats in e.g. a cache line. For anything else, prefer double to float.

Related

Fortran - want to round to one decimal point

In fortran I have to round latitude and longitude to one digit after decimal point.
I am using gfortran compiler and the nint function but the following does not work:
print *, nint( 1.40 * 10. ) / 10. ! prints 1.39999998
print *, nint( 1.49 * 10. ) / 10. ! prints 1.50000000
Looking for both general and specific solutions here. For example:
How can we display numbers rounded to one decimal place?
How can we store such rounded numbers in fortran. It's not possible in a float variable, but are there other ways?
How can we write such numbers to NetCDF?
How can we write such numbers to a CSV or text file?
As others have said, the issue is the use of floating point representation in the NetCDF file. Using nco utilities, you can change the latitude/longitude to short integers with scale_factor and add_offset. Like this:
ncap2 -s 'latitude=pack(latitude, 0.1, 0); longitude=pack(longitude, 0.1, 0);' old.nc new.nc
There is no way to do what you are asking. The underlying problem is that the rounded values you desire are not necessarily able to be represented using floating point.
For example, if you had a value 10.58, this is represented exactly as 1.3225000 x 2^3 = 10.580000 in IEEE754 float32.
When you round this to value to one decimal point (however you choose to do so), the result would be 10.6, however 10.6 does not have an exact representation. The nearest representation is 1.3249999 x 2^3 = 10.599999 in float32. So no matter how you deal with the rounding, there is no way to store 10.6 exactly in a float32 value, and no way to write it as a floating point value into a netCDF file.
YES, IT CAN BE DONE! The "accepted" answer above is correct in its limited range, but is wrong about what you can actually accomplish in Fortran (or various other HGL's).
The only question is what price are you willing to pay, if the something like a Write with F(6.1) fails?
From one perspective, your problem is a particularly trivial variation on the subject of "Arbitrary Precision" computing. How do you imagine cryptography is handled when you need to store, manipulate, and perform "math" with, say, 1024 bit numbers, with exact precision?
A simple strategy in this case would be to separate each number into its constituent "LHSofD" (Left Hand Side of Decimal), and "RHSofD" values. For example, you might have an RLon(i,j) = 105.591, and would like to print 105.6 (or any manner of rounding) to your netCDF (or any normal) file. Split this into RLonLHS(i,j) = 105, and RLonRHS(i,j) = 591.
... at this point you have choices that increase generality, but at some expense. To save "money" the RHS might be retained as 0.591 (but loose generality if you need to do fancier things).
For simplicity, assume the "cheap and cheerful" second strategy.
The LHS is easy (Int()).
Now, for the RHS, multiply by 10 (if, you wish to round to 1 DEC), e.g. to arrive at RLonRHS(i,j) = 5.91, and then apply Fortran "round to nearest Int" NInt() intrinsic ... leaving you with RLonRHS(i,j) = 6.0.
... and Bob's your uncle:
Now you print the LHS and RHS to your netCDF using a suitable Write statement concatenating the "duals", and will created an EXACT representation as per the required objectives in the OP.
... of course later reading-in those values returns to the same issues as illustrated above, unless the read-in also is ArbPrec aware.
... we wrote our own ArbPrec lib, but there are several about, also in VBA and other HGL's ... but be warned a full ArbPrec bit of machinery is a non-trivial matter ... lucky you problem is so simple.
There are several aspects one can consider in relation to "rounding to one decimal place". These relate to: internal storage and manipulation; display and interchange.
Display and interchange
The simplest aspects cover how we report stored value, regardless of the internal representation used. As covered in depth in other answers and elsewhere we can use a numeric edit descriptor with a single fractional digit:
print '(F0.1,2X,F0.1)', 10.3, 10.17
end
How the output is rounded is a changeable mode:
print '(RU,F0.1,2X,RD,F0.1)', 10.17, 10.17
end
In this example we've chosen to round up and then down, but we could also round to zero or round to nearest (or let the compiler choose for us).
For any formatted output, whether to screen or file, such edit descriptors are available. A G edit descriptor, such as one may use to write CSV files, will also do this rounding.
For unformatted output this concept of rounding is not applicable as the internal representation is referenced. Equally for an interchange format such as NetCDF and HDF5 we do not have this rounding.
For NetCDF your attribute convention may specify something like FORTRAN_format which gives an appropriate format for ultimate display of the (default) real, non-rounded, variable .
Internal storage
Other answers and the question itself mention the impossibility of accurately representing (and working with) decimal digits. However, nothing in the Fortran language requires this to be impossible:
integer, parameter :: rk = SELECTED_REAL_KIND(radix=10)
real(rk) x
x = 0.1_rk
print *, x
end
is a Fortran program which has a radix-10 variable and literal constant. See also IEEE_SELECTED_REAL_KIND(radix=10).
Now, you are exceptionally likely to see that selected_real_kind(radix=10) gives you the value -5, but if you want something positive that can be used as a type parameter you just need to find someone offering you such a system.
If you aren't able to find such a thing then you will need to work accounting for errors. There are two parts to consider here.
The intrinsic real numerical types in Fortran are floating point ones. To use a fixed point numeric type, or a system like binary-coded decimal, you will need to resort to non-intrinsic types. Such a topic is beyond the scope of this answer, but pointers are made in that direction by DrOli.
These efforts will not be computationally/programmer-time cheap. You will also need to take care of managing these types in your output and interchange.
Depending on the requirements of your work, you may find simply scaling by (powers of) ten and working on integers suits. In such cases, you will also want to find the corresponding NetCDF attribute in your convention, such as scale_factor.
Relating to our internal representation concerns we have similar rounding issues to output. For example, if my input data has a longitude of 10.17... but I want to round it in my internal representation to (the nearest representable value to) a single decimal digit (say 10.2/10.1999998) and then work through with that, how do I manage that?
We've seen how nint(10.17*10)/10. gives us this, but we've also learned something about how numeric edit descriptors do this nicely for output, including controlling the rounding mode:
character(10) :: intermediate
real :: rounded
write(intermediate, '(RN,F0.1)') 10.17
read(intermediate, *) rounded
print *, rounded ! This may look not "exact"
end
We can track the accumulation of errors here if this is desired.
The `round_x = nint(x*10d0)/10d0' operator rounds x (for abs(x) < 2**31/10, for large numbers use dnint()) and assigns the rounded value to the round_x variable for further calculations.
As mentioned in the answers above, not all numbers with one significant digit after the decimal point have an exact representation, for example, 0.3 does not.
print *, 0.3d0
Output:
0.29999999999999999
To output a rounded value to a file, to the screen, or to convert it to a string with a single significant digit after the decimal point, use edit descriptor 'Fw.1' (w - width w characters, 0 - variable width). For example:
print '(5(1x, f0.1))', 1.30, 1.31, 1.35, 1.39, 345.46
Output:
1.3 1.3 1.4 1.4 345.5
#JohnE, using 'G10.2' is incorrect, it rounds the result to two significant digits, not to one digit after the decimal point. Eg:
print '(g10.2)', 345.46
Output:
0.35E+03
P.S.
For NetCDF, rounding should be handled by NetCDF viewer, however, you can output variables as NC_STRING type:
write(NetCDF_out_string, '(F0.1)') 1.49
Or, alternatively, get "beautiful" NC_FLOAT/NC_DOUBLE numbers:
beautiful_float_x = nint(x*10.)/10. + epsilon(1.)*nint(x*10.)/10./2.
beautiful_double_x = dnint(x*10d0)/10d0 + epsilon(1d0)*dnint(x*10d0)/10d0/2d0
P.P.S. #JohnE
The preferred solution is not to round intermediate results in memory or in files. Rounding is performed only when the final output of human-readable data is issued;
Use print with edit descriptor ‘Fw.1’, see above;
There are no simple and reliable ways to accurately store rounded numbers (numbers with a decimal fixed point):
2.1. Theoretically, some Fortran implementations can support decimal arithmetic, but I am not aware of implementations that in which ‘selected_real_kind(4, 4, 10)’ returns a value other than -5;
2.2. It is possible to store rounded numbers as strings;
2.3. You can use the Fortran binding of GIMP library. Functions with the mpq_ prefix are designed to work with rational numbers;
There are no simple and reliable ways to write rounded numbers in a netCDF file while preserving their properties for the reader of this file:
3.1. netCDF supports 'Packed Data Values‘, i.e. you can set an integer type with the attributes’ scale_factor‘,’ add_offset' and save arrays of integers. But, in the file ‘scale_factor’ will be stored as a floating number of single or double precision, i.e. the value will differ from 0.1. Accordingly, when reading, when calculating by the netCDF library unpacked_data_value = packed_data_value*scale_factor + add_offset, there will be a rounding error. (You can set scale_factor=0.1*(1.+epsilon(1.)) or scale_factor=0.1d0*(1d0+epsilon(1d0)) to exclude a large number of digits '9'.);
3.2. There are C_format and FORTRAN_format attributes. But it is quite difficult to predict which reader will use which attribute and whether they will use them at all;
3.3. You can store rounded numbers as strings or user-defined types;
Use write() with edit descriptor ‘Fw.1’, see above.

converting floating point values to ascii and back again without introducing errors

At first sight, this seems trivial, but the usual (radix 2 <-> radix 10) FP<->ASCII conversions cannot always be done without introducing errors. Granted, these are small, but what options exist to make the conversions to and from ASCII perfect, that is, what are the possibilities of making the conversions, without introducing any error at all? I was thinking about base64 encoding, or bit-encoding (e.g. something like 11110101010...), both of these would preserve the radix.
EDIT: Since I can't answer myself, here's what I had in mind:
double d{.1};
auto const s(::std::to_string(*reinterpret_cast<::std::uint64_t*>(&d)));
::std::uint64_t n(::std::stoull(s));
auto const e(*reinterpret_cast<double*>(&n));
assert(d == e);
What do you mean exactly by "without introducing errors"? If it
is for the machine to reread later, 17 digits precision
guarantees round trip: the actual value in the text will not be
the exact value of the double, but it will be closer to the
original double value than to any other double value, so
reconversion to double will result in the initial value. If you
have access to C++11, you can also set the format to output the
value in hex:
std::cout.setf( std::ios_base::fixed | std::ios_base::scientific,
std::ios_base::floatfield );
In this case, the output should be exact, regardless of the
precision.
If it is for humans to read, and know the exact value, there is
nothing in the standard library which will guarantee this. In
theory, outputting 53 digits should suffice, but the neither the
C++ standard nor the IEEE standard require the implementation to
guard against rounding errors in the conversion routine at this
precision, and some implementations just append a sufficiently
large number of '0' after the 19th or 20th digit, rather than
waste runtime calculating incorrect values.
I think the question you are asking is how to round-trip a floating point double value via an ASCII (string) representation. I agree, for this purpose printing the number in fixed or floating point decimal notation is completely unsuitable.
If you don't care what the string looks like then the simple solution is to just treat the 8 byte double as two integers. Two hex integers will occupy 16 character positions. With practice you can even read one of these and estimate the value.
The same thing in Base-64 just reduces the number of character positions (to 11/12). The number formatted this way is quite unreadable.
There are other ways, but why bother? These should suffice.

How to reestablish double in c++

When representing double number its precision corrupts in some degree. For example number 37.3 can be represented as 37.29999999999991.
I need reestablishing of corrupted double number (My project requires that). One approach is converting double into CString.
double d = 37.3;
CString str;
str.Format("%.10f", d);
Output: str = 37.3;
By this way I could reestablish corrupted d. However, I found a counterexample. If I set
d = 37.3500;
then its double representation sometimes be equal to 37.349998474121094. When converting d to CString output is still 37.3499984741, which is not equal to 37.3500 actually.
Why converting 37.3500 didn't give desired answer, while 37.3 gave? Is there any ways to reestablish double?
Thanks.
Why converting 37.3500 didn't give desired answer, while 37.3 gave?
By accident. The representation of 37.3 happened to be close enough that rounding to 10 decimal places gave the expected result, while 37.3499984741 didn't.
Is there any ways to reestablish double?
No, once information has been lost, you can't recover it. If you need an exact representation of decimal numbers, then you'll need a different format than binary floating point. There's no suitable decimal type in the C++ language or standard library; depending on your needs, you might consider libraries such as Boost.Multiprecision or GMP. Alternatively, if you can limit the number of decimal places you need, you might be able to multiply all your numbers by that scale and work with exact integers.
It can be done to some extend, but not easily. Since the string representation is base 10, but the internal representation in base 2, there is rounding involved when converting one into the other. So when you convert the decimal "37.35" to double, the result is not identical to the original number. When converting that number back to a string, the computer cannot know for sure what number was there in the first place, because there are several decimal numbers that result in the same double. However, you can add the constraint that you want the shortest possible decimal string that results in the given double, then there is a very good chance that it recovers your original string precisely. An algorithm using that constraint has been developed by David Gay. Here's the source code, you need both g_fmt.c and dtoa.c, and here is a paper about it. This is the default algorithm used in Python since Version 3.1.

Conversion from string to double - Possibility and errors

I am aware that the string 2.34 would never be equal to the double 2.34. No matter what library or algorithm you tried (lexical_cast,atof). Also 2.3400 can not be represented as double type. Instead it will be equal to 2.3399999999999999 . A little background I am working on an application that passes of values to an external application using its api. Think of it as some sort of a trading application. The user can pass values using the applications api or the user can pass value by using the application directly.Now when the user uses the application directly and the user types in 2.34 the value is processed as 2.34 however when I use the API which requires double as a parameter I pass 2.34 and it passes of as 2.3399999999999999 which is not acceptable. My question is how would the application be handling this and is there a way to store 2.34000.. in a double so that I could pass it to an API ?
If you need to pass decimal values through an API which takes double but you need to get the exact values, there isn't much of a problem: As long as you don't use more than std::numeric_limits<double>::digits10 digits, you can recover the original decimal value although not necessarily the same representation (trailing fractional zeros will be lost). To do so, you need to convert the original decimal string into the closest representation as double and later use a suitable algorithm to restore the best decimal representation again. The parsing and formatting functions from the C and C++ standard libraries will do that correctly for you.
Note that you shouldn't try to do any arithmetic on the double values when you want to restore the original decimal values: the result of double arithmetic will use binary rounding and the values won't be the closest decimal values. However, as long as you only transfer the double values, there is no problem.
Since you mention "trading application" I will conclude that the numbers represent currencies. If that is the case you are probably dealing with a fixed number of fractional digits as well. In that case you can scale your floating point numbers by multiplying them by 10 ^ number_of_fractional_digits, essentially making them integer values. Floating point numbers can accurately store integer values (as long as they do not exceed the floating point type's range).
Another possibility - if the assumptions above are correct - would be to use Binary-coded decimals.
The one way to work around floating point precision issues is using a well made fraction class. You may code one for yourself or use the ones provided by common math libraries. Such classes will represent your 2.34 as 234/100 internally, which will lead higher amount of memory consumption compared to a single float.

Weird bug with floats in if-statement

So in my C++ code I have the following line of code for debugging purposes:
if(float1 != float2)
{
std::cout<<float1<<" "<<float2<<std::endl;
}
What's happening is that the program is entering into the if-statement...but when I print out the two float values they are the same. But if they were the same, then it should bypass this if-statement completely. So I'm really confused as to why this is happening.
The floats may just have very similar values. By default, the I/O library will truncate the output of floating point values. You can ensure that you get the full precision by calling the precision member function of std::cout:
if(float1 != float2)
{
std::cout.precision(9);
std::cout<<float1<<" "<<float2<<std::endl;
}
Now you should see the difference. The value 9 is the number of base-10 digits representable by a IEEE 754 32 bit float (see #EricPostpischil's comment below).
Floating-point value are typically stored in computer memory in binary format. Meanwhile, values you print through cout are represented in decimal format. The conversion from binary floating-point representation to decimal representation can be lossy, depending on your conversion settings. This immediatelty means that what you print is not necessarily exactly the same as what is actually stored in memory. This explains why the direct comparison between float1 and float2 might say that they are different, while the decimal printout might look identical.