Error reading float values - c++

I'm trying to make a very simple program that will first read an integer N, after that it will read N datas, which are floats, and just put them back in the screen with 2 digits after the decimal point. It's very simple, right? But it went wrong! Here is the source code :
#include<cstdio>
int main()
{
float d;
int n;
scanf("%d",&n);
while(n>0)
{
n--;
scanf("%f",&d);
printf("%.2f\n",d);
}
return 0;
}
When I put small values, such as 3.1 or 1.0, it works very well. But when I give big values such as -765057.71 or 978715.10 it prints different value. As of -765057.71 it prints -765057.69, and for the next one it prints 978715.13. WHY is this happening, and how I can fix this?

Primitive floating-point types do not have unlimited precision. There are some values that cannot be represented exactly, as you are seeing. You can mitigate this problem somewhat by using double instead of float, but for large enough values you will still see the same issue.
Other, more robust solutions include using a prebuilt arbitrary-precision library, rolling your own such library (if you've got some time to kill), representing your floating point values using two integers, one for the part before the decimal and one for the part after, or (if all you want to do is echo the number back out truncated to a given decimal position) using strings instead.
You may find this an interesting read: http://en.wikipedia.org/wiki/IEEE754

Because a single-precision float cannot store -765057.71 to that many digits of precision. It can store the most significant digits, but it has to more or less "round" off at some point.
Single-precision floats (float) get about 7 decimal digits of precision. If you need more, you need to use double-precision floats (double).

Floating point numbers are only an approximate representation. I suggest you read the article here

Related

Is it possible to take a decimal number as input from the user without using float/double data type?

Is it possible to take a decimal number as input from the user without using float/double data type?
And this number has to be used further for calculation.
I'm making a function to calculate the square root of a number without using cmath library, I'm doing this for a DSP that doesn't have FPU so I can't use float/double.
If I will be using string to take input as decimal number from the user then how can I convert it into a number, and if it is converted into a number then what should be the return type of the square root function (the root will be a fractional number)?
If possible please suggest some alternative way instead of stings.
I know it is related to fixed point arithmetic but I don't know how to implement it in c++.
Foreword: Compilers can implement software floating point operations for CPU's which do not have floating point hardware, so it is often not a problem to use float and double on such systems.
I recommend using the standard fundamental floating point types as long as they are supported by your compiler.
Is it possible to take a decimal number as input from the user without using float/double data type?
Yes. User input is done using character streams. You can read input into a string without involving any numeric type.
And this number has to be used further for calculation.
To do calculation, you must first decide how you would like represent the number. There are several alternatives to hardware floating point:
Fixed point: Use the integer 100 to represent 0.0100 for example.
Software floating point: Use one integer to represent mantissa, another integer to represent the exponent and a boolean to represent sign.
Rational numbers: Use one integer to represent nominator and another to represent denominator.
Probably many others...
Each of these have different implemenations for different arithmetic operations.
Fixed point is the simplest and probably most efficient, but has both small range and poor precision near zero (well, equal precision across the entire range but poor compared to floating point which has high precision near zero and very poor precision far from zero).
Software floating point allows potentially reproducing hardware behaviour by following the ubiquitous IEEE-754 standard.
Rational numbers have problems with overflowing as well as redundant representations. I don't think they are used much except with arbitrary precision integers.
(the root will be a fractional number)
Technically, most roots are irrational and thus not fractional. But since irrational numbers are not representable by computers (except in symbolic form), the best we can achieve is some fractional number close to the actual root.
Here's my solution.
In summary, 12345 has
5 1's units,
4 10's units,
3 100's units,
2 1000's units and
1 10,000's units.
What's below is verbose description of doing this in C++. It doesn't detect every possible error like the user typing MAXINT+1 and the output returning -MAXINT; fgets has some issues regarding buffer overflows but you probably take input from some source within the greater program anyhow but just to show the principal.
int error=0; //error condition if you can't indicate err by -1 or such
int output=0;
char input[256];
fgets(input,255,stdin); //get input
//arrange the input for our purpose
input=trim(input); //trim whitespace
input=strrev(input); //reverse it (12345 now equals 54321)
//set up a loop
int len=strlen(input); //length of output
int column=0; //iterates through each char in input
int units=1; //we start with the 1's column and then 10s, 100's, etc
while (column < len) {
int val=input[column];
//nitty gritty
if ((val>'0')&&(val<'9')){ //note the quotes amounting to int values 48 and 57
val-=48; //convert the ascii/utf-8 digit into its intval
val*=units; //multiply it by the units of the column 1s,10s,100s,etc
output+=val; //add it to the output
units*=10; //end of this iteration, set up for next unit scale
}
else if (val=='-'){ //test for the unique circumstance of the minus sign
output*=-1;
break;
}
else if (val=='+'){ //test if the user is a tit and puts the positive sign
break;
}
else{
error=1; //the user typed a character not conforming to an intval
}
}
EDIT: I realise I didn't read the full question and there's also a need for a square root function. When FPU's were added extras in 8-bit, 8086, 286 and 386SX days, the standard technique was to store a look up table in memory. There are mathematical functions you can use involving natural logarithms but the expense involved in processor time was such that it was cheaper just to make a table with each value that you wanted to sqrt and lookup the table for the value.

C++ set precision of a double (not for output)

Alright so I am trying to truncate actual values from a double with a given number of digits precision (total digits before and after, or without, decimal), not just output them, not just round them. The only built in functions I found for this truncates all decimals, or rounds to given decimal precision.
Other solutions I have found online, can only do it when you know the number of digits before the decimal, or the entire number.
This solution should be dynamic enough to handle any number. I whipped up some code that does the trick below, however I can't shake the feeling there is a better way to do it. Does anyone know of something more elegant? Maybe a built in function that I don't know about?
I should mention the reason for this. There are 3 different sources of observed values. All 3 of these sources agree to some level in precision. Such as below, they all agree within 10 digits.
4659.96751751236
4659.96751721355
4659.96751764253
However I need to only pull from 1 of the sources. So the best approach, is to only use up to the precision all 3 sources agree on. So its not like I am manipulating numbers and then need to truncate precision, they are observed values. The desired result is
4659.967517
double truncate(double num, int digits)
{
// check valid digits
if (digits < 0)
return num;
// create string stream for full precision (string conversion rounds at 10)
ostringstream numO;
// read in number to stream, at 17+ precision things get wonky
numO << setprecision(16) << num;
// convert to string, for character manipulation
string numS = numO.str();
// check if we have a decimal
int decimalIndex = numS.find('.');
// if we have a decimal, erase it for now, logging its position
if(decimalIndex != -1)
numS.erase(decimalIndex, 1);
// make sure our target precision is not higher than current precision
digits = min((int)numS.size(), digits);
// replace unwanted precision with zeroes
numS.replace(digits, numS.size() - digits, numS.size() - digits, '0');
// if we had a decimal, add it back
if (decimalIndex != -1)
numS.insert(numS.begin() + decimalIndex, '.');
return atof(numS.c_str());
}
This will never work since a double is not a decimal type. Truncating what you think are a certain number of decimal digits will merely introduce a new set of joke digits at the end. It could even be pernicious: e.g. 0.125 is an exact double, but neither 0.12 nor 0.13 are.
If you want to work in decimals, then use a decimal type, or a large integral type with a convention that part of it holds a decimal portion.
I disagree with "So the best approach, is to only use up to the precision all 3 sources agree on."
If these are different measurements of a physical quantity, or represent rounding error due to different ways of calculating from measurements, you will get a better estimate of the true value by taking their mean than by forcing the digits they disagree about to any arbitrary value, including zero.
The ultimate justification for taking the mean is the Central Limit Theorem, which suggests treating your measurements as a sample from a normal distribution. If so, the sample mean is the best available estimate of the population mean. Your truncation process will tend to underestimate the actual value.
It is generally better to keep every scrap of information you have through the calculations, and then remember you have limited precision when outputting results.
As well as giving a better estimate, taking the mean of three numbers is an extremely simple calculation.

Controlling the amount of decimal places

Is there a way within C++ of setting a definitive amount of decimal points to a float value? for example if i were to record multiple times as float values, i would most likely generate different results (in terms of number of decimal places) and would like to generate numbers of the same lengths i.e if a number were to return as 1.33 and there are other numbers returning as say 1.333 i would like to make the first result read as 1.330.
I understand there are methods of limiting the amount of decimal places such as setprecision() but i do not want to loose accuracy of my times.
You seem to confuse two things: the actual precision of floating point calculations in C++, and formatting of float (or double, or long double) values when printing with C++ streams (like cout, for example).
The first depends on the hardware/platform, and you cannot control it, apart from choosing between float and double. If you need better precision than what long double can give you, you need a library for arbitrary precision maths, for example GMPLIB.
Controlling number of digits after dot when printing/formatting is easier, see for example this question: Set the digits after decimal point
If your need is to limit the digits after the decimal point whether of folat, double or long double then is way is to use (setprecision). When you use it seperately it will be including the digits before decimal point as well and also if the digits after the decimal point are less than the precision being set,it will not add a zero after them. And the solution is to use fixed and showpoint. So if you want to set the precision to 3 digits after the decimal point then write this line before displaying or computing the values.
cout<<fixed<<showpoint<<setprecision(3);

Write a float with full precision in C++

In C++, can I write and read back a float (or double) in text format without losing precision?
Consider the following:
float f = ...;
{
std::ofstream fout("file.txt");
// Set some flags on fout
fout << f;
}
float f_read;
{
std::ifstream fin("file.txt");
fin >> f;
}
if (f != f_read) {
std::cout << "precision lost" << std::endl;
}
I understand why precision is lost sometimes. However, if I print the value with enough digits, I should be able to read back the exact same value.
Is there a given set of flags that is guaranteed to never lose precision?
Would this behaviour be portable across platforms?
If you don't need to support platforms that lack C99 support (MSVC), your best bet is actually to use the %a format-specifier with printf, which always generates an exact (hexadecimal) representation of the number while using a bounded number of digits. If you use this method, then no rounding occurs during the conversion to a string or back, so the rounding mode has no effect on the result.
Have a look at this article: How to Print Floating-Point Numbers Accurately and also at that one: Printing Floating-Point Numbers Quickly and Accurately.
It is also mentioned on stackoverflow here, and there is some pointer to an implementation here.
if I print the value with enough digits, I should be able to read back the exact same value
Not if you write it in decimal - there's not an integer relationship between the number of binary digits and the number of decimal digits required to represent a number. If you print your number out in binary or hexadecimal, you'll be able to read it back without losing any precision.
In general, floating point numbers are not portable between platforms in the first place, so your text representation is not going to be able to bridge that gap. In practice, most machines use IEEE 754 floating point numbers, so it'll probably work reasonably well.
You can't necessarily print the exact value of a "power of two" float in decimal.
Think of using base three to store 1/3, now try and print 1/3 in decimal perfectly.
For solutions see: How do you print the EXACT value of a floating point number?

C++: How to Convert From Float to String Without Rounding, Truncation or Padding? [duplicate]

This question already has answers here:
Why do I see a double variable initialized to some value like 21.4 as 21.399999618530273?
(14 answers)
Closed 6 years ago.
I am facing a problem and unable to resolve it. Need help from gurus. Here is sample code:-
float f=0.01f;
printf("%f",f);
if we check value in variable during debugging f contains '0.0099999998' value and output of printf is 0.010000.
a. Is there any way that we may force the compiler to assign same values to variable of float type?
b. I want to convert float to string/character array. How is it possible that only and only exactly same value be converted to string/character array. I want to make sure that no zeros are padded, no unwanted values are padded, no changes in digits as in above example.
It is impossible to accurately represent a base 10 decimal number using base 2 values, except for a very small number of values (such as 0.25). To get what you need, you have to switch from the float/double built-in types to some kind of decimal number package.
You could use boost::lexical_cast in this way:
float blah = 0.01;
string w = boost::lexical_cast<string>( blah );
The variable w will contain the text value 0.00999999978. But I can't see when you really need it.
It is preferred to use boost::format to accurately format a float as an string. The following code shows how to do it:
float blah = 0.01;
string w = str( boost::format("%d") % blah ); // w contains exactly "0.01" now
Have a look at this C++ reference. Specifically the section on precision:
float blah = 0.01;
printf ("%.2f\n", blah);
There are uncountably many real numbers.
There are only a finite number of values which the data types float, double, and long double can take.
That is, there will be uncountably many real numbers that cannot be represented exactly using those data types.
The reason that your debugger is giving you a different value is well explained in Mark Ransom's post.
Regarding printing a float without roundup, truncation and with fuller precision, you are missing the precision specifier - default precision for printf is typically 6 fractional digits.
try the following to get a precision of 10 digits:
float amount = 0.0099999998;
printf("%.10f", amount);
As a side note, a more C++ way (vs. C-style) to do things is with cout:
float amount = 0.0099999998;
cout.precision(10);
cout << amount << endl;
For (b), you could do
std::ostringstream os;
os << f;
std::string s = os.str();
In truth using the floating point processor or co-processor or section of the chip itself (most are now intergrated into the CPU), will never result in accurate mathematical results, but they do give a fairly rough accuracy, for more accurate results, you could consider defining a class "DecimalString", which uses nybbles as decimal characters and symbols... and attempt to mimic base 10 mathematics using strings... in that case, depending on how long you want to make the strings, you could even do away with the exponent part altogether a string 256 can represent 1x10^-254 upto 1^+255 in straight decimal using actual ASCII, shorter if you want a sign, but this may prove significantly slower. You could speed this by reversing the digit order, so from left to right they read
units,tens,hundreds,thousands....
Simple example
eg. "0021" becomes 1200
This would need "shifting" left and right to make the decimal points line up before routines as well, the best bet is to start with the ADD and SUB functions, as you will then build on them in the MUL and DIV functions. If you are on a large machine, you could make them theoretically as long as your heart desired!
Equally, you could use the stdlib.h, in there are the sprintf, ecvt and fcvt functions (or at least, there should be!).
int sprintf(char* dst,const char* fmt,...);
char *ecvt(double value, int ndig, int *dec, int *sign);
char *fcvt(double value, int ndig, int *dec, int *sign);
sprintf returns the number of characters it wrote to the string, for example
float f=12.00;
char buffer[32];
sprintf(buffer,"%4.2f",f) // will return 5, if it is an error it will return -1
ecvt and fcvt return characters to static char* locations containing the null terminated decimal representations of the numbers, with no decimal point, most significant number first, the offset of the decimal point is stored in dec, the sign in "sign" (1=-,0=+) ndig is the number of significant digits to store. If dec<0 then you have to pad with -dec zeros pror to the decimal point. I fyou are unsure, and you are not working on a Windows7 system (which will not run old DOS3 programs sometimes) look for TurboC version 2 for Dos 3, there are still one or two downloads available, it's a relatively small program from Borland which is a small Dos C/C++ edito/compiler and even comes with TASM, the 16 bit machine code 386/486 compile, it is covered in the help files as are many other useful nuggets of information.
All three routines are in "stdlib.h", or should be, though I have found that on VisualStudio2010 they are anything but standard, often overloaded with function dealing with WORD sized characters and asking you to use its own specific functions instead... "so much for standard library," I mutter to myself almost each and every time, "Maybe they out to get a better dictionary!"
You would need to consult your platform standards to determine how to best determine the correct format, you would need to display it as a*b^C, where 'a' is the integral component that holds the sign, 'b' is implementation defined (Likely fixed by a standard), and 'C' is the exponent used for that number.
Alternatively, you could just display it in hex, it'd mean nothing to a human, though, and it would still be binary for all practical purposes. (And just as portable!)
To answer your second question:
it IS possible to exactly and unambiguously represent floats as strings. However, this requires a hexadecimal representation. For instance, 1/16 = 0.1 and 10/16 is 0.A.
With hex floats, you can define a canonical representation. I'd personally use a fixed number of digits representing the underlying number of bits, but you could also decide to strip trailing zeroes. There's no confusion possible on which trailing digits are zero.
Since the representation is exact, the conversions are reversible: f==hexstring2float(float2hexstring(f))