How make a good calculator with floating point arithmetic [duplicate] - c++

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 4 years ago.
Here is how my calculator should work:
There is a JSON value where I can write the first multiplier - something like this:
{
"value1": 1.4
}
On the calculator I can write the second multiplier - only 10^n numbers (10, 100, ..., 10000000). And my calc should return me an integer, as I know that always people who use my calc with write less numbers after the decimal point for the first multiplier than we have 0s on the calc for the second multiplier. Yes, my calc is a very-very strange one.
Here are valid inputs:
v1=1.4; v2=100;
v1=1.414; v2=100000;
v1=1.1; v2=100;
What happens when I do this, for example for value1=1.4 and value2=10000 I get 13900. As far as float cannot hold any number sometimes it stores different numbers. For 1.4 internally it stores 1.399999 on my machine. I know why, but you know the QA engineer who tests my app tells me that I need to get 14000. Your calc does not work. How to make my calc so that I will print correct number?
P.S. Of course I have cut out my real problem from the context but the thing is that I have a float in a file and a 10^n number in my program as a user input. How to get correct result?
EDIT1: I don't ask why float works that way. I know why. I ask how to solve the problem even when float works that way.
EDIT2: I use RapidJson to read the JSON file which already returns me wrong number as a double precision number. I can't use libraries that provide with higher precision floating points.

Round the result when you format it for display. A double precision value is correct to about 15 significant digits, so if you round the result to 12 significant digits you're not going to surprise the user.

Related

clean up value without rounding up or down [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 6 years ago.
There is an input to my software for processing: float totalPurchased. I am coding with C++11/GCC/GDB/Linux.
The totalPurchase price informed is 14.92 as it is read from a file.
However, when the program runs, it shows 14.920001 out of no where. I don't want to round the value 14.92 to up 15.00 or down 14.00; the only thing I really need is to have the input right, without the compiler adding up things that does not exist as input.
The problem is that this 0.000001 is breaking a part the whole software calculation in the long run.
How to get rid of this 0.000001, and make sure that it appears the actual value that was read from the file into my float variable: 14.92?
All comments and suggestions are highly appreciated.
Unfortunately, floating point can't represent your number exactly: 14.92 is a repeating fraction in binary.
The question you want to ask yourself is: why does such a small offset break your calculation? If you really need to compare values so exactly, then perhaps floating point is not the appropriate datatype.
If you need something like, say, an exact percentage, or an exact number of cents, you can store 100* the number. This is a persistent problem in accounting, which is why accountants don't use floating point to add their money.
Use a fixed point library such as this.
For myself, in this domain, I would store price as integer pennies and do something along the line of
sprintf("%i.%02i", price/100, price%100)
Caveats for negative numbers, and in some applications you might want sub-penny precision, but you get the idea. A full fixed-point library is overkill for things like this, I think.
Actually, because I like to Do The Right Thing I would do something like this
class CashValue
{
public:
static CashValue parse (const std::string &);
float to_float () const;
CashValue & operator + (const CashValue &);
// etc
private:
int m_pennies;
};
Thus making the significance of the unit explicit. Only support operations for which exact solutions exist. If you want to do something like this
price *= pow (1.0 + (percent_interest/100.0), years)
then my interface would force you to verbosely convert it first, making you think about the issues when they become relevant, but still supporting safe operations such as addition transparently and accurately.

gfortran REAL not accurate to 8 decimal places [duplicate]

This question already exists:
gfortran represents REAL incorrectly [duplicate]
Closed 8 years ago.
This question has not been previously answered. I am trying to represent a real or any number for that matter in Fortran correctly. What gfortran is doing for me is way off. For example when I declare the variable REAL pi=3.14159 fortran prints pi = 3.14159012 rather than say 3.14159000. See below:
PROGRAM Test
IMPLICIT NONE
REAL:: pi = 3.14159
PRINT *, "PI = ",pi
END PROGRAM Test
This prints:
PI = 3.14159012
I might have expected something like PI = 3.14159000 as a REAL is supposed to be accurate to at least 8 decimal places.
I'm in a good mood, so I'll try to answer this question, which is basic knowledge which can be easily googled (as already pointed out in the comments to this and your former question).
Luckily, Fortran provides some really interesting intrinsics to get some understanding of floating point numbers.
The 8 digits, you are talking about, are a rule of thumb and can be related to the function EPSILON(x), which prints the smallest deviation from 1, which can be represented within the chosen model (e.g. REAL4). This value is actually 1.19e-7 which means, that your 8th digit is most likely wrong. I write most likely, because some numbers can be represented exactly.
In the case of PI, the smallest representable deviation can be printed using the intrinsic SPACING(PI). This shows a value of 2.38e-7, which is slightly larger than the epsilon and still allows for 7 correct digits.
Now, why does your value of PI get stored as 3.14159012? When you store a floating point number, you always store the nearest representable number.
Using the value of spacing, we can get the possible values for your pi. Possible numbers and their differences to your value of 3.14159 are:
3.14158988 1.20E-007
3.14159012 -1.18E-007
3.14159036 -3.56E-007
As you can see, 3.14159012 is the nearest possible value to 3.14159 and is thus stored and printed.
It is common for the last two digit to be erroneous. It is called floating point error.
Check this:
Week 1 - Lecture 2: Binary storage and version control / Fixed and floating point real numbers (9-08).mp4
#
https://class.coursera.org/scicomp-002/lecture

Dividing two floats doesn't give exact result [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 9 years ago.
I had divided 9501/100.0f expecting to get result of 95.01f, but for some deviant reason the result was 95.01000000002f.
I am aware of rounding errors and also that dividing two bigger floats can give improper result, but these two numbers are relative small, and they should not give bad answer.
I have changed floats to doubles, only to see the same result.
So my answer is, why am I seeing this false output?
And eventually workaround without copying number to string and back.
Floating point numbers are not precise, and dealing with them has lots of idiosyncrasies.
What Every Computer Scientist Should Know About Floating-Point Arithmetic
I also enjoy Bruce Dawson's blog entries on floating point values.
Floating point numbers are numbers represented in binary with limited precision.
The error between expected result and actual result is caused by the fact, that the number 95.01 is infinitely periodical in binary representation.
Double has only 51 binary digits, thus there has to be some rounding before the number is stored in the double precision. Single precision has only 23 digits.
It is not possible to represent 95.01 in finite precision floatin point number without any error.
However, you may trust the first 6-9 decimal digits, thus you should format the number with some meaningfull format.
Ahh good, another one of us has become a man in the church of programming :)
Floating points are not exact, the precision will vary from machine to machine. 1.0f != 1.00000000000000000000000000000000000 and so on, it's more like 1.0000001002003400011 and so on (I just picked arbitrary numbers here).

Dividing a float by 10 [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why can't decimal numbers be represented exactly in binary?
I am developing a pretty simple algorithm for mathematics use under C++.
And I have a floating point variable named "step", each time I finish a while loop, I need step to be divided by 10.
So my code is kind of like this,
float step = 1;
while ( ... ){
//the codes
step /= 10;
}
In my stupid simple logic, that ends of well. step will be divided by 10, from 1 to 0.1, from 0.1 to 0.01.
But it didn't, instead something like 0.100000000001 appears. And I was like "What The Hell"
Can someone please help me with this. It's probably something about the data type itself that I don't fully understand. So if someone could explain further, it'll be appreciated.
It is a numerical issue. The Problem is that 1/10 is a endless long number in binary and the successive apply of a division by 10 ends up with summing the error in each step. To get a more stable version you should multiply the divisor. But take care: the result is also not exact! You may want to replace the float with a double to minimize the error.
unsigned int div = 1;
while(...)
{
double step = 1.0 / (double)div;
....
div *= 10;
}
The division by ten cannot be exact for binary floating point arithmetic, so you see results that will look a little bit off from what you expect.
Binary floating are represented as an integer ratio where the denominator is a power of two. Since there in no binary fraction exactly equal to one-tenth, you'll see the nearest representable number instead of the one you expected.

Rounding problem with double type [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Why don't operations on double-precision values give expected results?
I am experiencing a peculiar problem in C++. I created a variable of type Double. Then I did some calculations with some values assigned to other variables and assigned the result to the double variable I declared. It gave me a result with a long decimal part. I want it to round to only 2 decimal places. and store it into the variable. But even after several attempt rounding, I couldnt round it to 2 decimal places.
Then I tried another way to check what the real problem is. I created a Double variable and assigned it the value 1.11. But when I debugged it by putting a break point and added a watch for that variable, I could find that the value now stored in the variable is 1.109999999999.
My question is, why is it showing like that? Isnt there any way in which we can round the variable into two decimal places? Why is it showing a long decimal part even if we assign a number with just two decimal places?
Please suggest a way to store numbers - whether it is calculated or directly assigned - as it is, in a double variable rather than a number with a long decimal part.
In the set of double values, there is no such thing as the number 1.11 because internally, double uses a binary representation (as opposed to humans who are used to a decimal representation). Most finite decimal numbers (such as 1.11) have an infinite representation in binary, but since memory is limited, you lose some precision because of rounding.
The closest you can get to 1.11 with the double data type is 1.1100000000000000976996261670137755572795867919921875, which is internally represented as 0x3ff1c28f5c28f5c3.
Your requirement of two decimal places sounds like you are working with money. A simple solution is to store the cents in an integer (as opposed to the dollars in a double):
int cents = 111;
This way, you don't lose any precision. Another solution is to use a dedicated decimal data type.
the floating-point types like float and double are not 100% precise. They may store 14.3 as 14.299999... and there is nothing wrong about that. That is why you should NEVER compare two floats or doubles with == operator, instread you should check if the absolute value of their difference is smaller than a certain epsilon, like 0.000000001
Now, if you want to output the number in a pleasant way, you can use setprecision from <iomanip>
E.g.
#include <iostream>
#include <iomanip>
int main()
{
double d = 1.389040598345;
std::cout << setprecision(2) << d; //outputs 1.39
}
If you want to obtain the value of d rounded 2 decimal places after the point, then you can use this formula
d = floor((d*100)+0.5)/100.0; //d is now 1.39
Not every decimal number has an exact, finite, binary floating-point representation. You've already found one example, but another one is 0.1 (decimal) = 0.0001100110011... (binary).
You either need to live with that, or use a decimal floating-point library (which will be less efficient).
My recommendation would be to store numbers to full precision, and only round when you need to display them to humans.