sum of double numbers in c++ [duplicate] - c++

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 6 years ago.
I want to calculate the sum of three double numbers and I expect to get 1.
double a=0.0132;
double b=0.9581;
double c=0.0287;
cout << "sum= "<< a+b+c <<endl;
if (a+b+c != 1)
cout << "error" << endl;
The sum is equal to 1 but I still get the error! I also tried:
cout<< a+b+c-1
and it gives me -1.11022e-16
I could fix the problem by changing the code to
if (a+b+c-1 > 0.00001)
cout << "error" << endl;
and it works (no error). How can a negative number be greater than a positive number and why the numbers don't add up to 1?
Maybe it is something basic with summation and under/overflow but I really appreciate your help.
Thanks

Rational numbers are infinitely precise. Computers are finite.
Precision loss is a well known problem in computer programming.
The real question is, how can you remedy it?
Consider using an approximation function when comparing floats for equality.
#include <iostream>
#include <cmath>
#include <limits>
using namespace std;
template <typename T>
bool ApproximatelyEqual(const T dX, const T dY)
{
return std::abs(dX - dY) <= std::max(std::abs(dX), std::abs(dY))
* std::numeric_limits<T>::epsilon();
}
int main() {
double a=0.0132;
double b=0.9581;
double c=0.0287;
//Evaluates to true and does not print error.
if (!ApproximatelyEqual(a+b+c,1.0)) cout << "error" << endl;
}

Floating point numbers in C++ have a binary representation. This means that most numbers that can exactly represented by a decimal fraction with only a few digits cannot be exactly represented by floating point numbers. That's where your error comes from.
One example: 0.1 (decimal) is a periodic fraction in binary:
0.000110011001100110011001100...
Therefore it cannot be exactly be represented with any number of bits with binary encoding.
In order to avoid this type of error, you can use BCD (binary coded decimal) numbers which are supported by some special libraries. The drawbacks are slower calculation speed (not directly supported by the CPU) and slightly higher memory usage.
ANother option is to represent the number by a general fraction and store numerator and denomiator as separate integers.

Related

Why is the difference between 2 double values wrongly calculated? [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 2 years ago.
I need to calculate the difference value between 2 string numbers by only taking only the first precision. I have to convert to double first then calculate the difference as below
#include <iostream>
#include <math.h>
#include <string>
using namespace std;
int main()
{
string v1 = "1568678435.244555";
string v2 = "1568678435.300111";
double s1 = atof(v1.substr(0,12).c_str()); // take upto first precision and convert to double
double s2 = atof(v2.substr(0,12).c_str()); // take upto first precision and convert to double
std::cout<<s1<<" "<<s2<<" "<<s2-s1<<endl;
if (s2-s1 >= 0.1)
cout<<"bigger";
else
cout<<"smaller";
return 0;
}
I expect the calculation would be 1568678435.3 - 1568678435.2 = 0.1 . But this program returns this value :
1.56868e+09 1.56868e+09 0.0999999
smaller
Why is that and how to get the value that I want properly?
Floating point format has limited precision. Not all values are representable. For example, the number 1568678435.2 is not representable (in IEEE-754 binary64 format). The closest representable value is:
1568678435.2000000476837158203125
1568678435.3 is also not a representable value. The closest reprecentable value is:
1568678435.2999999523162841796875
Given that the floating point values that you start with are not precise, it should be hardly surprising that the result of the calculation is also not precise. The floating point result of subtracting these numbers is:
0.099999904632568359375
Which very close to 0.1, but not quite. The error of the calculation was:
0.000000095367431640625
Also note that 0.1 is itself not a representable number, so there is no way to get that as the result of a floating point operation no matter what your inputs are.
how to get the value that I want properly?
To print the value 0.1, simply round the output to a sufficiently coarse precision:
std::cout << std::fixed << std::setprecision(1) << s2-s1;
This works as long as the error of the calculation doesn't exceed half of the desired precision.
If you don't want to deal with any accuracy error in your calculation, then you mustn't use floating point numbers.
You should round the difference between the values.
if (round((s2-s1) * 10) >= 1)
cout<<"bigger";
else
cout<<"smaller";

I don't understand why. problems with double [duplicate]

This question already has answers here:
How do I print a double value with full precision using cout?
(17 answers)
Closed 3 years ago.
Why doubles round themselves? How can i prevent it?
If i insert 45000.98 i expect 45000.98, but the number is rounded.
#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
double a;
cin >> a; //if i insert 45000.98
cout << a; //output is 45001
cout << endl << setprecision(2) << a; //output is 4.5e+04
}
Double type has 11 bits for exponent and 52 bits for the fractional part, more than enough to give you enough precision to represent 45000.98, but setprecision argument, as far as i recall, receives a characters limit, not the number of digits after decimal point. Use setprecision(8) and you should see 45000.98 as you probably expect.
The double did not round itself; the streaming operation rounded the value — first by default, and later according to your instructions. You requested that your value be rounded to 2 digits of precision, and that's what you got (just the first two digits: 4.5e+04). You are getting scientific notation because you have not requested enough digits to reach the decimal point.
If you want to see all 7 digits of 45000.98 then request at least 7 digits of precision. (You may want to stay under 17 digits though, since that's where you start seeing the artifacts of the floating point representation.)

How to calculate number of digits before and after decimal point? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
Today a question was asked in my c++ class test. "Write a program that inputs a floating point number and calculates the number of digits before and after decimal point."
I calculated numbers before decimal points with this code:
float n;
cin>>n;
float temp = n;
int count = 0;
while(temp1 > 1) {
count++;
temp = temp/10;
}
cout<<count;
but I stuck with after part. Can anyone tell me how to do this? or can provide the whole program?
Thanks in advance,
Write a program that inputs a floating point number and calculates the number of digits before and after decimal point.
Well, as is that task is asking for something not really solvable using a float and standard c++, because the binary representation of a float values exponent and mantissa isn't defined in the c++ standard.
Hence you can't know how many digits will be used to represent the fraction part of the number, unless you know how exactly the c++ compiler implemented float (or double) binary representations.
Most probably the implementation is optimized for the target CPU and its capabilities how to deal with floating point values.
So the only chance you have is to read the number as a std::string representation in 1st place, count the digits that appear before and after the '.' character, and finally convert the std::string variable to a float value.
Here's a simple illustration what I meant in the 1st part of my answer:
#include <iostream>
#include <iomanip>
#include <limits>
#include <cmath>
#include <sstream>
int main() {
std::istringstream iss("3.1415"); // same as reading from cin
std::cout << "Input: " << iss.str() << std::endl;
float temp;
iss >> temp;
std::cout << "Internal representation: "
<< std::fixed << std::setprecision(22) << temp << std::endl;
float fraction = temp - abs(temp);
int fractiondigits = 0;
while(fraction > std::numeric_limits<float>::epsilon()) { // epsilon is the smallest
// value that can be
// represented in binary form
fraction *= 10.0f;
fraction -= abs(fraction);
++fractiondigits;
}
std::cout << "Number of digits used in the representation: "
<< fractiondigits << std::endl;
}
The output is
Input: 3.1415
Internal representation: 3.1414999961853027343750
Number of fraction digits used in the representation: 21
Live Demo
So you see that's not congruent with the user's input.
I don't know if your professors intend was to ask about and letting you acknowledge this incongruence of user input and internal representation of float.
But as mentioned the actual count of digits is compiler implementation and platform dependent, so there's no definite answer for the number of fraction digits.
The question is fundamentally irrelevant. Most real numbers have infinitely many digits, but computer represented numbers must have a finite representation. For the common case of a binary representation, the represented number also has a finite decimal representation. However, truncating this decimal representation at fewer digits (as few as std::numeric_limits<float>::max_digits10 to be precise) still obtains the same representable number. Thus, the relevant number of digits for computer floating-point numbers best refers to their binary rather than their decimal representation. This is given by std::numeric_limits<float>::digits (total: in front of and after the point).

comparison function in g++

what is wrong with my code? its converting inches and feet and comparing them in meters. if i enter 12 for inches and 1 for feet it says that the numbers are not equal. Is this a known issue with g++? Can somebody explain this to me?
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
double in, ft, m1, m2;
cin >> in >> ft;
m1 = in * 0.0254;
m2 = ft * 0.3048;
cout << m1 << '\t' << m2 << '\n' << endl;
// to show that both numbers are equal
if (m1 == m2) cout << "yay";
else cout << "boo";
}
Does anybody else have this issue?
#Josh, add this to your code and run it
cout << m2-m1;
u will be surprised, answer is not zero
For the problem in code, changing data type from double to float fixes the problem
float in, ft, m1, m2;
The reason that the numbers don't match is that computers use a binary representation of numbers which leads to inaccuracies when trying to represent decimal numbers.
You think the number is 0.3048 (because that's what you coded) - but when compiled, the computer can only represent this as the nearest equivalent in binary format (see IEEE floating point for more info). So the number might be something extremely close to 0.3048, but not precisely that.
After you've done your calculations, you compare the numbers - but if the two are not absolutely identical in their binary representations, they won't match.
One simple way to solve it (but by no means the only solution) it to subtract the two operands and check how close to zero it is. If:
fabs(a - b) < 0.00001
(an arbitrary amount), then you can presume the values are the same.
What you're seeing is a result of inexact floating point representation. Base 2^n floating point numbers cannot represent all base 10 decimal values exactly. Thus, when you do something simple like multiplying 12*0.0254 you get the very odd result of 0.3047999.......6, whereas if you compute 1*0.3048 you get the expected result of 0.3048. The problem is that 0.0254 isn't being stored exactly; instead, the closest approximate value (something like 0.0253999999....98) is used. The difference is small but can become noticeable when you use the inexact value in a calculation, and then compare it to another value which doesn't suffer from rounding issue such as 0.3048. A basic rule to keep in mind is that you should never compare floating point values for equality; instead, compare them in a manner that allows for an acceptable error, e.g. instead of comparing values in the following manner:
if(val1 == val2)...
use something like
if(abs(val1 - val2) < 0.0000001)...
so that the two variables will be considered equal if their values differ by less than 1/10,000,000 (which is pretty close :-).

Why comparing double and float leads to unexpected result? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
strange output in comparision of float with float literal
float f = 1.1;
double d = 1.1;
if(f == d) // returns false!
Why is it so?
The important factors under consideration with float or double numbers are:
Precision & Rounding
Precision:
The precision of a floating point number is how many digits it can represent without losing any information it contains.
Consider the fraction 1/3. The decimal representation of this number is 0.33333333333333… with 3′s going out to infinity. An infinite length number would require infinite memory to be depicted with exact precision, but float or double data types typically only have 4 or 8 bytes. Thus Floating point & double numbers can only store a certain number of digits, and the rest are bound to get lost. Thus, there is no definite accurate way of representing float or double numbers with numbers that require more precision than the variables can hold.
Rounding:
There is a non-obvious differences between binary and decimal (base 10) numbers.
Consider the fraction 1/10. In decimal, this can be easily represented as 0.1, and 0.1 can be thought of as an easily representable number. However, in binary, 0.1 is represented by the infinite sequence: 0.00011001100110011…
An example:
#include <iomanip>
int main()
{
using namespace std;
cout << setprecision(17);
double dValue = 0.1;
cout << dValue << endl;
}
This output is:
0.10000000000000001
And not
0.1.
This is because the double had to truncate the approximation due to it’s limited memory, which results in a number that is not exactly 0.1. Such an scenario is called a Rounding error.
Whenever comparing two close float and double numbers such rounding errors kick in and eventually the comparison yields incorrect results and this is the reason you should never compare floating point numbers or double using ==.
The best you can do is to take their difference and check if it is less than an epsilon.
abs(x - y) < epsilon
Try running this code, the results will make the reason obvious.
#include <iomanip>
#include <iostream>
int main()
{
std::cout << std::setprecision(100) << (double)1.1 << std::endl;
std::cout << std::setprecision(100) << (float)1.1 << std::endl;
std::cout << std::setprecision(100) << (double)((float)1.1) << std::endl;
}
The output:
1.100000000000000088817841970012523233890533447265625
1.10000002384185791015625
1.10000002384185791015625
Neither float nor double can represent 1.1 accurately. When you try to do the comparison the float number is implicitly upconverted to a double. The double data type can accurately represent the contents of the float, so the comparison yields false.
Generally you shouldn't compare floats to floats, doubles to doubles, or floats to doubles using ==.
The best practice is to subtract them, and check if the absolute value of the difference is less than a small epsilon.
if(std::fabs(f - d) < std::numeric_limits<float>::epsilon())
{
// ...
}
One reason is because floating point numbers are (more or less) binary fractions, and can only approximate many decimal numbers. Many decimal numbers must necessarily be converted to repeating binary "decimals", or irrational numbers. This will introduce a rounding error.
From wikipedia:
For instance, 1/5 cannot be represented exactly as a floating point number using a binary base but can be represented exactly using a decimal base.
In your particular case, a float and double will have different rounding for the irrational/repeating fraction that must be used to represent 1.1 in binary. You will be hard pressed to get them to be "equal" after their corresponding conversions have introduced different levels of rounding error.
The code I gave above solves this by simply checking if the values are within a very short delta. Your comparison changes from "are these values equal?" to "are these values within a small margin of error from each other?"
Also, see this question: What is the most effective way for float and double comparison?
There are also a lot of other oddities about floating point numbers that break a simple equality comparison. Check this article for a description of some of them:
http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
The IEEE 754 32-bit float can store: 1.1000000238...
The IEEE 754 64-bit double can store: 1.1000000000000000888...
See why they're not "equal"?
In IEEE 754, fractions are stored in powers of 2:
2^(-1), 2^(-2), 2^(-3), ...
1/2, 1/4, 1/8, ...
Now we need a way to represent 0.1. This is (a simplified version of) the 32-bit IEEE 754 representation (float):
2^(-4) + 2^(-5) + 2^(-8) + 2^(-9) + 2^(-12) + 2^(-13) + ... + 2^(-24) + 2^(-25) + 2^(-27)
00011001100110011001101
1.10000002384185791015625
With 64-bit double, it's even more accurate. It doesn't stop at 2^(-25), it keeps going for about twice as much. (2^(-48) + 2^(-49) + 2^(-51), maybe?)
Resources
IEEE 754 Converter (32-bit)
Floats and doubles are stored in a binary format that can not represent every number exactly (it's impossible to represent the infinitely many possible different numbers in a finite space).
As a result they do rounding. Float has to round more than double, because it is smaller, so 1.1 rounded to the nearest valid Float is different to 1.1 rounded to the nearest valud Double.
To see what numbers are valid floats and doubles see Floating Point