Small numerical error when calculating Weight Average - c++

Here is a part in a Physics engine.
The simplified function centerOfMass calculates 1D-center-of-mass of two rigid bodies (demo) :-
#include <iostream>
#include <iomanip>
float centerOfMass(float pos1,float m1, float pos2,float m2){
return (pos1*m1+pos2*m2)/(m1+m2);
}
int main(){
float a=5.55709743f;
float b= centerOfMass(a,50,0,0);
std::cout << std::setprecision(9) << a << '\n'; //5.55709743
std::cout << std::setprecision(9) << b << '\n'; //5.55709696
}
I need b to be precisely = 5.55709743.
The tiny difference can, sometimes (my real case = 5%), introduces a nasty Physics divergence.
There are some ways to solve it e.g. heavily do some conditional checking.
However, it is very error-prone for me.
Question: How to solve the calculation error while keep the code clean, fast, and still easily to be maintained?
By the way, if it can't be done elegantly, I would probably need to improve the caller to be more resistant against such numerical error.
Edit
(clarify duplicate question)
Yes, the cause is the precision error from the storage/computing format (mentioned in Is floating point math broken?).
However, this question asks about how to neutralize its symptom in a very specific case.

You are trying to get 9 decimal digits of precision , but the datatype float has a precision of about 7 decimal digits.
Use double instead. (demo)

Use double, not float. IEEE 754 double has about 16 decimal places of precision.
#include <iostream>
#include <iomanip>
double centerOfMass(double pos1, double m1, double pos2, double m2) {
return (pos1*m1 + pos2 * m2) / (m1 + m2);
}
int main() {
double a = 5.55709743;
double b = centerOfMass(a, 50, 0, 0);
std::cout << std::setprecision(16) << a << '\n'; //5.55709743
std::cout << std::setprecision(16) << b << '\n'; //5.55709743
std::cout << std::setprecision(16) << (b - a) << '\n'; // 0
}
For the example given, centerOfMass(a, 50, 0, 0), the following will give exact results for all values of a, but of course the example does not look realistic.
double centerOfMass(double pos1, double m1, double pos2, double m2) {
double divisor = m1 + m2;
return pos1*(m1/divisor) + pos2*(m2/ divisor);
}

Related

How do you use setprecision() when declaring a double variable in C++?

So I'm trying to learn more about C++ and I'm practicing by making a calculator class for the quadratic equation. This is the code for it down below.
#include "QuadraticEq.h"
string QuadraticEq::CalculateQuadEq(double a, double b, double c)
{
double sqrtVar = sqrt(pow(b, 2) - (4 * a * c));
double eqPlus = (-b + sqrtVar)/(2 * a);
double eqMinus = (-b - sqrtVar) / (2 * a);
return "Your answers are " + to_string(eqPlus) + " and " + to_string(eqMinus);
}
I'm trying to make it so that the double variables eqPlus and eqMinus have only two decimal points. I've seen people say to use setprecision() but I've only seen people use that function in cout statements and there are none in the class because I'm not printing a string out I'm returning one. So what would I do here? I remember way before learning about some setiosflags() method, is there anything I can do with that?
You can use stringstream instead of the usual std::cout with setprecision().
#include <iostream>
#include <string>
#include <sstream>
#include <iomanip>
std::string adjustDP(double value, int decimalPlaces) {
// change the number of decimal places in a number
std::stringstream result;
result << std::setprecision(decimalPlaces) << std::fixed << value;
return result.str();
}
int main() {
std::cout << adjustDP(2.25, 1) << std::endl; //2.2
std::cout << adjustDP(0.75, 1) << std::endl; //0.8
std::cout << adjustDP(2.25213, 2) << std::endl; //2.25
std::cout << adjustDP(2.25, 0) << std::endl; //2
}
However, as seen from the output, this approach introduces some rounding off errors when value cannot be represented exactly as a floating point binary number.

What does double store?

I'm sending the value 4 *cos( fmod( acos(2.0/4.0), 2*3.14159265) ) as double to this function but I get output as
2
1k1
What is wrong here?
void convert_d_to_f(double n)
{
cout<<n<<" ";
double mantissa;
double fractional_part;
fractional_part = modf(n,&mantissa);
double x = fractional_part;
cout<<mantissa<<"k"<<fractional_part<<'\n';
}
The problem is that cout truncates and rounds double while printing. You can print the desired number of decimal places usingiomanip library.
#include <iostream>
#include <cmath>
#include <iomanip>
void convert_d_to_f(double n)
{
cout<<std::fixed<<std::setprecision(20); //number of decimal places you need to print to
cout<<n<<" ";
double mantissa;
double fractional_part;
fractional_part = modf(n,&mantissa);
double x = fractional_part;
cout<<mantissa<<"k"<<fractional_part<<'\n';
}
int main() {
convert_d_to_f(4 *cos( fmod( acos(2.0/4.0), 2*3.14159265) ));
return 0;
}
For all practical intents and purposes, your number n evaluates to 2. If you want it to display as 1.9999999... etc. then follow Kapil's solution and set the floating point precision for std::cout to many decimal places. Keep in mind the difference between precision and accuracy if you are going to go that route.
That being said, your void convert_d_to_f(double n) function is replicating the functionality of std::frexp(double arg, int* exp) with a limitation where your results are going out of scope after you print them to the screen. If you desire to use your exponent and mantissa values after computing them, then you can do it like this.
#include <iostream>
#include <cmath>
int main()
{
double n = 4 *cos( fmod( acos(2.0/4.0), 2*3.14159265) );
std::cout << "Given the number " << n << std::endl;
// convert the given floating point value `n` into a
// normalized fraction and an integral power of two
int exp;
double mantissa = std::frexp(n, &exp);
// display results as Mantissa x 2^Exponent
std::cout << "We have " << n << " = "
<< mantissa << " * 2^" << exp << std::endl;
return 0;
}

How to produce formatting similar to .NET's '0.###%' in iostreams?

I would like to output a floating-point number as a percentage, with up to three decimal places.
I know that iostreams have three different ways of presenting floats:
"default", which displays using either the rules of fixed or scientific, depending on the number of significant digits desired as defined by setprecision;
fixed, which displays a fixed number of decimal places defined by setprecision; and
scientific, which displays a fixed number of decimal places but using scientific notation, i.e. mantissa + exponent of the radix.
These three modes can be seen in effect with this code:
#include <iostream>
#include <iomanip>
int main() {
double d = 0.00000095;
double e = 0.95;
std::cout << std::setprecision(3);
std::cout.unsetf(std::ios::floatfield);
std::cout << "d = " << (100. * d) << "%\n";
std::cout << "e = " << (100. * e) << "%\n";
std::cout << std::fixed;
std::cout << "d = " << (100. * d) << "%\n";
std::cout << "e = " << (100. * e) << "%\n";
std::cout << std::scientific;
std::cout << "d = " << (100. * d) << "%\n";
std::cout << "e = " << (100. * e) << "%\n";
}
// output:
// d = 9.5e-05%
// e = 95%
// d = 0.000%
// e = 95.000%
// d = 9.500e-05%
// e = 9.500e+01%
None of these options satisfies me.
I would like to avoid any scientific notation here as it makes the percentages really hard to read. I want to keep at most three decimal places, and it's ok if very small values show up as zero. However, I would also like to avoid trailing zeros in fractional places for cases like 0.95 above: I want that to display as in the second line, as "95%".
In .NET, I can achieve this with a custom format string like "0.###%", which gives me a number formatted as a percentage with at least one digit left of the decimal separator, and up to three digits right of the decimal separator, trailing zeros skipped: http://ideone.com/uV3nDi
Can I achieve this with iostreams, without writing my own formatting logic (e.g. special casing small numbers)?
I'm reasonably certain nothing built into iostreams supports this directly.
I think the cleanest way to handle it is to round the number before passing it to an iostream to be printed out:
#include <iostream>
#include <vector>
#include <cmath>
double rounded(double in, int places) {
double factor = std::pow(10, places);
return std::round(in * factor) / factor;
}
int main() {
std::vector<double> values{ 0.000000095123, 0.0095123, 0.95, 0.95123 };
for (auto i : values)
std::cout << "value = " << 100. * rounded(i, 5) << "%\n";
}
Due to the way it does rounding, this has a limitation on the magnitude of numbers it can work with. For percentages this probably isn't an issue, but if you were working with a number close to the largest that can be represented in the type in question (double in this case) the multiplication by pow(10, places) could/would overflow and produce bad results.
Though I can't be absolutely certain, it doesn't seem like this would be likely to cause an issue for the problem you seem to be trying to solve.
This solution is terrible.
I am serious. I don't like it. It's probably slow and the function has a stupid name. Maybe you can use it for test verification, though, because it's so dumb I guess you can easily see it pretty much has to work.
It also assumes decimal separator to be '.', which doesn't have to be the case. The proper point could be obtained by:
char point = std::use_facet< std::numpunct<char> >(std::cout.getloc()).decimal_point();
But that's still not solving the problem, because the characters used for digits could be different and in general this isn't something that should be written in such a way.
Here it is.
template<typename Floating>
std::string formatFloatingUpToN(unsigned n, Floating f) {
std::stringstream out;
out << std::setprecision(n) << std::fixed;
out << f;
std::string ret = out.str();
// if this clause holds, it's all zeroes
if (std::abs(f) < std::pow(0.1, n))
return ret;
while (true) {
if (ret.back() == '0') {
ret.pop_back();
continue;
} else if (ret.back() == '.') {
ret.pop_back();
break;
} else
break;
}
return ret;
}
And here it is in action.

Same floating point operation, different results

I really can't wrap my head around the fact that this code gives 2 results for the same formula:
#include <iostream>
#include <cmath>
int main() {
// std::cout.setf(std::ios::fixed, std::ios::floatfield);
std::cout.precision(20);
float a = (exp(M_PI) - M_PI);
std::cout << (exp(M_PI) - M_PI) << "\n";
std::cout << a << "\n";
return (0);
}
I don't really think that the IEEE 754 floating point representation is playing a significant role here ...
The first expression (namely (exp(M_PI) - M_PI)) is a double, the second expression (namely a) is a float. Neither even have 20 decimal digits of precision, but the float has a lot less precision than the double.
Because M_PI are of type double, so change a to double, you will have the same result:
#include <iostream>
#include <cmath>
int main() {
// std::cout.setf(std::ios::fixed, std::ios::floatfield);
std::cout.precision(20);
double a = (exp(M_PI) - M_PI);
std::cout << (exp(M_PI) - M_PI) << "\n";
std::cout << a << "\n";
return (0);
}

define double constant as hexadecimal?

I would like to have the closest number below 1.0 as a floating point. By reading wikipedia's article on IEEE-754 I have managed to find out that the binary representation for 1.0 is 3FF0000000000000, so the closest double value is actually 0x3FEFFFFFFFFFFFFF.
The only way I know of to initialize a double with this binary data is this:
double a;
*((unsigned*)(&a) + 1) = 0x3FEFFFFF;
*((unsigned*)(&a) + 0) = 0xFFFFFFFF;
Which is rather cumbersome to use.
Is there any better way to define this double number, if possible as a constant?
Hexadecimal float and double literals do exist.
The syntax is 0x1.(mantissa)p(exponent in decimal)
In your case the syntax would be
double x = 0x1.fffffffffffffp-1
It's not safe, but something like:
double a;
*(reinterpret_cast<uint64_t *>(&a)) = 0x3FEFFFFFFFFFFFFFL;
However, this relies on a particular endianness of floating-point numbers on your system, so don't do this!
Instead, just put DBL_EPSILON in <cfloat> (or as pointed out in another answer, std::numeric_limits<double>::epsilon()) to good use.
#include <iostream>
#include <iomanip>
#include <limits>
using namespace std;
int main()
{
double const x = 1.0 - numeric_limits< double >::epsilon();
cout
<< setprecision( numeric_limits< double >::digits10 + 1 ) << fixed << x
<< endl;
}
If you make a bit_cast and use fixed-width integer types, it can be done safely:
template <typename R, typename T>
R bit_cast(const T& pValue)
{
// static assert R and T are POD types
// reinterpret_cast is implementation defined,
// but likely does what you expect
return reinterpret_cast<const R&>(pValue);
}
const uint64_t target = 0x3FEFFFFFFFFFFFFFL;
double result = bit_cast<double>(target);
Though you can probably just subtract epsilon from it.
It's a little archaic, but you can use a union.
Assuming a long long and a double are both 8 bytes long on your system:
typedef union { long long a; double b } my_union;
int main()
{
my_union c;
c.b = 1.0;
c.a--;
std::cout << "Double value is " << c.b << std::endl;
std::cout << "Long long value is " << c.a << std::endl;
}
Here you don't need to know ahead of time what the bit representation of 1.0 is.
This 0x1.fffffffffffffp-1 syntax is great, but only in C99 or C++17.
But there is a workaround, no (pointer-)casting, no UB/IB, just simple math.
double x = (double)0x1fffffffffffff / (1LL << 53);
If I need a Pi, and Pi(double) is 0x1.921fb54442d18p1 in hex, just write
const double PI = (double)0x1921fb54442d18 / (1LL << 51);
If your constant has large or small exponent, you could use the function exp2 instead of the shift, but exp2 is C99/C++11 ... Use pow for rescue!
Rather than all the bit juggling, the most direct solution is to use nextafter() from math.h. Thus:
#include <math.h>
double a = nextafter(1.0, 0.0);
Read this as: the next floating-point value after 1.0 in the direction of 0.0; an almost direct encoding of "the closest number below 1.0" from the original question.
https://godbolt.org/z/MTY4v4exz
typedef union { long long a; double b; } my_union;
int main()
{
my_union c;
c.b = 1.0;
c.a--;
std::cout << "Double value is " << c.b << std::endl;
std::cout << "Long long value is " << c.a << std::endl;
}