My program for calculating pi using Chudnovsky in C++ precision problem - c++

My code:
#include <iostream>
#include <iomanip>
#include <cmath>
long double fac(long double num) {
long double result = 1.0;
for (long double i=2.0; i<num; i++)
result *= i;
return result;
}
int main() {
using namespace std;
long double pi=0.0;
for (long double k = 0.0; k < 10.0; k++) {
pi += (pow(-1.0,k) * fac(6.0 * k) * (13591409.0 + (545140134.0 * k)))
/ (fac(3.0 * k) * pow(fac(k), 3.0) * pow(640320.0, 3.0 * k + 3.0/2.0));
}
pi *= 12.0;
cout << setprecision(100) << 1.0 / pi << endl;
return 0;
}
My output:
3.1415926535897637228433865175247774459421634674072265625
The problem with this output is that it outputed 56 digits instead of 100; How do I fix that?

First of all your factorial is wrong the loop should be for (long double i=2.0; i<=num; i++) instead of i<num !!!
As mentioned in the comments double can hold only up to ~16 digits so your 100 digits is not doable by this method. To remedy this there are 2 ways:
use high precision datatype
there are libs for this, or you can implement it on your own you need just few basic operations. Note that to represent 100 digits you need at least
ceil(100 digits/log10(2)) = 333 bits
of mantisa or fixed point integer while double has only 53
53*log10(2) = 15.954589770191003346328161420398 digits
use different method of computation of PI
For arbitrary precision I recommend to use BPP However if you want just 100 digits you can use simple taylor seriesbased like this on strings (no need for any high precision datatype nor FPU):
//The following 160 character C program, written by Dik T. Winter at CWI, computes pi to 800 decimal digits.
int a=10000,b=0,c=2800,d=0,e=0,f[2801],g=0;main(){for(;b-c;)f[b++]=a/5;
for(;d=0,g=c*2;c-=14,printf("%.4d",e+d/a),e=d%a)for(b=c;d+=f[b]*a,f[b]=d%--g,d/=g--,--b;d*=b);}
Aside the obvious precision limits Your implementation is really bad from both performance and precision aspects that is why you lost precision way sooner as you hitting double precision limits in very low iterations of k. If you rewrite the iterations so the subresults are as small as can be (in terms of bits of mantisa) and not use too much unnecessary computations here few hints:
why are you computing the same factorials again and again
You have k! in loop where k is incrementing why not just multiply the k to some variable holding actual factorial instead? for example:
//for ( k=0;k<10;k++){ ... fac(k) ... }
for (f=1,k=0;k<10;k++){ if (k) f*=k; ... f ... }
why are you divide by factorials again and again
if you think a bit about it then if (a>b) you can compute this instead:
a! / b! = (1*2*3*4*...*b*...*a) / (1*2*3*4*...*b)
a! / b! = (b+1)*(b+2)*...*(a)
I would not use pow at all for this
pow is "very complex" function causing further precision and performance losses for example pow(-1.0,k) can be done like this:
//for ( k=0;k<10;k++){ ... pow(-1.0,k) ... }
for (s=+1,k=0;k<10;k++){ s=-s; ... s ... }
Also pow(640320.0, 3.0 * k + 3.0/2.0)) can be computed in the same way as factorial, pow(fac(k), 3.0) you can 3 times multipply the variable holding fac(k) instead ...
the therm pow(640320.0, 3.0 * k + 3.0/2.0) outgrows even (6k)!
so you can divide it by it to keep subresults smaller...
These few simple tweaks will enhance the precision a lot as you will overflow the double precision much much latter as the subresults will be much smaller then the naive ones as factorials tend to grow really fast
Putting all together leads to this:
double pi_Chudnovsky() // no pow,fac lower subresult
{ // https://en.wikipedia.org/wiki/Chudnovsky_algorithm
double pi,s,f,f3,k,k3,k6,p,dp,q,r;
for (pi=0.0,s=1.0,f=f3=1,k=k3=k6=0.0,p=640320.0,dp=p*p*p,p*=sqrt(p),r=13591409.0;k<27.0;k++,s=-s)
{
if (k) // f=k!, f3=(3k)!, p=pow(640320.0,3k+1.5)*(3k)!/(6k)!, r=13591409.0+(545140134.0*k)
{
p*=dp; r+=545140134.0;
f*=k; k3++; f3*=k3; k6++; p/=k6; p*=k3;
k3++; f3*=k3; k6++; p/=k6; p*=k3;
k3++; f3*=k3; k6++; p/=k6; p*=k3;
k6++; p/=k6;
k6++; p/=k6;
k6++; p/=k6;
}
q=s*r; q/=f; q/=f; q/=f; q/=p; pi+=q;
}
return 1.0/(pi*12.0);
}
as you can see k goes up to 27, while your naive method can go only up to 18 on 64 bit doubles before overflow. However the result is the same as the double mantissa is saturated after 2 iterations ...

I am feeling happy due to following code :)
/*
I have compiled using cygwin
change "iostream...using namespace std" OR iostream.h based on your compiler at related OS.
*/
#include <iostream>
#include <iomanip>
#include <cmath>
using namespace std;
long double fac(long double num)
{
long double result = 1.0;
for (long double i=2.0; num > i; ++i)
{
result *= i;
}
return result;
}
int main()
{
long double pi=0.0;
for (long double k = 0.0; 10.0 > k; ++k)
{
pi += (pow(-1.0,k) * fac(6.0 * k) * (13591409.0 + (545140134.0 * k)))
/ (fac(3.0 * k) * pow(fac(k), 3.0) * pow(640320.0, 3.0 * k + 3.0/2.0));
}
pi *= 12.0;
cout << "BEFORE USING setprecision VALUE OF DEFAULT PRECISION " << cout.precision() << "\n";
cout << setprecision(100) << 1.0 / pi << endl;
cout << "AFTER USING setprecision VALUE OF CURRENT PRECISION WITHOUT USING fixed " << cout.precision() << "\n";
cout << fixed;
cout << "AFTER USING setprecision VALUE OF CURRENT PRECISION USING fixed " << cout.precision() << "\n";
cout << "USING fixed PREVENT THE EARTH'S ROUNDING OFF INSIDE OUR UNIVERSE :)\n";
cout << setprecision(100) << 1.0 / pi << endl;
return 0;
}
/*
$ # Sample output:
$ g++ 73256565.cpp -o ./a.out;./a.out
$ ./a.out
BEFORE USING setprecision VALUE OF DEFAULT PRECISION 6
3.14159265358976372457810999350158454035408794879913330078125
AFTER USING setprecision VALUE OF CURRENT PRECISION WITHOUT USING fixed 100
AFTER USING setprecision VALUE OF CURRENT PRECISION USING fixed 100
USING fixed PREVENT THE EARTH'S ROUNDING OFF INSIDE OUR UNIVERSE :)
3.1415926535897637245781099935015845403540879487991333007812500000000000000000000000000000000000000000
*/

Related

How can I get a more accurate result when dividing numbers in C++

I am trying to estimate PI using C++ as a fun math project. I've run into an issues where I can only get it as precise as 6 decimal places.
I have tried using a float instead of a double but found the same result.
My code works by summing all the results of 1/n^2 where n=1 through to a defined limit. It then multiplies this result by 6 and takes the square root.
Here is a link to an image written out in mathematical notation
Here is my main function. PREC is the predefined limit. It will populate the array with the results of these fractions and get the sum. My guess is that the sqrt function is causing the issue where I cannot get more precise than 6 digits.
int main(int argc, char *argv[]) {
nthsums = new float[PREC];
for (int i = 1; i < PREC + 1; i += 1) {
nthsums[i] = nth_fraction(i);
}
float array_sum = sum_array(nthsums);
array_sum *= 6.000000D;
float result = sqrt(array_sum);
std::string resultString = std::to_string(result);
cout << resultString << "\n";
}
Just for the sake of it, I'll also include my sum function as I suspect that there could be something wrong with that, too.
float sum_array(float *array) {
float returnSum = 0;
for (int itter = 0; itter < PREC + 1; itter += 1) {
if (array[itter] >= 0) {
returnSum += array[itter];
}
}
return returnSum;
}
I would like to get at least as precise as 10 digits. Is there any way to do this in C++?
So even with long double as the floating point type used for this, there's some subtlety required because adding two long doubles of substantially different order of magnitudes can cause precision loss. See here for a discussion in Java but I believe it to be basically the same behavior in C++.
Code I used:
#include <iostream>
#include <cmath>
#include <numbers>
long double pSeriesApprox(unsigned long long t_terms)
{
long double pi_squared = 0.L;
for (unsigned long long i = t_terms; i >= 1; --i)
{
pi_squared += 6.L * (1.L / i) * (1.L / i);
}
return std::sqrtl(pi_squared);
}
int main(int, char[]) {
const long double pi = std::numbers::pi_v<long double>;
const unsigned long long num_terms = 10'000'000'000;
std::cout.precision(30);
std::cout << "Pi == " << pi << "\n\n";
std::cout << "Pi ~= " << pSeriesApprox(num_terms) << " after " << num_terms << " terms\n";
return 0;
}
Output:
Pi == 3.14159265358979311599796346854
Pi ~= 3.14159265349430016911469465413 after 10000000000 terms
9 decimal digits of accuracy, which is about what we'd expect from a series converging at this rate.
But if all I do is reverse the order the loop in pSeriesApprox goes, adding the exact same terms but from largest to smallest instead of smallest to largest:
long double pSeriesApprox(unsigned long long t_terms)
{
long double pi_squared = 0.L;
for (unsigned long long i = 1; i <= t_terms; ++i)
{
pi_squared += 6.L * (1.L / i) * (1.L / i);
}
return std::sqrtl(pi_squared);
}
Output:
Pi == 3.14159265358979311599796346854
Pi ~= 3.14159264365071688729358356795 after 10000000000 terms
Suddenly we're down to 7 digits of accuracy, even though we used 10 billion terms. In fact, after 100 million terms or so, the approximation to pi stabilizes at this specific value. So while using sufficiently large data types to store these computations is important, some additional care is still needed when trying to perform this kind of sum.

Counting iterations of the Leibniz summation for π in C++

My task is to ask the user to how many decimal places of accuracy they want the summation to iterate compared to the actual value of pi. So 2 decimal places would stop when the loop reaches 3.14. I have a complete program, but I am unsure if it actually works as intended. I have checked for 0 and 1 decimal places with a calculator and they seem to work, but I don't want to assume it works for all of them. Also my code may be a little clumsy since were are still learning the basics. We only just learned loops and nested loops. If there are any obvious mistakes or parts that could be cleaned up, I would appreciate any input.
Edit: I only needed to have this work for up to five decimal places. That is why my value of pi was not precise. Sorry for the misunderstanding.
#include <iostream>
#include <cmath>
using namespace std;
int main() {
const double PI = 3.141592;
int n, sign = 1;
double sum = 0,test,m;
cout << "This program determines how many iterations of the infinite series for\n"
"pi is needed to get with 'n' decimal places of the true value of pi.\n"
"How many decimal places of accuracy should there be?" << endl;
cin >> n;
double p = PI * pow(10.0, n);
p = static_cast<double>(static_cast<int>(p) / pow(10, n));
int counter = 0;
bool stop = false;
for (double i = 1;!stop;i = i+2) {
sum = sum + (1.0/ i) * sign;
sign = -sign;
counter++;
test = (4 * sum) * pow(10.0,n);
test = static_cast<double>(static_cast<int>(test) / pow(10, n));
if (test == p)
stop = true;
}
cout << "The series was iterated " << counter<< " times and reached the value of pi\nwithin "<< n << " decimal places." << endl;
return 0;
}
One of the problems of the Leibniz summation is that it has an extremely low convergence rate, as it exhibits sublinear convergence. In your program you also compare a calculated extimation of π with a given value (a 6 digits approximation), while the point of the summation should be to find out the right figures.
You can slightly modify your code to make it terminate the calculation if the wanted digit doesn't change between iterations (I also added a max number of iterations check). Remember that you are using doubles not unlimited precision numbers and sooner or later rounding errors will affect the calculation. As a matter of fact, the real limitation of this code is the number of iterations it takes (2,428,700,925 to obtain 3.141592653).
#include <iostream>
#include <cmath>
#include <iomanip>
using std::cout;
// this will take a long long time...
const unsigned long long int MAX_ITER = 100000000000;
int main() {
int n;
cout << "This program determines how many iterations of the infinite series for\n"
"pi is needed to get with 'n' decimal places of the true value of pi.\n"
"How many decimal places of accuracy should there be?\n";
std::cin >> n;
// precalculate some values
double factor = pow(10.0,n);
double inv_factor = 1.0 / factor;
double quad_factor = 4.0 * factor;
long long int test = 0, old_test = 0, sign = 1;
unsigned long long int count = 0;
double sum = 0;
for ( long long int i = 1; count < MAX_ITER; i += 2 ) {
sum += 1.0 / (i * sign);
sign = -sign;
old_test = test;
test = static_cast<long long int>(sum * quad_factor);
++count;
// perform the test on integer values
if ( test == old_test ) {
cout << "Reached the value of Pi within "<< n << " decimal places.\n";
break;
}
}
double pi_leibniz = static_cast<double>(inv_factor * test);
cout << "Pi = " << std::setprecision(n+1) << pi_leibniz << '\n';
cout << "The series was iterated " << count << " times\n";
return 0;
}
I have summarized the results of several runs in this table:
digits Pi iterations
---------------------------------------
0 3 8
1 3.1 26
2 3.14 628
3 3.141 2,455
4 3.1415 136,121
5 3.14159 376,848
6 3.141592 2,886,751
7 3.1415926 21,547,007
8 3.14159265 278,609,764
9 3.141592653 2,428,700,925
10 3.1415926535 87,312,058,383
Your program will never terminate, because test==p will never be true. This is a comparison between two double-precision numbers that are calculated differently. Due to round-off errors, they will not be identical, even if you run an infinite number of iterations, and your math is correct (and right now it isn't, because the value of PI in your program is not accurate).
To help you figure out what's going on, print the value of test in each iteration, as well as the distance between test and pi, as follows:
#include<iostream>
using namespace std;
void main() {
double pi = atan(1.0) * 4; // Make sure you have a precise value of PI
double sign = 1.0, sum = 0.0;
for (int i = 1; i < 1000; i += 2) {
sum = sum + (1.0 / i) * sign;
sign = -sign;
double test = 4 * sum;
cout << test << " " << fabs(test - pi) << "\n";
}
}
After you make sure the program works well, change the stopping condition eventually to be based on the distance between test and pi.
for (int i=1; fabs(test-pi)>epsilon; i+=2)

Taylor Series Resulting in nan after sin(90) and cos(120)

doing a school project. i do not understand why the sin comes out to -NaN when after sin(90) and cos(120).
Can anyone help me understand this?
Also, when I put this in an online C++ editor it totally works, but when compiled in linux it does not.
// Nick Garver
// taylorSeries
// taylorSeries.cpp
#include <iostream>
#include <cmath>
#include <iomanip>
using namespace std;
const double PI = atan(1.0)*4.0;
double angle_in_degrees = 0;
double radians = 0;
double degreesToRadians(double d);
double factorial(double factorial);
double mySine(double x);
double myCosine(double x);
int main()
{
cout << "\033[2J\033[1;1H";
cout.width(4); cout << left << "Deg";
cout.width(9); cout << left << "Radians";
cout.width(11); cout << left << "RealSine";
cout.width(11); cout << left << "MySin";
cout.width(12); cout << left << "RealCos";
cout.width(11); cout << left << "MyCos"<<endl;
while (angle_in_degrees <= 360) //radian equivalent of 45 degrees
{
double sine = sin(degreesToRadians(angle_in_degrees));
double cosine = cos(degreesToRadians(angle_in_degrees));
//output
cout.width(4); cout << left << angle_in_degrees;
cout.width(9); cout << left << degreesToRadians(angle_in_degrees);
cout.width(11); cout << left << sine;
cout.width(11); cout << left << mySine(degreesToRadians(angle_in_degrees));
cout.width(12); cout << left << cosine;
cout.width(11); cout << left << myCosine(degreesToRadians(angle_in_degrees))<<endl;
angle_in_degrees = angle_in_degrees + 15;
}
cout << endl;
return 0;
}
double degreesToRadians(double d)
{
double answer;
answer = (d*PI)/180;
return answer;
}
double mySine(double x)
{
double result = 0;
for(int i = 1; i <= 1000; i++) {
if (i % 2 == 1)
result += pow(x, i * 2 - 1) / factorial(i * 2 - 1);
else
result -= pow(x, i * 2 - 1) / factorial(i * 2 - 1);
}
return result;
}
double myCosine(double x)
{
double positive = 0.0;
double negative= 0.0;
double result=0.0;
for (int i=4; i<=1000; i+=4)
{
positive = positive + (pow(x,i) / factorial (i));
}
for (int i=2; i<=1000; i+=4)
{
negative = negative + (pow(x,i) / factorial (i));
}
result = (1 - (negative) + (positive));
return result;
}
double factorial(double factorial)
{
float x = 1;
for (float counter = 1; counter <= factorial; counter++)
{
x = x * counter;
}
return x;
}
(Marcus has good points; I am going to ramble in other directions...)
Look at the terms in a Taylor series. They become too small to make any difference after fewer than 10 terms. Asking for 1000 is asking for trouble.
Instead of going for 1000, go until the next term does not add anything, something like:
term = pow(x, i * 2 - 1) / factorial(i * 2 - 1);
if (result + term == result) { break; }
result += term;
The series would run much faster if you iteratively calculated the pow and factorial rather than starting over each time. (But, probably speed is not an issue at this point.)
Float has 24 bits of binary precision. Beginning perhaps with 13!, you will get roundoff errors in float. Double, on the other hand, has 53 bits of precision and will last until about 22! without roundoff errors. My point is that you should have done factorial() in double.
Another problem is that the computation of the Taylor series gets somewhat 'unstable' for bigger arguments. Intermediate terms become bigger than the end result, thereby leading to other roundoff errors. To avoid this, a common way to compute sine and cosine is to first fold to between -45 and +45 degrees. No unfolding, except maybe for the sign, is needed later.
As for why you had trouble on one system but not the other -- Different implementations handle NaN differently.
Once you have gotten the NaN out of the way, try computing the series in reverse order. This will lead to a different set of roundoff errors. Will it make your sin() closer to the real sin?
The 'real' sin is probably computed in hardware with 64-bit fixed-point arithmetic, and will be "correctly rounded" to 53 or 24 bits well over 99% of the time. (This, of course, depends on the chip manufacturer, hence my 'hand-waving' statement.)
To judge how 'close' your value is, you need to compute ULPs (units in the last place). This involves looking at the bits in the float/double. (Beyond the scope of this question.)
Sorry about the TMI.
Before I answer this, a few remarks:
It's always helpful for your own debugging to keep your code tidy. Remove unnecessary empty lines, make sure your bracketing style is uniform, and properly indent. I did this for you, but believe me, you'll avoid a lot of bugs if you keep up a consistent style!
you have functions that take double as input and return double, but internally just use float; that should be a red flag!
your whole degreesToRadians would be better to read and only one third as long if you just used return (d*PI)/180;
Answers now:
in your factorial function, you calculate a factorial for values up to 1999. Hint: try to figure out the value of 1999! and look up the maximum number that float on your machine can hold. Then look up double's maximum. How many orders of magnitude is 1999! larger?
1999! is ca. 10^5732. That is a large number, about 150 orders of magnitude larger than what a 32bit float can hold, or still 18 orders of magnitude larger than what a 64bit double can hold. To compare, to store 1999! in a double would be like trying to fit the distance from sun center to earth center in the typical 0.1µm diameter of bacteria.

How do I end this while loop with a precision of 0.00001 ([C++],[Taylor Series])?

I'm working on this program that approximates a taylor series function. I have to approximate it so that the taylor series function stops approximating the sin function with a precision of .00001. In other words,the absolute value of the last approximation minus the current approximation equals less than or equal to 0.00001. It also approximates each angle from 0 to 360 degrees in 15 degree increments. My logic seems to be correct, but I cannot figure out why i am getting garbage values. Any help is appreciated!
#include <math.h>
#include <iomanip>
#include <iostream>
#include <string>
#include <stdlib.h>
#include <cmath>
double fact(int x){
int F = 1;
for(int i = 1; i <= x; i++){
F*=i;
}
return F;
}
double degreesToRadians(double angle_in_degrees){
double rad = (angle_in_degrees*M_PI)/180;
return rad;
}
using namespace std;
double mySine(double x){
int current =99999;
double comSin=x;
double prev=0;
int counter1 = 3;
int counter2 = 1;
while(current>0.00001){
prev = comSin;
if((counter2 % 2) == 0){
comSin += (pow(x,(counter1))/(fact(counter1)));
}else{
comSin -= (pow(x,(counter1))/(fact(counter1)));
}
current=abs(prev-comSin);
cout<<current<<endl;
counter1+=2;
counter2+=1;
}
return comSin;
}
using namespace std;
int main(){
cout<<"Angle\tSine"<<endl;
for (int i = 0; i<=360; i+=15){
cout<<i<<"\t"<<mySine(degreesToRadians(i));
}
}
Here is an example which illustrates how to go about doing this.
Using the pow function and calculating the factorial at each iteration is very inefficient -- these can often be maintained as running values which are updated alongside the sum during each iteration.
In this case, each iteration's addend is the product of two factors: a power of x and a (reciprocal) factorial. To get from one iteration's power factor to the next iteration's, just multiply by x*x. To get from one iteration's factorial factor to the next iteration's, just multiply by ((2*n+1) + 1) * ((2*n+1) + 2), before incrementing n (the iteration number).
And because these two factors are updated multiplicatively, they do not need to exist as separate running values, they can exists as a single running product. This also helps avoid precision problems -- both the power factor and the factorial can become large very quickly, but the ratio of their values goes to zero relatively gradually and is well-behaved as a running value.
So this example maintains these running values, updated at each iteration:
"sum" (of course)
"prod", the ratio: pow(x, 2n+1) / factorial 2n+1
"tnp1", the value of 2*n+1 (used in the factorial update)
The running update value, "prod" is negated every iteration in order to to factor in the (-1)^n.
I also included the function "XlatedSine". When x is too far away from zero, the sum requires more iterations for an accurate result, which takes longer to run and also can require more precision than our floating-point values can provide. When the magnitude of x goes beyond PI, "XlatedSine" finds another x, close to zero, with an equivalent value for sin(x), then uses this shifted x in a call to MaclaurinSine.
#include <iostream>
#include <iomanip>
// Importing cmath seemed wrong LOL, so define Abs and PI
static double Abs(double x) { return x < 0 ? -x : x; }
const double PI = 3.14159265358979323846;
// Taylor series about x==0 for sin(x):
//
// Sum(n=[0...oo]) { ((-1)^n) * (x^(2*n+1)) / (2*n + 1)! }
//
double MaclaurinSine(double x) {
const double xsq = x*x; // cached constant x squared
int tnp1 = 3; // 2*n+1 | n==1
double prod = xsq*x / 6; // pow(x, 2*n+1) / (2*n+1)! | n==1
double sum = x; // sum after n==0
for(;;) {
prod = -prod;
sum += prod;
static const double MinUpdate = 0.00001; // try zero -- the factorial will always dominate the power of x, eventually
if(Abs(prod) <= MinUpdate) {
return sum;
}
// Update the two factors in prod
prod *= xsq; // add 2 to the power factor's exponent
prod /= (tnp1 + 1) * (tnp1 + 2); // update the factorial factor by two iterations
tnp1 += 2;
}
}
// XlatedSine translates x to an angle close to zero which will produce the equivalent result.
double XlatedSine(double x) {
if(Abs(x) >= PI) {
// Use int casting to do an fmod PI (but symmetric about zero).
// Keep in mind that a really big x could overflow the int,
// however such a large double value will have lost so much precision
// at a sub-PI-sized scale that doing this in a legit fashion
// would also disappoint.
const int p = static_cast<int>(x / PI);
x -= PI * p;
if(p % 2) {
x = -x;
}
}
return MaclaurinSine(x);
}
double DegreesToRadians(double angle_deg) {
return PI / 180 * angle_deg;
}
int main() {
std::cout<<"Angle\tSine\n" << std::setprecision(12);
for(int i = 0; i<=360; i+=15) {
std::cout << i << "\t" << MaclaurinSine(DegreesToRadians(i)) << "\n";
//std::cout << i << "\t" << XlatedSine(DegreesToRadians(i)) << "\n";
}
}

modf returns 1 as the fractional:

I have this static method, it receives a double and "cuts" its fractional tail leaving two digits after the dot. works almost all the time. I have noticed that when
it receives 2.3 it turns it to 2.29. This does not happen for 0.3, 1.3, 3.3, 4.3 and 102.3.
Code basically multiplies the number by 100 uses modf divides the integer value by 100 and returns it.
Here the code catches this one specific number and prints out:
static double dRound(double number) {
bool c = false;
if (number == 2.3)
c = true;
int factor = pow(10, 2);
number *= factor;
if (c) {
cout << " number *= factor : " << number << endl;
//number = 230;// When this is not marked as comment the code works well.
}
double returnVal;
if (c){
cout << " fractional : " << modf(number, &returnVal) << endl;
cout << " integer : " <<returnVal << endl;
}
modf(number, &returnVal);
return returnVal / factor;
}
it prints out:
number *= factor : 230
fractional : 1
integer : 229
Does anybody know why this is happening and how can i fix this?
Thank you, and have a great weekend.
Remember floating point number cannot represent decimal numbers exactly. 2.3 * 100 actually gives 229.99999999999997. Thus modf returns 229 and 0.9999999999999716.
However, cout's format will only display floating point numbers to 6 decimal places by default. So the 0.9999999999999716 is shown as 1.
You could use (roughly) the upper error limit that a value represents in floating point to avoid the 2.3 error:
#include <cmath>
#include <limits>
static double dRound(double d) {
double inf = copysign(std::numeric_limits<double>::infinity(), d);
double theNumberAfter = nextafter(d, inf);
double epsilon = theNumberAfter - d;
int factor = 100;
d *= factor;
epsilon *= factor/2;
d += epsilon;
double returnVal;
modf(number, &returnVal);
return returnVal / factor;
}
Result: http://www.ideone.com/ywmua
Here is a way without rounding:
double double_cut(double d)
{
long long x = d * 100;
return x/100.0;
}
Even if you want rounding according to 3rd digit after decimal point, here is a solution:
double double_cut_round(double d)
{
long long x = d * 1000;
if (x > 0)
x += 5;
else
x -= 5;
return x / 1000.0;
}