nan output due to maclaurin series expansion of sine, console crashes - c++

Here is my code:
#include <iostream>
#include <cmath>
using namespace std;
int factorial(int);
int main()
{
for(int k = 0; k < 100000; k++)
{
static double sum = 0.0;
double term;
term = (double)pow(-1.0, k) * (double)pow(4.0, 2*k+1) / factorial(2*k+1);
sum = sum + term;
cout << sum << '\n';
}
}
int factorial(int n)
{
if(n == 0)
{
return 1;
}
return n*factorial(n-1);
}
I'm just trying to calculate the value of sine(4) using the maclaurin expansion form of sine. For each console output, the value reads 'nan'. The console gives an error and shuts down after like 10 second. I don't get any errors in the IDE.

There're multiple problems with your approach.
Your factorial function can't return an int. The return value will be way too big, very quickly.
Using pow(-1, value) to get a alternating positive/negative one is very inefficient and will yield incorrect value pretty quick. You should pick 1.0 or -1.0 depending on k's parity.
When you sum a long series of terms, you want to sum the terms with the least magnitude first. Otherwise, you lose precision due to existing bit limiting the range you can reach. In your case, the power of four is dominated by the factorial, so you sum the highest magnitude values first. You'd probably get better precision starting by the other end.
Algorithmically, if you're going to raise 4 to the 2k+1 power and then divide by (2k+1)!, you should keep both the list of factors (4, 4, 4, 4...) and (2,3,4,5,6,7,8,9,....) and simplify both sides. There's plenty of fours to remove on the numerators and denominators at the same time.
Even with those four, I'm not sure you can get anywhere close to the 100000 target you set, without specialized code.

As already stated by others, the intermediate results you will get for large k are magnitudes too large to fit into a double. From a certain k on pow as well as factorial will return infinity. This is simply what happens for very large doubles. And as you then divide one infinity by another you get NaN.
One common trick to deal with too large numbers is using logarithms for intermediate results and only in the end apply the exponential function once.
Some mathematical knowledge of logarithms is required here. To understand what I am doing here you need to know exp(log(x)) == x, log(a^b) == b*log(a), and log(a/b) == log(a) - log(b).
In your case you can rewrite
pow(4, 2*k+1)
to
exp((2*k+1)*log(4))
Then there is still the factorial. The lgamma function can help with factorial(n) == gamma(n+1) and log(factorial(n)) == lgamma(n+1). In short, lgamma gives you the log of a factorial without huge intermediate results.
So summing up, replace
pow(4, 2*k+1) / factorial(2*k+1)
With
exp((2*k+1)*log(4) - lgamma(2*k+2))
This should help you with your NaNs. Also, this should increase performance as lgamma operates in O(1) whereas your factorial is in O(k).
Note, however, that I have still very little confidence that your result will be numerically accurate.
A double has still limited precision of roughly 16 decimal digits. Your 100000 iterations are very likely worthless, probably even harmfull.

Related

Loss of precision with pow function when surpassing 10^10 limit?

Doing one of my first homeworks of uni, and have ran into this problem:
Task: Find a sum of all n elements where n is the count of numerals in a number (n=1, means 1, 2, 3... 8, 9 for example, answer is 45)
Problem: The code I wrote has gotten all the test answers correctly up to 10 to the power of 9, but when it reaches 10 to the power of 10 territory, then the answers start being wrong, it's really close to what I should be getting, but not quite there (For example, my output = 49499999995499995136, expected result = 49499999995500000000)
Would really appreciate some help/insights, am guessing it's something to do with the variable types, but not quite sure of a possible solution..
#include <iostream>
#include <cmath>
#include <iomanip>
using namespace std;
int main()
{
int n;
double ats = 0, maxi, mini;
cin >> n;
maxi = pow(10, n) - 1;
mini = pow(10, n-1) - 1;
ats = (maxi * (maxi + 1)) / 2 - (mini * (mini + 1)) / 2;
cout << setprecision(0) << fixed << ats;
}
The main reason of problems is pow() function. It works with double, not int. Loss of accuracy is price for representing huge numbers.
There are 3 way's to solve problem:
For small n you can make your own long long int pow(int x, int pow) function. But there is problem, that we can overflow even long long int
Use long arithmetic functions, as #rustyx sayed. You can write your own with vector, or find and include library.
There is Math solution specific for topic's task. It solves the big numbers problem.
You can write your formula like
((10^n) - 1) * (10^n) - (10^m - 1) * (10^m)) / 2 , (here m = n-1)
Then multiply numbers in numerator. Regroup them. Extract common multiples 10^(n-1). And then you can see, that answer have a structure:
X9...9Y0...0 for big enought n, where letter X and Y are constants.
So, you can just print the answer "string" without calculating.
I think you're stretching floating points beyond their precision. Let me explain:
The C pow() function takes doubles as arguments. You're passing ints, the compiler is adding the code to convert them to doubles before they reach pow(). (And anyway you're storing it as a double when you get the return value since you declared it that way).
Floating points are called that way precisely because the point "floats". Inside a double there's a sign bit, a few bits for the mantissa and a few bits for the exponent. In binary, elevating to a power of two is equivalent to moving the fractional point to the right (or to the left if you're elevating to a negative number). So basically the exponent is saying where the fractional point is, in binary. The great advantage of using this kind of in-memory representation for doubles is that you get a lot of precision for numbers close to 0, and gradually lose precision as numbers become bigger.
That last thing is exactly what's happening to you. Your number is too large to be stored exactly. So it's being rounded to the closest sum of powers of two (powers of two are the numbers that have all zeroes to the right in binary).
Quick experiment: press F12 in your browser, open the javascript console and type 49499999995499995136. In my case, in chrome, I reproduce the same problem.
If you really really really want precision with such big numbers then you can try some of these libraries, but that's too advanced for a student program, you don't need it. Just add an if block and print an error message if the number that the user typed is too big (professors love that, which is actually quite correct).

Why does it show nan?

Ok so i am doing an a program where I am trying to get the result of the right side to be equivalent to the left side with 0.0001% accuracy
sin x = x - (x^3)/3! + (x^5)/5! + (x^7)/7! +....
#include<iostream>
#include<iomanip>
#include<math.h>
using namespace std;
long int fact(long int n)
{
if(n == 1 || n == 0)
return 1;
else
return n*fact(n-1);
}
int main()
{
int n = 1, counts=0; //for sin
cout << "Enter value for sin" << endl;
long double x,value,next = 0,accuracy = 0.0001;
cin >> x;
value = sin(x);
do
{
if(counts%2 == 0)
next = next + (pow(x,n)/fact(n));
else
next = next - (pow(x,n)/fact(n));
counts++;
n = n+2;
} while((fabs(next - value))> 0);
cout << "The value of sin " << x << " is " << next << endl;
}
and lets say i enter 45 for x
I get the result
The value for sin 45 in nan.
can anyone help me out on where I did wrong ?
First your while condition should be
while((fabs(next - value))> accuracy) and fact should return long double.
When you change that it still won't work for value of 45. The reason is that this Taylor series converge too slowly for large values.
Here is the error term in the formula
Here k is the number of iterations a=0 and the function is sin.In order for the condition to become false 45^(k+1)/(k+1)! times some absolute value of sin or cos (depending what the k-th derivative is) (it's between 0 and 1) should be less than 0.0001.
Well in this formula for value of 50 the number is still very large (we should expect error of around 1.3*10^18 which means we will do more than 50 iterations for sure).
45^50 and 50! will overflow and then dividing them will give you infinity/infinity=NAN.
In your original version fact value doesn't fit in the integer (your value overflows to 0) and then the division over 0 gives you infinity which after subtract of another infinity gives you NAN.
I quote from here in regard to pow:
Return value
If no errors occur, base raised to the power of exp (or
iexp) (baseexp), is returned.
If a domain error occurs, an
implementation-defined value is returned (NaN where supported)
If a pole error or a range error due to overflow occurs, ±HUGE_VAL,
±HUGE_VALF, or ±HUGE_VALL is returned.
If a range error occurs due to
underflow, the correct result (after rounding) is returned.
Reading further:
Error handling
...
except where specified above, if any argument is NaN, NaN is returned
So basically, since n is increasing and and you have many loops pow returns NaN (the compiler you use obviously supports that). The rest is arithmetic. You calculate with overflowing values.
I believe you are trying to approximate sin(x) by using its Taylor series. I am not sure if that is the way to go.
Maybe you can try to stop the loop as soon as you hit NaN and not update the variable next and simply output that. That's the closest you can get I believe with your algorithm.
If the choice of 45 implies you think the input is in degrees, you should rethink that and likely should reduce mod 2 Pi.
First fix two bugs:
long double fact(long int n)
...
}while((fabs(next - value))> accuracy);
the return value of fact will overflow quickly if it is long int. The return value of fact will overflow eventually even for long double. When you compare to 0 instead of accuracy the answer is never correct enough, so only nan can stop the while
Because of rounding error, you still never converge (while pow is giving values bigger than fact you are computing differences between big numbers, which accumulates significant rounding error, which is then never removed). So you might instead stop by computing long double m=pow(x,n)/fact(n); before increasing n in each step of the loop and use:
}while(m > accuracy*.5);
At that point, either the answer has the specified accuracy or the remaining error is dominated by rounding error and iterating further won't help.
If you had compiled your system with any reasonable level of warnings enabled you would have immediately seen that you are not using the variable accuracy. This and the fact that your fact function returns a long int are but a small part of your problem. You will never get a good result for sin(45) using your algorithm even if you correct those issues.
The problem is that with x=45, the terms in the Taylor expansion of sin(x) won't start decreasing until n=45. This is a big problem because 4545/45! is a very large number, 2428380447472097974305091567498407675884664058685302734375 / 1171023117375434566685446533210657783808, or roughly 2*1018. Your algorithm initially adds and subtracts huge numbers that only start decreasing after 20+ additions/subtractions, with the eventual hope that the result will be somewhere between -1 and +1. That is an unrealizable hope given an input value of 45 and using a native floating point type.
You could use some BigNum type (the internet is chock-full of them) with your algorithm, but that's extreme overkill when you only want four place accuracy. Alternatively, you could take advantage of the cyclical nature of sin(x), sin(x+2*pi)=sin(x). An input value of 45 is equivalent to 1.017702849742894661522992634... (modulo 2*pi). Your algorithm works quite nicely for an input of 1.017702849742894661522992634.
You can do much better than that, but taking the input value modulo 2*pi is the first step toward a reasonable algorithm for computing sine and cosine. Even better, you can use the facts that sin(x+pi)=-sin(x). This lets you reduce the range from -infinity to +infinity to 0 to pi. Even better, you can use the fact that between 0 and pi, sin(x) is symmetric about pi/2. You can do even better than that. The implementations of the trigonometric functions take extreme advantage of these behaviors, but they typically do not use Taylor approximations.

C++ Modulus returning wrong answer

Here is my code :
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
int n, i, num, m, k = 0;
cout << "Enter a number :\n";
cin >> num;
n = log10(num);
while (n > 0) {
i = pow(10, n);
m = num / i;
k = k + pow(m, 3);
num = num % i;
--n;
cout << m << endl;
cout << num << endl;
}
k = k + pow(num, 3);
return 0;
}
When I input 111 it gives me this
1
12
1
2
I am using codeblocks. I don't know what is wrong.
Whenever I use pow expecting an integer result, I add .5 so I use (int)(pow(10,m)+.5) instead of letting the compiler automatically convert pow(10,m) to an int.
I have read many places telling me others have done exhaustive tests of some of the situations in which I add that .5 and found zero cases where it makes a difference. But accurately identifying the conditions in which it isn't needed can be quite hard. Using it when it isn't needed does no real harm.
If it makes a difference, it is a difference you want. If it doesn't make a difference, it had a tiny cost.
In the posted code, I would adjust every call to pow that way, not just the one I used as an example.
There is no equally easy fix for your use of log10, but it may be subject to the same problem. Since you expect a non integer answer and want that non integer answer truncated down to an integer, adding .5 would be very wrong. So you may need to find some more complicated work around for the fundamental problem of working with floating point. I'm not certain, but assuming 32-bit integers, I think adding 1e-10 to the result of log10 before converting to int is both never enough to change log10(10^n-1) into log10(10^n) but always enough to correct the error that might have done the reverse.
pow does floating-point exponentiation.
Floating point functions and operations are inexact, you cannot ever rely on them to give you the exact value that they would appear to compute, unless you are an expert on the fine details of IEEE floating point representations and the guarantees given by your library functions.
(and furthermore, floating-point numbers might even be incapable of representing the integers you want exactly)
This is particularly problematic when you convert the result to an integer, because the result is truncated to zero: int x = 0.999999; sets x == 0, not x == 1. Even the tiniest error in the wrong direction completely spoils the result.
You could round to the nearest integer, but that has problems too; e.g. with sufficiently large numbers, your floating point numbers might not have enough precision to be near the result you want. Or if you do enough operations (or unstable operations) with the floating point numbers, the errors can accumulate to the point you get the wrong nearest integer.
If you want to do exact, integer arithmetic, then you should use functions that do so. e.g. write your own ipow function that computes integer exponentiation without any floating-point operations at all.

Calculating Probability C++ Bernoulli Trials

The program asks the user for the number of times to flip a coin (n; the number of trials).
A success is considered a heads.
Flawlessly, the program creates a random number between 0 and 1. 0's are considered heads and success.
Then, the program is supposed to output the expected values of getting x amount of heads. For example if the coin was flipped 4 times, what are the following probabilities using the formula
nCk * p^k * (1-p)^(n-k)
Expected 0 heads with n flips: xxx
Expected 1 heads with n flips: xxx
...
Expected n heads with n flips: xxx
When doing this with "larger" numbers, the numbers come out to weird values. It happens if 15 or twenty are put into the input. I have been getting 0's and negative values for the value that should be xxx.
Debugging, I have noticed that the nCk has come out to be negative and not correct towards the upper values and beleive this is the issue. I use this formula for my combination:
double combo = fact(n)/fact(r)/fact(n-r);
here is the psuedocode for my fact function:
long fact(int x)
{
int e; // local counter
factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
Any thoughts? My guess is my factorial or combo functions are exceeding the max values or something.
You haven't mentioned how is factor declared. I think you are getting integer overflows. I suggest you use double. That is because since you are calculating expected values and probabilities, you shouldn't be concerned much about precision.
Try changing your fact function to.
double fact(double x)
{
int e; // local counter
double factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
EDIT:
Also to calculate nCk, you need not calculate factorials 3 times. You can simply calculate this value in the following way.
if k > n/2, k = n-k.
n(n-1)(n-2)...(n-k+1)
nCk = -----------------------
factorial(k)
You're exceeding the maximum value of a long. Factorial grows so quickly that you need the right type of number--what type that is will depend on what values you need.
Long is an signed integer, and as soon as you pass 2^31, the value will become negative (it's using 2's complement math).
Using an unsigned long will buy you a little time (one more bit), but for factorial, it's probably not worth it. If your compiler supports long long, then try an "unsigned long long". That will (usually, depends on compiler and CPU) double the number of bits you're using.
You can also try switching to use double. The problem you'll face there is that you'll lose accuracy as the numbers increase. A double is a floating point number, so you'll have a fixed number of significant digits. If your end result is an approximation, this may work okay, but if you need exact values, it won't work.
If none of these solutions will work for you, you may need to resort to using an "infinite precision" math package, which you should be able to search for. You didn't say if you were using C or C++; this is going to be a lot more pleasant with C++ as it will provide a class that acts like a number and that would use standard arithmetic operators.

about float number comparision in C++

I have a loop to loop a floating number between given min and max range as follow
#include <iostream>
using namespace std;
void main(void)
{
for (double x=0.012; x<=0.013; x+=0.001)
{
cout << x << endl;
}
}
It is pretty simple code but as I know in computer language, we need to compare two floating numbers with EPS considered. Hence, above code doesn't work (we expect it to loop two times from 0.012 to 0.013 but it only loop once). So I manually add an EPS to the upper limit.
#include <iostream>
using namespace std;
#define EPS 0.0000001
void main(void)
{
for (double x=0.012; x<=0.013+EPS; x+=0.001)
{
cout << x << endl;
}
}
and it works now. But it looks ugly to do that manually since EPS should really depends on machine. I am porting my code from matlab to C++ and I don't have problem in matlab since there is eps command. But is there anything like that in C/C++?
Fudging the comparison is the wrong technique to use. Even if you get the comparison “right”, a floating-point loop counter will accumulate error from iteration to iteration.
You can eliminate accumulation of error by using exact arithmetic for the loop counter. It may still have floating-point type, but you use exactly representable values, such as:
for (double i = 12; i <= 13; ++i)
Then, inside the loop, you scale the counter as desired:
for (double i = 12; i <= 13; ++i)
{
double x = i / 1000.;
…
}
Obviously, there is not much error accumulating in a loop with two iterations. However, I expect your values are just one example, and there may be longer loops in practice. With this technique, the only error in x is in the scaling operation, so just one error in each iteration instead of one per iteration.
Note that dividing by 1000 is more accurate than scaling by .001. Division by 1000 has only one error (in the division). However, since .001 is not exactly representable in binary floating point, multiplying by it has two errors (in the conversion of .001 to floating point and in the multiplication). On the other hand, division is typically a very slow operation, so you might choose the multiplication.
Finally, although this technique guarantees the desired number of iterations, the scaled value might be slight outside the ideal target interval in the first or the last iteration, due to the rounding error in the scaling. If this matters to your application, you must adjust the value in these iterations.