Numerically calculate combinations of factorials and polynomials - c++

I am trying to write a short C++ routine to calculate the following function F(i,j,z) for given integers j > i (typically they lie between 0 and 100) and complex number z (bounded by |z| < 100), where L are the associated Laguerre Polynomials:
The issue is that I want this function to be callable from within a CUDA kernel (i.e. with a __device__ attribute). Standard library/Boost/etc functions are therefore out of the questions, unless they are simple enough to re-implement on my own - this especially relates to the Laguerre polynomials which exist in Boost and C++17. Regardless if I manage to wrap any standard function for Laguerre polynomials, I still have a similar pre-factor to calculate of the form (z^j/j!).
Question: How can I do a relatively simple implementation of such a function, without introducing significant numerical instability?
My idea so far is to calculate L and its pre-factor independently. The pre-factor I will calculate by first looping from 0 to j-i and calculate (z^1 * z^2/2 * ... * z^(j-1)/(j-i)!). I will then calculate the remaining factor exp(-|z|^2/2) *(j-i)! * sqrt(i!/j!) (either in a similar way, or through the Gamma-function, which is implemented in CUDA math). The idea is then to find a minimal algorithm to calculate the associated Laguerre polynomial, unless I manage to wrap an implementation from e.g. Boost or GNU C++.
Edit/side note: The expression for F actually blows up numerically for some values of i/j. It was derived wrong in the source where I got it, and the indices of the associated Laguerre polynomials should instead be L_i^(j-i). That does not invalidate the approaches suggested in the answers/comments.

I recommend finding a recurrence relation for the coefficients of the Laguerre Polynomial:
C(k+1) = g(k)C(k)
g(k) = C(k+1) / C(k)
g(k) = -z * (j - k) / ((j - i + k + 1) * (k + 1)) //Verify this yourself :)
This allows you to avoid most of factorials in computing the polynomial.
After that I would follow Severin's idea of doing the calculations in logarithms
so as to not overload the double floating point range:
log(F) = log(sqrt(i!/j!)) - |z|^2 + (j-i) * log(-z) + log(L(|z|^2))
log(L) = log((2*j - i)!) + log(sum) // where the summation is computed using the recurrence relation above
and using the fact that:
log(a!) = sum(k=1..a, log(k))
and also:
log(z) = log(|z|) + I * arg(z) for complex z
log(-z) = log(|z|) + I * arg(-z)
log(-z) = log(|z|) - I * arg(z)
for the log(sqrt(i!/j!)) part I would do (assuming that j >= i):
log(sqrt(i!/j!))
= 0.5 * (log(i!) - log(j!))
= -0.5 * sum(k==i+1..j, log(k))
I haven't tried this out so there could definitely be little mistakes here and there. This answer is more about the technique rather than a copy-paste-ready answer

Well, what you should do is to logarithm it
Assuming natural logarithm,
q = log(z^j/j!) = log(z^j) - log(j!) = j*log(z) - log(Gamma(j+1))
First term is simple, second term is standard C++ function lgamma(x) (or you could use GSL).
compute value of q and return cexp(q)
You could fold exponent in this method as well

Related

Calculation sine and cosine in one shot

I have a scientific code that uses both sine and cosine of the same argument (I basically need the complex exponential of that argument). I was wondering if it were possible to do this faster than calling sine and cosine functions separately.
Also I only need about 0.1% precision. So is there any way I can find the default trig functions and truncate the power series for speed?
One other thing I have in mind is, is there any way to perform the remainder operation such that the result is always positive? In my own algorithm I used x=fmod(x,2*pi); but then I would need to add 2pi if x is negative (smaller domain means I can use a shorter power series)
EDIT: LUT turned out to be the best approach for this, however I am glad I learned about other approximation techniques. I will also advise using an explicit midpoint approximation. This is what I ended up doing:
const int N = 10000;//about 3e-4 error for 1000//3e-5 for 10 000//3e-6 for 100 000
double *cs = new double[N];
double *sn = new double[N];
for(int i =0;i<N;i++){
double A= (i+0.5)*2*pi/N;
cs[i]=cos(A);
sn[i]=sin(A);
}
The following part approximates (midpoint) sincos(2*pi*(wc2+t[j]*(cotp*t[j]-wc)))
double A=(wc2+t[j]*(cotp*t[j]-wc));
int B =(int)N*(A-floor(A));
re += cs[B]*f[j];
im += sn[B]*f[j];
Another approach could have been using the chebyshev decomposition. You can use the orthogonality property to find the coefficients. Optimized for exponential, it looks like this:
double fastsin(double x){
x=x-floor(x/2/pi)*2*pi-pi;//this line can be improved, both inside this
//function and before you input it into the function
double x2 = x*x;
return (((0.00015025063885163012*x2-
0.008034350857376128)*x2+ 0.1659789684145034)*x2-0.9995812174943602)*x;} //7th order chebyshev approx
If you seek fast evaluation with good (but not high) accuracy with powerseries you should use an expansion in Chebyshev polynomials: tabulate the coefficients (you'll need VERY few for 0.1% accuracy) and evaluate the expansion with the recursion relations for these polynomials (it's really very easy).
References:
Tabulated coefficients: http://www.ams.org/mcom/1980-34-149/S0025-5718-1980-0551302-5/S0025-5718-1980-0551302-5.pdf
Evaluation of chebyshev expansion: https://en.wikipedia.org/wiki/Chebyshev_polynomials
You'll need to (a) get the "reduced" argument in the range -pi/2..+pi/2 and consequently then (b) handle the sign in your results when the argument actually should have been in the "other" half of the full elementary interval -pi..+pi. These aspects should not pose a major problem:
determine (and "remember" as an integer 1 or -1) the sign in the original angle and proceed with the absolute value.
use a modulo function to reduce to the interval 0..2PI
Determine (and "remember" as an integer 1 or -1) whether it is in the "second" half and, if so, subtract pi*3/2, otherwise subtract pi/2. Note: this effectively interchanges sine and cosine (apart from signs); take this into account in the final evaluation.
This completes the step to get an angle in -pi/2..+pi/2
After evaluating sine and cosine with the Cheb-expansions, apply the "flags" of steps 1 and 3 above to get the right signs in the values.
Just create a lookup table. The following will let you lookup the sin and cos of any radian value between -2PI and 2PI.
// LOOK UP TABLE
var LUT_SIN_COS = [];
var N = 14400;
var HALF_N = N >> 1;
var STEP = 4 * Math.PI / N;
var INV_STEP = 1 / STEP;
// BUILD LUT
for(var i=0, r = -2*Math.PI; i < N; i++, r += STEP) {
LUT_SIN_COS[2*i] = Math.sin(r);
LUT_SIN_COS[2*i + 1] = Math.cos(r);
}
You index into the lookup table by:
var index = ((r * INV_STEP) + HALF_N) << 1;
var sin = LUT_SIN_COS[index];
var cos = LUT_SIN_COS[index + 1];
Here's a fiddle that displays the % error you can expect from different sized LUTS http://jsfiddle.net/77h6tvhj/
EDIT Here's an ideone (c++) with a ~benchmark~ vs the float sin and cos. http://ideone.com/SGrFVG For whatever a benchmark on ideone.com is worth the LUT is 5 times faster.
One way to go would be to learn how to implement the CORDIC algorithm. It is not difficult and pretty interesting intelectually. This gives you both the cosine and the sine. Wikipedia gives a MATLAB example that should be easy to adapt in C++.
Note that you can augment speed and reduce precision simply by lowering the parameter n.
About your second question, it has already been asked here (in C). It seems that there is no simple way.
You can also calculate sine using a square root, given the angle and the cosine.
The example below assumes the angle ranges from 0 to 2π:
double c = cos(angle);
double s = sqrt(1.0-c*c);
if(angle>pi)s=-s;
For single-precision floats, Microsoft uses 11-degree polynomial approximation for sine, 10-degree for cosine: XMScalarSinCos.
They also have faster version, XMScalarSinCosEst, that uses lower-degree polynomials.
If you aren’t on Windows, you’ll find same code + coefficients on geometrictools.com under Boost license.

How to compute sin(2*m*Pi/n) exactly with CGAL and CORE?

Using Chebyshev polynomials, we can compute sin(2*Pi/n) exactly using the CGAL and CORE library, like the following piece of codes:
#include <CGAL/CORE_Expr.h>
#include <CGAL/Polynomial.h>
#include <CGAL/number_utils.h>
//return sin(theta) and cos(theta) for theta = 2pi/n
static std::pair<AA, AA> sin_cos(unsigned short n) {
// We actually use -x instead of x since root_of will give the k-th
// smallest root but we want the second largest one without counting.
Polynomial x(CGAL::shift(Polynomial(-1), 1));
Polynomial twox(2*x);
Polynomial a(1), b(x);
for (unsigned short i = 2; i <= n; ++i) {
Polynomial c = twox*b - a;
a = b;
b = c;
}
a = b - 1;
AA cos = -CGAL::root_of(2, a.begin(), a.end());
AA sin = CGAL::sqrt(AA(1) - cos*cos);
return std::make_pair(sin, cos);
}
But if I want to compute sin(2*m*Pi/n) exactly, where m and n are integers, what is the formula of the polynomial that I should use? Thanks.
(Partial solution.)
This is essentially computing the real and imaginary part of the roots of unity as algebraic numbers. Let's denote w(m) = exp(2*pi*I*m/n). Then, w(m) itself is a complex root of En(x) = x^n-1.
You need to find a defining polynomial of Re(w(m)). Resultants are a tool to find such a polynomial: 2*Re(w(m)) is a root of Res (En(x-y), En(y); y).
For an explanation why this is the case: Note that 2*Re(w(m)) = w(m) + conj(w(m)), and that the complex roots of En come in conjugate pairs; hence, also conj(w(m)) is a root of En. Now loosely speaking, the En(y) part "constrains" y to be any (complex) root of En, and combining this with the first argument allows x to take any complex value such that x-y is a root of En as well. Hence, a possible assignment is y = conj(w(m)) and x-y = w(m), hence x = w(m)+conj(w(m)) = 2*Re(w(m)).
CGAL can compute resultants of multivariate polynomials, so you can compute this resultant, and you simply have to pick the correct real root. (The largest one will obviously be w(0) = 1, the smallest one is 2*Re(w(floor(n/2))).)
Unfortunately, the resultant has a high complexity (degree n^2), and resultant computation will not be the fastest operation you've ever seen. Also, you'll pay for dense polynomials although your instances are very sparse and structured. YMMV; I have no clue about your use case, and if you need higher degrees.
However, I did a few tests in a computer algebra system, and I found that the resultant splits into factors of more reasonable size, and that all its real roots actually belong to a much simpler polynomial of degree floor(n/2)+1 only. (No proof, just an observation.)
I don't know of a direct formula to write down this factor, and I don't want to speculate about it. But maybe some people at mathoverflow or math.stackexchange can help?
EDIT: Here is a guess for at least a recursive formula.
I write s(n,x) for the significant factor of the resultant polynomial containing all real roots but 0. This means that s(n,x) has all values 2*Re(w(m)) for m != n/4, 3*n/4 as roots.
s(0,x) = 0
s(1,x) = x - 2
s(2,x) = x^2 - 4
s(3,x) = x^2 - x - 2
s(4,x) = x^2 - 4
s(5,x) = x^3 - x^2 - 3*x + 2
s(6,x) = x^4 - 5*x^2 + 4
s(7,x) = x^4 - x^3 - 4*x^2 + 3*x + 2
s(8,x) = x^4 - 6*x^2 + 8
s(n,x) = (x^2-2)*s(n-4,x) - s(n-8,x)
Waiting for a proof...

Using the gaussian probability density function in C++

First, is this the correct C++ representation of the pdf gaussian function ?
float pdf_gaussian = ( 1 / ( s * sqrt(2*M_PI) ) ) * exp( -0.5 * pow( (x-m)/s, 2.0 ) );
Second, does it make sense of we do something like this ?
if(pdf_gaussian < uniform_random())
do something
else
do other thing
EDIT: An example of what exactly are you trying to achieve:
Say I have a data called Y1. Then a new data called Xi arrive. I want to see if I should associated Xi to Y1 or if I should keep Xi as a new data data that will be called Y2. This is based on the distance between the new data Xi and the existing data Y1. If Xi is "far" from Y1 then Xi will not be associated to Y1, otherwise if it is "not far", it will be associated to Y1. Now I want to model this "far" or "not far" using a gaussian probability based on the mean and stdeviation of distances between Y and the data that where already associated to Y in the past.
Technically,
float pdf_gaussian = ( 1 / ( s * sqrt(2*M_PI) ) ) * exp( -0.5 * pow( (x-m)/s, 2.0 ) );
is not incorrect, but can be improved.
First, 1 / sqrt(2 Pi) can be precomputed, and using pow with integers is not a good idea: it may use exp(2 * log x) or a routine specialized for floating point exponents instead of simply x * x.
Example better code:
float normal_pdf(float x, float m, float s)
{
static const float inv_sqrt_2pi = 0.3989422804014327;
float a = (x - m) / s;
return inv_sqrt_2pi / s * std::exp(-0.5f * a * a);
}
You may want to make this a template instead of using float:
template <typename T>
T normal_pdf(T x, T m, T s)
{
static const T inv_sqrt_2pi = 0.3989422804014327;
T a = (x - m) / s;
return inv_sqrt_2pi / s * std::exp(-T(0.5) * a * a);
}
this allows you to use normal_pdf on double arguments also (it is not that much more generic though). There are caveats with the last code, namely that you have to beware not using it with integers (there are workarounds, but this makes the routine more verbose).
yes. boost::random has gaussian distribution.
See, for example, this question: How to use boost normal distribution classes?
As an alternative, there's a standard way of converting two uniformly distributed random numbers into two normally distributed numbers.
See, e.g. this question: Generate random numbers following a normal distribution in C/C++
In response to your last edit (note that the question is completely different as edited, hence my answer to an original one is irrelevant). I think you'd better off first formulating for yourself what exactly do you mean to mean by "modelling far or not far using a gaussian distribution". Then reformulate that understanding in math terms and only then start programming. As it stands, I think the problem is underspecified.
Use Box-Muller transform. This creates values with a normal/gaussian distribution.
http://en.wikipedia.org/wiki/Normal_distribution#Generating_values_from_normal_distribution
http://en.wikipedia.org/wiki/Box_Muller_transform
It's not very complex to code using math libraries.
eg.
Generate 2 uniform numbers, use them to get two normally distributed numbers. Then Return one and save the other so that you have it for your 'next' request of a random number.
Since C++ 11, std::normal_distribution defined in the standard header random can be used to generate Gaussian random samples. More information can be found herein.

lagrange approximation -c++

I updated the code.
What i am trying to do is to hold every lagrange's coefficient values in pointer d.(for example for L1(x) d[0] would be "x-x2/x1-x2" ,d1 would be (x-x2/x1-x2)*(x-x3/x1-x3) etc.
My problem is
1) how to initialize d ( i did d[0]=(z-x[i])/(x[k]-x[i]) but i think it's not right the "d[0]"
2) how to initialize L_coeff. ( i am using L_coeff=new double[0] but am not sure if it's right.
The exercise is:
Find Lagrange's polynomial approximation for y(x)=cos(π x), x ∈−1,1 using 5 points
(x = -1, -0.5, 0, 0.5, and 1).
#include <iostream>
#include <cstdio>
#include <cstdlib>
#include <cmath>
using namespace std;
const double pi=3.14159265358979323846264338327950288;
// my function
double f(double x){
return (cos(pi*x));
}
//function to compute lagrange polynomial
double lagrange_polynomial(int N,double *x){
//N = degree of polynomial
double z,y;
double *L_coeff=new double [0];//L_coefficients of every Lagrange L_coefficient
double *d;//hold the polynomials values for every Lagrange coefficient
int k,i;
//computations for finding lagrange polynomial
//double sum=0;
for (k=0;k<N+1;k++){
for ( i=0;i<N+1;i++){
if (i==0) continue;
d[0]=(z-x[i])/(x[k]-x[i]);//initialization
if (i==k) L_coeff[k]=1.0;
else if (i!=k){
L_coeff[k]*=d[i];
}
}
cout <<"\nL("<<k<<") = "<<d[i]<<"\t\t\tf(x)= "<<f(x[k])<<endl;
}
}
int main()
{
double deg,result;
double *x;
cout <<"Give the degree of the polynomial :"<<endl;
cin >>deg;
for (int i=0;i<deg+1;i++){
cout <<"\nGive the points of interpolation : "<<endl;
cin >> x[i];
}
cout <<"\nThe Lagrange L_coefficients are: "<<endl;
result=lagrange_polynomial(deg,x);
return 0;
}
Here is an example of lagrange polynomial
As this seems to be homework, I am not going to give you an exhaustive answer, but rather try to send you on the right track.
How do you represent polynomials in a computer software? The intuitive version you want to archive as a symbolic expression like 3x^3+5x^2-4 is very unpractical for further computations.
The polynomial is defined fully by saving (and outputting) it's coefficients.
What you are doing above is hoping that C++ does some algebraic manipulations for you and simplify your product with a symbolic variable. This is nothing C++ can do without quite a lot of effort.
You have two options:
Either use a proper computer algebra system that can do symbolic manipulations (Maple or Mathematica are some examples)
If you are bound to C++ you have to think a bit more how the single coefficients of the polynomial can be computed. You programs output can only be a list of numbers (which you could, of course, format as a nice looking string according to a symbolic expression).
Hope this gives you some ideas how to start.
Edit 1
You still have an undefined expression in your code, as you never set any value to y. This leaves prod*=(y-x[i])/(x[k]-x[i]) as an expression that will not return meaningful data. C++ can only work with numbers, and y is no number for you right now, but you think of it as symbol.
You could evaluate the lagrange approximation at, say the value 1, if you would set y=1 in your code. This would give you the (as far as I can see right now) correct function value, but no description of the function itself.
Maybe you should take a pen and a piece of paper first and try to write down the expression as precise Math. Try to get a real grip on what you want to compute. If you did that, maybe you come back here and tell us your thoughts. This should help you to understand what is going on in there.
And always remember: C++ needs numbers, not symbols. Whenever you have a symbol in an expression on your piece of paper that you do not know the value of you can either find a way how to compute the value out of the known values or you have to eliminate the need to compute using this symbol.
P.S.: It is not considered good style to post identical questions in multiple discussion boards at once...
Edit 2
Now you evaluate the function at point y=0.3. This is the way to go if you want to evaluate the polynomial. However, as you stated, you want all coefficients of the polynomial.
Again, I still feel you did not understand the math behind the problem. Maybe I will give you a small example. I am going to use the notation as it is used in the wikipedia article.
Suppose we had k=2 and x=-1, 1. Furthermore, let my just name your cos-Function f, for simplicity. (The notation will get rather ugly without latex...) Then the lagrangian polynomial is defined as
f(x_0) * l_0(x) + f(x_1)*l_1(x)
where (by doing the simplifications again symbolically)
l_0(x)= (x - x_1)/(x_0 - x_1) = -1/2 * (x-1) = -1/2 *x + 1/2
l_1(x)= (x - x_0)/(x_1 - x_0) = 1/2 * (x+1) = 1/2 * x + 1/2
So, you lagrangian polynomial is
f(x_0) * (-1/2 *x + 1/2) + f(x_1) * 1/2 * x + 1/2
= 1/2 * (f(x_1) - f(x_0)) * x + 1/2 * (f(x_0) + f(x_1))
So, the coefficients you want to compute would be 1/2 * (f(x_1) - f(x_0)) and 1/2 * (f(x_0) + f(x_1)).
Your task is now to find an algorithm that does the simplification I did, but without using symbols. If you know how to compute the coefficients of the l_j, you are basically done, as you then just can add up those multiplied with the corresponding value of f.
So, even further broken down, you have to find a way to multiply the quotients in the l_j with each other on a component-by-component basis. Figure out how this is done and you are a nearly done.
Edit 3
Okay, lets get a little bit less vague.
We first want to compute the L_i(x). Those are just products of linear functions. As said before, we have to represent each polynomial as an array of coefficients. For good style, I will use std::vector instead of this array. Then, we could define the data structure holding the coefficients of L_1(x) like this:
std::vector L1 = std::vector(5);
// Lets assume our polynomial would then have the form
// L1[0] + L2[1]*x^1 + L2[2]*x^2 + L2[3]*x^3 + L2[4]*x^4
Now we want to fill this polynomial with values.
// First we have start with the polynomial 1 (which is of degree 0)
// Therefore set L1 accordingly:
L1[0] = 1;
L1[1] = 0; L1[2] = 0; L1[3] = 0; L1[4] = 0;
// Of course you could do this more elegant (using std::vectors constructor, for example)
for (int i = 0; i < N+1; ++i) {
if (i==0) continue; /// For i=0, there will be no polynomial multiplication
// Otherwise, we have to multiply L1 with the polynomial
// (x - x[i]) / (x[0] - x[i])
// First, note that (x[0] - x[i]) ist just a scalar; we will save it:
double c = (x[0] - x[i]);
// Now we multiply L_1 first with (x-x[1]). How does this multiplication change our
// coefficients? Easy enough: The coefficient of x^1 for example is just
// L1[0] - L1[1] * x[1]. Other coefficients are done similary. Futhermore, we have
// to divide by c, which leaves our coefficient as
// (L1[0] - L1[1] * x[1])/c. Let's apply this to the vector:
L1[4] = (L1[3] - L1[4] * x[1])/c;
L1[3] = (L1[2] - L1[3] * x[1])/c;
L1[2] = (L1[1] - L1[2] * x[1])/c;
L1[1] = (L1[0] - L1[1] * x[1])/c;
L1[0] = ( - L1[0] * x[1])/c;
// There we are, polynomial updated.
}
This, of course, has to be done for all L_i Afterwards, the L_i have to be added and multiplied with the function. That is for you to figure out. (Note that I made quite a lot of inefficient stuff up there, but I hope this helps you understanding the details better.)
Hopefully this gives you some idea how you could proceed.
The variable y is actually not a variable in your code but represents the variable P(y) of your lagrange approximation.
Thus, you have to understand the calculations prod*=(y-x[i])/(x[k]-x[i]) and sum+=prod*f not directly but symbolically.
You may get around this by defining your approximation by a series
c[0] * y^0 + c[1] * y^1 + ...
represented by an array c[] within the code. Then you can e.g. implement multiplication
d = c * (y-x[i])/(x[k]-x[i])
coefficient-wise like
d[i] = -c[i]*x[i]/(x[k]-x[i]) + c[i-1]/(x[k]-x[i])
The same way you have to implement addition and assignments on a component basis.
The result will then always be the coefficients of your series representation in the variable y.
Just a few comments in addition to the existing responses.
The exercise is: Find Lagrange's polynomial approximation for y(x)=cos(π x), x ∈ [-1,1] using 5 points (x = -1, -0.5, 0, 0.5, and 1).
The first thing that your main() does is to ask for the degree of the polynomial. You should not be doing that. The degree of the polynomial is fully specified by the number of control points. In this case you should be constructing the unique fourth-order Lagrange polynomial that passes through the five points (xi, cos(π xi)), where the xi values are those five specified points.
const double pi=3.1415;
This value is not good for a float, let alone a double. You should be using something like const double pi=3.14159265358979323846264338327950288;
Or better yet, don't use pi at all. You should know exactly what the y values are that correspond to the given x values. What are cos(-π), cos(-π/2), cos(0), cos(π/2), and cos(π)?

Create sine lookup table in C++

How can I rewrite the following pseudocode in C++?
real array sine_table[-1000..1000]
for x from -1000 to 1000
sine_table[x] := sine(pi * x / 1000)
I need to create a sine_table lookup table.
You can reduce the size of your table to 25% of the original by only storing values for the first quadrant, i.e. for x in [0,pi/2].
To do that your lookup routine just needs to map all values of x to the first quadrant using simple trig identities:
sin(x) = - sin(-x), to map from quadrant IV to I
sin(x) = sin(pi - x), to map from quadrant II to I
To map from quadrant III to I, apply both identities, i.e. sin(x) = - sin (pi + x)
Whether this strategy helps depends on how much memory usage matters in your case. But it seems wasteful to store four times as many values as you need just to avoid a comparison and subtraction or two during lookup.
I second Jeremy's recommendation to measure whether building a table is better than just using std::sin(). Even with the original large table, you'll have to spend cycles during each table lookup to convert the argument to the closest increment of pi/1000, and you'll lose some accuracy in the process.
If you're really trying to trade accuracy for speed, you might try approximating the sin() function using just the first few terms of the Taylor series expansion.
sin(x) = x - x^3/3! + x^5/5! ..., where ^ represents raising to a power and ! represents the factorial.
Of course, for efficiency, you should precompute the factorials and make use of the lower powers of x to compute higher ones, e.g. use x^3 when computing x^5.
One final point, the truncated Taylor series above is more accurate for values closer to zero, so its still worthwhile to map to the first or fourth quadrant before computing the approximate sine.
Addendum:
Yet one more potential improvement based on two observations:
1. You can compute any trig function if you can compute both the sine and cosine in the first octant [0,pi/4]
2. The Taylor series expansion centered at zero is more accurate near zero
So if you decide to use a truncated Taylor series, then you can improve accuracy (or use fewer terms for similar accuracy) by mapping to either the sine or cosine to get the angle in the range [0,pi/4] using identities like sin(x) = cos(pi/2-x) and cos(x) = sin(pi/2-x) in addition to the ones above (for example, if x > pi/4 once you've mapped to the first quadrant.)
Or if you decide to use a table lookup for both the sine and cosine, you could get by with two smaller tables that only covered the range [0,pi/4] at the expense of another possible comparison and subtraction on lookup to map to the smaller range. Then you could either use less memory for the tables, or use the same memory but provide finer granularity and accuracy.
long double sine_table[2001];
for (int index = 0; index < 2001; index++)
{
sine_table[index] = std::sin(PI * (index - 1000) / 1000.0);
}
One more point: calling trigonometric functions is pricey. if you want to prepare the lookup table for sine with constant step - you may save the calculation time, in expense of some potential precision loss.
Consider your minimal step is "a". That is, you need sin(a), sin(2a), sin(3a), ...
Then you may do the following trick: First calculate sin(a) and cos(a). Then for every consecutive step use the following trigonometric equalities:
sin([n+1] * a) = sin(n*a) * cos(a) + cos(n*a) * sin(a)
cos([n+1] * a) = cos(n*a) * cos(a) - sin(n*a) * sin(a)
The drawback of this method is that during this procedure the round-off error is accumulated.
double table[1000] = {0};
for (int i = 1; i <= 1000; i++)
{
sine_table[i-1] = std::sin(PI * i/ 1000.0);
}
double getSineValue(int multipleOfPi){
if(multipleOfPi == 0) return 0.0;
int sign = 1;
if(multipleOfPi < 0){
sign = -1;
}
return signsine_table[signmultipleOfPi - 1];
}
You can reduce the array length to 500, by a trick sin(pi/2 +/- angle) = +/- cos(angle).
So store sin and cos from 0 to pi/4.
I don't remember from top of my head but it increased the speed of my program.
You'll want the std::sin() function from <cmath>.
another approximation from a book or something
streamin ramp;
streamout sine;
float x,rect,k,i,j;
x = ramp -0.5;
rect = x * (1 - x < 0 & 2);
k = (rect + 0.42493299) *(rect -0.5) * (rect - 0.92493302) ;
i = 0.436501 + (rect * (rect + 1.05802));
j = 1.21551 + (rect * (rect - 2.0580201));
sine = i*j*k*60.252201*x;
full discussion here:
http://synthmaker.co.uk/forum/viewtopic.php?f=4&t=6457&st=0&sk=t&sd=a
I presume that you know, that using a division is a lot slower than multiplying by decimal number, /5 is always slower than *0.2
it's just an approximation.
also:
streamin ramp;
streamin x; // 1.5 = Saw 3.142 = Sin 4.5 = SawSin
streamout sine;
float saw,saw2;
saw = (ramp * 2 - 1) * x;
saw2 = saw * saw;
sine = -0.166667 + saw2 * (0.00833333 + saw2 * (-0.000198409 + saw2 * (2.7526e-006+saw2 * -2.39e-008)));
sine = saw * (1+ saw2 * sine);