sin and cos are slow, is there an alternatve? - c++

My game needs to move by a certain angle. To do this I get the vector of the angle via sin and cos. Unfortunately sin and cos are my bottleneck. I'm sure I do not need this much precision. Is there an alternative to a C sin & cos and look-up table that is decently precise but very fast?
I had found this:
float Skeleton::fastSin( float x )
{
const float B = 4.0f/pi;
const float C = -4.0f/(pi*pi);
float y = B * x + C * x * abs(x);
const float P = 0.225f;
return P * (y * abs(y) - y) + y;
}
Unfortunately, this does not seem to work. I get significantly different behavior when I use this sin rather than C sin.
Thanks

A lookup table is the standard solution. You could Also use two lookup tables on for degrees and one for tenths of degrees and utilize sin(A + B) = sin(a)cos(b) + cos(A)sin(b)

For your fastSin(), you should check its documentation to see what range it's valid on. The units you're using for your game could be too big or too small and scaling them to fit within that function's expected range could make it work better.
EDIT:
Someone else mentioned getting it into the desired range by subtracting PI, but apparently there's a function called fmod for doing modulus division on floats/doubles, so this should do it:
#include <iostream>
#include <cmath>
float fastSin( float x ){
x = fmod(x + M_PI, M_PI * 2) - M_PI; // restrict x so that -M_PI < x < M_PI
const float B = 4.0f/M_PI;
const float C = -4.0f/(M_PI*M_PI);
float y = B * x + C * x * std::abs(x);
const float P = 0.225f;
return P * (y * std::abs(y) - y) + y;
}
int main() {
std::cout << fastSin(100.0) << '\n' << std::sin(100.0) << std::endl;
}
I have no idea how expensive fmod is though, so I'm going to try a quick benchmark next.
Benchmark Results
I compiled this with -O2 and ran the result with the Unix time program:
int main() {
float a = 0;
for(int i = 0; i < REPETITIONS; i++) {
a += sin(i); // or fastSin(i);
}
std::cout << a << std::endl;
}
The result is that sin is about 1.8x slower (if fastSin takes 5 seconds, sin takes 9). The accuracy also seemed to be pretty good.
If you chose to go this route, make sure to compile with optimization on (-O2 in gcc).

I know this is already an old topic, but for people who have the same question, here is a tip.
A lot of times in 2D and 3D rotation, all vectors are rotated with a fixed angle. In stead of calling the cos() or sin() every cycle of the loop, create variable before the loop which contains the value of cos(angle) or sin(angle) already. You can use this variable in your loop. This way the function only has to be called once.

If you rephrase the return in fastSin as
return (1-P) * y + P * (y * abs(y))
And rewrite y as (for x>0 )
y = 4 * x * (pi-x) / (pi * pi)
you can see that y is a parabolic first-order approximation to sin(x) chosen so that it passes through (0,0), (pi/2,1) and (pi,0), and is symmetrical about x=pi/2.
Thus we can only expect our function to be a good approximation from 0 to pi. If we want values outside that range we can use the 2-pi periodicity of sin(x) and that sin(x+pi) = -sin(x).
The y*abs(y) is a "correction term" which also passes through those three points. (I'm not sure why y*abs(y) is used rather than just y*y since y is positive in the 0-pi range).
This form of overall approximation function guarantees that a linear blend of the two functions y and y*y, (1-P)*y + P * y*y will also pass through (0,0), (pi/2,1) and (pi,0).
We might expect y to be a decent approximation to sin(x), but the hope is that by picking a good value for P we get a better approximation.
One question is "How was P chosen?". Personally, I'd chose the P that produced the least RMS error over the 0,pi/2 interval. (I'm not sure that's how this P was chosen though)
Minimizing this wrt. P gives
This can be rearranged and solved for p
Wolfram alpha evaluates the initial integral to be the quadratic
E = (16 π^5 p^2 - (96 π^5 + 100800 π^2 - 967680)p + 651 π^5 - 20160 π^2)/(1260 π^4)
which has a minimum of
min(E) = -11612160/π^9 + 2419200/π^7 - 126000/π^5 - 2304/π^4 + 224/π^2 + (169 π)/420
≈ 5.582129689596371e-07
at
p = 3 + 30240/π^5 - 3150/π^3
≈ 0.2248391013559825
Which is pretty close to the specified P=0.225.
You can raise the accuracy of the approximation by adding an additional correction term. giving a form something like return (1-a-b)*y + a y * abs(y) + b y * y * abs(y). I would find a and b by in the same way as above, this time giving a system of two linear equations in a and b to solve, rather than a single equation in p. I'm not going to do the derivation as it is tedious and the conversion to latex images is painful... ;)
NOTE: When answering another question I thought of another valid choice for P.
The problem is that using reflection to extend the curve into (-pi,0) leaves a kink in the curve at x=0. However, I suspect we can choose P such that the kink becomes smooth.
To do this take the left and right derivatives at x=0 and ensure they are equal. This gives an equation for P.

You can compute a table S of 256 values, from sin(0) to sin(2 * pi). Then, to pick sin(x), bring back x in [0, 2 * pi], you can pick 2 values S[a], S[b] from the table, such as a < x < b. From this, linear interpolation, and you should have a fair approximation
memory saving trick : you actually need to store only from [0, pi / 2], and use symmetries of sin(x)
enhancement trick : linear interpolation can be a problem because of non-smooth derivatives, humans eyes is good at spotting such glitches in animation and graphics. Use cubic interpolation then.

What about
x*(0.0174532925199433-8.650935142277599*10^-7*x^2)
for deg and
x*(1-0.162716259904269*x^2)
for rad on -45, 45 and -pi/4 , pi/4 respectively?

This (i.e. the fastsin function) is approximating the sine function using a parabola. I suspect it's only good for values between -π and +π. Fortunately, you can keep adding or subtracting 2π until you get into this range. (Edited to specify what is approximating the sine function using a parabola.)

you can use this aproximation.
this solution use a quadratic curve :
http://www.starming.com/index.php?action=plugin&v=wave&ajax=iframe&iframe=fullviewonepost&mid=56&tid=4825

Related

Speeding up a do-while loop with loop unrolling

I am trying to speed up code in a function that may be called many times over (maybe more than a million). The code has to do with setting two variables to random numbers and finding squared distance. My first idea for this is loop unrolling but I am a bit confused on how I should execute this because of the while condition that dictates it.
To speed up my program I have replaced the c++ built in rand() function with a custom one but I am stumped on how to make my program even faster.
do {
x = customRand();
y = customRand();
distance = x * x + y * y; // euclidean square distance
} while (distance >= 1.0);
You shouldn't expect to make your program faster by loop unrolling, because with a properly selected range ([-1, 1]) for the random number generator output your loop's body will be executed just once in more than 3/4 of the cases.
What you might want to do to help the compiler is to mark your while condition as "unlikely". For example, in GCC it would be:
#define unlikely(x) __builtin_expect((x),0)
do {
x = customRand();
y = customRand();
distance = x * x + y * y; // euclidean square distance
} while (unlikely(distance >= 1.0));
Still, even this is unlikely to speed up your code in a measurable way.
If you were about guaranteed time and not speed, then for an uniform random distribution within a circle, with customRand() uniformly distributed in [-1, 1]
r = std::sqrt(std::abs(customRand()));
t = M_PI * customRand();
x = r * std::cos(t);
y = r * std::sin(t);
would do the trick.
It smells like a math-based XY problem, but I will first focus on your question about loop speedup. I will still do some math though.
So basic speedup may use early continue if any of x or y is bigger or equal 1.0. There is no chance that x*x + y*y < 1.0 when one of x or y is bigger than 1.
do {
float x = customRand();
if (x >= 1.0) {
continue;
}
float y = customRand();
if (y >= 1.0) {
continue;
}
distance = x * x + y * y; // euclidean square distance
} while (distance >= 1.0);
Unfortunately, it most likely screws up modern CPUs branch prediction mechanism and still may not give big speedup. Benchmarks, as always, would be required.
Also, as the fastest code is the one which does not run, let's discuss the potential XY problem. You seem to generate point which is bounded by circle with radius 1.0. In Cartesian system it is tough task, but it is really easy in polar system, where each point in disk is described with radius and angle and can be easily converted to Cartesian. I don't know what distribution do you expect over disk area, but see answers for this problem here: Generate a random point within a circle (uniformly)
PS. There are some valid comments about <random>, which as std library may lead to some speedup as well.

Is there any "standard" way to calculate the numerical gradient?

I am trying to calculate the numerical gradient of a smooth function in c++. And the parameter value could vary from zero to a very large number(maybe 1e10 to 1e20?)
I used the function f(x,y) = 10*x^3 + y^3 as a testbench, but I found that if x or y is too large, I can't get correct gradient.
Here is my code to calculate the graidient:
#include <iostream>
#include <cmath>
#include <cassert>
using namespace std;
double f(double x, double y)
{
// black box expensive function
return 10 * pow(x, 3) + pow(y, 3);
}
int main()
{
// double x = -5897182590.8347721;
// double y = 269857217.0017581;
double x = 1.13041e+19;
double y = -5.49756e+14;
const double epsi = 1e-4;
double f1 = f(x, y);
double f2 = f(x, y+epsi);
double f3 = f(x, y-epsi);
cout << f1 << endl;
cout << f2 << endl;
cout << f3 << endl;
cout << f1 - f2 << endl; // 0
cout << f2 - f3 << endl; // 0
return 0;
}
If I use the above code to calculate the gradient, the gradient would be zero!
The testbench function, 10*x^3 + y^3, is just a demo, the real problem I need to solve is actually a black box function.
So, is there any "standard" way to calculate the numerical gradient?
In the first place, you should use the central difference scheme, which is more accurate (by cancellation of one more term of the Taylor develoment).
(f(x + h) - f(x - h)) / 2h
rather than
(f(x + h) - f(x)) / h
Then the choice of h is critical and using a fixed constant is the worst thing you can do. Because for small x, h will be too large so that the approximation formula no more works, and for large x, h will be too small, resulting in severe truncation error.
A much better choice is to take a relative value, h = x√ε, where ε is the machine epsilon (1 ulp), which gives a good tradeoff.
(f(x(1 + √ε)) - f(x(1 - √ε))) / 2x√ε
Beware that when x = 0, a relative value cannot work and you need to fall back to a constant. But then, nothing tells you which to use !
You need to consider the precision needed.
At first glance, since |y| = 5.49756e14 and epsi = 1e-4, you need at least ⌈log2(5.49756e14)-log2(1e-4)⌉ = 63 bits of significand precision (that is the number of bits used to encode the digits of your number, also known as mantissa) for y and y+epsi to be considered different.
The double-precision floating-point format only has 53 bits of significand precision (assuming it is 8 bytes). So, currently, f1, f2 and f3 are exactly the same because y, y+epsi and y-epsi are equal.
Now, let's consider the limit : y = 1e20, and the result of your function, 10x^3 + y^3. Let's ignore x for now, so let's take f = y^3. Now we can calculate the precision needed for f(y) and f(y+epsi) to be different : f(y) = 1e60 and f(epsi) = 1e-12. This gives a minimum significand precision of ⌈log2(1e60)-log2(1e-12)⌉ = 240 bits.
Even if you were to use the long double type, assuming it is 16 bytes, your results would not differ : f1, f2 and f3 would still be equal, even though y and y+epsi would not.
If we take x into account, the maximum value of f would be 11e60 (with x = y = 1e20). So the upper limit on precision is ⌈log2(11e60)-log2(1e-12)⌉ = 243 bits, or at least 31 bytes.
One way to solve your problem is to use another type, maybe a bignum used as fixed-point.
Another way is to rethink your problem and deal with it differently. Ultimately, what you want is f1 - f2. You can try to decompose f(y+epsi). Again, if you ignore x, f(y+epsi) = (y+epsi)^3 = y^3 + 3*y^2*epsi + 3*y*epsi^2 + epsi^3. So f(y+epsi) - f(y) = 3*y^2*epsi + 3*y*epsi^2 + epsi^3.
The only way to calculate gradient is calculus.
Gradient is a vector:
g(x, y) = Df/Dx i + Df/Dy j
where (i, j) are unit vectors in x and y directions, respectively.
One way to approximate derivatives is first order differences:
Df/Dx ~ (f(x2, y)-f(x1, y))/(x2-x1)
and
Df/Dy ~ (f(x, y2)-f(x, y1))/(y2-y1)
That doesn't look like what you're doing.
You have a closed form expression:
g(x, y) = 30*x^2 i + 3*y^2 j
You can plug in values for (x, y) and calculate the gradient exactly at any point. Compare that to your differences and see how well your approximation is doing.
How you implement it numerically is your responsibility. (10^19)^3 = 10^57, right?
What is the size of double on your machine? Is it a 64 bit IEEE double precision floating point number?
Use
dx = (1+abs(x))*eps, dfdx = (f(x+dx,y) - f(x,y)) / dx
dy = (1+abs(y))*eps, dfdy = (f(x,y+dy) - f(x,y)) / dy
to get meaningful step sizes for large arguments.
Use eps = 1e-8 for one-sided difference formulas, eps = 1e-5 for central difference quotients.
Explore automatic differentiation (see autodiff.org) for derivatives without difference quotients and thus much smaller numerical errors.
We can examine the behaviour of the error in the derivative using the following program - it calculates the 1-sided derivative and the central difference based derivative using a varying step size. Here I'm using x and y ~ 10^10, which is smaller than what you were using, but should illustrate the same point.
#include <iostream>
#include <cmath>
#include <cassert>
using namespace std;
double f(double x, double y) {
return 10 * pow(x, 3) + pow(y, 3);
}
double f_x(double x, double y) {
return 3 * 10 * pow(x,2);
}
double f_y(double x, double y) {
return 3 * pow(y,2);
}
int main()
{
// double x = -5897182590.8347721;
// double y = 269857217.0017581;
double x = 1.13041e+10;
double y = -5.49756e+10;
//double x = 10.1;
//double y = -5.2;
double epsi = 1e8;
for(int i=0; i<60; ++i) {
double dfx_n = (f(x+epsi,y) - f(x,y))/epsi;
double dfx_cd = (f(x+epsi,y) - f(x-epsi,y))/(2*epsi);
double dfx = f_x(x,y);
cout<<epsi<<" "<<fabs(dfx-dfx_n)<<" "<<fabs(dfx - dfx_cd)<<std::endl;
epsi/=1.5;
}
return 0;
}
The output shows that a 1-sided difference gets us an optimal error of about 1.37034e+13 at a step length of about 100.0. Note that while this error looks large, as a relative error it is 3.5746632302764072e-09 (since the exact value is 3.833e+21)
In comparison the 2-sided difference gets an optimal error of about 1.89493e+10 with a step size of about 45109.3. This is three-orders of magnitude better, (with a much larger step-size).
How can we work out the step size? The link in the comments of Yves Daosts answer gives us a ballpark value:
h=x_c sqrt(eps) for 1-Sided, and h=x_c cbrt(eps) for 2-Sided.
But either way, if the required step size for decent accuracy at x ~ 10^10 is 100.0, the required step size with x ~ 10^20 is going to be 10^10 larger too. So the problem is simply that your step size is way too small.
This can be verified by increasing the starting step-size in the above code and resetting the x/y values to the original values.
Then expected derivative is O(1e39), best 1-sided error of about O(1e31) occurs near a step length of 5.9e10, best 2-sided error of about O(1e29) occurs near a step length of 6.1e13.
As numerical differentiation is ill conditioned (which means a small error could alter your result significantly) you should consider to use Cauchy's integral formula. This way you can calculate the n-th derivative with an integral. This will lead to less problems with considering accuracy and stability.

What is the optimum epsilon/dx value to use within the finite difference method?

double MyClass::dx = ?????;
double MyClass::f(double x)
{
return 3.0*x*x*x - 2.0*x*x + x - 5.0;
}
double MyClass::fp(double x) // derivative of f(x), that is f'(x)
{
return (f(x + dx) - f(x)) / dx;
}
When using finite difference method for derivation, it is critical to choose an optimum dx value. Mathematically, dx must be as small as possible. However, I'm not sure if it is a correct choice to choose it the smallest positive double precision number (i.e.; 2.2250738585072014 x 10−308).
Is there an optimal numeric interval or exact value to choose a dx in to make the calculation error as small as possible?
(I'm using 64-bit compiler. I will run my program on a Intel i5 processor.)
Choosing the smallest possible value is almost certainly wrong: if dx were that smallest number, then f(x + dx) would be exactly equal to f(x) due to rounding.
So you have a tradeoff: Choose dx too small, and you lose precision to rounding errors. Choose it too large, and your result will be imprecise due to changes in the derivative as x changes.
To judge the numeric errors, consider (f(x + dx) - f(x))/f(x)1 mathematically. The numerator denotes the difference you want to compute, but the denominator denotes the magnitude of numbers you're dealing with. If that fraction is about 2‒k, then you can expect approximately k bits of precision in your result.
If you know your function, you can compute what error you'd get from choosing dx too large. You can then balence things, so that the error incurred from this is about the same as the error incurred from rounding. But if you know the function, you might be better off by providing a function that directly computes the derivative, like in your example with the polygonal f.
The Wikipedia section that pogorskiy pointed out suggests a value of sqrt(ε)x, or approximately 1.5e-8 * x. Without any more detailed knowledge about the function, such a rule of thumb will provide a reasonable default. Also note that that same section suggests not dividing by dx, but instead by (x + dx) - x, as this takes rounding errors incurred by computing x + dx into account. But I guess that whole article is full of suggestions you might use.
1 This formula really should divide by f(x), not by dx, even though a past editor thought differently. I'm attempting to compare the amount of significant bits remaining after the division, not the slope of the tangent.
Why not just use the Power Rule to derive the derivative, you'll get an exact answer:
f(x) = 3x^3 - 2x^2 + x - 5
f'(x) = 9x^2 - 4x + 1
Therefore:
f(x) = 3.0 * x * x * x - 2.0 * x * x + x - 5.0
fp(x) = 9.0 * x * x - 4.0 * x + 1.0

lagrange approximation -c++

I updated the code.
What i am trying to do is to hold every lagrange's coefficient values in pointer d.(for example for L1(x) d[0] would be "x-x2/x1-x2" ,d1 would be (x-x2/x1-x2)*(x-x3/x1-x3) etc.
My problem is
1) how to initialize d ( i did d[0]=(z-x[i])/(x[k]-x[i]) but i think it's not right the "d[0]"
2) how to initialize L_coeff. ( i am using L_coeff=new double[0] but am not sure if it's right.
The exercise is:
Find Lagrange's polynomial approximation for y(x)=cos(π x), x ∈−1,1 using 5 points
(x = -1, -0.5, 0, 0.5, and 1).
#include <iostream>
#include <cstdio>
#include <cstdlib>
#include <cmath>
using namespace std;
const double pi=3.14159265358979323846264338327950288;
// my function
double f(double x){
return (cos(pi*x));
}
//function to compute lagrange polynomial
double lagrange_polynomial(int N,double *x){
//N = degree of polynomial
double z,y;
double *L_coeff=new double [0];//L_coefficients of every Lagrange L_coefficient
double *d;//hold the polynomials values for every Lagrange coefficient
int k,i;
//computations for finding lagrange polynomial
//double sum=0;
for (k=0;k<N+1;k++){
for ( i=0;i<N+1;i++){
if (i==0) continue;
d[0]=(z-x[i])/(x[k]-x[i]);//initialization
if (i==k) L_coeff[k]=1.0;
else if (i!=k){
L_coeff[k]*=d[i];
}
}
cout <<"\nL("<<k<<") = "<<d[i]<<"\t\t\tf(x)= "<<f(x[k])<<endl;
}
}
int main()
{
double deg,result;
double *x;
cout <<"Give the degree of the polynomial :"<<endl;
cin >>deg;
for (int i=0;i<deg+1;i++){
cout <<"\nGive the points of interpolation : "<<endl;
cin >> x[i];
}
cout <<"\nThe Lagrange L_coefficients are: "<<endl;
result=lagrange_polynomial(deg,x);
return 0;
}
Here is an example of lagrange polynomial
As this seems to be homework, I am not going to give you an exhaustive answer, but rather try to send you on the right track.
How do you represent polynomials in a computer software? The intuitive version you want to archive as a symbolic expression like 3x^3+5x^2-4 is very unpractical for further computations.
The polynomial is defined fully by saving (and outputting) it's coefficients.
What you are doing above is hoping that C++ does some algebraic manipulations for you and simplify your product with a symbolic variable. This is nothing C++ can do without quite a lot of effort.
You have two options:
Either use a proper computer algebra system that can do symbolic manipulations (Maple or Mathematica are some examples)
If you are bound to C++ you have to think a bit more how the single coefficients of the polynomial can be computed. You programs output can only be a list of numbers (which you could, of course, format as a nice looking string according to a symbolic expression).
Hope this gives you some ideas how to start.
Edit 1
You still have an undefined expression in your code, as you never set any value to y. This leaves prod*=(y-x[i])/(x[k]-x[i]) as an expression that will not return meaningful data. C++ can only work with numbers, and y is no number for you right now, but you think of it as symbol.
You could evaluate the lagrange approximation at, say the value 1, if you would set y=1 in your code. This would give you the (as far as I can see right now) correct function value, but no description of the function itself.
Maybe you should take a pen and a piece of paper first and try to write down the expression as precise Math. Try to get a real grip on what you want to compute. If you did that, maybe you come back here and tell us your thoughts. This should help you to understand what is going on in there.
And always remember: C++ needs numbers, not symbols. Whenever you have a symbol in an expression on your piece of paper that you do not know the value of you can either find a way how to compute the value out of the known values or you have to eliminate the need to compute using this symbol.
P.S.: It is not considered good style to post identical questions in multiple discussion boards at once...
Edit 2
Now you evaluate the function at point y=0.3. This is the way to go if you want to evaluate the polynomial. However, as you stated, you want all coefficients of the polynomial.
Again, I still feel you did not understand the math behind the problem. Maybe I will give you a small example. I am going to use the notation as it is used in the wikipedia article.
Suppose we had k=2 and x=-1, 1. Furthermore, let my just name your cos-Function f, for simplicity. (The notation will get rather ugly without latex...) Then the lagrangian polynomial is defined as
f(x_0) * l_0(x) + f(x_1)*l_1(x)
where (by doing the simplifications again symbolically)
l_0(x)= (x - x_1)/(x_0 - x_1) = -1/2 * (x-1) = -1/2 *x + 1/2
l_1(x)= (x - x_0)/(x_1 - x_0) = 1/2 * (x+1) = 1/2 * x + 1/2
So, you lagrangian polynomial is
f(x_0) * (-1/2 *x + 1/2) + f(x_1) * 1/2 * x + 1/2
= 1/2 * (f(x_1) - f(x_0)) * x + 1/2 * (f(x_0) + f(x_1))
So, the coefficients you want to compute would be 1/2 * (f(x_1) - f(x_0)) and 1/2 * (f(x_0) + f(x_1)).
Your task is now to find an algorithm that does the simplification I did, but without using symbols. If you know how to compute the coefficients of the l_j, you are basically done, as you then just can add up those multiplied with the corresponding value of f.
So, even further broken down, you have to find a way to multiply the quotients in the l_j with each other on a component-by-component basis. Figure out how this is done and you are a nearly done.
Edit 3
Okay, lets get a little bit less vague.
We first want to compute the L_i(x). Those are just products of linear functions. As said before, we have to represent each polynomial as an array of coefficients. For good style, I will use std::vector instead of this array. Then, we could define the data structure holding the coefficients of L_1(x) like this:
std::vector L1 = std::vector(5);
// Lets assume our polynomial would then have the form
// L1[0] + L2[1]*x^1 + L2[2]*x^2 + L2[3]*x^3 + L2[4]*x^4
Now we want to fill this polynomial with values.
// First we have start with the polynomial 1 (which is of degree 0)
// Therefore set L1 accordingly:
L1[0] = 1;
L1[1] = 0; L1[2] = 0; L1[3] = 0; L1[4] = 0;
// Of course you could do this more elegant (using std::vectors constructor, for example)
for (int i = 0; i < N+1; ++i) {
if (i==0) continue; /// For i=0, there will be no polynomial multiplication
// Otherwise, we have to multiply L1 with the polynomial
// (x - x[i]) / (x[0] - x[i])
// First, note that (x[0] - x[i]) ist just a scalar; we will save it:
double c = (x[0] - x[i]);
// Now we multiply L_1 first with (x-x[1]). How does this multiplication change our
// coefficients? Easy enough: The coefficient of x^1 for example is just
// L1[0] - L1[1] * x[1]. Other coefficients are done similary. Futhermore, we have
// to divide by c, which leaves our coefficient as
// (L1[0] - L1[1] * x[1])/c. Let's apply this to the vector:
L1[4] = (L1[3] - L1[4] * x[1])/c;
L1[3] = (L1[2] - L1[3] * x[1])/c;
L1[2] = (L1[1] - L1[2] * x[1])/c;
L1[1] = (L1[0] - L1[1] * x[1])/c;
L1[0] = ( - L1[0] * x[1])/c;
// There we are, polynomial updated.
}
This, of course, has to be done for all L_i Afterwards, the L_i have to be added and multiplied with the function. That is for you to figure out. (Note that I made quite a lot of inefficient stuff up there, but I hope this helps you understanding the details better.)
Hopefully this gives you some idea how you could proceed.
The variable y is actually not a variable in your code but represents the variable P(y) of your lagrange approximation.
Thus, you have to understand the calculations prod*=(y-x[i])/(x[k]-x[i]) and sum+=prod*f not directly but symbolically.
You may get around this by defining your approximation by a series
c[0] * y^0 + c[1] * y^1 + ...
represented by an array c[] within the code. Then you can e.g. implement multiplication
d = c * (y-x[i])/(x[k]-x[i])
coefficient-wise like
d[i] = -c[i]*x[i]/(x[k]-x[i]) + c[i-1]/(x[k]-x[i])
The same way you have to implement addition and assignments on a component basis.
The result will then always be the coefficients of your series representation in the variable y.
Just a few comments in addition to the existing responses.
The exercise is: Find Lagrange's polynomial approximation for y(x)=cos(π x), x ∈ [-1,1] using 5 points (x = -1, -0.5, 0, 0.5, and 1).
The first thing that your main() does is to ask for the degree of the polynomial. You should not be doing that. The degree of the polynomial is fully specified by the number of control points. In this case you should be constructing the unique fourth-order Lagrange polynomial that passes through the five points (xi, cos(π xi)), where the xi values are those five specified points.
const double pi=3.1415;
This value is not good for a float, let alone a double. You should be using something like const double pi=3.14159265358979323846264338327950288;
Or better yet, don't use pi at all. You should know exactly what the y values are that correspond to the given x values. What are cos(-π), cos(-π/2), cos(0), cos(π/2), and cos(π)?

Create sine lookup table in C++

How can I rewrite the following pseudocode in C++?
real array sine_table[-1000..1000]
for x from -1000 to 1000
sine_table[x] := sine(pi * x / 1000)
I need to create a sine_table lookup table.
You can reduce the size of your table to 25% of the original by only storing values for the first quadrant, i.e. for x in [0,pi/2].
To do that your lookup routine just needs to map all values of x to the first quadrant using simple trig identities:
sin(x) = - sin(-x), to map from quadrant IV to I
sin(x) = sin(pi - x), to map from quadrant II to I
To map from quadrant III to I, apply both identities, i.e. sin(x) = - sin (pi + x)
Whether this strategy helps depends on how much memory usage matters in your case. But it seems wasteful to store four times as many values as you need just to avoid a comparison and subtraction or two during lookup.
I second Jeremy's recommendation to measure whether building a table is better than just using std::sin(). Even with the original large table, you'll have to spend cycles during each table lookup to convert the argument to the closest increment of pi/1000, and you'll lose some accuracy in the process.
If you're really trying to trade accuracy for speed, you might try approximating the sin() function using just the first few terms of the Taylor series expansion.
sin(x) = x - x^3/3! + x^5/5! ..., where ^ represents raising to a power and ! represents the factorial.
Of course, for efficiency, you should precompute the factorials and make use of the lower powers of x to compute higher ones, e.g. use x^3 when computing x^5.
One final point, the truncated Taylor series above is more accurate for values closer to zero, so its still worthwhile to map to the first or fourth quadrant before computing the approximate sine.
Addendum:
Yet one more potential improvement based on two observations:
1. You can compute any trig function if you can compute both the sine and cosine in the first octant [0,pi/4]
2. The Taylor series expansion centered at zero is more accurate near zero
So if you decide to use a truncated Taylor series, then you can improve accuracy (or use fewer terms for similar accuracy) by mapping to either the sine or cosine to get the angle in the range [0,pi/4] using identities like sin(x) = cos(pi/2-x) and cos(x) = sin(pi/2-x) in addition to the ones above (for example, if x > pi/4 once you've mapped to the first quadrant.)
Or if you decide to use a table lookup for both the sine and cosine, you could get by with two smaller tables that only covered the range [0,pi/4] at the expense of another possible comparison and subtraction on lookup to map to the smaller range. Then you could either use less memory for the tables, or use the same memory but provide finer granularity and accuracy.
long double sine_table[2001];
for (int index = 0; index < 2001; index++)
{
sine_table[index] = std::sin(PI * (index - 1000) / 1000.0);
}
One more point: calling trigonometric functions is pricey. if you want to prepare the lookup table for sine with constant step - you may save the calculation time, in expense of some potential precision loss.
Consider your minimal step is "a". That is, you need sin(a), sin(2a), sin(3a), ...
Then you may do the following trick: First calculate sin(a) and cos(a). Then for every consecutive step use the following trigonometric equalities:
sin([n+1] * a) = sin(n*a) * cos(a) + cos(n*a) * sin(a)
cos([n+1] * a) = cos(n*a) * cos(a) - sin(n*a) * sin(a)
The drawback of this method is that during this procedure the round-off error is accumulated.
double table[1000] = {0};
for (int i = 1; i <= 1000; i++)
{
sine_table[i-1] = std::sin(PI * i/ 1000.0);
}
double getSineValue(int multipleOfPi){
if(multipleOfPi == 0) return 0.0;
int sign = 1;
if(multipleOfPi < 0){
sign = -1;
}
return signsine_table[signmultipleOfPi - 1];
}
You can reduce the array length to 500, by a trick sin(pi/2 +/- angle) = +/- cos(angle).
So store sin and cos from 0 to pi/4.
I don't remember from top of my head but it increased the speed of my program.
You'll want the std::sin() function from <cmath>.
another approximation from a book or something
streamin ramp;
streamout sine;
float x,rect,k,i,j;
x = ramp -0.5;
rect = x * (1 - x < 0 & 2);
k = (rect + 0.42493299) *(rect -0.5) * (rect - 0.92493302) ;
i = 0.436501 + (rect * (rect + 1.05802));
j = 1.21551 + (rect * (rect - 2.0580201));
sine = i*j*k*60.252201*x;
full discussion here:
http://synthmaker.co.uk/forum/viewtopic.php?f=4&t=6457&st=0&sk=t&sd=a
I presume that you know, that using a division is a lot slower than multiplying by decimal number, /5 is always slower than *0.2
it's just an approximation.
also:
streamin ramp;
streamin x; // 1.5 = Saw 3.142 = Sin 4.5 = SawSin
streamout sine;
float saw,saw2;
saw = (ramp * 2 - 1) * x;
saw2 = saw * saw;
sine = -0.166667 + saw2 * (0.00833333 + saw2 * (-0.000198409 + saw2 * (2.7526e-006+saw2 * -2.39e-008)));
sine = saw * (1+ saw2 * sine);