I have a scientific code that uses both sine and cosine of the same argument (I basically need the complex exponential of that argument). I was wondering if it were possible to do this faster than calling sine and cosine functions separately.
Also I only need about 0.1% precision. So is there any way I can find the default trig functions and truncate the power series for speed?
One other thing I have in mind is, is there any way to perform the remainder operation such that the result is always positive? In my own algorithm I used x=fmod(x,2*pi); but then I would need to add 2pi if x is negative (smaller domain means I can use a shorter power series)
EDIT: LUT turned out to be the best approach for this, however I am glad I learned about other approximation techniques. I will also advise using an explicit midpoint approximation. This is what I ended up doing:
const int N = 10000;//about 3e-4 error for 1000//3e-5 for 10 000//3e-6 for 100 000
double *cs = new double[N];
double *sn = new double[N];
for(int i =0;i<N;i++){
double A= (i+0.5)*2*pi/N;
cs[i]=cos(A);
sn[i]=sin(A);
}
The following part approximates (midpoint) sincos(2*pi*(wc2+t[j]*(cotp*t[j]-wc)))
double A=(wc2+t[j]*(cotp*t[j]-wc));
int B =(int)N*(A-floor(A));
re += cs[B]*f[j];
im += sn[B]*f[j];
Another approach could have been using the chebyshev decomposition. You can use the orthogonality property to find the coefficients. Optimized for exponential, it looks like this:
double fastsin(double x){
x=x-floor(x/2/pi)*2*pi-pi;//this line can be improved, both inside this
//function and before you input it into the function
double x2 = x*x;
return (((0.00015025063885163012*x2-
0.008034350857376128)*x2+ 0.1659789684145034)*x2-0.9995812174943602)*x;} //7th order chebyshev approx
If you seek fast evaluation with good (but not high) accuracy with powerseries you should use an expansion in Chebyshev polynomials: tabulate the coefficients (you'll need VERY few for 0.1% accuracy) and evaluate the expansion with the recursion relations for these polynomials (it's really very easy).
References:
Tabulated coefficients: http://www.ams.org/mcom/1980-34-149/S0025-5718-1980-0551302-5/S0025-5718-1980-0551302-5.pdf
Evaluation of chebyshev expansion: https://en.wikipedia.org/wiki/Chebyshev_polynomials
You'll need to (a) get the "reduced" argument in the range -pi/2..+pi/2 and consequently then (b) handle the sign in your results when the argument actually should have been in the "other" half of the full elementary interval -pi..+pi. These aspects should not pose a major problem:
determine (and "remember" as an integer 1 or -1) the sign in the original angle and proceed with the absolute value.
use a modulo function to reduce to the interval 0..2PI
Determine (and "remember" as an integer 1 or -1) whether it is in the "second" half and, if so, subtract pi*3/2, otherwise subtract pi/2. Note: this effectively interchanges sine and cosine (apart from signs); take this into account in the final evaluation.
This completes the step to get an angle in -pi/2..+pi/2
After evaluating sine and cosine with the Cheb-expansions, apply the "flags" of steps 1 and 3 above to get the right signs in the values.
Just create a lookup table. The following will let you lookup the sin and cos of any radian value between -2PI and 2PI.
// LOOK UP TABLE
var LUT_SIN_COS = [];
var N = 14400;
var HALF_N = N >> 1;
var STEP = 4 * Math.PI / N;
var INV_STEP = 1 / STEP;
// BUILD LUT
for(var i=0, r = -2*Math.PI; i < N; i++, r += STEP) {
LUT_SIN_COS[2*i] = Math.sin(r);
LUT_SIN_COS[2*i + 1] = Math.cos(r);
}
You index into the lookup table by:
var index = ((r * INV_STEP) + HALF_N) << 1;
var sin = LUT_SIN_COS[index];
var cos = LUT_SIN_COS[index + 1];
Here's a fiddle that displays the % error you can expect from different sized LUTS http://jsfiddle.net/77h6tvhj/
EDIT Here's an ideone (c++) with a ~benchmark~ vs the float sin and cos. http://ideone.com/SGrFVG For whatever a benchmark on ideone.com is worth the LUT is 5 times faster.
One way to go would be to learn how to implement the CORDIC algorithm. It is not difficult and pretty interesting intelectually. This gives you both the cosine and the sine. Wikipedia gives a MATLAB example that should be easy to adapt in C++.
Note that you can augment speed and reduce precision simply by lowering the parameter n.
About your second question, it has already been asked here (in C). It seems that there is no simple way.
You can also calculate sine using a square root, given the angle and the cosine.
The example below assumes the angle ranges from 0 to 2π:
double c = cos(angle);
double s = sqrt(1.0-c*c);
if(angle>pi)s=-s;
For single-precision floats, Microsoft uses 11-degree polynomial approximation for sine, 10-degree for cosine: XMScalarSinCos.
They also have faster version, XMScalarSinCosEst, that uses lower-degree polynomials.
If you aren’t on Windows, you’ll find same code + coefficients on geometrictools.com under Boost license.
Related
as i said, i want implement my own double precision cos() function in a compute shader with GLSL, because there is just a built-in version for float.
This is my code:
double faculty[41];//values are calculated at the beginning of main()
double myCOS(double x)
{
double sum,tempExp,sign;
sum = 1.0;
tempExp = 1.0;
sign = -1.0;
for(int i = 1; i <= 30; i++)
{
tempExp *= x;
if(i % 2 == 0){
sum = sum + (sign * (tempExp / faculty[i]));
sign *= -1.0;
}
}
return sum;
}
The result of this code is, that the sum turns out to be NaN on the shader, but on the CPU the algorithm is working well.
I tried to debug this code too and I got the following information:
faculty[i] is positive and not zero for all entries
tempExp is positive in each step
none of the other variables are NaN during each step
the first time sum is NaN is at the step with i=4
and now my question: What exactly can go wrong if each variable is a number and nothing is divided by zero especially when the algorithm works on the CPU?
Let me guess:
First you determined the problem is in the loop, and you use only the following operations: +, *, /.
The rules for generating NaN from these operations are:
The divisions 0/0 and ±∞/±∞
The multiplications 0×±∞ and ±∞×0
The additions ∞ + (−∞), (−∞) + ∞ and equivalent subtractions
You ruled out the possibility for 0/0 and ±∞/±∞ by stating that faculty[] is correctly initialized.
The variable sign is always 1.0 or -1.0 so it cannot generate the NaN through the * operation.
What remains is the + operation if tempExp ever become ±∞.
So probably tempExp is too high on entry of your function and becomes ±∞, this will make sum to be ±∞ too. At the next iteration you will trigger the NaN generating operation through: ∞ + (−∞). This is because you multiply one side of the addition by sign and sign switches between positive and negative at each iteration.
You're trying to approximate cos(x) around 0.0. So you should use the properties of the cos() function to reduce your input value to a value near 0.0. Ideally in the range [0, pi/4]. For instance, remove multiples of 2*pi, and get the values of cos() in [pi/4, pi/2] by computing sin(x) around 0.0 and so on.
What can go dramatically wrong is a loss of precision. cos(x) usually is implemented by range reduction followed by a dedicated implementation for the range [0, pi/2]. Range reduction uses cos(x+2*pi) = cos(x). But this range reduction isn't perfect. For starters, pi cannot be exactly represented in finite math.
Now what happens if you try something as absurd as cos(1<<30) ? It's quite possible that the range reduction algorithm introduces an error in x that's larger than 2*pi, in which case the outcome is meaningless. Returning NaN in such cases is reasonable.
I am rendering 500x500 points in real-time.
I have to compute the position of points using atan() and sin() functions. By using atan() and sin() I am getting 24 fps (frames per second).
float thetaC = atan(value);
float h = (value) / (sin(thetaC)));
If I don't use sin() I am getting 52 fps.
and if I dont use atan() I am 30 fps.
So, the big problem is with sin(). How can I use Fast Sin version. Can I create a Look Up Table for that ? I don't have any specific values to create LUT. what can I do in this situation ?
PS: I have also tried fast sin function of ASM but not getting any difference.
Thanks.
Hang on a second....
You have a triangle, you're computing the hypoteneuse. First, you're taking atan(value) to get the angle, and then using value again with sin to compute h. So we have the scenario where one side of the triangle is 1:
/|
h / | value
/ |
/C__|
1
All you really need to do is calculate h = sqrt(value*value + 1); ... But then, sqrt isn't the fastest function around either.
Perhaps I've missed the point or you've left something out. I've always used lookup tables for sin and cos, and found them to be fast. If you don't know the values ahead of time then you need to approximate, but this means a multiplication, truncation to integer (and possibly sign conversion) in order to get the array index. If you can convert your units to work in integers (effectively making your floats into fixed-point), it makes the lookup even quicker.
It depends on the accuracy that you need. The maximum derivative of sin is 1, so if if x1 and x2 are within epsilon of one another, then sin(x1) and sin(x2) are also within epsilon. If you just need accuracy to within, say 0.001, then you can create a lookup table of 1000 * PI = 3142 points, and just look up the value closest to the one you need. This can be faster than what the native code does, since the native code (probably) uses a lookup table for polynomial coefficients, and then interpolates, and since this table can be small enough to stay in cache easily.
If you need complete accuracy over the whole range, then there's probably nothing better that you can do.
If you wanted, you could also create a lookup table over (1/sin(x)), since that's your actual function of interest. Either way, you'll want to be careful around sin(x) = 0, since a small error in sin(x) can cause a big error in 1/sin(x). Defining your error tolerance is important for figuring out what shortcuts you can take.
You'd create the lookup table with something like:
float *table = malloc(1000 * sizeof(float));
for(int i = 0; i < 1000; i++){
table[i] = sin(i/1000.0);
}
and would access it something like
float fastSin(float x){
int index = x * 1000.0;
return table[index];
}
This code isn't complete (and will crash for anything outside of 0 < x < 1, because of array bounds), but should get you started.
For sin (but not atan) you can actually get simpler than a table--just create
float sin_arr[31416]; //Or as much precision as you need
for (int i=0; i<31416; ++i)
sin_arr[i] = sin( i / 10000.0 );
//...
float h = (value) / sin_arr[ (int)(thetaC*10000.0) % 31416 ];
My guess is that this will give you a speed improvement.
I want to integrate a probability density function from (-\infty, a] because the cdf is not available in closed form. But I'm not sure how to do this in C++.
This task is pretty simple in Mathematica; All I need to do is define the function,
f[x_, lambda_, alpha_, beta_, mu_] :=
Module[{gamma},
gamma = Sqrt[alpha^2 - beta^2];
(gamma^(2*lambda)/((2*alpha)^(lambda - 1/2)*Sqrt[Pi]*Gamma[lambda]))*
Abs[x - mu]^(lambda - 1/2)*
BesselK[lambda - 1/2, alpha Abs[x - mu]] E^(beta (x - mu))
];
and then call the NIntegrate Routine to numerically integrate it.
F[x_, lambda_, alpha_, beta_, mu_] :=
NIntegrate[f[t, lambda, alpha, beta, mu], {t, -\[Infinity], x}]
Now I want to achieve the same thing in C++. I using the routine gsl_integration_qagil from the gsl numerics library. It is designed to integrate functions on the semi infinite intervals (-\infty, a] which is just what I want. But unfortunately I can't get it to work.
This is the density function in C++,
density(double x)
{
using namespace boost::math;
if(x == _mu)
return std::numeric_limits<double>::infinity();
return pow(_gamma, 2*_lambda)/(pow(2*_alpha, _lambda-0.5)*sqrt(_pi)*tgamma(_lambda))* pow(abs(x-_mu), _lambda - 0.5) * cyl_bessel_k(_lambda-0.5, _alpha*abs(x - _mu)) * exp(_beta*(x - _mu));
}
Then I try and integrate to obtain the cdf by calling the gsl routine.
cdf(double x)
{
gsl_integration_workspace * w = gsl_integration_workspace_alloc (1000);
double result, error;
gsl_function F;
F.function = &density;
double epsabs = 0;
double epsrel = 1e-12;
gsl_integration_qagil (&F, x, epsabs, epsrel, 1000, w, &result, &error);
printf("result = % .18f\n", result);
printf ("estimated error = % .18f\n", error);
printf ("intervals = %d\n", w->size);
gsl_integration_workspace_free (w);
return result;
}
However gsl_integration_qagil returns an error, number of iterations was insufficient.
double mu = 0.0f;
double lambda = 3.0f;
double alpha = 265.0f;
double beta = -5.0f;
cout << cdf(0.01) << endl;
If I increase the size of the workspace then the bessel function will not evaluate.
I was wondering if there was anyone that could give me any insight to my problem. A call to the corresponding Mathematica function F above with x = 0.01 returns 0.904384.
Could it be that the density is concentrated around a very small interval (i.e. outside of [-0.05, 0.05] the density is almost 0, a plot is given below). If so what can be done about this. Thanks for reading.
Re: integrating to +/- infinity:
I would use Mathematica to find an empirical bound for |x - μ| >> K, where K represents the "width" around the mean, and K is a function of alpha, beta, and lambda -- for example F is less than and approximately equal to a(x-μ)-2 or ae-b(x-μ)2 or whatever. These functions have known integrals out to infinity, for which you can evaluate empirically. Then you can integrate numerically out to K, and use the bounded approximation to get from K to infinity.
Figuring out K may be a bit tricky; I'm not very familiar with Bessel functions so I can't help you much there.
In general, I've found that for numerical calculation that's not obvious, the best way is to do as much analytical math as you can before you do numerical evaluation. (Kind of like an autofocus camera -- get it close to where you want, then let the camera do the rest.)
I haven't tried the C++ code, but by checking out the function in Mathematica, it does seem extremely peaked around mu, with the spread of the peak determined by the parameters lambda,alpha,beta.
What I would do would be to do a preliminary search of the pdf: look to the right and to the left of x=mu until you find the first value below a given tolerance. Use these as the bounds for your cdf, instead of negative infinity.
Pseudo code follows:
x_mu
step = 0.000001
adaptive_step(y_value) -> returns a small step size if close to 0, and larger if far.
while (pdf_current > tolerance):
step = adaptive_step(pdf_current)
xtest = xtest - step
pdf_current = pdf(xtest)
left_bound = xtest
//repeat for left bound
Given how tightly peaked this function seems to be, tightening the bounds would probably save you a lot of computer time that's currently wasted calculating zeros. Also, you'd be able to use the bounded integration routine, rather than -\infty,b .
Just a thought...
PS: Mathematica gives me F[0.01, 3, 265, -5, 0] = 0.884505
I found a complete description on this glsl there http://linux.math.tifr.res.in/manuals/html/gsl-ref-html/gsl-ref_16.html, you may find usefull informations.
Since I'm not GSL expert I did not focus on your problem from the math point of view, but rather I've to remind you some key aspect about floating point programming.
You can't accurately represent numbers using IEEE 754 standard. MathLab do hide the fact by using an infinite number representation logic, in order to give you rouding error-free results , this is the reason why it's slow compared to native code.
I strongly recommand this link for anyone involved in scientific calculus using a FPU:
http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
Assuming you've enjoyed that article, I've noticed this on the GSL link above: "The routines will fail to converge if the error bounds are too stringent".
Your bounds may be too stringent if the difference between the upper and the lower is less than the minimum representable value of double, that is
std::numeric_limits::epsilon();.
In addition remember, from the 2nd link, for any C/C++ compiler implementation the default rounding mode is "truncate", this introduce subtle calculus errors leeding to the wrong results. I did have the problem with a simple Liang Barsky line clipper, 1st order ! So imagine the mess in this line:
return pow(_gamma, 2*_lambda)/(pow(2*_alpha, _lambda-0.5)*sqrt(_pi)*tgamma(_lambda))* pow(abs(x-_mu), _lambda - 0.5) * cyl_bessel_k(_lambda-0.5, _alpha*abs(x - _mu)) * exp(_beta*(x - _mu));
As a general rule, it is wise in C/C++, to add additional variable holding intermediate results, so you can debug step by step, then see any rounding error, you shouldn't try to input expression like this one in any native programing langage. One can't optimize variables better than a compiler.
Finally, as a general rule, you should multiply everything then divide, unless you are confident about the dynamic behavior of your calculus.
Good luck.
I'm using bits of Perry Cook's Synthesis Toolkit (STK) to generate saw and square waves. STK includes this BLIT-based sawtooth oscillator:
inline STKFloat BlitSaw::tick( void ) {
StkFloat tmp, denominator = sin( phase_ );
if ( fabs(denominator) <= std::numeric_limits<StkFloat>::epsilon() )
tmp = a_;
else {
tmp = sin( m_ * phase_ );
tmp /= p_ * denominator;
}
tmp += state_ - C2_;
state_ = tmp * 0.995;
phase_ += rate_;
if ( phase_ >= PI )
phase_ -= PI;
lastFrame_[0] = tmp;
return lastFrame_[0];
}
The square wave oscillator is broadly similar. At the top, there's this comment:
// A fully optimized version of this code would replace the two sin
// calls with a pair of fast sin oscillators, for which stable fast
// two-multiply algorithms are well known.
I don't know where to start looking for these "fast two-multiply algorithms" and I'd appreciate some pointers. I could use a lookup table instead, but I'm keen to learn what these 'fast sin oscillators' are. I could also use an abbreviated Taylor series, but thats way more than two multiplies. Searching hasn't turned up anything much, although I did find this approximation:
#define AD_SIN(n) (n*(2.f- fabs(n)))
Plotting it out shows that it's not really a close approximation outside the range of -1 to 1, so I don't think I can use it when phase_ is in the range -pi to pi:
Here, Sine is the blue line and the purple line is the approximation.
Profiling my code reveals that the calls to sin() are far and away the most time-consuming calls, so I really would like to optimise this piece.
Thanks
EDIT Thanks for the detailed and varied answers. I will explore these and accept one at the weekend.
EDIT 2 Would the anonymous close voter please kindly explain their vote in the comments? Thank you.
Essentially the sinusoidal oscilator is one (or more) variables that change with each DSP step, rather than getting recalculated from scratch.
The simplest are based on the following trig identities: (where d is constant, and thus so is cos(d) and sin(d) )
sin(x+d) = sin(x) cos(d) + cos(x) sin(d)
cos(x+d) = cos(x) cos(d) - sin(x) sin(d)
However this requires two variables (one for sin and one for cos) and 4 multiplications to update. However this will still be far faster than calculating a full sine at each step.
The solution by Oli Charlesworth is based on solutions to this general equation
A_{n+1} = a A_{n} + A_{n-1}
Where looking for a solution of the form A_n = k e^(i theta n) gives an equation for theta.
e^(i theta (n+1) ) = a e^(i theta n ) + b e^(i theta (n-1) )
Which simplifies to
e^(i theta) - e^(-i theta ) = a
2 cos(theta) = a
Giving
A_{n+1} = 2 cos(theta) A_{n} + A_{n-1}
Whichever approach you use you'll either need to use one or two of these oscillators for each frequency, or use another trig identity to derive the higher or lower frequencies.
How accurate do you need this?
This function, f(x)=0.398x*(3.1076-|x|), does a reasonably good job for x between -pi and pi.
Edit
An even better approximation is f(x)=0.38981969947653056*(pi-|x|), which keeps the absolute error to 0.038158444604 or less for x between -pi and pi.
A least squares minimization will yield a slightly different function.
It's not possible to generate one-off sin calls with just two multiplies (well, not a useful approximation, at any rate). But it is possible to generate an oscillator with low complexity, i.e. where each value is calculated in terms of the preceding ones.
For instance, consider that the following difference equation will give you a sinusoid:
y[n] = 2*cos(phi)*y[n-1] - y[n-2]
(where cos(phi) is a constant)
(From the original author of the VST BLT code).
As a matter of fact, I was porting the VST BLT oscillators to C#, so I was googling for good sin oscillators. Here's what I came up with. Translation to C++ is straightforward. See the notes at the end about accuumulated round-off errors.
public class FastOscillator
{
private double b1;
private double y1, y2;
private double fScale;
public void Initialize(int sampleRate)
{
fScale = AudioMath.TwoPi / sampleRate;
}
// frequency in Hz. phase in radians.
public void Start(float frequency, double phase)
{
double w = frequency * fScale;
b1 = 2.0 * Math.Cos(w);
y1 = Math.Sin(phase - w);
y2 = Math.Sin(phase - w * 2);
}
public double Tick()
{
double y0 = b1 * y1 - y2;
y2 = y1;
y1 = y0;
return y0;
}
}
Note that this particular oscillator implementation will drift over time, so it needs to be re-initialzed periodically. In this particular implementation, the magnitude of the sin wave decays over time. The original comments in the STK code suggested a two-multiply oscillator. There are, in fact, two-multiply oscillators that are reasonably stable over time. But in retrospect, the need to keep the sin(phase), and sin(m*phase) oscillators tightly in synch probably means that they have to be resynched anyway. Round-off errors between phase and m*phase mean that even if the oscillators were stable, they would drift eventually, running a significant risk of producing large spikes in values near the zeros of the BLT functions. May as well use a one-multiply oscillator.
These particular oscillators should probably be re-initialized every 30 to 100 cycles (or so). My C# implementation is frame based (i.e. it calculates an float[] array of results in a void Tick(int count, float[] result) method. The oscillators are re-synched at the end of each Tick call. Something like this:
void Tick(int count, float[] result)
{
for (int i = 0; i < count; ++i)
{
...
result[i] = bltResult;
}
// re-initialize the oscillators to avoid accumulated drift.
this.phase = (this.phase + this.dPhase*count) % AudioMath.TwoPi;
this.sinOsc.Initialize(frequency,this.phase);
this.mSinOsc.Initialize(frequency*m,this.phase*m);
}
Probably missing from the STK code. You might want to investigate this. The original code provided to the STK did this. Gary Scavone tweaked the code a bit, and I think the optimization was lost. I do know that the STK implementations suffer from DC drift, which can be almost entirely eliminated when implemented properly.
There's a peculiar hack that prevents DC drift of the oscillators, even when sweeping the frequency of the oscillators. The trick is that the oscillators should be started with an initial phase adjustment of dPhase/2. That just so happens to start the oscillators off with zero DC drift, without having to figure out wat the correct initial state for various integrators in each of the BLT oscillators.
Strangely, if the adjustment is re-adjusted whenever the frequency of the oscillator changes, then this also prevents wild DC drift of the output when sweeping the frequency of the oscillator. Whenever the frequency changes, subtract dPhase/2 from the previous phase value, recalculate dPhase for the new frequency, and then add dPhase/2.I rather suspect this could be formally proven; but I have not been able to so. All I know is that It Just Works.
For a block implementation, the oscillators should actually be initialized as follows, instead of carrying the phase adjustment in the current this.phase value.
this.sinOsc.Initialize(frequency,phase+dPhase*0.5);
this.mSinOsc.Initialize(frequency*m,(phase+dPhase*0.5)*m);
You might want to take a look here:
http://devmaster.net/forums/topic/4648-fast-and-accurate-sinecosine/
There's some sample code that calculates a very good appoximation of sin/cos using only multiplies, additions and the abs() function. Quite fast too. The comments are also a good read.
It essentiall boils down to this:
float sine(float x)
{
const float B = 4/pi;
const float C = -4/(pi*pi);
const float P = 0.225;
float y = B * x + C * x * abs(x);
return P * (y * abs(y) - y) + y;
}
and works for a range of -PI to PI
If you can, you should consider memorization based techniques. Essentially store sin(x) and cos(x) values for a bunch values. To calculate sin(y), find a and b for which precomputed values exist such that a<=y<=b. Now using sin(a), sin(b), cos(a), cos(b), y-a and y-b approximately calculate sin(y).
The general idea of getting periodically sampled results from the sine or cosine function is to use a trig recursion or an initialized (barely) stable IIR filter (which can end up being pretty much the same computations). There are bunches of these in the DSP literature, of varying accuracy and stability. Choose carefully.
I updated the code.
What i am trying to do is to hold every lagrange's coefficient values in pointer d.(for example for L1(x) d[0] would be "x-x2/x1-x2" ,d1 would be (x-x2/x1-x2)*(x-x3/x1-x3) etc.
My problem is
1) how to initialize d ( i did d[0]=(z-x[i])/(x[k]-x[i]) but i think it's not right the "d[0]"
2) how to initialize L_coeff. ( i am using L_coeff=new double[0] but am not sure if it's right.
The exercise is:
Find Lagrange's polynomial approximation for y(x)=cos(π x), x ∈−1,1 using 5 points
(x = -1, -0.5, 0, 0.5, and 1).
#include <iostream>
#include <cstdio>
#include <cstdlib>
#include <cmath>
using namespace std;
const double pi=3.14159265358979323846264338327950288;
// my function
double f(double x){
return (cos(pi*x));
}
//function to compute lagrange polynomial
double lagrange_polynomial(int N,double *x){
//N = degree of polynomial
double z,y;
double *L_coeff=new double [0];//L_coefficients of every Lagrange L_coefficient
double *d;//hold the polynomials values for every Lagrange coefficient
int k,i;
//computations for finding lagrange polynomial
//double sum=0;
for (k=0;k<N+1;k++){
for ( i=0;i<N+1;i++){
if (i==0) continue;
d[0]=(z-x[i])/(x[k]-x[i]);//initialization
if (i==k) L_coeff[k]=1.0;
else if (i!=k){
L_coeff[k]*=d[i];
}
}
cout <<"\nL("<<k<<") = "<<d[i]<<"\t\t\tf(x)= "<<f(x[k])<<endl;
}
}
int main()
{
double deg,result;
double *x;
cout <<"Give the degree of the polynomial :"<<endl;
cin >>deg;
for (int i=0;i<deg+1;i++){
cout <<"\nGive the points of interpolation : "<<endl;
cin >> x[i];
}
cout <<"\nThe Lagrange L_coefficients are: "<<endl;
result=lagrange_polynomial(deg,x);
return 0;
}
Here is an example of lagrange polynomial
As this seems to be homework, I am not going to give you an exhaustive answer, but rather try to send you on the right track.
How do you represent polynomials in a computer software? The intuitive version you want to archive as a symbolic expression like 3x^3+5x^2-4 is very unpractical for further computations.
The polynomial is defined fully by saving (and outputting) it's coefficients.
What you are doing above is hoping that C++ does some algebraic manipulations for you and simplify your product with a symbolic variable. This is nothing C++ can do without quite a lot of effort.
You have two options:
Either use a proper computer algebra system that can do symbolic manipulations (Maple or Mathematica are some examples)
If you are bound to C++ you have to think a bit more how the single coefficients of the polynomial can be computed. You programs output can only be a list of numbers (which you could, of course, format as a nice looking string according to a symbolic expression).
Hope this gives you some ideas how to start.
Edit 1
You still have an undefined expression in your code, as you never set any value to y. This leaves prod*=(y-x[i])/(x[k]-x[i]) as an expression that will not return meaningful data. C++ can only work with numbers, and y is no number for you right now, but you think of it as symbol.
You could evaluate the lagrange approximation at, say the value 1, if you would set y=1 in your code. This would give you the (as far as I can see right now) correct function value, but no description of the function itself.
Maybe you should take a pen and a piece of paper first and try to write down the expression as precise Math. Try to get a real grip on what you want to compute. If you did that, maybe you come back here and tell us your thoughts. This should help you to understand what is going on in there.
And always remember: C++ needs numbers, not symbols. Whenever you have a symbol in an expression on your piece of paper that you do not know the value of you can either find a way how to compute the value out of the known values or you have to eliminate the need to compute using this symbol.
P.S.: It is not considered good style to post identical questions in multiple discussion boards at once...
Edit 2
Now you evaluate the function at point y=0.3. This is the way to go if you want to evaluate the polynomial. However, as you stated, you want all coefficients of the polynomial.
Again, I still feel you did not understand the math behind the problem. Maybe I will give you a small example. I am going to use the notation as it is used in the wikipedia article.
Suppose we had k=2 and x=-1, 1. Furthermore, let my just name your cos-Function f, for simplicity. (The notation will get rather ugly without latex...) Then the lagrangian polynomial is defined as
f(x_0) * l_0(x) + f(x_1)*l_1(x)
where (by doing the simplifications again symbolically)
l_0(x)= (x - x_1)/(x_0 - x_1) = -1/2 * (x-1) = -1/2 *x + 1/2
l_1(x)= (x - x_0)/(x_1 - x_0) = 1/2 * (x+1) = 1/2 * x + 1/2
So, you lagrangian polynomial is
f(x_0) * (-1/2 *x + 1/2) + f(x_1) * 1/2 * x + 1/2
= 1/2 * (f(x_1) - f(x_0)) * x + 1/2 * (f(x_0) + f(x_1))
So, the coefficients you want to compute would be 1/2 * (f(x_1) - f(x_0)) and 1/2 * (f(x_0) + f(x_1)).
Your task is now to find an algorithm that does the simplification I did, but without using symbols. If you know how to compute the coefficients of the l_j, you are basically done, as you then just can add up those multiplied with the corresponding value of f.
So, even further broken down, you have to find a way to multiply the quotients in the l_j with each other on a component-by-component basis. Figure out how this is done and you are a nearly done.
Edit 3
Okay, lets get a little bit less vague.
We first want to compute the L_i(x). Those are just products of linear functions. As said before, we have to represent each polynomial as an array of coefficients. For good style, I will use std::vector instead of this array. Then, we could define the data structure holding the coefficients of L_1(x) like this:
std::vector L1 = std::vector(5);
// Lets assume our polynomial would then have the form
// L1[0] + L2[1]*x^1 + L2[2]*x^2 + L2[3]*x^3 + L2[4]*x^4
Now we want to fill this polynomial with values.
// First we have start with the polynomial 1 (which is of degree 0)
// Therefore set L1 accordingly:
L1[0] = 1;
L1[1] = 0; L1[2] = 0; L1[3] = 0; L1[4] = 0;
// Of course you could do this more elegant (using std::vectors constructor, for example)
for (int i = 0; i < N+1; ++i) {
if (i==0) continue; /// For i=0, there will be no polynomial multiplication
// Otherwise, we have to multiply L1 with the polynomial
// (x - x[i]) / (x[0] - x[i])
// First, note that (x[0] - x[i]) ist just a scalar; we will save it:
double c = (x[0] - x[i]);
// Now we multiply L_1 first with (x-x[1]). How does this multiplication change our
// coefficients? Easy enough: The coefficient of x^1 for example is just
// L1[0] - L1[1] * x[1]. Other coefficients are done similary. Futhermore, we have
// to divide by c, which leaves our coefficient as
// (L1[0] - L1[1] * x[1])/c. Let's apply this to the vector:
L1[4] = (L1[3] - L1[4] * x[1])/c;
L1[3] = (L1[2] - L1[3] * x[1])/c;
L1[2] = (L1[1] - L1[2] * x[1])/c;
L1[1] = (L1[0] - L1[1] * x[1])/c;
L1[0] = ( - L1[0] * x[1])/c;
// There we are, polynomial updated.
}
This, of course, has to be done for all L_i Afterwards, the L_i have to be added and multiplied with the function. That is for you to figure out. (Note that I made quite a lot of inefficient stuff up there, but I hope this helps you understanding the details better.)
Hopefully this gives you some idea how you could proceed.
The variable y is actually not a variable in your code but represents the variable P(y) of your lagrange approximation.
Thus, you have to understand the calculations prod*=(y-x[i])/(x[k]-x[i]) and sum+=prod*f not directly but symbolically.
You may get around this by defining your approximation by a series
c[0] * y^0 + c[1] * y^1 + ...
represented by an array c[] within the code. Then you can e.g. implement multiplication
d = c * (y-x[i])/(x[k]-x[i])
coefficient-wise like
d[i] = -c[i]*x[i]/(x[k]-x[i]) + c[i-1]/(x[k]-x[i])
The same way you have to implement addition and assignments on a component basis.
The result will then always be the coefficients of your series representation in the variable y.
Just a few comments in addition to the existing responses.
The exercise is: Find Lagrange's polynomial approximation for y(x)=cos(π x), x ∈ [-1,1] using 5 points (x = -1, -0.5, 0, 0.5, and 1).
The first thing that your main() does is to ask for the degree of the polynomial. You should not be doing that. The degree of the polynomial is fully specified by the number of control points. In this case you should be constructing the unique fourth-order Lagrange polynomial that passes through the five points (xi, cos(π xi)), where the xi values are those five specified points.
const double pi=3.1415;
This value is not good for a float, let alone a double. You should be using something like const double pi=3.14159265358979323846264338327950288;
Or better yet, don't use pi at all. You should know exactly what the y values are that correspond to the given x values. What are cos(-π), cos(-π/2), cos(0), cos(π/2), and cos(π)?