I've been attempting to build a Runge Kutta fourth order integrator to model simple projectile motion. My code is as follows
double rc4(double initState, double (*eqn)(double,double),double now,double dt)
{
double k1 = eqn(initState,now);
double k2 = eqn(initState + k1*dt/2.0,now + dt/2.0);
double k3 = eqn(initState + k2*dt/2.0,now + dt/2.0);
double k4 = eqn(initState + k3*dt, now + dt);
return initState + (dt/6.0) * (k1 + 2*k2 + 2*k3 + k4);
}
This is called within a while loop
while (time <= duration && yPos >=0)
{
xPos = updatePosX(xPos,vx,timeStep);
yPos = updatePosY(yPos,vy,timeStep);
vx = rc4(vx,updateVelX,time,timeStep);
vy = rc4(vy,updateVelY,time,timeStep);
cout << "x Pos: " << xPos <<"\t y Pos: " << yPos << endl;
time+=timeStep;
myFile << xPos << " " << yPos << " " << vx << " " << vy << endl;
}
However, contrary to what should happen my results simply blow up. What's going on here?
Your rk4 code looks right. But only for scalar differential equations.
What you most certainly have is a system of coupled differential equations in a dimension greater than 1. Here you have to apply the integration method in its vector form. That is, x,y,vx,vy are combined into a 4 dimensional (phase) state vector and the system function is vector valued, k1,...k4 are vectors etc.
As an advanced note, time <= duration is sensible to rounding errors accumulated in the repetitions of time+=timeStep;. Better use time <= duration-timeStep/2 to have time at the end of the loop close to duration.
Reading the code on the closed previous question I see that you have problems with the idea of a differential equation. You should not use the result of the Euler step as acceleration in the RK4 implementation. The system for ballistic motion without air friction is
dotx = vx
doty = vy
dotvx = 0
dotvy = -g
which you would have to implement in vector form as something like
eqn(t, [x,y,vx,vy]) // where X = array of double of dimension 4
{ return [vx,vy,0,-g]; }
Related
I have an assignment which says to implement logistic regression in c++ using gradient descent. Part of the assignment is to make the gradient descent stop when the magnitude of the gradient is below 10e-07.
I have to minimize: //chart.googleapis.com/chart?cht=tx&chl=L(w)%20%3D%20%5Cfrac%7B1%7D%7BN%7D%5Csum%20log(1%20%2B%20exp(-y_%7Bi%7Dw%5E%7BT%7Dx_%7Bi%7D))
However my gradient descent keeps stopping due to max iterations surpassed. I have tried with various max iteration thresholds, and they all max out. I think there is something wrong with my code, since logistic regression is supposedly an easy task for gradient descent due to the concave nature of its cost function, the gradient descent should easily find the minium.
I am using the armadillo library for matrices and vectors.
#include "armadillo.hpp"
using namespace arma;
double Log_Likelihood(Mat<double>& x, Mat<int>& y, Mat<double>& w)
{
Mat<double> L;
double L_sum = 0;
for (int i = 0; i < x.n_rows; i++)
{
L = log(1 + exp(-y[i] * w * x.row(i).t() ));
L_sum += as_scalar(L);
}
return L_sum / x.n_rows;
}
Mat<double> Gradient(Mat<double>& x, Mat<int>& y, Mat<double>& w)
{
Mat<double> grad(1, x.n_cols);
for (int i = 0; i < x.n_rows; i++)
{
grad = grad + (y[i] * (1 / (1 + exp(y[i] * w * x.row(i).t()))) * x.row(i));
}
return -grad / x.n_rows;
}
void fit(Mat<double>& x, Mat<int>& y, double alpha = 0.05, double threshold = pow(10, -7), int maxiter = 10000)
{
w.set_size(1, x.n_cols);
w = x.row(0);
int iter = 0;
double log_like = 0;
while (true)
{
log_like = Log_Likelihood(x, y, w);
if (iter % 1000 == 0)
{
std::cout << "Iter: " << iter << " -Log likelihood = " << log_like << " ||dL/dw|| = " << norm( Gradient(x, y, w), 2) << std::endl;
}
iter++;
if ( norm( Gradient(x, y, w), 2) < threshold)
{
std::cout << "Magnitude of gradient below threshold." << std::endl;
break;
}
if (iter == maxiter)
{
std::cout << "Max iterations surpassed." << std::endl;
break;
}
w = w - (alpha * Gradient(x, y, w));
}
}
I want the gradient descent to stop because the magnitude of the gradient falls below 10e-07.
My labels are {1, -1}.
Verify that your loglikelihood is increasing towards convergence by recording or plotting the values at every iteration, and also check that the norm of the gradient is going towards 0. You should be doing gradient ascent, so add the gradient instead of subtracting it. If the norm of the gradient consistently increases it means you are not going in a direction towards the optimum. If on the other hand, the norm of the gradient "jumps around" but doesn't go to 0, then you should reduce your stepsize/learning rate alpha and try again.
Plotting and analyzing these values will be helpful to debug and analyze your algorithm.
I would like to ask a very short question, and it is as follows: in finding the cube root of a number (both neg. and pos.) in C++, how does one restrict the output to real solutions only?
I am currently writing a program to solve a cubic with Cardano's formula, and one of the intermediate variables I am using randomly outputs the complex and real cube roots - and I only need the real roots.
(E.g. in evaluating the cube root of -0.0127378, the three roots would be 0.11677095+0.202253218i, −0.2335419, 0.11677095−0.202253218i - I wish to ignore the complex ones for substitution into a later formula)
Thank you!
EDIT: Solved it! :) I created a signum function and tweaked the sign after taking the power of the absolute value of SPrime and TPrime, so now it carries forward only the real cube root.
/* ... */
#include <iostream>
#include <cmath>
#include <complex>
#include <cstdio>
#include <cassert>
using namespace std;
int signum(std::complex<double> z)
{
if (z.real() < 0 || z.imag() < 0) return -1;
else if (z.real() >= 0 || z.imag() >= 0) return 1;
}
// POST: The function is intended to solve a cubic equation with coefficients a, b, c and d., such that
// ax^3 + bx^2 + cx + d = 0. If there exist infinitely many solutions, we output -1, i.e. if a=b=c=d=0
// (trivial solution).
void solve(std::complex<double> a, std::complex<double> b, std::complex<double> c, std::complex<double> d, std::complex<double>& x1, std::complex<double>& x2, std::complex<double>& x3)
{
complex<double> i = complex<double> (0, 1.0);
// Consider implementing Cardano's method for obtaining the solution of a degree 3 polynomial, as suggested
// We must hence define the discriminant D of such an equation through complex doubles Q and R
std::complex<double> Q;
Q = (3.0*a*c - pow(b, 2)) / (9.0*pow(a, 2));
cout << "Q=" << Q << endl;
std::complex<double> R;
R = (9.0*a*b*c - 27.0*d*pow(a, 2) - 2.0*pow(b, 3)) / (54.0*pow(a, 3));
cout << "R=" << R << endl;
std::complex<double> D;
D = pow(Q, 3) + pow(R, 2);
// Possible types of output for discriminant
if (abs(D) < 0.0)
{
cout << "The cubic has three distinct, real roots." << endl;
}
else if (abs(D) == 0.0)
{
cout << "The cubic has three real roots, at least two of which are equal." << endl;
}
else if (abs(D) > 0.0)
{
cout << "The cubic has one real root and two complex conjugate roots." << endl;
}
// Defining two further complex double variables S and T, which are required to obtain the final solution for x1, x2 and x3
std::complex<double> S;
std::complex<double> SPrime;
SPrime = R+sqrt(Q*Q*Q + R*R);
cout << "SPrime=" << SPrime << endl;
if (signum(SPrime) == -1)
{
S = (-1)*pow(abs(SPrime), 0.3333333333333);
}
else if (signum(SPrime) == 1)
{
S = pow(abs(SPrime), 0.3333333333333);
}
cout << "S=" << S << endl;
std::complex<double> T;
std::complex<double> TPrime;
TPrime = (R-sqrt(Q*Q*Q + R*R));
if (signum(TPrime) == -1)
{
T = (-1)*pow(abs(TPrime), 0.3333333333333);
}
else if (signum(TPrime) == 1)
{
T = pow(abs(TPrime), 0.3333333333333);
}
cout << "T=" << T << endl;
cout << "TPrime= " << TPrime << endl;
// Expressions for the solutions
x1 = S + T - (b/(3.0*a));
x2 = (-0.5)*(S + T) - (b/(3.0*a)) + (sqrt(3.0)*0.5)*(S - T)*i;
x3 = conj(x2);
if (abs(x1) < 0.000000000001)
{
x1 = 0;
}
}
// Driver code
int main ()
{
// Taking user input for a, b, c and d
std::complex<double> a, b, c, d, x1, x2, x3;
cout << "Please enter the coefficients of the polynomial in successive order." << endl;
cin >> a >> b >> c >> d;
solve (a, b, c, d, x1, x2, x3);
cout << x1 << ", " << x2 << ", " << x3 << "." << endl;
return 0;
}
The problem as you're stating it can be solved trivially (with real numbers the cubic root of -x is the opposite of the cubic root of x):
double cuberoot(double x) {
if (x < 0) {
return -pow(-x, 1.0/3.0);
} else if (x > 0) {
return pow(x, 1.0/3.0);
} else {
return 0;
}
}
If the input is instead in general complex z and you're looking for the "most real" (principal) cubic root the same reasoning can be applied using complex pow version to either z or -z depending on the sign of the real part:
std::complex<double> cuberoot(std::complex<double> z) {
if (z.real() < 0) {
return -pow(-z, 1.0/3.0);
} else {
return pow(z, 1.0/3.0);
}
}
Problems with your code:
As you allow complex coefficients, the discussion of the discriminant becomes slightly meaningless, it is only of value for real coefficients.
abs(D) is always non-negative. If D==0, then there is a double root, more can not be said in the case of complex coefficients.
You can avoid a lot of code by utilizing that S*T=-Q. One would have to care that the computation of u=T^3 returns the larger of the roots of 0==u^2 - 2*R*u - Q^3 or (u-R)^2 = D = R^2+Q^3
rtD = sqrt(D);
T = cuberoot( R + (abs(R+rtD)>=abs(R-rtD)) ? rtD : -rtD );
S = (abs(T)<epsilon) ? 0 : -Q/T;
Because of abs(R)<=abs(T)^3 and abs(D)<=abs(T)^6
one gets also abs(Q)<=2^(1/3)*abs(T)^2 resulting in
abs(S)=abs(Q/T) <= 2^(1/3)*abs(T)
For S=-Q/T to fail one would thus need a serious case
of extremely small floating point numbers in R, Q
and thus T. Quantitatively, for double even
the threshold epsilon=1e-150 should be safe.
On cube root variants:
For esthetic reasons one might want T as close to a coordinate axis as possible. A cube root function achieving this would be
std::complex<double> cuberoot(std::complex<double> z) {
double r=abs(z), phi=arg(z);
double k = round(2*phi/pi);
// closest multiple of pi/2
// an equivalent angle is (phi-k*pi/2) - k*3*pi/2
return std::polar( pow(r,1.0/3), (phi-k*pi/2)/3 - k*pi/2 );
}
so that abs(phi-k*pi/2)<=pi/4, and thus the angle to the next coordinate axis of the cube root is smaller than pi/12=15°. cuberoot(i) returns -i, cuberoot(-1) returns -1, a point at 60° returns a cube root at (60°-90°)/3-90°=-100°, etc.
I am being asked to find the roots of f(x) = 5x(e^-mod(x))cos(x) + 1 . I have previously used the Durand-Kerner method to find the roots of the function x^4 -3x^3 + x^2 + x + 1 with the code shown below. I thought I could simply reuse the code to find the roots of f(x) but whenever I replace x^4 -3x^3 + x^2 + x + 1 with f(x) the program outputs nan for all the roots. What is wrong with my Durand-Kerner implementation and how do I go about modifying it to work for f(x)? I would be very grateful for any help.
#include <iostream>
#include <complex>
#include <math.h>
using namespace std;
typedef complex<double> dcmplx;
dcmplx f(dcmplx x)
{
// the function we are interested in
double a4 = 1;
double a3 = -3;
double a2 = 1;
double a1 = 1;
double a0 = 1;
return (a4 * pow(x,4) + a3 * pow(x,3) + a2 * pow(x,2) + a1 * x + a0);
}
int main()
{
dcmplx p(.9,2);
dcmplx q(.1, .5);
dcmplx r(.7,1);
dcmplx s(.3, .5);
dcmplx p0, q0, r0, s0;
int max_iterations = 100;
bool done = false;
int i=0;
while (i<max_iterations && done == false)
{
p0 = p;
q0 = q;
r0 = r;
s0 = s;
p = p0 - f(p0)/((p0-q)*(p0-r)*(p0-s));
q = q0 - f(q0)/((q0-p)*(q0-r)*(q0-s));
r = r0 - f(r0)/((r0-p)*(r0-q)*(r0-s0));
s = s0 - f(s0)/((s0-p)*(s0-q)*(s0-r));
// if convergence within small epsilon, declare done
if (abs(p-p0)<1e-5 && abs(q-q0)<1e-5 && abs(r-r0)<1e-5 && abs(s-s0)<1e-5)
done = true;
i++;
}
cout<<"roots are :\n";
cout << p << "\n";
cout << q << "\n";
cout << r << "\n";
cout << s << "\n";
cout << "number steps taken: "<< i << endl;
return 0;
}
The only thing I have been changing so far is the dcmplx f function. I have been changing it to
dcmplx f(dcmplx x)
{
// the function we are interested in
double a4 = 5;
double a0 = 1;
return (a4 * x * exp(-x) * cos(x) )+ a0;
}
The Durand-Kerner method that you're using requires the function to be continuous on the interval you are working.
Here we ahve a discrepancy between the mathematical view and the limits of the numeric applications. I'd propose you to plot your function (typing the formula in google will give you a quick overview of course for the real part). You'll notice that:
there are an infinity of roots due to the periodicity of the cosinus.
due to the x*exp(-x) the absolute value quickly rises up beyond the maximum precision that a floating point number can hold.
To understand the consequences on your code, I invite you to trace the different iteration. You'll notice that p, r and s are converging very quicky while q is diverging (apparently on the track of one of the huge peak):
At the 2nd iteration q is already at 1e74
At 3rd iteration already beyond what a double can store.
As q is used in the calculation of p,r and s, the error is propagated to the other terms
At 5th iteration, all terms are at NAN
It then continues bravely through the 100 iterations
Perhap's you could make it work by choosing different starting points. If not, you'll have to use some other method and carefully select the interwall on which you're working.
You should have noted in your documentation of the Durand-Kerner method (invented by Karl Weierstrass around 1850) that it only applies to polynomials. Your second function is far from being a polynomial.
Indeed, because of the mod function it has to be declared as a nasty function for numerical methods. Most of them rely on the continuity of the given function, i.e., if the value is close to zero, there is a good chance that there is a root nearby and if the sign changes on an interval then there is a root in the interval. Even the most basic derivate-free methods as the bisection method or Brents method on the sophisticated end of that class pre-suppose these properties.
I would like to simulate a point mass within a closed box. There is no friction and the point mass obeys the impact law. So there are only elastic collisions with the walls of the box. The output of the program is the time, position (rx,ry ,rz) and velocity (vx,vy,vz). I plot the trajectory by using GNUplot.
The problem I have now is, that the point mass gets energy from somewhere. So their jumps get each time more intense.
Is someone able to check my code?
/* Start of the code */
#include <iostream>
#include <cmath>
#include <iomanip>
using namespace std;
struct pointmass
{
double m; // mass
double r[3]; // coordinates
double v[3]; // velocity
};
// Grav.constant
const double G[3] = {0, -9.81, 0};
int main()
{
int Time = 0; // Duration
double Dt = 0; // Time steps
pointmass p0;
cerr << "Duration: ";
cin >> Time;
cerr << "Time steps: ";
cin >> Dt;
cerr << "Velocity of the point mass (vx,vy,vz)? ";
cin >> p0.v[0];
cin >> p0.v[1];
cin >> p0.v[2];
cerr << "Initial position of the point mass (x,y,z)? ";
cin >> p0.r[0];
cin >> p0.r[1];
cin >> p0.r[2];
for (double i = 0; i<Time; i+=Dt)
{
cout << i << setw(10);
for (int j = 0; j<=2; j++)
{
////////////position and velocity///////////
p0.r[j] = p0.r[j] + p0.v[j]*i + 0.5*G[j]*i*i;
p0.v[j] = p0.v[j] + G[j]*i;
///////////////////reflection/////////////////
if(p0.r[j] >= 250)
{
p0.r[j] = 500 - p0.r[j];
p0.v[j] = -p0.v[j];
}
else if(p0.r[j] <= 0)
{
p0.r[j] = -p0.r[j];
p0.v[j] = -p0.v[j];
}
//////////////////////////////////////////////
}
/////////////////////Output//////////////////
for(int j = 0; j<=2; j++)
{
cout << p0.r[j] << setw(10);
}
for(int j = 0; j<=2; j++)
{
cout << p0.v[j] << setw(10);
}
///////////////////////////////////////////////
cout << endl;
}
}
F = ma
a = F / m
a dt = F / m dt
a dt is acceleration over a fixed time - the change in velocity for that frame.
You are setting it to F / m i
it is that i which is wrong, as comments have suggested. It needs to be the duration of a frame, not the duration of the entire simulation so far.
I am a little concerned about the time loop along with other commenters - make sure that it represents an increment of time, not a growing duration.
Still, I think the main problem is you are changing the sign of all three components of velocity
on reflection.
That's not consistent with the laws of physics -conservation of linear momentum and energy - at the boundaries.
To see this, consider the case if your particle is moving in just the x-y plane (velocity in z is zero) and about to hit the wall at x= L.
The collision looks like this:
The force exerted on the point mass by the wall acts perpendicular to the wall. So there is no change in the momentum component of the particle parallel to the wall.
Applying conservation of linear momentum and kinetic energy, and assuming a perfectly elastic collision, you will find that
The component of velocity perpendicular to the wall DOES change sign
The component of velocity parallel to the wall DOES NOT change sign
In three dimensions, to have an accurate simulation, you have to work out the momentum components parallel and perpendicular to the wall on collision and code the resulting velocity changes.
In other words, this code:
///////////////////reflection/////////////////
if(p0.r[j] >= 250)
{
p0.r[j] = 500 - p0.r[j];
p0.v[j] = -p0.v[j];
}
else if(p0.r[j] <= 0)
{
p0.r[j] = -p0.r[j];
p0.v[j] = -p0.v[j];
}
//////////////////////////////////////////////
does not model the physics of reflection correctly. To fix it here is an outline of what to do:
Take the reflection checks out of the loop over x,y,z coordinates (but still within the time loop)
The collision condition for all six walls needs to be checked,
according to the direction of the normal vector to the wall.
For example for the right wall of the cube defined by X=250, 0<=Y<250, 0<=Z<250, the normal vector is in the negative X direction. For the left wall defined by X=0, 0<=Y<250, 0<=Z<250, the normal vector is in the positive X direction.
So on reflection from those two walls, the X component of velocity changes sign because it is normal (perpendicular) to the wall, but the Y and Z components do NOT change sign because they are parallel to the wall.
Apply similar considerations at the top and bottom wall (constant Y), and front and back wall (constant Z), of the cube -left as exercise to work out the normals to those surfaces.
Finally you shouldn't change sign of the position vector components on reflection, just the velocity vector. Instead recompute the next value of the position vector given the new velocity.
OK, so there are a few issues. The others have pointed out the need to use Dt rather than i for the integration step.
However, you are correct in stating that there is an issue with the reflection and energy conservation. I've added an explicit track of that below.
Note that the component wise computation of the reflection is actually fine other than the energy issue.
The problem was that during a reflection the acceleration due to gravity changes. In the case of the particle hitting the floor, it was acquiring kinetic energy equal to that it would have had if it had kept falling, but the new position had higher potential energy. So the energy would increase by exactly twice the potential energy difference between the floor and the new position. A bounce off the roof would have the opposite effect.
As noted below, once strategy would be to compute the actual time of reflection. However, actually working directly with energy is much simpler as well as more robust. However, please note although the the simple energy version below ensures that the speed and position are consistent, it actually does not have the correct position. For most purposes that may not actually matter. If you really need the correct position, I think we need to solve for the bounce time.
/* Start of the code */
#include <iostream>
#include <cmath>
#include <iomanip>
using namespace std;
struct pointmass
{
double m; // mass
double r[3]; // coordinates
double v[3]; // velocity
};
// Grav.constant
const double G[3] = { 0, -9.81, 0 };
int main()
{
// I've just changed the initial values to speed up unit testing; your code worked fine here.
int Time = 50; // Duration
double Dt = 1; // Time steps
pointmass p0;
p0.v[0] = 23;
p0.v[1] = 40;
p0.v[2] = 15;
p0.r[0] = 100;
p0.r[1] = 200;
p0.r[2] = 67;
for (double i = 0; i<Time; i += Dt)
{
cout << setw(10) << i << setw(10);
double energy = 0;
for (int j = 0; j <= 2; j++)
{
double oldR = p0.r[j];
double oldV = p0.v[j];
////////////position and velocity///////////
p0.r[j] = p0.r[j] + p0.v[j] * Dt + 0.5*G[j] * Dt*Dt;
p0.v[j] = p0.v[j] + G[j] * Dt;
///////////////////reflection/////////////////
if (G[j] == 0)
{
if (p0.r[j] >= 250)
{
p0.r[j] = 500 - p0.r[j];
p0.v[j] = -p0.v[j];
}
else if (p0.r[j] <= 0)
{
p0.r[j] = -p0.r[j];
p0.v[j] = -p0.v[j];
}
}
else
{
// Need to capture the fact that the acceleration switches direction relative to velocity half way through the timestep.
// Two approaches, either
// Try to compute the time of the bounce and work out the detail.
// OR
// Use conservation of energy to get the right speed - much easier!
if (p0.r[j] >= 250)
{
double energy = 0.5*p0.v[j] * p0.v[j] - G[j] * p0.r[j];
p0.r[j] = 500 - p0.r[j];
p0.v[j] = -sqrt(2 * (energy + G[j] * p0.r[j]));
}
else if (p0.r[j] <= 0)
{
double energy = 0.5*p0.v[j] * p0.v[j] - G[j] * p0.r[j];
p0.r[j] = -p0.r[j];
p0.v[j] = sqrt(2*(energy + G[j] * p0.r[j]));
}
}
energy += 0.5*p0.v[j] * p0.v[j] - G[j] * p0.r[j];
}
/////////////////////Output//////////////////
cout << energy << setw(10);
for (int j = 0; j <= 2; j++)
{
cout << p0.r[j] << setw(10);
}
for (int j = 0; j <= 2; j++)
{
cout << p0.v[j] << setw(10);
}
///////////////////////////////////////////////
cout << endl;
}
}
This code compiles and runs but does not output the correct distances.
for (int z = 0; z < spaces_x; z++)
{
double dist=( ( (spaces[z][0]-x)^2) + ( (spaces[z][1]-y)^2) );
dist = abs(dist);
dist = sqrt(dist);
cout << "for x " << spaces[z][0] <<
" for y " << spaces[z][1] <<
" dist is "<< dist << endl;
if (dist < min_dist)
{
min_dist = dist;
index = z;
}
}
Does anyone have an idea what the problem could be?
The syntax ^ 2 does not mean raise to the power of 2 - it means XOR. Use x * x.
double dx = spaces[z][0] - x;
double dy = spaces[z][1] - y;
double dist2 = dx * dx + dy * dy;
It may be a better idea to use hypot() instead of manually squaring and adding and taking a the square root. hypot() takes care of a number of cases where naive approach would lose precision. It is a part of C99 and C++0x, and for the compilers that don't have it, there's always boost.math.
^ is the xor operator; it does not perform exponentiation.
In the general case, if you want to raise something to a power, you should use the std::pow function. However, in this specific case, since it is the square, you're probably better off just using multiplication (e.g., x * x instead of std::pow(x, 2)).
Note that in C++ the caret (^) is not an exponentiation operator. Rather, it's a bitwise exclusive or.