Using derivatives as functions in CppAD - c++

I am trying to modify the example here:
# include <cppad/cppad.hpp>
namespace { // ---------------------------------------------------------
// define the template function JacobianCases<Vector> in empty namespace
template <typename Vector>
bool JacobianCases()
{ bool ok = true;
using CppAD::AD;
using CppAD::NearEqual;
double eps99 = 99.0 * std::numeric_limits<double>::epsilon();
using CppAD::exp;
using CppAD::sin;
using CppAD::cos;
// domain space vector
size_t n = 2;
CPPAD_TESTVECTOR(AD<double>) X(n);
X[0] = 1.;
X[1] = 2.;
// declare independent variables and starting recording
CppAD::Independent(X);
// a calculation between the domain and range values
AD<double> Square = X[0] * X[0];
// range space vector
size_t m = 3;
CPPAD_TESTVECTOR(AD<double>) Y(m);
Y[0] = Square * exp( X[1] );
Y[1] = Square * sin( X[1] );
Y[2] = Square * cos( X[1] );
// create f: X -> Y and stop tape recording
CppAD::ADFun<double> f(X, Y);
// new value for the independent variable vector
Vector x(n);
x[0] = 2.;
x[1] = 1.;
// compute the derivative at this x
Vector jac( m * n );
jac = f.Jacobian(x);
/*
F'(x) = [ 2 * x[0] * exp(x[1]) , x[0] * x[0] * exp(x[1]) ]
[ 2 * x[0] * sin(x[1]) , x[0] * x[0] * cos(x[1]) ]
[ 2 * x[0] * cos(x[1]) , -x[0] * x[0] * sin(x[i]) ]
*/
ok &= NearEqual( 2.*x[0]*exp(x[1]), jac[0*n+0], eps99, eps99);
ok &= NearEqual( 2.*x[0]*sin(x[1]), jac[1*n+0], eps99, eps99);
ok &= NearEqual( 2.*x[0]*cos(x[1]), jac[2*n+0], eps99, eps99);
ok &= NearEqual( x[0] * x[0] *exp(x[1]), jac[0*n+1], eps99, eps99);
ok &= NearEqual( x[0] * x[0] *cos(x[1]), jac[1*n+1], eps99, eps99);
ok &= NearEqual(-x[0] * x[0] *sin(x[1]), jac[2*n+1], eps99, eps99);
return ok;
}
} // End empty namespace
# include <vector>
# include <valarray>
bool Jacobian(void)
{ bool ok = true;
// Run with Vector equal to three different cases
// all of which are Simple Vectors with elements of type double.
ok &= JacobianCases< CppAD::vector <double> >();
ok &= JacobianCases< std::vector <double> >();
ok &= JacobianCases< std::valarray <double> >();
return ok;
}
I am trying to modify it in the following way:
Let G be the Jacobian jac that is calculated in this example, in the line:
jac = f.Jacobian(x);
and, as in the example, let X be the independent variables. I would like to construct a new function, H, which is a function of jac, i.e. H(jacobian(X)) = something, such that H is autodifferentiable. An example may be H(X) = jacobian( jacobian(X)[0]), i.e. the jacobian of the first element of jacobian(X) w.r.t X (a second derivative of sorts).
The problem is that jac as written here is of type Vector, which is a parameterized type on a raw double, not an AD<double>. To my knowledge, this means the output is not autodifferentiable.
I am looking for some advice on if it is possible to use the Jacobian in a larger operation, and take the Jacobian of that larger operation (not unlike any arithmetic operator) or if this is not possible.
EDIT: This has been put up for a bounty once, but I'm putting it up again to see if there's a better solution, because I think this is important. To be a bit more clear, the elements that the "correct" answer needs are:
a) A means of calculating arbitrary order derivatives.
b) An intelligent way of not having to specify the order of derivatives a priori. If the maximum order derivative must be known at compile time, the order of derivative can't be determined algorithmically. Further, specifying an enormously large order as in the current answer given will lead to memory allocation issues and, I imagine, performance issues.
c) Abstracting the templating of derivative order away from the end-user. This is important, because it can be difficult to keep track of the order of derivatives needed. This is probably something that comes "for free" if b) is solved.
If anybody can crack this, it would be an awesome contribution and an extremely useful operation.

If you want to nest functions, you should nest the AD<> as well. You can nest Jacobians as other functions, for instance see the code snippet below, which is computing the double derivative by nesting Jacobian
#include <cstring>
#include <iostream> // standard input/output
#include <vector> // standard vector
#include <cppad/cppad.hpp> // the CppAD package http://www.coin-or.org/CppAD/
// main program
int main(void)
{ using CppAD::AD; // use AD as abbreviation for CppAD::AD
using std::vector; // use vector as abbreviation for std::vector
size_t i; // a temporary index
// domain space vector
auto Square = [](auto t){return t*t;};
vector< AD<AD<double>> > X(1); // vector of domain space variables
// declare independent variables and start recording operation sequence
CppAD::Independent(X);
// range space vector
vector< AD<AD<double>> > Y(1); // vector of ranges space variables
Y[0] = Square(X[0]); // value during recording of operations
// store operation sequence in f: X -> Y and stop recording
CppAD::ADFun<AD<double>> f(X, Y);
// compute derivative using operation sequence stored in f
vector<AD<double>> jac(1); // Jacobian of f (m by n matrix)
vector<AD<double>> x(1); // domain space vector
CppAD::Independent(x);
jac = f.Jacobian(x); // Jacobian for operation sequence
CppAD::ADFun<double> f2(x, jac);
vector<double> result(1);
vector<double> x_res(1);
x_res[0]=15.;
result=f2.Jacobian(x_res);
// print the results
std::cout << "f'' computed by CppAD = " << result[0] << std::endl;
}
As a side-note, since C++14 or 11 implementing expression templates and automatic differentiation became easier and can be done with much less effort, as shown e.g. in this video towards the end https://www.youtube.com/watch?v=cC9MtflQ_nI (sorry for the poor quality). If I had to implement reasonably simple symbolic operations I would start from scratch with modern C++: you can write simpler code, and you get errors that you can understand easily.
Edit:
Generalizing the example to build arbitrary order derivatives can be a template metaprogramming exercice. The snippet below shows it is possible using template recursion
#include <cstring>
#include <iostream>
#include <vector>
#include <cppad/cppad.hpp>
using CppAD::AD;
using std::vector;
template<typename T>
struct remove_ad{
using type=T;
};
template<typename T>
struct remove_ad<AD<T>>{
using type=T;
};
template<int N>
struct derivative{
using type = AD<typename derivative<N-1>::type >;
static constexpr int order = N;
};
template<>
struct derivative<0>{
using type = double;
static constexpr int order = 0;
};
template<typename T>
struct Jac{
using value_type = typename remove_ad<typename T::type>::type;
template<typename P, typename Q>
auto operator()(P & X, Q & Y){
CppAD::ADFun<value_type> f(X, Y);
vector<value_type> jac(1);
vector<value_type> x(1);
CppAD::Independent(x);
jac = f.Jacobian(x);
return Jac<derivative<T::order-1>>{}(x, jac);
}
};
template<>
struct Jac<derivative<1>>{
using value_type = derivative<0>::type;
template<typename P, typename Q>
auto operator()(P & x, Q & jac){
CppAD::ADFun<value_type> f2(x, jac);
vector<value_type> res(1);
vector<value_type> x_res(1);
x_res[0]=15.;
return f2.Jacobian(x_res);
}
};
int main(void)
{
constexpr int order=4;
auto Square = [](auto t){return t*t;};
vector< typename derivative<order>::type > X(1);
vector< typename derivative<order>::type > Y(1);
CppAD::Independent(X);
Y[0] = Square(X[0]);
auto result = Jac<derivative<order>>{}(X, Y);
std::cout << "f'' computed by CppAD = " << result[0] << std::endl;
}

There is a new feature in CppAD that eliminates the need for AD< AD >, see
https://coin-or.github.io/CppAD/doc/base2ad.cpp.htm

Related

Data structure for interpolating T along axis of numerical keys

Is there a data structure that when initialized with a key1 at 10 and key2 at 20, when accessed at 15 will call an interpolation callback function with (key1, key2, 0.5) with return value T?
The callback ensures that we can use our preferred S-curve for the interpolation. I'm expecting accesses on each end to pass the same key in both with 1.0 as the factor.
Pseudo-code:
Container<float, MyStruct> container;
container.insert(10, MyStruct{ .x = 1.0f });
container.insert(20, MyStruct{ .x = 2.0f });
auto middlepoint = container.get(15,
[] (const MyStruct& a, const MyStruct& b, float factor) {
return MyStruct{ a.x * factor + b.x * (1.0 - factor) };
});
assert(middlepoint.x ~ 1.5f);
Have you seen this? What is this data structure called?
Thanks
Its not just a data structure you are looking for, but a way to retrieve information from it. You can use a map + find_if:
#include <map>
#include <algorithm>
#include <iostream>
template <typename C,typename I>
double get(const C& container,double x,I interp) {
auto it = std::find_if(container.begin(),container.end(),[x](auto e){ return e.first > x;});
return interp(std::prev(it)->second,it->second);
}
int main() {
std::map<double,double> m{ {1,0.0},{2,3.0},{3,5.0}};
std::cout << get(m,1.5,[](auto x1,auto x2){ return 0.5*(x1 + x2);});
}
Prints: 1.5
Maybe 2-3 more lines are needed to handle edge cases, but otherwise above is all you need. Sometimes a wheel is really just a wheel and can be crafted by building on existing containers and algorithms. I don't expect to find this anywhere as a container, but rather as a interpolation function along the line of interpolate(some_array_with_x_values,some_array_with_y_values,x).

Eigen LLT (Cholesky) fails, while SVD works

I'm trying to reproduce some numpy code on Gaussian Processes (from here) using Eigen. Basically, I need to sample from a multivariate normal distribution:
samples = np.random.multivariate_normal(mu.ravel(), cov, 1)
The mean vector is currently zero, while the covariance matrix is a square matrix generated via the isotropic squared exponential kernel:
sqdist = np.sum(X1**2, 1).reshape(-1, 1) + np.sum(X2**2, 1) - 2 * np.dot(X1, X2.T)
return sigma_f**2 * np.exp(-0.5 / l**2 * sqdist)
I can generate the covariance matrix just fine for now (it can probably be cleaned but for now it's a POC):
Matrix2D kernel(const Matrix2D & x1, const Matrix2D & x2, double l = 1.0, double sigma = 1.0) {
auto dists = ((- 2.0 * (x1 * x2.transpose())).colwise()
+ x1.rowwise().squaredNorm()).rowwise() +
+ x2.rowwise().squaredNorm().transpose();
return std::pow(sigma, 2) * ((-0.5 / std::pow(l, 2)) * dists).array().exp();
}
However, my problems start when I need to sample the multivariate normal.
I've tried using the solution proposed in this accepted answer; however, the decomposition only works with covariance matrices of size up to 30x30; more than that and LLT fails to decompose the matrix. The alternative version provided in the answer also does not work, and creates NaNs. I tried LDLT as well but it also breaks (D contains negative values, so sqrt gives NaN).
Then, I got curious, and I looked into how numpy does this. Turns out the numpy implementation uses SVD decomposition (with LAPACK), rather than Cholesky. So I tried copying their implementation:
// SVD on the covariance matrix generated via kernel function
Eigen::BDCSVD<Matrix2D> solver(covs, Eigen::ComputeFullV);
normTransform = (-solver.matrixV().transpose()).array().colwise() * solver.singularValues().array().sqrt();
// Generate gaussian samples, "randN" is from the multivariate StackOverflow answer
Matrix2D gaussianSamples = Eigen::MatrixXd::NullaryExpr(1, means.size(), randN);
Eigen::MatrixXd samples = (gaussianSamples * normTransform).rowwise() + means.transpose();
The various minuses are me trying to exactly reproduce numpy's results.
In any case, this works perfectly fine, even with large dimensions. I was wondering why Eigen is not able to do LLT, but SVD works. The covariance matrix I use is the same. Is there something I can do to simply use LLT?
EDIT: Here is my full example:
#include <iostream>
#include <random>
#include <Eigen/Cholesky>
#include <Eigen/SVD>
#include <Eigen/Eigenvalues>
using Matrix2D = Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor | Eigen::AutoAlign>;
using Vector = Eigen::Matrix<double, Eigen::Dynamic, 1>;
/*
We need a functor that can pretend it's const,
but to be a good random number generator
it needs mutable state.
*/
namespace Eigen {
namespace internal {
template<typename Scalar>
struct scalar_normal_dist_op
{
static std::mt19937 rng; // The uniform pseudo-random algorithm
mutable std::normal_distribution<Scalar> norm; // The gaussian combinator
EIGEN_EMPTY_STRUCT_CTOR(scalar_normal_dist_op)
template<typename Index>
inline const Scalar operator() (Index, Index = 0) const { return norm(rng); }
};
template<typename Scalar> std::mt19937 scalar_normal_dist_op<Scalar>::rng;
template<typename Scalar>
struct functor_traits<scalar_normal_dist_op<Scalar> >
{ enum { Cost = 50 * NumTraits<Scalar>::MulCost, PacketAccess = false, IsRepeatable = false }; };
} // end namespace internal
} // end namespace Eigen
Matrix2D kernel(const Matrix2D & x1, const Matrix2D & x2, double l = 1.0, double sigma = 1.0) {
auto dists = ((- 2.0 * (x1 * x2.transpose())).colwise() + x1.rowwise().squaredNorm()).rowwise() + x2.rowwise().squaredNorm().transpose();
return std::pow(sigma, 2) * ((-0.5 / std::pow(l, 2)) * dists).array().exp();
}
int main() {
unsigned size = 50;
unsigned seed = 1;
Matrix2D X = Vector::LinSpaced(size, -5.0, 4.8);
Eigen::internal::scalar_normal_dist_op<double> randN; // Gaussian functor
Eigen::internal::scalar_normal_dist_op<double>::rng.seed(seed); // Seed the rng
Vector means = Vector::Zero(X.rows());
auto covs = kernel(X, X);
Eigen::LLT<Matrix2D> cholSolver(covs);
// We can only use the cholesky decomposition if
// the covariance matrix is symmetric, pos-definite.
// But a covariance matrix might be pos-semi-definite.
// In that case, we'll go to an EigenSolver
Eigen::MatrixXd normTransform;
if (cholSolver.info()==Eigen::Success) {
std::cout << "Used LLT\n";
// Use cholesky solver
normTransform = cholSolver.matrixL();
} else {
std::cout << "Broken\n";
Eigen::BDCSVD<Matrix2D> solver(covs, Eigen::ComputeFullV);
normTransform = (-solver.matrixV().transpose()).array().colwise() * solver.singularValues().array().sqrt();
}
Matrix2D gaussianSamples = Eigen::MatrixXd::NullaryExpr(1, means.size(), randN);
Eigen::MatrixXd samples = (gaussianSamples * normTransform).rowwise() + means.transpose();
return 0;
}

How to incorporate time-varying parameters from lookup table into boost::odeint, c++

I am trying to numerically integrate a nonlinear system using boost::odeint. The system has time-varying parameters that are generated externally, and I want to incorporate this into my program. Is this possible with odeint? In Matlab, if you were to do something similar, you would need to interpolate the values as they become available.
Thank you in advance for your help!
Edit:
You can solve nonlinear time-varying system easily with odeint. The following example of a nonlinear time-varying system is taken from Applied Nonlinear Control by Slotine
Notice that we can insert 6sin(t) inside ode safely since we are not doing anything at each time step. If your system has a controller that depends on a time step like PID controller that requires delta time to compute derivatives, then in this case, don't put it inside the ode since ode() is called several times by the ode solver. This is my code for solving the system.
#include <iostream>
#include <boost/math/constants/constants.hpp>
#include <boost/numeric/odeint.hpp>
#include <fstream>
std::ofstream data("data.txt");
using namespace boost::numeric::odeint;
typedef std::vector< double > state_type;
class System
{
public:
System(const double& deltaT);
void updateODE();
void updateSystem();
private:
double t, dt;
runge_kutta_dopri5 < state_type > stepper;
state_type y;
void ode(const state_type &y, state_type &dy, double t);
};
System::System(const double& deltaT) : dt(deltaT), t(0.0), y(2)
{
/*
x = y[0]
dx = y[1] = dy[0]
ddx = dy[1] = ode equation
*/
// initial values
y[0] = 2.0; // x1
y[1] = 0.0; // x2
}
void System::updateODE()
{
// store data for plotting
data << t << " " << y[0] << std::endl;
//=========================================================================
using namespace std::placeholders;
stepper.do_step(std::bind(&System::ode, this, _1, _2, _3), y, t, dt);
t += dt;
}
void System::updateSystem()
{
// you can utitilize this function in case you have a controller and
// you need to update the controller at a fixed step size.
}
void System::ode(const state_type &y, state_type &dy, double t)
{
//#####################( ODE Equation )################################
dy[0] = y[1];
dy[1] = 6.0*sin(t) - 0.1*y[1] - pow(y[0],5);
}
int main(int argc, char **argv)
{
const double dt(0.001);
System sys(dt);
for (double t(0.0); t <= 50.0; t += dt){
// update time-varying parameters of the system
//sys.updateSystem();
// solve the ODE one step forward.
sys.updateODE();
}
return 0;
}
The result is (i.e. same result presented in the aforementioned book).
Yes, this is possible. But you might need to interpolate the values too. You interact with the solver via the system function. It can be functor and it can contain the a link to the data, for example
struct ode
{
void operator()( const state_type& x , state_type& dxdt , double t ) const
{
// get the the parameter from time_varying_parameters
}
std::vector<double> time_varying_parameters;
};

R interacting with c++ by means of Rcpp to solve large ODE systems

I'm trying to speedup my R code performing some computationally expensive task with C++ and Rcpp. My problem involves approximately a system of 100 equations, so any hint to speed up the computation is welcome.
What I need is to import a matrix MX created in R into a C++ script. The C++ script have to use rows of MX as x0 (x initial values) in a systems of ODE.
To simplify the explanation of my problem, the code below is based on the Lorenz systems.
As it is clear from the quality of my code, I'm new to C++ (and Rcpp).
For clarity, I don't post all my test code that are terrible, bu I really need your help to try solve this problem.
Any help will be really, really appreciated!
Thanks in advance.
#include <boost/array.hpp>
#include <boost/numeric/odeint.hpp>
#include <Rcpp.h>
// [[Rcpp::depends(BH)]]
// [[Rcpp::plugins(cpp11)]]
using namespace std;
using namespace boost::numeric::odeint;
double theta [] = {10.000,28,2.5};
typedef boost::array< double , 3 > state_type;
void lorenz( const state_type &x , state_type &dxdt , double t ) {
dxdt[0] = theta[0] * ( x[1] - x[0] );
dxdt[1] = theta[1] * x[0] - x[1] - x[0] * x[2];
dxdt[2] = -theta[2] * x[2] + x[0] * x[1];
}
struct foo { std::vector<double> a, b, c; };
struct foo f;
//observer should be a function that append a single output row for each input row of mx corresponding to the last integration step.
void append_lorenz(const state_type &x , const double t ) {
f.a.push_back(x[0]);
f.b.push_back(x[1]);
f.c.push_back(x[2]);
}
using namespace Rcpp;
// [[Rcpp::export]]
DataFrame callMain(NumericMatrix mx){
int n = mx.nrow();
NumericMatrix total(mx);
for(int i = 0; i < n; ++i) {
// state_type x should be mx rows
state_type x = total.row(i); // initial conditions
const double dt =0.1;
integrate(lorenz , x , 0.0 , 1.0 , dt , append_lorenz );
}
return DataFrame::create(Named("a") = f.a, Named("b") = f.b, Named("c") = f.c);
}
/*** R
mx=matrix(1:9,3,3)
res <- callMain(mx)
print((res))
*/
the error I get is:
error: conversion from ‘Rcpp::Matrix<14>::Row {aka Rcpp::MatrixRow<14>}’ to non-scalar type ‘state_type {aka boost::array}’ requested
state_type x = total.row(i); // initial conditions
I think the error message is clear enough.
state_type x = total.row(i);
There is no conversion between Rcpp object and boost::array, you need to develop your own.

Comparison of odeint's runge_kutta4 with Matlab's ode45

I would like to use runge_kutta4 method in the odeint C++ library. I've solved the problem in Matlab. My following code in Matlab to solve x'' = -x - g*x', with initial values x1 = 1, x2 = 0, is as follows
main.m
clear all
clc
t = 0:0.1:10;
x0 = [1; 0];
[t, x] = ode45('ODESolver', t, x0);
plot(t, x(:,1));
title('Position');
xlabel('time (sec)');
ylabel('x(t)');
ODESolver.m
function dx = ODESolver(t, x)
dx = zeros(2,1);
g = 0.15;
dx(1) = x(2);
dx(2) = -x(1) - g*x(2);
end
I've installed the odeint Library. My code for using runge_kutta4 is as follows
#include <iostream>
#include <boost/numeric/odeint.hpp>
using namespace std;
using namespace boost::numeric::odeint;
/* The type of container used to hold the state vector */
typedef std::vector< double > state_type;
const double gam = 0.15;
/* The rhs of x' = f(x) */
void lorenz( const state_type &x , state_type &dx , double t )
{
dx[0] = x[1];
dx[1] = -x[0] - gam*x[1];
}
int main(int argc, char **argv)
{
const double dt = 0.1;
runge_kutta_dopri5<state_type> stepper;
state_type x(2);
x[0] = 1.0;
x[1] = 0.0;
double t = 0.0;
cout << x[0] << endl;
for ( size_t i(0); i <= 100; ++i){
stepper.do_step(lorenz, x , t, dt );
t += dt;
cout << x[0] << endl;
}
return 0;
}
The result is in the following picture
My question is why the result varies? Is there something wrong with my C++ code?
These are the first values of both methods
Matlab C++
-----------------
1.0000 0.9950
0.9950 0.9803
0.9803 0.9560
0.9560 0.9226
0.9226 0.8806
0.8806 0.8304
0.8304 0.7728
0.7728 0.7084
0.7083 0.6379
Update:
The problem is that I forgot to include the initial value in my C++ code. Thanks for #horchler for noticing it. After including the proper values and using runge_kutta_dopri5 as #horchler suggested, the result is
Matlab C++
-----------------
1.0000 1.0000
0.9950 0.9950
0.9803 0.9803
0.9560 0.9560
0.9226 0.9226
0.8806 0.8806
0.8304 0.8304
0.7728 0.7728
0.7083 0.7084
I've updated the code to reflect these modifications.
The runge_kutta4 stepper in odeint is nothing like Matlab's ode45, which is an adaptive scheme based on the Dormand-Prince method. To replicate Matlab's results, you should probably try the runge_kutta_dopri5 stepper. Also, make sure that your C++ code uses the same absolute and relative tolerances as ode45 (defaults are 1e-6 and 1e-3, respectively). Lastly, it looks like you may not be saving/printing your initial condition in your C++ results.
If you're confused at why ode45 is not taking fixed steps even though you specified t = 0:0.1:10;, see my detailed answer here.
If you must use the fixed steprunge_kutta4 stepper, then you'll need to reduce the integration step-size in your C++ code to match Matlab's output.
The Matlab ode45 function already includes error control and I think also interpolation (dense output). to compare with boost.odeint you should use the same functionality there. Boost.odeint provides integrate functions that perform step-size control and dense output if the used stepper algorithm provides this functionality. The following code piece shows how you this is used with the default error control parameters from Matlab given by horchler:
#include <boost/numeric/odeint.hpp>
using namespace std;
using namespace boost::numeric::odeint;
/* The type of container used to hold the state vector */
typedef std::vector< double > state_type;
const double gam = 0.15;
/* The rhs of x' = f(x) */
void damped_osc( const state_type &x , state_type &dx , const double t )
{
dx[0] = x[1];
dx[1] = -x[0] - gam*x[1];
}
void print( const state_type &x, const double t )
{
cout << x[0] << endl;
}
int main(int argc, char **argv)
{
cout.precision(16); // full precision output
const double dt = 0.1;
typedef runge_kutta_dopri5<state_type> stepper_type;
state_type x(2);
x[0] = 1.0;
x[1] = 0.0;
integrate_const(make_dense_output<stepper_type>( 1E-6, 1E-3 ),
damped_osc, x, 0.0, 10.0, dt , print);
return 0;
}
Please note that the results might still not be exactly the same (as in all 16 digits) because the error control in Boost.odeint might not be impemented exactly as in Matlab's ode45.