Eigen LLT Module Giving incorrect result? - c++

First off, I assume the problem is with me and not with Eigen's LLT module. That said, here is the code (I will explain the problem briefly) but sourcing the code in Rstudio should recreate the bug.
#include <RcppEigen.h>
using namespace Rcpp;
using Eigen::MatrixXd;
using Eigen::VectorXd;
// [[Rcpp::depends(RcppEigen)]]
template <typename T>
void fillUnitNormal(Eigen::PlainObjectBase<T>& Z){
int m = Z.rows();
int n = Z.cols();
Rcpp::NumericVector r(m*n);
r = Rcpp::rnorm(m*n, 0, 1); // using vectorization from Rcpp sugar
std::copy(std::begin(r), std::end(r), Z.data());
}
template <typename T1, typename T2, typename T3>
// #param z is object derived from class MatrixBase to overwrite with sample
// #param m MAP estimate
// #param S the hessian of the NEGATIVE log-likelihood evaluated at m
// #param pars structure of type pars
// #return int 0 success, 1 failure
int cholesky_lap(Eigen::MatrixBase<T1>& z, Eigen::MatrixBase<T2>& m,
Eigen::MatrixBase<T3>& S){
int nc=z.cols();
int nr=z.rows();
Eigen::LLT<MatrixXd> hesssqrt;
hesssqrt.compute(-S);
if (hesssqrt.info() == Eigen::NumericalIssue){
Rcpp::warning("Cholesky of Hessian failed with status status Eigen::NumericalIssue");
return 1;
}
typename T1::PlainObject samp(nr, nc);
fillUnitNormal(samp);
z = hesssqrt.matrixL().solve(samp);
z.template colwise() += m;
return 0;
}
// #param z an object derived from class MatrixBase to overwrite with samples
// #param m MAP estimate (as a vector)
// #param S the hessian of the NEGATIVE log-likelihood evaluated at m
// block forms should be given as blocks row bound together, blocks
// must be square and of the same size!
// [[Rcpp::export]]
Eigen::MatrixXd LaplaceApproximation(int n_samples, Eigen::VectorXd m,
Eigen::MatrixXd S){
int p=m.rows();
MatrixXd z = MatrixXd::Zero(p, n_samples);
int status = cholesky_lap(z, m, S);
if (status==1) Rcpp::stop("decomposition failed");
return z;
}
/*** R
library(testthat)
n_samples <- 1000000
m <- 1:3
S <- diag(1:3)
S[1,2] <- S[2,1] <- -1
S <- -S # Pretending this is the negative precision matrix
# e.g., hessian of negative log likelihood
z <- LaplaceApproximation(n_samples, m, S)
expect_equal(var(t(z)), solve(-S), tolerance=0.005)
expect_equal(rowMeans(z), m, tolerance=.01)
*/
Here is the (key) output:
> expect_equal(var(t(z)), solve(-S), tolerance=0.005)
Error: var(t(z)) not equal to solve(-S).
2/9 mismatches (average diff: 1)
[1] 0.998 - 2 == -1
[5] 2.003 - 1 == 1
In Words:
I am trying to write a function to perform a Laplace approximation. This means essentially sampling from a multivariate normal with mean m and covariance inverse(-S) where S is the Hessian of the negative log-liklihood.
My code works perfectly for an eigen decomposition I coded but for some reason, it is failing with the Cholesky. (I have tried to just give a minimal reproducible example and for space am not showing the eigen decomposition).
The best thought I have now is that some aliasing issue is happening but I can't figure out where that would be...
Thank you in advance!

It turned out to be a simple math error. Not a code error. Issue was that cholesky of matrix inverse has a transpose compared to just the inverse of the cholesky of the original matrix. Changing
z = hesssqrt.matrixL().solve(samp);
to
z = hesssqrt.matrixU().solve(samp);
Solved the problem.

Related

Implementing the Bartels–Stewart algorithm in Eigen3 -- real matrices only?

Based off this question and solution -- Implementing the Bartels–Stewart algorithm in Eigen3? -- I am trying to solve Lyapunov equations (AX + XA^T = C) using the Eigen library, but am limited to real matrices.
The R (with c++) code below works, but involves complex numbers. It can definitely be simplified (since in this framing, there is no B matrix), but the main difficulty is the reliance on complex numbers. The real schur form seems to be the standard alternative in this case, but the Eigen function matrix_function_solve_triangular_sylvester then does not work because the input matrix is not upper triangular, but is upper block triangular. I would be happy to see suggestions to a) remove the need for complex numbers, and then if that is possible, b) any efficiency improvements.
library(expm)
library(Rcpp)
library(RcppEigen)
library(inline)
# R -----------------------------------------------------------------------
d<-6 #dimensions
A<-matrix(rnorm(d^2),d,d) #continuous time transition
G <- matrix(rnorm(d^2),d,d)
C<-G %*% t(G) #continuous time pos def error
AHATCH<-A %x% diag(d) + diag(d) %x% A
Xtrue<-matrix(-solve(AHATCH,c(C)), d) #asymptotic error from continuous time
# c++ in R ---------------------------------------------------------------------
sylcpp <- '
using Eigen::Map;
using Eigen::MatrixXd;
// Map the double matrix A from Ar
const Map<MatrixXd> A(as<Map<MatrixXd> >(Ar));
// Map the double matrix Q from Qr
const Map<MatrixXd> Q(as<Map<MatrixXd> >(Qr));
Eigen::MatrixXd B = A.transpose();
Eigen::ComplexSchur<Eigen::MatrixXd> SchurA(A);
Eigen::MatrixXcd R = SchurA.matrixT();
Eigen::MatrixXcd U = SchurA.matrixU();
Eigen::ComplexSchur<Eigen::MatrixXd> SchurB(B);
Eigen::MatrixXcd S = SchurB.matrixT();
Eigen::MatrixXcd V = SchurB.matrixU();
Eigen::MatrixXcd F = (U.adjoint() * Q) * V;
Eigen::MatrixXcd Y = Eigen::internal::matrix_function_solve_triangular_sylvester(R, S, F);
Eigen::MatrixXd X = ((U * Y) * V.adjoint()).real();
return wrap(X);
'
syl <- cxxfunction(signature(Ar = "matrix",Qr='matrix'), sylcpp, plugin = "RcppEigen")
X=syl(A,-C)
X-Xtrue #approx zero
In principle, you could use RealSchur insted.
That will produce a quasi-triangular real R.

How to find (Q, R ) from SuiteSparseQR_factorization object?

In C++ interface of SuiteSparse, I can use
SuiteSparseQR_factorization <double> *QR;
QR = SuiteSparseQR_factorize(A) ;
to calculate QR decomposition of matrix A so that I can reuse QR for further calculation. But I wonder can I get the real Q,R directly from
this QR object?
SuiteSparse is awesome, but the interface can be confusing. Unfortunately, the methods that involve the SuiteSparseQR_factorization struct, which appear to be the most convenient, haven't worked so well for me in practice. For instance, using SuiteSparseQR_factorize and then SuiteSparseQR_qmult with a sparse matrix input argument actually converts it to a dense matrix first, which seems completely unnecessary!
Instead, use
template <typename Entry> SuiteSparse_long SuiteSparseQR
(
// inputs, not modified
int ordering, // all, except 3:given treated as 0:fixed
double tol, // only accept singletons above tol
SuiteSparse_long econ, // number of rows of C and R to return; a value
// less than the rank r of A is treated as r, and
// a value greater than m is treated as m.
int getCTX, // if 0: return Z = C of size econ-by-bncols
// if 1: return Z = C' of size bncols-by-econ
// if 2: return Z = X of size econ-by-bncols
cholmod_sparse *A, // m-by-n sparse matrix
// B is either sparse or dense. If Bsparse is non-NULL, B is sparse and
// Bdense is ignored. If Bsparse is NULL and Bdense is non-NULL, then B is
// dense. B is not present if both are NULL.
cholmod_sparse *Bsparse,
cholmod_dense *Bdense,
// output arrays, neither allocated nor defined on input.
// Z is the matrix C, C', or X
cholmod_sparse **Zsparse,
cholmod_dense **Zdense,
cholmod_sparse **R, // the R factor
SuiteSparse_long **E, // size n; fill-reducing ordering of A.
cholmod_sparse **H, // the Householder vectors (m-by-nh)
SuiteSparse_long **HPinv,// size m; row permutation for H
cholmod_dense **HTau, // size nh, Householder coefficients
// workspace and parameters
cholmod_common *cc
) ;
This method will perform the factorization and then, optionally, output (among other things) R, the matrix product Z = Q^T * B (or its transpose -- B^T * Q), or the solution of a linear system. To get Q, define B as the identity matrix. Here's an example to get Q and R.
cholmod_common Common, * cc;
cc = &Common;
cholmod_l_start(cc);
cholmod_sparse *A;//assume you have already defined this
int ordering = SPQR_ORDERING_BEST;
double tol = 0;
Long econ = A->nrow;
int getCTX = 1;// Z = (Q^T * B)^T = B^T * Q
cholmod_sparse *B = cholmod_l_speye(A->nrow, A->nrow, CHOLMOD_REAL, cc);//the identity matrix
cholmod_sparse *Q, *R;//output pointers to the Q and R sparse matrices
SuiteSparseQR<double>(ordering, tol, econ, getCTX, A, B, NULL, &Q, NULL, &R, NULL, NULL, NULL, NULL, cc);
If you want any of the other outputs to perform subsequent operations without the use of an explicitly formed Q and/or R, then you need to substitute the NULL's for additional pointers and then make calls to SuiteSparseQR_qmult.

Eigenvector calculation in C++

How can I make a function in cpp in order to calculate the first "Q" eigenvectors of a matrix M?
I tried using this code, but failed.
#include <RcppArmadillo.h>
using namespace arma;
mat M;
int Q;
vec getEigen(M,Q) {
return eig_sym(M, Q);
}
The error message says:
"no matching function for call to "arma::col(arma::mat&, int&)"
Any idea? I am new at cpp and don't know what the message means.
Thanks
As noted in the comments, there is no function in Armadillo that returns a subset of the eigenvalues. However, one can combine .head() or .tail() with eigen_sym() to extract a subset. In addition, it makes sense to use reverse(), since Armadillo returns eigenvalues in ascending order. For convenience I am using RcppArmadillo with Rcpp attributes here:
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
arma::vec getEigen(const arma::mat& M, int Q) {
return arma::reverse(arma::eig_sym(M).tail(Q));
}
/*** R
set.seed(42)
N <- 10
m <- matrix(rnorm(N * N), N, N)
m <- m + t(m)
getEigen(m, N/2)
*/
Output upon calling Rcpp::sourceCpp on the file:
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
arma::vec getEigen(const arma::mat& M, int Q) {
return arma::reverse(arma::eig_sym(M).tail(Q));
}
/*** R
set.seed(42)
N <- 10
m <- matrix(rnorm(N * N), N, N)
m <- m + t(m)
getEigen(m, N/2)
*/
This is only for the eigenvalues, not the eigenvectors. Extracting the Eigenvectors shouldn't be difficult, though.

SparseQR for Least Squares

For an application I am building I need to run linear regression on large datasets in order to obtain residuals. For example, one dataset is more than 1 million x 20k in dimension. For the smaller datasets I was using fastLm from the RcppArmadillo package - which works great for those - currently. With time those datasets will also grow beyond 1 million rows.
My solution was to use sparse matrices and Eigen. I was unable to find a good example for using SparseQR in RcppEigen. Based on many hours of reading (e.g. rcpp-gallery, stackoverflow, rcpp-dev mailinglist, eigen docs, rcpp-gallery, stackoverflow and many more that I have forgotten but sure have read) I wrote the following piece of code;
(NB: my first c++ program - please be nice :) - any advice to improve is welcomed)
// [[Rcpp::depends(RcppEigen)]]
#include <RcppEigen.h>
using namespace Rcpp;
using namespace Eigen;
using Eigen::Map;
using Eigen::SparseMatrix;
using Eigen::MappedSparseMatrix;
using Eigen::VectorXd;
using Eigen::SimplicialCholesky;
// [[Rcpp::export]]
List sparseLm_eigen(const SEXP Xr,
const NumericVector yr){
typedef SparseMatrix<double> sp_mat;
typedef MappedSparseMatrix<double> sp_matM;
typedef Map<VectorXd> vecM;
typedef SimplicialCholesky<sp_mat> solver;
const sp_mat Xt(Rcpp::as<sp_matM>(Xr).adjoint());
const VectorXd Xty(Xt * Rcpp::as<vecM>(yr));
const solver Ch(Xt * Xt.adjoint());
if(Ch.info() != Eigen::Success) return "failed";
return List::create(Named("betahat") = Ch.solve(Xty));
}
This works for example for;
library(Matrix)
library(speedglm)
Rcpp::sourceCpp("sparseLm_eigen.cpp")
data("data1")
data1$fat1 <- factor(data1$fat1)
mm <- model.matrix(formula("y ~ fat1 + x1 + x2"), dat = data1)
sp_mm <- as(mm, "dgCMatrix")
y <- data1$y
res1 <- sparseLm_eigen(sp_mm, y)$betahat
res2 <- unname(coefficients(lm.fit(mm, y)))
abs(res1 - res2)
It fails however for my large datasets (as I kind of expected). My initial intention was to use the SparseQR as a solver but I don't know how to implement that.
So my question - can someone help me to implement QR decomposition for sparse matrices with RcppEigen?
How to write a sparse solver with Eigen is a bit generic. This is mainly because the sparse solver classes are designed superbly well. They provide a guide explaining their sparse solver classes. Since the question focuses on SparseQR, the documentation indicates that there are two parameters required to initialize the solver: SparseMatrix class type and OrderingMethods class that dictates the supported fill-reducing ordering method.
With this in mind, we can whip up the following:
// [[Rcpp::depends(RcppEigen)]]
#include <RcppEigen.h>
#include <Eigen/SparseQR>
// [[Rcpp::export]]
Rcpp::List sparseLm_eigen(const Eigen::MappedSparseMatrix<double> A,
const Eigen::Map<Eigen::VectorXd> b){
Eigen::SparseQR <Eigen::MappedSparseMatrix<double>, Eigen::COLAMDOrdering<int> > solver;
solver.compute(A);
if(solver.info() != Eigen::Success) {
// decomposition failed
return Rcpp::List::create(Rcpp::Named("status") = false);
}
Eigen::VectorXd x = solver.solve(b);
if(solver.info() != Eigen::Success) {
// solving failed
return Rcpp::List::create(Rcpp::Named("status") = false);
}
return Rcpp::List::create(Rcpp::Named("status") = true,
Rcpp::Named("betahat") = x);
}
Note: Here we create a list that always passes a named status variable that should be checked first. This indicates whether convergence happens in two areas: decomposition and solving. If all checks out, then we pass the betahat coefficient.
Test Script:
library(Matrix)
library(speedglm)
Rcpp::sourceCpp("sparseLm_eigen.cpp")
data("data1")
data1$fat1 <- factor(data1$fat1)
mm <- model.matrix(formula("y ~ fat1 + x1 + x2"), dat = data1)
sp_mm <- as(mm, "dgCMatrix")
y <- data1$y
res1 <- sparseLm_eigen(sp_mm, y)
if(res1$status != TRUE){
stop("convergence issue")
}
res1_coef = res1$betahat
res2_coef <- unname(coefficients(lm.fit(mm, y)))
cbind(res1_coef, res2_coef)
Output:
res1_coef res2_coef
[1,] 1.027742926 1.027742926
[2,] 0.142334262 0.142334262
[3,] 0.044327457 0.044327457
[4,] 0.338274783 0.338274783
[5,] -0.001740012 -0.001740012
[6,] 0.046558506 0.046558506

How to define a sequence of functions with RcppGSL?

I would like to do a 2-dimentional spline interpolation. Since GSL library does not have a multi-dimentional function ready to use, I think I could use two-step interpolation. That is, I interpolate along first dimension conditional on a grid of values of the second dimension, and then interpolate along the second dimension. In R, I can easily create a list of spline functions that can bridge the two steps. For example, I have a vector of x and y, and the corresponding matrix of z = f(x,y). Now I want to interpolate f(x0,y0) at values of x0 and y0.
x <- 1:10
y <- 3:8
z <- matrix(rnorm(length(x)*length(y)), length(x), length(y))
x0 <- 2.2; y0 <- 4.5
# Create a sequence of spline functions conditional on y
spl.list <- vector("list", length(y))
for(i in 1:length(y)){
spl.list[[i]] <- splinefun(x, z[,i], "natural")
}
# The function values at (x0, y).
intp1 <- sapply(1:length(y), function(i) spl.list[[i]](x0) )
# Create the spline function along y.
intp2.spl <- splinefun(y, intp1, "natural")
intp2.spl(y0)
I'm trying to achieve the similar goal with RcppGSL. I'm using gsl_spline from GSL. However, the issue is that I'm not aware of anything in RcppGSL that can stores a sequence of functions, i.e., something like spl.list in the R code above. I tried List, but it is apparently not the right thing. The code below is my univariate interpolation function in RcppGSL.
src <- '
#include <RcppGSL.h>
#include <gsl/gsl_spline.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppGSL)]]
// [[Rcpp::export]]
double my_fn(NumericVector x, NumericVector y, double x0){
int nx = x.length();
gsl_interp_accel *accP = gsl_interp_accel_alloc();
gsl_spline *spline = gsl_spline_alloc( gsl_interp_cspline , nx );
gsl_spline_init( spline, x.begin(), y.begin(), nx);
double out = gsl_spline_eval(spline, x0, accP);
gsl_interp_accel_free (accP);
gsl_spline_free (spline);
return(out);
}
'
sourceCpp(code = src)
Anyone has some thought of how to create a sequence of functions in RcppGSL? Or other alternatives to get the 2-dimensional spline interpolation with RcppGSL?