Reconciling exponential function results in C++ (Rcpp) and R - c++

I am working on speeding up software from my dissertation by utilizing Rcpp and RcppEigen. I have been very impressed with Rcpp and RcppEigen as the speed of my software has increased by upwards of 100 times. This is quite exciting to me because my R code had been parallelized using snow/doSNOW and the foreach package, so the actual speed gain is probably somewhere around 400x. However, the last time I attempeted to run my program in entirety to assess overall speed gains after translating some gradient/hessian calculations into Cpp, I see that the new Hessian matrix calculated using my C++ code differs from the old, much slower version which was calculated strictly in R. I had been very careful to check my results line by line, slowly increasing the complexity of my calculations while assuring the results were identical in R and C++. I realize now that I was only checking the first 11 or so digits.
The code for optimization has been very robust in R, but was dreadfully slow. All of the calculations in C++ have been checked and were virtually identical to previous versions in R (this was checked to 11 digits via specifying options(digits = 11) at the beginning of each session). However, deviations in long vectors or matrices representing particular quantities begin at 15 or so digits past the decimal point in some cells/elements. These differences become problematic when using matrix multiplication and summing over risk sets, as a small difference can lead to a large error (is it an error?) in the overall precision of the final estimate.
After looking back over my code and finding the first point of deviation in results between R and C++, I observed that this first occurs after taking the exponential of a matrix or vector in my Rcpp code. This led me to work out the examples below, which I hope illustrates the issue I am seeing. Has anyone observed this before, and is there a way to utilize the R exponential function within C++ or change the routine used within C++?
## A small example to illustrate issues with Rcppsugar exponentiate function
library(RcppEigen)
library(inline)
RcppsugarexpC <-
"
using Eigen::MatrixXd;
typedef Eigen::Map<Eigen::MatrixXd> MapMatd;
MapMatd A(as<MapMatd>(AA));
MatrixXd B = exp(A.array());
return wrap(B);
"
RcppexpC <-
"
using Eigen::MatrixXd;
using Eigen::VectorXd;
typedef Eigen::Map<Eigen::MatrixXd> MapMatd;
MapMatd A(as<MapMatd>(AA));
MatrixXd B = A.array().exp().matrix();
return wrap(B);
"
Rcppsugarexp <- cxxfunction(signature(AA = "NumericMatrix"), RcppsugarexpC, plugin = "RcppEigen")
Rcppexp <- cxxfunction(signature(AA = "NumericMatrix"), RcppexpC, plugin = "RcppEigen")
mat <- matrix(seq(-5.25, 10.25, by = 1), ncol = 4, nrow = 4)
RcppsugarC <- Rcppsugarexp(mat)
RcppexpC <- Rcppexp(mat)
exp <- exp(mat)
I then tested whether these exponentiated matrices were actually equal beyond the print standard (default is 7) that R uses via:
exp == RcppexpC ## inequalities in 3 cells
exp == RcppsugarC ## inequalities in 3 cells
RcppsugarC == RcppexpC ## these are equal!
sprintf("%.22f", exp)
Please forgive me if this is a dense question - my computer science skills are not as strong as they should be, but I am eager to learn how to do better. I appreciate any and all help or advice that can be given me. Special thanks to the creators of Rcpp, and all of the wonderful moderators/contributors at this site - your previous answers have saved me from posting questions on here well over a hundred times!
Edit:
It turns out that I didn't know what I was doing. I wanted to apply Rcppsugar to the MatrixXd or VectorXd, which I was attempting by using the .array() method, however calling exp(A.array()) or A.exp() computes what is referred to as the matrix exponential, rather than computing exp(A_ij) element by element. My friend pointed this out to me when he worked out a simple example using std::exp() on each element in a nested for loop and found that this result was identical to what was reported in R. I thus needed to use the .unaryExpr functionality of eigen, which meant changing the compiler settings to -std=c++0x. I was able to do this by specifying the following in R:
settings$env$PKG_CXXFLAGS='-std=c++0x'
I then made a file called Rcpptesting.cpp which is below:
#include <RcppEigen.h>
// [[Rcpp::depends(RcppEigen)]]
using Eigen::Map; // 'maps' rather than copies
using Eigen::MatrixXd; // variable size matrix, double precision
using Eigen::VectorXd; // variable size vector, double precision
// [[Rcpp::export]]
MatrixXd expCorrect(Map<MatrixXd> M) {
MatrixXd M2 = M.unaryExpr([](double e){return(std::exp(e));});
return M2;
}
After this, I was able to call this function in with sourceCpp() in R as follows: (note that I used the option verbose = TRUE and rebuild = TRUE because this seems to give me info regarding what the settings are - I was trying to make sure that -std=c++0x was actually being used)
sourceCpp("~/testingRcpp.cpp", verbose = TRUE, rebuild = TRUE)
Then the following R code worked like a charm:
mat <- matrix(seq(-5.25, 10.25, by = 1), ncol = 4, nrow = 4)
exp(mat) == expCorrect(mat)
Pretty cool!

Related

RcppEigen slicing method to obtain subset of a matrix not working consistently and causes fatal error/crash

I'm writing an R package using Rcpp and RcppEigen and I'm having a problem with matrix slicing and sub-setting. I need to get an off-diagonal square matrix from a larger square matrix
The Eigen::MatrixXd slicing methods seem to be the problem. Using Eigen::seq() method doesn't work at all from R because "no member named 'seq' in namespace 'Eigen'" and the MatrixXd.block(i, j, n, n) method is causing a crash with certain large(ish) values of n. Sometimes it works perfectly fine, but if I increase the size, it causes a fatal crash.
Here's an example of the C++ code:
// [[Rcpp::depends(RcppEigen)]]
#include <iostream>
#include <RcppEigen.h>
using namespace Rcpp;
using Eigen::Map;
using Eigen::MatrixXd;
typedef Map<MatrixXd> MapMatd;
// [[Rcpp::export]]
List crosspart_worker_cpp(const MapMatd& Vij, ...){
int n = Vij.cols()/2;
/* ... some arbitrary code ... */
// extract sub-block of varcovar matrix (only unique pairs)
MatrixXd Vsub = Vij.block(1, n + 1, n, n);
/* ... more code ... */
List out_lst = List::create(Named("Vsub") = Vsub, ...);
return out_lst;
}
and R code that uses it:
# test_crosspart-worker-cpp.R
# setup ----
## source relevant file
# normally build and load package instead of sourceCpp()
if(!exists("crosspart_worker_cpp")){
Rcpp::sourceCpp("crosspart-worker.cpp")
}
## make reproducible
set.seed(75)
## set parameters
n = 100 # rows in original model matrix X
## ... additional setup code ...
# Generate dummy data with arbitrary contents, but correct structure ----
## varcovar matrix
Vij <- matrix(abs(rnorm(2*n * 2*n)), nrow = 2*n)
## ... other code ...
# use the function ----
result <- crosspart_worker_cpp(Vij = Vij, ...)
Why doesn't this work consistently? Are there other options for sub-setting a matrix in RcppEigen?
My original post contained a larger C++ file and I didn't know where the error was. I've been able to identify the problem line of code and the function works perfectly if I remove it. Currently, I am obtaining the matrix subset in R and then passing that as an argument to the C++ function. However, it would be be highly beneficial for my package to do this in all in a C++ function.
I used Rcout to find the problem line of code:
MatrixXd Vsub = VDiag.block(1, n + 1, n, n);
Which is supposed to take the block matrix starting at row 1 and column np and contain np rows and columns according to Eigen: Slicing and Indexing
this is meant to replicate the R code Vsub[1:np, (n + 1):(2*n)].
Why would this cause a crash? Does the Eigen slice method not work in R? is there an RcppEigen specific way? I haven't found one online.
System specs:
OS: Microsoft Windows 10 Education Edition
Processor: AMD Ryzen 7 3700X 8-core
Installed Physical Memory (RAM): 16.0 GB
R version: 4.0.2 (2020-06-22) -- "Taking Off Again"
Rstudio version: 1.3.1056

C++ Trapezoidal Integration Function Returning Negative Numbers when it shouldn't

I am using the following function written in C++, whose purpose is to take the integral of one array of data (y) with respect to another (x)
// Define function to perform numerical integration by the trapezoidal rule
double trapz (double xptr[], double yptr[], int Npoints)
{
// The trapzDiagFile object and associated output file are how I monitor what data the for loop actually sees.
std::ofstream trapzDiagFile;
trapzDiagFile.open("trapzDiagFile.txt",std::ofstream::out | std::ofstream::trunc);
double buffer = 0.0;
for (int n = 0; n < (Npoints - 1); n++)
{
buffer += 0.5 * (yptr[n+1] + yptr[n]) * (xptr[n+1] - xptr[n]);
trapzDiagFile << xptr[n] << "," << yptr[n] << std::endl;
}
trapzDiagFile.close();
return buffer;
}
I validated this function for the simple case where x contains 100 uniformly spaced points from 0 to 1, and y = x^2, and it returned 0.33334, as it should.
But when I use it for a different data set, it returns -3.431, which makes absolutely no sense. If you look in the attached image file, the integral I am referring to is the area under the curve between the dashed vertical lines.
It's definitely a positive number.
Moreover, I used the native trapz command in MATLAB on the same set of numbers and that returned 1.4376.
In addition, I translated the above C++ trapz function into MATLAB, line for line as closely as possible, and again got 1.4376.
I feel like there's something C++ related I'm not seeing here. If it is relevant, I am using minGW-w64.
Apologies for the vagueness of this post. If I knew more about what kind of issue I am seeing, it would be easier to be concise about it.
Plot of the dataset for which the trapz function (my homemade C++ version) returns -3.431:
Please check the value of xptr[Npoints - 1]. It may be less than xptr[Npoints - 2], and was not included in the values that you output.

Why is Eigens mean() method so much faster than sum()?

This is a rather theoretical question, but I'm quite interested in it and would be glad if someone has some expert knowledge on this which he or she is willing to share.
I have a matrix of floats with 2000 rows and 600 cols and want to subtract the mean of the columns from each row. I have tested the following two lines and compared their runtime:
MatrixXf centered = data.rowwise() - (data.colwise().sum() / data.cols());
MatrixXf centered = data.rowwise() - data.colwise().mean();
I thought, mean() would not do something different from dividing the sum of each column by the number of rows, but while the execution of the first line takes 12.3 seconds on my computer, the second line finishes in 0.09 seconds.
I'm using Eigen version 3.2.6, which currently is the latest version, and my matrices are stored in row-major order.
Does someone know something about the internals of Eigen which could explain this huge performance difference?
Edit: I should add that data in the code above actually is of type Eigen::Map< Eigen::MatrixXf<Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor> > and maps Eigen's functionality to a raw buffer.
Edit 2: As suggested by GuyGreer, I'll provide some sample code to reproduce my findings:
#include <iostream>
#include <chrono>
#include <Eigen/Core>
using namespace std;
using namespace std::chrono;
using namespace Eigen;
int main(int argc, char * argv[])
{
MatrixXf data(10000, 1000), centered;
data.setRandom();
auto start = high_resolution_clock::now();
if (argc > 1)
centered = data.rowwise() - data.colwise().mean();
else
centered = data.rowwise() - (data.colwise().sum() / data.rows());
auto stop = high_resolution_clock::now();
cout << duration_cast<milliseconds>(stop - start).count() << " ms" << endl;
return 0;
}
Compile with:
g++ -O3 -std=c++11 -o test test.cc
Running the resulting program without arguments, so that is uses sum(), takes 126 seconds on my machine, while running test 1 using mean() only takes 0.03 seconds!
Edit 3: As it turned out (see comments), it is not sum() which takes so long, but the division of the resulting vector by the number of rows. So the new question is: Why does Eigen take more than 2 minutes to divide a vector with 1000 columns by a single scalar?
Somehow, both the partial reduction (sum) and division are recomputed every time because some crucial information about the evaluation cost of the partial reduction are wrongly lost by operator/... Explicitly evaluating the mean fixes the issue:
centered = data.rowwise() - (data.colwise().sum() / data.cols()).eval();
Of course, this evaluation should be done by Eigen for you, as fixed by the changeset 42ab43a. This fix will be part of the next 3.2.7 and 3.3 releases.

calculating w coefficients for iir filter

I am trying to implement an IIR filter I have designed in Matlab into a c++ program to filter out an unwanted signal from a wave file. The fdatool in Matlab generated this C header to use (it is a bandstop filter):
#include "tmwtypes.h"
/*
* Expected path to tmwtypes.h
* C:\Program Files (x86)\MATLAB\R2013a Student\extern\include\tmwtypes.h
*/
const int al = 7;
const real64_T a[7] = {
0.9915141178644, -5.910578456199, 14.71918523779, -19.60023964796,
14.71918523779, -5.910578456199, 0.9915141178644
};
const int bl = 7;
const real64_T b[7] = {
1, -5.944230431733, 14.76096188047, -19.60009655976,
14.67733658492, -5.877069568864, 0.9831002459245
};
After hours of exhausting research, I still can't figure out the proper way to use these values to determine the W values and then how to use those W values to properly calculate my Y outputs. If anyone has any insight into the ordering these values should be used to do all these conversions, it would be a major help.
All the methods I've developed and tried to this point do not generate a valid wave file, the header values all translate correctly, but everything beyond cannot be evaluated by a media player.
Thanks.
IIR filters work this way:
Assuming an array of samples A and and array of ceof named 'c' the result array B will be:
B[i] = (A[i] * c[0]) + (B[i-1] * c[1]) + ... + (B[n] * c[n])
Note that only the newest element is taken from A.
This is easier to do in-place, just update A as you move along.
These filter coefs are very violent, are you sure you got them right?
The first one is also symmetrical which probably indicates it's an FIR filter.
It appears to me that you have a 3 pole IIR filter with the coefficients given for an Nth order implementation (as opposed to a series of 2nd order sections). Since this is a band reject (or band pass) the polynomial order is twice the pole count.
I am not sure what you mean by W values, unless you are trying to evaluate the frequency response of this filter.
To calculate the Y values, as you put it, see this link for code on implementing IIR filters. See the Nth order implementation code in particular.
http://www.iowahills.com/A7ExampleCodePage.html
BTW: I assumed these were Nth order coefficients and simulated them. I got a 10 dB notch at 0.05 Pi. Sound about right?
where
B6 = 0.9915141178644
.
.
.
b0 = 0.9915141178644
a6 = 0.9831002459245
.
.
.
a0 = 1
Also, you may want to post a question like this on:
https://dsp.stackexchange.com/

Rewriting slow R function in C++ & Rcpp

I have this line of R code:
croppedDNA <- completeDNA[,apply(completeDNA,2,function(x) any(c(FALSE,x[-length(x)]!=x[-1])))]
What it does is identify the sites (cols) in a matrix of DNA sequences (1 row = one seq) that are not universal (informative) and subsets them from the matrix to make a new 'cropped matrix' i.e. get rid of all the columns in which values are the same. For a big dataset this takes about 6 seconds. I don't know if I can do it faster in C++ (still a beginner in C++) but it will be good for me to try. My idea is to use Rcpp, loop through the columns of the CharacterMatrix, pull out the column (the site) as a CharacterVector check if they are the same. If they are the same, record that column number/index, continue for all columns. Then at the end make a new CharacterMatrix that only includes those columns. It is important that I keep the rownames and column names as they are in th "R version" of the matrix i.e. if a column goes, so should the colname.
I've been writing for about two minutes, so far what I have is (not finished):
#include <Rcpp.h>
#include <vector>
using namespace Rcpp;
// [[Rcpp::export]]
CharacterMatrix reduce_sequences(CharacterMatrix completeDNA)
{
std::vector<bool> informativeSites;
for(int i = 0; i < completeDNA.ncol(); i++)
{
CharacterVector bpsite = completeDNA(,i);
if(all(bpsite == bpsite[1])
{
informativeSites.push_back(i);
}
}
CharacterMatrix cutDNA = completeDNA(,informativeSites);
return cutDNA;
}
Am I going the right way about this? Is there an easier way. My understanding is I need std::vector because it's easy to grow them (since I don't know in advance how many cols I am going to want to keep). With the indexing will I need to +1 to the informativeSites vector at the end (because R indexes from 1 and C++ from 0)?
Thanks,
Ben W.
Sample data:
set.seed(123)
z <- matrix(sample(c("a", "t", "c", "g", "N", "-"), 3*398508, TRUE), 3, 398508)
OP's solution:
system.time(y1 <- z[,apply(z,2,function(x) any(c(FALSE,x[-length(x)]!=x[-1])))])
# user system elapsed
# 4.929 0.043 4.976
A faster version using base R:
system.time(y2 <- (z[, colSums(z[-1,] != z[-nrow(z), ]) > 0]))
# user system elapsed
# 0.087 0.011 0.098
The results are identical:
identical(y1, y2)
# [1] TRUE
It's very possible c++ will beat it, but is it really necessary?