Memory management and S4 class in Rcpp - c++

Let suppose I have an S4 class A which contains a slot #S which is a data.frame. The data.frame has a column X. I want to process such object in C++ using Rcpp. Here a toy example of how I did that:
SEXP f(S4 A)
{
DataFrame S = A.slot("S");
NumericVector X = S["X"];
// do something with X
}
My questions are the following.
Does X is still a reference to the original R data or a deep copy? Considering how Rcpp works is should not be a copy. But how can I be sure?
This code compiles and works well but the IDE (Rstudio not the compiler) raise a warning: conversion from 'Rcpp::SlotProxyPolicy< Rcppp::S4_Impl<PreserveStorage >::SlotProxy' to 'DataFrame' (aka 'DataFrame_Impl< PreserveStorage >') is ambigous.What does it mean? Is is serious?
Thanks

Related

pass R complex object to armadillo C++

R uses, when interfacing with other languages, the header R_ext/Complex.h which includes the type Rcomplex which seems to be an implementation of std::complex<double>. The standard way of using it would be for a complex vector x_ in R:
Rcomplex *px = COMPLEX(x_);
However, since I need to pass it to armadillo I then do:
arma::cx_vec x(px, nrows(x_), false, false);
but armadillo does not accept Rcomplex types. I have tried doing instead:
std::complex<double> *px = COMPLEX(x_);
but get the following error: cannot convert ‘Rcomplex*’ to ‘std::complex<double>*’ in initialization
Do you have any clue for passing a complex vector in R to std::complex<double> type? I am aware of Rcpp but would like to have a direct solution relying on base R.
EDIT: Following one of the comments, I wanted to clarify that Rcomplex is of C type but that it is compatible with std::complex<double> according to the answer by #Stephen Canon.
EDIT2: Why does Dirk's answer have more votes than the accepted answer if it is not answering the "without dependencies" question. In addition, I have been downvoted apparently because if one wants preferably to use base R with C or C++ somebody does not like it. Anyway, I have better things to do but this is not the first time that I get no answer to my original question when asking something related to base R interfacing with C or C++ and get a Rcpp related answer that I have not asked for.
Complex numbers are less common in statistics so that has not been an initial focus. However there are use cases Baptiste has one or two packages which pushes to add features to the interface given the existing support in Armadillo and R.
So all the work is done for you in templates -- here is the simplest possibly example of passing complex-values matrix and returning its sum:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::cx_mat foo(const arma::cx_mat & C) {
return C + C;
}
/*** R
C <- matrix( (1+1i) * 1:4, 2, 2)
C
foo(C)
*/
It does what you'd expect:
R> sourceCpp("/tmp/armaComplex.cpp")
R> C <- matrix( (1+1i) * 1:4, 2, 2)
R> C
[,1] [,2]
[1,] 1+1i 3+3i
[2,] 2+2i 4+4i
R> foo(C)
[,1] [,2]
[1,] 2+2i 6+6i
[2,] 4+4i 8+8i
R>
So we start with complex values in R, pass them via Rcpp and RcppArmadillo to Armadillo and get them back to R. Without writing an additional line of code, and no discernible overhead.
One type can always be forced to another type using reinterpret_cast. Generally this is a bad idea, but if you can guarantee that the two complex types are indeed compatible, you can do:
Rcomplex* px = COMPLEX(x_);
arma::cx_vec x( reinterpret_cast<arma::cx_double*>(px), nrows(x_), false, false );
The type arma::cx_double is a shorthand for std::complex<double>

Using Eigen::Map<Eigen::MatrixXd> as function argument of type Eigen::MatrixXd

In short, the question is how to pass a
Eigen::Map<Eigen::MatrixXd>
object to a function which expects a
Eigen::MatrixXd
object.
Longer story:
I have this C++ function declaration
void npMatrix(const Eigen::MatrixXd &data, Eigen::MatrixXd &result);
together with this implementation
void npMatrix(const Eigen::MatrixXd &data, Eigen::MatrixXd &result)
{
//Just do s.th. with arguments
std::cout << data << std::endl;
result(1,1) = -5;
std::cout << result << std::endl;
}
I want to call this function from python using numpy.array as arguments. To this end, I use a wrapper function written in c++
void pyMatrix(const double* p_data, const int dimData[],
double* p_result, const int dimResult[]);
which takes a pointer to data, the size of the data array, a pointer to result, and the size of the result array. The data pointer points to a const patch of memory, since data is not to be altered while the patch of memory reserved for result is writeable. The implementation of the function
void pyMatrix(const double *p_data, const int dimData[], double *p_result, const int dimResult[])
{
Eigen::Map<const Eigen::MatrixXd> dataMap(p_data, dimData[0], dimData[1]);
Eigen::Map<Eigen::MatrixXd> resultMap(p_result, dimResult[0], dimResult[1]);
resultMap(0,0) = 100;
npMatrix(dataMap, resultMap);
}
defines a Eigen::Map for data and result, respectively. A Eigen::Map allows to access raw memory as a kind of Eigen::Matrix. The dataMap is of type
<const Eigen::MatrixXd>
since the associated memory is read only; resultMap in contrast is of type
<Eigen::MatrixXd>
since it must we writeable. The line
resultMap(0,0) = 100;
shows, that resultMap is in deed writeable. While passing dataMap to the npMatrix() where a const Eigen::MatrixXd is expected works, I could not find a way to pass resultMap in the same way. I am sure, the trouble comes from the fact, that the first argument of npMatrix is const, and the second is not. A possible solution I found is to define
Eigen::MatrixXd resultMatrix = resultMap;
and pass this resutlMatrix to npMatrix(). However, I guess, this creates a copy and hence kills the nice memory mapping of Eigen::Map. So my question is.
Is there a way to pass a Eigen:Map to a function which expects a non-const Eigen::MatrixXd instead?
As a side note: I could change npMatrix to expect a Eigen::Map, but since in the real project, functions are already there and tested, I would rather not temper with them.
To complete the question, here is the python file to call pyMatrix()
import ctypes as ct
import numpy as np
import matplotlib.pyplot as plt
# Load libfit and define input types
ct.cdll.LoadLibrary("/home/wmader/Methods/fdmb-refactor/build/pyinterface/libpyfit.so")
libfit = ct.CDLL("libpyfit.so")
libfit.pyMatrix.argtypes = [np.ctypeslib.ndpointer(dtype=np.float64, ndim=2),
np.ctypeslib.ndpointer(dtype=np.int32, ndim=1),
np.ctypeslib.ndpointer(dtype=np.float64, ndim=2, flags='WRITEABLE'),
np.ctypeslib.ndpointer(dtype=np.int32, ndim=1)
]
data = np.array(np.random.randn(10, 2), dtype=np.float64, order='F')
result = np.zeros_like(data, dtype=np.float64, order='F')
libfit.pyMatrix(data, np.array(data.shape, dtype=np.int32),
result, np.array(result.shape, dtype=np.int32))
Pass it as plain pointer to data, and Eigen::Map it there. Alternatively, use template <typename Derived> and the like, found in http://eigen.tuxfamily.org/dox/TopicFunctionTakingEigenTypes.html
My personal Choice is the first though, as it is better to have code that doesn't expose all the stubbornness of every API you have used. Also, you won't lose compatibility neither with eigen ,nor with any other kind of library that you (or anyone else) may use later.
There is also another trick i found out, which can be used in numerous occasions:
Eigen::MatrixXd a;
//lets assume a data pointer like double* DATA that we want to map
//Now we do
new (&a) Eigen::Map<Eigen::Matrix<Double,Eigen::Dynamic,Eigen::Dynamic>> (DATA,DATA rows,DATA cols);
This will do what you ask, without wasting memory. I think it is a cool trick, and a will behave as a matrixXd, but i haven't tested every occasion. It has no memory copy. However, you might need to resize a to the right size before assigning. Even so, the compiler will not immediately allocate all memory at the time you request the resize operation, so there won't be big useless memory allocations either!
Be careful! Resizing operations might reallocate the memory used by an eigen matrix! So, if you ::Map a memory but then you perform an action that resizes the matrix, it might be mapped to a different place in memory.
For anyone still struggling with the problem of passing an Eigen::Map to a function with signature Eigen::Matrix or vice versa, and found the Eigen::Matrix to Eigen::Map implicit casting trick which #Aperture Laboratories suggested didn't work (in my case this gave runtime errors associated with trying to free already released memory, [Mismatched delete / Invalid delete errors when ran with valgrind]),
I suggest using the Eigen::Ref class for function signatures as suggested in the answer given by #ggael here:
Passing Eigen::Map<ArrayXd> to a function expecting ArrayXd&
and written in the documentation:
http://eigen.tuxfamily.org/dox/TopicFunctionTakingEigenTypes.html#TopicUsingRefClass
under the title:
How to write generic, but non-templated function?
For example, for the function specified in the question, changing the signature to
void npMatrix(const Eigen::Ref<const Eigen::MatrixXd> & data, Eigen::Ref< Eigen::MatrixXd> result);
means passing either Eigen::Map<Eigen::MatrixXd> orEigen::MatrixXd objects to the function would work seamlessly (see #ggael's answer to Correct usage of the Eigen::Ref<> class for different ways to use Eigen::Ref in function signature).
I appreciate OP said he didn't want to change the function signatures, but in terms of using Eigen::Maps and Eigen::Matrix's interchangeably I found this the easiest and most robust method.

What is the R to C++ syntax for vectors?

I am an R and C programmer trying to use Rcpp to create an R (and C++) wrapper for a program written in C. I'm not really familiar with C++.
Can someone help me understand the documentation on how to import an R object as a C++ vector? (I'm specifically trying to import an R list that includes a mixture of int, double, and numeric lists.)
The Rcpp-introduction.pdf states:
The reverse conversion from an R object to a C++ object is implemented by variations of
the Rcpp::as template whose signature is:
template <typename T> T as(SEXP x);
It offers less sexibility and currently handles conversion of R objects into primitive types (e.g., bool, int, std::string, ...), STL vectors of primitive types (e.g.,
std::vector<bool>, std::vector<double>, ...) and arbitrary types that offer a constructor that takes a SEXP.
I am confused as to what this means, i.e. what gets filled in for "template", "typename T", and "T". I've seen what appear to be lots of examples for the primitives, e.g.
int dx = Rcpp::as<int>(x);
but I don't understand how this syntax maps to the template documentation above, and (more importantly) don't understand how to generalize this to STL vectors.
Maybe you are making it too complicated. An R vector just becomes a C++ vector "because that is what Rcpp does for you".
Here it is using Rcpp::NumericVector:
R> library(Rcpp)
R> cppFunction('NumericVector ex1(NumericVector x) { return x + 2;}')
R> ex1(1:4) # adds two
[1] 3 4 5 6
R>
This uses the fact that we defined + for our NumericVector types. You can also pass a std::vector<double> in and out, but need to add some operations which is often more than one statement so I didn't here...
So in short, keep reading the documentation and ask on rcpp-devel or here if you need help.
Edit: For completeness, the same with a STL vector
R> cppFunction('std::vector<double> ex2(std::vector<double> x) { \
for (size_t i=0; i<x.size(); i++) x[i] = x[i] + 2; return x;}')
R> ex2(1:4)
[1] 3 4 5 6
R>

write an Rdata file from C++

Suppose I have a C++ program that has a vector of objects that I want to write out to an Rdata data.frame file, one observation per element of the vector. How can I do that? Here is an example. Suppose I have
vector<Student> myStudents;
And Student is a class which has two data members, name which is of type std::string and grade which is of type int.
Is my only option to write a csv file?
Note that Rdata is a binary format so I guess I would need to use a library.
A search for Rdata [r] [C++] came up empty.
I think nobody has bothered to extract a binary file writer from the R sources to be used independently from R.
Almost twenty years ago I did the same for Octave files as their format is simply: two integers for 'n' and 'k', followed by 'n * k' of data -- so you could read / write with two function calls each.
I fear that for R you would have to cover too many of R's headers -- so the easiest (?) route may be to give the data to R, maybe via Rserve ('loose' connection over tcp/ip) and RInside (tighter connection via embedding), and have R write it.
Edit: In the years since the original answer was written, one such library has been created: librdata.
Here is an example of a function that saves a list in a RData. This example is based on the previous answer :
void save_List_RData(const List &list_Data, const CharacterVector &file_Name)
{
Environment base("package:base");
Environment env = new_env();
env["list_Data"] = list_Data;
Function save = base["save"];
CharacterVector all(1);
all[0] = "list_Data";
save(Named("list", all), Named("envir", env), Named("file", file_Name));
Rcout << "File " << file_Name << " has been saved! \\n";
}
I don't know if this will fit everyone needs (of those who is googling this question), but this way you can save individual or multiple variables:
using namespace std;
using namespace Rcpp;
using Eigen::Map;
using Eigen::MatrixXi;
using Eigen::MatrixXd;
Environment base("package:base");
Function save = base["save"];
Function saveRDS = base["saveRDS"];
MatrixXd M = MatrixXd::Identity(3,3);
NumericMatrix xx(wrap(M));
NumericMatrix xx1(wrap(M));
NumericMatrix xx2(wrap(M));
base["xx"] = xx;
base["xx1"] = xx1;
base["xx2"] = xx2;
vector<string> lst;
lst.push_back("xx");
lst.push_back("xx1");
lst.push_back("xx2");
CharacterVector all = wrap(lst);
save(Named("list", all), Named("envir", base) , Named("file","Identities.RData"));
saveRDS(xx,Named("file","Identity.RDs"));
return wrap(M);
library(inline)
library(Rcpp)
library(RcppEigen)
src <- '
#put here cpp code shown above
'
saveworkspace <- cxxfunction(signature(), src, plugin = "RcppEigen")
saveworkspace()
list.files(pattern="*.RD*")
[1] "Identity.RDs"
[2] "Identities.RData"
I'm not 100% sure if this C++ code will work in standalone library/executable.
NB: Initially I missed the comment that the solution should be independent of R, but for those who is searching for exactly the same question, but they are ok with dependency on R, this could be helpful.

R List of numeric vectors -> C++ 2d array with Rcpp

I mainly use R, but eventually would like to use Rcpp to interface with some C++ functions that take in and return 2d numeric arrays. So to start out playing around with C++ and Rcpp, I thought I'd just make a little function that converts my R list of variable-length numeric vectors to the C++ equivalent and back again.
require(inline)
require(Rcpp)
test1 = cxxfunction(signature(x='List'), body =
'
using namespace std;
List xlist(x);
int xlen = xlist.size();
vector< vector<int> > xx;
for(int i=0; i<xlen; i++) {
vector<int> test = as<vector<int> > (xlist[i]);
xx.push_back(test);
}
return(wrap(xx));
'
, plugin='Rcpp')
This works like I expect:
> test1(list(1:2, 4:6))
[[1]]
[1] 1 2
[[2]]
[1] 4 5 6
Admittedly I am only part way through the very thorough documentation, but is there a nicer (i.e. more Rcpp-like) way to do the R -> C++ conversion than with the for loop? I am thinking possibly not, since the documentation mentions that (at least with the built-in methods) as "offers less flexibility and currently handles conversion of R objects into primitive types", but I wanted to check because I'm very much a novice in this area.
I will give you bonus points for a reproducible example, and of course for using Rcpp :) And then I will take those away for not asking on the rcpp-devel list...
As for converting STL types: you don't have to, but when you decide to do it, the as<>() idiom is correct. The only 'better way' I can think of is to do name lookup as you would in R itself:
require(inline)
require(Rcpp)
set.seed(42)
xl <- list(U=runif(4), N=rnorm(4), T2df=rt(4,2))
fun <- cxxfunction(signature(x="list"), plugin="Rcpp", body = '
Rcpp::List xl(x);
std::vector<double> u = Rcpp::as<std::vector<double> >(xl["U"]);
std::vector<double> n = Rcpp::as<std::vector<double> >(xl["N"]);
std::vector<double> t2 = Rcpp::as<std::vector<double> >(xl["T2df"]);
// do something clever here
return(R_NilValue);
')
Hope that helps. Otherwise, the list is always open...
PS As for the two-dim array, that is trickier as there is no native C++ two-dim array. If you actually want to do linear algebra, look at RcppArmadillo and RcppEigen.