Rcpp: calling c++ function in R without exporting c++ function - c++

I am trying to make a package with Rcpp. I have all of my C++ functions in a single .cpp file as follows:
double meanvec(NumericVector x) {
int n = x.size();
double tot = 0;
for (int i = 0; i < n; i++) {
tot += x[i];
}
tot /= n;
return tot;
}
double inprod(NumericVector u, NumericVector v) {
int m = u.size();
double val = 0;
for (int i = 0; i < m; i++) {
val += u[i] * v[i];
}
return val;
}
NumericVector lincoef(NumericVector x, NumericVector y) {
int n = x.size();
double xm = meanvec(x);
double ym = meanvec(y);
NumericVector xa(n);
for (int i = 0; i < n; i++) {
xa[i] = x[i] - xm;
}
NumericVector ya(n);
for (int i = 0; i < n; i++) {
ya[i] = y[i] - ym;
}
double b1 = inprod(xa, ya) / inprod(xa, xa);
double b0 = ym - (b1 * xm);
NumericVector beta = NumericVector::create(b0, b1);
return beta;
}
Basically, the last function takes two vectors as input and outputs a single vector. I would like to call this function into a separate .R file where I am trying to write another function. Something like this:
#' Title
#'
#' #param x Numeric vector.
#' #param y Numeric vector.
#'
#' #return
#' #export
linfit338 = function(x, y){
beta = .Call(`_pkg338_lincoef`, x, y)
fmod = function(x){
beta[1] + beta[2]*x
}
flist = list(beta, fmod)
return(flist)
}
Here the output is a list, where the first element is a vector from the C++ function being called and the second element is the created function. When I try to install and restart, I get this error message:
RcppExports.o:RcppExports.cpp:(.rdata+0x790): undefined reference to `_pkg338_lincoef'
My guess is that is has something to do with exporting the function. When I add // [[Rcpp::export]] above the lincoef function in the C++ file, I don't get any error message, and my final R function works. However, my whole goal is that I do not want the lincoef function exported at all.
Any way to fix this? I would also be open to suggestions as to how I can improve organizing these files, as this is my first experience building a package with Rcpp.

I think you're probably mixing up the concept of exporting C++ code to be used in R (via // [[Rcpp::export]]), which is entirely different to exporting R functions from your package, i.e. making those functions available to end-users of your package.
To make your Rcpp functions callable from within R at all, you need to // [[Rcpp::export]] them. If you don't do this, none of your C++ code will be available from within your R package.
It sounds like what you would like to do is to use the Rcpp-exported functions within your package but to hide them from end-users. This is a common use case for Rcpp, as it allows you to have an R function that acts as an end-user interface to your C++ code, while leaving you free to alter the C++ implementation in future developments without the risk of breaking existing users' code.
Any function you have created within your package, be it an R function or an Rcpp-exported function, has to actively be exported from your package to make it available to end-users. This is a different concept from // [[Rcpp::export]], which is needed to access C++ functions from within your package's R code.
Any R functions will only be exported from your R package if you specify them in the NAMESPACE file in your project's root directory. Thus to export myfunction() you need to have a line that says export(myfunction) in the NAMESPACE file. You are using roxygen2, which will generate this line automatically as long as you write #export in the roxygen skeleton. An alternative to using roxygen's exporting system is to specify an exportPattern in the NAMESPACE file that uses regex to export only functions whose names match a certain pattern.
My usual workflow is to prefix any Rcpp-exported functions with a period by writing my C++ functions like this:
// [[Rcpp::export(.MyCppFunction)]]
int BoringFunction() { return 0; }
I can now call the C++ function from R like this:
MyRFunction <- function()
{
result <- .MyCppFunction()
return(result)
}
The first line in my NAMESPACE file looks like this:
exportPattern("^[[:alpha:]]+")
Which means that any R function in my package starting with a letter will be exported. Since all the functions I Rcpp::export start with a period, I can use them internally within the R package but they won't be exported to end-users.
In other words, end-users of the package can call MyRFunction() but would get an error if they tried to call .MyCppFunction

Related

Adding two objective function with weight in CPLEX

in my math model, I have two objective functions which based on its importance I want to allocate weight to them and add them together as one objective function.
here are my two Objectives added together with weight:
IloExpr objExpression(env);
for (cc = 0; cc < NumberOfCourses; cc++)
for (ww = 0; ww < AvailableWeeks; ww++) {
objExpression += Weight * Diff[cc][ww]; // objective a
}
for (cc = 0; cc < NumberOfCourses; cc++) {
objExpression += (1 - Weight) * (M[cc] * Students[cc]); // objective b
}
IloObjective theObjective(env, objExpression, IloObjective::Minimize);
mod.add(theObjective);
objExpression.end();
i have set the parameters and variables as follow:
const int NumberOfCourses = 15;
const int AvailableWeeks = 8;
const float Weight = 0.5;
double Students[NumberOfCourses];
IloNumVarArray2 Diff(env, NumberOfCourses);
for (cc = 0; cc < NumberOfCourses; cc++)
Diff[cc] = IloNumVarArray(env, AvailableWeeks, 0.0, 8);
IloNumVarArray M(env, NumberOfCourses);
when I run the code it freezes and sets the breakpoint at the second objective line.
also, what should I do if I wanted to have two separate objectives and get output for both individually?
Check the reference documentation of the IloNumVarArray(env, IloInt) constructor. It says
This constructor creates an extensible array of n numeric variables in
env. Initially, the n elements are empty handles.
In other words, all elements in the newly created array will be NULL (note that this is different from the IloBoolVarArray constructor which creates an array in which all elements are new variables). Thus M is an array of empty handles in your case.
In order to fix you code you either have to initialize the elements of the array one by one or use a constructor that initializes the variables, like for example
IloNumVarArray M(env, NumberOfCourses, 0, IloInfinity);
Additionally, I suggest that you compile your code in debug mode without NDEBUG being defined. If you compile without NDEBUG then many of the functions in Concert will raise an exception if you use empty handles.
If you want to have separate objectives then your problem becomes a multi-objective problem. You may want to read the respective chapter in the user manual and also refer to the ilodiet.cpp example that ships with CPLEX.

use a C++ class with Rcpp to modify it from C or R

I'm starting to play with rcpp and I'd like to make an object in which I can modify the variables from R or rcpp.
It seems that making a class is somehow a good solution, as I can modify the fields using the '$' from R, or directly using Rcpp functions. My problem is that when I try to modify a variable of my class from rcpp with a class method, R simply crashes...
below is a small example. The class contains some variables of different types, plus a constructor and two functions, one to print values (to test if changes are trully done when I do something), and one to change the variables of the class.
library(Rcpp)
library(RcppArmadillo)
# ODEs can also be described using Rcpp
Rcpp::sourceCpp(code = '
#include <Rcpp.h>
//#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::export]]
class parameters{
public:
NumericMatrix mat;
double val;
int n;
NumericVector dB;
parameters(double x):
val(x) {}
NumericVector changes(){
dB[0] = val;
dB[1] = 12;
//dB[1] = mean(mat(1,_));
}
void print(){
Rcout << "val:" << std::endl << val << std::endl;
Rcout << "mat:" << std::endl << mat << std::endl;
}
};
RCPP_MODULE(ParamModule){
using namespace Rcpp;
class_<parameters>("parameters")
.constructor<double>("constructor")
.method("changes", &parameters::changes)
.method("print", &parameters::print)
.field("val", &parameters::val)
.field("mat", &parameters::mat)
.field("n", &parameters::n)
.field("dB", &parameters::dB)
;
}
// [[Rcpp::export]]
NumericMatrix addVal(NumericMatrix mat, double val, int n){
int i = 0;
for (i=0; i<n; i++){
mat(i, 1) = mat(i,1) + val;
}
return mat;
}
')
and here is the code that I use to test it:
p = new(parameters,5)
str(p) # constructor initialise the field val, ok
p$mat = matrix(5, nrow = 5, ncol = 5)
p$print() # field mat initialised, ok
p$mat = matrix(0.1, nrow = 5, ncol = 5)
p$print() # field mat changed, ok
addVal(p$mat, 2, 5)
p$print() # sounds like p is a pointer, ok
p$changes() #...
This last line is where the problem occurs (no error message, as R simply crashes).
When I compile the class I have this warning:
Warning message:
No function found for Rcpp::export attribute at file215bf0ef501.cpp:8
Also, I saw here that I might need to use these two lines:
ParamModule = Module("ParamModule")
parameters = ParamModule$parameters
but I obtain an error message when I run the last one:
Error in Module(module, mustStart = TRUE) :
Failed to initialize module pointer: Error in FUN(X[[i]], ...): no such symbol _rcpp_module_boot_ParamModule in package .GlobalEnv
(ParamModule is present as an environment in the global environment).
So, my questions are:
1) Why the function p$changes() is not working
2) Do I need to load paramModule? It doesn't seem to change anything...
Thanks!
There is support for exposing C++ classes via Rcpp Modules, which you found.
Also in the package, less well-known, but added by John Chambers himself is Rcpp Classes, an extension to Rcpp Modules which seems to be exactly what you are asking for.
Have at the example in the complete example package that ships with Rcpp as a full directory and test. There is also documentation in the package.
A small edition of the problematic method resolved the issue. I basically made two (stupid) mistakes:
1) I did not return something, even if when the function was expected to return a numeric vector (I'm surprised that this compiled)
2) I used db[0] instead of db(0)
so the new edited function is:
NumericVector changes(){
dB(0) = val;
dB(1) = 12;
//dB[1] = mean(mat(1,_));
return dB;
}
coatless and Dirk, Thanks for your answers, but it seems that there is no need to use them in a package (as what I did is working correctly). Did I miss something?

How to access lists in Rcpp?

I want to use Rcpp to make certain parts of my code more efficient. I have a main R function in which my objects are defined, in this R functions I have several rcpp functions that use the r data objects as inputs. This is one example of an rcpp function that is called in the R-function:
void calculateClusters ( List order,
NumericVector maxorder,
List rank,
double lambda,
int nbrClass,
NumericVector nbrExamples) {
int current, i;
for ( current = 0; current < nbrClass; current ++ ) {
maxorder [current] = 0;
for ( i = 0; i < nbrExamples [ current ]; i++ ) {
order[ current ][i] = ( int ) ( rank[ current ][i] / lambda ) - 1;
}
if ( order[ current ][i] > maxorder[ current ] ) {
maxorder[ current ] = order[ current ][i];
}
}
}
This function calculates the maximum number of clusters for each class. In native c++ coding I would define my List as an int** and my NumericVector as int*. However in Rcpp this gives an error. I know the fault lies in the subsetting of these Lists (I handled them the same way as int**).
My question is how can I transform these int** succesfully into List, without loosing flexibility. For example the List order and distance have the structure order[[1]][1:500], order[[2]][1:500], so this would be exactly the same as int** in c++ where it would be order[1][1:500], order[2][1:500]. If there are 3 classes the order and the distance List change to order order[[1]][1:500], order[[2]][1:500], order[[3]][1:500]. How can I do this in Rcpp?
Briefly:
Example from the Rcpp Gallery
Example from the Rcpp Examples package on CRAN

Values of the function parameters are changing randomly (but they're not modified from the code)

I have to implement an NBC (for finding clusters in the provided set of data) algorithm at my class project with a friend. We came across very strange issue. There are few helper functions, and the one with a problem is kNN (possibly kEN too) in the kNB.h file. After passing arguments to it from the main function of the program (for example k=3 and p=5), it goes to the kNN function and starts changing values of k and p randomly, even though function code is not supposed to do that anywhere as you can see below.
Also, while using debugging mode and going through this function step by step I noticed that it sometimes comes back before the first while which I think shouldn't happen. I guess it may be some trivial mistake, but I'm not very good at C++ (unfortunately we were ordered to use it). You can download entire Visual Studio 2013 solution from here: https://dl.dropboxusercontent.com/u/1561186/EDAMI.zip. So, does anyone have any idea why described situation is happening?
static vector<int> kNN(int k, int p, Dataset<V>* records)
{
int b = p, f = p;
bool backwardSearch, forwardSearch;
vector<int> tmp;
LengthMetric<V>* lengthMetric = records->getLengthMetric();
backwardSearch = PrecedingPoint(records, b);
forwardSearch = FollowingPoint(records, f);
int i = 0;
while (backwardSearch && forwardSearch && i < k)
{
if (records->getRecord(p)->getLength() - records->getRecord(b)->getLength() < records->getRecord(f)->getLength() - records->getRecord(p)->getLength())
{
i++;
tmp.push_back(b);
backwardSearch = PrecedingPoint(records, b);
}
else
{
i++;
tmp.push_back(f);
forwardSearch = FollowingPoint(records, f);
}
}
while (backwardSearch && i < k)
{
i++;
tmp.push_back(b);
backwardSearch = PrecedingPoint(records, b);
}
while (forwardSearch && i < k)
{
i++;
tmp.push_back(f);
forwardSearch = FollowingPoint(records, f);
}
return tmp;
}
Look at second constructor of your class Dataset
Dataset(vector<Record<V>*> rrecords,
LengthMetric<V>* metric = new DumbLengthMetric<V>())
: records(rrecords),
lengthMetric(lengthMetric) { // <-------------------
lengthMetric(lengthMetric) does nothing. Changing it to lengthMetric(metric) I got some result on your project and no one variable was changed.
BTW, do not include in zip any stuff like folders Debug, Release and files *.sdf, *.ncb

Rcpp function to be SLOWER than same R function

I have been coding a R function to compute an integral with respect to certain distributions, see code below.
EVofPsi = function(psi, probabilityMeasure, eps=0.01, ...){
distFun = function(u){
probabilityMeasure(u, ...)
}
xx = yy = seq(0,1,length=1/eps+1)
summand=0
for(i in 1:(length(xx)-1)){
for(j in 1:(length(yy)-1)){
signPlus = distFun(c(xx[i+1],yy[j+1]))+distFun(c(xx[i],yy[j]))
signMinus = distFun(c(xx[i+1],yy[j]))+distFun(c(xx[i],yy[j+1]))
summand = c(summand, psi(c(xx[i],yy[j]))*(signPlus-signMinus))
}
}
sum(summand)
}
It works fine, but it is pretty slow. It is common to hear that re-programming the function in a compiled language such as C++ would speed it up, especially because the R code above involves a double loop. So did I, using Rcpp:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double EVofPsiCPP(Function distFun, Function psi, int n, double eps) {
NumericVector xx(n+1);
NumericVector yy(n+1);
xx[0] = 0;
yy[0] = 0;
// discretize [0,1]^2
for(int i = 1; i < n+1; i++) {
xx[i] = xx[i-1] + eps;
yy[i] = yy[i-1] + eps;
}
Function psiCPP(psi);
Function distFunCPP(distFun);
double signPlus;
double signMinus;
double summand = 0;
NumericVector topRight(2);
NumericVector bottomLeft(2);
NumericVector bottomRight(2);
NumericVector topLeft(2);
// compute the integral
for(int i=0; i<n; i++){
//printf("i:%d \n",i);
for(int j=0; j<n; j++){
//printf("j:%d \n",j);
topRight[0] = xx[i+1];
topRight[1] = yy[j+1];
bottomLeft[0] = xx[i];
bottomLeft[1] = yy[j];
bottomRight[0] = xx[i+1];
bottomRight[1] = yy[j];
topLeft[0] = xx[i];
topLeft[1] = yy[j+1];
signPlus = NumericVector(distFunCPP(topRight))[0] + NumericVector(distFunCPP(bottomLeft))[0];
signMinus = NumericVector(distFunCPP(bottomRight))[0] + NumericVector(distFunCPP(topLeft))[0];
summand = summand + NumericVector(psiCPP(bottomLeft))[0]*(signPlus-signMinus);
//printf("summand:%f \n",summand);
}
}
return summand;
}
I'm pretty happy since this C++ function works fine. However, when I tested both functions, the C++ one ran SLOWER:
sourceCpp("EVofPsiCPP.cpp")
pFGM = function(u,theta){
u[1]*u[2] + theta*u[1]*u[2]*(1-u[1])*(1-u[2])
}
psi = function(u){
u[1]*u[2]
}
print(system.time(
for(i in 1:10){
test = EVofPsi(psi, pFGM, 1/100, 0.2)
}
))
test
print(system.time(
for(i in 1:10){
test = EVofPsiCPP(psi, function(u){pFGM(u,0.2)}, 100, 1/100)
}
))
So, is there some kind expert around willing to explain me this? Did I code like a monkey and is there a way to speed up that function? Moreover, I would have a second question. Indeed, I could have replaced the output type double by SEXP, and the argument types Function by SEXP as well, it doesn't seem to change anything. So what is the difference?
Thank you very much in advance,
Gildas
Others have answered in comments already. So I'll just emphasize the point: Calling back to R functions is expensive as we need to be extra cautious about error handling. Just having the loop in C++ and call R functions is not rewriting your code in C++. Try rewriting psi and pFGM as C++ functions and report back here what happens.
You might argue that you lose some flexibility and you're not able anymore to use any R function. For situations like this, I'd advise to use some sort of hybrid solution where you have implemented the most common cases in C++ and fallback to an R solution otherwise.
As for the other question, a SEXP is an R object. This is part of the R API. It can be anything. When you create a Function from it (as is done implicitly for you when create a function that takes a Function argument), you are guaranteed that this is indeed an R function. The overhead is very small, but the gain in terms of expressiveness of your code is huge.