Using gsl_function using a supplied R function - c++

I am currently attempting to use the gsl_function from GSL library through RcppGSL using a .cpp file and calling it using sourceCpp(). The idea is to perform numerical integration with gsl_integration_qags, also from GSL. My C code invokes a user defined R function (SomeRFunction in my code below) saved into the global environment. The code is:
#include <RcppGSL.h>
#include <stdio.h>
#include <math.h>
#include <gsl/gsl_integration.h>
#include <gsl/gsl_vector.h>
// [[Rcpp::depends(RcppGSL)]]
// [[Rcpp::export]]
double integfn1(double ub, double lb, double abserr, double Rsq, int pA){
double result, abserror;
Rcpp::Environment global = Rcpp::Environment::global_env();
Rcpp::Function f1 = global["SomeRFunction"];
struct my_f_params { double Rsq; int pA; };
struct my_f_params params = { Rsq, pA };
gsl_function F;
F.function = & f1;
F.params = &params;
double lb1 = lb;
double ub1 = ub;
gsl_integration_workspace * w = gsl_integration_workspace_alloc (1000);
gsl_integration_qags (&F, lb1, ub1, 1e-8, 1e-8, 1000, w, &result, &abserror);
printf ("result = % .18f\n", result);
printf ("estimated error = % .18f\n", abserror);
gsl_integration_workspace_free (w);
return result;
}
And the following error comes up:
"cannot convert 'Rcpp::Function* {aka Rcpp::Function_Impl<Rcpp::PreserveStorage>*}' to 'double (*)(double, void*)' "
The problem is in the line where I declare what the function to integrate is (i.e. "F.function = & f1;").
I looked up similar problems but couldn't find anything listed... Any hint will be greatly appreciated!
Many thanks

I created a working example (and timed it against C) in which you can pass an arbitrary user-defined R function to the GSL function QAWF. You should be able to generalize it to other gsl functions as well.
See the example here: https://sites.google.com/site/andrassali/computing/user-supplied-functions-in-rcppgsl
As noted above, the competing C implementation is much-much faster.

Quick comments:
You write: My C code invokes a user defined R function (SomeRFunction in my code below) saved into the global environment. So your C code will still slow down a lot at each function evaluation due to the cost of calling R, and the slower speed of R.
My RcppDE package (which is also for optmisation) as as example of using external pointers (Rcpp::XPtr) to pass a user-defined C function down to the optimiser. Same flexibility. better speed.
Your compile error is exactly at that intersection---you can't "just pass" an Rcpp object to a void pointer. Rcpp::XPtr helps you there, but you too need to know what you are doing so a the working example may be helpful.
This is really an Rcpp question but you didn't add the tag, so I just did.

F.function expects a function of this signature double (*) (double x, void * params) so a function taking a double then a void*. You need to help Rcpp::Function if you want this to fly.
This is typical C apis stripping away types. So what you need to give as function is something that understand what you have as params and is able to call the R function, with the right parameters and convert to a double. The simple way to do this is to consider the R function as data, so augment the struct to make it something like this:
struct my_f_params { double Rsq; int pA; Function& rfun };
This way, the function you pass as function can be something like this:
double proxy_fun( double x, void* params ){
my_f_params* data = reinterpret_cast<my_f_params*>(params) ;
return as<double>( data->rfun(x, data->Rsq, data->pA ) ) ;
} ;
So you can build your gsl_function something like this:
my_f_params params = { Rsq, pA, f1 };
gsl_function F ;
F.function = &proxy_fun ;
F.params = &params;
With a bit of C++11 variadic templates and tuples, you could generalize this to anything instead of the (double, int) pair but it is more work.

Related

Is there a way to check the arity of Rcpp::Function?

I need to check the arity of a function in an Rcpp block at run time. What I would like to do is something akin to the following:
double loglikelihood(Rcpp::List data, Rcpp::List params, SEXP i, Rcpp::RObject custom_function) {
Rcpp::Function f = Rcpp::as<Rcpp::Function>(custom_function);
double res = 0.0;
if (arity(f) == 3) {
res = Rcpp::as<double>(f(data, param, i));
} else if (arity(f) == 2) {
res = Rcpp::as<double>(f(data, param));
}
return res;
}
However, the limited documentation I've seen for Rcpp does not seem to contain a function for checking the arity of an Rcpp::Function. Is there any way to do this?
The "limited documentation" (currently ten pdf vignettes alone) tells you, among other things, that all we have from R itself is .Call() returning SEXP and taking (an arbitrary number of) SEXP objects which can be a function. So all this ... goes back to the R API which may, or may not, have such an accessor which may or may not be public and supposed to be used by anybody but R itself.
These days we register compiled functions with R (typically in a file src/init.c or alike) where this number of argument is passed on as an second argument (beyond the function call name) when making the registration. Which suggests to me that it is not discoverable.
So I solved this using a workaround that is a little bit clunky but after giving it some serious thought, this is the least clunky of three methods I tried implementing.
The method I ended up going with is checking the arity of the function on the R side using methods::formalArgs, wrapping the (function, arity) pair in a list and passing that to the Rcpp function like so:
double loglikelihood(Rcpp::List data, Rcpp::List params,
SEXP i, Rcpp::RObject custom_function) {
Rcpp::List l = Rcpp::as<Rcpp::List>(custom_function);
Rcpp::Function f = Rcpp::as<Rcpp::Function>(l[0]);
int arity = l[1];
double res = 0.0;
if (arity == 3) {
res = Rcpp::as<double>(f(data, param, i));
} else if (arity == 2) {
res = Rcpp::as<double>(f(data, param));
}
return res;
}
As I mentioned, this is a bit clunky and it changes the signature of the funciton, which is not ideal. Another way of doing this would be to use the forgiveness-rather-than-permission approach and doing the control flow in a try-catch block, like so:
double loglikelihood(Rcpp::List data, Rcpp::List params,
SEXP i, Rcpp::RObject custom_function) {
Rcpp::Function f = Rcpp::as<Rcpp::Function>(custom_function);
double res = 0.0;
try {
res = Rcpp::as<double>(f(data, param, i));
} catch (const std::exception &e) {
res = Rcpp::as<double>(f(data, param));
}
return res;
}
This approach is less clunky, but the problem with it is that it also catches other exceptions that might arise within f and silences them so they aren't passed to the user. It is possible that there are more fine-grained exceptions defined in Rcpp that would be able to catch the specific error of passing too many parameters, but if so I haven't found it.
Lastly, we could pass methods::formalArgs to loglikelihood and query the arity just before we need to use it, but I think this approach is the clunkiest of the three, because it requires us to pass formalArgs around a lot.

How do I include more objective functions in my TMB .cpp file?

The TMB objective functions are seemingly defined in one function block that is saved to a <name>.cpp file. Then, after compiling the file, each objective function is accessed by loading with the command dyn.load(dynlib(<name>)).
Is it possible to store more than one objective function in each .cpp file? For example, the following two objective functions are very similar to each other, but at the moment need to be saved to different files:
// TMB Tutorial but with fixed variance
#include <TMB.hpp> // Links in the TMB libraries
template<class Type>
Type objective_function<Type>::operator() ()
{
DATA_VECTOR(x); // Data vector transmitted from R
PARAMETER(mu); // Parameter value transmitted from R
Type sigma = 1.0;
Type f; // Declare the "objective function" (neg. log. likelihood)
f = -sum(dnorm(x,mu,sigma,true)); // Use R-style call to normal density
return f;
}
and
// TMB Tutorial
#include <TMB.hpp> // Links in the TMB libraries
template<class Type>
Type objective_function<Type>::operator() ()
{
DATA_VECTOR(x); // Data vector transmitted from R
PARAMETER(mu); // Parameter value transmitted from R
PARAMETER(sigma); //
Type f; // Declare the "objective function" (neg. log. likelihood)
f = -sum(dnorm(x,mu,sigma,true)); // Use R-style call to normal density
return f;
}
The "map" argument to MakeADFun() allows you to fix parameters at specific values.
In this example, we only need to compile/load the latter template. First, we'll write the template to a file, compile, and load the resulting DLL.
library(TMB)
file_conn <- file('test.cpp')
writeLines("
#include <TMB.hpp>
template<class Type>
Type objective_function<Type>::operator() ()
{
DATA_VECTOR(x);
PARAMETER(mu);
PARAMETER(sigma);
Type f;
f = -sum(dnorm(x,mu,sigma,true));
return f;
}
", file_conn)
close(file_conn)
compile('test.cpp')
dyn.load(dynlib('test'))
We can use the same DLL to fit models with and without a varying sigma.
n <- 100
x <- rnorm(n = n, mean = 0, sd = 1)
f1 <- MakeADFun(data = list(x = x),
parameters = list(mu = 0, sigma = 1),
DLL = 'test')
f2 <- MakeADFun(data = list(x = x),
parameters = list(mu = 0, sigma = 1),
DLL = 'test',
map = list(sigma = factor(NA)))
opt1 <- do.call('optim', f1)
opt2 <- do.call('optim', f2)
When using "map", the specified parameter(s) (sigma in this case) is fixed at the value given in "parameters".
Post-optimization, we do a sanity check -- the mus ought to be nearly identical.
> opt1$par
mu sigma
0.08300554 1.07926521
> opt2$par
mu
0.08300712
Turning random effects on and off is a little more difficult. An example of that is given here, where you can use CppAD::Variable() to check whether to decrement the negative log-likelihood or not.
For unalike objective functions (not subsets of one another), you could pass a DATA_INTEGER or DATA_STRING into the template, e.g. like they did in glmmTMB here, and pick the objective function depending on the value of that DATA_*.
I just wanted to make clear on what #alexforrence meant by
For unalike objective functions (not subsets of one another), you could pass a DATA_INTEGER or DATA_STRING into the template, e.g. like they did in glmmTMB here, and pick the objective function depending on the value of that DATA_*
It turns out there is a code snippet on the TMB github that covers this scenario, which I duplicate here:
#include <TMB.hpp>
template<class Type>
Type objective_function<Type>::operator() ()
{
DATA_STRING(model_type);
if (model_type == "model1") {
#include "model1.h"
} else
if (model_type == "model2") {
#include "model2.h"
} else {
error ("Unknown model type")
}
return 0;
}
That is, pass in a string telling the objective function which model to choose, and then include the text for that function stored in a separate .h file.

Define a function f1(x) object from an another function f0(x,y), but setting y to a fixed value c++

Suppose I have a function
void f0(double x, double parameters[]) { ... }
and I want to define a function object
std::function <void (double x) >f1
such that, for example, f1(x) = f0(x,a) where a is a specified set of parameters (e.g. double parameters[4] = {1.0, 2.9, 6.2, 2.1})
How would I do this? My thoughts are to try to have a function that inputs a outputs f1, but I'm not sure how to do this.
The motivation of this is that, in essence, I have yet another function FUNC in a library, that inputs a function with a single double input, but I want more flexibility to add parameters to that function
This is very simple with a lambda:
std::function<void(double)> f1 =
[&parameters](double x) { f0(x, parameters); };
Alternatively, you could use std::bind.
using namespace std::placeholders;
std::function<void(double)> f2 = std::bind(f0, _1, parameters);
But I much prefer lambdas for almost any situation.

Is this C callback safe with C++ objects?

My purpose is to call some C function from my C++ code and pass some C++ objects.
In fact I am using a integration routine from the GSL libray(written in C), see this link,
My code snippet:
// main.cpp
#include <stdio.h>
#include <gsl/gsl_integration.h>
#include <myclass.h>
/* my test function. */
double testfunction ( double x , void *param ) {
myclass *bar=static_cast<myclass*>(param);
/*** do something with x and bar***/
return val;
}
int main ( int argc , char *argv[] ) {
gsl_function F; // defined in GSL: double (* function) (double x, void * params)
/* initialize.*/
gsl_integration_cquad_workspace *ws =
gsl_integration_cquad_workspace_alloc( 200 ) ;
/* Prepare test function. */
myclass foo{}; // call myclass constructor
F.function = &testfunction;
F.params = &foo;
/* Call the routine. */
gsl_integration_cquad( &F, 0.0,1.0,1.0e-10,1.0e-10,ws, &res,&abserr,&neval);
/* Free the workspace. */
gsl_integration_cquad_workspace_free( ws );
return 0;
}
In my case, direct calling gsl_integration_cquad seems OK, provided the header includes sth like "ifdef __cplusplus", my concern is about the callback F,originally defined in C, am I allowed to pass the testfunction and also the C++ foo object in this way ? .
or is there any better way to do this kind of stuff, maybe overloading and use a functor?
P.S. Am I allowed to do exeption handling within the callback function? (use try catch inside "testfunction"). It works in my case but not sure if it's legal.
I'm not familiar with the library in question, but in general,
when passing a pointer to a callback and a void* to
a C routine, which will call the callback back with the void*,
there are two things you need to do to make it safe:
The function whose address you pass must be declared extern "C".
You'll get away with not doing this with a lot of compilers, but
it isn't legal, and a good compiler will complain.
The type you convert to the void* must be exactly the same
type as the type you cast it back to in the callback. The
classic error is to pass something like new Derived to the
C function, and cast it back to Base* in the callback. The
round trip Derived*→void*→Base* is undefined
behavior. It will work some of the time, but at other times, it
may crash, or cause any number of other problems.
And as cdhowie pointed out in a comment, you don't want to
allow exceptions to propagate accross the C code. Again, it
might work. But it might not.
For the exact example you posted, the only thing you need to do
is to declare testfunction as extern "C", and you're all
right. If you later start working with polymorphic objects,
however, beware of the second point.
You can use
myclass *bar=static_cast<myclass*>(param);
with void*.
If you meant something like transporting a c++ class pointer through a c callback's void* pointer, yes it's safe to do a static_cast<>.
There's no kind of losing c++ specific attributes of this class pointer, when passed through c code. Though passing a derived class pointer, and static casting back to the base class, won't work properly as #James Kanze pointed out.
The void* will likely just be passed trough by the C library without looking at the pointed-to data, so it's not a problem if this contains a C++ class. As log as you cast the void* to the correctly there shouldn't be any problems.
To make sure the callback function itself is compatible, you can declare it as extern "C". Additionally you should make sure that no exceptions are thrown from the callback function, since the C code calling the callback won't expect those.
All together I would split up the code into one function that does the real work and another function that is used as the callback and handles the interface with the C library, for example like this:
#include <math.h>
double testfunction ( double x ,myclass *param ) {
/*** do something with x and bar***/
return val;
}
extern "C" double testfunction_callback ( double x , void *param ) {
try {
myclass *bar=reinterpret_cast<myclass*>(param);
return testfunction(x, bar);
}
catch(...) {
std::cerr << "Noooo..." << std::endl;
return NAN;
}
}

Calling GSL function inside a class in a shared library

I'm trying make a shared library in c++ implementing tools for Fermi gases. I'm using the GSL library to solve a function numerically and my code runs without a problem without when running as a script, but when trying to convert it to a shared library and classes I encounter problems.
I've seen similar questions:
Q1
Q2
Q3
I'm fairly new to c++-programming and cannot seem to adapt the different answers to my problem. Probably since I do not quite understand the answers.
My code is:
/* Define structure for the GSL-function: chempot_integrand */
struct chempot_integrand_params { double mu; double T; };
double
ChemicalPotential::chempot_integrand (double x, void * params){
/* Computes the integrand for the integral used to obtain the chemical potential.
*
* This is a GSL-function, which are integrated using gsl_integration_qag.
*/
// Get input parameters.
struct chempot_integrand_params * p = (struct chempot_integrand_params *) params;
double mu = p->mu;
double T = p->T;
// Initiate output parameters for GSL-function.
gsl_sf_result_e10 result;
int status = gsl_sf_exp_e10_e( ( gsl_pow_2(x) - mu ) / T , &result );
if (status != GSL_SUCCESS){
printf ("Fault in calculating exponential function.");
}
// Return (double) integrand.
return (gsl_pow_2(x) / ( 1 + result.val * gsl_sf_pow_int(10,result.e10) ));
}
/* Define structure for the GSL-function: chempot_integration */
struct chempot_integral_params { double T; };
double
ChemicalPotential::chempot_integration (double mu, double T){
/* Computes the integral used to obtain the chemical potential using the integrand: chempot_integrand.
*/
// Set input parameters for the integrand: chempot_integrand.
struct chempot_integrand_params params_integrand = { mu, T };
// Initiate the numerical integration.
gsl_integration_workspace * w = gsl_integration_workspace_alloc (1000); // Allocate memory for the numerical integration. Can be made larger if neccessary, REMEMBER to change it in the function call: gsl_integration_qag as well.
double result, error;
gsl_function F;
F.function = &ChemicalPotential::chempot_integrand;
F.params = &params_integrand;
// Upper limit for integration
double TOL = 1e-9;
double upp_lim = - T * gsl_sf_log(TOL) + 10;
gsl_integration_qag (&F, 0, upp_lim, 1e-12, 1e-12, 1000, 6, w, &result, &error);
// Free memory used for the integration.
gsl_integration_workspace_free (w);
return result;
}
and when compiling I get the error
error: cannot convert ‘double (Fermi_Gas::ChemicalPotential::*)(double, void*)’ to ‘double (*)(double, void*)’
in line
F.function = &ChemicalPotential::chempot_integrand;
It is indeed interesting that people ask this over and over again. One reason may be that the proposed solutions are not easy to understand. I for one had problems understanding and implementing them. (the solutions did not work out of the box for me, as you might expect.)
With the help of tlamadon I just figured out a solution that may be helpful here as well. Let's see what you guys think.
So just to recap, the problem is that you have a class that contains a member function on which you want to operate with something from the GSL library. Our example is useful if the GSL interface requires a
gsl_function F;
see here for a definition.
So here is the example class:
class MyClass {
private:
gsl_f_pars *p; // not necessary to have as member
public:
double obj(double x, void * pars); // objective fun
double GetSolution( void );
void setPars( gsl_f_pars * xp ) { p = xp; };
double getC( void ) ; // helper fun
};
The objective of this exercise is to be able to
initiate MyClass test,
supply it with a paramter struct (or write a corresponding constructor), and
call test.GetSolution() on it, which should return whatever the GSL function was used for (the minimum of obj, a root, the integral or whatever)
The trick is now to put have an element in the parameter struct gsl_f_pars which is a pointer to MyClass. Here's the struct:
struct gsl_f_pars {
double a;
double b;
double c;
MyClass * pt_MyClass;
};
The final piece is to provide a wrapper that will be called inside MyClass::GetSolution() (the wrapper is a stand in for the member function MyClass::obj, which we cannot just point to with &obj inside the class). This wrapper will take the parameter struct, dereference pt_MyClass and evaluate pt_MyClass's member obj:
// Wrapper that points to member function
// Trick: MyClass is an element of the gsl_f_pars struct
// so we can tease the value of the objective function out
// of there.
double gslClassWrapper(double x, void * pp) {
gsl_f_pars *p = (gsl_f_pars *)pp;
return p->pt_MyClass->obj(x,p);
}
The full example is a bit too long to post here, so I put up a gist. It's a header file and a cpp file, it should be working wherever you have GSL. Compile and run with
g++ MyClass.cpp -lgsl -o test
./test
This is a duplicate question. See Q1 or Q2 for example. Your problem is the following: you cannot convert pointers to member functions to free function pointers. To solve your problem, there are two options. You can define your member function as static (which is bad in 90% of the case because the member function will not be attached to any instantiation of your class and that is why you can convert it to a free function) or you can use the wrapper you linked that will use a static member functions under the hood to make your code compatible with gsl without the need of declaring your particular member function static.
EDIT #Florian Oswald. Basically your entire solution can be implemented in 2 lines using std::bind the wrapper I cited before
gsl_function_pp Fp( std::bind(&Class::member_function, &(*this), std::placeholders::_1) );
gsl_function *F = static_cast<gsl_function*>(&Fp);
In practice is this is just 1 extra line from a pure C code!
As I stated in a comment, wrapping every member function that you want to integrate using an extra global struct and an extra global function is cumbersome and pollute your code with a lot of extra functions/struct that are completely unnecessary. Why use c++ if we refuse to use the features that make C++ powerful and useful (in comparison to C)?
Another classical Example: if you want to pass a LOT of parameters, use lambda functions (no extra struct or global functions) !!!
To be more precise: Imagine you have 2 parameters (doubles) .
//Declare them (locally) here
double a1 = ...;
double a2 = ...;
// Declare a lambda function that capture all of them by value or reference
// no need to write another struct with these 2 parameters + class pointer
auto ptr = [&](double x)->double {/.../};
// Cast to GSL in 3 lines using the wrapper
std::function<double(double)> F1(ptr);
gsl_function_pp F2(F1);
gsl_function *F = static_cast<gsl_function*>(&F2);
No extra global struct of global functions and no extra wrapper (the same wrapper that solved the problem of integrating member function also solved the problem of integrating a lambda expression). Of course this is a matter of style in the end, but in the absence of these nice features that allow the use of C libraries without code bloat, I would never leave C.