I need to check the arity of a function in an Rcpp block at run time. What I would like to do is something akin to the following:
double loglikelihood(Rcpp::List data, Rcpp::List params, SEXP i, Rcpp::RObject custom_function) {
Rcpp::Function f = Rcpp::as<Rcpp::Function>(custom_function);
double res = 0.0;
if (arity(f) == 3) {
res = Rcpp::as<double>(f(data, param, i));
} else if (arity(f) == 2) {
res = Rcpp::as<double>(f(data, param));
}
return res;
}
However, the limited documentation I've seen for Rcpp does not seem to contain a function for checking the arity of an Rcpp::Function. Is there any way to do this?
The "limited documentation" (currently ten pdf vignettes alone) tells you, among other things, that all we have from R itself is .Call() returning SEXP and taking (an arbitrary number of) SEXP objects which can be a function. So all this ... goes back to the R API which may, or may not, have such an accessor which may or may not be public and supposed to be used by anybody but R itself.
These days we register compiled functions with R (typically in a file src/init.c or alike) where this number of argument is passed on as an second argument (beyond the function call name) when making the registration. Which suggests to me that it is not discoverable.
So I solved this using a workaround that is a little bit clunky but after giving it some serious thought, this is the least clunky of three methods I tried implementing.
The method I ended up going with is checking the arity of the function on the R side using methods::formalArgs, wrapping the (function, arity) pair in a list and passing that to the Rcpp function like so:
double loglikelihood(Rcpp::List data, Rcpp::List params,
SEXP i, Rcpp::RObject custom_function) {
Rcpp::List l = Rcpp::as<Rcpp::List>(custom_function);
Rcpp::Function f = Rcpp::as<Rcpp::Function>(l[0]);
int arity = l[1];
double res = 0.0;
if (arity == 3) {
res = Rcpp::as<double>(f(data, param, i));
} else if (arity == 2) {
res = Rcpp::as<double>(f(data, param));
}
return res;
}
As I mentioned, this is a bit clunky and it changes the signature of the funciton, which is not ideal. Another way of doing this would be to use the forgiveness-rather-than-permission approach and doing the control flow in a try-catch block, like so:
double loglikelihood(Rcpp::List data, Rcpp::List params,
SEXP i, Rcpp::RObject custom_function) {
Rcpp::Function f = Rcpp::as<Rcpp::Function>(custom_function);
double res = 0.0;
try {
res = Rcpp::as<double>(f(data, param, i));
} catch (const std::exception &e) {
res = Rcpp::as<double>(f(data, param));
}
return res;
}
This approach is less clunky, but the problem with it is that it also catches other exceptions that might arise within f and silences them so they aren't passed to the user. It is possible that there are more fine-grained exceptions defined in Rcpp that would be able to catch the specific error of passing too many parameters, but if so I haven't found it.
Lastly, we could pass methods::formalArgs to loglikelihood and query the arity just before we need to use it, but I think this approach is the clunkiest of the three, because it requires us to pass formalArgs around a lot.
Related
Assume that I have a functionality which I want to call whenever a timer finishes. I have put that piece of functionality in a lambda function. Furthermore, in that function, I may wish to set another timer to call that same lambda on another, later occasion.
void doSetupThingsInSomeDecoupledCodeOrWhatever() {
std::function<void(float)> semiRecursiveFunc;
semiRecursiveFunc = [&semiRecursiveFunc](float deltaT){
if (whatever()) {
// Do something...
}
else {
// Do something else, then:
float durationMS = getRNGSystem().getUniformFloat(1000.0f, 5000.0f)
// Gives the timer a duration to wait, and a function to run at the end of it.
getTimerSystem().setNewTimer(durationMS, semiRecursiveFunc);
}
};
float durationMS = getRNGSystem().getUniformFloat(1000.0f, 5000.0f)
// Gives the timer a duration to wait, and a function to run at the end of it.
getTimerSystem().setNewTimer(durationMS, fooLambda);
}
Now, clearly this won't work, because semiRecursiveFunc is tied to the scope of doSetupThingsInSomeDecoupledCodeOrWhatever, and when the timer system tries to run it the function will no longer exist and everything will disintegrate into a spectacular ball of flame.
What's the best way to manage this? I can't store semiRecursiveFunc in a pointer because one can't declare lambdas like that, as far as I can tell. Is there some common tool for this sort of persistent-lambda use-case? What's the least ugly approach, with minimum surrounding infrastructure? Is there a best-practice to follow, some relevant tool I've missed? Any suggestions or recommendations would be much appreciated.
What you're looking for is a y-combinator, sometimes called a fixed-point combinator.
Either way, instead of using std::function at all (which adds needless overhead), you would write your callback like this:
auto semiRecursiveCallback = combinator([](auto self, float deltaT){
if (whatever()) {
// Do something...
}
else {
// Do something else, then:
float durationMS = getRNGSystem().getUniformFloat(1000.0f, 5000.0f)
// Gives the timer a duration to wait, and a function to run at the end of it.
// NB: we pass 'self' as the argument
getTimerSystem().setNewTimer(durationMS, self);
}
});
Where combinator is either the y_combinator implementation of my linked answer or boost::hof::fix from the excellent Boost.HOF library.
The combinator ensures that the object itself has access to itself, so you can do recursive things. In the above code, you're actually getting passed a copy of yourself, but that's fine: value semantics are cool like that.
Here is a tiny Y-combinator:
template<class R>
auto Y = [] (auto f) {
auto action = [=] (auto action) {
return [=] (auto&&... args)->R {
return f( action(action),decltype(args)(args)... );
};
};
return action(action);
};
Just do this:
auto semiRecursiveFunc = Y<void>([](auto&& semiRecursiveFunc, float deltaT){
if (whatever()) {
// Do something...
}
else {
// Do something else, then:
float durationMS = getRNGSystem().getUniformFloat(1000.0f, 5000.0f)
// Gives the timer a duration to wait, and a function to run at the end of it.
getTimerSystem().setNewTimer(durationMS, semiRecursiveFunc);
}
);
and it works.
Y<R> takes a callable that is passed what to recurse on as its first argument. When you recurse, just pass the rest of the arguments.
You can write a fancier Y combinator. This one copies the lambdas state a lot and isn't picky about moving it, to keep its implementation simple. It also requires you provide its return type (that is harder to avoid, due to C++ type deduction rules).
Here's a way that is in the style of Objective-C reference counting. The advantage is that you can use a lambda signature that is the same as the original function you want (no extra arguments). The disadvantages are that it looks ugly and verbose, and you have to always use the lambda through a shared_ptr; you can't take it out and pass it separately.
void doSetupThingsInSomeDecoupledCodeOrWhatever() {
std::shared_ptr<std::weak_ptr<std::function<void(float)>>> weakFuncHolder =
std::make_shared<std::weak_ptr<std::function<void(float)>>>();
std::shared_ptr<std::function<void(float)>> semiRecursiveFunc =
std::make_shared<std::function<void(float)>>([=](float deltaT) {
std::shared_ptr<std::function<void(float)>> strongFunc(*weakFuncHolder);
if (whatever()) {
// Do something...
}
else {
// Do something else, then:
float durationMS = getRNGSystem().getUniformFloat(1000.0f, 5000.0f);
// Gives the timer a duration to wait, and a function to run at the end of it.
getTimerSystem().setNewTimer(durationMS,
[=](float deltaT){ (*strongFunc)(deltaT); });
}
});
*weakFuncHolder = semiRecursiveFunc;
float durationMS = getRNGSystem().getUniformFloat(1000.0f, 5000.0f);
// Gives the timer a duration to wait, and a function to run at the end of it.
getTimerSystem().setNewTimer(durationMS,
[=](float deltaT){ (*semiRecursiveFunc)(deltaT); });
}
I’m trying to return a result (in fact, NULL) invisibly from a C++ function via Rcpp. Unfortunately I am unable to find out how to do this. My first attempt was to set R_Visible but this global variable is no longer exported; next, I tried calling do_invisible (the primitive that invisible calls) directly but, likewise, it’s not exported (and to be honest I’m unsure how to call it correctly anyway).
I then went the roundabout way, calling R’s base::invisible from Rcpp via an Rcpp::Function. My code is now something like this:
Rcpp::Function invisible = Rcpp::Environment("package:base")["invisible"];
// [[Rcpp::export]]
SEXP read_value(Rcpp::XPtr<std::vector<int>> x, int index) {
try {
return Rcpp::wrap(x->at(index));
} catch (std::out_of_range const&) {
return invisible(R_NilValue);
}
}
This compiles and executes. Unfortunately, the invisible call is simply ignored; when calling the function from R with an invalid index, it prints NULL. I would like it to print nothing.
For testing:
// [[Rcpp::export]]
Rcpp::XPtr<std::vector<int>> make_xvec() {
return Rcpp::XPtr<std::vector<int>>{new std::vector<int>{1, 2, 3}};
}
/*** R
xv = make_xvec()
read_value(xv, 1)
invisible(read_value(xv, 4)) # Works
read_value(xv, 4) # Doesn’t work
*/
Hm. "Ultimately" we always get SEXP .Call(id, SEXP a, SEXP b, ...) and that ends up (via Rcpp Attributes) with something like
R> rqdb::qdbConnect
function ()
{
.Call(`_rqdb_qdbConnect`)
}
<environment: namespace:rqdb>
R>
which when we call it gives us
R> qdbConnect()
[1] TRUE
R> invisible(qdbConnect())
R>
Can't you just wrap another layer at the R side and call it a day?
I think the key really is that a void function is possible, but the default is something as the SEXP. And C++ only has return so you need R for the invisible part.
I seek to compare two generic R values in C++ using Rcpp. How can I compare two values without casting them to specific types in C++?
The code that explains my issue is as follows,
require("Rcpp")
require("inline")
src <- "return wrap(x1 == x2);"
fun <- cxxfunction(signature(x1 = "SEXP", x2 = "SEXP"), src, plugin = "Rcpp")
fun("a", "a")
to_cmp <- "a"
fun(to_cmp, to_cmp)
It now gives FALSE and TRUE where I want it to yield TRUE and TRUE.
Since my goal is to implement a data structure in C++ I would prefer to potential user defined == methods.
Possible approach
One approach that I tried is,
require("Rcpp")
src <- '
Language call("\`==\`", x1, x2);
return call.eval();
'
fun <- cxxfunction(signature(x1 = "SEXP", x2 = "SEXP"), src, plugin = "Rcpp")
fun("a", "a")
to_cmp <- "a"
fun(to_cmp, to_cmp)
However, when I run this I get Error: could not find function "`==`"
You are on the right track with using the generic SEXP input object tag. To get this to work one needs to use C++ templates in addition to TYPEOF(). The prior enables the correct vector creation in the comparison function to be hooked in with Rcpp sugar while the latter enables the correct check and dispatch to occur.
#include <Rcpp.h>
using namespace Rcpp;
template <int RTYPE>
Rcpp::LogicalVector compare_me(Rcpp::Vector<RTYPE> x, Rcpp::Vector<RTYPE> y) {
return x == y;
}
// [[Rcpp::export]]
Rcpp::LogicalVector compare_objects(SEXP x, SEXP y) {
if (TYPEOF(x) == TYPEOF(y)) {
switch (TYPEOF(x)) {
case INTSXP:
return compare_me<INTSXP>(x, y);
case REALSXP:
return compare_me<REALSXP>(x, y);
case STRSXP:
return compare_me<STRSXP>(x, y);
default:
Rcpp::stop("Type not supported");
}
} else {
Rcpp::stop("Objects are of different type");
}
// Never used, but necessary to avoid the compiler complaining
// about a missing return statement
return Rcpp::LogicalVector();
}
Example:
to_cmp <- "a"
compare_objects(to_cmp, to_cmp)
Output:
[1] TRUE
Also, the above is for use with Rcpp::sourceCpp(). I would encourage you to switch from using inline to using Rcpp::cppFunction() for function definitions as it allows you to focus on the computation and not the setup.
I am currently attempting to use the gsl_function from GSL library through RcppGSL using a .cpp file and calling it using sourceCpp(). The idea is to perform numerical integration with gsl_integration_qags, also from GSL. My C code invokes a user defined R function (SomeRFunction in my code below) saved into the global environment. The code is:
#include <RcppGSL.h>
#include <stdio.h>
#include <math.h>
#include <gsl/gsl_integration.h>
#include <gsl/gsl_vector.h>
// [[Rcpp::depends(RcppGSL)]]
// [[Rcpp::export]]
double integfn1(double ub, double lb, double abserr, double Rsq, int pA){
double result, abserror;
Rcpp::Environment global = Rcpp::Environment::global_env();
Rcpp::Function f1 = global["SomeRFunction"];
struct my_f_params { double Rsq; int pA; };
struct my_f_params params = { Rsq, pA };
gsl_function F;
F.function = & f1;
F.params = ¶ms;
double lb1 = lb;
double ub1 = ub;
gsl_integration_workspace * w = gsl_integration_workspace_alloc (1000);
gsl_integration_qags (&F, lb1, ub1, 1e-8, 1e-8, 1000, w, &result, &abserror);
printf ("result = % .18f\n", result);
printf ("estimated error = % .18f\n", abserror);
gsl_integration_workspace_free (w);
return result;
}
And the following error comes up:
"cannot convert 'Rcpp::Function* {aka Rcpp::Function_Impl<Rcpp::PreserveStorage>*}' to 'double (*)(double, void*)' "
The problem is in the line where I declare what the function to integrate is (i.e. "F.function = & f1;").
I looked up similar problems but couldn't find anything listed... Any hint will be greatly appreciated!
Many thanks
I created a working example (and timed it against C) in which you can pass an arbitrary user-defined R function to the GSL function QAWF. You should be able to generalize it to other gsl functions as well.
See the example here: https://sites.google.com/site/andrassali/computing/user-supplied-functions-in-rcppgsl
As noted above, the competing C implementation is much-much faster.
Quick comments:
You write: My C code invokes a user defined R function (SomeRFunction in my code below) saved into the global environment. So your C code will still slow down a lot at each function evaluation due to the cost of calling R, and the slower speed of R.
My RcppDE package (which is also for optmisation) as as example of using external pointers (Rcpp::XPtr) to pass a user-defined C function down to the optimiser. Same flexibility. better speed.
Your compile error is exactly at that intersection---you can't "just pass" an Rcpp object to a void pointer. Rcpp::XPtr helps you there, but you too need to know what you are doing so a the working example may be helpful.
This is really an Rcpp question but you didn't add the tag, so I just did.
F.function expects a function of this signature double (*) (double x, void * params) so a function taking a double then a void*. You need to help Rcpp::Function if you want this to fly.
This is typical C apis stripping away types. So what you need to give as function is something that understand what you have as params and is able to call the R function, with the right parameters and convert to a double. The simple way to do this is to consider the R function as data, so augment the struct to make it something like this:
struct my_f_params { double Rsq; int pA; Function& rfun };
This way, the function you pass as function can be something like this:
double proxy_fun( double x, void* params ){
my_f_params* data = reinterpret_cast<my_f_params*>(params) ;
return as<double>( data->rfun(x, data->Rsq, data->pA ) ) ;
} ;
So you can build your gsl_function something like this:
my_f_params params = { Rsq, pA, f1 };
gsl_function F ;
F.function = &proxy_fun ;
F.params = ¶ms;
With a bit of C++11 variadic templates and tuples, you could generalize this to anything instead of the (double, int) pair but it is more work.
What is a good way to return success or one or more error codes from a C++ function?
I have this member function called save(), which saves to each of the member variables, there are at least ten of these member variables that are saved-to, for the call to save(), I want to find out if the call failed, and if so, on which member variable (some are hard failures, some are soft).
You can either return an object that has multiple error fields or you can use 'out'parameters.
How you do this depends on your design and what exactly you are trying to return back. A common scenario is when you need to report back a status code along with a message of sorts. This is sometimes done where the function returns the status code as the return value and then returns the message status via an 'out' parameter.
If you are simply returning a set of 'codes', it might make more sense to construct a struct type and return that. In that case, I would be prone to pass it in as an out parameter and have the method internally update it instead of allocating a new one each time.
Are you planning on doing this once or many times?
I know this doesn't really answer your question, but...
In C++ you should use exceptions instead of returning error codes. Error codes are most commonly used by libraries which don't want to force the library user to use a particular error handling convention, but in C++, we already have stdexcept. Of course, there might be reasons you don't use exceptions, such as if you're writing embedded code or kernel extensions.
I usually use a boost::tuple:
typedef boost::tuple<int,int> return_value;
return_value r = my_function();
int first_value = boost::get<0>( r );
int second_valud = boost::get<1>( r );
EDIT
You can also use boost::tie to extract the values from a tuple:
boost::tie( first_value, second_value ) = r;
The simplest way to return two values is with the std::pair<> template:
I would use a bitset if you're intention is to purely return error states. e.g.
const bitset<10> a_not_set(1);
const bitset<10> b_not_set(2);
const bitset<10> c_not_set(4);
...
bitset<10> foo(T& a, T& b, T& c, ...)
{
bitset<10> error_code = 0;
...
if ( /* a can't be set */ )
{
error_code |= a_not_set;
}
...
if ( /* b can't be set */ )
{
error_code |= b_not_set;
}
...
// etc etc
return error_code;
}
bitset<10> err = foo(a, b, c, ... );
if (err && a_not_set)
{
// Blah.
}
You need to return them as output parameters:
bool function(int& error1, int& error2, stringx& errorText, int& error3);
You can use an integer with bit manipulation (aka flags).
I probably try to throw an exception first but it depends on your coding paradigm. Please check some books or articles about reasons why c++ exception handling might be better.
If I really need to stick to retrun-error-code style, I would define a eunm type for specifying errors with bit operations..
enum error
{
NO_ERROR = 0,
MEMBER_0_NOT_SAVED = 1,
MEMBER_1_NOT_SAVED = 1 << 1,
MEMBER_2_NOT_SAVED = 1 << 2,
// etc..
};
int save()
{
int ret = NO_ERROR;
// fail to save member_0
ret |= MEMBER_0_NOT_SAVED;
// fail to save member_1
ret |= MEMBER_1_NOT_SAVED;
// ....
return ret;
}
int main(void)
{
int ret = save();
if( ret == NO_ERROR)
{
// good.
}
else
{
if(ret & MEMBER_0_NOT_SAVED)
{
// do something
}
if(ret & MEMBER_1_NOT_SAVED)
{
// do something
}
// check the other errors...
}
}
This is just a rough example. It's better to put this into a class or use a namespace.
I am not familiar with the internals and constrains of your project, but if possible, try to use exceptions instead of error codes.
The reasons are listed here, at C++ FAQ lite, and they conclude with:
So compared to error reporting via return-codes and if, using try / catch / throw is likely to result in code that has fewer bugs, is less expensive to develop, and has faster time-to-market.