Is there any reason why I should prefer Rcpp::NumericVector over std::vector<double>?
For example, the two functions below
// [[Rcpp::export]]
Rcpp::NumericVector foo(const Rcpp::NumericVector& x) {
Rcpp::NumericVector tmp(x.length());
for (int i = 0; i < x.length(); i++)
tmp[i] = x[i] + 1.0;
return tmp;
}
// [[Rcpp::export]]
std::vector<double> bar(const std::vector<double>& x) {
std::vector<double> tmp(x.size());
for (int i = 0; i < x.size(); i++)
tmp[i] = x[i] + 1.0;
return tmp;
}
Are equivalent when considering their working and benchmarked performance. I understand that Rcpp offers sugar and vectorized operations, but if it is only about taking R's vector as input and returning vector as output, then would there be any difference which one of those I use? Can using std::vector<double> lead to any possible problems when interacting with R?
Are equivalent when considering their working and benchmarked performance.
I doubt that the benchmarks are accurate because going from a SEXP to std::vector<double> requires a deep copy from one data structure to another. (And as I was typing this, #DirkEddelbuettel ran a microbenchmark.)
The markup of the Rcpp object (e.g. const Rcpp::NumericVector& x) is just visual sugar. By default, the object given is a pointer and as such can easily have a ripple modification effect (see below). Thus, there is no true match that exists with const std::vector<double>& x that effectively "locks" and "passes a references".
Can using std::vector<double> lead to any possible problems when interacting with R?
In short, no. The only penalty that is paid is the transference between objects.
The gain over this transference is the fact that modifying a value of a NumericVector that is assigned to another NumericVector will not cause a domino update. In essence, each std::vector<T> is a direct copy of the other. Therefore, the following couldn't happen:
#include<Rcpp.h>
// [[Rcpp::export]]
void test_copy(){
NumericVector A = NumericVector::create(1, 2, 3);
NumericVector B = A;
Rcout << "Before: " << std::endl << "A: " << A << std::endl << "B: " << B << std::endl;
A[1] = 5; // 2 -> 5
Rcout << "After: " << std::endl << "A: " << A << std::endl << "B: " << B << std::endl;
}
Gives:
test_copy()
# Before:
# A: 1 2 3
# B: 1 2 3
# After:
# A: 1 5 3
# B: 1 5 3
Is there any reason why I should prefer Rcpp::NumericVector over std::vector<double>?
There are a few reasons:
As hinted previously, using Rcpp::NumericVector avoids a deep copy to and fro the C++ std::vector<T>.
You gain access to the sugar functions.
Ability to 'mark up' Rcpp object in C++ (e.g. adding attributes via .attr())
"If unsure, just time it."
All it takes is to add these few lines to the file you already had:
/*** R
library(microbenchmark)
x <- 1.0* 1:1e7 # make sure it is numeric
microbenchmark(foo(x), bar(x), times=100L)
*/
Then just calling sourceCpp("...yourfile...") generates the following result (plus warnings on signed/unsigned comparisons):
R> library(microbenchmark)
R> x <- 1.0* 1:1e7 # make sure it is numeric
R> microbenchmark(foo(x), bar(x), times=100L)
Unit: milliseconds
expr min lq mean median uq max neval cld
foo(x) 31.6496 31.7396 32.3967 31.7806 31.9186 54.3499 100 a
bar(x) 50.9229 51.0602 53.5471 51.1811 51.5200 147.4450 100 b
R>
Your bar() solution needs to make a copy to create a R object in the R memory pool. foo() does not. That matters for large vectors that you run over many times. Here we see a ratio of close of about 1.8.
In practice, it may not matter if you prefer one coding style over the other etc pp.
Related
I have written a function to raise 2 to a given power. I want to use 64 bit integers. In R, the bit64 package have the following for the maximum and minimum limits:
From R:
> bit64::lim.integer64()
integer64
[1] -9223372036854775807 9223372036854775807
This is -(2^63) and 2^63.
However, for some reason, my Rcpp code can only pass 2^62 back to R. Here is the code for my function that raises 2 to a given power (NOTE: I use bit-shifting to achieve this):
C++ code:
// [[Rcpp::export]]
Rcpp::NumericVector i2_to_the_power_j ( int64_t j )
{
int64_t base = 1;
int64_t value = base << j;
// cout << "C++ value: " << value << "\n";
// Create a vector of length 1 with `value` as the sole contents
const std::vector<int64_t> v(1, value);
const size_t len = v.size();
Rcpp::NumericVector nn(len); // storage vehicle we return them in
// transfers values 'keeping bits' but changing type
// using reinterpret_cast would get us a warning
std::memcpy(&(nn[0]), &(v[0]), len * sizeof(double));
nn.attr("class") = "integer64";
return nn;
return value;
}
However, when I run this in R, I cannot obtain the largest possible/limiting value!
From R:
>library(Rcpp)
>library(bit64)
> sourceCpp("./hilbert_curve_copy.cpp")
> # I can get 2^62
> i2_to_the_power_j(62)
integer64
[1] 4611686018427387904
> # ...but I cannot get 2^63
> i2_to_the_power_j(63)
integer64
[1] <NA>
> # I cannot get 2^63, despite bit64 package claiming it can
> # handle integers of this size
> bit64::lim.integer64()
integer64
[1] -9223372036854775807 9223372036854775807
Have I missed something here? Please advise, and thank you for your time.
Quick guess of mine (that was proven right): the max value itself may be the one flagged for NA. So compute the 'one minus' that value and try it.
Quick guess of mine: the max value may be the one flagged for NA. So compute the 'one minus' that value and try it
// [[Rcpp::export]]
Rcpp::NumericVector largeval ( ) {
int64_t val = 9223372036854775807LL - 1;
Rcpp::Rcout << "C++ value: " << val << "\n";
Rcpp::NumericVector dbl(1);
std::memcpy(&(dbl[0]), &val, sizeof(double));
dbl.attr("class") = "integer64";
return dbl;
}
I added that to your code and running it yields:
R> largeval()
C++ value: 9223372036854775806
integer64
[1] 9223372036854775806
R>
Full code below just in case.
Code
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::NumericVector i2_to_the_power_j ( int64_t j )
{
int64_t base = 1;
int64_t value = base << j;
// cout << "C++ value: " << value << "\n";
// Create a vector of length 1 with `value` as the sole contents
const std::vector<int64_t> v(1, value);
const size_t len = v.size();
Rcpp::NumericVector nn(len); // storage vehicle we return them in
// transfers values 'keeping bits' but changing type
// using reinterpret_cast would get us a warning
std::memcpy(&(nn[0]), &(v[0]), len * sizeof(double));
nn.attr("class") = "integer64";
return nn;
return value;
}
// [[Rcpp::export]]
Rcpp::NumericVector largeval ( ) {
int64_t val = 9223372036854775807LL - 1;
Rcpp::Rcout << "C++ value: " << val << "\n";
Rcpp::NumericVector dbl(1);
std::memcpy(&(dbl[0]), &val, sizeof(double));
dbl.attr("class") = "integer64";
return dbl;
}
/*** R
library(bit64)
# I can get 2^62
i2_to_the_power_j(62)
# ...but I cannot get 2^63
i2_to_the_power_j(63)
# I cannot get 2^63, despite bit64 package claiming it can
# handle integers of this size
bit64::lim.integer64()
largeval()
*/
I am playing around with Eigen doing some calculations with matrices and logs/exp, but I found the expressions I got a bit clumsy (and also possibly slower?). Is there a better way to write calculations like this ?
MatrixXd m = MatrixXd::Random(3,3);
m = m * (m.array().log()).matrix();
That is, not having to convert to arrays, then back to a matrix ?
If you are mixing array and matrix operations you can't really avoid them, except for some functions which have a cwise function which works directly on matrices (e.g., cwiseSqrt(), cwiseAbs()).
However, neither .array() nor .matrix() will have an impact on runtime when compiled with optimization (on any reasonable compiler).
If you consider that more readable, you can work with unaryExpr().
I agree fully with chtz's answer, and reiterate that there is no runtime cost to the "casts." You can confirm using the following toy program:
#include "Eigen/Core"
#include <iostream>
#include <chrono>
using namespace Eigen;
int main()
{
typedef MatrixXd matType;
//typedef MatrixXf matType;
volatile int vN = 1024 * 4;
int N = vN;
auto startAlloc = std::chrono::system_clock::now();
matType m = matType::Random(N, N).array().abs();
matType r1 = matType::Zero(N, N);
matType r2 = matType::Zero(N, N);
auto finishAlloc = std::chrono::system_clock::now();
r1 = m * (m.array().log()).matrix();
auto finishLog = std::chrono::system_clock::now();
r2 = m * m.unaryExpr<float(*)(float)>(&std::log);
auto finishUnary = std::chrono::system_clock::now();
std::cout << (r1 - r2).array().abs().maxCoeff() << '\n';
std::cout << "Allocation\t" << std::chrono::duration<double>(finishAlloc - startAlloc).count() << '\n';
std::cout << "Log\t\t" << std::chrono::duration<double>(finishLog - finishAlloc).count() << '\n';
std::cout << "unaryExpr\t" << std::chrono::duration<double>(finishUnary - finishLog).count() << '\n';
return 0;
}
On my computer, there is a slight advantage (~4%) to the first form which probably has to do with the way that the memory is loaded (unchecked). Beyond that, the reason for "casting" the type is to remove any ambiguities. For a clear example, consider operator *. In the matrix form, it should be considered matrix multiplication, whereas in the array form, it should be coefficient wise multiplication. The ambiguity in the case of exp and log are the matrix exponential and matrix logarithm respectively. Presumably, you want the element wise exp and log and therefore the cast is necessary.
In the latest Boost, there is a function to compute the Bernoulli number, but I miss what it does exactly.
For example, Mathematica, Python mpmath and www.bernoulli.org say that:
BernoulliB[1] == -1/2
but the boost version
#include <boost/multiprecision/cpp_dec_float.hpp>
#include <boost/math/special_functions/bernoulli.hpp>
boost::multiprecision::cpp_dec_float_100 x = bernoulli_b2n<boost::multiprecision::cpp_dec_float_100>(1);
returns 0.166667
Why this difference? Am I missing something?
All odd Bernoulli numbers are zero, apart of B1, which you know is -1/2. So,
boost::math::bernoulli_b2n returns the only even (2nth) Bernoulli numbers.
For example, to get B4 you need to actually pass 2:
std::cout
<< std::setprecision(std::numeric_limits<double>::digits10)
<< boost::math::bernoulli_b2n<double>(2) << std::endl;
and if you pass 1, you get B2.
See docs: http://www.boost.org/doc/libs/1_56_0/libs/math/doc/html/math_toolkit/number_series/bernoulli_numbers.html
Of course, you can make a simple wrapper, to imitate preferred syntax1:
double bernoulli(int n)
{
if (n == 1) return -1.0 / 2.0; // one
if (n % 2) return 0; // odd
return boost::math::bernoulli_b2n<double>(n / 2);
}
int main()
{
std::cout << std::setprecision(std::numeric_limits<double>::digits10);
for (int i = 0; i < 10; ++i)
{
std::cout << "B" << i << "\t" << bernoulli(i) << "\n";
}
}
or even a class with overloaded operator[] (for demanding persons ;) ):
class Bernoulli
{
public:
double operator[](int n)
{
return bernoulli(n);
}
};
or even make use of template magic and do all this checks at compile time (I will left it as an exercise for a reader ;) ).
1Please note, that this exact function body is not well verified and can contains mistakes. But I hope you've got the idea of how you can make a wrapper.
I was reading the Box2D source code. In b2Vec2 there is the () operator being overloaded, but I did not understand what it is supposed to do. I read the manual and the reference of this method but still did not get what it means to Read from an indexed element and write to an indexed element, and both methods have the same body return (&x)[i]. What does this mean and do?
Thanks to a previous comment (but it was removed for some reason), I got an idea and tested it out, and it turns out this will allow me to access and write to x and y using indices 0 and 1 respectively.
For example:
#include <iostream>
using namespace std;
class clazz {
public:
float x, y;
clazz(float x_, float y_) : x(x_), y(y_) {}
float operator () (int i) const {
return (&x)[i];
}
float& operator () (int i) {
return (&x)[i];
}
};
int main() {
clazz f (3, 4);
cout << "f: x = " << f(0) << " y = " << f(1) << endl; // printed => f: x = 3 y = 4
f(0) = 6;
f(1) = 6;
cout << "f: x = " << f(0) << " y = " << f(1) << endl; // printed => f: x = 6 y = 6
return 0;
}
As you found out it's an accessor function to the individual elements in the vector class. The reason there are two functions is due to const functions need access to the value of the element without needing to modify it. Note that you could return a const reference here as well but this is not necessary in your case since it is operating on a float.
Hopefully there are asserts in place for making sure that code isn't indexing out of the range since that is quite easy to do, especially when you have are using a signed variable like in your example.
I would like to know what the most efficient c++ implementation of the following matlab idiom is.
Suppose I have 3 vectors in matlab, x, y and idx.
x = [13,2,5.5,22,107]
y = [-3,100,200]
idx = [1,2,5]
I want to replace positions 1,2 and 5 of x with the contents of y. In matlab I do
x[idx] = y
What is the best way to do this in c++?
The Armadillo library probably comes closest as one of its goals is to make things easy for folks used to Matlab.
Here is a short example (and uvec is a typdef for vector of unsigned ints)
// set four specific elements of X to 1
uvec indices;
indices << 2 << 3 << 6 << 8;
X.elem(indices) = ones<vec>(4);
Obviously, the right-hand side could be any other vector of the same dimension as the index.
But there are few language-imposed constraints you cannot overcome:
zero-based indexing at the C++ level (which you could alter, but few C / C++ programmers will consider it a good idea)
certain operators, including [
A loop seems simple enough, although it can be made simpler with a helper function.
// helper function to get the size of an array
template<typename T, size_t S>
constexpr size_t size(T (&)[S])
{
return S;
}
// loop
for (size_t i = 0; i < size(idx); ++i) x[idx[i]] = y[i];
Of course, here you should check that y is large enough:
If you want it in pretty basic C++ (with no prizes for elegance) try this:
double x[] = { 13.0, 2.0, 5.5, 22.0, 107.0 };
double y[] = { -3.0, 100.0, 200.0 };
size_t idx[] = { 1, 2, 5 };
for ( size_t i = 0; i < sizeof(x)/sizeof(double); ++i )
cout << x[i] << " ";
cout << endl;
// Make a mutable copy of x
double x1[ sizeof(x)/sizeof(double) ];
std::copy( x, x + sizeof(x)/sizeof(double), x1 );
// The transformation
for ( size_t i = 0; i < sizeof(idx)/sizeof(double); ++i )
x1[idx[i]] = y[i];
for ( size_t i = 0; i < sizeof(x)/sizeof(double); ++i )
cout << x1[i] << " ";
cout << endl;
Note exactly pretty, but it does give the following:
13 2 5.5 22 107
-3 100 5.5 22 200
(Note that I've assumed that the indexing starts from 1 rather than 0 in idx)