I'm porting a R function to c++ for use in RcppArmadillo, and I cannot find an elegant (efficient) way to repeat a column vector N times, element-by-element. Here's a minimal example, where I had to first create a matrix with 3 columns repeated, then reshape to a row vector, then transpose.
library(RcppArmadillo)
sourceCpp(code = '
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::colvec foo(const arma::colvec& u, const arma::colvec& v)
{
arma::colvec u_rep(12), result(12);
u_rep = trans(vectorise(repmat(u, 1, 3), 1)); // this seems inefficient
result = u_rep % v;
return(result);
}'
)
foo(1:4, 1:12)
The R equivalent would be,
fooR = function(u, v){
u_rep = rep(u, each=3)
u_rep * v
}
There is no known C++ operator or function that does this, so you may well have to do it by hand.
Worst case you just loop and copy (possibly in chunks). Armadillo does have indexing, so maybe that will help. R does a lot of checking when recycling so you probably have to account for that too.
By the way, you example mixes Attributes and inline. I'd just put the code
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
using namespace arma;
// [[Rcpp::export]]
arma::colvec foo(const arma::colvec& u, const arma::colvec& v) {
arma::colvec u_rep(12), result(12);
u_rep = trans(vectorise(repmat(u, 1, 3), 1)); // this seems inefficient
result = u_rep % v;
return(result);
}
in a file bafoo.cpp and source it as follows:
R> sourceCpp("/tmp/bafoo.cpp")
R> foo(1:4, 1:12)
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 8
[5,] 10
[6,] 12
[7,] 21
[8,] 24
[9,] 27
[10,] 40
[11,] 44
[12,] 48
R>
Related
I can select all the rows of a matrix and a range of columns of a matrix as follows:
library(Rcpp)
cppFunction('
NumericMatrix subset(NumericMatrix x){
return x(_, Range(0, 1));
}
')
However, I would like to select columns based on a NumericVector y which, for instance, could be something like c(0, 1, 0, 0, 1). I tried this:
library(Rcpp)
cppFunction('
NumericMatrix subset(NumericMatrix x, NumericVector y){
return x(_, y);
}
')
but it doesn't compile. How do I do it?
Alas, Rcpp doesn't have great support for non-contiguous views or selecting in a single statement only columns 1 and 4. As you saw, selecting contiguous views or selecting all columns is accessible with Rcpp::Range(). You'll likely want to upgrade to RcppArmadillo for better control over matrix subsets.
RcppArmadillo subset examples
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat matrix_subset_idx(const arma::mat& x,
const arma::uvec& y) {
// y must be an integer between 0 and columns - 1
// Allows for repeated draws from same columns.
return x.cols( y );
}
// [[Rcpp::export]]
arma::mat matrix_subset_logical(const arma::mat& x,
const arma::vec& y) {
// Assumes that y is 0/1 coded.
// find() retrieves the integer index when y is equivalent 1.
return x.cols( arma::find(y == 1) );
}
Test
# Sample data
x = matrix(1:15, ncol = 5)
x
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 4 7 10 13
# [2,] 2 5 8 11 14
# [3,] 3 6 9 12 15
# Subset only when 1 (TRUE) is found:
matrix_subset_logical(x, c(0, 1, 0, 0, 1))
# [,1] [,2]
# [1,] 4 13
# [2,] 5 14
# [3,] 6 15
# Subset with an index representing the location
# Note: C++ indices start at 0 not 1!
matrix_subset_idx(x, c(1, 3))
# [,1] [,2]
# [1,] 4 13
# [2,] 5 14
# [3,] 6 15
Pure Rcpp logic
If you do not want to take on the dependency of armadillo, then the equivalent for the matrix subset in Rcpp is:
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::NumericMatrix matrix_subset_idx_rcpp(
Rcpp::NumericMatrix x, Rcpp::IntegerVector y) {
// Determine the number of observations
int n_cols_out = y.size();
// Create an output matrix
Rcpp::NumericMatrix out = Rcpp::no_init(x.nrow(), n_cols_out);
// Loop through each column and copy the data.
for(unsigned int z = 0; z < n_cols_out; ++z) {
out(Rcpp::_, z) = x(Rcpp::_, y[z]);
}
return out;
}
I have run into something I cannot wrap my head around. It's part of a larger coding effort, but a minimal example is here:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
Rcpp::List foo(arma::vec & tau2, const arma::vec & nu) {
arma::vec bet = Rcpp::rnorm(3);
tau2 = R::rgamma(1, arma::as_scalar(sum(pow(bet, 2)/nu)));
return Rcpp::List::create(Rcpp::Named("nu") = nu,
Rcpp::Named("tau2") = tau2);
}
(tau2, although a scalar, is a vector here because I want to pass by reference: function pass by reference in RcppArmadillo)
What is puzzling me is that if I now run the following R code:
n <- 3
m <- matrix(0, n, 1)
for (r in 1:1000) {
tau2 <- 1.0
nu <- matrix(1, n, 1)
upd <- foo(tau2, nu)
}
I get:
error: element-wise division: incompatible matrix dimensions: 3x1 and 18x1
Error in foo(tau2, nu) :
element-wise division: incompatible matrix dimensions: 3x1 and 18x1
where the 18x1 varies; mostly it's 0x1 but it's always a multiple of 3.
Looking at the output:
> nu
[,1] [,2] [,3] [,4]
[1,] 4.165242 4.165242 4.165242 4.165242
[2,] 4.165242 4.165242 4.165242 4.165242
[3,] 4.165242 4.165242 4.165242 4.165242
> upd
$nu
[,1]
[1,] 1
[2,] 1
[3,] 1
$tau2
[,1]
[1,] 4.165242
That is, despite declaring nu as a constant reference (which I do because I do not want it changed), it is altered. The value it is filled with is upd$tau2 (but why?).
Strangely, I can make the behavior go away by seemingly meaningless changes by:
putting tau2 <- 1.0 or nu <- matrix(1, n, 1) (or both) outside of the loop
removing the reference in the first argument (i.e. using arma::vec tau2)
not dividing pow(bet, 2) by nu
changing to nu <- rep(1, n)
Perhaps the most confusing part is that if I select the code chunk inside of the loop and repeatedly run it, it works(!). However, if I run the R code using the loop it crashes on the second iteration.
Because I seem to be able to fix the problem, I'm mostly interested in learning what is going on here. I suspect it's just a consequence of my lack of expertise in C++ and recklessness with various variable types, so knowing what is causing all of this would be very valuable.
Two fixes:
tau2 is a double (mimic #dirkeddelbuettel here)
Temporary variable for the NumericVector of length n being generated prior to saving into bet
Code:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
Rcpp::List foo(double tau2, const arma::vec & nu) {
int n = nu.n_elem;
Rcpp::NumericVector x = Rcpp::rnorm(n);
arma::vec bet = arma::vec(x.begin(), n, true, false);
tau2 = R::rgamma(1, arma::as_scalar(sum(pow(bet, 2) / nu)));
return Rcpp::List::create(Rcpp::Named("nu") = nu,
Rcpp::Named("tau2") = tau2);
}
Test case:
n <- 3
m <- matrix(0, n, 1)
for (r in 1:1000) {
tau2 <- 1.0
nu <- matrix(1, n, 1)
upd <- foo(tau2, nu)
}
upd
#> $nu
#> [,1]
#> [1,] 1
#> [2,] 1
#> [3,] 1
#>
#> $tau2
#> [1] 3.292889
If I change the interface to using a double it all works:
Code
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
Rcpp::List foo(double & tau2, const arma::vec & nu) {
arma::vec bet = Rcpp::rnorm(3);
tau2 = R::rgamma(1, arma::as_scalar(sum(pow(bet, 2)/nu)));
return Rcpp::List::create(Rcpp::Named("nu") = nu,
Rcpp::Named("tau2") = tau2);
}
/*** R
n <- 3
m <- matrix(0, n, 1)
for (r in 1:1000) {
tau2 <- 1.0
nu <- matrix(1, n, 1)
upd <- foo(tau2, nu)
}
*/
Demo
R> sourceCpp("/tmp/hejseb.cpp")
R> n <- 3
R> m <- matrix(0, n, 1)
R> for (r in 1:1000) {
+ tau2 <- 1.0
+ nu <- matrix(1, n, 1)
+ upd <- foo(tau2, nu)
+ }
R> upd
$nu
[,1]
[1,] 1
[2,] 1
[3,] 1
$tau2
[1] 1.77314
R>
I am not sure if those are the numbers you expected. I don't really have time to work through what you are trying to do.
I am new to RcppArmadillo. I am wondering how I can make a column-wise ordered matrix by the index of given vector. I know how to do it in R, but in RcppArmadillo it does not working. For example, in R,
aa = c(2,4,1,3)
# [1] 2 4 1 3
bb = cbind(c(1,5,4,2),c(3,1,0,8))
# [,1] [,2]
# [1,] 1 3
# [2,] 5 1
# [3,] 4 0
# [4,] 2 8
Trying to subset with R gives:
cc = bb[aa,]
# [,1] [,2]
# [1,] 5 1
# [2,] 2 8
# [3,] 1 3
# [4,] 4 0
I've tried the following using RcppArmadillo:
#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
List example(arma::vec aa,arma::mat bb){
int p = bb.n_rows;
int n = aa.size();
arma::uvec index_aa=sort_index(aa);;
List cc(n);
for(int it=0; it<p; it++){
cc(it) = bb.each_col();
}
return List::create(cc);
}
and,
#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
List example(arma::vec aa,arma::mat bb){
arma::uvec index_aa=sort_index(aa);
return List::create(bb.elem(index_aa));
}
Not sure why you are sorting the index here as that causes a new order to be introduced compared to bb[aa,].
Anyway, the idea here is to subset using the .rows() index, which requires a uvec or unsigned integer vector. As aa contains R indexes, we can translate them from R to C++ by subtracting 1 to take it from a 1-based index system to a 0-based index system.
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat example_subset(arma::uvec aa, arma::mat bb){
// Convert to a C++ index from R (1 to 0-based indices)
aa = aa - 1;
return bb.rows(aa);
}
Test code:
aa = c(2, 4, 1, 3)
bb = cbind(c(1, 5, 4, 2), c(3, 1, 0, 8))
cpp_cc = example_subset(aa, bb)
r_cc = cbind(c(5,2,1,4),c(1,8,3,0))
all.equal(cpp_cc, r_cc)
# [1] TRUE
I am getting the following error when trying to compile using sourceCpp from Rcpppackage:
`my path to R/.../Rcpp/internal/Exporter.h`
no matching function for call to 'arma::Cube<double>::Cube(SEXPREC*&)'
The object cube is the armadillo equivalent of an array in R.
EDIT: Note that the problem seems to be that the function can't accept a arma::cube object as an argument. If we change arma::cube Bby arma::mat Bit does work:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace arma;
// [[Rcpp::export]]
arma::cube ssmooth(arma::mat A,
arma::cube B) {
int ns = A.n_rows;
int nk = A.n_cols;
int np = B.n_rows;
arma::mat C = zeros<mat>(nk, ns);
arma::cube D = zeros<cube>(nk, nk, ns);
return D;
}
I would appreciate any hint.
A basic example works:
R> cppFunction("arma::cube getCube(int n) { arma::cube a(n,n,n);\
a.zeros(); return a; }", depends="RcppArmadillo")
R> getCube(2)
, , 1
[,1] [,2]
[1,] 0 0
[2,] 0 0
, , 2
[,1] [,2]
[1,] 0 0
[2,] 0 0
R>
so either you are doing something wrong or your installation is off.
I had the same issue. The problem seems to be related to the combination "Rcpp::export" and cube as an argument of the exported function. My guess is that the converter from sexp to cube may not be implemented yet (no pun intended ;-)). (Or we are both missing something...).
Workaround when you want to have an arma::cube argument in a Rcpp::export function: get it first as a NumericVector and simply create the cube afterward...
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace arma;
// [[Rcpp::export]]
arma::cube ssmooth(arma::mat A,
NumericVector B_) {
IntegerVector dimB=B_.attr("dim");
arma::cube B(B_.begin(), dimB[0], dimB[1], dimB[2]);
//(rest of your code unchanged...)
int ns = A.n_rows;
int nk = A.n_cols;
int np = B.n_rows;
arma::mat C = zeros<mat>(nk, ns);
arma::cube D = zeros<cube>(nk, nk, ns);
return D;
}
I think your code fails because implicitly it tries to do casting like this:
#include<RcppArmadillo.h>
using namespace Rcpp;
using namespace arma;
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
cube return_thing(SEXP thing1){
cube thing2 = as<cube>(thing1);
return thing2;
}
/***R
thing <- 1:8
dim(thing) <- c(2, 2, 2)
return_thing(thing)
*/
which doesn't work, whereas it works for matrices:
#include<RcppArmadillo.h>
using namespace Rcpp;
using namespace arma;
// [[Rcpp::depends(RcppArmadillo)]]
//[[Rcpp::export]]
mat return_thing(SEXP thing1){
mat thing2 = as<mat>(thing1);
return thing2;
}
/***R
thing <- 1:4
dim(thing) <- c(2, 2)
return_thing(thing)
*/
I am able to read and return an arma cube with the following function :
#include <RcppArmadillo.h>
using namespace Rcpp;
using namespace arma;
// [[Rcpp::export]]
arma::cube return_cube(arma::cube X)
{
return(X);
}
For example, I obtain the following result when I run the following in R :
my_cube <- array(data = rnorm(5 * 3 * 2), dim = c(5,3, 2))
return_cube(my_cube)
, , 1
[,1] [,2] [,3]
[1,] 0.4815994 1.0863765 0.3278728
[2,] 1.4138699 -0.7809922 0.8341867
[3,] 0.6555752 -0.2708001 0.7701501
[4,] 1.1447104 -1.4064894 -0.2653888
[5,] 1.5972670 1.8368235 -2.2814959
, , 2
[,1] [,2] [,3]
[1,] -0.485091067 1.1826162 -0.3524851
[2,] 0.227652584 0.3005968 -0.6079604
[3,] -0.147653664 1.3463318 -1.2238623
[4,] 0.067090580 -0.8982740 -0.8903684
[5,] 0.006421618 -1.7156955 -1.2813880
I believe boost has a limitation on contiguous or at least step-wise consistent slicing of matrices. In R, I could have a random vector c(5,2,8) and use that to index into a matrix M[c(5,2,8),] for example...
Armadillo supports this as of version 3.0 which was released not even two weeks ago.
Here is a worked example via RcppArmadillo:
R> library(inline)
R>
R> code <- '
+ arma::mat M = Rcpp::as<arma::mat>(m); // normal matrix
+ arma::uvec V = Rcpp::as<arma::uvec>(v); // unsigned int vec
+ arma::mat N = M.cols(V); // index matrix by vec
+ return Rcpp::wrap(N);
+ '
R>
R> fun <- cxxfunction(signature(m="numeric", v="integer"),
+ code,
+ plugin="RcppArmadillo")
R> M <- matrix(1:25,5,5)
R> V <- c(1L, 3L, 5L) - 1 # offset by one for zero indexing
R> fun(M, V)
[,1] [,2] [,3]
[1,] 1 11 21
[2,] 2 12 22
[3,] 3 13 23
[4,] 4 14 24
[5,] 5 15 25
R>
There is a matching function to pick rows rather than columns.