I am a beginner in R studio, so hopefully someone can help me with this problem. The case: I want to make an if else loop. I made the following code for an l times m matrix:
for (i in 1:l){
for (j in 1:m){
if (is.na(quantilereturns[i,j]) < quantile(quantilereturns[,j], c(.1), na.rm=TRUE)) {
quantilereturns[i,j]
} else { (0) }
}
}
Summary: I want to make a matrix with values that are smaller than the quantile of a certain vector in the matrix quantilereturns. So when they are smaller than the 10% quantile they get their original value otherwise it will be a zero.
The code doesn't give any errors, but it doesn't change the values in the matrix either.
Can someone help me?
You need to assign the result to a cell of the matrix. I will take the matrix of a recent other thread as an example:
a <- c(4, -9, 2)
b <- c(-1, 3, -8)
c <- c(5, 2, 6)
d <- c(7, 9, -2)
matrix <- cbind(a,b,c,d)
d <- dim(matrix)
rows <- d[1]
columns <- d[2]
print("Before")
print(matrix)
for (i in 1:rows) {
for (j in 1:columns) {
if (is.na(matrix[i,j]) >= quantile(matrix[,j], c(.1), na.rm=TRUE)) {
matrix[i,j] <- 0
}
}
}
print("After")
print(matrix)
this gives
[1] "Before"
a b c d
[1,] 4 -1 5 7
[2,] -9 3 2 9
[3,] 2 -8 6 -2
[1] "After"
a b c d
[1,] 0 0 5 0
[2,] 0 0 2 0
[3,] 0 0 6 0
So the essential line you are looking for is matrix[i,j] <- 0
Related
I made a first stab at an Rcpp function via inline and it solved my speed problem (thanks Dirk!):
Replace negative values by zero
The initial version looked like this:
library(inline)
cpp_if_src <- '
Rcpp::NumericVector xa(a);
int n_xa = xa.size();
for(int i=0; i < n_xa; i++) {
if(xa[i]<0) xa[i] = 0;
}
return xa;
'
cpp_if <- cxxfunction(signature(a="numeric"), cpp_if_src, plugin="Rcpp")
But when called cpp_if(p), it overwrote p with the output, which was not as intended. So I assumed it was passing by reference.
So I fixed it with the following version:
library(inline)
cpp_if_src <- '
Rcpp::NumericVector xa(a);
int n_xa = xa.size();
Rcpp::NumericVector xr(a);
for(int i=0; i < n_xa; i++) {
if(xr[i]<0) xr[i] = 0;
}
return xr;
'
cpp_if <- cxxfunction(signature(a="numeric"), cpp_if_src, plugin="Rcpp")
Which seemed to work. But now the original version doesn't overwrite its input anymore when I re-load it into R (i.e. the same exact code now doesn't overwrite its input):
> cpp_if_src <- '
+ Rcpp::NumericVector xa(a);
+ int n_xa = xa.size();
+ for(int i=0; i < n_xa; i++) {
+ if(xa[i]<0) xa[i] = 0;
+ }
+ return xa;
+ '
> cpp_if <- cxxfunction(signature(a="numeric"), cpp_if_src, plugin="Rcpp")
>
> p
[1] -5 -4 -3 -2 -1 0 1 2 3 4 5
> cpp_if(p)
[1] 0 0 0 0 0 0 1 2 3 4 5
> p
[1] -5 -4 -3 -2 -1 0 1 2 3 4 5
I'm not the only one who has tried to replicate this behavior and found inconsistent results:
https://chat.stackoverflow.com/transcript/message/4357344#4357344
What's going on here?
They key is 'proxy model' -- your xa really is the same memory location as your original object so you end up changing your original.
If you don't want that, you should do one thing: (deep) copy using the clone() method, or maybe explicit creation of a new object into which the altered object gets written. Method two does not do that, you simply use two differently named variables which are both "pointers" (in the proxy model sense) to the original variable.
An additional complication, though, is in implicit cast and copy when you pass an int vector (from R) to a NumericVector type: that creates a copy, and then the original no longer gets altered.
Here is a more explicit example, similar to one I use in the tutorials or workshops:
library(inline)
f1 <- cxxfunction(signature(a="numeric"), plugin="Rcpp", body='
Rcpp::NumericVector xa(a);
int n = xa.size();
for(int i=0; i < n; i++) {
if(xa[i]<0) xa[i] = 0;
}
return xa;
')
f2 <- cxxfunction(signature(a="numeric"), plugin="Rcpp", body='
Rcpp::NumericVector xa(a);
int n = xa.size();
Rcpp::NumericVector xr(a); // still points to a
for(int i=0; i < n; i++) {
if(xr[i]<0) xr[i] = 0;
}
return xr;
')
p <- seq(-2,2)
print(class(p))
print(cbind(f1(p), p))
print(cbind(f2(p), p))
p <- as.numeric(seq(-2,2))
print(class(p))
print(cbind(f1(p), p))
print(cbind(f2(p), p))
and this is what I see:
edd#max:~/svn/rcpp/pkg$ r /tmp/ari.r
Loading required package: methods
[1] "integer"
p
[1,] 0 -2
[2,] 0 -1
[3,] 0 0
[4,] 1 1
[5,] 2 2
p
[1,] 0 -2
[2,] 0 -1
[3,] 0 0
[4,] 1 1
[5,] 2 2
[1] "numeric"
p
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 1 1
[5,] 2 2
p
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 1 1
[5,] 2 2
edd#max:~/svn/rcpp/pkg$
So it really matters whether you pass int-to-float or float-to-float.
This is what I'm doing now
library(Rcpp)
A <- diag(c(1.0, 2.0, 3.0))
rownames(A) <- c('X', 'Y', 'Z')
colnames(A) <- c('A', 'B', 'C')
cppFunction('
void scaleMatrix(NumericMatrix& A, double x) {
A = A * x;
}')
Unfortunately It doesn't work :(
> A
A B C
X 1 0 0
Y 0 2 0
Z 0 0 3
> scaleMatrix(A, 2)
> A
A B C
X 1 0 0
Y 0 2 0
Z 0 0 3
I learned from Rcpp FAQ, Question 5.1 that Rcpp should be able to change the object I passed by value. Stealing an example from Dirk's answer to my previous question:
> library(Rcpp)
> cppFunction("void inplaceMod(NumericVector x) { x = x * 2; }")
> x <- as.numeric(1:5)
> inplaceMod(x)
> x
[1] 2 4 6 8 10
I'm confused: it is possible to modify a NumericVector in-place, but not a NumericMatrix?
You can preserve the row and column names by using NumericVector instead of NumericMatrix, keeping in mind that a matrix in R is just a vector with attached dimensions. You can do this switch either when going from R to C++ (scaleVector below) or within C++ (scaleMatrix below taken from a now deleted answer by #Roland):
library(Rcpp)
cppFunction('
NumericVector scaleVector(NumericVector& A, double x) {
A = A * x;
return A;
}')
cppFunction('
NumericMatrix scaleMatrix(NumericMatrix& A, double x) {
NumericVector B = A;
B = B * x;
return A;
}')
If one applies these two function to your matrix, the row and column names are preserved. However, the matrix is not changed in place:
A <- diag(1:3)
rownames(A) <- c('X', 'Y', 'Z')
colnames(A) <- c('A', 'B', 'C')
scaleMatrix(A, 2)
#> A B C
#> X 2 0 0
#> Y 0 4 0
#> Z 0 0 6
scaleVector(A, 2)
#> A B C
#> X 2 0 0
#> Y 0 4 0
#> Z 0 0 6
A
#> A B C
#> X 1 0 0
#> Y 0 2 0
#> Z 0 0 3
The reason for that is that diag(1:3) is actually an integer matrix, so a copy is made when you transfer it to a numeric matrix (or vector):
is.integer(A)
#> [1] TRUE
If one uses a numeric matrix to begin with, modification is done in place:
A <- diag(c(1.0, 2.0, 3.0))
rownames(A) <- c('X', 'Y', 'Z')
colnames(A) <- c('A', 'B', 'C')
scaleMatrix(A, 2)
#> A B C
#> X 2 0 0
#> Y 0 4 0
#> Z 0 0 6
scaleVector(A, 2)
#> A B C
#> X 4 0 0
#> Y 0 8 0
#> Z 0 0 12
A
#> A B C
#> X 4 0 0
#> Y 0 8 0
#> Z 0 0 12
I would like to collapse the rows of a transposed NumericMatrix using Rcpp. For instance:
library("data.table")
library("Rcpp")
dt1 <- data.table(V1=c(1, 0, 2),
V2=c(1, 1, 0),
V3=c(1, 0, 1),
V4=c(0, 1, 2),
V5=c(1, 1, 1))
cppFunction('NumericMatrix transpose(DataFrame data) {
NumericMatrix genotypes = internal::convert_using_rfunction(data, "as.matrix");
NumericMatrix tgeno(data.ncol(), data.nrow());
int number_samples = data.ncol();
int number_snps = data.nrow();
for (int i = 0; i < number_snps; i++) {
for (int j = 0; j < number_samples; j++) {
tgeno(j,i) = genotypes(i,j);
}
}
return tgeno;
}')
dt1
transpose(dt1)
Original Matrix
V1 V2 V3 V4 V5
1: 1 1 1 0 1
2: 0 1 0 1 1
3: 2 0 1 2 1
Transposed Matrix
[,1] [,2] [,3]
[1,] 1 0 2
[2,] 1 1 0
[3,] 1 0 1
[4,] 0 1 2
[5,] 1 1 1
I would like to have the following matrix:
[,1]
[1,] 102
[2,] 110
[3,] 101
[4,] 012
[5,] 111
Could anyone suggest a way to do this?
Maybe as a starting point, assuming that the numbers you concatenate consist only of a single digit:
//' #export
// [[Rcpp::export]]
std::vector<std::string> string_collapse(const Rcpp::DataFrame& data)
{
R_xlen_t nrow = data.nrow();
R_xlen_t ncol = data.ncol();
std::vector<std::string> ret(ncol);
for (R_xlen_t j = 0; j < ncol; ++j) {
const auto& col = Rcpp::as<Rcpp::NumericVector>(data[j]);
std::string ccstr;
ccstr.reserve(nrow);
for (const auto& chr: col) {
ccstr += std::to_string(chr)[0];
}
ret[j] = ccstr;
}
return ret;
}
It gives
dat <- data.frame(V1=c(1, 0, 2),
V2=c(1, 1, 0),
V3=c(1, 0, 1),
V4=c(0, 1, 2),
V5=c(1, 1, 1))
string_collapse(dat)
[1] "102" "110" "101" "012" "111"
But a quick benchmark comparing it to a pure R-solution suggests that you should not expect miracles. Probably there is still room for optimization.
Once you have transposed the matrix you can collapse the rows as follows:
matrix(apply(dt1, 1, paste0, collapse = ""), ncol = 1)
I made a first stab at an Rcpp function via inline and it solved my speed problem (thanks Dirk!):
Replace negative values by zero
The initial version looked like this:
library(inline)
cpp_if_src <- '
Rcpp::NumericVector xa(a);
int n_xa = xa.size();
for(int i=0; i < n_xa; i++) {
if(xa[i]<0) xa[i] = 0;
}
return xa;
'
cpp_if <- cxxfunction(signature(a="numeric"), cpp_if_src, plugin="Rcpp")
But when called cpp_if(p), it overwrote p with the output, which was not as intended. So I assumed it was passing by reference.
So I fixed it with the following version:
library(inline)
cpp_if_src <- '
Rcpp::NumericVector xa(a);
int n_xa = xa.size();
Rcpp::NumericVector xr(a);
for(int i=0; i < n_xa; i++) {
if(xr[i]<0) xr[i] = 0;
}
return xr;
'
cpp_if <- cxxfunction(signature(a="numeric"), cpp_if_src, plugin="Rcpp")
Which seemed to work. But now the original version doesn't overwrite its input anymore when I re-load it into R (i.e. the same exact code now doesn't overwrite its input):
> cpp_if_src <- '
+ Rcpp::NumericVector xa(a);
+ int n_xa = xa.size();
+ for(int i=0; i < n_xa; i++) {
+ if(xa[i]<0) xa[i] = 0;
+ }
+ return xa;
+ '
> cpp_if <- cxxfunction(signature(a="numeric"), cpp_if_src, plugin="Rcpp")
>
> p
[1] -5 -4 -3 -2 -1 0 1 2 3 4 5
> cpp_if(p)
[1] 0 0 0 0 0 0 1 2 3 4 5
> p
[1] -5 -4 -3 -2 -1 0 1 2 3 4 5
I'm not the only one who has tried to replicate this behavior and found inconsistent results:
https://chat.stackoverflow.com/transcript/message/4357344#4357344
What's going on here?
They key is 'proxy model' -- your xa really is the same memory location as your original object so you end up changing your original.
If you don't want that, you should do one thing: (deep) copy using the clone() method, or maybe explicit creation of a new object into which the altered object gets written. Method two does not do that, you simply use two differently named variables which are both "pointers" (in the proxy model sense) to the original variable.
An additional complication, though, is in implicit cast and copy when you pass an int vector (from R) to a NumericVector type: that creates a copy, and then the original no longer gets altered.
Here is a more explicit example, similar to one I use in the tutorials or workshops:
library(inline)
f1 <- cxxfunction(signature(a="numeric"), plugin="Rcpp", body='
Rcpp::NumericVector xa(a);
int n = xa.size();
for(int i=0; i < n; i++) {
if(xa[i]<0) xa[i] = 0;
}
return xa;
')
f2 <- cxxfunction(signature(a="numeric"), plugin="Rcpp", body='
Rcpp::NumericVector xa(a);
int n = xa.size();
Rcpp::NumericVector xr(a); // still points to a
for(int i=0; i < n; i++) {
if(xr[i]<0) xr[i] = 0;
}
return xr;
')
p <- seq(-2,2)
print(class(p))
print(cbind(f1(p), p))
print(cbind(f2(p), p))
p <- as.numeric(seq(-2,2))
print(class(p))
print(cbind(f1(p), p))
print(cbind(f2(p), p))
and this is what I see:
edd#max:~/svn/rcpp/pkg$ r /tmp/ari.r
Loading required package: methods
[1] "integer"
p
[1,] 0 -2
[2,] 0 -1
[3,] 0 0
[4,] 1 1
[5,] 2 2
p
[1,] 0 -2
[2,] 0 -1
[3,] 0 0
[4,] 1 1
[5,] 2 2
[1] "numeric"
p
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 1 1
[5,] 2 2
p
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 1 1
[5,] 2 2
edd#max:~/svn/rcpp/pkg$
So it really matters whether you pass int-to-float or float-to-float.
I have a small C++ function using Rcpp that replaces elements of one matrix with values from another matrix. It works fine for single cells, or a column as below:
cppFunction('NumericMatrix changeC(NumericMatrix one, NumericMatrix two) {
NumericMatrix a = one;
NumericMatrix b = two;
b(_,1) = a(_,1);
return b;
}')
changeC(g,f)
If originally f is the following matrix:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 6 6 6 6 6 6
[2,] 6 6 6 6 6 6
[3,] 6 6 6 6 6 6
[4,] 6 6 6 6 6 6
[5,] 6 6 6 6 6 6
[6,] 6 6 6 6 6 6
and g looks like the following matrix:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 5 5 5 5 5 5
[2,] 5 5 5 5 5 5
[3,] 5 5 5 5 5 5
[4,] 5 5 5 5 5 5
[5,] 5 5 5 5 5 5
[6,] 5 5 5 5 5 5
When I run changeC(g,f) I end up with (as expected):
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 6 5 6 6 6 6
[2,] 6 5 6 6 6 6
[3,] 6 5 6 6 6 6
[4,] 6 5 6 6 6 6
[5,] 6 5 6 6 6 6
[6,] 6 5 6 6 6 6
But what I really want to do is replace a subset of one matrix with a subset of another matrix from a different place (eg rows 1 to 3, columns 1 to 3 of one matrix (3*3) to rows 3 to 6, columns 3 to 6 (also 3*3) of the other matrix). I have tried:
cppFunction('NumericMatrix changeC(NumericMatrix one, NumericMatrix two) {
NumericMatrix a = one;
NumericMatrix b = two;
b( Range(0,2), Range(0,2)) = a( Range(3,5), Range(3,5));
return b;
}')
but this doesn't compile. Although:
cppFunction('NumericMatrix changeC(NumericMatrix one, NumericMatrix two) {
NumericMatrix a = one;
NumericMatrix b = two;
b = a( Range(3,5), Range(3,5));
return b;
}')
does compile. What am I doing wrong? In R I would do the following:
f[1:3,1:3] <- g[4:6,4:6] (but this is relatively slow with a very large matrix (hence Rcpp).
Thanks for any help in advance.
EDIT 1
After a bit of playing around I've managed to get my matrix to step east and west (and I assume it would be similar to north and south - possibly a two step approach for North East, North West??):
func <- 'NumericMatrix eastC(NumericMatrix a) {
int acoln=a.ncol();
NumericMatrix out(a.nrow(),a.ncol()) ;
for (int j = 0;j < acoln;j++) {
if (j > 0) {
out(_,j) = a(_,j-1);
} else {
out(_,j) = a(_,0);
}
}
return out ;
}'
cppFunction(func)
Any refinements would be welcome. I would ideally like to leave the first column as zeros rather than column 0. Any ideas?
I don't think the Rcpp subMatrix allows for assignments that way.
Take a look at using RcppArmadillo and Armadillo submatrix views
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace arma;
// [[Rcpp::export]]
mat example( mat m1, mat m2) {
m1.submat( 0,0, 2,2) = m2.submat( 3,3, 5,5 );
return m1;
}
/*** R
m1 <- matrix(1,6,6)
m2 <- matrix(-1,6,6)
example(m1, m2)
*/
> m1 <- matrix(1,6,6)
> m2 <- matrix(-1,6,6)
> example(m1, m2)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] -1 -1 -1 1 1 1
[2,] -1 -1 -1 1 1 1
[3,] -1 -1 -1 1 1 1
[4,] 1 1 1 1 1 1
[5,] 1 1 1 1 1 1
[6,] 1 1 1 1 1 1