Casting a BigMatrix/array to a Armadillo matrix - c++

I have a big.matrix that I want to cast to an arma::Mat so that I can use the linear algebra functionality of Armadillo.
However, I can't seem to get the cast to work.
As far as I can gather from reading, both are internally stored in column major format, and the actual matrix component of a big.matrix is simply a pointer of type <T> (char/short/int/double)
The following code compiles, but the cast to the arma::Mat doesn't work, segfaulting when iterating over the cast matrix.
#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::depends(BH, bigmemory, RcppArmadillo)]]
#include <bigmemory/BigMatrix.h>
template <typename T>
void armacast(const arma::Mat<T>& M) {
// This segfaults
for (int j = 0; j < 2; j++) {
for (int i = 0; i < 2; i++) {
std::cout << M.at(j, i) << std::endl;
}
}
std::cout << "Success!" << std::endl;
}
// [[Rcpp::export]]
void armacast(SEXP pDat) {
XPtr<BigMatrix> xpDat(pDat);
if (xpDat->matrix_type() == 8) {
// I can iterate over this *mat and get sensible output.
double *mat = (double *)xpDat->matrix();
for (int j = 0; j < 2; j++) {
for (int i = 0; i < 2; i++) {
std::cout << *mat + 2 * (j + 0) + i << std::endl;
}
}
armacast((const arma::Mat<double> &)mat);
} else {
std::cout << "Not implemented yet!" << std::endl;
}
}
In R:
library(Rcpp)
library(RcppArmadillo)
library(bigmemory)
sourceCpp("armacast.cpp")
m <- as.big.matrix(matrix(1:4, 2), type="double")
armacast(m#address)

Great question! We may spin this into another Rcpp Gallery post.
There is one important detail you may have glossed over. Bigmemory objects are external so that we get R to not let its memory management interfere. Armadillo does have constructors for this (and please read the docs and warnings there) so in a first instance
we can just do
arma::mat M( (double*) xpDat->matrix(), xpDat->nrow(), xpDat->ncol(), false);
where we using a pointer to the matrix data, as well as row and column counts. A complete version:
// [[Rcpp::export]]
void armacast(SEXP pDat) {
XPtr<BigMatrix> xpDat(pDat);
if (xpDat->matrix_type() == 8) {
arma::mat M(mat, xpDat->nrow(), xpDat>-ncol(), false);
M.print("Arma matrix M");
} else {
std::cout << "Not implemented yet!" << std::endl;
}
}
It correctly invokes the print method from Armadillo:
R> armacast(m#address)
Arma matrix M
1.0000 3.0000
2.0000 4.0000
R>

Related

Efficient matrix implementation

I have the following problem:
I've a precomputed 2d matrix of values which i need to lookup very often and compute only once
The size of the matrix is about 4000x4000 at most
The matrix won't be sparse, i typically need almost all values.
The values in the matrix can be boolean, integer or double. At least they are always small objects
Currently i am storing the precomputed values in a std::vector<<std::vector<T>>, and i've noticed the lookups into this datastructure takes quite some time in heavy computations. I've googled around and so far the suggested implementation seems to be to try a solution in which all the memory is stored contigious using an 1D array where the location in this array is computed based on i and j.
Does anybody have a good example implementation of this or has an even better suggestion? I couldn't find a modern C++ example, while it seems to be a very common problem to me. I'd prefer to use someone elses code instead of reinventing the wheel here. Of course i will measure the differences to see whether it actually improves performance.
Examples i've found:
https://medium.com/#patdhlk/c-2d-array-a-different-better-solution-6d371363ebf8
https://secure.eld.leidenuniv.nl/~moene/Home/tips/matrix2d/
Here is a very simple and efficient 2-d matrix. The 'main' creates a 10000x10000 double array 'mat', then filled it with random number. The array 'mat' is copied into another array 'mat2'. your may input two integers 'n' and 'm' between 0 and 9999 to fetch the double data at mat2(n,m).
Feel free to use or test it. Let me know if you encounter problems or need some more functions to be implemented. Good luck!
#ifndef ytlu_simple_matrix_class_
#define ytlu_simple_matrix_class_
#include <iostream>
#include <iomanip>
#include <complex>
template <typename T> class tMatrix
{
public:
T *ptr;
int col, row, size;
inline T* begin() const {return ptr;}
inline T* end() const {return this->ptr + this->size;}
inline T operator()(const int i, const int j) const { return ptr[i*col+j];
} // r-value
inline T&operator()(const int i, const int j) { return ptr[i*col+j]; } //l-value
inline tMatrix(): col{0}, row{0}, size{0}, ptr{0} {;}
tMatrix(const int i, const int j): col(j), row(i), size(i*j)
{
ptr = new T [this->size] ;
}
tMatrix(const tMatrix<T>&a) : tMatrix<T>(a.row, a.col)
{
std::copy(a.begin(), a.end(), this->ptr);
}
tMatrix<T>& operator=(tMatrix<T>&&a)
{
this->col = a.col;
this->row = a.row;
delete [] this->ptr;
this->ptr = a.ptr;
a.ptr = nullptr;
return *this;
}
tMatrix<T>& operator=(const tMatrix<T>&a)
{
if (col==a.cpl && row==a.row) std::copy(a.begin(), a.end(), this->ptr);
else { tMatrix<T>&&v(a); *this = std::move(v);}
return *this;
}
~tMatrix() {delete [] this->ptr;}
}; //end of class tMatrix
template <typename X> std::ostream& operator<<(std::ostream&p, const tMatrix<X>&a)
{
p << std::fixed;
for (int i=0; i<a.row; i++) {
for (int j=0; j <a.col; j++) p << std::setw(12) << a(i, j);
p << std::endl;
}
return p;
}
using iMatrix = tMatrix<int>;
using rMatrix = tMatrix<double>;
using cMatrix = tMatrix<std::complex<double> >;
#endif
//
//
#include <ctime>
#include <cstdlib>
#define N1 10000
int main()
{
int n, m;
std:srand(time(NULL)); // randomize
rMatrix mat(N1, N1); // declare a 10000 x 10000 double matrix
//
// fill the whole matrix with double random number 0.0 - 1.0
//
for (int i = 0; i<mat.row; i++)
{ for (int j=0; j<mat.col; j++) mat(i, j) = (double)std::rand() / (double)RAND_MAX; }
//
// copy mat to mat 2 just for test
//
rMatrix mat2 = mat;
//
// fetch data test input 0 <= n m < 10000 to print mat2(n, m)
//
while(1)
{
std::cout << "Fetch 2d array at (n m) = ";
std::cin >> n >> m;
if ((n < 0) || (m < 0) || (n > mat2.row) || (m > mat2.col) )break;
std::cout << "mat(" << n << ", " << m << ") = " << mat2(n, m) << std::endl << std::endl;
}
return 0;
}
The compile parameter I used and the test run. It takes a couple seconds to fill the random numbers, and I felt no lapse at all in fetch a data running in my aged PC.
ytlu#ytlu-PC MINGW32 /d/ytlu/working/cpptest
$ g++ -O3 -s mtx_class.cpp -o a.exe
ytlu#ytlu-PC MINGW32 /d/ytlu/working/cpptest
$ ./a.exe
Fetch 2d array at (n m) = 7000 9950
mat(7000, 9950) = 0.638447
Fetch 2d array at (n m) = 2904 5678
mat(2904, 5678) = 0.655934
Fetch 2d array at (n m) = -3 4

why my function doesn't change my object's attribute

I'm trying to do a Matrix class using C++ Vector, but i don't know why the inside of "Matrix result" inside my function isn't passed to my object but it remain enclosed inside the function.
for semplicity so far I've tryed only to do an "addition function" among two Matrices.
I have tryied to work with pointer but in this way (according to my knowledgs) i cant call my funtion to an object in this wise:
foo.function1(bar1).function2(bar2);
but working with pointer i have to call function in this manner:
foo.function1(bar1);
foo.function2(bar2);
//and so on..
this is my header file:
#include <iostream>
#include <vector>
using namespace std;
class Matrix
{
public:
Matrix (int height, int width);
Matrix add(Matrix m);
Matrix applyFunction(double (*function)(double));
void print();
private:
vector<vector<double> > matrix;
int height;
int width;
};
this is the .cpp file:
Matrix::Matrix(int height, int width)
{
this->height = height;
this->width = width;
this->matrix = vector<vector<double> >(this->height, vector<double>(this->width));
}
Matrix Matrix::add(Matrix m)
{
Matrix result(this->height, this->width);
if (m.height== this->height&& m.width== this->width)
{
for (int i = 0; i < this->height; i++)
{
for (int j = 0; j < this->width; j++)
{
result.matrix[i][j] = this->matrix[i][j] + m.matrix[i][j];
}
return result;
}
}
else
{
cout << "Impossible to do addition, matrices doesn't have the same dimension" << endl;
return result;
}
}
Matrix Matrix::applyFunction(double(*function)(double))
{
Matrix result(this->height, this->width);
for (int i = 0; i < this->height; i++)
{
for (int j = 0; j < this->width; j++)
{
result.matrix[i][j] = (*function)(this->matrix[i][j]);
}
}
return result;
}
void Matrix::print()
{
for (int i = 0; i < this->height; i++)
{
for (int j = 0; j < this->width; j++)
{
cout << this->matrix[i][j] << " ";
}
cout << endl;
}
cout << endl;
}
the output should be the addition beetwen A B 2x2:
x1 x2
x3 x4
but computer show only zeros.
Your member functions all return a new object (they return "by value").
From your usage of chaining, it seems like you actually want to modify the object and return *this by reference.
Otherwise you'll need something like:
auto bar2 = foo.function1(bar1);
auto bar3 = foo.function2(bar2);
// etc
There are no pointers here at present.
There are two variants how you can implement your add
Matrix add(Matrix m)
{
// optimisation: you don't need separate result, m already IS a copy!
// so you can just calculate:
...
{
m.matrix[i][j] += this->matrix[i][j];
}
return m;
}
or:
Matrix& add(Matrix const& m)
// ^ accept const reference to avoid unnecessary copy
// ^ returning reference(!)
{
...
{
// modifies itself!
this->matrix[i][j] += m.matrix[i][j];
}
return *this; // <- (!)
}
This allows now to do:
Matrix m0, m1, m2;
m0.add(m1).add(m2);
// m0 now contains the result, original value is lost (!)
So you don't need the final assignment as in first variant:
m0 = m0.add(m1).add(m2);
// or assign to a new variable, if you want to retain m0's original values
which is what you lacked in your question (thus you did not get the desired result).
Maybe you want to have both variants, and you might rename one of. But there's a nice feature in C++ that you might like even better: Operator overloading. Consider ordinary int:
int n0, n1;
n0 += n1;
int n2 = n0 + n1;
Well, suppose you know what's going on. And if you could do exactly the same with your matrices? Actually, you can! You need to do is overloading the operators:
Matrix& operator+=(Matrix const& m)
{
// identical to second variant of add above!
}
Matrix operator+(Matrix m) // again: the copy!
{
// now implement in terms of operator+=:
return m += *this;
}
Yes, now you can do:
Matrix m0, m1, m2;
m0 += m1 += m2;
m2 = m1 + m0;
Alternatively (and I'd prefer it) you can implement the second operator (operator+) as free standing function as well:
// defined OUTSIDE Matrix class!
Matrix operator+(Matrix first, Matrix const& second)
{
return first += second;
}
Finally: If dimensions don't match, better than returning some dummy matrix would be throwing some exception; std::domain_error might be a candidate for, or you define your own exception, something like SizeMismatch. And please don't output anything to console or elsewhere in such operators, this is not what anybody would expect from them, additionally, you impose console output to others who might consider it inappropriate (perhaps they want output in another language?).

Getting sequential pointers for the coordinates of a vtkPoint data

I wrote the following function to store the (x, y, z) of a vtkPoint in an array of type double and size of 3*N, where N is the number of vertices (or points).
double* myClass::getMyPoints(void)
{
double* vertices = new double[this->m_numberOfVertices * 3];
for (vtkIdType ivert = 0; ivert < this->m_numberOfVertices; ivert++)
for (auto i = 0; i < 3; ++i)
this->m_points->GetPoint(ivert, &vertices[3 * ivert]);
return vertices;
}
where m_points is a member of myClass and is of type vtkSmartPointer<vtkPoints>.
This function does what I want and works just fine. I was wondering if there is an elegant way of getting the sequential pointers. I tried GetVoidPointer(), which looks like an elegant one-line code, to avoid the for loop but it does not get the coordinates correctly after the function returns vertices.
(double*)(m_points->GetData()->GetVoidPointer(0));
Could someone help me with this?
vtkPoints internally stores it's data as a float array instead of a double array. So you may need to modify your function to work with float* instead of double*. If we want to use double array for vtkPoints then we should call SetDataTypeToDouble() on the vtkPoints object.
#include <stdio.h>
#include <stdlib.h>
#include <vtkPoints.h>
#include <vtkSmartPointer.h>
int main(){
// Create data
auto N = 5;
vtkNew<vtkPoints> pts;
pts->SetDataTypeToDouble();
for(auto i=0; i < N; ++i)
pts->InsertNextPoint(rand()%100,rand()%100,rand()%100);
// Read using for loop
std::cout<< "Using for loop ... " << std::endl;
for( auto j=0; j < N; ++j ){
double p[3];
pts->GetPoint( j, p );
std::cout<< p[0] << "," << p[1] << "," << p[2] << std::endl;
}
// Read using GetVoidPointer()
std::cout<< "Using GetVoidPointer() ... " << std::endl;
auto data_ptr = (double*) pts->GetData()->GetVoidPointer(0);
for( auto k = 0; k < N; ++k )
std::cout<< *(data_ptr + 3*k) << ","
<< *(data_ptr + 3*k + 1) << ","
<< *(data_ptr + 3*k + 2) << std::endl;
return 0;
}
This gives result as follows:
Test that there are N = 5 tuples.
Using for loop ...
83,86,77
15,93,35
86,92,49
21,62,27
90,59,63
Using GetVoidPointer() ...
83,86,77
15,93,35
86,92,49
21,62,27
90,59,63

Efficient Eigen Matrix From Function

I'm trying to build a matrix from a kernel, such that A(i,j) = f(i,j) where i,j are both vectors (hence I build A from two matrices x,y which each row corresponds to a point/vector). My current function looks similar to this:
Eigen::MatrixXd get_kernel_matrix(const Eigen::MatrixXd& x, const Eigen::MatrixXd& y, double(&kernel)(const Eigen::VectorXd&)) {
Eigen::MatrixXd res (x.rows(), y.rows());
for(int i = 0; i < res.rows() ; i++) {
for(int j = 0; j < res.cols(); j++) {
res(i, j) = kernel(x.row(i), y.row(j));
}
}
}
return res;
}
Along with some logic for the diagonal (which would in my case likely cause division by zero).
Is there a more efficient/idiometric way to do this? In some of my tests it appears that Matlab code beats the speed of my C++/Eigen implementation (I'm guessing due to vectorization).
I've looked through a considerable amount of documentation (such as the unaryExpr function), but can't seem to find what I'm looking for.
Thanks for any help.
You can use NullaryExpr with an appropriate lambda to remove your for loops:
MatrixXd res = MatrixXd::NullaryExpr(x.rows(), y.rows(),
[&x,&y,&kernel](int i,int j) { return kernel(x.row(i), y.row(j)); });
Here is a working self-contained example reproducing a matrix product:
#include <iostream>
#include <Eigen/Dense>
using namespace Eigen;
using namespace std;
double my_kernel(const MatrixXd::ConstRowXpr &x, const MatrixXd::ConstRowXpr &y) {
return x.dot(y);
}
template<typename Kernel>
MatrixXd apply_kernel(const MatrixXd& x, const MatrixXd& y, Kernel kernel) {
return MatrixXd::NullaryExpr(x.rows(), y.rows(),
[&x,&y,&kernel](int i,int j) { return kernel(x.row(i), y.row(j)); });
}
int main()
{
int n = 10;
MatrixXd X = MatrixXd::Random(n,n);
MatrixXd Y = MatrixXd::Random(n,n);
MatrixXd R = apply_kernel(X,Y,std::ptr_fun(my_kernel));
std::cout << R << "\n\n";
std::cout << X*Y.transpose() << "\n\n";
}
If you don't want to make apply_kernel a template function, you can use std::function to pass the kernel.

Sorting multiple vectors according to one vector [duplicate]

This question already has answers here:
Sorting zipped (locked) containers in C++ using boost or the STL
(5 answers)
Closed 1 year ago.
I have four vectors containing x, y, radius and weight information on centres of circles. I would like to sort them in order of weight (highest to lowest), but I really have no idea how or where to start with this. I could put all the vectors in an Eigen::Tensor to keep the data gathered in one 4d matrix if that would help. But other than that I don't know.
Each of the vectors contain 134 elements, but since it's only one of them having to be sorted that means the sorting algorithm doesn't matter all that much.
Does anyone have a hint on where to start?
You can create a 5th vector of indices, sort the vector of indices according to one of the 4 vectors, then reorder all 4 vectors (and also sort the vector of indices) in O(n) time. Example to sort 3 vectors according to one of them (the ages vector). The vector of indices I is created then sorted according to A (using lambda compare), then all 3 vectors and I are reordered according to I by undoing the "cycles" in I.
#include <algorithm>
#include <iostream>
#include <iomanip>
#include <string>
#include <vector>
int main()
{
std::vector <int> A; // ages
std::vector <std::string> N; // names
std::vector <int> Z; // zip codes
std::vector <size_t> I; // indices
int tA;
std::string tN;
int tZ;
A.push_back(37);
N.push_back("Ted");
Z.push_back(54211);
A.push_back(21);
N.push_back("John");
Z.push_back(53421);
A.push_back(31);
N.push_back("Fred");
Z.push_back(52422);
A.push_back(21);
N.push_back("Sam");
Z.push_back(51422);
// display the vectors
for(size_t i = 0; i < A.size(); i++)
std::cout << std::setw(6) << N[i]
<< std::setw(8) << Z[i]
<< std::setw(4) << A[i] << std::endl;
std::cout << std::endl;
// initialize the vector of indices
for(size_t i = 0; i < A.size(); i++)
I.push_back(i);
// sort I according to A
std::stable_sort(I.begin(), I.end(),
[&A](size_t i, size_t j) {return
A[i] < A[j];});
// reorder A, N, Z in place also restore I
// time complexity is O(n)
for(size_t i = 0; i < A.size(); i++){
size_t j, k;
if(i != I[i]){
tA = A[i];
tN = N[i];
tZ = Z[i];
k = i;
while(i != (j = I[k])){
A[k] = A[j];
N[k] = N[j];
Z[k] = Z[j];
I[k] = k;
k = j;
}
A[k] = tA;
N[k] = tN;
Z[k] = tZ;
I[k] = k;
}
}
// display the sorted vectors
for(size_t i = 0; i < A.size(); i++)
std::cout << std::setw(6) << N[i]
<< std::setw(8) << Z[i]
<< std::setw(4) << A[i] << std::endl;
return 0;
}
With ranges-v3, you may do something like
ranges::sort(
ranges::view::zip(xs, ys, radiuses, weights),
std::greater<>{}, // decreasing order
[](const auto& t){ return std::get<3>(t); }); // Projection: use weight
Demo
But having class Circle would make sense, that would avoid to zip the arrays, and allow to have a shorter projection.
Perhaps it makes more sense to first restructure your code and convert four vectors into one vectors of structures.
Something like that:
struct CircleInfo
{
int x, y, radius, weight;
};
std::vector<CircleInfo> circles;
Then, if you want to sort by radius:
#include <vector>
#include <algorithm>
#include <iostream>
struct CircleInfo
{
int x, y, radius, weight;
};
int main()
{
std::vector<CircleInfo> circles;
CircleInfo ci1 = { 1,1,1,1 };
CircleInfo ci2 = { 3,3,3,3 };
circles.push_back(ci2);
circles.push_back(ci1);
std::cout << "before sort circles[0].radius: " << circles[0].radius << std::endl;
std::sort(circles.begin(), circles.end(), [](const CircleInfo& c1, const CircleInfo& c2) {
return c1.radius < c2.radius;
});
std::cout << "aftern sort circles[0].radius: " << circles[0].radius << std::endl;
}
Output:
before sort circles[0].radius: 3
after sort circles[0].radius: 1
This code uses std::sort with custom function that compares two circles. To compare by radius you'd need to update it to compare c1.weight with c2.weight.