Any C++/C equivalent function with sparse in MATLAB - c++

I tried to port m code to c or cpp.
In my code there is a line
A = sparse(I,J,IA,nR,nC);
which converts row index I, col index J, and data IA to sparse matrix A with size nR x nC.
Is there any equivalent code with C++ or C?
An naïve algorithm to duplicate result in full matrix is
double *A;
A = malloc(sizeof(double)*nR*nC);
memset(A, 0, sizeof(double));
for(k=0; k<size_of_IA; k++)
A[I[k]*nC + J[k]] += IA[k];
Note that if there is common indices, the value is not over overwritten, but accumulated.

Eigen is an example of a C++ math matrix library that cobtains sparse matrices. It overloads operators to make it feel like a built in feature.
There are many C and C++ matrix libraries. None ship as part of std, nor is there anything built in.
Writing a good sparse matrix library would be quite hard; your best bet is finding a pre-written one. Recommendation questions are off topic

Related

Using R and Rcpp, how to multiply two matrices that are sparse Matrix::csr/csc format?

The following code works as expected:
matrix.cpp
// [[Rcpp::depends(RcppEigen)]]
#include <RcppEigen.h>
// [[Rcpp::export]]
SEXP eigenMatTrans(Eigen::MatrixXd A){
Eigen::MatrixXd C = A.transpose();
return Rcpp::wrap(C);
}
// [[Rcpp::export]]
SEXP eigenMatMult(Eigen::MatrixXd A, Eigen::MatrixXd B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}
// [[Rcpp::export]]
SEXP eigenMapMatMult(const Eigen::Map<Eigen::MatrixXd> A, Eigen::Map<Eigen::MatrixXd> B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}
This is using the C++ eigen class for matrices, See https://eigen.tuxfamily.org/dox
In R, I can access those functions.
library(Rcpp);
Rcpp::sourceCpp('matrix.cpp');
A <- matrix(rnorm(10000), 100, 100);
B <- matrix(rnorm(10000), 100, 100);
library(microbenchmark);
microbenchmark(eigenMatTrans(A), t(A), A%*%B, eigenMatMult(A, B), eigenMapMatMult(A, B))
This shows that R performs pretty well on resorting (transpose). Multiplying has some advantages with eigen.
Using the Matrix library, I can convert a normal matrix to a sparse matrix.
Example from https://cmdlinetips.com/2019/05/introduction-to-sparse-matrices-in-r/
library(Matrix);
data<- rnorm(1e6)
zero_index <- sample(1e6)[1:9e5]
data[zero_index] <- 0
A = matrix(data, ncol=1000)
A.csr = as(A, "dgRMatrix");
B.csr = t(A.csr);
A.csc = as(A, "dgCMatrix");
B.csc = t(A.csc);
So if I wanted to multiply A.csr times B.csr using eigen, how to do that in C++? I do not want to have to convert types if I don't have to. It is a memory size thing.
The A.csr %*% B.csr is not-yet-implemented.
The A.csc %*% B.csc is working.
I would like to microbenchmark the different options, and see how matrix size will be most efficient. In the end, I will have a matrix that is about 1% sparse and have 5 million rows and cols ...
There's a reason that dgRMatrix crossproduct functions are not yet implemented, in fact, they should not be implemented because otherwise they would enable bad practice.
There are a few performance considerations when working with sparse matrices:
Accessing marginal views against the major marginal orientation is highly inefficient. For instance, a column iterator in a dgRMatrix and a row iterator in a dgCMatrix need to loop through almost all elements of the matrix to find the ones in just that column or row. See this Rcpp gallery post for additional enlightenment.
A matrix cross-product is simply a dot product between all combinations of columns. This means the penalty of using a column iterator in a dgRMatrix (vs. a column iterator in a dgCMatrix) is multiplied by the number of column combinations.
Cross-product functions in R are highly optimized, and are not (in my experience) significantly faster than Eigen, Armadillo, equivalent STL variants. They are parallelized, and the Matrix package takes wonderful advantage of these optimized algorithms. I have written C++ parallelized STL cross-product variants using Rcpp structures and I don't see any increase in performance.
If you're really going this route, check out my Rcpp gallery post on Sparse Matrix structures in Rcpp. This is to be preferred to Eigen and Armadillo Sparse Matrices if memory is a concern, as Eigen and Armadillo perform a deep copy rather than a reference to an R object already existing in memory.
At 1% density, the inefficiencies of row iterators will be greater than at say 5 or 10% density. I do most of my tests at 5% density and generally binary operations take 5-10x longer for row iterators than for column iterators.
There may be applications where row-major ordering shines (i.e. see the work by Dmitry Selivanov on CSR matrices and irlba svd), but this is absolutely not one of them, in fact, so much so you are better off doing in-place conversion to get to a CSC matrix.
tl;dr: column-wise cross-product in row-major matrices is the ultimatum of inefficiency.

Using Eigen and C++ to do a colsum of massive matrix product

I am trying to compute colsum(N * P), where N is a sparse, 1M by 2500 matrix, and P is a dense 2500 by 1.5M matrix. I am using the Eigen C++ library with Intel's MKL library. The issue is that the matrix N*P can't actually exist in memory, it's way too big (~10 TB). My question is whether Eigen will be able to handle this computation through some combination of lazy evaluation and parallelism? It says here that Eigen won't make temporary matrices unnecessarily: http://eigen.tuxfamily.org/dox-devel/TopicLazyEvaluation.html
But does Eigen know to compute N * P in piecewise chunks that will actually fit in memory? IE: it will have to do something like colsum(N * P_1) ++ colsum(N * P_2) ++ .. ++ colsum(N * P_n), where P is split into n different submatrices column-wise and "++" is concatenation.
I am working with 128 GB RAM.
I gave it a try but ended up with a bad malloc (I'm only running on 8GB on Win8). I set up my main() and used a not inline colsum function I wrote.
int main(int argc, char *argv[])
{
Eigen::MatrixXd dense = Eigen::MatrixXd::Random(1000, 100000);
Eigen::SparseMatrix<double> sparse(100000, 1000);
typedef Triplet<int> Trip;
std::vector<Trip> trps(dense.rows());
for(int i = 0; i < dense.rows(); i++)
{
trps[i] = Trip(20*i, i, 2);
}
sparse.setFromTriplets(trps.begin(), trps.end());
VectorXd res = colsum(sparse, dense);
std::cout << res;
std::cin >> argc;
return 0;
}
The attempt was simply:
__declspec(noinline) VectorXd
colsum(const Eigen::SparseMatrix<double> &sparse, const Eigen::MatrixXd &dense)
{
return (sparse * dense).colwise().sum();
}
That had a bad malloc. Sol it looks like you have to split it up manually on your own (unless someone else has a better solution).
EDIT
I improved the function a bit, but the get the same bad malloc:
__declspec(noinline) VectorXd
colsum(const Eigen::SparseMatrix<double> &sparse, const Eigen::MatrixXd &dense)
{
return (sparse * dense).topRows(4).colwise().sum();
}
EDIT 2
Another option would be to make the sparse matrix dense and force a lazy evaluation. I don't think that it would work with a sparse matrix (oh well).
__declspec(noinline) VectorXd
colsum(const Eigen::SparseMatrix<double> &sparse, const Eigen::MatrixXd &dense)
{
Eigen::MatrixXd denseSparse(sparse);
return denseSparse.lazyProduct(dense).colwise().sum();
}
This doesn't give me the bad malloc, but computes a lot of pointless 0*x_i expressions.
To answer your question: Especially, when products are involved, Eigen often evaluates parts of expressions into temporaries. In some situations this could be optimized but is not implemented yet, in some cases this is essentially the most efficient way to implement it.
However, in your case you could simply calculate the colsum of N (a 1 x 2500 vector) and multiply that by P.
Maybe future versions of Eigen will be able to make this kind of optimization themselves, but most of the time it is a good idea to make problem-specific optimizations oneself before letting the computer do the rest of the work.
Btw: I'm afraid sparse.colwise() is not implemented yet, so you must compute that manually. If you are lazy, you can instead compute Eigen::RowVectorXd Nsum = Eigen::RowVectorXd::Ones(N.rows())*P; (I have not checked it, but this might actually get optimized to near optimal code, with the most recent versions of Eigen).

Use Eigen library to perform sparseLU and display L & U?

I'm a new to Eigen and I'm working with sparse LU problem.
I found that if I create a vector b(n), Eigen could compute the x(n) for the Ax=b equation.
Questions:
How to display the L & U, which is the factorization result of the original matrix A?
How to insert non-zeros in Eigen? Right now I just test with some small sparse matrix so I insert non-zeros one by one, but if I have a large-scale matrix, how can I input the matrix in my program?
I realize that this question was asked a long time ago. Apparently, referring to Eigen documentation:
an expression of the matrix L, internally stored as supernodes The only operation available with this expression is the triangular solve
So there is no way to actually convert this to an actual sparse matrix to display it. Eigen::FullPivLU performs dense decomposition and is of no use to us here. Using it on a large sparse matrix, we would quickly run out of memory while trying to convert it to dense, and the time required to compute the factorization would increase several orders of magnitude.
An alternative solution is using the CSparse library from the Suite Sparse as:
extern "C" { // we are in C++ now, since you are using Eigen
#include <csparse/cs.h>
}
const cs *p_matrix = ...; // perhaps possible to use Eigen::internal::viewAsCholmod()
css *p_symbolic_decomposition;
csn *p_factor;
p_symbolic_decomposition = cs_sqr(2, p_matrix, 0); // 1 = ordering A + AT, 2 = ATA
p_factor = cs_lu(p_matrix, m_p_symbolic_decomposition, 1.0); // tol = 1.0 for ATA ordering, or use A + AT with a small tol if the matrix has amostly symmetric nonzero pattern and large enough entries on its diagonal
// calculate ordering, symbolic decomposition and numerical decomposition
cs *L = p_factor->L, *U = p_factor->U;
// there they are (perhaps can use Eigen::internal::viewAsEigen())
cs_sfree(p_symbolic_decomposition); cs_nfree(p_factor);
// clean up (deletes the L and U matrices)
Note that although this does not use expliit vectorization as some Eigen functions do, it is still fairly fast. CSparse is also very compact, it is just a single header and about thirty .c files with no external dependencies. Easy to incorporate in any C++ project. There is no need to actually include all of Suite Sparse.
If you'll use Eigen::FullPivLU::matrixLU() to the original matrix, you'll receive LU decomposition matrix. To display L and U separately, you can use method triangularView<mode>. In Eigen wiki you can find good example of it. Inserting nonzeros into matrices depends on numbers, which you wan't to put. Eigen has convenient syntax, so you can easily insert values in loop:
for(int i=0;i<size;i++)
{
for(int j=size;j>someNumber;j--)
{
matrix(i,j)=yourClass.getNextNumber();
}
}

Can I solve a system of linear equations, in the form Ax = b with A being sparse, using Eigen?

I need to convert a MATLAB code into C++, and I'm stuck with this instruction:
a = K\F
, where K is a sparse matrix of size n x n, and F is a column vector of size n.
I know it's easy to solve that using the Eigen library - I have tried the fullPivLu() method, and I've been able to built a working snippet, using a Matrix and a Vector.
However, my K is a SparseMatrix<double> (while F is a VectorXd). My declarations:
SparseMatrix<double> K(nec, nec);
VectorXd F(nec);
and it seems that SparseMatrix doesn't have the fullPivLu() method, nor the lu() one.
I've tried, in fact, these two different approaches, taken from the documentation:
//1.
MatrixXd x = K.fullPivLu().solve(F);
//2.
VectorXf x;
K.lu().solve(F, &x);
They don't work, because fullPivLu() and lu() are not members of 'Eigen::SparseMatrix<_Scalar>'
So, I am asking: is there a way to solve a system of linear equations (the MATLAB's mldivide, or '\'), using Eigen for C++, with K being a sparse matrix?
Thank you for any help.
Would Eigen::SparseLU work for you?

C++ invert matrix

The following dynamic array contains a non-symmetric n*n matrix (with n <=100):
int **matrix;
matrix = new int*[n];
for (int i = 0; i < n; i++)
matrix[i] = new int[n];
Is there an extremely easy way to invert it? Ideally I'd only use something from the STL or download a single header file.
Using Eigen.
http://eigen.tuxfamily.org/index.php?title=Main_Page
You can map your array to an Eigen matrix and then perform efficient matrix inversion.
You must only include it.
I add that usually if you have to perform your inversion for linear system solving, it's better to use a matrix decomposition based on the properties of the matrix that you can exploit.
http://eigen.tuxfamily.org/dox/TutorialLinearAlgebra.html
Not extremely easy but it works: Numerical Recipes in c page 48, using LU decomposition.