Armadillo C++ LU decomposition - c++

I am using the Armadillo C++ library for solving linear systems of medium/large dimensions (1000-5000 equations).
Since I have to solve different linear systems
AX=b
in which A is always the same and B changes, I would like to LU factorize A only once and reuse the LU factorization with different b. Unfortunately I do not know how to perform this kind of operations in Armadillo.
What I did was just the LU factorization of the A matrix:
arma::mat A;
// ... fill the A matrix ...
arma::mat P,L,U;
arma::lu(L, U, P, A);
But now I would like to use the matrices P, L and U to solve several linear systems with different b vectors.
Could you help me please?

Since A = P.t()*L*U (where equality is only approximate due to rounding errors), solving for x in P.t()*L*U*x = b requires to permute rows of B and performing forward and back substitution:
x = solve(trimatu(U), solve(trimatl(L), P*b) );
Due to the lack of a true triangular solver in armadillo, and a fast way to perform row permutation, this procedure will not be very efficient, with respect to a direct call to the relevant computational LAPACK subroutines.
General advice is to avoid explicit LU decomposition in higher level libraries, like armadillo.
if all different b's are known at the same time, store them as columns in a rectangular matrix B and X = solve(A,B);
if the different b's are known one at a time, then precomputing AINV = A.i(); and x = AINV*b; will be more efficient if the number of different r.h.s. vectors is big enough. See this answer to a similar question.

Related

Preallocating memory for sparse matrix operations in Eigen

I'm writing a program using Eigen (an optimizer to be precise) that has a "hot loop" within which I'd like to avoid memory allocation whenever possible.
When using Eigen/Dense I've been able to organize things so that all my memory allocation is performed in the constructor (with judicious usage of .noalias() and inplace LLT factorizations) by computing the sizes of all the workspaces I need in advance and preallocating.
I'd like to do something similar with the Eigen/Sparse variant of the code (to the extent that it is possible). The constructor will require the user to supply sparsity patterns for all the data which I can use to determine the sizes and sparsity patterns for all the subsequent matrices I need, workspaces included. I need to perform 4 kinds of operations:
Sparse matrix-vector products
Cholesky factorizations
Schur complements of the form E = H + A' * B * A where B is a diagonal matrix and H,A are general sparse matrices.
My current understanding is as follows:
I can use x.noalias() = A*y (A is sparse, x,y dense vectors) for matrix vector products with no problems
I can perform matrix addition using coefficient wise tricks by padding with explicit zeros as in e.g.,How to avoid memory allocations in sparse expressions with Eigen. This is mostly for operations like B = A + s I where A is sparse and sI is a scalar multiple of identity. I can just make sure the main diagonal is included in the sparsity pattern of B and perform the addition in a loop.
As of 2017, there is no way to avoid temporary allocation in sparse matrix products e.g., C = A*B (https://forum.kde.org/viewtopic.php?f=74&t=140031&p=376933&hilit=sparse+memory#p376933), and I don't see any inplace functions for sparse matrices yet so I'll have to bite the bullet on this one and accept the temporary creation.
For licensing reasons, I'm using an eigen external LDL' factorization package which allows me to preallocate based on a symbolic analysis stage.
Can anyone suggest a fast way to organize the Schur computation E = H + A' * B * A? (exploiting that I know its sparsity structure in advance)

Eigen - Sparse matrix solvers surprisingly slow for simple case

I'm trying to solve a linear system for a finite element case:
d = F/K
Where K is a sparse matrix 25,000x25,000 and 360,000 non-zero values which is only 0.05% of the entire matrix. And F is a 25,000x1 matrix filled with mostly zeros.
Solving this system is taking a surprising amount of time:
Sparse Solver Compute + Solve Time
ConjugateGradient 70.2s
BiCGSTAB 40.2s
SimplicialLDLT 40.1s
SimplicialCholesky 32.9s
SimplicialLLT 29.0s
Where the solvers are used in a standard way e.g.
VectorXd F_vector(25000);
// fill F_vector
VectorXd d_vector(25000);
BiCGSTAB<SparseMatrix<double> > solver;
solver.compute(K_sparse);
d_vector = solver.solve(F_vector);
(Also, a 10% increase to the size of the dimensions resulted in a 10%^2 increase in the time. I'm not sure if that's noteworthy but I'll state it in case it's not)
Is there something wrong with my implementation or understanding of these sparse solvers?

Armadillo complex sparse matrix inverse

I'm writing a program with Armadillo C++ (4.400.1)
I have a matrix that has to be sparse and complex, and I want to calculate the inverse of such matrix. Since it is sparse it could be the pseudoinverse, but I can guarantee that the matrix has the full diagonal.
In the API documentation of Armadillo, it mentions the method .i() to calculate the inverse of any matrix, but sp_cx_mat members do not contain such method, and the inv() or pinv() functions cannot handle the sp_cx_mat type apparently.
sp_cx_mat Y;
/*Fill Y ensuring that the diagonal is full*/
sp_cx_mat Z = Y.i();
or
sp_cx_mat Z = inv(Y);
None of them work.
I would like to know how to compute the inverse of matrices of sp_cx_mat type.
Sparse matrix support in Armadillo is not complete and many of the factorizations/complex operations that are available for dense matrices are not available for sparse matrices. There are a number of reasons for this, the largest being that efficient complex operations such as factorizations for sparse matrices is still very much an open research field. So, there is no .i() function available for cx_sp_mat or other sp_mat types. Another reason for this is lack of time on the part of the sparse matrix developers (...which includes me).
Given that the inverse of a sparse matrix is generally going to be dense, then you may simply be better off turning your cx_sp_mat into a cx_mat and then using the same inversion techniques that you normally would for dense matrices. Since you are planning to represent this as a dense matrix anyway, then it's a fair assumption that you have enough RAM to do that.

Largest eigenvalues (and corresponding eigenvectors) in C++

What is the easiest and fastest way (with some library, of course) to compute k largest eigenvalues and eigenvectors for a large dense matrix in C++? I'm looking for an equivalent of MATLAB's eigs function; I've looked through Armadillo and Eigen but couldn't find one, and computing all eigenvalues takes forever in my case (I need top 10 eigenvectors for an approx. 30000x30000 dense non-symmetric real matrix).
Desperate, I've even tried to implement power iterations by myself with Armadillo's QR decomposition but ran into complex pairs of eigenvalues and gave up. :)
Did you tried https://github.com/yixuan/spectra ?
It similar to ARPACK but with nice Eigen-like interface (it compatible with Eigen!)
I used it for 30kx30k matrices (PCA) and it was quite ok
AFAIK the problem of finding the first k eigenvalues of a generic matrix has no easy solution. The Matlab function eigs you mentioned is supposed to work with sparse matrices.
Matlab probably uses Arnoldi/Lanczos, you might try if it works decently in your case even if your matrix is not sparse. The reference package for Arnlodi is ARPACK which has a C++ interface.
Here is how I get the k largest eigenvectors of a NxN real-valued (float), dense, symmetric matrix A in C++ Eigen:
#include <Eigen/Dense>
...
Eigen::MatrixXf A(N,N);
...
Eigen::SelfAdjointEigenSolver<Eigen::MatrixXf> solver(N);
solver.compute(A);
Eigen::VectorXf lambda = solver.eigenvalues().reverse();
Eigen::MatrixXf X = solver.eigenvectors().block(0,N-k,N,k).rowwise().reverse();
Note that the eigenvalues and associated eigenvectors are returned in ascending order so I reverse them to get the largest values first.
If you want eigenvalues and eigenvectors for other (non-symmetric) matrices they will, in general, be complex and you will need to use the Eigen::EigenSolver class instead.
Eigen has an EigenValues module that works pretty well.. But, I've never used it on anything quite that large.

Efficient solution of linear system Ax= b when only one of the constant term changes

How does one solve a large system of linear equations efficiently when only a few of the constant terms change. For example:
I currently have the system Ax= b. I compute the inverse of A once, store it in a matrix and each time any entry updates in b perform a matrix-vector multiplication A^-1(b) to recompute x.
This is inefficient as only a couple of entries would have update in b. Are there more efficient ways of solving this system when A-1 remains constant but specific known values change in b?
I use uBlas and Eigen, but not aware of solutions that would address this problem of selective recalculation. Thanks for any guidance.
Compute A^-1. If b_i is the ith component of b, then examine d/db_i A^-1 b (the derivative of A^-1 with respect to the ith component of b) -- it equals a column of A^-1 (in particular, the ith column). And derivatives of linear functions are constant over their domain. So if you have b and b', and they differ only in the ith component, then A^-1 b - A^-1 b' = [d/db_i A^-1] * (b-b')_i. For multiple components, just add them up (as A^-1 is linear).
Or, in short, you can calculate A^-1 (b'-b) with some optimizations for input components that are zero (which, if only some components change, will be most of the components). A^-1 b' = A^-1 (b'-b) + A^-1 (b). And if you know that only some components will change, you can take a copy of the appropriate column of A^-1, then multiply it by the change in that component of b.
You can take advantage of the problem's linearity :
x0 = A_(-1)*b0
x = A_(-1)*b = x0 + A_(-1)*db
where db is a the difference matrix between b and b0 and it should be filled with zero : you can compressed it to a sparse matrix.
the Eigen lib has a lot of cool functions for sparse matrices (multiplication, inverse, ...).
Firstly, don't perform a matrix inversion, use a solver library instead. Secondly, pass your initial x to the library as a first guess.
The library will perform some kind of decomposition like LU, and use that to calculate x. If you choose an iterative solver, then it is already doing pretty much what you describe to home in on the solution; it will begin with a worse guess and generate a better one, and any good routine will take an initial guess to speed up the process. In many circumstances you have a good idea of the result anyway, so it makes sense to exploit that.
If the new b is near the old b, then the new x should be near the old x, and it will serve as a good initial guess.
First, don't compute the matrix inverse, use rather the LU decomposition, or the QR decomposition (slower than LU but stabler). Such decompositions scale better than inversion performancewise with the matrix size, and are usually stabler (especially QR).
There are ways to update the QR decomposition if A changes slightly (eg. by a rank one matrix), but if B is changed, you have to solve again with the new b -- you cannot escape this, and this is O(n^2).
However, if the right hand side B only changes by a fixed element, ie. B' = B + dB with dB known in advance, you can solve A dx = dB once and for all and now the solution x' of Ax' = B' is x + dX.
If dB is not known in advance but is always a linear combination of a few dB_i vectors, you may solve for A dx_i = dB_i, but if you have many such dB_i, you end up with a n^2 process (this in fact amounts to computing the inverse)...