C++ eigenvalue/vector decomposition, only need first n vectors fast - c++

I have a ~3000x3000 covariance-alike matrix on which I compute the eigenvalue-eigenvector decomposition (it's a OpenCV matrix, and I use cv::eigen() to get the job done).
However, I actually only need the, say, first 30 eigenvalues/vectors, I don't care about the rest. Theoretically, this should allow to speed up the computation significantly, right? I mean, that means it has 2970 eigenvectors less that need to be computed.
Which C++ library will allow me to do that? Please note that OpenCV's eigen() method does have the parameters for that, but the documentation says they are ignored, and I tested it myself, they are indeed ignored :D
UPDATE:
I managed to do it with ARPACK. I managed to compile it for windows, and even to use it. The results look promising, an illustration can be seen in this toy example:
#include "ardsmat.h"
#include "ardssym.h"
int n = 3; // Dimension of the problem.
double* EigVal = NULL; // Eigenvalues.
double* EigVec = NULL; // Eigenvectors stored sequentially.
int lowerHalfElementCount = (n*n+n) / 2;
//whole matrix:
/*
2 3 8
3 9 -7
8 -7 19
*/
double* lower = new double[lowerHalfElementCount]; //lower half of the matrix
//to be filled with COLUMN major (i.e. one column after the other, always starting from the diagonal element)
lower[0] = 2; lower[1] = 3; lower[2] = 8; lower[3] = 9; lower[4] = -7; lower[5] = 19;
//params: dimensions (i.e. width/height), array with values of the lower or upper half (sequentially, row major), 'L' or 'U' for upper or lower
ARdsSymMatrix<double> mat(n, lower, 'L');
// Defining the eigenvalue problem.
int noOfEigVecValues = 2;
//int maxIterations = 50000000;
//ARluSymStdEig<double> dprob(noOfEigVecValues, mat, "LM", 0, 0.5, maxIterations);
ARluSymStdEig<double> dprob(noOfEigVecValues, mat);
// Finding eigenvalues and eigenvectors.
int converged = dprob.EigenValVectors(EigVec, EigVal);
for (int eigValIdx = 0; eigValIdx < noOfEigVecValues; eigValIdx++) {
std::cout << "Eigenvalue: " << EigVal[eigValIdx] << "\nEigenvector: ";
for (int i = 0; i < n; i++) {
int idx = n*eigValIdx+i;
std::cout << EigVec[idx] << " ";
}
std::cout << std::endl;
}
The results are:
9.4298, 24.24059
for the eigenvalues, and
-0.523207, -0.83446237, -0.17299346
0.273269, -0.356554, 0.893416
for the 2 eigenvectors respectively (one eigenvector per row)
The code fails to find 3 eigenvectors (it can only find 1-2 in this case, an assert() makes sure of that, but well, that's not a problem).

In this article, Simon Funk shows a simple, effective way to estimate a singular value decomposition (SVD) of a very large matrix. In his case, the matrix is sparse, with dimensions: 17,000 x 500,000.
Now, looking here, describes how eigenvalue decomposition closely related to SVD. Thus, you might benefit from considering a modified version of Simon Funk's approach, especially if your matrix is sparse. Furthermore, your matrix is not only square but also symmetric (if that is what you mean by covariance-like), which likely leads to additional simplification.
... Just an idea :)

It seems that Spectra will do the job with good performances.
Here is an example from their documentation to compute the 3 first eigen values of a dense symmetric matrix M (likewise your covariance matrix):
#include <Eigen/Core>
#include <Spectra/SymEigsSolver.h>
// <Spectra/MatOp/DenseSymMatProd.h> is implicitly included
#include <iostream>
using namespace Spectra;
int main()
{
// We are going to calculate the eigenvalues of M
Eigen::MatrixXd A = Eigen::MatrixXd::Random(10, 10);
Eigen::MatrixXd M = A + A.transpose();
// Construct matrix operation object using the wrapper class DenseSymMatProd
DenseSymMatProd<double> op(M);
// Construct eigen solver object, requesting the largest three eigenvalues
SymEigsSolver< double, LARGEST_ALGE, DenseSymMatProd<double> > eigs(&op, 3, 6);
// Initialize and compute
eigs.init();
int nconv = eigs.compute();
// Retrieve results
Eigen::VectorXd evalues;
if(eigs.info() == SUCCESSFUL)
evalues = eigs.eigenvalues();
std::cout << "Eigenvalues found:\n" << evalues << std::endl;
return 0;
}

Related

Eigen: Obtain the kernel of a sparse matrix

Given a sparse matrix A and a vector b, I would like to obtain a solution x to the equation A * x = b as well as the kernel of A.
One possibility is to convert A to a dense representation.
#include <iostream>
#include <Eigen/Dense>
#include <Eigen/SparseQR>
int main()
{
// This is a toy problem. My actual matrix
// is of course bigger and sparser.
Eigen::SparseMatrix<double> A(2,2);
A.insert(0,0) = 1;
A.insert(0,1) = 2;
A.insert(1,0) = 4;
A.insert(1,1) = 8;
A.makeCompressed();
Eigen::Vector2d b;
b << 3, 12;
Eigen::SparseQR<Eigen::SparseMatrix<double>,
Eigen::COLAMDOrdering<int> > solver;
solver.compute(A);
std::cout << "Solution:\n" << solver.solve(b) << std::endl;
Eigen::Matrix2d A_dense(A);
std::cout << "Kernel:\n" << A_dense.fullPivLu().kernel() << std::endl;
return 0;
}
Is it possible to do the same directly in the sparse representation? I could not find a function kernel() anywhere except in FullPivLu.
I think #chtz's answer is almost correct, except we need to take the last A.cols() - qr.rank() columns. Here is a mathematical derivation.
Say we do a QR decomposition of your matrix Aᵀ as
Aᵀ * P = [Q₁ Q₂] * [R; 0] = Q₁ * R
where P is the permutation matrix, thus
Aᵀ = Q₁ * R * P⁻¹.
We can see that Range(Aᵀ) = Range(Q₁ * R * P⁻¹) = Range(Q₁) (because both P and R are invertible).
Since Aᵀ and Q₁ have the same range space, this implies that A and Q₁ᵀ will also have the same null space, namely Null(A) = Null(Q₁ᵀ). (Here we use the property that Range(M) and Null(Mᵀ) are complements to each other for any matrix M, hence Null(A) = complement(Range(Aᵀ)) = complement(Range(Q₁)) = Null(Q₁ᵀ)).
On the other hand, since the matrix [Q₁ Q₂] is orthonormal, Null(Q₁ᵀ) = Range(Q₂), thus Null(A) = Range(Q₂), i.e., kernal(A) = Q₂.
Since Q₂ is the right A.cols() - qr.rank() columns, you could call rightCols(A.cols() - qr.rank()) to retrieve the kernal of A.
For more information on kernal space, you could refer to https://en.wikipedia.org/wiki/Kernel_(linear_algebra)

Increase precision in SelfAdjointEigenSolver in Eigen

I am trying to determine the eigenvalues and eigenvectors of a sparse array in Eigen. Since I need to compute all the eigenvectors and eigenvalues, and I could not get this done using the unsupported ArpackSupport module working, I chose to convert the system to a dense matrix and compute the eigensystem using SelfAdjointEigenSolver (I know my matrix is real and has real eigenvalues). This works well until I have matrices of size 1024*1024 but then I start getting deviations from the expected results.
In the documentation of this module (https://eigen.tuxfamily.org/dox/classEigen_1_1SelfAdjointEigenSolver.html) from what I understood it is possible to change the number of max iterations:
const int m_maxIterations
static
Maximum number of iterations.
The algorithm terminates if it does not converge within m_maxIterations * n iterations, where n denotes the size of the matrix. This value is currently set to 30 (copied from LAPACK).
However, I do not understand how do you implement this, using their examples:
SelfAdjointEigenSolver<Matrix4f> es;
Matrix4f X = Matrix4f::Random(4,4);
Matrix4f A = X + X.transpose();
es.compute(A);
cout << "The eigenvalues of A are: " << es.eigenvalues().transpose() << endl;
es.compute(A + Matrix4f::Identity(4,4)); // re-use es to compute eigenvalues of A+I
cout << "The eigenvalues of A+I are: " << es.eigenvalues().transpose() << endl
How would you modify it in order to change the maximum number of iterations?
Additionally, will this solve my problem or should I try to find an alternative function or algorithm to solve the eigensystem?
My thanks in advance.
Increasing the number of iterations is unlikely to help. On the other hand, moving from float to double will help a lot!
If that does not help, please, be more specific on "deviations from the expected results".
m_maxIterations is a static const int variable, and as such it can be considered an intrinsic property of the type. Changing such a type property usually would be done via a specific template parameter. In this case, however, it is set to the constant number 30, so it's not possible.
Therefore, you're only choice is to change the value in the header file and recompile your program.
However, before doing that, I would try the Singular Value Decomposition. According to the homepage, its accuracy is "Excellent-Proven". Moreover, it can overcome problems due to numerically not completely symmetric matrices.
I solved the problem by writing the Jacobi algorithm adapted from the Book Numerical Recipes:
void ROTATy(MatrixXd &a, int i, int j, int k, int l, double s, double tau)
{
double g,h;
g=a(i,j);
h=a(k,l);
a(i,j)=g-s*(h+g*tau);
a(k,l)=h+s*(g-h*tau);
}
void jacoby(int n, MatrixXd &a, MatrixXd &v, VectorXd &d )
{
int j,iq,ip,i;
double tresh,theta,tau,t,sm,s,h,g,c;
VectorXd b(n);
VectorXd z(n);
v.setIdentity();
z.setZero();
for (ip=0;ip<n;ip++)
{
d(ip)=a(ip,ip);
b(ip)=d(ip);
}
for (i=0;i<50;i++)
{
sm=0.0;
for (ip=0;ip<n-1;ip++)
{
for (iq=ip+1;iq<n;iq++)
sm += fabs(a(ip,iq));
}
if (sm == 0.0) {
break;
}
if (i < 3)
tresh=0.2*sm/(n*n);
else
tresh=0.0;
for (ip=0;ip<n-1;ip++)
{
for (iq=ip+1;iq<n;iq++)
{
g=100.0*fabs(a(ip,iq));
if (i > 3 && (fabs(d(ip))+g) == fabs(d[ip]) && (fabs(d[iq])+g) == fabs(d[iq]))
a(ip,iq)=0.0;
else if (fabs(a(ip,iq)) > tresh)
{
h=d(iq)-d(ip);
if ((fabs(h)+g) == fabs(h))
{
t=(a(ip,iq))/h;
}
else
{
theta=0.5*h/(a(ip,iq));
t=1.0/(fabs(theta)+sqrt(1.0+theta*theta));
if (theta < 0.0)
{
t = -t;
}
c=1.0/sqrt(1+t*t);
s=t*c;
tau=s/(1.0+c);
h=t*a(ip,iq);
z(ip)=z(ip)-h;
z(iq)=z(iq)+h;
d(ip)=d(ip)- h;
d(iq)=d(iq) + h;
a(ip,iq)=0.0;
for (j=0;j<ip;j++)
ROTATy(a,j,ip,j,iq,s,tau);
for (j=ip+1;j<iq;j++)
ROTATy(a,ip,j,j,iq,s,tau);
for (j=iq+1;j<n;j++)
ROTATy(a,ip,j,iq,j,s,tau);
for (j=0;j<n;j++)
ROTATy(v,j,ip,j,iq,s,tau);
}
}
}
}
}
}
the function jacoby receives the size of of the square matrix n, the matrix we want to calculate the we want to solve (a) and a matrix that will receive the eigenvectors in each column and a vector that is going to receive the eigenvalues. It is a bit slower so I tried to parallelize it with OpenMp (see: Parallelization of Jacobi algorithm using eigen c++ using openmp) but for 4096x4096 sized matrices what I did not mean an improvement in computation time, unfortunately.

Eigen - Check if matrix is Positive (Semi-)Definite

I'm implementing a spectral clustering algorithm and I have to ensure that a matrix (laplacian) is positive semi-definite.
A check if the matrix is positive definite (PD) is enough, since the "semi-" part can be seen in the eigenvalues. The matrix is pretty big (nxn where n is in the order of some thousands) so eigenanalysis is expensive.
Is there any check in Eigen that gives a bool result in runtime?
Matlab can give a result with the chol() method by throwing an exception if a matrix is not PD. Following this idea, Eigen returns a result without complaining for LLL.llt().matrixL(), although I was expecting some warning/error.
Eigen also has the method isPositive, but due to a bug it is unusable for systems with an old Eigen version.
You can use a Cholesky decomposition (LLT), which returns Eigen::NumericalIssue if the matrix is negative, see the documentation.
Example below:
#include <Eigen/Dense>
#include <iostream>
#include <stdexcept>
int main()
{
Eigen::MatrixXd A(2, 2);
A << 1, 0 , 0, -1; // non semi-positive definitie matrix
std::cout << "The matrix A is" << std::endl << A << std::endl;
Eigen::LLT<Eigen::MatrixXd> lltOfA(A); // compute the Cholesky decomposition of A
if(lltOfA.info() == Eigen::NumericalIssue)
{
throw std::runtime_error("Possibly non semi-positive definitie matrix!");
}
}
In addition to #vsoftco 's answer, we shall also check for matrix symmetry, since the definition of PD/PSD requires symmetric matrix.
Eigen::LLT<Eigen::MatrixXd> A_llt(A);
if (!A.isApprox(A.transpose()) || A_llt.info() == Eigen::NumericalIssue) {
throw std::runtime_error("Possibly non semi-positive definitie matrix!");
}
This check is important, e.g. some Eigen solvers (like LTDT) requires PSD(or NSD) matrix input. In fact, there exists non-symmetric and hence non-PSD matrix A that passes the A_llt.info() != Eigen::NumericalIssue test. Consider the following example (numbers taken from Jiuzhang Suanshu, Chapter 8, Problem 1):
Eigen::Matrix3d A;
Eigen::Vector3d b;
Eigen::Vector3d x;
// A is full rank and all its eigen values >= 0
// However A is not symmetric, thus not PSD
A << 3, 2, 1,
2, 3, 1,
1, 2, 3;
b << 39, 34, 26;
// This alone doesn't check matrix symmetry, so can't guarantee PSD
Eigen::LLT<Eigen::Matrix3d> A_llt(A);
std::cout << (A_llt.info() == Eigen::NumericalIssue)
<< std::endl; // false, no issue detected
// ldlt solver requires PSD, wrong answer
x = A.ldlt().solve(b);
std::cout << x << std::endl; // Wrong solution [10.625, 1.5, 4.125]
std::cout << b.isApprox(A * x) << std::endl; // false
// ColPivHouseholderQR doesn't assume PSD, right answer
x = A.colPivHouseholderQr().solve(b);
std::cout << x << std::endl; // Correct solution [9.25, 4.25, 2.75]
std::cout << b.isApprox(A * x) << std::endl; // true
Notes: to be more exact, one could apply the definition of PSD by checking A is symmetric and all of A's eigenvalues >= 0. But as mentioned in the question, this could be computationally expensive.
you have to test that the matrix is symmetric (A.isApprox(A.transpose())), then create the LDLT (and not LLT because LDLT takes care of the case where one of the eigenvalues is 0, ie not strictly positive), then test for numerical issues and positiveness:
template <class MatrixT>
bool isPsd(const MatrixT& A) {
if (!A.isApprox(A.transpose())) {
return false;
}
const auto ldlt = A.template selfadjointView<Eigen::Upper>().ldlt();
if (ldlt.info() == Eigen::NumericalIssue || !ldlt.isPositive()) {
return false;
}
return true;
}
I tested this on
1 2
2 3
which has a negative eigenvalue (hence not PSD). Without the isPositive() test, isPsd() incorrectly returns true here.
and on
1 2
2 4
which has a null eigenvalue (hence PSD but not PD).

Best way to indexing a matrix in opencv

Let say, A and B are matrices of the same size.
In Matlab, I could use simple indexing as below.
idx = A>0;
B(idx) = 0
How can I do this in OpenCV? Should I just use
for (i=0; ... rows)
for(j=0; ... cols)
if (A.at<double>(i,j)>0) B.at<double>(i,j) = 0;
something like this? Is there a better (faster and more efficient) way?
Moreover, in OpenCV, when I try
Mat idx = A>0;
the variable idx seems to be a CV_8U matrix (not boolean but integer).
You can easily convert this MATLAB code:
idx = A > 0;
B(idx) = 0;
// same as
B(A>0) = 0;
to OpenCV as:
Mat1d A(...)
Mat1d B(...)
Mat1b idx = A > 0;
B.setTo(0, idx) = 0;
// or
B.setTo(0, A > 0);
Regarding performance, in C++ it's usually faster (it depends on the enabled optimizations) to work on raw pointers (but is less readable):
for (int r = 0; r < B.rows; ++r)
{
double* pA = A.ptr<double>(r);
double* pB = B.ptr<double>(r);
for (int c = 0; c < B.cols; ++c)
{
if (pA[c] > 0.0) pB[c] = 0.0;
}
}
Also note that in OpenCV there isn't any boolean matrix, but it's a CV_8UC1 matrix (aka a single channel matrix of unsigned char), where 0 means false, and any value >0 is true (typically 255).
Evaluation
Note that this may vary according to optimization enabled with OpenCV. You can test the code below on your PC to get accurate results.
Time in ms:
my results my results #AdrienDescamps
(OpenCV 3.0 No IPP) (OpenCV 2.4.9)
Matlab : 13.473
C++ Mask: 640.824 5.81815 ~5
C++ Loop: 5.24414 4.95127 ~4
Note: I'm not entirely sure about the performance drop with OpenCV 3.0, so I just remark: test the code below on your PC to get accurate results.
As #AdrienDescamps stated in comments:
It seems that the performance drop with OpenCV 3.0 is related to the OpenCL option, that is now enabled in the comparison operator.
C++ Code
#include <opencv2/opencv.hpp>
#include <iostream>
using namespace std;
using namespace cv;
int main()
{
// Random initialize A with values in [-100, 100]
Mat1d A(1000, 1000);
randu(A, Scalar(-100), Scalar(100));
// B initialized with some constant (5) value
Mat1d B(A.rows, A.cols, 5.0);
// Operation: B(A>0) = 0;
{
// Using mask
double tic = double(getTickCount());
B.setTo(0, A > 0);
double toc = (double(getTickCount()) - tic) * 1000 / getTickFrequency();
cout << "Mask: " << toc << endl;
}
{
// Using for loop
double tic = double(getTickCount());
for (int r = 0; r < B.rows; ++r)
{
double* pA = A.ptr<double>(r);
double* pB = B.ptr<double>(r);
for (int c = 0; c < B.cols; ++c)
{
if (pA[c] > 0.0) pB[c] = 0.0;
}
}
double toc = (double(getTickCount()) - tic) * 1000 / getTickFrequency();
cout << "Loop: " << toc << endl;
}
getchar();
return 0;
}
Matlab Code
% Random initialize A with values in [-100, 100]
A = (rand(1000) * 200) - 100;
% B initialized with some constant (5) value
B = ones(1000) * 5;
tic
B(A>0) = 0;
toc
UPDATE
OpenCV 3.0 uses IPP optimization in the function setTo. If you have that enabled (you can check with cv::getBuildInformation()), you'll have a faster computation.
The answer of Miki is very good, but i just want to add some clarification about the performance problem to avoid any confusion.
It is true that the best way to implement an image filter (or any algorithm) with OpenCV is to use the raw pointers, as shown in the second C++ example of Miki (C++ Loop).
Using the at function is also correct, but significantly slower.
However, most of the time, you don't need to worry about that, and you can simply use the high level functions of OpenCV (first example of Miki , C++ Mask). They are well optimized, and will usually be almost as fast as a low level loop on pointers, or even faster.
Of course, there are exceptions (we just found one), and you should always test for your specific problem.
Now, regarding this specific problem :
The example here where the high level function was much slower (100x slower) than the low level loop is NOT a normal case, as it is demonstrated by the timings with other version/configuration of OpenCV, that are much lower.
The problem seems to be that when OpenCV3.0 is compiled with OpenCL, there is a huge overhead the first time a function that uses OpenCL is called. The simplest solution is to disable OpenCL at compile time, if you use OpenCV3.0 (see also here for other possible solutions if you are interested).

VLFeat kmeans C API explanation

I'm trying to use VLFeat's kmeans implementation in C but I'm having a really hard time understanding how it works.
Note: I am using the C API in a C++ program, so any code posted by me here is C++. Additionally, I am using the Eigean header library, so that's where those Matrix data types are coming from.
Things unclear to from the example and API are:
What format does the data have to be in? The kmeans library functions appear to require a one-dimensional array, which could be taken from the backing of a matrix. However, does this matrix need to be column major or row major? That is, how does the function know to differentiate between dimensions of data and different data vectors?
How do I actually access the cluster center info? I ran a test where I declared I wanted 5 clusters, but using their example code from the link above, I only return 1.
Code:
int numData = 1000;
int dims = 10;
// Use float data and the L1 distance for clustering
VlKMeans * kmeans = vl_kmeans_new (VL_TYPE_FLOAT, VlDistanceL1) ;
// Use Lloyd algorithm
vl_kmeans_set_algorithm (kmeans, VlKMeansLloyd) ;
// Initialize the cluster centers by randomly sampling the data
Matrix<float, 1000,10, RowMajor> data = buildData(numData, dims);
vl_kmeans_init_centers_with_rand_data (kmeans, data.data(), dims, numData, 5);
// Run at most 100 iterations of cluster refinement using Lloyd algorithm
vl_kmeans_set_max_num_iterations (kmeans, 100) ;
vl_kmeans_refine_centers (kmeans, &data, numData) ;
// Obtain the energy of the solution
energy = vl_kmeans_get_energy(kmeans) ;
// Obtain the cluster centers
centers = (double*)vl_kmeans_get_centers(kmeans);
cout << *centers << endl;
Example Output: centers = 0.0376879 (a scalar)
How do I get all centers? I tried using an array to store centers, but it won't accept the type.
I also tried the following, assuming that perhaps I was just accessing the center info wrong:
cout << centers[0]<< endl;
cout << centers[1]<< endl;
cout << centers[2]<< endl;
cout << centers[3]<< endl;
cout << centers[4]<< endl;
cout << centers[5]<< endl;
cout << centers[6]<< endl;
cout << centers[7]<< endl;
cout << centers[8]<< endl;
But I should only have none-zero values for indices 0-4 (given 5 cluster centers). I actually expected exceptions to be thrown for higher indices. If this is the right approach, could someone please explain to me what these other values (indices 5-8) are from?
I'm sure there are other confusing pieces as well, but I haven't even addressed them yet as I've been stuck on these two pretty important pieces (I mean what is kmeans if you can't cluster properly to start).
Thank you in advance for your help!
What format does the data have to be in?
The manual says:
All algorithms support float or double data and can use the l1 or the l2 distance for clustering.
You specify that when you create your kmeans handle, e.g:
VlKMeans *kmeans = vl_kmeans_new(VL_TYPE_FLOAT, VlDistanceL2);
does this matrix need to be column major or row major?
It must be in row major, i.e: data + dimension * i is the i-th center.
How do I actually access the cluster center info?
With vl_kmeans_get_centers. For example if you work with float-s:
/* no need to cast here since get centers returns a `void *` */
const float *centers = vl_kmeans_get_centers(kmeans);
(see this answer regarding the cast)
The total size (in bytes) of this array is sizeof(float) * dimension * numCenters. If you want to print out the centers you can do:
int i, j;
for (i = 0; i < numCenters; i++) {
printf("center # %d:\n", i);
for (j = 0; j < dimension; j++) {
printf(" coord[%d] = %f\n", j, centers[dimension * i + j]);
}
}