Why doesn't Eigen converge for tiny constant matrix? - c++

I am running repeated matrix diagonalization routines for small complex valued square matrices (dimension < 10), but encountered a failure on a small constant value matrix. The ComplexEigenSolver doesn't converge, returning empty objects for the eigenvalues and eigenvectors.
I have checked this problem by trying to solve the matrix with values all 1, but this works fine. My problem must be related to the small values of my matrix.
MatrixXcd matrix(2,2);
matrix(0,0) = std::complex<double>(1.4822e-322, 0);
matrix(0,1) = std::complex<double>(1.4822e-322, 0);
matrix(1,0) = std::complex<double>(1.4822e-322, 0);
matrix(1,1) = std::complex<double>(1.4822e-322, 0);
ComplexEigenSolver<MatrixXcd> ces;
ces.compute(matrix);
ces.eigenvalues();
ces.eigenvectors();
ces.info();
This gives empty eigenvalues and eigenvectors, and returns 2 from ces.info().
I expect it to simply give eigenvalues with entries 0 and 2.96e-322 (a scaled version of the matrix of ones given here: https://en.wikipedia.org/wiki/Matrix_of_ones)
Are the values too small?

Related

Matrix inverse calculation of upper triangular matrix gives error for large matrix dimensions

I have a recursive function to calculate the inverse of an upper triangular matrix. I have divided the matrix into Top, Bottom and Corner sections and then followed the methodology as laid down in https://math.stackexchange.com/a/2333418. Here is a pseudocode form:
//A diagram structure of the Matrix
Matrix = [Top Corner]
[0 Bottom]
Matrix multiply_matrix(Matrix A, Matrix B){
Simple Code to multiply two matrices and return a Matrix
}
Matrix simple_inverse(Matrix A){
Simple Code to get inverse of a 2x2 Matrix
}
Matrix inverse_matrix(Matrix A){
//Creating an empty A_inv matrix of dimension equal to A
Matrix A_inv;
if(A.dimension == 2){
A_inv = simple_inverse(A)
}
else{
Top_inv = inverse_matrix(Top);
(Code to check Top*Top_inv == Identity Matrix)
Bottom_inv = inverse_matrix(Bottom);
(Code to check Bottom*Bottom_inv == Identity Matrix)
Corner_inv = multiply_matrix(Top_inv, Corner);
Corner_inv = multiply_matrix(Corner_inv, Bottom_inv);
Corner_inv = negate(Corner_inv); //Just a function for negation of the matrix elements
//Code to copy Top_inv, Bottom_inv and Corner_inv to A_inv
...
...
}
return A_inv;
}
int main(){
matrix A = {An upper triangular matrix with random integers between 1 and 9};
A_inv = inverse_matrix(A);
test_matrix = multiply_matrix(A, A_inv);
(Code to Raise error if test_matrix != Identity matrix)
}
For simplicity I have implemented the code such that only power of 2 dimension matrices are supported.
My problem is that I have tested this code for matrix dimensions of 2, 4, 8, 16, 32 and 64. All of these pass all of the assertion checks as shown in code.
But for matrix dimension of 128 I get failure is the assertion in main(). And when I check, I observer that the test_matrix is not Identity matrix. Some non-diagonal elements are not equal to 0.
I am wondering what could be the reason for this:-
I am using C++ std::vector<std::vector<double>> for Matrix representation.
Since the data type is double the non-diagonal elements of test_matrix for cases 2, 4, 8, ..., 64 do have some value but very small. For example, -9.58122e-14
All my matrices at any recursion stage are square matrix
I am performing checks that Top*Top_inv = Identity and Bottom*Bottom_inv = Identity.
Finally, for dimensions 2, 4, ..., 64 I generated random numbers(b/w 1 and 10) to create my upper triangular matrix. Since, these cases passed, I guess my mathematical implementation is correct.
I feel like there is some aspect of C++ datatypes about double which I am unaware of that could be causing the error. Otherwise the sudden error from 64->128 doesn't make sense.
Could you please elaborate on how the matrix == identity operation is implemented?
My guess is that the problem might be resumed to the floating point comparison.
The matrix inversion can be O(n^3) in the worst case. This means that, as the matrix size increases, the amount of computations involved also increase. Real numbers cannot be perfectly represented even when using 64 bit floating point, they are always an approximation.
For operations such as matrix inversion this can cause problems of numerical error propagation, due to the loss of precision on the accumulated multiply adds operations.
For this, there has been discussions already in the StackOverflow: How should I do floating point comparison?
EDIT: Other thing to consider if the full matrix is actually invertible.
Perhaps the Top and/or Bottom matrices are invertible, but the full matrix (when composing with the Corner matrix) is not.

c++ eigenvalue and eigenvector corresponding to the smallest eigenvalue

I am trying to find out the eigenvalues and the eigenvector corresponding to the smallest eigenvalue. I have a matrix A (nx2) and I have computed B = transpose(A) * a. When I am using c++ eigen function compute() and print the eigenvalues of matrix B, it shows something like this:
(4.4, 0)
(72.1, 0)
Printing the eigenvectors it gives output:
(-0.97, 0) (0.209, 0)
(-0.209, 0) (-0.97, 0)
I am confused. Eigenvectors can't be zero I guess. So, for the smallest eigenvalue 4.4, is the corresponding eigenvector (-0.97, -0.209)?
P.S. - when I print
mysolution.eigenvalues()[0]
it prints (4.4, 0). And when I print
mysolution.eigenvectors().col(0)
it prints (-0.97, 0) (0.209, 0). That's why I guess I can assume that for eigenvalue 4.4, the corresponding eigenvector is (-0.97, -0.209).
I guess you are correct.
None of your eigenvalues is null, though. It seems that you are working with complex numbers.
Could it be that you selected a complex floating point matrix to do your computations? Something along the lines of MatrixX2cf or MatrixX2cd.
Every square matrix has a set of eigenvalues. But even if the matrix itself only consists of real numbers, the eigenvalues and -vectors might contain complex numbers (take (0 1;-1 0) for example)
If Eigen knows nothing about your matrix structure (i.e. is it symmetric/self-adjoint? Is it orthonormal/unitary?) but still wants to provide you with exact eigenvalues, the only general type that can hold all possible eigenvalues is a complex number.
Thus, Eigen always returns complex numbers which are represented as pairs (a, b) for a + bi. Eigen will only return real numbers if the matrix is self-adjoint, i.e. SelfAdjointView is used to access the matrix.
If you know for a fact that your matrix only has real eigenvalues, you can just extract the real part by eigenvalue.real since Eigen returns std::complex values.
EDIT: I just realized that if your matrix A has no complex entries, B=transposed(A)*A is self-adjoint and thus you could just use a SelfAdjointView of the matrix to compute the real eigenvalues and -vectors.

Diagonalization of a 2x2 self-adjoined (hermitian) matrix

Diagonalizing a 2x2 hermitian matrix is simple, it can be done analytically. However, when it comes to calculating the eigenvalues and eigenvectors over >10^6 times, it is important to do it as efficient as possible. Especially if the off-diagonal elements can vanish it is not possible to use one formula for the eigenvectors: An if-statement is necessary, which of course slows down the code. Thus, I thought using Eigen, where it's stated that the diagonalization of 2x2 and 3x3 matrices is optimized, would be still a good choice:
using
const std::complex<double> I ( 0.,1. );
inline double block_distr ( double W )
{
return (-W/2. + rand() * W/RAND_MAX);
}
a test-loop would be
...
SelfAdjointEigenSolver<Matrix<complex< double >, 2, 2> > ces;
Matrix<complex< double >, 2, 2> X;
for (int i = 0 ; i <iter_MAX; ++i) {
a00=block_distr(100.);
a11=block_distr(100.);
re_a01=block_distr(100.);
im_a01=block_distr(100.);
X(0,0)=a00;
X(1,0)=re_a01-I*im_a01;
//only the lower triangular part is referenced! X(0,1)=0.; <--- not necessary
X(1,1)=a11;
ces.compute(X,ComputeEigenvectors);
}
Writing the loop without Eigen, using directly the formulas for eigenvalues and eigenvectors of a hermitian matrix and an if-statement to check if the off diagonal is zero, is a factor of 5 faster. Am I not using Eigen properly or is such an overhead normal? Are there other lib.s which are optimized for small self-adjoint matrices?
By default, the iterative method is used. To use the analytical version for the 2x2 and 3x3, you have to call the computeDirect function:
ces.computeDirect(X);
but it is unlikely to be faster than your implementation of the analytic formulas.

OpenCV Assertion failed on Matrix multiplication

I'm multiplying two matrices with OpenCV, A in NxM and B is MxP.
According to the documentation:
All the arrays must have the same type and the same size (or ROI
size). For types that have limited range this operation is saturating.
However, by the theory of matrix multiplication:
Assume two matrices are to be multiplied (the generalization to any
number is discussed below). If A is an n×m matrix and B is an m×p
matrix, the result would be AB of their multiplication is an n×p matrix defined
only if the number of columns m in A is equal to the number of rows m
in B.
shouldn't this code be working?
- (CvMat *) multMatrix:(CvMat *)AMatrix BMatrix:(CvMat *)BMatrix
{
CvMat *result = cvCreateMat(AMatrix->rows, BMatrix->cols, kMatrixType);
cvMul(AMatrix, BMatrix, result, 1.0);
return result;
}
I get the following exception:
OpenCV Error: Assertion failed (src1.size == dst.size &&
src1.channels() == dst.channels()) in cvMul, file
/Users/Aziz/Documents/Projects/opencv_sources/trunk/modules/core/src/arithm.cpp,
line 2728
kMatrixType is CV_32F, A is 6x234, B is 234x5 and result is 6x5...
Am I doing something wrong? Or is this an OpenCV restriction to matrix multiplication ?
You are doing element-wise multiplication with cvMul.
You should look at cvMatMul for doing proper matrix multiplication.
http://opencv.willowgarage.com/wiki/Matrix_operations

Finding eigenvectors of covariance matrix to create 3D bounding sphere

I'm currently in the process of writing a function to find an "exact" bounding-sphere for a set of points in 3D space. I think I have a decent understanding of the process so far, but I've gotten stuck.
Here's what I'm working with:
A) Points in 3D space
B) 3x3 covariance matrix stored in a 4x4 matrix class (referenced by cells m0,m1,m2,m3,m4,ect; instead of rows and cols)
I've found the 3 eigenvalues for the covariance matrix of the points, and I've set up a function to convert a matrix to reduced row echelon form (rref) via Gaussian elimination.
I've tested both of those functions against figures in examples I've found online, and they appear to be working correctly.
The next step is to find the eigenvectors using the equation:
(M - λ*I)*V
... where M is the covariance matrix, λ is one of the eigenvalues, I is the identity matrix, and V is the eigenvector.
However, I don't seem to be constructing the 4x3 matrix correctly before rref'ing it, as the far right column where the eigenvector components should be calculated are 0 before and after running rref. I understand why they are zero after (without any constants, the simplest solution to a linear system of equations is all coefficients of zero), but I'm at a loss as to what to put there.
Here's the function so far:
Vect eigenVector(const Matrix &M, const float eval) {
Matrix A = Matrix(M);
A -= Matrix(IDENTITY)*eval;
A.rref();
return Vect(A[m3],A[m7],A[m11]);
}
The 3x3 covariance matrix is passed as M, and the eigenvalue as eval. Matrix(IDENTITY) returns an identity matrix. m3,m7, and m11 correspond to the far-right column of a 4x3 matrix.
Here's the example 3x3 matrix (stored in a 4x4 matrix class) I'm using to test the functions:
Matrix(1.5f, 0.5f, 0.75f, 0,
0.5f, 0.5f, 0.25f, 0,
0.75f, 0.25f, 0.5f, 0,
0, 0, 0, 0);
I'm correctly (?) getting the eigenvalues of 2.097, 0.3055, 0.09756 from my other function.
eigenVector() above correctly subtracts the passed eigenvalue from the diagonal (0,0 1,1 2,2)
Matrix A after rref():
[(1, 0, 0, -0),
(-0, 1, 0, -0),
(-0, -0, 1, -0),
(0, 0, 0, -2.09694)]
For the rref() function, I'm using a translated python function found here:
http://elonen.iki.fi/code/misc-notes/python-gaussj/index.html
What should the matrix I pass to rref() look like to get an eigenvector out?
Thanks
(M - λI)V is not an equation, it's just an expression. However, (M - λI)V = 0 is. And it's the equation that relates eigenvectors to eigenvalues.
So assuming your rref function works, I would imagine that you create an augmented matrix as [(M - λI) | 0], where 0 denotes a zero-vector. This sounds like what you're doing already, so I would have to assume that your rref function is broken. Or alternatively, it doesn't know how to handle 4x4 matrices (as opposed to 4x3 matrices, which is what it would expect for an augmented matrix).
Ah, with a few more hours of grueling research, I've managed to solve my problem.
The issue is that there is no "one" set of eigenvectors but rather an infinite number with varying magnitudes.
The method I chose was to use a REF (row echelon form) instead of RREF, leaving enough information in the matrix to allow me to substitute in an arbitrary value for z, and work backwards to solve for y and x. I then normalized the vector to get a unit eigenvector, which should work for my purposes.
My final code:
Vect eigenVector(const Matrix &M, const float eVal) {
Matrix A = Matrix(M);
A -= Matrix(IDENTITY)*eVal;
A.ref();
float K = 16; // Arbitrary value
float J = -K*A[m6]; // Substitute in K to find J
float I = -K*A[m2]-J*A[m1]; // Substitute in K and J to find I
Vect eVec = Vect(I,J,K);
eVec.norm(); // Normalize eigenvector
return eVec;
}
The only oddity is that the eigenvectors come out facing in the opposite direction than I expected (they were negated!), but that's a moot problem.