I need to invert an Eigen matrix (9x9 in my particular case) as a part of code that I want to automatically differentiate using CppAD. For this to succeed the code executing the inversion can not contain any branching like for example if or switch statements. Unfortunately, the inverse function of Eigen contains branching with makes the algorithmic differentiation of CppAD fail.
Mathematically it should be possible to come up with a formulation that does not need branching for a fixed matrix size that is guaranteed to be invertible. Is that correct?
Do you know of any library that implements such an inverse without branching?
There is a mechanical conversion from branch to no-branch for arithmetic functions.
Duplicate all the variables you use in each branch, and calculate both halves. At the end of the block, multiply the if branch by condition, and the else branch by !condition, then sum them.
Similarly for a switch, calculate all the cases, and multiply by value == case.
E.g.
Mat frob_branch(Mat a, Mat b) {
if (a.baz()) {
return a * b;
} else {
return b * a;
}
}
becomes
Mat frob_no_branch(Mat a, Mat b) {
auto if_true = a * b;
auto if_false = b * a;
bool condition = a.baz();
return (if_true * condition) + (if_false * !condition);
}
In general, it is possible to invert arbitrary large (invertible) matrices without branching, but this gets inefficient for bigger matrices. Eigen only does this for matrices up to size 4x4.
If you want the derivation of the inverse, just use the identity
deriv (inv(A)) = - inv(A) * deriv(A) * inv(A)
i.e., compute the inverse of the plain matrix, then compute the expression above.
Related
Consider the following code:
#include <Eigen/Core>
using Matrix = Eigen::Matrix<float, 2, 2>;
Matrix func1(const Matrix& mat) { return mat + 0.5; }
Matrix func2(const Matrix& mat) { return mat / 0.5; }
func1() does not compile; you need to replace mat with mat.array() in the function body to fix it ([1]). However, func2() does compile as-is.
My question has to do with why the API is designed this way. Why is addition-with-scalar and division-by-scalar treated differently? What problems would arise if the following method is added to the Matrix class, and why haven't those problems arisen already for the operator/ method?:
auto operator+(Scalar s) const { return this->array() + s; }
From a mathematics perspective, a scalar added to a matrix "should" be the same as adding the scalar only to the diagonal. That is, a math text would usually use M + 0.5 to mean M + 0.5I, for I the identity matrix. There are many ways to justify this. For example, you can appeal to the analogy I = 1, or you can appeal to the desire to say Mx + 0.5x = (M + 0.5)x whenever x is a vector, etc.
Alternatively, you could take M + 0.5 to add 0.5 to every element. This is what you think is right if you don't think of matrices from a "linear algebra mindset" and treat them as just collections (arrays) of numbers, where it is natural to just "broadcast" scalar operations.
Since there are multiple "obvious" ways to handle + between a scalar and a matrix, where someone expecting one may be blindsided by the other, it is a good idea to do as Eigen does and ban such expressions. You are then forced to signify what you want in a less ambiguous way.
The natural definition of / from an algebra perspective coincides with the array perspective, so no reason to ban it.
I have a dense matrix A of size 2N*N that has to be multiplied by a matrix B, of size N*2N.
Matrix B is actually a horizontal concatenation of 2 sparse matrices, X and Y. B requires only a read-only access.
Unfortunately for me, there doesn't seem to be a concatenate operation for sparse matrices. Of course, I could simply create a matrix of size N*2N and populate it with the data, but this seems rather wasteful. It seems like there could be a way to group X and Y into some sort of matrix view.
Additional simplification in my case is that either X or Y is a zero matrix.
For your specific case, it is sufficient to multiply A by either X or Y - depending on which one is nonzero. The result will be exactly the same as the multiplication by B (simple matrix algebra).
If your result matrix is column major (the default), you can assign partial results to vertical sub-blocks like so (if X or Y is structurally zero, the corresponding sub-product is calculated in O(1)):
typedef Eigen::SparseMatrix<float> SM;
void foo(SM& out, SM const& A, SM const& X, SM const &Y)
{
assert(X.rows()==Y.rows() && X.rows()==A.cols());
out.resize(A.rows(), X.cols()+Y.cols());
out.leftCols(X.cols()) = A*X;
out.rightCols(Y.cols()) = A*Y;
}
If you really want to, you could write a wrapper class which holds references to two sparse matrices (X and Y) and implement operator*(SparseMatrix, YourWrapper) -- but depending on how you use it, it is probably better to make an explicit function call.
in Matlab if I write
A = B*inv(C)
(with A, B and C being square matrices), I get a warning that matrix inversion should be replaced with a matrix "right-division" (due to being numerically more stable and accurate) like
A = B/C
In my Eigen C++ project I have the following code:
Eigen::Matrix<double> A = B*(C.inverse());
and I was woundering if there is an equivalent replacement for taking the matrix inverse in Eigen analogous to the one in Matlab mentioned above?
I know that matrix "left-division" can be expressed by solving a system of equations for expressions like
A = inv(C)*B
but what about
A = C*inv(B)
in Eigen?
At the moment the most efficient way to do this is to rewrite your equation to
A^T = inv(C^T) * B^T
A = (inv(C^T) * B^T)^T
which can be implemented in Eigen as
SomeDecomposition decompC(C); // decompose C with a suiting decomposition
Eigen::MatrixXd A = decompC.transpose().solve(B.transpose()).transpose();
There were/are plans, that eventually, one can write
A = B * decompC.inverse();
and Eigen will evaluate this in the most efficient way.
I am trying to get Eigen3 to solve a linear system A * X = B with an in-place Cholesky decomposition. I cannot afford to have any temporaries of the size of A pushed on the stack, but I am free to destroy A in the process.
Unfortunately,
A.llt().solveInPlace(B);
is out of question, since A.llt() implicitly pushes a temporary matrix of the size of A on the stack. For the LLT case, I could get access to the necessary functionality like so:
// solve A * X = B in-place for positive-definite A
template <typename AType, typename BType>
void AllInPlaceSolve(AType& A, BType& B)
{
typedef Eigen::internal::LLT_Traits<AType, Eigen::Upper> TraitsType;
TraitsType::inplace_decomposition(A);
TraitsType::getL(A).solveInPlace(B);
TraitsType::getU(A).solveInPlace(B);
}
This works fine, but I am worried that:
My matrices A might be positive semidefinite only, in which case a LDLT decomposition is required
The LLT decomposition calculates sqrt() unnecessarily for the solution of the system
I could not find a way to hook in Eigen's LDLT functionality similarly to the code above, since the code is structured very differently.
So my question is: Is there a way to use Eigen3 for solving a linear system using LDLT decompositions using no more scratch space than for the diagonal matrix D?
One option is to allocate a LDLT solver only once, and call the compute method:
LDLT<MatType> ldlt(size);
// ...
ldlt.compute(A);
x = ldlt.solve(b);
If that's also not an option, you can const cast the matrix stored by the ldlt object:
LDLT<MatType> ldlt(MatType::Identity(size,size));
MatType& A = const_cast<MatType&>(ldlt.matrixLDLT());
plays with A, and then:
ldlt.compute(A);
x = ldlt.solve(b);
This is ugly, but this should work as long as MatType is column major.
I am attempting to implement a complex-valued matrix equation in OpenCV. I've prototyped in MATLAB which works fine. Starting off with the equation (exact MATLAB implementation):
kernel = exp(1i .* k .* Circ3D) .* z ./ (1i .* lambda .* Circ3D .* Circ3D)
In which
1i = complex number
k = constant (float)
Circ3D = real-valued matrix of known size
lambda = constant (float)
.* = element-wise multiplication
./ = element-wise division
The result is a complex-valued matrix. I succeeded in generating the necessary Circ3D matrix as a CV_32F, however the multiplication by the complex number i is giving me trouble. From the OpenCV documentation I understand that a complex matrix is simply a two-channel matrix (CV_32FC2).
The real trouble comes from how to define i. I've tried several options, among which defining i as
cv::Vec2d complex = cv::Vec2d(0,1);
and then multiplying by the matrix
kernel = complex * Circ3D
But this doesn't work (although I didn't expect it to). I suspect I need to do something with std::complex but I have no idea what (http://docs.opencv.org/modules/core/doc/basic_structures.html).
Thanks in advance for any help.
Edit: Just after writing this post I did make some progress, by defining i as follows:
std::complex<float> complex(0,1)
I am then able to assign complex values as follows:
kernel.at<std::complex<float>>(i,j) = cv::exp(complex * k * Circ3D.at<float>(i,j)) * ...
z / (complex * lambda * pow(Circ3D.at<float>(i,j),2));
However, this works in a loop, which makes the procedure incredibly slow. Any way to do it in one go?
OpenCV treats std::complex just like the simple pair of numbers (see example in the documentation). No special rules on arithmetic operations are applied. You overcome this by multiplying std::complex directly. So basically, this is simple: you either chose automatic complex arithmetic (as you are doing now), or automatic vectorization (when using OpenCV functions on matrices).
I think, in your case you should carry all the complex arithmetic by yourself. Store matrix of complex values C{ai + b} as two matrices A{a} and B{b}. Implement exponentiation by yourself. Multiplication on scalars and addition shouldn't be a problem.
There is also the function mulSpectrums, which lets you do element wise multiplication of complex matrices. So if K is your kernel matrix and I is some complex matrix, that is, CV_32FC2 (float two channel) you can do the following to compute the element wise multiplication,
// Make K a complex matrix
cv::Mat Ktmp[] = {cv::Mat_<float>(K), cv::Mat::zeros(K.size(), CV_32FC1)};
cv::Mat Kc;
cv::merge(Ktmp,2,Kc);
// Do matrix multiplication
cv::mulSpectrums(Kc,I,I,0);