Hilbert Transform (Analytical Signal) using Apple's Accelerate Framework? - c++

I am having issues with getting a Matlab equivalent Hilbert transform in C++ with using Apple's Accelerate Framework. I have been able to get vDSP's FFT algorithm working and, with the help of Paul R's post, have managed to get the same outcome as Matlab.
I have read both: this stackoverflow question by Jordan and have read the Matlab algorithm (under the 'Algorithms' sub-heading). To sum the algorithm up in 3 stages:
Take forward FFT of input.
Zero reflection frequencies and double frequencies between DC and Nyquist.
Take inverse FFT of the modified forward FFT output.
Below are the outputs of both Matlab and C++ for each stage. The examples use the following arrays:
Matlab: m = [1 2 3 4 5 6 7 8];
C++: float m[] = {1,2,3,4,5,6,7,8};
Matlab Example
Stage 1:
36.0000 + 0.0000i
-4.0000 + 9.6569i
-4.0000 + 4.0000i
-4.0000 + 1.6569i
-4.0000 + 0.0000i
-4.0000 - 1.6569i
-4.0000 - 4.0000i
-4.0000 - 9.6569i
Stage 2:
36.0000 + 0.0000i
-8.0000 + 19.3137i
-8.0000 + 8.0000i
-8.0000 + 3.3137i
-4.0000 + 0.0000i
0.0000 + 0.0000i
0.0000 + 0.0000i
0.0000 + 0.0000i
Stage 3:
1.0000 + 3.8284i
2.0000 - 1.0000i
3.0000 - 1.0000i
4.0000 - 1.8284i
5.0000 - 1.8284i
6.0000 - 1.0000i
7.0000 - 1.0000i
8.0000 + 3.8284i
C++ Example (with Apple's Accelerate Framework)
Stage 1:
Real: 36.0000, Imag: 0.0000
Real: -4.0000, Imag: 9.6569
Real: -4.0000, Imag: 4.0000
Real: -4.0000, Imag: 1.6569
Real: -4.0000, Imag: 0.0000
Stage 2:
Real: 36.0000, Imag: 0.0000
Real: -8.0000, Imag: 19.3137
Real: -8.0000, Imag: 8.0000
Real: -8.0000, Imag: 3.3137
Real: -4.0000, Imag: 0.0000
Stage 3:
Real: -2.0000, Imag: -1.0000
Real: 2.0000, Imag: 3.0000
Real: 6.0000, Imag: 7.0000
Real: 10.0000, Imag: 11.0000
It's quite clear that the end results are not the same. I am looking to be able to compute the Matlab 'Stage 3' results (or at least the imaginary parts) but I am unsure how to go about it, I've tried everything I can think of with no success. I have a feeling that it's because I'm not zeroing out the reflection frequencies in the Apple Accelerate version as they are not provided (due to being calculated from the frequencies between DC and Nyquist) - so I think the FFT is just taking the conjugate of the doubled frequencies as the reflection values, which is wrong. If anyone could help me overcome this issue I would greatly appreciate it.
Code
void hilbert(std::vector<float> &data, std::vector<float> &hilbertResult){
// Set variable.
dataSize_2 = data.size() * 0.5;
// Clear and resize vectors.
workspace.clear();
hilbertResult.clear();
workspace.resize(data.size()/2+1, 0.0f);
hilbertResult.resize(data.size(), 0.0f);
// Copy data into the hilbertResult vector.
std::copy(data.begin(), data.end(), hilbertResult.begin());
// Perform forward FFT.
fft(hilbertResult, hilbertResult.size(), FFT_FORWARD);
// Fill workspace with 1s and 2s (to double frequencies between DC and Nyquist).
workspace.at(0) = workspace.at(dataSize_2) = 1.0f;
for (i = 1; i < dataSize_2; i++)
workspace.at(i) = 2.0f;
// Multiply forward FFT output by workspace vector.
for (i = 0; i < workspace.size(); i++) {
hilbertResult.at(i*2) *= workspace.at(i);
hilbertResult.at(i*2+1) *= workspace.at(i);
}
// Perform inverse FFT.
fft(hilbertResult, hilbertResult.size(), FFT_INVERSE);
for (i = 0; i < hilbertResult.size()/2; i++)
printf("Real: %.4f, Imag: %.4f\n", hilbertResult.at(i*2), hilbertResult.at(i*2+1));
}
The code above is what I used to get the 'Stage 3' C++ (with Apple's Accelerate Framework) result. The fft(..) method for the forward fft performs the vDSP fft routine followed by a scale of 0.5 and then unpacked (as per Paul R's post). When the inverse fft is performed, the data is packed, scaled by 2.0, fft'd using vDSP and finally scaled by 1/(2*N).

So the main problem (as far as I can tell, as your code sample doesn't include the actual calls into vDSP) is that you’re attempting to use the real-to-complex FFT routines for step three, which is fundamentally a complex-to-complex inverse FFT (as evidenced by the fact that your Matlab results have non-zero imaginary parts).
Here’s a simple C implementation using vDSP that matches your expected Matlab output (I used the more modern vDSP_DFT routines, which should generally be preferred to the older fft routines, but otherwise this is very similar to what you’re doing, and illustrates the need for a real-to-complex forward transform, but complex-to-complex inverse transform):
#include <Accelerate/Accelerate.h>
#include <stdio.h>
int main(int argc, char *argv[]) {
const vDSP_Length n = 8;
vDSP_DFT_SetupD forward = vDSP_DFT_zrop_CreateSetupD(NULL, n, vDSP_DFT_FORWARD);
vDSP_DFT_SetupD inverse = vDSP_DFT_zop_CreateSetupD(forward, n, vDSP_DFT_INVERSE);
// Look like a typo? The real-to-complex DFT takes its input separated into
// the even- and odd-indexed elements. Since the real signal is [ 1, 2, 3, ... ],
// signal[0] is 1, signal[2] is 3, and so on for the even indices.
double even[n/2] = { 1, 3, 5, 7 };
double odd[n/2] = { 2, 4, 6, 8 };
double real[n] = { 0 };
double imag[n] = { 0 };
vDSP_DFT_ExecuteD(forward, even, odd, real, imag);
// At this point, we have the forward real-to-complex DFT, which agrees with
// MATLAB up to a factor of two. Since we want to double all but DC and NY
// as part of the Hilbert transform anyway, I'm not going to bother to
// unscale the rest of the frequencies -- they're already the values that
// we really want. So we just need to move NY into the "right place",
// and scale DC and NY by 0.5. The reflection frequencies are already
// zeroed out because the real-to-complex DFT only writes to the first n/2
// elements of real and imag.
real[0] *= 0.5; real[n/2] = 0.5*imag[0]; imag[0] = 0.0;
printf("Stage 2:\n");
for (int i=0; i<n; ++i) printf("%f%+fi\n", real[i], imag[i]);
double hilbert[2*n];
double *hilbertreal = &hilbert[0];
double *hilbertimag = &hilbert[n];
vDSP_DFT_ExecuteD(inverse, real, imag, hilbertreal, hilbertimag);
// Now we have the completed hilbert transform up to a scale factor of n.
// We can unscale using vDSP_vsmulD.
double scale = 1.0/n; vDSP_vsmulD(hilbert, 1, &scale, hilbert, 1, 2*n);
printf("Stage 3:\n");
for (int i=0; i<n; ++i) printf("%f%+fi\n", hilbertreal[i], hilbertimag[i]);
vDSP_DFT_DestroySetupD(inverse);
vDSP_DFT_DestroySetupD(forward);
return 0;
}
Note that if you already have your DFT setups built and your storage allocated, the computation is extremely straightforward; the “real work” is just:
vDSP_DFT_ExecuteD(forward, even, odd, real, imag);
real[0] *= 0.5; real[n/2] = 0.5*imag[0]; imag[0] = 0.0;
vDSP_DFT_ExecuteD(inverse, real, imag, hilbertreal, hilbertimag);
double scale = 1.0/n; vDSP_vsmulD(hilbert, 1, &scale, hilbert, 1, 2*n);
Sample output:
Stage 2:
36.000000+0.000000i
-8.000000+19.313708i
-8.000000+8.000000i
-8.000000+3.313708i
-4.000000+0.000000i
0.000000+0.000000i
0.000000+0.000000i
0.000000+0.000000i
Stage 3:
1.000000+3.828427i
2.000000-1.000000i
3.000000-1.000000i
4.000000-1.828427i
5.000000-1.828427i
6.000000-1.000000i
7.000000-1.000000i
8.000000+3.828427i

Related

Arrayfire Vectorization

I'm trying to speed up the following calculations but have not been able to reach the desired speed. Im sure the issue is with my code and not physical limitations of the GPU.
I have a matrix V that is 10,000 x 6 x 6.
And another matrix P that is 6 x 1,000
Both complex
I need to do V * P (which should results in 10,000 x 6 x 1000)
Take the magnitude (or mag sq) of it and then sum in the 6 dimension.
resulting in a 10,000 x 1000 of real values.
I have tried the following:
af::array V{ 10000, 6, 6, c32 };
af::array P{ 6, 1000, c32 };
af::array VP = af::matmul(V, P); (results in 10,000x1000x6 - ok, as long as i still sum in the 6 dim)
af::array res = af::sum(af::abs(VP),2);
This was not nealy fast enough. Then I tried converting V into an array, so I had:
af::array V[6] = { af::array{ 10000, 6, c32 },
af::array{ 10000, 6, c32 }, af::array{ 10000, 6, c32 }, af::array{
10000, 6, c32 }, af::array{ 10000, 6, c32 }, af::array{
10000, 6, c32 } };
af::array VP[6];
af::array res;
for (int i = 0; i < 6; i++)
{
VP[i] = af::matmul(V[i], P);
}
res= af::abs(mCalledData[0]);
for (int i = 1; i < 6; i++)
{
res+= af::abs(VP[i]);
}
This had about a 2x speedup. I came up with another solution but af::matmult that takes in 3 arrays doesn't support options (like hermitian) and doesn't support gfor, so I couldn't try that route.
Currently, the matrix multiply (in both approaches) takes about 2.2ms and it looks like arrayfire can combine the abs and sum into one JIT kernel that takes about 2 ms.
My knowledge of arrayfire is limited so i'm guessing there is something I'm not thinking of. Does anyone have an idea of how I can increase the speed of this algorithm?
Thank you!
I can confirm your findings that looped version is about twice as fast as the batched matmul. Matmul on its own is not essentially the one taking long runtime in your code snippet, it is the other operation of summing up along third dimension after abs which is costly. It is due to the following reasons.
1) sum(abs(result)) - abs is again not issue here. Sum is reduction algorithm, which are usually quite fast along the fast moving dimension. However, reduction along higher dimension the element stride is size of the matrix for successive elements. This expensive compared to reduction along continuous locations.
2) looped abs additions - This version is however is accessing elements that continuous in memory because, we are basically adding respective elements of 6 matrices. On top of this, the entire loop (along with abs OP) will be converted into a single JIT kernel that does the following which is very efficient.
res = res + ptr0[i] + ptr1[i] + ptr2[i] + ptr0[i] + ptr1[i]
Above line is just for illustration, that is not the exact JIT kernel.
Hence, the batched version is faster than looped version in this specific case because of the reduction operation that is being done on the result of matmul.
My test GPU: GTX 1060
The matmul itself for a single [10k x 6] * [6 x 1k] is about half a millisecond on GTX 1060. Six such matmuls can't be done under millisecond on my GTX 1060 at least I would think. What is your target runtime ?
EDITED (Jan 10, 2020): - Actually, This won't work because of abs operation on result of each matmul.
You can try looking into our latest entry into gemm category in master branch of ArrayFire. However, you would have to build arrayfire from source until our next feature release 3.7. You can look at the documentation at the following page.
https://github.com/arrayfire/arrayfire/blob/master/include/af/blas.h#L230
It follows the principle of Carray from cuBLAS gemm API.

FFT : FFTW Matlab FFT2 mystery

I inherited an old fortran code with fft subroutine and I am unable to trace the source of that program. The only thing I know is that there is a call to ff2prp() and the call to fft2() to perform 2D forward and backward DFT. In order to know what the code is doing I take DFT of a 4x4 2D array (matrix) and the results are very different from Matlab and FFTW results.
Question: Can someone tell what the code is doing by looking at the output. The input and output are both real arrays
Input array
0.20000 0.30000 1.00000 1.20000
0.00000 12.00000 5.00000 1.30000
0.30000 0.30000 1.00000 1.40000
0.00000 0.00000 0.00000 1.50000
After forward FFT with fft2() fortran routine
0.16875 -0.01875 -0.05000 0.05625
0.00000 12.00000 5.00000 1.30000
0.30000 0.30000 1.00000 1.40000
0.00000 0.00000 0.00000 1.50000
Matlab output performing DCT: dct2(input)
6.3750 -0.8429 -3.4250 -2.4922
2.4620 0.6181 -2.6356 -0.9887
-4.2750 -0.9798 4.2250 2.2730
-4.8352 -1.2387 5.0695 3.4819
Output from C++ code using FFTW library.
DCT from FFTW
(6.3750, 0.00) (-0.8429, 0.00) (-3.4250, 0.00) (-2.4922, 0.00)
(2.4620, 0.00) (0.6181, 0.00) (-2.6356, 0.00) (-0.9887, 0.00)
(-4.2750, 0.00) (-0.9798, 0.00) (4.2250, 0.00) (2.2730, 0.00)
(-4.8352, 0.00) (-1.2387, 0.00) (5.0695, 0.00) (3.4819, 0.00)
Forward FFT with Matlab - fft2(input)
25.5000 + 0.0000i -6.5000 - 7.2000i -10.5000 + 0.0000i -6.5000 + 7.2000i
-0.3000 -16.8000i -12.3000 + 4.8000i 0.1000 + 6.8000i 12.1000 + 5.2000i
-14.1000 + 0.0000i 3.5000 +11.2000i 9.1000 + 0.0000i 3.5000 -11.2000i
-0.3000 +16.8000i 12.1000 - 5.2000i 0.1000 - 6.8000i -12.3000 - 4.8000i
Forward FFT with FFTW
(25.50, 0.00) (-6.50, -7.20) (-10.50, 0.00) (-6.50, 7.20)
(-0.30, -16.80) (-12.30, 4.80) (0.10, 6.80) (12.10, 5.20)
(-14.10, 0.00) (3.50, 11.20) (9.10, 0.00) (3.50, -11.20)
(-0.30, 16.80) (12.10, -5.20) (0.10, -6.80) (-12.30, -4.80)
As you cane see the output of Matlab and FFTW agrees with each other but not with the output of the fortran code.
I would like to use FFTW but the results are different due to FFT. I can't figure out what FFT the fortran program is doing. Can anyone tell by looking at the output.
As far as I can tell, fft2 seems to have computed the 1D FFT of the first row (leaving all 3 others unchanged), with the result being scaled by 1/16 and packed in r0, r2, r1, i1 format.
In other words the output can be constructed in Matlab using:
input = [0.2 0.3 1 1.2;0 12 5 1.3;0.3 0.3 1 1.4;0 0 0 1.5];
N = size(input,2);
A = fft(input(1,:))/16;
B = reshape([real(A);imag(A)],1,2*N);
B(2) = B(N+1);
output = [B(1:N);A(2:size(input,1),:)];
If you have some reasons to believe that fft2 should be computing 2D FFTs, then there might some problem in the way you pass your data to this routine which causes incorrect results. Also, additional test cases (or how you call ff2prp) for different sizes inputs may provide more insight about the choice of scaling factor (eg. is it 1/N^2 or 1/4N, or something else).

Rewriting Matlab eig(A,B) (Generalized eigenvalues/eigenvectors) to C/C++

Do anyone have any idea how can I rewrite eig(A,B) from Matlab used to calculate generalized eigenvector/eigenvalues? I've been struggling with this problem lately. So far:
Matlab definition of eig function I need:
[V,D] = eig(A,B) produces a diagonal matrix D of generalized
eigenvalues and a full matrix V whose columns are the corresponding
eigenvectors so that A*V = B*V*D.
So far I tried the Eigen library (http://eigen.tuxfamily.org/dox/classEigen_1_1GeneralizedSelfAdjointEigenSolver.html)
My implementation looks like this:
std::pair<Matrix4cd, Vector4d> eig(const Matrix4cd& A, const Matrix4cd& B)
{
Eigen::GeneralizedSelfAdjointEigenSolver<Matrix4cd> solver(A, B);
Matrix4cd V = solver.eigenvectors();
Vector4d D = solver.eigenvalues();
return std::make_pair(V, D);
}
But first thing that comes to my mind is, that I can't use Vector4cd as .eigenvalues() doesn't return complex values where Matlab does. Furthermore results of .eigenvectors() and .eigenvalues() for the same matrices are not the same at all:
C++:
Matrix4cd x;
Matrix4cd y;
pair<Matrix4cd, Vector4d> result;
for (int i = 0; i < 4; i++)
{
for (int j = 0; j < 4; j++)
{
x.real()(i,j) = (double)(i+j+1+i*3);
y.real()(i,j) = (double)(17 - (i+j+1+i*3));
x.imag()(i,j) = (double)(i+j+1+i*3);
y.imag()(i,j) = (double)(17 - (i+j+1+i*3));
}
}
result = eig(x,y);
cout << result.first << endl << endl;
cout << result.second << endl << endl;
Matlab:
for i=1:1:4
for j=1:1:4
x(i,j) = complex((i-1)+(j-1)+1+((i-1)*3), (i-1)+(j-1)+1+((i-1)*3));
y(i,j) = complex(17 - ((i-1)+(j-1)+1+((i-1)*3)), 17 - ((i-1)+(j-1)+1+((i-1)*3)));
end
end
[A,B] = eig(x,y)
So I give eig the same 4x4 matrices holding values 1-16 ascending (x) and descending (y). But I receive different results, furthermore Eigen method returns double from eigenvalues while Matlab returns complex dobule. I also find out that there is other Eigen solver named GeneralizedEigenSolver. That one in the documentation (http://eigen.tuxfamily.org/dox/classEigen_1_1GeneralizedEigenSolver.html) has written that it solves A*V = B*V*D but to be honest I tried it and results (matrix sizes) are not the same size as Matlab so I got quite lost how it works (examplary results are on the website I've linked). It also has only .eigenvector method.
C++ results:
(-0.222268,-0.0108754) (0.0803437,-0.0254809) (0.0383264,-0.0233819) (0.0995482,0.00682079)
(-0.009275,-0.0182668) (-0.0395551,-0.0582127) (0.0550395,0.03434) (-0.034419,-0.0287563)
(-0.112716,-0.0621061) (-0.010788,0.10297) (-0.0820552,0.0294896) (-0.114596,-0.146384)
(0.28873,0.257988) (0.0166259,-0.0529934) (0.0351645,-0.0322988) (0.405394,0.424698)
-1.66983
-0.0733194
0.0386832
3.97933
Matlab results:
[A,B] = eig(x,y)
A =
Columns 1 through 3
-0.9100 + 0.0900i -0.5506 + 0.4494i 0.3614 + 0.3531i
0.7123 + 0.0734i 0.4928 - 0.2586i -0.5663 - 0.4337i
0.0899 - 0.4170i -0.1210 - 0.3087i 0.0484 - 0.1918i
0.1077 + 0.2535i 0.1787 + 0.1179i 0.1565 + 0.2724i
Column 4
-0.3237 - 0.3868i
0.2338 + 0.7662i
0.5036 - 0.3720i
-0.4136 - 0.0074i
B =
Columns 1 through 3
-1.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i
0.0000 + 0.0000i -1.0000 - 0.0000i 0.0000 + 0.0000i
0.0000 + 0.0000i 0.0000 + 0.0000i -4.5745 - 1.8929i
0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i
Column 4
0.0000 + 0.0000i
0.0000 + 0.0000i
0.0000 + 0.0000i
-0.3317 + 1.1948i
Second try was with Intel IPP but it seems that it solves only A*V = V*D and support told me that it's not supported anymore.
https://software.intel.com/en-us/node/505270 (list of constructors for Intel IPP)
I got suggestion to move from Intel IPP to MKL. I did it and hit the wall again. I tried to check all algorithms for Eigen but it seems that there are only A*V = V*D problems solved. I was checking lapack95.lib. The list of algorithms used by this library is available there:
https://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_lapack_examples/index.htm#dsyev.htm
Somewhere on the web I could find topic on Mathworks when someone said that managed to solve my problem partially with usage of MKL:
http://jp.mathworks.com/matlabcentral/answers/40050-generalized-eigenvalue-and-eigenvectors-differences-between-matlab-eig-a-b-and-mkl-lapack-dsygv
Person said that he/she used dsygv algorithm but I can't locate anything like that on the web. Maybe it's a typo.
Anyone has any other proposition/idea how can I implement it? Or maybe point my mistake. I'd appreciate that.
EDIT:
In comments I've received a hint that I was using Eigen solver wrong. My A matrix wasn't self-adjoint and my B matrix wasn't positive-definite. I took matrices from program I want to rewrite to C++ (from random iteration) and checked if they meet the requirements. They did:
Rj =
1.0e+02 *
Columns 1 through 3
0.1302 + 0.0000i -0.0153 + 0.0724i 0.0011 - 0.0042i
-0.0153 - 0.0724i 1.2041 + 0.0000i -0.0524 + 0.0377i
0.0011 + 0.0042i -0.0524 - 0.0377i 0.0477 + 0.0000i
-0.0080 - 0.0108i 0.0929 - 0.0115i -0.0055 + 0.0021i
Column 4
-0.0080 + 0.0108i
0.0929 + 0.0115i
-0.0055 - 0.0021i
0.0317 + 0.0000i
Rt =
Columns 1 through 3
4.8156 + 0.0000i -0.3397 + 1.3502i -0.2143 - 0.3593i
-0.3397 - 1.3502i 7.3635 + 0.0000i -0.5539 - 0.5176i
-0.2143 + 0.3593i -0.5539 + 0.5176i 1.7801 + 0.0000i
0.5241 + 0.9105i 0.9514 + 0.6572i -0.7302 + 0.3161i
Column 4
0.5241 - 0.9105i
0.9514 - 0.6572i
-0.7302 - 0.3161i
9.6022 + 0.0000i
As for Rj which is now my A - it is self-adjoint because Rj = Rj' and Rj = ctranspose(Rj). (http://mathworld.wolfram.com/Self-AdjointMatrix.html)
As for Rt which is now my B - it is Positive-Definite what is checked with method linked to me. (http://www.mathworks.com/matlabcentral/answers/101132-how-do-i-determine-if-a-matrix-is-positive-definite-using-matlab). So
>> [~,p] = chol(Rt)
p =
0
I've rewritten matrices manually to C++ and performed eig(A,B) again with matrices meeting requirements:
Matrix4cd x;
Matrix4cd y;
pair<Matrix4cd, Vector4d> result;
x.real()(0,0) = 13.0163601949795;
x.real()(0,1) = -1.53172561296005;
x.real()(0,2) = 0.109594869350436;
x.real()(0,3) = -0.804231869422614;
x.real()(1,0) = -1.53172561296005;
x.real()(1,1) = 120.406645675346;
x.real()(1,2) = -5.23758765476463;
x.real()(1,3) = 9.28686785230169;
x.real()(2,0) = 0.109594869350436;
x.real()(2,1) = -5.23758765476463;
x.real()(2,2) = 4.76648319080400;
x.real()(2,3) = -0.552823839520508;
x.real()(3,0) = -0.804231869422614;
x.real()(3,1) = 9.28686785230169;
x.real()(3,2) = -0.552823839520508;
x.real()(3,3) = 3.16510496622613;
x.imag()(0,0) = -0.00000000000000;
x.imag()(0,1) = 7.23946944213164;
x.imag()(0,2) = 0.419181335323979;
x.imag()(0,3) = 1.08441894337449;
x.imag()(1,0) = -7.23946944213164;
x.imag()(1,1) = -0.00000000000000;
x.imag()(1,2) = 3.76849276970080;
x.imag()(1,3) = 1.14635625342266;
x.imag()(2,0) = 0.419181335323979;
x.imag()(2,1) = -3.76849276970080;
x.imag()(2,2) = -0.00000000000000;
x.imag()(2,3) = 0.205129702522089;
x.imag()(3,0) = -1.08441894337449;
x.imag()(3,1) = -1.14635625342266;
x.imag()(3,2) = 0.205129702522089;
x.imag()(3,3) = -0.00000000000000;
y.real()(0,0) = 4.81562784930907;
y.real()(0,1) = -0.339731222392148;
y.real()(0,2) = -0.214319720979258;
y.real()(0,3) = 0.524107127885349;
y.real()(1,0) = -0.339731222392148;
y.real()(1,1) = 7.36354235698375;
y.real()(1,2) = -0.553927983436786;
y.real()(1,3) = 0.951404408649307;
y.real()(2,0) = -0.214319720979258;
y.real()(2,1) = -0.553927983436786;
y.real()(2,2) = 1.78008768533745;
y.real()(2,3) = -0.730246631850385;
y.real()(3,0) = 0.524107127885349;
y.real()(3,1) = 0.951404408649307;
y.real()(3,2) = -0.730246631850385;
y.real()(3,3) = 9.60215057284395;
y.imag()(0,0) = -0.00000000000000;
y.imag()(0,1) = 1.35016928394966;
y.imag()(0,2) = -0.359262708214312;
y.imag()(0,3) = -0.910512495060186;
y.imag()(1,0) = -1.35016928394966;
y.imag()(1,1) = -0.00000000000000;
y.imag()(1,2) = -0.517616473138836;
y.imag()(1,3) = -0.657235460367660;
y.imag()(2,0) = 0.359262708214312;
y.imag()(2,1) = 0.517616473138836;
y.imag()(2,2) = -0.00000000000000;
y.imag()(2,3) = -0.316090662865005;
y.imag()(3,0) = 0.910512495060186;
y.imag()(3,1) = 0.657235460367660;
y.imag()(3,2) = 0.316090662865005;
y.imag()(3,3) = -0.00000000000000;
result = eig(x,y);
cout << result.first << endl << endl;
cout << result.second << endl << endl;
And the results of C++:
(0.0295948,0.00562174) (-0.253532,0.0138373) (-0.395087,-0.0139696) (-0.0918132,-0.0788735)
(-0.00994614,-0.0213973) (-0.0118322,-0.0445976) (0.00993512,0.0127006) (0.0590018,-0.387949)
(0.0139485,-0.00832193) (0.363694,-0.446652) (-0.319168,0.376483) (-0.234447,-0.0859585)
(0.173697,0.268015) (0.0279387,-0.0103741) (0.0273701,0.0937148) (-0.055169,0.0295393)
0.244233
2.24309
3.24152
18.664
Results of MATLAB:
>> [A,B] = eig(Rj,Rt)
A =
Columns 1 through 3
0.0208 - 0.0218i 0.2425 + 0.0753i -0.1242 + 0.3753i
-0.0234 - 0.0033i -0.0044 + 0.0459i 0.0150 - 0.0060i
0.0006 - 0.0162i -0.4964 + 0.2921i 0.2719 + 0.4119i
0.3194 + 0.0000i -0.0298 + 0.0000i 0.0976 + 0.0000i
Column 4
-0.0437 - 0.1129i
0.2351 - 0.3142i
-0.1661 - 0.1864i
-0.0626 + 0.0000i
B =
0.2442 0 0 0
0 2.2431 0 0
0 0 3.2415 0
0 0 0 18.6640
Eigenvalues are the same! Nice, but why Eigenvectors are not similar at all?
There is no problem here with Eigen.
In fact for the second example run, Matlab and Eigen produced the very same result. Please remember from basic linear algebra that eigenvector are determined up to an arbitrary scaling factor. (I.e. if v is an eigenvector the same holds for alpha*v, where alpha is a non zero complex scalar.)
It is quite common that different linear algebra libraries compute different eigenvectors, but this does not mean that one of the two codes is wrong: it simply means that they choose a different scaling of the eigenvectors.
EDIT
The main problem with exactly replicating the scaling chosen by matlab is that eig(A,B) is a driver routine, which depending from the different properties of A and B may call different libraries/routines, and apply extra steps like balancing the matrices and so on. By quickly inspecting your example, I would say that in this case matlab is enforcing following condition:
all(imag(V(end,:))==0) (the last component of each eigenvector is real)
but not imposing other constraints. This unfortunately means that the scaling is not unique, and probably depends on intermediate results of the generalised eigenvector algorithm used. In this case I'm not able to give you advice on how to exactly replicate matlab: knowledge of the internal working of matlab is required.
As a general remark, in linear algebra usually one does not care too much about eigenvector scaling, since this is usually completely irrelevant for the problem solved, when the eigenvectors are just used as intermediate results.
The only case in which the scaling has to be defined exactly, is when you are going to give a graphic representation of the eigenvalues.
The eigenvector scaling in Matlab seems to be based on normalizing them to 1.0 (ie. the absolute value of the biggest term in each vector is 1.0). In the application I was using it also returns the left eigenvector rather than the more commonly used right eigenvector. This could explain the differences between Matlab and the eigensolvers in Lapack MKL.

Incorrect results with invert in opencv

I am using opencv library in c++ for matrix inversion. I use the function invert with DECOMP_SVD flag. For matrices which are not singular, it computes using SVD method.
However it gives me an incorrect answer for singular matrices (determinant = 0) when I compare it to the output in Matlab for the same inversion.
The answer is off by a margin of 1e+4!
The methods I have used in matlab is pinv() and svd().
pinv() uses moore-Penrose method.
Need help
Thanks in advance!
Example :
original =
0.2667 0.0667 -1.3333 2.2222
0.0667 0.0667 -0.0000 0.8889
-1.3333 -0.0000 8.8889 -8.8889
2.2222 0.8889 -8.8889 20.7408
Inverse from matlab =
1.0e+04 *
9.8888 -0.0000 0.7417 -0.7417
-0.0000 9.8888 -0.7417 -0.7417
0.7417 -0.7417 0.1113 0.0000
-0.7417 -0.7417 0.0000 0.1113
Your matrix is ill-conditioned (weak main diagonal).
Try to increase the main diagonal elements, and, I think, the error should decrease.

What does the resultant matrix of Homography denote?

I have 2 frames of shaky video. I applied homography on all the inliers points. Now the resultant matrix that i get for different frames are like this
0.2711 -0.0036 0.853
-0.0002 0.2719 -0.2247
0.0000 -0.0000 0.2704
0.4787 -0.0061 0.5514
0.0007 0.4798 -0.0799
0.0000 -0.0000 0.4797
What are those similar values in the diagonal and how can I retrieve the translation component from this matrix ?
Start with the following observation: a homography matrix is only defined up to scale. This means that if you divide or multiply all the matrix coefficients by the same number, you obtain a matrix that represent the same geometrical transformation. This is because, in order to apply the homography to a point at coordinates (x, y), you multiply its matrix H on the right by the column vector [x, y, 1]' (here I use the apostrophe symbol to denote transposition), and then divide the result H * x = [u, v, w]' by the third component w. Therefore, if instead of H you use a scaled matrix (s * H), you end up with [s*u, s*v, s*w], which represents the same 2D point.
So, to understand what is going on with your matrices, start by dividing both of them by their bottom-right component:
octave:1> a = [
> 0.2711 -0.0036 0.853
> -0.0002 0.2719 -0.2247
> 0.0000 -0.0000 0.2704
> ];
octave:2> b=[
> 0.4787 -0.0061 0.5514
> 0.0007 0.4798 -0.0799
> 0.0000 -0.0000 0.4797];
octave:3> a/a(3,3)
ans =
1.00259 -0.01331 3.15459
-0.00074 1.00555 -0.83099
0.00000 -0.00000 1.00000
octave:4> b/b(3,3)
ans =
0.99792 -0.01272 1.14947
0.00146 1.00021 -0.16656
0.00000 -0.00000 1.00000
Now suppose, for the moment, that the third column elements in both matrices were [0, 0, 1]'. Then the effect of applying it to any point (x, y) would be to move it by approx 1/100 units (say, pixels). Basically, not changing it by much.
Plugging back the actual values for the third column shows that both matrices are, essentially, translating the whole images by constant amounts.
So, in conclusion, having equal values on the diagonals, and very small values at indices (1,2) and (2,1), means that these homographies are both (essentially) pure translations.
Various transformations involve all elementary operations such as addition, multiplication, division, and addition of a constant. Only the first two can be modeled by regular matrix multiplication. Note that addition of a constant and, in case of a Homography, division is impossible to represent with matrix multiplication in 2D. Adding a third coordinate (that is converting points to homogeneous representation) solves this problem. For example, if you want to add constant 5 to x you can do this like this
1 0 5 x x+5
0 1 0 * y = y
1
Note that matrix is 2x3, not 2x2 and coordinates have three numbers though they represent 2D points. Also, the last transition is converting back from homogeneous to Euclidian representation. Thus two results are achieved: all operations (multiplication, division, addition of variables and additions of constants) can be represented by matrix multiplication; second, we can chain multiple operations (via multiplying their matrices) and still have only a single matrix as the result (of matrix multiplication).
Ok, now let’s explain Homography. Homography is better to consider in the context of the whole family of transformation moving from simple ones to complex ones. In other words, it is easier to understand the meaning of Homography coefficients by comparing them to the meaning of coefficients of simpler Euclidean, Similarity and Affine transforms. The Euclidwan transformation is the simplest and represents a rigid rotation and translation in space (note that matrix is 2x3). For 2D case,
cos(a) -sin(a) Tx
sin(a) cos(a) Ty
Similarity adds scaling to the rotation coefficients. So now the matrix looks like this:
Scl*cos(a) -scl*sin(a) Tx
Scl*sin(a) scl*cos(a) Ty
Affiliate transformation adds shearing so the rotation coefficients become unrestricted:
a11 a12 Tx
a21 a22 Ty
Homography adds another row that divides the output x and y (see how we explained the division during the transition form homogeneous to Euclidean coordinates above) and thus introduces projectivity or non uniform scaling that is a function of point coordinates. This is better understood by looking at the transition to Euclidean coordinates.
a11 a12 Tx x a11*x+a12*y+Tx (a11*x+a12*y+Tx)/(a32*x+a32*y+a33)
a21 a22 Ty * y = a21*x+a22*y+Ty -> (a21*x+a22*y+Ty)/(a32*x+a32*y+a33)
a31 a32 a33 1 a32*x+a32*y+a33
Thus homography has an extra row compared to other transformations such as affine or similarity. This extra row allows to scale objects depending on their coordinates which is how projectivity is formed.
Finally, speaking of your numbers:
0.4787 -0.0061 0.5514
0.0007 0.4798 -0.0799
0.0000 -0.0000 0.4797
This is not homography!. Just look at the last row and you will see that the first two coefficients are 0 thus there is no projectivity. Since a11=a22 this is not even an Affine transformation. This is rather a similarity transform. The translation is
Tx=0.5514/0.4797 and Ty=-0.0799/0.4797