Matrix multiplication issues using C++ Eigen, and matlab mexFunction - c++

// computing the matrix operation here
// resultEigen = Input matrix
// result1Eigen = hidden bias
// result2Eigen = visible bias
// result3Eigen = weight matrix
MatrixXd H;
MatrixXd V;
double well[36];
Map<MatrixXd>( well, H.rows(), H.cols() ) = H;
H = resultEigen * result3Eigen + result1Eigen;
mexPrintf("H is here\n");
for (int i=0; i<36; i++)
{
mexPrintf("%d\n",H);
}
mexPrintf("\n");
I need to build a reconstructing function for my RBM and since direct matrix multiplication could get me a better result, I have been referring to eigen library to solve my issues but I am facing some difficulties.
when running the above code I end up getting a single value for the H matrix and I wonder why!
Moreover the parameters used in for the computation of H have been initiated as follows:
double *data1 = hbias;
Map<VectorXd>hidden_bias(data1,6,1);
VectorXd result1Eigen;
double result1[6];
result1Eigen = hidden_bias.transpose();
Map<VectorXd>(result1, result1Eigen.cols()) = result1Eigen;
// next param
double *data2 = vbias;
Map<VectorXd>visible_bias(data2,6,1);
VectorXd result2Eigen;
double result2[6];
result2Eigen = visible_bias.transpose();
Map<VectorXd>(result2, result2Eigen.cols()) = result2Eigen;
// next param
double *data3 = w;
Map<MatrixXd>weight_matrix(data3,n_visible,n_hidden);
MatrixXd result3Eigen;
// double result3[36];
mxArray * result3Matrix = mxCreateDoubleMatrix(n_visible, n_hidden, mxREAL );
double *result3=(double*)mxGetData(result3Matrix);
result3Eigen = weight_matrix.transpose();
Map<MatrixXd>(result3, result3Eigen.rows(), result3Eigen.cols()) = result3Eigen
At last I also face issues printing out data using std::cout from inside the mexFunction.
Thanks for any hints.

The problem is in the printing code which should be:
mexPrintf("%d\n",H(i));
Then, there is no need to duplicate vectors and matrices. For instance, result1 is useless, as you can get a raw pointer to the data stored in result1Eigen using result1Eigen.data(). Likewise, you can directly assign weight_matrix.transpose() to Map<MatrixXd>(result3,...), and I don't see the purpose of well.
Finally, if sizes are really known at compile-time, then better using Matrix<double,6,1> instead of a VectorXd and Matrix<double,6,6> instead of a MatrixXd. Yo ucan expect significant speedup.

Related

How to write Multiplicative Update Rules for Matrix Factorization when one doesn't have access to the whole matrix?

So we want to approximate the matrix A with m rows and n columns with the product of two matrices P and Q that have dimension mxk and kxn respectively. Here is an implementation of the multiplicative update rule due to Lee in C++ using the Eigen library.
void multiplicative_update()
{
Q = Q.cwiseProduct((P.transpose()*matrix).cwiseQuotient(P.transpose()*P*Q));
P = P.cwiseProduct((matrix*Q.transpose()).cwiseQuotient(P*Q*Q.transpose()));
}
where P, Q, and the matrix (matrix = A) are global variables in the class mat_fac. Thus I train them using the following method,
void train_2(){
double error_trial = 0;
for (int count = 0;count < num_iterations; count ++)
{
multiplicative_update();
error_trial = (matrix-P*Q).squaredNorm();
if (error_trial < 0.001)
{
break;
}
}
}
where num_iterations is also a global variable in the class mat_fac.
The problem is that I am working with very large matrices and in particular I do not have access to the entire matrix. Given a triple (i,j,matrix[i][j]), I have access to the row vector P[i][:] and the column vector Q[:][j]. So my goal is to write rewrite the multiplicative update rule in such a way that I update these two vectors every time, I see a non-zero matrix value.
In code, I want to have something like this:
void multiplicative_update(int i, int j, double mat_value)
{
Eigen::MatrixXd q_vect = get_vector(1, j); // get_vector returns Q[:][j] as a column vector
Eigen::MatrixXd p_vect = get_vector(0, i); // get_vector returns P[i][:] as a column vector
// Somehow compute coeff_AQ_t, coeff_PQQ_t, coeff_P_tA and coeff_P_tA.
for(int i = 0; i< k; i++):
p_vect[i] = p_vect[i]* (coeff_AQ_t)/(coeff_PQQ_t)
q_vect[i] = q_vect[i]* (coeff_P_tA)/(coeff_P_tA)
}
Thus the problem boils down to computing the required coefficients given the two vectors. Is this a possible thing to do? If not, what more data do I need for the multiplicative update to work in this manner?

How to access matrix data in opencv by another mat with locations (indexing)

Suppose I have a Mat of indices (locations) called B, We can say that this Mat has dimensions of 1 x 100 and We suppose to have another Mat, called A, full of data of the same dimensions of B.
Now, I would access to the data of A with B. Usually I would create a for loop and I would take for each elements of B, the right elements of A. For the most fussy of the site, this is the code that I would write:
for(int i=0; i < B.cols; i++){
int index = B.at<int>(0, i);
std::cout<<A.at<int>(0, index)<<std:endl;
}
Ok, now that I showed you what I could do, I ask you if there is a way to access the matrix A, always using the B indices, in a more intelligent and fast way. As someone could do in python thanks to the numpy.take() function.
This operation is called remapping. In OpenCV, you can use function cv::remap for this purpose.
Below I present the very basic example of how remap algorithm works; please note that I don't handle border conditions in this example, but cv::remap does - it allows you to use mirroring, clamping, etc. to specify what happens if the indices exceed the dimensions of the image. I also don't show how interpolation is done; check the cv::remap documentation that I've linked to above.
If you are going to use remapping you will probably have to convert indices to floating point; you will also have to introduce another array of indices that should be trivial (all equal to 0) if your image is one-dimensional. If this starts to represent a problem because of performance, I'd suggest you implement the 1-D remap equivalent yourself. But benchmark first before optimizing, of course.
For all the details, check the documentation, which covers everything you need to know to use te algorithm.
cv::Mat<float> remap_example(cv::Mat<float> image,
cv::Mat<float> positions_x,
cv::Mat<float> positions_y)
{
// sizes of positions arrays must be the same
int size_x = positions_x.cols;
int size_y = positions_x.rows;
auto out = cv::Mat<float>(size_y, size_x);
for(int y = 0; y < size_y; ++y)
for(int x = 0; x < size_x; ++x)
{
float ps_x = positions_x(x, y);
float ps_y = positions_y(x, y);
// use interpolation to determine intensity at image(ps_x, ps_y),
// at this point also handle border conditions
// float interpolated = bilinear_interpolation(image, ps_x, ps_y);
out(x, y) = interpolated;
}
return out;
}
One fast way is to use pointer for both A (data) and B (indexes).
const int* pA = A.ptr<int>(0);
const int* pIndexB = B.ptr<int>(0);
int sum = 0;
for(int i = 0; i < Bi.cols; ++i)
{
sum += pA[*pIndexB++];
}
Note: Be carefull with pixel type, in this case (as you write in your code) is int!
Note2: Using cout for each point access put the optimization useless!
Note3: In this article Satya compare four methods for pixel access and fastest seems "foreach": https://www.learnopencv.com/parallel-pixel-access-in-opencv-using-foreach/

Getting values for specific frequencies in a short time fourier transform

I'm trying to use C++ to recreate the spectrogram function used by Matlab. The function uses a Short Time Fourier Transform (STFT). I found some C++ code here that performs a STFT. The code seems to work perfectly for all frequencies but I only want a few. I found this post for a similar question with the following answer:
Just take the inner product of your data with a complex exponential at
the frequency of interest. If g is your data, then just substitute for
f the value of the frequency you want (e.g., 1, 3, 10, ...)
Having no background in mathematics, I can't figure out how to do this. The inner product part seems simple enough from the Wikipedia page but I have absolutely no idea what he means by (with regard to the formula for a DFT)
a complex exponential at frequency of interest
Could someone explain how I might be able to do this? My data structure after the STFT is a matrix filled with complex numbers. I just don't know how to extract my desired frequencies.
Relevant function, where window is Hamming, and vector of desired frequencies isn't yet an input because I don't know what to do with them:
Matrix<complex<double>> ShortTimeFourierTransform::Calculate(const vector<double> &signal,
const vector<double> &window, int windowSize, int hopSize)
{
int signalLength = signal.size();
int nOverlap = hopSize;
int cols = (signal.size() - nOverlap) / (windowSize - nOverlap);
Matrix<complex<double>> results(window.size(), cols);
int chunkPosition = 0;
int readIndex;
// Should we stop reading in chunks?
bool shouldStop = false;
int numChunksCompleted = 0;
int i;
// Process each chunk of the signal
while (chunkPosition < signalLength && !shouldStop)
{
// Copy the chunk into our buffer
for (i = 0; i < windowSize; i++)
{
readIndex = chunkPosition + i;
if (readIndex < signalLength)
{
// Note the windowing!
data[i][0] = signal[readIndex] * window[i];
data[i][1] = 0.0;
}
else
{
// we have read beyond the signal, so zero-pad it!
data[i][0] = 0.0;
data[i][1] = 0.0;
shouldStop = true;
}
}
// Perform the FFT on our chunk
fftw_execute(plan_forward);
// Copy the first (windowSize/2 + 1) data points into your spectrogram.
// We do this because the FFT output is mirrored about the nyquist
// frequency, so the second half of the data is redundant. This is how
// Matlab's spectrogram routine works.
for (i = 0; i < windowSize / 2 + 1; i++)
{
double real = fft_result[i][0];
double imaginary = fft_result[i][1];
results(i, numChunksCompleted) = complex<double>(real, imaginary);
}
chunkPosition += hopSize;
numChunksCompleted++;
} // Excuse the formatting, the while ends here.
return results;
}
Look up the Goertzel algorithm or filter for example code that uses the computational equivalent of an inner product against a complex exponential to measure the presence or magnitude of a specific stationary sinusoidal frequency in a signal. Performance or resolution will depend on the length of the filter and your signal.

C++ eigenvalue/vector decomposition, only need first n vectors fast

I have a ~3000x3000 covariance-alike matrix on which I compute the eigenvalue-eigenvector decomposition (it's a OpenCV matrix, and I use cv::eigen() to get the job done).
However, I actually only need the, say, first 30 eigenvalues/vectors, I don't care about the rest. Theoretically, this should allow to speed up the computation significantly, right? I mean, that means it has 2970 eigenvectors less that need to be computed.
Which C++ library will allow me to do that? Please note that OpenCV's eigen() method does have the parameters for that, but the documentation says they are ignored, and I tested it myself, they are indeed ignored :D
UPDATE:
I managed to do it with ARPACK. I managed to compile it for windows, and even to use it. The results look promising, an illustration can be seen in this toy example:
#include "ardsmat.h"
#include "ardssym.h"
int n = 3; // Dimension of the problem.
double* EigVal = NULL; // Eigenvalues.
double* EigVec = NULL; // Eigenvectors stored sequentially.
int lowerHalfElementCount = (n*n+n) / 2;
//whole matrix:
/*
2 3 8
3 9 -7
8 -7 19
*/
double* lower = new double[lowerHalfElementCount]; //lower half of the matrix
//to be filled with COLUMN major (i.e. one column after the other, always starting from the diagonal element)
lower[0] = 2; lower[1] = 3; lower[2] = 8; lower[3] = 9; lower[4] = -7; lower[5] = 19;
//params: dimensions (i.e. width/height), array with values of the lower or upper half (sequentially, row major), 'L' or 'U' for upper or lower
ARdsSymMatrix<double> mat(n, lower, 'L');
// Defining the eigenvalue problem.
int noOfEigVecValues = 2;
//int maxIterations = 50000000;
//ARluSymStdEig<double> dprob(noOfEigVecValues, mat, "LM", 0, 0.5, maxIterations);
ARluSymStdEig<double> dprob(noOfEigVecValues, mat);
// Finding eigenvalues and eigenvectors.
int converged = dprob.EigenValVectors(EigVec, EigVal);
for (int eigValIdx = 0; eigValIdx < noOfEigVecValues; eigValIdx++) {
std::cout << "Eigenvalue: " << EigVal[eigValIdx] << "\nEigenvector: ";
for (int i = 0; i < n; i++) {
int idx = n*eigValIdx+i;
std::cout << EigVec[idx] << " ";
}
std::cout << std::endl;
}
The results are:
9.4298, 24.24059
for the eigenvalues, and
-0.523207, -0.83446237, -0.17299346
0.273269, -0.356554, 0.893416
for the 2 eigenvectors respectively (one eigenvector per row)
The code fails to find 3 eigenvectors (it can only find 1-2 in this case, an assert() makes sure of that, but well, that's not a problem).
In this article, Simon Funk shows a simple, effective way to estimate a singular value decomposition (SVD) of a very large matrix. In his case, the matrix is sparse, with dimensions: 17,000 x 500,000.
Now, looking here, describes how eigenvalue decomposition closely related to SVD. Thus, you might benefit from considering a modified version of Simon Funk's approach, especially if your matrix is sparse. Furthermore, your matrix is not only square but also symmetric (if that is what you mean by covariance-like), which likely leads to additional simplification.
... Just an idea :)
It seems that Spectra will do the job with good performances.
Here is an example from their documentation to compute the 3 first eigen values of a dense symmetric matrix M (likewise your covariance matrix):
#include <Eigen/Core>
#include <Spectra/SymEigsSolver.h>
// <Spectra/MatOp/DenseSymMatProd.h> is implicitly included
#include <iostream>
using namespace Spectra;
int main()
{
// We are going to calculate the eigenvalues of M
Eigen::MatrixXd A = Eigen::MatrixXd::Random(10, 10);
Eigen::MatrixXd M = A + A.transpose();
// Construct matrix operation object using the wrapper class DenseSymMatProd
DenseSymMatProd<double> op(M);
// Construct eigen solver object, requesting the largest three eigenvalues
SymEigsSolver< double, LARGEST_ALGE, DenseSymMatProd<double> > eigs(&op, 3, 6);
// Initialize and compute
eigs.init();
int nconv = eigs.compute();
// Retrieve results
Eigen::VectorXd evalues;
if(eigs.info() == SUCCESSFUL)
evalues = eigs.eigenvalues();
std::cout << "Eigenvalues found:\n" << evalues << std::endl;
return 0;
}

Violation access in time compilation (0xC0000005)

The process I want to do is to make the FFT to an image (stored in “imagen”) , and then, multiply it with a filter ‘H’, after that, the inverse FFT will be done also.
The code is shown below:
int ancho;
int alto;
ancho=ui.imageframe->imagereader->GetBufferedRegion().GetSize()[0]; //ancho=widht of the image
alto=ui.imageframe->imagereader->GetBufferedRegion().GetSize()[1]; //alto=height of the image
double *H ;
H =matrix2D_H(ancho,alto,eta,sigma); // H is calculated
// We want to get: F= fft(f) ; H*F ; f'=ifft(H*F)
// Inicialization of the neccesary elements for the calculation of the fft
fftw_complex *out;
fftw_plan p;
int N= (ancho/2+1)*alto; //number of points of the image
out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*N);
double *in = (double*) imagen.GetPointer(); // conversion of itk.smartpointer --> double*
p = fftw_plan_dft_r2c_2d(ancho, alto, in, out, FFTW_ESTIMATE); // FFT planning
fftw_execute(p); // FFT calculation
/* Multiplication of the Output of the FFT with the Filter H*/
int a = alto;
int b = ancho/2 +1; // The reason for the second dimension to have this value is that when the FFT calculation of a real image is performed only the non-redundants outputs are calculated, that’s the reason for the output of the FFT and the filter ‘H’ to be equal.
// Matrix point-by-point multiplicaction: [axb]*[axb]
fftw_complex* res ; // result will be stored here
res = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*a*b);
res = multiply_matrix_2D(out,H, a, b);
The problem is located here, in the loop inside the function ‘multiply_matrix_2D’:
fftw_complex* prueba_r01::multiply_matrix_2D(fftw_complex* out, double* H, int M ,int N){
/* The matrix out[MxN] or [n0x(n1/2)+1] is the image after the FFT , and the out_H[MxN] is the filter in the frequency domain,
both are multiplied POINT TO POINT, it has to be called twice, one for the imaginary part and another for the normal part
*/
fftw_complex *H_cast;
H_cast = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*M*N);
H_cast= reinterpret_cast<fftw_complex*> (H); // casting from double* to fftw_complex*
fftw_complex *res; // the result of the multiplication will be stored here
res = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*M*N);
//Loop for calculating the matrix point-to-point multiplication
for (int x = 0; x<M ; x++){
for (int y = 0; y<N ; y++){
res[x*N+y][0] = out[x*N+y][0]*(H_cast[x*N+y][0]+H_cast[x*N+y][1]);
res[x*N+y][1] = out[x*N+y][1]*(H_cast[x*N+y][0]+H_cast[x*N+y][1]);
}
}
fftw_free(H_cast);
return res;
}
With the values of x = 95 and y = 93 being M = 191 and N = 96;
Uncontroled exception at 0x004273ab in prueba_r01.exe: 0xC0000005 acess infraction reading 0x01274000.
imagen http://img846.imageshack.us/img846/4585/accessviolationproblem.png
Where a lot of values of the variables are in red, and for translation issue: H_cast[][1] has in the value box : “Error30CXX0000 : impossible to evaluate the expression”.
I will really appreciate any kind of help with this please!!
Antonio
This part of the code
H_cast = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*M*N);
H_cast= reinterpret_cast<fftw_complex*> (H); // casting from double* to fftw_complex*
first allocates a new buffer for H_cast and then immediately sets it to point to the original H instead. It doesn't copy the data, just the pointer.
At the end of the function some buffer is free'd
fftw_free(H_cast);
which seems to free the data pointed to by H and not the buffer allocated in the function.
When getting back to the caller, the H there is lost!
There is an FFT class inside of ITK that can use fftw (USE_FFTW) from cmake for configuration. This class describes how to reference the ITK raw buffer memory from fftw.
PS: The upcoming ITKv4 has greatly improved the fftw compatibility.