Matrix multiplication issues using C++ Eigen, and matlab mexFunction

Matrix multiplication issues using C++ Eigen, and matlab mexFunction - c++

// computing the matrix operation here
// resultEigen = Input matrix
// result1Eigen = hidden bias
// result2Eigen = visible bias
// result3Eigen = weight matrix
MatrixXd H;
MatrixXd V;
double well[36];
Map<MatrixXd>( well, H.rows(), H.cols() ) = H;
H = resultEigen * result3Eigen + result1Eigen;
mexPrintf("H is here\n");
for (int i=0; i<36; i++)
{
mexPrintf("%d\n",H);
}
mexPrintf("\n");
I need to build a reconstructing function for my RBM and since direct matrix multiplication could get me a better result, I have been referring to eigen library to solve my issues but I am facing some difficulties.
when running the above code I end up getting a single value for the H matrix and I wonder why!
Moreover the parameters used in for the computation of H have been initiated as follows:
double *data1 = hbias;
Map<VectorXd>hidden_bias(data1,6,1);
VectorXd result1Eigen;
double result1[6];
result1Eigen = hidden_bias.transpose();
Map<VectorXd>(result1, result1Eigen.cols()) = result1Eigen;
// next param
double *data2 = vbias;
Map<VectorXd>visible_bias(data2,6,1);
VectorXd result2Eigen;
double result2[6];
result2Eigen = visible_bias.transpose();
Map<VectorXd>(result2, result2Eigen.cols()) = result2Eigen;
// next param
double *data3 = w;
Map<MatrixXd>weight_matrix(data3,n_visible,n_hidden);
MatrixXd result3Eigen;
// double result3[36];
mxArray * result3Matrix = mxCreateDoubleMatrix(n_visible, n_hidden, mxREAL );
double *result3=(double*)mxGetData(result3Matrix);
result3Eigen = weight_matrix.transpose();
Map<MatrixXd>(result3, result3Eigen.rows(), result3Eigen.cols()) = result3Eigen
At last I also face issues printing out data using std::cout from inside the mexFunction.
Thanks for any hints.

The problem is in the printing code which should be:
mexPrintf("%d\n",H(i));
Then, there is no need to duplicate vectors and matrices. For instance, result1 is useless, as you can get a raw pointer to the data stored in result1Eigen using result1Eigen.data(). Likewise, you can directly assign weight_matrix.transpose() to Map<MatrixXd>(result3,...), and I don't see the purpose of well.
Finally, if sizes are really known at compile-time, then better using Matrix<double,6,1> instead of a VectorXd and Matrix<double,6,6> instead of a MatrixXd. Yo ucan expect significant speedup.

Related

How to write Multiplicative Update Rules for Matrix Factorization when one doesn't have access to the whole matrix?

So we want to approximate the matrix A with m rows and n columns with the product of two matrices P and Q that have dimension mxk and kxn respectively. Here is an implementation of the multiplicative update rule due to Lee in C++ using the Eigen library.
void multiplicative_update()
{
Q = Q.cwiseProduct((P.transpose()*matrix).cwiseQuotient(P.transpose()*P*Q));
P = P.cwiseProduct((matrix*Q.transpose()).cwiseQuotient(P*Q*Q.transpose()));
}
where P, Q, and the matrix (matrix = A) are global variables in the class mat_fac. Thus I train them using the following method,
void train_2(){
double error_trial = 0;
for (int count = 0;count < num_iterations; count ++)
{
multiplicative_update();
error_trial = (matrix-P*Q).squaredNorm();
if (error_trial < 0.001)
{
break;
}
}
}
where num_iterations is also a global variable in the class mat_fac.
The problem is that I am working with very large matrices and in particular I do not have access to the entire matrix. Given a triple (i,j,matrix[i][j]), I have access to the row vector P[i][:] and the column vector Q[:][j]. So my goal is to write rewrite the multiplicative update rule in such a way that I update these two vectors every time, I see a non-zero matrix value.
In code, I want to have something like this:
void multiplicative_update(int i, int j, double mat_value)
{
Eigen::MatrixXd q_vect = get_vector(1, j); // get_vector returns Q[:][j] as a column vector
Eigen::MatrixXd p_vect = get_vector(0, i); // get_vector returns P[i][:] as a column vector
// Somehow compute coeff_AQ_t, coeff_PQQ_t, coeff_P_tA and coeff_P_tA.
for(int i = 0; i< k; i++):
p_vect[i] = p_vect[i]* (coeff_AQ_t)/(coeff_PQQ_t)
q_vect[i] = q_vect[i]* (coeff_P_tA)/(coeff_P_tA)
}
Thus the problem boils down to computing the required coefficients given the two vectors. Is this a possible thing to do? If not, what more data do I need for the multiplicative update to work in this manner?

How to access matrix data in opencv by another mat with locations (indexing)

Suppose I have a Mat of indices (locations) called B, We can say that this Mat has dimensions of 1 x 100 and We suppose to have another Mat, called A, full of data of the same dimensions of B.
Now, I would access to the data of A with B. Usually I would create a for loop and I would take for each elements of B, the right elements of A. For the most fussy of the site, this is the code that I would write:
for(int i=0; i < B.cols; i++){
int index = B.at<int>(0, i);
std::cout<<A.at<int>(0, index)<<std:endl;
}
Ok, now that I showed you what I could do, I ask you if there is a way to access the matrix A, always using the B indices, in a more intelligent and fast way. As someone could do in python thanks to the numpy.take() function.

This operation is called remapping. In OpenCV, you can use function cv::remap for this purpose.
Below I present the very basic example of how remap algorithm works; please note that I don't handle border conditions in this example, but cv::remap does - it allows you to use mirroring, clamping, etc. to specify what happens if the indices exceed the dimensions of the image. I also don't show how interpolation is done; check the cv::remap documentation that I've linked to above.
If you are going to use remapping you will probably have to convert indices to floating point; you will also have to introduce another array of indices that should be trivial (all equal to 0) if your image is one-dimensional. If this starts to represent a problem because of performance, I'd suggest you implement the 1-D remap equivalent yourself. But benchmark first before optimizing, of course.
For all the details, check the documentation, which covers everything you need to know to use te algorithm.
cv::Mat<float> remap_example(cv::Mat<float> image,
cv::Mat<float> positions_x,
cv::Mat<float> positions_y)
{
// sizes of positions arrays must be the same
int size_x = positions_x.cols;
int size_y = positions_x.rows;
auto out = cv::Mat<float>(size_y, size_x);
for(int y = 0; y < size_y; ++y)
for(int x = 0; x < size_x; ++x)
{
float ps_x = positions_x(x, y);
float ps_y = positions_y(x, y);
// use interpolation to determine intensity at image(ps_x, ps_y),
// at this point also handle border conditions
// float interpolated = bilinear_interpolation(image, ps_x, ps_y);
out(x, y) = interpolated;
}
return out;
}

One fast way is to use pointer for both A (data) and B (indexes).
const int* pA = A.ptr<int>(0);
const int* pIndexB = B.ptr<int>(0);
int sum = 0;
for(int i = 0; i < Bi.cols; ++i)
{
sum += pA[*pIndexB++];
}
Note: Be carefull with pixel type, in this case (as you write in your code) is int!
Note2: Using cout for each point access put the optimization useless!
Note3: In this article Satya compare four methods for pixel access and fastest seems "foreach": https://www.learnopencv.com/parallel-pixel-access-in-opencv-using-foreach/

Getting values for specific frequencies in a short time fourier transform

I'm trying to use C++ to recreate the spectrogram function used by Matlab. The function uses a Short Time Fourier Transform (STFT). I found some C++ code here that performs a STFT. The code seems to work perfectly for all frequencies but I only want a few. I found this post for a similar question with the following answer:
Just take the inner product of your data with a complex exponential at
the frequency of interest. If g is your data, then just substitute for
f the value of the frequency you want (e.g., 1, 3, 10, ...)
Having no background in mathematics, I can't figure out how to do this. The inner product part seems simple enough from the Wikipedia page but I have absolutely no idea what he means by (with regard to the formula for a DFT)
a complex exponential at frequency of interest
Could someone explain how I might be able to do this? My data structure after the STFT is a matrix filled with complex numbers. I just don't know how to extract my desired frequencies.
Relevant function, where window is Hamming, and vector of desired frequencies isn't yet an input because I don't know what to do with them:
Matrix<complex<double>> ShortTimeFourierTransform::Calculate(const vector<double> &signal,
const vector<double> &window, int windowSize, int hopSize)
{
int signalLength = signal.size();
int nOverlap = hopSize;
int cols = (signal.size() - nOverlap) / (windowSize - nOverlap);
Matrix<complex<double>> results(window.size(), cols);
int chunkPosition = 0;
int readIndex;
// Should we stop reading in chunks?
bool shouldStop = false;
int numChunksCompleted = 0;
int i;
// Process each chunk of the signal
while (chunkPosition < signalLength && !shouldStop)
{
// Copy the chunk into our buffer
for (i = 0; i < windowSize; i++)
{
readIndex = chunkPosition + i;
if (readIndex < signalLength)
{
// Note the windowing!
data[i][0] = signal[readIndex] * window[i];
data[i][1] = 0.0;
}
else
{
// we have read beyond the signal, so zero-pad it!
data[i][0] = 0.0;
data[i][1] = 0.0;
shouldStop = true;
}
}
// Perform the FFT on our chunk
fftw_execute(plan_forward);
// Copy the first (windowSize/2 + 1) data points into your spectrogram.
// We do this because the FFT output is mirrored about the nyquist
// frequency, so the second half of the data is redundant. This is how
// Matlab's spectrogram routine works.
for (i = 0; i < windowSize / 2 + 1; i++)
{
double real = fft_result[i][0];
double imaginary = fft_result[i][1];
results(i, numChunksCompleted) = complex<double>(real, imaginary);
}
chunkPosition += hopSize;
numChunksCompleted++;
} // Excuse the formatting, the while ends here.
return results;
}

Look up the Goertzel algorithm or filter for example code that uses the computational equivalent of an inner product against a complex exponential to measure the presence or magnitude of a specific stationary sinusoidal frequency in a signal. Performance or resolution will depend on the length of the filter and your signal.

C++ eigenvalue/vector decomposition, only need first n vectors fast

I have a ~3000x3000 covariance-alike matrix on which I compute the eigenvalue-eigenvector decomposition (it's a OpenCV matrix, and I use cv::eigen() to get the job done).
However, I actually only need the, say, first 30 eigenvalues/vectors, I don't care about the rest. Theoretically, this should allow to speed up the computation significantly, right? I mean, that means it has 2970 eigenvectors less that need to be computed.
Which C++ library will allow me to do that? Please note that OpenCV's eigen() method does have the parameters for that, but the documentation says they are ignored, and I tested it myself, they are indeed ignored :D
UPDATE:
I managed to do it with ARPACK. I managed to compile it for windows, and even to use it. The results look promising, an illustration can be seen in this toy example:
#include "ardsmat.h"
#include "ardssym.h"
int n = 3; // Dimension of the problem.
double* EigVal = NULL; // Eigenvalues.
double* EigVec = NULL; // Eigenvectors stored sequentially.
int lowerHalfElementCount = (n*n+n) / 2;
//whole matrix:
/*
2 3 8
3 9 -7
8 -7 19
*/
double* lower = new double[lowerHalfElementCount]; //lower half of the matrix
//to be filled with COLUMN major (i.e. one column after the other, always starting from the diagonal element)
lower[0] = 2; lower[1] = 3; lower[2] = 8; lower[3] = 9; lower[4] = -7; lower[5] = 19;
//params: dimensions (i.e. width/height), array with values of the lower or upper half (sequentially, row major), 'L' or 'U' for upper or lower
ARdsSymMatrix<double> mat(n, lower, 'L');
// Defining the eigenvalue problem.
int noOfEigVecValues = 2;
//int maxIterations = 50000000;
//ARluSymStdEig<double> dprob(noOfEigVecValues, mat, "LM", 0, 0.5, maxIterations);
ARluSymStdEig<double> dprob(noOfEigVecValues, mat);
// Finding eigenvalues and eigenvectors.
int converged = dprob.EigenValVectors(EigVec, EigVal);
for (int eigValIdx = 0; eigValIdx < noOfEigVecValues; eigValIdx++) {
std::cout << "Eigenvalue: " << EigVal[eigValIdx] << "\nEigenvector: ";
for (int i = 0; i < n; i++) {
int idx = n*eigValIdx+i;
std::cout << EigVec[idx] << " ";
}
std::cout << std::endl;
}
The results are:
9.4298, 24.24059
for the eigenvalues, and
-0.523207, -0.83446237, -0.17299346
0.273269, -0.356554, 0.893416
for the 2 eigenvectors respectively (one eigenvector per row)
The code fails to find 3 eigenvectors (it can only find 1-2 in this case, an assert() makes sure of that, but well, that's not a problem).

In this article, Simon Funk shows a simple, effective way to estimate a singular value decomposition (SVD) of a very large matrix. In his case, the matrix is sparse, with dimensions: 17,000 x 500,000.
Now, looking here, describes how eigenvalue decomposition closely related to SVD. Thus, you might benefit from considering a modified version of Simon Funk's approach, especially if your matrix is sparse. Furthermore, your matrix is not only square but also symmetric (if that is what you mean by covariance-like), which likely leads to additional simplification.
... Just an idea :)

It seems that Spectra will do the job with good performances.
Here is an example from their documentation to compute the 3 first eigen values of a dense symmetric matrix M (likewise your covariance matrix):
#include <Eigen/Core>
#include <Spectra/SymEigsSolver.h>
// <Spectra/MatOp/DenseSymMatProd.h> is implicitly included
#include <iostream>
using namespace Spectra;
int main()
{
// We are going to calculate the eigenvalues of M
Eigen::MatrixXd A = Eigen::MatrixXd::Random(10, 10);
Eigen::MatrixXd M = A + A.transpose();
// Construct matrix operation object using the wrapper class DenseSymMatProd
DenseSymMatProd<double> op(M);
// Construct eigen solver object, requesting the largest three eigenvalues
SymEigsSolver< double, LARGEST_ALGE, DenseSymMatProd<double> > eigs(&op, 3, 6);
// Initialize and compute
eigs.init();
int nconv = eigs.compute();
// Retrieve results
Eigen::VectorXd evalues;
if(eigs.info() == SUCCESSFUL)
evalues = eigs.eigenvalues();
std::cout << "Eigenvalues found:\n" << evalues << std::endl;
return 0;
}

Violation access in time compilation (0xC0000005)

The process I want to do is to make the FFT to an image (stored in “imagen”) , and then, multiply it with a filter ‘H’, after that, the inverse FFT will be done also.
The code is shown below:
int ancho;
int alto;
ancho=ui.imageframe->imagereader->GetBufferedRegion().GetSize()[0]; //ancho=widht of the image
alto=ui.imageframe->imagereader->GetBufferedRegion().GetSize()[1]; //alto=height of the image
double *H ;
H =matrix2D_H(ancho,alto,eta,sigma); // H is calculated
// We want to get: F= fft(f) ; H*F ; f'=ifft(H*F)
// Inicialization of the neccesary elements for the calculation of the fft
fftw_complex *out;
fftw_plan p;
int N= (ancho/2+1)*alto; //number of points of the image
out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*N);
double *in = (double*) imagen.GetPointer(); // conversion of itk.smartpointer --> double*
p = fftw_plan_dft_r2c_2d(ancho, alto, in, out, FFTW_ESTIMATE); // FFT planning
fftw_execute(p); // FFT calculation
/* Multiplication of the Output of the FFT with the Filter H*/
int a = alto;
int b = ancho/2 +1; // The reason for the second dimension to have this value is that when the FFT calculation of a real image is performed only the non-redundants outputs are calculated, that’s the reason for the output of the FFT and the filter ‘H’ to be equal.
// Matrix point-by-point multiplicaction: [axb]*[axb]
fftw_complex* res ; // result will be stored here
res = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*a*b);
res = multiply_matrix_2D(out,H, a, b);
The problem is located here, in the loop inside the function ‘multiply_matrix_2D’:
fftw_complex* prueba_r01::multiply_matrix_2D(fftw_complex* out, double* H, int M ,int N){
/* The matrix out[MxN] or [n0x(n1/2)+1] is the image after the FFT , and the out_H[MxN] is the filter in the frequency domain,
both are multiplied POINT TO POINT, it has to be called twice, one for the imaginary part and another for the normal part
*/
fftw_complex *H_cast;
H_cast = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*M*N);
H_cast= reinterpret_cast<fftw_complex*> (H); // casting from double* to fftw_complex*
fftw_complex *res; // the result of the multiplication will be stored here
res = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*M*N);
//Loop for calculating the matrix point-to-point multiplication
for (int x = 0; x<M ; x++){
for (int y = 0; y<N ; y++){
res[x*N+y][0] = out[x*N+y][0]*(H_cast[x*N+y][0]+H_cast[x*N+y][1]);
res[x*N+y][1] = out[x*N+y][1]*(H_cast[x*N+y][0]+H_cast[x*N+y][1]);
}
}
fftw_free(H_cast);
return res;
}
With the values of x = 95 and y = 93 being M = 191 and N = 96;
Uncontroled exception at 0x004273ab in prueba_r01.exe: 0xC0000005 acess infraction reading 0x01274000.
imagen http://img846.imageshack.us/img846/4585/accessviolationproblem.png
Where a lot of values of the variables are in red, and for translation issue: H_cast[][1] has in the value box : “Error30CXX0000 : impossible to evaluate the expression”.
I will really appreciate any kind of help with this please!!
Antonio

This part of the code
H_cast = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*M*N);
H_cast= reinterpret_cast<fftw_complex*> (H); // casting from double* to fftw_complex*
first allocates a new buffer for H_cast and then immediately sets it to point to the original H instead. It doesn't copy the data, just the pointer.
At the end of the function some buffer is free'd
fftw_free(H_cast);
which seems to free the data pointed to by H and not the buffer allocated in the function.
When getting back to the caller, the H there is lost!

There is an FFT class inside of ITK that can use fftw (USE_FFTW) from cmake for configuration. This class describes how to reference the ITK raw buffer memory from fftw.
PS: The upcoming ITKv4 has greatly improved the fftw compatibility.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Matrix multiplication issues using C++ Eigen, and matlab mexFunction - c++

Related

How to write Multiplicative Update Rules for Matrix Factorization when one doesn't have access to the whole matrix?

How to access matrix data in opencv by another mat with locations (indexing)

Getting values for specific frequencies in a short time fourier transform

C++ eigenvalue/vector decomposition, only need first n vectors fast

Violation access in time compilation (0xC0000005)

Categories

Resources