uchar/float matrices - element by element division and multiplication - c++

I want to divide a matrix cv::Mat src of type CV_8UC1 with a scalar of type float and the result stored in a new matrix srcN of type CV_32FC1:
I'm doing this at the moment:
for(unsigned j=0;j<src.rows;j++)
for(unsigned i=0;i<src.cols;i++)
srcN.at<float>(j,i) = ((int) src.at<uchar>(j,i))/A;
Which is not very fast. I want to do it like this: srcN=src/A; but I don't get the right values in srcN. Is there any way to do that?
Another question: MATLAB (which literally means MATrix LABoratory) is very fast in operations with matrices, how can I make my code as fast as matlab with c++/opencv?

Related

2D FFT what to do after converting both matrix into FFT-ed form?

Assume that I have 2 matrix: image, filter; with size MxM and NxN.
My regular convolution looks like this and produces matrix output size (M-N+1)x(M-N+1). Basically it places the top-left corner of a filter on a pixel, convolute, then assign the sum onto that pixel:
for (int i=0; i<M-N; i++)
for (int j=0; j<M-N; j++)
{
float sum = 0;
for (int u=0; u<N; u++)
for (int v=0; v<N; v++)
sum += image[i+u][j+v] * filter[u][v];
output[i][j] = sum;
}
Next, to perform FFT:
Apply zero-padding to both image, filter to the right and bottom (that is, adding more zero columns to the right, zero rows to the bottom). Now both have size (M+N)x(M+N); the original image is at
image[0->M-1][0-M-1].
(Do the same for both matrix) Calculate the FFT of each row into a new matrix, then calculate the FFT of each column of that new matrix.
Now, I have 2 matrices imageFreq and filterFreq, both size (M+N)x(M+N), which is the FFT-ed form of the image and the filter.
But how can I get the convolution values that I need (as described in the sample code) from them?
convolution between A,B using FFT is done by per element multiplication in the frequency domain so in 1D something like this:
convert A,B by FFT
assuming the sizes are N,M of A[N],B[M] first zero pad to common size Q which is a power of 2 and at least M+N in size and then apply FFT:
Q = exp2(ceil(log2(M+N)));
zeropad(A,Q);
zeropad(B,Q);
a = FFT(A);
b = FFT(B);
convolute
in frequency domain use just element wise multiplication:
for (i=0;i<Q;i++) a[i]*=b[i];
reconstruct result
simply apply IFFT (inverse of FFT)...
AB = IFFT(a); // crop to first N (real) elements
and use only the first N element (unless algorithm used need more depends on what you are doing...)
For 2D you can either convolute directly in 2D (using 2 nested for loops) or convolve each axis separately. Beware that separating axises need also to normalize the result by some constant (which depends on dimensionality, resolution and kernel used)
So when put together (also assuming the same resolution NxN and MxM) first zero pad to (QxQ) and then:
Q = exp2(ceil(log2(M+N)));
zeropad(A,Q,Q);
zeropad(B,Q,Q);
a = FFT(A);
b = FFT(B);
for (i=0;i<Q;i++)
for (j=0;j<Q;j++) a[i][j]*=b[i][j];
AB = IFFT(a); // crop to first NxN (real) elements
And again crop to AB to NxN size (unless ...) for more info see:
How to compute Discrete Fourier Transform?
and all sublinks there... Also here at the end is 1D convolution example using NTT (its a special form of FFT) to compute bignum multiplication:
Modular arithmetics and NTT (finite field DFT) optimizations
Also if you want real result then just use only the real parts of the result (ignore imaginary part).

Is there a something like a sparse cube in armadillo or some way of using sparse matrices as slices in a cube?

I am using armadillos sparse matrices. But now I would like to use something like a "sparse cube" which does not exist in armadillo. writing sparse matrices into a cube with cube.slice(some_sparse_matrix) converts everything back to a dense cube.
I am using sparse matrices in order to multiply a vector with. for larger vectors/matrices the sparse variant is much faster. Now I have to sum up the multiplications of several sparse matrices with several vectors.
would a std:vector be a way?
In my experience it is faster to use armadillos functions (for example a subvector or arma::span() or arma::sum() )) as opposed to write loops myself. So I was wondering what would be the fastest way of doing this.
It's possible to approximate a sparse cube using the field class, like so.
arma::uword number_of_matrices = 10;
arma::uword number_of_rows = 5000;
arma::uword number_of_cols = 5000;
arma::field<arma::sp_mat> F(number_of_matrices);
F.for_each( [&](arma::sp_mat& X) { X.set_size(number_of_rows, number_of_cols); } );
F(0)(1,2) = 456.7; // write to element (1,2) in matrix 0
F(1)(2,3) = 567.8; // write to element (2,3) in matrix 1
F.print("F:"); // show all matrices
Your compiler must support at least C++11 for this to work.

How to do element-wise comparison with Eigen?

I'm trying to implement the following pseudo-code in C++ using Eigen:
img_binary = +1*(img>img_mean) + -1*(img<img_mean)
i.e. i'm trying to convert a gray scale image into a binary image such that values greater than image mean are +1 and less then image mean are -1. So far, I have the following:
cv::Mat cv_image
cv_image = cv::imread(img_path, CV_LOAD_IMAGE_GRAYSCALE)
MatrixXf eig_image;
cv::cv2eigen(cv_image, eig_image):
float image_mean = eig_image.mean();
ArrayXXf bin_image;
bin_image = eig_image.array() > image_mean;
I'm getting an error in the last line saying that I mixed different numeric types. Any suggestion on how I can do element-wise comparisons with Eigen?
The easiest solution in Eigen would be
ArrayXXf bin_image = (eig_image.array() > image_mean).cast<float>()*2.f-1.f;
An alternative would be:
ArrayXXf bin_image = (eig_image.array() > image_mean)
.select(ArrayXXf::Constant(1.0f,eig_image.rows(),eig_image.cols()), -1.0f);
Having to use ArrayXXf::Constant for one argument unfortunately is necessary, because there is no .select method accepting two scalar values
However, unless you plan to do further processing in Eigen you should consider using the corresponding OpenCV method threshold.

sparse sparse product A^T*A optim in Eigen lib

In the case of multiple of same matrix matA, like
matA.transpose()*matA,
You don't have to compute all result product, because the result matrix is symmetric(so only if the m>n), in my specific case is always symmetric! square.
So its enough the compute only for. ex. lower triangular part and rest only copy..... because the results of the multiple 2nd and 3rd row, resp.col, is the same like 3rd and 2nd.....And etc....
So my question is , exist way how to tell Eigen, to compute only lower part. and optionally save to only lower trinaguler part the product?
DATA = SparseMatrix<double>((SparseMatrix<double>(matA.transpose()) * matA).pruned()).toDense();
According to the documentation, you can evaluate the lower triangle of a matrix with:
m1.triangularView<Eigen::Lower>() = m2 + m3;
or in your case:
m1.triangularView<Eigen::Lower>() = matA.transpose()*matA;
(where it says "Writing to a specific triangular part: (only the referenced triangular part is evaluated)"). Otherwise, in the line you've written
Eigen will calculate the entire sparse matrix matA.transpose()*matA.
Regarding saving the resulting m1 matrix, it is the same as saving whatever type of matrix it is (Eigen::MatrixXt or Eigen::SparseMatrix<t>). If m1 is sparse, then it will be only half the size of a straightforward matA.transpose()*matA. If m1 is dense, then it will be the full square matrix.
https://eigen.tuxfamily.org/dox/classEigen_1_1SparseSelfAdjointView.html
The symmetric rank update is defined as:
B = B + alpha * A * A^T
where alpha is a scalar. In your case, you are doing A^T * A, so you should pass the transposed matrix instead. The resulting matrix will only store the upper or lower portion of the matrix, whichever you prefer. For example:
SparseMatrix<double> B;
B.selfadjointView<Lower>().rankUpdate(A.transpose());

Opencv Multiplication of Large matrices

I have 2 matrices of dimension 1*280000.
I wanted to multiply one matrix with transposed second matrix using opencv.
I tried to multiply them using Multiplication operator(*).
But it is giving me error :'The total size matrix does not fit to size_t type'
As after multiplication the size will be 280000*28000 of matrix.
So,I am thinking multiplication should 32 bit.
Is there any method to do the 32bit multiplication?
Why do you want to multiply them like that? But because this is an answer, I would like to help you thinking more than just do it:
supposing that you have the two matrix: A and B (A.size() == B.size() == [1x280000]).
and A * B.t() = AB (AB is the result)
then AB = [A[0][0]*B A[0][1]*B ... A[0][279999]*B] (each column is the transposed matrix multiplied by the corresponding element of the other matrix)
AB may also be written as:
[ B[0][0]*A
B[0][1]*A
...
B[0][279999]*A]
(each row of the result will be the row matrix multiplied by the corresponding element of the column (transposed) matrix)
Hope that this will help you in what you are doing... Using a for loop you can print, or store, or what you need with the result