2D fourier transform with Eigen and FFTW

2D fourier transform with Eigen and FFTW - c++

I'm trying to do a real-valued 2d Fourier Transform with FFTW. My data is stored in a dynamically sized Eigen Matrix. Here's the wrapper class I wrote:
FFT2D.h:
#include <Eigen>
class FFT2D {
public:
enum FFT_TYPE {FORWARD=0, REVERSE=1};
FFT2D(EMatrix &input, EMatrix &output, FFT_TYPE type_ = FORWARD);
~FFT2D();
void execute();
private:
EMatrix& input;
EMatrix& output;
fftw_plan plan;
FFT_TYPE type;
};
FFT2D.cpp:
#include "FFT2D.h"
#include <fftw3.h>
#include "Defs.h"
FFT2D::FFT2D(EMatrix &input_, EMatrix &output_, FFT_TYPE type_)
: type(type_), input(input_), output(output_) {
if (type == FORWARD)
plan = fftw_plan_dft_2d((int) input.rows(), (int) input.cols(),
(fftw_complex *) &input(0), (fftw_complex *) &output(0),
FFTW_FORWARD, FFTW_ESTIMATE);
else
// placeholder for ifft-2d code, unwritten
}
FFT2D::~FFT2D() {
fftw_destroy_plan(plan);
}
void FFT2D::execute() {
fftw_execute(plan); // seg-fault here
}
And a definition for EMatrix:
typedef Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor> EMatrix;
The problem is, is I'm getting a seg fault in FFT2D::execute(). I know I'm setting something up wrong in the constructor, and I've tried a number of different ways, but I can't seem to make any headway on this.
Things I've tried include: changing EMatrix typedef to Eigen::ColMajor, passing (fftw_complex *) input.data() to fftw_plan_dft_2d, using different fftw plans (fftw_plan_dft_r2c_2d).
My C++ is (clearly) rusty, but at the end of the day what I need is to do a 2D FT on a real-valued 2D Eigen Matrix of doubles. Thanks in advance.

The major problem here is that there is no such thing as "real-valued Fourier transform". It's just a Fourier transform of something with zero imaginary part, but the zeroes still have to be there, as you can see from fftw_complex definition:
typedef double fftw_complex[2];
This makes sense as the output can (and probably will) have non-zero imaginary part.
The output will have some symmetric properties though, i.e. in case of 1D transform it would be an even function.
As a result the (fftw_complex *) &input(0) cast doesn't really work - FFTW expects twice as many double values as you pass to it.
The solution is to interleave your matrix raw data with zeroes, and there is a number of ways to do that. Few examples:
You could copy the whole matrix into a new array before passing it to FFTW, adding the zeroes in process.
You could reserve the space for zeroes in the matrix itself - this way you'll be able to avoid copying, but it will probably require a lot of refactoring:)
The best way I can think of is to use std::complex<double> as a scalar. This will somewhat hurt your notice of "real-valued FFT", but again there is hardly such thing in the first place. Instead you'll be able to keep all your real-value operations as they are, and the layout of std::complex will fit fftw_complex perfectly.
There could be some other things to consider here, like storage order (FFTW operates on arrays in row-major order, so Eigen matrices should comply) and the validity of linear access to Eigen matrix data (seems OK to me).

Related

Is there a way to create std::vector<arma::mat> from arma::mat matrices without creating a copy of the matrices?

I am new to C++. For a statistical method, I compute large matrices, e.g. A and B . They are n x n so for large sample sizes n, they become very large. If they are double and n = 70k , I think it might be on the order of 30GB?
Because the number of matrices needed can vary, I implemented the algorithm to use a vector of matrices and iterate over it for some operations. E.g.
arma::mat A;
arma::mat B;
std::vector<arma::mat> matrices;
matrices = {A, B};
Is there a way to create this std::vector without copying the matrices?
I tried to check whether the memory is the same by doing this:
logger->info("Memory address for A: {}.", (void *)&A);
logger->info("Memory address for matrices.at(0): {}.", (void *)&matrices.at(0));
And it showed different addresses so I assume it is creating a copy but I am not sure.
I tried to use
std::vector<arma::mat> matrices;
matrices.push_back(A);
The memory addresses differed still. With
std::vector<arma::mat> matrices;
matrices.push_back(std::move(A));
the algorithm no longer worked because the matrices were empty.

You need to reverse the logic: Let the std::vector allocate the memory and create the matrices. Then you work directly with the elements in the vector. For example:
std::vector<arma::mat> matrices;
matrices.resize(2);
arma::mat & A = matrices[0];
arma::mat & B = matrices[1];
// Initializing the values of A and B, and do stuff with them.
Note: It is important to note here that the references to A and B will become invalid if you add more elements afterwards to the vector, i.e. if the vector grows. But since these matrices are so large, you want to avoid this anyway at all costs.
If the container needs to grow, you might want to take a look at e.g. std::list.

First, apply for memory in advance.
Second, use advanced constructors:
mat(ptr_aux_mem, n_rows, n_cols, copy_aux_mem = false, strict = false)

Store result of sparse mat-vec-mult into pre-allocated vector

I'm working on a routine for sparse matrix-vector multiplication and want to create a reference performance benchmark using the Eigen3 library. I only want to benchmark the actual arithmetic without the memory allocation involved in the construction of the result vector. How can this be achieved?
I tried to assign the result to a pre-allocated vector but Eigen::internal::set_is_malloc_allowed reveals that some memory allocation is performed despite all my attempts.
// Setup multiplicands
const Eigen::SparseMatrix<double, Eigen::RowMajor> A = createMat();
const Eigen::VectorXd x = Eigen::VectorXd::Random(num_of_cols);
// Pre-allocate result vector
Eigen::VectorXd y = Eigen::VectorXd::Zero(num_of_rows);
Eigen::internal::set_is_malloc_allowed(false);
y = A * x; // <-- Runtime-error in debug mode
Eigen::internal::set_is_malloc_allowed(true);
What I'm looking for is basically a flavor of the sparse matrix-vector multiplication which takes a reference to an output buffer where the result is written to. Instead of y = A * x in the above example I would then write something like matVecMult(A, x, std::begin(y)). Is there a way to make this happen?
Kind regards.

Try this:
y.noalias() = A * x;
noalias() indicates to Eigen that there is no potential aliasing issue involved (i.e., y does not overlap with x), and that Eigen shouldn't create a temporary.

Eigen LDLT Cholesky decomposition in-place

I am trying to get Eigen3 to solve a linear system A * X = B with an in-place Cholesky decomposition. I cannot afford to have any temporaries of the size of A pushed on the stack, but I am free to destroy A in the process.
Unfortunately,
A.llt().solveInPlace(B);
is out of question, since A.llt() implicitly pushes a temporary matrix of the size of A on the stack. For the LLT case, I could get access to the necessary functionality like so:
// solve A * X = B in-place for positive-definite A
template <typename AType, typename BType>
void AllInPlaceSolve(AType& A, BType& B)
{
typedef Eigen::internal::LLT_Traits<AType, Eigen::Upper> TraitsType;
TraitsType::inplace_decomposition(A);
TraitsType::getL(A).solveInPlace(B);
TraitsType::getU(A).solveInPlace(B);
}
This works fine, but I am worried that:
My matrices A might be positive semidefinite only, in which case a LDLT decomposition is required
The LLT decomposition calculates sqrt() unnecessarily for the solution of the system
I could not find a way to hook in Eigen's LDLT functionality similarly to the code above, since the code is structured very differently.
So my question is: Is there a way to use Eigen3 for solving a linear system using LDLT decompositions using no more scratch space than for the diagonal matrix D?

One option is to allocate a LDLT solver only once, and call the compute method:
LDLT<MatType> ldlt(size);
// ...
ldlt.compute(A);
x = ldlt.solve(b);
If that's also not an option, you can const cast the matrix stored by the ldlt object:
LDLT<MatType> ldlt(MatType::Identity(size,size));
MatType& A = const_cast<MatType&>(ldlt.matrixLDLT());
plays with A, and then:
ldlt.compute(A);
x = ldlt.solve(b);
This is ugly, but this should work as long as MatType is column major.

Matrix of matrices in Eigen C++

I'm creating a circuit analysis library in C++ (also to learn C++, so I'm very new to it).
After getting familiar with Eigen, I'd like to have a matrix where each cell hosts a 3x3 complex matrix.
So far I've tried this very simple prove of principle:
typedef Eigen::MatrixXcd cx_mat;
typedef Eigen::SparseMatrix<cx_mat> sp_mat_mat;
void test(cx_mat Z1){
sp_mat_mat Y(2, 2);
Y(0, 0) = Z1;
Y(2, 2) = Z1;
cout << "\n\nY:\n" << Y << endl;
}
Testing this simple example fails as a probable consequence of Eigen expecting a number instead of a structure.
As a matter of fact the matrix of matrices is prompt to be sparse, hence the sparse matrix structure.
Is there any way to make this work?
Any help is appreciated.

I don't believe Eigen will give you a way to make this work. I you think about the other functions which are connected to Matrix or Sparse matrix, like:
inverse()
norm()
m.row()*m.col()
what should Eigen do when a matrix element number is replaced by a matrix?
What I can understand is that you want to have a data structure that stores your Eigen::MatrixXcd in an memory efficient way.
You could also realize this using the map container:
#include <map>
typedef Eigen::MatrixXcd cx_mat;
cx_mat Z1;
std::map<int,Eigen::MatrixXcd> sp_mat_mat;
int cols = 2;
sp_mat_mat[0*cols+0]=Z1;
sp_mat_mat[2*cols+2]=Z1;
Less memory efficient, but perhaps easier to access would be using the vector container:
#include <vector>
std::vector<std::vector<Eigen::MatrixXcd>> mat_mat;

Have you found a way to create a matrix of matrices?
I see that we can use a 2-D array to create a matrix of matrices.
For example,
#include <Eigen/Dense>
MatrixXd A;
MatrixXd B;
A = MatrixXd::Random(3, 3);
B = MatrixXd::Random(3, 4);
C = MatrixXd::Random(4, 4);
MatrixXd D[2][2];
D[0][0] = A;
D[0][1] = B; D[1][0] = B.transpose();
D[1][1] = C;
I don't know if this way is memory-efficient or not. Let's check it out.

You asked "sparse matrix structure. Is there any way to make this work?" I would say no, because it is not easy to translate a circuit design into a "matrix of matrices" in the first place.. if you want to simulate something, you choose a representation close to it,. In case of an electronic circuit diagram, the schema in memory should IMHO be a directed graph, with linked-list items. At each node/junction, there is a matrix representing the behaviour of a particular component input to output transfer (eg resistor, capacitor, transistor) and you propagate the signal through the matrices assigned to each component. The transformed signal eventually arrives at an output, through the connections in your connected graph. In software, it should work similarly.. Suggested further reading: https://core.ac.uk/download/pdf/53745212.pdf

What is the better Matrix4x4 class design c++ newbie

What would be better to use as a way to store matrix values?
float m1,m2,m3 ... ,m16
or
float[4][4].
I first tried float[16] but when im debugging and testing VS wont show what is inside of the array :( could implement a cout and try to read answer from a console test application.
Then i tried using float m1,m2,m3,etc under testing and debugging the values could be read in VS so it seemed easier to work with.
My question is because im fairly new with c++ what is the better design?
I find the float m1,m2 ... ,m16 easier to work with when debugging.
Would also love if someone could say from experience or has benchmark data what has better performance my gut says it shouldn't really matter because the matrix data should be laid out the same in memory right?
Edit:
Some more info its a column major matrix.
As far as i know i only need a 4x4 Matrix for the view transformation pipeline.
So nothing bigger and so i have some constant values.
Busy writing a simple software renderer as a way to learn more c++ and get some more experiences and learn/improve my Linear algebra skills. Will probably only go to per fragment shading and some simple lighting models and so far that i have seen 4x4 matrix is the biggest i will need for rendering.
Edit2:
Found out why i couldn't read the array data it was a float pointer i used and debugging menu only showed the pointer value i did discover a way to see the array value in watch where you have to do pointer, n where n = the element you want to see.
Everybody that answered thanks i will use the Vector4 m[4] answer for now.

You should consider a Vector4 with float [4] members, and a Matrix4 with Vector4 [4] members. With operator [], you have two useful classes, and maintain the ability to access elements with: [i][j] - in most cases, the element data will be contiguous, provided you don't use virtual methods.
You can also benefit from vector (SIMD) instructions this way, e.g., in Vector4
union alignas(16) { __m128 _v; float _s[4]; }; // members
inline float & operator [] (int i) { return _s[i]; }
inline const float & operator [] (int i) const { return _s[i]; }
and in Matrix4
Vector4 _m[4]; // members
inline Vector4 & operator [] (int i) { return _m[i]; }
inline const Vector4 & operator [] (int i) const { return _m[i]; }

The float m1, m2 .. m16; becomes very awkward to deal with when it comes to using loops to iterate through things. Using arrays of some sort is much easier. And, most likely, the compiler will generate AT LEAST as efficient code when you use loops as if you "hand-code", unless you actually write inline assembler or use SSE intrinsics.

The 16 float solution is fine as long as the code doesn't evolve (it is a hassle to maintain and it is not really readable)
The float[4][4] is a way better design (in terms of size parametrization) but you have to understand the notion of pointers.

I would use an array of 16 floats like float m[16]; with the sole reason being that it is very easy to pass it to a library like openGL, using the Matrix4fv suffix functions.
A 2D array like float m[4][4]; should also be configured in memory identically to float m[16] (see May I treat a 2D array as a contiguous 1D array?) and using that would be more convenient as far as having [row][col] (or [col][row] I am not sure which is correct in terms of openGL) indexing (compare m[1][1] vs m[5]).

Using separate variables for matrix elements may prove to be problematic. What are you planning to do when dealing with big matrices like 100x100?
Ofcourse you need to use some array-like structure and I strongly recommend you at least to use arrays

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js