Matrix representation using Eigen vs double pointer - c++

I have inherited some code which makes extensive use of double pointers to represent 2D arrays. I have little experience using Eigen but it seems easier to use and more robust than double pointers.
Does anyone have insight as to which would be preferable?

Both Eigen and Boost.uBLAS define expression hierarchies and abstract matrix data structures that can use any storage class that satisfies certain constraints. These libraries are written so that linear algebra operations can be clearly expressed and efficiently evaluated at a very high level. Both libraries use expression templates heavily, and are capable of doing pretty complicated compile-time expression transformations. In particular, Eigen can also use SIMD instructions, and is very competitive on several benchmarks.
For dense matrices, a common approach is to use a single pointer and keep track of additional row, column, and stride variables (the reason that you may need the third is because you may have allocated more memory than you really need to store x * y * sizeof(value_type) elements because of alignment). However, you have no mechanisms in place to check for out-of-range accessing, and nothing in the code to help you debug. You would only want to use this sort of approach if, for example, you need to implement some linear algebra operations for educational purposes. (Even if this is the case, I advise that you first consider which algorithms you would like to implement, and then take a look at std::unique_ptr, std::move, std::allocator, and operator overloading).

Remember Eigen has a Map capability that allows you to make an Eigen matrix to a contiguous array of data. If it's difficult to completely change the code you have inherited, mapping things to an Eigen matrix at least might make interoperating with raw pointers easier.

Yes definitely, for modern C++ you should be using a container rather than raw pointers.
Eigen
When using Eigen, take note that its fixed size classes (like Vector3d) use optimizations that require them to be properly aligned. This requires special care if you include those fixed size Eigen values as members in structures or classes. You also can't pass them by value, only by reference.
If you don't care about such optimizations, it's trivial enough to disable it: simply add
#define EIGEN_DONT_ALIGN
as the first line of all source files (.h, .cpp, ...) that use Eigen.
The other two options are:
Boost Matrix
#include <boost/numeric/ublas/matrix.hpp>
boost::numeric::ublas::matrix<double> m (3, 3);
std::vector
#include <vector>
std::vector<std::vector<double> > m(3, std::vector<double>(3));

Related

Fortran array access via vector subscripts, cpp equivalent

I am wondering whether there is a cpp equivalent to accessing array locations in fortran via indexes stored in other arrays
I am novice to cpp but experienced in oop fortran. I am thinking about leaving fortran behind for the much better support of oop in recent cpp (oop in fortran is probably at the stage of year 2000 cpp).
However, my applications are heavily geared towards linear algebra. Contrarily to cpp, fortran has a lot of compiler built in support for this. But I would happily load libraries in cpp for gaining elaborate oop support.
But if the below construct is missing in cpp that would be really annoying.
As I haven't found anything related yet I would appreciate if some experienced cpp programmer could comment.
An assignment to a 1D array location in fortan using a cascade of vector subscripts can be a complex as this:
iv1(ivcr(val(i,j)))=1
where iv1 is a 1D integer vector, ivcr is a 1D integer vector, val is a 2D integer array and i and j are scalars. I am wondering whether I could write this in a similar compact form in cpp.
A only slightly more complex example would be:
iv1(ivcr(val(i:j,j)))=1
which will fill a section in iv1 with "1".
How would cpp deal with that problem in the shortest possible way.
Given (suitably initialized):
std::vector<int> iv1, ivcr;
std::vector<std::vector<int>> val;
Then your iv1(ivcr(val(i,j)))=1 is simply
iv1[ivcr[val[i][j]]] = 1;
As for iv1(ivcr(val(i:j,j)))=1, or just val(i:j, j), there is no inbuilt way to slice into arrays like this. To be able to assign 1 to these kinds of nested datastructure accesses, you would need datastructures that provide expression templates. The Eigen library has just that and is one of the major linear algebra libraries for C++. Check out their documentation for indexing and slicing here:
https://eigen.tuxfamily.org/dox-devel/group__TutorialSlicingIndexing.html

Intel MKL OOP Wrapper Design and Operator Overloading

I started writing an oop wrapper for Intels MKL library and came across some design issues. I hope you can help me find the "best" way to handle these issues. The issues are mainly concerning operator overloading and are not critical two the wrapper but effect readability and/or performance.
The first issue is overloading operators considering how the blas functions are defined. As an example, matrix multiplication is defined as
( being matrices, scalars).
Now i can overload , and alone, but for the implementation of BLAS I would need 4 function calls using overloaded operators instead of one. Or i could use a normal function call (which will be implemented anyway), but lose the "natural" way of writing the equation using overloaded operators, making it less readable (but still more readible than with those horrible BLAS names).
The second issue is read and write access to the matrices. As example we can consider the following upper triangular matrix:
This matrix would be stored efficiently in a 1D array like this (order may vary depending on row/column major order):
Since a matrix has two indices, the easiest way to overload reading would be using
<TYPE> & operator() (size_t row, size_t column);
instead of some work around with subscript operators. The problem is handling the zeros. They may not be stored in the array, but mathematically they exist. If I want to read these values in another function (not MKL) I may need to be able to return the zero to handle this (aside from storing the matrix type, which is done for BLAS anyway).
Since () returns a reference, I can't return 0. I could make a dummy variable, but if I were to write to that value, I wouldn't have a upper triangular matrix anymore. So I would have to either change the matrix type, forbid writing to these elements, or ignore it (bad idea).
To change the matrix type I would need to detect writing, that would require explicitly using some kind of proxy object.
To prevent writing, I would probably have to do the same since I can't return a const value because the overload doesn't fit that definition. Alternatively I could forbid writing this way in general, but then I couldn't change the existing matrix itself, which I don't want.
I hope you can give me some pointers on how to handle these issues and what design principles I may be forgetting/should take into account. As I said, they are not critical (I can write appropriate functions for everything instead of operators).
T
I wrote a library for medical image reconstruction https://github.com/kvahed/codeare. The matrix object there has a lot of overloaded operators and convenience function to allow one to write efficiently matlab-like code in c++.
What you want to do for passing the data between MKL and other libraries / algorithms is in my view impossible. How do you want to distinguish 0 from 1e-18. What when you want to go to some numeric optimisation etc. This is premature optimisation that you are looking at. Even if you wanted to use sparsity, you could only do it say column-wise or row-wise, or like above note down, that you have an upper triangular form. But skipping individual 0s. Crazy. Of course copying 0s around doesn't feel right, but getting your algorithms optimised first and then worry about the above would be the way I'd go.
Also don't forget, that a lot of libraries out there cannot handle sparse matrixes, at which point you would have to put in place a recopying of the non-zero part or some bad ass expensive iterator, that would deliver the results.
Btw you would not only need the operator you noted down in your question but also the const variant; in other words:
template <typename T> class Matrix {
...
T& operator()(size_t n, size_t m);
const T& operator()(size_t n, size_t m) const;
...
};
There is so much more expensive stuff to optimise than std::copying stuff. For example SIMD intrinsics ...
https://github.com/kvahed/codeare/blob/master/src/matrix/SIMDTraits.hpp

Eigen Map<> performance

I'm using Eigen to provide some convenient array operations on some data that also interfaces with some C libraries (particularly FFTW). Essentially I need to FFT the x, y, and z components of a (large) collection of vectors. Because of this, I'm trying to decide if it makes more sense to use
array<Eigen::Vector3d, n> vectors;
(which will make the FFTW calls a little more cumbersome as I'd need a pointer to the very first double) or
array<double, 3 * n> data;
(which will make the linear algebra/vector products more cumbersome as I'd have to wrap chunks of data with Eigen::Map<Vector3d>). A little bit of empirical testing shows that constructing the Map is about 30% slower than constructing a Eigen::Vector3d but I could have easily oversimplified my use-case in exploring this. Is there a good reason to prefer one design over the other? I'm inclined to use the second because data has less abstraction in the way of treating its contents as "a collection of doubles that I can interpret however I want."
Edit: There's probably something to be said for implicit Eigen::Vector3d alignment, but I'm assuming vectors and data align the underlying doubles the same way.

performance of thrust vs. cublas

I have an std::vector of matrices of different sizes and I am going to calculate the square of every matrix. I have two solutions :
1/ Flatten all my matrices, and store them in the device as a huge flat array (float *), with indices of beginning and end of each matrix in that array, and use cublas for example to do the squaring.
2/ store the matrices in a thrust::device_vector<float *> and use thrust::for_each to square them.
Clearly the second solution gives more readable code, but does it impact performance?
I think this is (now) just a repeat of a question you already asked.
Assuming the elementwise operation you want to do is something simple like squaring of each element, there should be little difference in performance or efficiency between the two cases.
This is because such an operation will be memory-bound, meaning its performance will be limited by (GPU) memory bandwidth. Therefore both realizations will have approximately the same limiter, and approximately the same performance.
Note that in both of your proposals, the data will ultimately need to be effectively "flattened" in the same way (thrust operations cannot be constructed in a typical or simple fashion to operate on a thrust::device_vector<float *>)
If you already have a mix of thrust and CUBLAS, for example, then you could probably use whichever approach suited you. If, on the other hand, your module used only CUBLAS, and you could realize your operation using either CUBLAS or thrust, I'm not sure I would inject thrust just for this one operation. But that's just a matter of opinion.

C++ multidimensional arrays possibilities

I would like to translate some existing Matlab code that quite naturally uses a lot of multidimensional arrays and I wonder what are the possible options. I want the containers to have copy constructors, default constructors, if possible clear error messages at compilation, access via A[i][j] and in general not to be troublesome. Preferably, they should use the std::move operation for speed.
As far as I can see the options boils down to:
std::vector iterated. It sure works, but it seems stupid to write std::vector<std::vector<std::vector<double> > >for a 3D array. I am also concerned with the overhead in speed and memory.
The boost::multiarray and blitz::Array offer most of the functionality but fails at the copy constructor (see stackoverflow) at runtime. It is unclear to me if there are valid reasons for that.
The Eigen library seems to be very fast but it does not allow copy at all, and has no default constructor, which means that another container has to be used.
The std::array has the disadvantage that the size has to be known when the object is created, so there is no default constructors.
Is there a simpler multidimensional container satisfying all the requests but more frugal than iterated std::vector?
There is good linear algebra package called Armadillo
http://arma.sourceforge.net/
used it with R, happy user
I am not sure this can answer all your needs but I myself had to handle multi-dimensional arrays for creating meshes/grid and wanted to create my own class for that.
My class let's call it MultiArray uses a a one-dimension vector as container.
For instance, writing MultiArray<4, float, 10, 15, 10, 18> A() would create a multi array A[10][15][10][18] in a vector of size 10*15*10*18.
I can access to elements by single index A(i) or by coordinates A[i][j][k][l] by calling A({i,j,k,l}). For performance purpose I have precomputed in the constructor the product of the dimensions in order to compute fastly coordinates->index or index->coordinates.
The code is generic for N dimensions. I can detail some parts if you want.
You have missed another option:
std::valarray
Depending on what your requirements are it could be useful.
http://www.cplusplus.com/reference/valarray/