I'd like to ask about mathematical operations of arrays. I am mainly interested in carrying out operations such as:
vector products:
C=A+B
C=A*B
where A and B are arrays (or vectors), and
matrix products:
D=E*F;
where D[m][n], E[m][p], F[p][n];
Could anyone tell me what is the most efficient way to manipulate large quantities of numbers? Is it only possible by looping through the elements of an array or is there another way? Can vectors be used and how?
The C++ spec does not have mathematical constructs as you describe. The language surely provides all the features necessary for people to implement them. There are a lot of libraries out there, so its just up to you to choose one that fits your requirements.
Searching through stack overflow questions might give you an idea of where to start identifying those requirements if you don't know them already.
What's a good C++ library for matrix operations
Looking for an elegant and efficient C++ matrix library
High Performance Math Library for Vector And Matrix Calculations
Matrix Data Type in C++
Best C++ Matrix Library for sparse unitary matrices
Check out Armadillo, it provides lots of matrix functionality in a C++ interface. And it supports LAPACK, which is what MATLAB uses for linear algebra calculations.
C++ does not come with any "number aggregate" handling functionality out of the box, which the possible exception of std::valarray. (Compiler vendors could make valarray use vectorized operations, but generally speaking they don't)
Related
I want to do all kinds of matrix operation like finding a sub-matrix and finding its index and multiplication using vectors but can't find proper syntax for that on the internet. Please let me know the correct syntax to transition from primitive data types to vectors.
I suggest that you have a look at linear algrebra libraries, such as eigen, armadillo or lapack.
For armadillo you can find here an example on how to transform data form your structure to theirs.
Transforming or copying will take some time. Nevertheless, you will have a hard time creating a linear algebra library on your own, that will out-perform one of these libraries. Moreover, if you design your datastructure accordingly, you don't have to pay for this overhead.
I am used to Eigen for almost all my mathematical linear algebra work.
Recently, I have discovered that Boost also provides a C++ template class library that provides Basic Linear Algebra Library (Boost::uBLAS). This got me wondering if I can get all my work based only on boost as it is already a major library for my code.
A closer look at both didn't really got me a clearer distinction between them:
Boost::uBLAS :
uBLAS provides templated C++ classes for dense, unit and sparse vectors, dense, identity, triangular, banded, symmetric, hermitian and sparse matrices. Views into vectors and matrices can be constructed via ranges, slices, adaptor classes and indirect arrays. The library covers the usual basic linear algebra operations on vectors and matrices: reductions like different norms, addition and subtraction of vectors and matrices and multiplication with a scalar, inner and outer products of vectors, matrix vector and matrix matrix products and triangular solver.
...
Eigen :
It supports all matrix sizes, from small fixed-size matrices to arbitrarily large dense matrices, and even sparse matrices.
It supports all standard numeric types, including std::complex, integers, and is easily extensible to custom numeric types.
It supports various matrix decompositions and geometry features.
Its ecosystem of unsupported modules provides many specialized features such as non-linear optimization, matrix functions, a polynomial solver, FFT, and much more.
...
Does anyone have a better idea about their key differences and on which basis can we choose between them?
I'm rewriting a substantial project from boost::uBLAS to Eigen. This is production code in a commercial environment. I was the one who chose uBLAS back in 2006 and now recommended the change to Eigen.
uBLAS results in very little actual vectorization performed by the compiler. I can look at the assembly output of big source files, compiled to amd64 architecture, with SSE, using the float type, and not find a single ***ps instruction (addps, mulps, subps, 4 way packed single-precision floating point instructions) and only ***ss instructions (addss, ..., scalar single-precision).
With Eigen, the library is written to make sure that vector instructions result.
Eigen is very feature complete. Has lots of matrix factorizations and solvers. In boost::uBLAS the LU factorization is an undocumented add-on, a piece of contributed code. Eigen has additions for 3D geometry, such as rotations and quaternions, not uBLAS.
uBLAS is slightly more complete on the most basic operations. Eigen lacks some things, such as projection (indexing a matrix using another matrix), while uBLAS has it. For features that both have, Eigen is more terse, resulting in expressions that are easier to read.
Then, uBLAS is completely stale. I can't understand how anyone considers it in 2016/2017. Read the FAQ:
Q: Should I use uBLAS for new projects?
A: At the time of writing (09/2012) there are a lot of good matrix libraries available, e.g., MTL4, armadillo, eigen. uBLAS offers a stable, well tested set of vector and matrix classes, the typical operations for linear algebra and solvers for triangular systems of equations. uBLAS offers dense, structured and sparse matrices - all using similar interfaces. And finally uBLAS offers good (but not outstanding) performance. On the other side, the last major improvement of uBLAS was in 2008 and no significant change was committed since 2009. So one should ask himself some questions to aid the decision: Availability? uBLAS is part of boost and thus available in many environments. Easy to use? uBLAS is easy to use for simple things, but needs decent C++ knowledge when you leave the path. Performance? There are faster alternatives. Cutting edge? uBLAS is more than 10 years old and missed all new stuff from C++11.
I just did a time complexity comparison between boost and eigen for fairly trivial matrix computations. These results, limited as they are, seem to denote that boost is a much better alternative.
I had an FEM code which does the pre-processing parts (setting up the element matrices and stitching them together). So naturally, this would involve a lot of memory allocations.
I wrote identical pieces of codes with Boost and Eigen on C++ (gcc 5.4.0, ubuntu 16.04, Intel i3 Quad Core, 2.40GHz, RAM : 4Gb) and ran them separately for varying node sizes (N) and calculated time using the linux cl-utility.
As far as I'm concerned, I have decided to proceed with my code in Boost.
Choose Eigen if you care the performance and performance gain introduced by expression templates, and choose uBlas if you only want to learn expression templates.
http://eigen.tuxfamily.org/index.php?title=Benchmark
What library computes the rank of a matrix the fastest? Or, is there any code out in the open that does this fairly rapidly?
I am using Eigen3 and it seems to be slower than Python's numpy rank function. I just need this one function to be fast, absolutely nothing else matters. If you suggest a package everything but this is irrelevant, including ease of use.
The matrices I am looking at tend to be n by ( n choose 3) in size, the entries are 1 or 0....mostly 0's.
Thanks.
Edit 1: the rank is over R.
In general, BLAS/LAPACK functions are frighteningly fast. This link suggests using the GESVD or GESDD functions to compute singular values. The number of non-zero singular values will be the matrix's rank.
LAPACK is what numpy uses.
In short, you can use the same LAPACK library calls. It will be difficult to outperform BLAS/LAPACK functions, unless sparsity and special structure allow more efficient approaches. If that's true, you may want to check around for alternative libraries implementing sparse SVD solvers.
Note also there are multiple BLAS/LAPACK implementations.
Update
This post seems to argue that LU decomposition is unreliable for calculating rank. Better to do SVD. You may want to see how fast that eigen call is before going through all the hassle of using BLAS/LAPACK (I've just never used eigen).
I m going to do an algorithm like needleman-wunsch algorithm or smith-watterman algorithm for large sequences. So I'm going to need a way to create matrices of different sizes so my question is which library gives me the best performance on that and would be easy to use.
P.S: I know that OpenCV, and Boost can handle the matrices but I don't know if there are good to do operations on it.
If “the best performance” is the requirement, then you have to look at NVIDIA CUDA or Intel MKL. Libraries like C++ boost uBLAS concentrate not on performance, but on usability.
Hi I've been doing some research about matrix inversion (linear algebra) and I wanted to use C++ template programming for the algorithm , what i found out is that there are number of methods like: Gauss-Jordan Elimination or LU Decomposition and I found the function LU_factorize (c++ boost library)
I want to know if there are other methods , which one is better (advantages/disadvantages) , from a perspective of programmers or mathematicians ?
If there are no other faster methods is there already a (matrix) inversion function in the boost library ? , because i've searched alot and didn't find any.
As you mention, the standard approach is to perform a LU factorization and then solve for the identity. This can be implemented using the LAPACK library, for example, with dgetrf (factor) and dgetri (compute inverse). Most other linear algebra libraries have roughly equivalent functions.
There are some slower methods that degrade more gracefully when the matrix is singular or nearly singular, and are used for that reason. For example, the Moore-Penrose pseudoinverse is equal to the inverse if the matrix is invertible, and often useful even if the matrix is not invertible; it can be calculated using a Singular Value Decomposition.
I'd suggest you to take a look at Eigen source code.
Please Google or Wikipedia for the buzzwords below.
First, make sure you really want the inverse. Solving a system does not require inverting a matrix. Matrix inversion can be performed by solving n systems, with unit basis vectors as right hand sides. So I'll focus on solving systems, because it is usually what you want.
It depends on what "large" means. Methods based on decomposition must generally store the entire matrix. Once you have decomposed the matrix, you can solve for multiple right hand sides at once (and thus invert the matrix easily). I won't discuss here factorization methods, as you're likely to know them already.
Please note that when a matrix is large, its condition number is very likely to be close to zero, which means that the matrix is "numerically non-invertible". Remedy: Preconditionning. Check wikipedia for this. The article is well written.
If the matrix is large, you don't want to store it. If it has a lot of zeros, it is a sparse matrix. Either it has structure (eg. band diagonal, block matrix, ...), and you have specialized methods for solving systems involving such matrices, or it has not.
When you're faced with a sparse matrix with no obvious structure, or with a matrix you don't want to store, you must use iterative methods. They only involve matrix-vector multiplications, which don't require a particular form of storage: you can compute the coefficients when you need them, or store non-zero coefficients the way you want, etc.
The methods are:
For symmetric definite positive matrices: conjugate gradient method. In short, solving Ax = b amounts to minimize 1/2 x^T A x - x^T b.
Biconjugate gradient method for general matrices. Unstable though.
Minimum residual methods, or best, GMRES. Please check the wikipedia articles for details. You may want to experiment with the number of iterations before restarting the algorithm.
And finally, you can perform some sort of factorization with sparse matrices, with specially designed algorithms to minimize the number of non-zero elements to store.
depending on the how large the matrix actually is, you probably need to keep only a small subset of the columns in memory at any given time. This might require overriding the low-level write and read operations to the matrix elements, which i'm not sure if Eigen, an otherwise pretty decent library, will allow you to.
For These very narrow cases where the matrix is really big, There is StlXXL library designed for memory access to arrays that are mostly stored in disk
EDIT To be more precise, if you have a matrix that does not fix in the available RAM, the preferred approach is to do blockwise inversion. The matrix is split recursively until each matrix does fit in RAM (this is a tuning parameter of the algorithm of course). The tricky part here is to avoid starving the CPU of matrices to invert while they are pulled in and out of disk. This might require to investigate in appropiate parallel filesystems, since even with StlXXL, this is likely to be the main bottleneck. Although, let me repeat the mantra; Premature optimization is the root of all programming evil. This evil can only be banished with the cleansing ritual of Coding, Execute and Profile
You might want to use a C++ wrapper around LAPACK. The LAPACK is very mature code: well-tested, optimized, etc.
One such wrapper is the Intel Math Kernel Library.