I was wondering if there is a more efficient way to remove columns or rows that are all zero elements. I am sure there is using the functions in the eigen library but I do not know how.
Right now I am doing it like so, with the idea of the while loop being used in case there are multiple rows/columns that sum to zero I dont want to exceed range limits or pass any zero rows.
void removeZeroRows() {
int16_t index = 0;
int16_t num_rows = rows();
while (index < num_rows) {
double sum = row(index).sum();
// I use a better test if zero but use this for demonstration purposes
if (sum == 0.0) {
else {
num_rows = rows();
Currently (Eigen 3.3), there is no direct functionality for this (though it is planned for Eigen 3.4).
Meanwhile can use something like this (of course, row and col can be interchanged, and output is just for illustration):
Eigen::MatrixXd A;
// find non-zero columns:
Eigen::Matrix<bool, 1, Eigen::Dynamic> non_zeros = A.cast<bool>().colwise().any();
std::cout << "A:\n" << A << "\nnon_zeros:\n" << non_zeros << "\n\n";
// allocate result matrix:
Eigen::MatrixXd res(A.rows(), non_zeros.count());
// fill result matrix:
Eigen::Index j=0;
for(Eigen::Index i=0; i<A.cols(); ++i)
res.col(j++) = A.col(i);
std::cout << "res:\n" << res << "\n\n";
Generally, you should avoid resizing a matrix at every iteration, but resize it to the final size as soon as possible.
With Eigen 3.4 something similar to this will be possible (syntax is not final yet):
Eigen::MatrixXd res = A("", A.cast<bool>().colwise().any());
Which would be equivalent to Matlab/Octave:
res = A(:, any(A));
I am trying to use Eigen to efficiently assemble a Stiffness matrix for non-linear finite element computations.
From my finite element discretization I can exactly extract my sparsity pattern. Therefore I can just use:
mat.setFromTriplets(TripletList.begin(), TripletList.end());
as proposed in http://eigen.tuxfamily.org/dox/group__SparseQuickRefPage.html.
My questions that arise here are:
Due to the non-linear nature I have to refill my matrix very often. Therefore should I store than again all contribution in a triplet and reuse mat.setFromTriplets(...) again and again?
If I reuse mat.setFromTriplets(...) can I somehow exploit the fact that I evaluated my element matrices for the assembly always in the same order and therefore my indices in the triplet never change but only the value. Therefore the "search in memory" can be circumvented since I can maybe store the place where to put it in a new Array?
If mat.coeffRef(i,j) is faster can I maybe exploit the aforementioned fact?
One extra question: (Lower priority) Is it possible to store and assemble efficiently 3 matrices with the same sparsity pattern, i.e. if I have to do it in a loop? For example a matrix wrapper where i have one SparseMatrix to get the matrices as M1=mat[0], M2=mat[1], M3=mat[2], where mat[i] return the first matrix and M1,M2 and M3 are e.g. SparseMatrix<double> M1(1000,1000).-
The general setup is the following (for question 1.-3. only M1 appears):
std::vector< Eigen::Triplet<double> > tripletListA; // triplets differ only in the values and not in the indices
std::vector< Eigen::Triplet<double> > tripletListB;
std::vector< Eigen::Triplet<double> > tripletListC;
SparseMatrix<double> M1(1000,1000);
SparseMatrix<double> M2(1000,1000);
SparseMatrix<double> M3(1000,1000);
//Reserve space in triplets
//Reserve space in matrices
//fill triplet list with zeros
M1.setFromTriplets(tripletListA.begin(), tripletListA.end());
M2.setFromTriplets(tripletListB.begin(), tripletListB.end());
M3.setFromTriplets(tripletListC.begin(), tripletListC.end());
for (int i=0; i<1000; i++) {
//Fill triplets
M1.setFromTriplets(tripletListA.begin(), tripletListA.end()); //or use coeffRef?
M2.setFromTriplets(tripletListB.begin(), tripletListB.end());
M3.setFromTriplets(tripletListC.begin(), tripletListC.end());
Thank you and regards,
Thank you for your answers. Initially the order of my access to the nonzeros is quite arbitrary. But since i'm interested in an iterative scheme i think about documenting this random sorting and construct an operator which takes care of this. This operator can be constructed (at least in my mind) from the initially constructed triplet.
SparseMatrix<double> mat(rows,cols);
std::vector<double> valuevector(nnz);
//Initially construction
std::vector< Eigen::Triplet<double> > tripletList;
//naive fill of tripletList
//Sorting of entries and identifying double entries in tripletList from col and row values
//generating from this information operator P
for (int i=0; i<1000; i++)
//naive refill of tripletList
valuevector= P*tripletList.value(); //constructing vector in efficient ordering from values of triplets (tripletList.value() call does not makes since for std::vector but i hope it is clear what i have in mind
for (int k=0; k<mat.outerSize(); ++k)
for (SparseMatrix<double>::InnerIterator it(mat,k); it; ++it)
it.valueRef() =valuevector(it);
I think about the operator P just as a matrix with ones and zeros at the appropiate places.
The question remains if this is even a more efficient procedure?
UPDATE-2: Benchmark:
I tried to construct my ideas in a code snippet. I first generate a random triplet list. This list is constructed to get a sparsity of 95% and additionally some values in the list are duplicated to mimic dubplicates in the triplet list whic hwrite on the same position in the sparse matrix. These values are then inserted based on different concepts. The first one is the setfromtriplet approach and the second and third tries to exploit the known structure.
The second and third approach documents the ordering of the triplet list. This information is then exploited to directly write the values in the pure mat1.coeffs() vector.
#include <iostream>
#include <Eigen/Sparse>
#include <random>
#include <fstream>
#include <chrono>
using namespace std::chrono;
using namespace Eigen;
using namespace std;
typedef Eigen::Triplet<double> T;
void findDuplicates(vector<pair<int, int> > &dummypair, Ref<VectorXi> multiplicity) {
// Iterate over the vector and store the frequency of each element in map
int pairCount = 0;
pair<int, int> currentPair;
for (int i = 0; i < multiplicity.size(); ++i) {
currentPair = dummypair[pairCount];
while (currentPair == dummypair[pairCount + multiplicity[i]]) {
pairCount += multiplicity[i];
typedef Matrix<duration<double, std::milli>, Dynamic, Dynamic> MatrixXtime;
int main() {
//init random generators
std::default_random_engine gen;
std::uniform_real_distribution<double> dist(0.0, 1.0);
int sizesForTest = 5;
int measures = 6;
MatrixXtime timeArray(sizesForTest, measures);
cout << "TripletTime NestetTime LNestedTime " << endl;
for (int m = 0; m < sizesForTest; ++m) {
int rows = pow(10, m + 1);
int cols = rows;
std::uniform_int_distribution<int> distentryrow(0, rows - 1);
std::uniform_int_distribution<int> distentrycol(0, cols - 1);
std::vector<T> tripletList;
SparseMatrix<double> mat1(rows, cols);
// SparseMatrix<double> mat2(rows,cols);
// SparseMatrix<double> mat3(rows,cols);
//generate sparsity pattern of matrix with 10% fill-in
tripletList.emplace_back(3, 0, 15);
for (int i = 0; i < rows; ++i)
for (int j = 0; j < cols; ++j) {
auto value = dist(gen); //generate random number
auto value2 = dist(gen); //generate random number
auto value3 = dist(gen); //generate random number
if (value < 0.05) {
auto rowindex = distentryrow(gen);
auto colindex = distentrycol(gen);
tripletList.emplace_back(rowindex, colindex, value); //if larger than treshold, insert it
//dublicate every third entry to mimic entries which appear more then once
if (value2 < 0.3333333333333333333333)
tripletList.emplace_back(rowindex, colindex, value);
//triple every forth entry to mimic entries which appear more then once
if (value3 < 0.25)
tripletList.emplace_back(rowindex, colindex, value);
tripletList.emplace_back(3, 0, 9);
int numberOfValues = tripletList.size();
//initially set all matrices from triplet to allocate space and sparsity pattern
mat1.setFromTriplets(tripletList.begin(), tripletList.end());
// mat2.setFromTriplets(tripletList.begin(), tripletList.end());
// mat3.setFromTriplets(tripletList.begin(), tripletList.end());
int nnz = mat1.nonZeros();
//reset all entries back to zero to fill in later
// mat2.coeffs().setZero();
// mat3.coeffs().setZero();
//document sorting of entries for repetative insertion
VectorXi internalIndex(numberOfValues);
vector<pair<int, int> > dummypair(numberOfValues);
VectorXd valuelist(numberOfValues);
for (int l = 0; l < numberOfValues; ++l) {
valuelist(l) = tripletList[l].value();
//init internalindex and dummy pair
internalIndex = Eigen::VectorXi::LinSpaced(numberOfValues, 0.0, numberOfValues - 1);
for (int i = 0; i < numberOfValues; ++i) {
dummypair[i].first = tripletList[i].col();
dummypair[i].second = tripletList[i].row();
auto start = high_resolution_clock::now();
// sort the vector internalIndex based on the dummypair
sort(internalIndex.begin(), internalIndex.end(), [&](int i, int j) {
return dummypair[i].first < dummypair[j].first ||
(dummypair[i].first == dummypair[j].first && dummypair[i].second < dummypair[j].second);
auto stop = high_resolution_clock::now();
timeArray(m, 3) = (stop - start) / 1000;
start = high_resolution_clock::now();
sort(dummypair.begin(), dummypair.end());
stop = high_resolution_clock::now();
timeArray(m, 4) = (stop - start) / 1000;
start = high_resolution_clock::now();
VectorXi dublicatecount(nnz);
findDuplicates(dummypair, dublicatecount);
stop = high_resolution_clock::now();
timeArray(m, 5) = (stop - start) / 1000;
//calculate vector containing all indices of triplet
//therefore vector[k] is the vectorXi containing the entries of triples which should be written at dof k
int indextriplet = 0;
int multiplicity = 0;
vector<VectorXi> listofentires(mat1.nonZeros());
for (int k = 0; k < mat1.nonZeros(); ++k) {
multiplicity = dublicatecount[k];
listofentires[k] = internalIndex.segment(indextriplet, multiplicity);
indextriplet += multiplicity;
//Here the nonlinear analysis should start and everything beforehand is prepocessing
//Test1 from triplets
start = high_resolution_clock::now();
mat1.setFromTriplets(tripletList.begin(), tripletList.end());
stop = high_resolution_clock::now();
timeArray(m, 0) = (stop - start) / 1000;
//Test2 use internalIndex but calculate listofentires on the fly
indextriplet = 0;
start = high_resolution_clock::now();
for (int k = 0; k < mat1.nonZeros(); ++k) {
multiplicity = dublicatecount[k];
mat1.coeffs()[k] += valuelist(internalIndex.segment(indextriplet, multiplicity)).sum();
indextriplet += multiplicity;
stop = high_resolution_clock::now();
timeArray(m, 1) = (stop - start) / 1000;
//Test3 directly use listofentires
start = high_resolution_clock::now();
for (int k = 0; k < mat1.nonZeros(); ++k)
mat1.coeffs()[k] += valuelist(listofentires[k]).sum();
stop = high_resolution_clock::now();
timeArray(m, 2) = (stop - start) / 1000;
std::ofstream file("test.txt");
if (file.is_open()) {
file << mat1 << '\n';
cout << "Size: " << rows << ": ";
for (int n = 0; n < measures; ++n)
cout << timeArray(m, n).count() << " ";
cout << endl;
return 0;
If i run this example on my i5-6600K 3.5Ghz and 16GB ram i end up with the following results. which are the times in seconds.
Size Triplet Nested LessNested Sort_intIndex Sort_dum_pair findDuplica
10 1e-06 1e-06 2e-06 1e-06 1e-06 1e-06
100 2.8e-05 4e-06 1.4e-05 5e-05 4.2e-05 1e-05
1000 0.003 0.000416 0.001489 0.01012 0.00627 0.000635
10000 0.426 0.093911 0.48912 1.5389 0.780676 0.061881
100000 337.799 99.0801 37.3656 292.397 87.4488 0.79996
The first three columns denote the calculation time of the different approaches and column 4 to 6 denote the times for different preprocessing steps.
For the size of 100000 rowsand coloumns my Ram gets full relatively fast and therefore the last table entry should be taken with care. Here the fastest method changes from 2 to three.
My questions here are is this approach going in the correct direction to improve the efficiency? Is this a complete wrong direction because for example for the case of a size of 10000 an assemble time of 0.48s seems a bit high?
Additionally the preprocessing steps are getting expensive very fast and is there a better way to construct the ordering of the matrix? Finally as last question is the benchmarking done in the correct way?
Thanks for your time,
I'm working in C++ with a sparse matrix in Eigen. I would like to read the data stored in a specific row and column index just like I would with a regular eigen matrix.
std::vector<Eigen::Triplet<double>> tripletList;
// TODO: populate triplet list with non-zero entries of matrix
Eigen::SparseMatrix<double> matrix(nRows, nCols);
matrix.setFromTriplets(tripletList.begin(), tripletList.end());
// TODO: set iRow and iCol to be valid indices.
// How to read the value at a specific row and column index?
// double value = matrix(iRow, iCol); // Compiler error
How do I go about performing this type of indexing operation?
Try coeff:
double value = matrix.coeff(iRow, iCol);
If you want a non-const version use coeffRef instead. Note that when using coeffRef if the element doesn't exist, it will be inserted.
This Code work for me
for (int i=0; i<matrix.rows(); ++i){
for(int j=0; i<matrix.cols(); ++j)
cout << " i,j=" << i << ","
<< j << " value="
<< matrix.coeff(i,j)
<< std::endl;
Here is how to do it in the raw:
The methods you want are innerIndexPtr, outerIndexPtr, InnerNNZs and valuePtr.
struct sparseIndex {
std::size_t row, col;
template<class SparseMatrix, class Scalar=typename SparseMatrix::Scalar>
Scalar get_from( SparseMatrix const& m, Scalar def={} ) const {
if ((std::size_t)m.cols() >= col) return def;
auto* inner_index_start = m.innerIndexPtr()+m.outerIndexPtr()[col];
auto* inner_index_end = inner_index_start;
if (auto* nzp = m.innerNonZeroPtr()) { // returns null if compressed
inner_index_end += nzp[col];
} else {
inner_index_end = m.innerIndexPtr()+m.outerIndexPtr()[col+1];
auto search_result = std::equal_range(
(typename SparseMatrix::StorageIndex)row
if (search_result.first == search_result.second) return def;
if ((std::size_t)*search_result.first != row) return def;
return m.valuePtr()[search_result.first-m.innerIndexPtr()];
auto r = sparseIndex{2,2}.get_from( sparseMatrix );
Code not tested. Based off these docs and these docs which disagree in some details.
I suspect I just reimplmeented .coeff, so take this with a grain of salt. :)
I have a matrix A of this form:
Eigen::Matrix<bool, n, m> A(n, m)
and I want to obtain a random element among the ones that are 'true'. The silly way to do that would be to obtain the number of 'true' elements t, generate a random number between 1 and t and iterate:
//r = random number
int k = 0;
for (int i = 0; i < A.rows(); ++i)
for (int j = 0; j < A.cols(); ++j)
if (A(i, j))
if (k == r)
std::cout << "(" << i << ", " << j << ")" << std::endl;
This solution is incredibly slow when multiple samples are needed and the matrix is big. Any suggestion as to how I should go about this?
In short: I'd like to find an efficient way to obtain the i-th 'true' element of the above matrix.
You could use Eigen::SparseMatrix instead.
Eigen::SparseMatrix<bool> A(n, m);
With its compressed (or not) column/row storage scheme, you could find the r-th non-zero element in O(m)/O(n) time, or O(log(m)) with binary search.
You could use the COO format utility Eigen::Triplet to find the r-th non-zero element in O(1) time.
std::vector<Eigen::Triplet<bool> > a(num_nonzeros);
And yes, since it's a bool matrix, storing the values is unnecessary too.
For example I have a 10x10 SparseMatrix A, and I want to add a 3x3 identity matrix to the upper left corner of A.
A is known to be already non-zero in those 3 entries.
If I have to add the values one by one it is ok too, but I didn't find the method to manipulate on elements of a Sparse Matrix in Eigen.
Did I miss something?
If all you want is to apply an operation to a specific element at a time, you can use coeffRef like so:
typedef Eigen::Triplet<double> T;
std::vector<T> coefficients;
for (int i = 0; i < 9; i++) coefficients.push_back(T(i, i, (i % 3) + 1));
Eigen::SparseMatrix<double> A(10, 10);
A.setFromTriplets(coefficients.begin(), coefficients.end());
std::cout << A << "\n\n";
for (int i = 0; i < 3; i++) A.coeffRef(i,i) += 1;
std::cout << A << "\n\n";
////////////////////MAKE INPUT VALUES////////////////////
double *NumOfInputsPointer = NULL;
std::cout << "How many inputs?" << std::endl;
int NumOfInputs;
std::cin >> NumOfInputs;
NumOfInputsPointer = new double[NumOfInputs];
std::cout << std::endl;
double InputVal;
for(int a = 0; a < NumOfInputs; a++)
std::cout << "What is the value for input " << a << std::endl;
std::cin >> InputVal;
*(NumOfInputsPointer + a) = InputVal;
std::cout << std::endl;
////////////////////MAKE WEIGHTS////////////////////
double *NumOfWeightsPointer = NULL;
int NumOfWeights;
NumOfWeightsPointer = new double[NumOfWeights];
double WightVal;
for(int a = 0; a < NumOfInputs; a++)
*(NumOfWeightsPointer + a) = 0.5;
////////////////////Multiplication BRAIN BROKE!!!!!////////////////////
double *MultiplyPointer = NULL;
MultiplyPointer = NumOfInputsPointer;
for(int a = 0; a < NumOfInputs; a++)
//Stuff to do things
The code above is going to make a single Artificial Neuron. I already have it built to make an array with the users wanted number of inputs which then automatically makes every inputs weight 0.5.
The wall I have hit, has caused me to struggle with the multiplication of the input values array with their weights array, then save those in another array to be added together latter and then go through a modifier.
My struggle is with the multiplication and saving it into an array. I hope I explained my problem well enough.
There are many problems with this code. I would highly recommend using std::vector instead of arrays. If every input has a constant weight of 0.5, then what's the point of creating an array where all elements are 0.5? Just create a constant variable representing the 0.5 weight and apply it to each input. The second array is unnecessary from what I can tell. Creating the last array (again, this would be easier with a vector) would be similar to the first one because the size is going to be the same. It is based on the number of inputs. So just create an array of the same size, loop through each element in the first array, do the multiplication using the constant I described above, and then store the result into the new array.
Just new it like you did with the others, and store the result of the multiplication there.
MultiplyPointer = new double[NumOfInputs];
for (a = 0; a < NumOfInputs; a++) {
MultiplyPointer[a] = NumOfWeightsPointer[a] * NumOfInputsPointer[a];
That being said, there are better ways to go about solving your problem. std::vector has been mentioned, which makes the memory management and looping bits easier. I would go a step further and incorporate a library with the notions of a matrix and matrix expressions, such as OpenCV or dlib.
Example using Mat from OpenCV:
cv::Mat input(NumOfInputs, 1, CV_64F, NumOfInputsPointer);
cv::Mat weights(NumOfInputs, 1, CV_64F, cv::Scalar(0.5));
cv::Mat result = input.mul(weights);
If the weights vector is not to be modified and reused, just skip the whole thing:
cv::Mat result = input.mul(cv::Scalar(0.5));