I'd like to obtain the set difference between two Eigen matrices. The code:
void diffMatrix(
MatrixXi &M1, // First Matrix
MatrixXi &M2, // Second Matrix
MatrixXi &M3, // Matrix set difference
VectorXi &I3 // Matrix set difference indices
)
{
// find rows in first matrix that aren't in second matrix
// cols of M1 = M2
assert(M1.cols() == M2.cols());
M3.resize(M1.rows(), M1.cols());
I3.resize(M1.rows());
bool m2r_nonex;
size_t k = 0;
// get M1 rows
for (size_t i = 0; i < M1.rows(); i++)
{
m2r_nonex = true;
auto m1r = M1.row(i);
// NOTE: this is slow
// check M1 row is in M2
for (size_t j = 0; j < M2.rows(); j++)
{
auto m2r = M2.row(j);
if (m1r == m2r)
m2r_nonex = false;
}
// if it's not in m2, add it to M3
if (m2r_nonex)
{
M3.row(k) = m1r;
I3(k) = i;
k++;
}
}
M3.conservativeResize(k, NoChange);
I3.conservativeResize(k, NoChange);
}
MatrixXi M1, M2, M3;
VectorXi I3;
M1.resize(3, 3);
M2.resize(2, 3);
M1 << 0, 0, 0, 1, 1, 1, 2, 2, 2;
M2 << 1, 1, 1, 2, 2, 2;
diffMatrix(M1, M2, M3, I3);
===========================================
M3 (Rows: 1 Cols: 3)
===========================================
[[0, 0, 0]]
The provided code works of course, but is slow. Ideally one would substitute the inner for loop for some more compact expression, maybe something allowing to count all occurrence of a M1 row in M2 rows in a single statement...
Is that possible?
===== EDIT =====
(following Homer512 answer)
Yes, it works. To give an idea of the performance increase... After loading 10000 records:
MatrixXi M1, M2, M3;
VectorXi I3;
size_t rows = 10000;
M1.resize(rows, 3);
M2.resize(rows, 3);
for (size_t i = 0; i < rows; i++)
{
M1(i,0) = i;
M1(i,1) = i;
M1(i,2) = i;
M2(i,0) = i + 1;
M2(i,1) = i + 1;
M2(i,2) = i + 1;
}
with first method proposed elapsed is: 43.826125 secs; with second method proposed elapsed is: 0.017632 secs
It's about some orders of magnitude faster... Thank you for this.
Here is a version that ought to be relatively efficient.
We start by transposing the input matrices. You want to compare rows. But Eigen organizes its matrices column-major which means that consecutive elements within a row are not stored consecutively in memory. This makes everything you do rowwise slow and everything you do column-wise fast. Ideally, you want to skip this step and simply start with the proper orientation. Or switch to row-major matrices as described in the linked Eigen documentation.
const Eigen::MatrixXi left_transp = M1.transpose();
const Eigen::MatrixXi right_transp = M2.transpose();
Now for the actual star of the show: We want to use a hash set to check whether an element is in the set. We have a hash set we can use, the std::unordered_set but we need to define a suitable key to represent a vector, ideally without copying data.
Thankfully C++17 introduced std::string_view and related type definitions including std::u32string_view; perfect for binary int32 data. These types come with std::hash specializations and comparisons.
At this point we can build a hash table for all entries in M2.
std::unordered_set<std::u32string_view> right_set;
right_set.reserve(static_cast<std::size_t>(right_transp.cols()));
const std::size_t rows = static_cast<std::size_t>(right_transp.rows());
for(auto col: right_transp.colwise()) {
static_assert(sizeof(char32_t) == sizeof(int));
assert(col.innerStride() == 1);
const char32_t* ptr = reinterpret_cast<const char32_t*>(col.data());
right_set.emplace(ptr, rows);
}
In the next step we can do the same with M1 and search the hash table for duplicates.
const Eigen::Index left_n = left_transp.cols();
I3.resize(left_n);
Eigen::Index count = 0;
for(Eigen::Index i = 0; i < left_n; ++i) {
const char32_t* ptr = reinterpret_cast<const char32_t*>(
left_transp.col(i).data());
const std::u32string_view view{ ptr, rows };
if(! right_set.count(view))
I3[count++] = i;
}
I3.conservativeResize(count);
All that's left is using the indices to build M3. Again, preferably without transposing.
M3.resize(static_cast<Eigen::Index>(rows), count);
for(Eigen::Index i = 0; i < count; ++i)
M3.col(i) = left_transp.col(I3[i]);
M3.transposeInPlace();
Related
I would like to know if there is a function or an optimized way to reshape sparse matrices in Eigen.
In the documentation there is no reshape method for such matrices, so I implemented a function myself, but I don't know if it is optimized (i need it to be as fast as possible). Here is my approach:
Eigen::SparseMatrix<double> reshape_sp(const Eigen::SparseMatrix<double>& x,
lint a, lint b) {
Eigen::SparseMatrix<double> y(a, b);
for (int k=0; k<x.outerSize(); ++k) {
for (Eigen::SparseMatrix<double>::InnerIterator it(x,k); it; ++it) {
int pos = it.col()*x.rows()+it.row();
int col = int(pos/a);
int row = pos%a;
y.insert(row, col) = it.value();
}
}
y.makeCompressed();
return y;
}
For performance, it is absolutely crucial that you call reserve on your matrix. I've tested with a 100,000 x 100,000 matrix population 1%. Your version (after fixing the 32 bit overflow in pos computation), took 3 minutes. This fixed version a few seconds:
Eigen::SparseMatrix<double>
reshape(const Eigen::SparseMatrix<double>& orig,
int rows, int cols)
{
Eigen::SparseMatrix<double> rtrn(rows, cols);
rtrn.reserve(orig.nonZeros());
using InnerIterator = Eigen::SparseMatrix<double>::InnerIterator;
for(int k = 0; k < orig.outerSize(); ++k) {
for(InnerIterator it(orig, k); it; ++it) {
std::int64_t pos = std::int64_t(it.col()) * orig.rows() + it.row();
int col = int(pos / rows);
int row = int(pos % rows);
rtrn.insert(row, col) = it.value();
}
}
rtrn.makeCompressed();
return rtrn;
}
An alternative is to work with triplets again. This is a bit slower but less likely to explode in your face the same way insert does. This is particularly helpful for more complex operations like transposing where you cannot guarantee that the insert appends at the end.
Eigen::SparseMatrix<double>
reshape(const Eigen::SparseMatrix<double>& orig,
int rows, int cols)
{
using InnerIterator = Eigen::SparseMatrix<double>::InnerIterator;
using Triplet = Eigen::Triplet<double>;
std::vector<Triplet> triplets;
triplets.reserve(std::size_t(orig.nonZeros()));
for(int k = 0; k < orig.outerSize(); ++k) {
for(InnerIterator it(orig, k); it; ++it) {
std::int64_t pos = std::int64_t(it.col()) * orig.rows() + it.row();
int col = int(pos / rows);
int row = int(pos % rows);
triplets.emplace_back(row, col, it.value());
}
}
Eigen::SparseMatrix<double> rtrn(rows, cols);
rtrn.setFromTriplets(triplets.begin(), triplets.end());
return rtrn;
}
Things I tested that did not work:
Using FXDiv to replace the division with a cheaper operation
Computing maximum distance from one index to the next within a single column to skip dividing if both values are in the same output column (may still be worth it for sparse matrices with suitable inner structure)
Parallelizing the loop with OpenMP, using a final std::sort(std::execution::par, ...) for the triplets.
I am trying to use Eigen to efficiently assemble a Stiffness matrix for non-linear finite element computations.
From my finite element discretization I can exactly extract my sparsity pattern. Therefore I can just use:
mat.reserve(nnz);
mat.setFromTriplets(TripletList.begin(), TripletList.end());
as proposed in http://eigen.tuxfamily.org/dox/group__SparseQuickRefPage.html.
My questions that arise here are:
Due to the non-linear nature I have to refill my matrix very often. Therefore should I store than again all contribution in a triplet and reuse mat.setFromTriplets(...) again and again?
If I reuse mat.setFromTriplets(...) can I somehow exploit the fact that I evaluated my element matrices for the assembly always in the same order and therefore my indices in the triplet never change but only the value. Therefore the "search in memory" can be circumvented since I can maybe store the place where to put it in a new Array?
If mat.coeffRef(i,j) is faster can I maybe exploit the aforementioned fact?
One extra question: (Lower priority) Is it possible to store and assemble efficiently 3 matrices with the same sparsity pattern, i.e. if I have to do it in a loop? For example a matrix wrapper where i have one SparseMatrix to get the matrices as M1=mat[0], M2=mat[1], M3=mat[2], where mat[i] return the first matrix and M1,M2 and M3 are e.g. SparseMatrix<double> M1(1000,1000).-
The general setup is the following (for question 1.-3. only M1 appears):
std::vector< Eigen::Triplet<double> > tripletListA; // triplets differ only in the values and not in the indices
std::vector< Eigen::Triplet<double> > tripletListB;
std::vector< Eigen::Triplet<double> > tripletListC;
SparseMatrix<double> M1(1000,1000);
SparseMatrix<double> M2(1000,1000);
SparseMatrix<double> M3(1000,1000);
//Reserve space in triplets
tripletListA.reserve(nnz);
tripletListB.reserve(nnz);
tripletListC.reserve(nnz);
//Reserve space in matrices
M1.reserve(nnz);
M2.reserve(nnz);
M3.reserve(nnz);
//fill triplet list with zeros
M1.setFromTriplets(tripletListA.begin(), tripletListA.end());
M2.setFromTriplets(tripletListB.begin(), tripletListB.end());
M3.setFromTriplets(tripletListC.begin(), tripletListC.end());
for (int i=0; i<1000; i++) {
//Fill triplets
M1.setFromTriplets(tripletListA.begin(), tripletListA.end()); //or use coeffRef?
M2.setFromTriplets(tripletListB.begin(), tripletListB.end());
M3.setFromTriplets(tripletListC.begin(), tripletListC.end());
//solve
//update
}
Thank you and regards,
Alex
UPDATE:
Thank you for your answers. Initially the order of my access to the nonzeros is quite arbitrary. But since i'm interested in an iterative scheme i think about documenting this random sorting and construct an operator which takes care of this. This operator can be constructed (at least in my mind) from the initially constructed triplet.
SparseMatrix<double> mat(rows,cols);
std::vector<double> valuevector(nnz);
//Initially construction
std::vector< Eigen::Triplet<double> > tripletList;
//naive fill of tripletList
//Sorting of entries and identifying double entries in tripletList from col and row values
//generating from this information operator P
for (int i=0; i<1000; i++)
{
//naive refill of tripletList
valuevector= P*tripletList.value(); //constructing vector in efficient ordering from values of triplets (tripletList.value() call does not makes since for std::vector but i hope it is clear what i have in mind
for (int k=0; k<mat.outerSize(); ++k)
for (SparseMatrix<double>::InnerIterator it(mat,k); it; ++it)
it.valueRef() =valuevector(it);
}
I think about the operator P just as a matrix with ones and zeros at the appropiate places.
The question remains if this is even a more efficient procedure?
UPDATE-2: Benchmark:
I tried to construct my ideas in a code snippet. I first generate a random triplet list. This list is constructed to get a sparsity of 95% and additionally some values in the list are duplicated to mimic dubplicates in the triplet list whic hwrite on the same position in the sparse matrix. These values are then inserted based on different concepts. The first one is the setfromtriplet approach and the second and third tries to exploit the known structure.
The second and third approach documents the ordering of the triplet list. This information is then exploited to directly write the values in the pure mat1.coeffs() vector.
#include <iostream>
#include <Eigen/Sparse>
#include <random>
#include <fstream>
#include <chrono>
using namespace std::chrono;
using namespace Eigen;
using namespace std;
typedef Eigen::Triplet<double> T;
void findDuplicates(vector<pair<int, int> > &dummypair, Ref<VectorXi> multiplicity) {
// Iterate over the vector and store the frequency of each element in map
int pairCount = 0;
pair<int, int> currentPair;
for (int i = 0; i < multiplicity.size(); ++i) {
currentPair = dummypair[pairCount];
while (currentPair == dummypair[pairCount + multiplicity[i]]) {
multiplicity[i]++;
}
pairCount += multiplicity[i];
}
}
typedef Matrix<duration<double, std::milli>, Dynamic, Dynamic> MatrixXtime;
int main() {
//init random generators
std::default_random_engine gen;
std::uniform_real_distribution<double> dist(0.0, 1.0);
int sizesForTest = 5;
int measures = 6;
MatrixXtime timeArray(sizesForTest, measures);
cout << "TripletTime NestetTime LNestedTime " << endl;
for (int m = 0; m < sizesForTest; ++m) {
int rows = pow(10, m + 1);
int cols = rows;
std::uniform_int_distribution<int> distentryrow(0, rows - 1);
std::uniform_int_distribution<int> distentrycol(0, cols - 1);
std::vector<T> tripletList;
SparseMatrix<double> mat1(rows, cols);
// SparseMatrix<double> mat2(rows,cols);
// SparseMatrix<double> mat3(rows,cols);
//generate sparsity pattern of matrix with 10% fill-in
tripletList.emplace_back(3, 0, 15);
for (int i = 0; i < rows; ++i)
for (int j = 0; j < cols; ++j) {
auto value = dist(gen); //generate random number
auto value2 = dist(gen); //generate random number
auto value3 = dist(gen); //generate random number
if (value < 0.05) {
auto rowindex = distentryrow(gen);
auto colindex = distentrycol(gen);
tripletList.emplace_back(rowindex, colindex, value); //if larger than treshold, insert it
//dublicate every third entry to mimic entries which appear more then once
if (value2 < 0.3333333333333333333333)
tripletList.emplace_back(rowindex, colindex, value);
//triple every forth entry to mimic entries which appear more then once
if (value3 < 0.25)
tripletList.emplace_back(rowindex, colindex, value);
}
}
tripletList.emplace_back(3, 0, 9);
int numberOfValues = tripletList.size();
//initially set all matrices from triplet to allocate space and sparsity pattern
mat1.setFromTriplets(tripletList.begin(), tripletList.end());
// mat2.setFromTriplets(tripletList.begin(), tripletList.end());
// mat3.setFromTriplets(tripletList.begin(), tripletList.end());
int nnz = mat1.nonZeros();
//reset all entries back to zero to fill in later
mat1.coeffs().setZero();
// mat2.coeffs().setZero();
// mat3.coeffs().setZero();
//document sorting of entries for repetative insertion
VectorXi internalIndex(numberOfValues);
vector<pair<int, int> > dummypair(numberOfValues);
VectorXd valuelist(numberOfValues);
for (int l = 0; l < numberOfValues; ++l) {
valuelist(l) = tripletList[l].value();
}
//init internalindex and dummy pair
internalIndex = Eigen::VectorXi::LinSpaced(numberOfValues, 0.0, numberOfValues - 1);
for (int i = 0; i < numberOfValues; ++i) {
dummypair[i].first = tripletList[i].col();
dummypair[i].second = tripletList[i].row();
}
auto start = high_resolution_clock::now();
// sort the vector internalIndex based on the dummypair
sort(internalIndex.begin(), internalIndex.end(), [&](int i, int j) {
return dummypair[i].first < dummypair[j].first ||
(dummypair[i].first == dummypair[j].first && dummypair[i].second < dummypair[j].second);
});
auto stop = high_resolution_clock::now();
timeArray(m, 3) = (stop - start) / 1000;
start = high_resolution_clock::now();
sort(dummypair.begin(), dummypair.end());
stop = high_resolution_clock::now();
timeArray(m, 4) = (stop - start) / 1000;
start = high_resolution_clock::now();
VectorXi dublicatecount(nnz);
dublicatecount.setOnes();
findDuplicates(dummypair, dublicatecount);
stop = high_resolution_clock::now();
timeArray(m, 5) = (stop - start) / 1000;
dummypair.clear();
//calculate vector containing all indices of triplet
//therefore vector[k] is the vectorXi containing the entries of triples which should be written at dof k
int indextriplet = 0;
int multiplicity = 0;
vector<VectorXi> listofentires(mat1.nonZeros());
for (int k = 0; k < mat1.nonZeros(); ++k) {
multiplicity = dublicatecount[k];
listofentires[k] = internalIndex.segment(indextriplet, multiplicity);
indextriplet += multiplicity;
}
//========================================
//Here the nonlinear analysis should start and everything beforehand is prepocessing
//Test1 from triplets
start = high_resolution_clock::now();
mat1.setFromTriplets(tripletList.begin(), tripletList.end());
stop = high_resolution_clock::now();
timeArray(m, 0) = (stop - start) / 1000;
mat1.coeffs().setZero();
//Test2 use internalIndex but calculate listofentires on the fly
indextriplet = 0;
start = high_resolution_clock::now();
for (int k = 0; k < mat1.nonZeros(); ++k) {
multiplicity = dublicatecount[k];
mat1.coeffs()[k] += valuelist(internalIndex.segment(indextriplet, multiplicity)).sum();
indextriplet += multiplicity;
}
stop = high_resolution_clock::now();
timeArray(m, 1) = (stop - start) / 1000;
mat1.coeffs().setZero();
//Test3 directly use listofentires
start = high_resolution_clock::now();
for (int k = 0; k < mat1.nonZeros(); ++k)
mat1.coeffs()[k] += valuelist(listofentires[k]).sum();
stop = high_resolution_clock::now();
timeArray(m, 2) = (stop - start) / 1000;
std::ofstream file("test.txt");
if (file.is_open()) {
file << mat1 << '\n';
}
cout << "Size: " << rows << ": ";
for (int n = 0; n < measures; ++n)
cout << timeArray(m, n).count() << " ";
cout << endl;
}
return 0;
}
If i run this example on my i5-6600K 3.5Ghz and 16GB ram i end up with the following results. which are the times in seconds.
Size Triplet Nested LessNested Sort_intIndex Sort_dum_pair findDuplica
10 1e-06 1e-06 2e-06 1e-06 1e-06 1e-06
100 2.8e-05 4e-06 1.4e-05 5e-05 4.2e-05 1e-05
1000 0.003 0.000416 0.001489 0.01012 0.00627 0.000635
10000 0.426 0.093911 0.48912 1.5389 0.780676 0.061881
100000 337.799 99.0801 37.3656 292.397 87.4488 0.79996
The first three columns denote the calculation time of the different approaches and column 4 to 6 denote the times for different preprocessing steps.
For the size of 100000 rowsand coloumns my Ram gets full relatively fast and therefore the last table entry should be taken with care. Here the fastest method changes from 2 to three.
My questions here are is this approach going in the correct direction to improve the efficiency? Is this a complete wrong direction because for example for the case of a size of 10000 an assemble time of 0.48s seems a bit high?
Additionally the preprocessing steps are getting expensive very fast and is there a better way to construct the ordering of the matrix? Finally as last question is the benchmarking done in the correct way?
Thanks for your time,
Alex
Given a rank-4 tensor (each rank with dimension K), for example T(p,q,r,s), we can 1-to-1 map all the tensor elements into a matrix of dimension K^2 x K^2, for example M(i,j) in which the two first tensor indices p,q and the last two indices r,s are combined in a column major way:
i = p + K * q
j = r + K * s
Exploiting some (anti-)symmetries of the given tensor, for example T(p,q,r,s) = -T(q,p,r,s) = -T(p,q,s,r) = T(q,p,s,r) and T(p,q,r,s) = T(r,s,p,q), we would like to be able to construct a matrix H(m,n) that only contains the unique elements (i.e. those not related by the previously defined symmetries), such that p>q and r>s into the matrix H(m,n), which would then be of dimension K(K-1)/2 x K(K-1)/2.
How could we find an algorithm (or even better: how can we use the C++ library Eigen) to accomplish these index transformations? Furthermore, can we write down m and n algebraically in terms of p,q and r,s, like we can do in the case where we would want to extract the strict lower triangular matrix (no diagonal) into a vector?
For reference, given a square matrix Eigen::MatrixXd M (K,K), here is an algorithm that extracts the strict lower triangle of a given square matrix M into a vector m of size, using the C++ Eigen library:
Eigen::VectorXd strictLowerTriangle(const Eigen::MatrixXd& M) {
auto K = static_cast<size_t>(M.cols()); // the dimension of the matrix
Eigen::VectorXd m = Eigen::VectorXd::Zero((K*(K-1)/2)); // strictly lower triangle has K(K-1)/2 parameters
size_t vector_index = 0;
for (size_t q = 0; q < K; q++) { // "column major" ordering for, so we do p first, then q
for (size_t p = q+1; p < K; p++) {
m(vector_index) = M(p,q);
vector_index++;
}
}
return m;
}
We are able to extend this algorithm to the requested general case:
Eigen::MatrixXd strictLowerTriangle(const Eigen::Tensor<double, 4>& T) {
auto K = static_cast<size_t> (dims[0]);
Eigen::MatrixXd M (K*(K-1)/2, K*(K-1)/2);
size_t row_index = 0;
for (size_t j = 0; j < K; j++) { // "column major" ordering for row_index<-i,j so we do j first, then i
for (size_t i = j+1; i < K; i++) { // in column major indices, columns are contiguous, so the first of two indices changes more rapidly
// require i > j
size_t column_index = 0;
for (size_t l = 0; l < K; l++) { // "column major" ordering for column_index<-k,l so we do l first, then k
for (size_t k = l+1; k < K; k++) { // in column major indices, columns are contiguous, so the first of two indices changes more rapidly
// require l > k
M(row_index,column_index) = T(i,j,k,l);
column_index++;
}
}
row_index++;
}
}
return M;
}
I was wondering if there is a more efficient way to remove columns or rows that are all zero elements. I am sure there is using the functions in the eigen library but I do not know how.
Right now I am doing it like so, with the idea of the while loop being used in case there are multiple rows/columns that sum to zero I dont want to exceed range limits or pass any zero rows.
void removeZeroRows() {
int16_t index = 0;
int16_t num_rows = rows();
while (index < num_rows) {
double sum = row(index).sum();
// I use a better test if zero but use this for demonstration purposes
if (sum == 0.0) {
removeRow(index);
}
else {
index++;
}
num_rows = rows();
}
}
Currently (Eigen 3.3), there is no direct functionality for this (though it is planned for Eigen 3.4).
Meanwhile can use something like this (of course, row and col can be interchanged, and output is just for illustration):
Eigen::MatrixXd A;
A.setRandom(4,4);
A.col(2).setZero();
// find non-zero columns:
Eigen::Matrix<bool, 1, Eigen::Dynamic> non_zeros = A.cast<bool>().colwise().any();
std::cout << "A:\n" << A << "\nnon_zeros:\n" << non_zeros << "\n\n";
// allocate result matrix:
Eigen::MatrixXd res(A.rows(), non_zeros.count());
// fill result matrix:
Eigen::Index j=0;
for(Eigen::Index i=0; i<A.cols(); ++i)
{
if(non_zeros(i))
res.col(j++) = A.col(i);
}
std::cout << "res:\n" << res << "\n\n";
Generally, you should avoid resizing a matrix at every iteration, but resize it to the final size as soon as possible.
With Eigen 3.4 something similar to this will be possible (syntax is not final yet):
Eigen::MatrixXd res = A("", A.cast<bool>().colwise().any());
Which would be equivalent to Matlab/Octave:
res = A(:, any(A));
Helo,
I'm wondering if there's any working method for this?
I'm trying make this work, but no luck.
int mat[3][3];
mat[0][0] = 4;mat[0][1] = 5;mat[0][2] = 3;
mat[1][0] = 3;mat[1][1] = 2;mat[1][2] = 1;
mat[2][0] = 1;mat[2][1] = 8;mat[2][2] = 9;
Any idea? :)
A more idiomatically C++ way of doing this (vs. your original approach of array-of-arrays) would be to have a vector of vectors, ie. std::vector<std::vector<int> > and then invoke std::sort on the top-level vector. You can pass sort a custom predicate that compares two rows based on their average.
You should create a temporary data structure that is an array of tuples. The tuples would be the row index and the average of that row index. Then sort this tuple array based on the average using the standard sort() function. Then, run through the sorted tuple array to recompute the sorted matrix.
This would give you performance benefit of not copying the matrix rows during the swap done by the sort routine. If you only have 3 elements in your row, you may be okay with swap the whole row. But as you increase the number of columns the swapping would be become a bottleneck.
In 'pseudo code' you may do something like this:
function sort(input, numrows, numcols)
{
pair<int, int> index[numrows];
for (int i=0 to numrows) {
index[i].second = i;
// compute average of row[i] in the input matrix
index[i].first = average_of_row(&input[i]);
}
// STL sort will sort the pair based on the average (.first member)
sort(index.begin(), index.end());
for (int i=0 to index.size())
{
// copy rows from input matrix to output matrix
copy_row(&input[index[i].second], &output_matrix[i]);
}
return output;
}
Following #Peter's suggestion,
#include <algorithm>
#include <numeric>
#include <vector>
using namespace std;
bool comp(vector<int> a, vector<int> b) {
if (a.size() == 0 || b.size() == 0) return false;
int sum_a = accumulate(a.begin(), a.end(), 0);
int sum_b = accumulate(b.begin(), b.end(), 0);
return sum_a / (double)a.size() < sum_b / (double)b.size();
}
int main() {
vector<vector<int> > mat(3, vector<int>(3));
mat[0][0] = 4; mat[0][1] = 5; mat[0][2] = 3;
mat[1][0] = 3; mat[1][1] = 2; mat[1][2] = 1;
mat[2][0] = 1; mat[2][1] = 8; mat[2][2] = 9;
sort(mat.begin(), mat.end(), comp);
return 0;
}
I wasn't sure of the best way to handle empty vectors, so I just had it return false. Of course you could give comp() a more meaningful name.
EDIT: I think a better way to handle zero-sized vectors is to multiply,
bool comp(vector<int> a, vector<int> b) {
int sum_a = accumulate(a.begin(), a.end(), 0);
int sum_b = accumulate(b.begin(), b.end(), 0);
return sum_a * b.size() < sum_b * a.size();
}
Put the rows in a multiset and overload the < operator. Here's a 1D example:
http://bytes.com/topic/c/answers/171028-multiset-example