Converting Eigen::SparseMatrix<double> to deal.ii ::SparseMatrix<double>? - c++

This is kind of an obscure question and I don't really expect anyone to answer, but I have this method that takes (and returns) an Eigen::SparseMatrix. I want to put it into the deal.ii library, is there a way to copy/convert a SparseMatrix from deal.ii/Eigen? I know you can copy deal.ii to Trilinos SparseMatrix something like:
`SparseMatrix<double> matrix(sparsity);
...//fill matrix
Epetra_Map map(TrilinosWrappers::types::int_type(5),
TrilinosWrappers::types::int_type(5),
0,
Utilities::Trilinos::comm_world());
TrilinosWrappers::SparseMatrix tmatrix;
tmatrix.reinit (map, map, matrix, 0, false);`
Is there a similar way Eigen::SparseMatrix? I guess Eigen don't really have that kind of support in deal.ii. So perhaps there is some 'brute force' type method, like this attempt at code which obviously doesn't work:
`
Eigen::SparseMatrix<double> ConvertToEigenMatrix(SparseMatrix<double> data)
{
Eigen::SparseMatrix<double> eMatrix(data.m(), data.n());
for (int i = 0; i < data.m(); ++i)
eMatrix.row(i) = Eigen::SparseMatrix<double> ::Map(&data[i][0], data.n());
return eMatrix;
`
Ok, so I figured out how to convert from dealii::SparseMatrix -> Eigen::SparseMatrix.
SparseMatrix<double>::iterator smi = matrix.begin();
SparseMatrix<double>::iterator smi_end = matrix.end();
unsigned int row,col;
double val;
for (; smi!=smi_end; ++smi)
{
row = smi->row();
col = smi->column();
val = smi->value();
spMat.insert(row, col) = val;
std::cout << val << std::endl;
}
No, I just need to figure out the reverse.

This question is old but maybe I can still help. I am one of the deal.II developers and I don't remember seeing this on the mailing list (which is much more active for these types of questions than SO).
A SparseMatrix in deal.II does not store its own sparsity pattern: instead, it stores a pointer to a SparsityPattern object. You'll need to loop over the eigen matrix twice: once to set up the SparsityPattern and a second time to copy matrix values. Something like the following seems to work:
#include <deal.II/lac/dynamic_sparsity_pattern.h>
#include <deal.II/lac/sparsity_pattern.h>
#include <deal.II/lac/sparse_matrix.h>
#include <eigen3/Eigen/Sparse>
#include <iostream>
int main()
{
const std::size_t shape = 3;
Eigen::SparseMatrix<double> matrix(shape, shape);
matrix.insert(0, 0) = 1.0;
matrix.insert(0, 1) = 2.0;
matrix.insert(0, 2) = 1.0;
matrix.insert(2, 2) = 2.0;
matrix.makeCompressed();
{
dealii::SparsityPattern sparsity_pattern(matrix.rows(), matrix.cols());
dealii::DynamicSparsityPattern dynamic_sparsity_pattern(matrix.rows(), matrix.cols());
for (decltype(matrix.outerSize()) row_n = 0; row_n < matrix.outerSize(); ++row_n)
for (Eigen::SparseMatrix<double>::InnerIterator it(matrix, row_n); it; ++it)
dynamic_sparsity_pattern.add(it.row(), it.col());
sparsity_pattern.copy_from(dynamic_sparsity_pattern);
dealii::SparseMatrix<double> matrix2(sparsity_pattern);
for (decltype(matrix.outerSize()) row_n = 0; row_n < matrix.outerSize(); ++row_n)
for (Eigen::SparseMatrix<double>::InnerIterator it(matrix, row_n); it; ++it)
matrix2.set(it.row(), it.col(), it.value());
matrix2.print(std::cout); // prints the right matrix
}
}
You will have to manage the lifetime of the SparsityPattern object too.
deal.II does not use CSR or CSC: it uses its own CSR-like format where the entry on the main diagonal is stored first in the array containing the matrix entries for that row, so we really do need to copy with the iterator interfaces.

Related

Is there a way to dynamically change a comparison operator?

I'm creating an AI Director and need a way to change when the AI needs to pressure the player and when they need to move away. I've got a TArray of Objects and am checking the distance from the player. I'd like to either get the largest distance or the smallest distance.
I know this doesn't work:
operator comparer = PlayerTensionLevel > BackstageThreshold ? > : <;
Both of the variables used in the one line Bool are floats. I'm hoping that the comparer can be used in a situation like:
if(DistanceSquared(objectA, objectB) comparer _currentThresholdDistance){
_currentObject = objectA;
_currentThresholdDistance = DistanceSquared(objectA, objectB);
}
You can compute with bool! If you aren’t concerned with differing behavior for ties, you can just write
if((DistanceSquared(objectA, objectB) > _currentThresholdDistance) ==
(PlayerTensionLevel > BackstageThreshold)) …
(Technically, the extra parentheses here are unnecessary, but it’s probably not reasonable to expect the reader to know that relational operators have higher precedence than equality operators.)
As already mentioned in #SamVarshavchik's comment above,
You can use std::function and assign it to either std::less or std::greater based on PlayerTensionLevel and BackstageThreshold.
After you determine the comparator you can use it against the current values of _currentThresholdDistance and the the squared distance between the objects (here I just used dist2 to represent it).
#include <iostream>
#include <functional>
std::function<bool(float a, float b)>
GetComp(float PlayerTensionLevel, float BackstageThreshold)
{
if (PlayerTensionLevel > BackstageThreshold) {
return std::less<float>{};
}
return std::greater<float>{};
}
int main() {
float _currentThresholdDistance = 1;
float dist2 = 2;
float PlayerTensionLevel = 100;
float BackstageThreshold;
BackstageThreshold = 101;
std::cout << GetComp(PlayerTensionLevel, BackstageThreshold)
(dist2, _currentThresholdDistance) << std::endl;
BackstageThreshold = 99;
std::cout << GetComp(PlayerTensionLevel, BackstageThreshold)
(dist2, _currentThresholdDistance) << std::endl;
}
Output:
1
0

Rotate elements in a vector and how to return a vector

c++ newbie here. So for an assignment I have to rotate all the elements in a vector to the left one. So, for instance, the elements {1,2,3} should rotate to {2,3,1}.
I'm researching how to do it, and I saw the rotate() function, but I don't think that will work given my code. And then I saw a for loop that could do it, but I'm not sure how to translate that into a return statement. (i tried to adjust it and failed)
This is what I have so far, but it is very wrong (i haven't gotten a single result that hasn't ended in an error yet)
Edit: The vector size I have to deal with is just three, so it doesn't need to account for any sized vector
#include <vector>
using namespace std;
vector<int> rotate(const vector<int>& v)
{
// PUT CODE BELOW THIS LINE. DON'T CHANGE ANYTHING ABOVE.
vector<int> result;
int size = 3;
for (auto i = 0; i < size - 1; ++i)
{
v.at(i) = v.at(i + 1);
result.at(i) = v.at(i);
}
return result;
// PUT CODE ABOVE THIS LINE. DON'T CHANGE ANYTHING BELOW.
}
All my teacher does it upload textbook pages that explain what certain parts of code are supposed to do but the textbook pages offer NO help in trying to figure out how to actually apply this stuff.
So could someone please give me a few pointers?
Since you know exactly how many elements you have, and it's the smallest number that makes sense to rotate, you don't need to do anything fancy - just place the items in the order that you need, and return the result:
vector<int> rotate3(const vector<int>& x) {
return vector<int> { x[1], x[2], x[0] };
}
Note that if your collection always has three elements, you could use std::array instead:
std::array<int,3>
First, just pay attention that you have passed v as const reference (const vector<int>&) so you are forbbiden to modify the state of v in v.at(i) = v.at(i + 1);
Although Sergey has already answered a straight forward solution, you could correct your code like this:
#include <vector>
using namespace std;
vector<int> left_rotate(const vector<int>& v)
{
vector<int> result;
int size = v.size(); // this way you are able to rotate vectors of any size
for (auto i = 1; i < size; ++i)
result.push_back(v.at(i));
// adding first element of v at end of result
result.push_back(v.front());
return result;
}
Use Sergey's answer. This answer deals with why what the asker attempted did not work. They're damn close, so it's worth going though it, explaining the problems, and showing how to fix it.
In
v.at(i) = v.at(i + 1);
v is constant. You can't write to it. The naïve solution (which won't work) is to cut out the middle-man and write directly to the result vector because it is NOT const
result.at(i) = v.at(i + 1);
This doesn't work because
vector<int> result;
defines an empty vector. There is no at(i) to write to, so at throws an exception that terminates the program.
As an aside, the [] operator does not check bounds like at does and will not throw an exception. This can lead you to thinking the program worked when instead it was writing to memory the vector did not own. This would probably crash the program, but it doesn't have to1.
The quick fix here is to ensure usable storage with
vector<int> result(v.size());
The resulting code
vector<int> rotate(const vector<int>& v)
{
// PUT CODE BELOW THIS LINE. DON'T CHANGE ANYTHING ABOVE.
vector<int> result(v.size()); // change here to size the vector
int size = 3;
for (auto i = 0; i < size - 1; ++i)
{
result.at(i) = v.at(i + 1); // change here to directly assign to result
}
return result;
// PUT CODE ABOVE THIS LINE. DON'T CHANGE ANYTHING BELOW.
}
almost works. But when we run it on {1, 2, 3} result holds {2, 3, 0} at the end. We lost the 1. That's because v.at(i + 1) never touches the first element of v. We could increase the number of for loop iterations and use the modulo operator
vector<int> rotate(const vector<int>& v)
{
// PUT CODE BELOW THIS LINE. DON'T CHANGE ANYTHING ABOVE.
vector<int> result(v.size());
int size = 3;
for (auto i = 0; i < size; ++i) // change here to iterate size times
{
result.at(i) = v.at((i + 1) % size); // change here to make i + 1 wrap
}
return result;
// PUT CODE ABOVE THIS LINE. DON'T CHANGE ANYTHING BELOW.
}
and now the output is {2, 3, 1}. But it's just as easy, and probably a bit faster, to just do what we were doing and tack on the missing element after the loop.
vector<int> rotate(const vector<int>& v)
{
// PUT CODE BELOW THIS LINE. DON'T CHANGE ANYTHING ABOVE.
vector<int> result(v.size());
int size = 3;
for (auto i = 0; i < size - 1; ++i)
{
result.at(i) = v.at(i + 1);
}
result.at(size - 1) = v.at(0); // change here to store first element
return result;
// PUT CODE ABOVE THIS LINE. DON'T CHANGE ANYTHING BELOW.
}
Taking this a step further, the size of three is an unnecessary limitation for this function that I would get rid of and since we're guaranteeing that we never go out of bounds in our for loop, we don't need the extra testing in at
vector<int> rotate(const vector<int>& v)
{
// PUT CODE BELOW THIS LINE. DON'T CHANGE ANYTHING ABOVE.
if (v.empty()) // nothing to rotate.
{
return vector<int>{}; // return empty result
}
vector<int> result(v.size());
for (size_t i = 0; i < v.size() - 1; ++i) // Explicitly using size_t because
// 0 is an int, and v.size() is an
// unsigned integer of implementation-
// defined size but cannot be larger
// than size_t
// note v.size() - 1 is only safe because
// we made sure v is not empty above
// otherwise 0 - 1 in unsigned math
// Becomes a very, very large positive
// number
{
result[i] = v[i + 1];
}
result.back() = v.front(); // using direct calls to front and back because it's
// a little easier on my mind than math and []
return result;
// PUT CODE ABOVE THIS LINE. DON'T CHANGE ANYTHING BELOW.
}
We can go further still and use iterators and range-based for loops, but I think this is enough for now. Besides at the end of the day, you throw the function out completely and use std::rotate from the <algorithm> library.
1This is called Undefined Behaviour (UB), and one of the most fearsome things about UB is anything could happen including giving you the expected result. We put up with UB because it makes for very fast, versatile programs. Validity checks are not made where you don't need them (along with where you did) unless the compiler and library writers decide to make those checks and give guaranteed behaviour like an error message and crash. Microsoft, for example, does exactly this in the vector implementation in the implementation used when you make a debug build. The release version of Microsoft's vector make no checks and assumes you wrote the code correctly and would prefer the executable to be as fast as possible.
I saw the rotate() function, but I don't think that will work given my code.
Yes it will work.
When learning there is gain in "reinventing the wheel" (e.g. implementing rotate yourself) and there is also gain in learning how to use the existing pieces (e.g. use standard library algorithm functions).
Here is how you would use std::rotate from the standard library:
std::vector<int> rotate_1(const std::vector<int>& v)
{
std::vector<int> result = v;
std::rotate(result.begin(), result.begin() + 1, result.end());
return result;
}

Unrestricted conversion from Array to TypedArray<std::complex<double>>?

Tried many things, just cannot get it to work when writing a mex-function.
I have an input from MATLAB which I pass to a method as const matlab::data::Array. This array may contain complex data, sometimes it's only real. So the most straightforward approach should be, in my naive thoughts, that I can simply convert the Array to a TypedArray<std::complex<double>> and I get full complex values if the array contains complex values, and I get complex values with imag=0 if the array contains only real values. It seems to be impossible... This last conversion is not accepted in any case, and MATLAB even simply crashes on trying to cast single elements from a real-valued Array to std::complex<double>.
Anybody a solution how to get a TypedArray<std::complex<double>> in all cases so I can use that in C++ code?
Story of my life, trying for hours and after posting here I find something that works within half an hour... Following code seems to do the job:
void prepareObject(const matlab::data::Array& corners, const matlab::data::Array& facets)
{
size_t N_facet_rows = facets.getDimensions()[0];
size_t N_facet_columns = facets.getDimensions()[1];
matlab::data::TypedArray<std::complex<double>> complex_facets = arrayFactory.createArray<std::complex<double>>(facets.getDimensions());
// Convert the facets to a complex-valued array.
if (facets.getType() == ArrayType::DOUBLE) {
std::complex<double> v;
// Input is DOUBLE, so for each value init a complex<double> and store that in the complex array.
v.imag(0);
for (int i_r = 0; i_r < N_facet_rows; i_r++) {
for (int i_c = 0; i_c < N_facet_columns; i_c++) {
v.real(facets[i_r][i_c]);
complex_facets[i_r][i_c] = v;
}
}
}
else {
// Input is COMPLEX_DOUBLE, so simply copy all values.
for (int i_r = 0; i_r < N_facet_rows; i_r++) {
for (int i_c = 0; i_c < N_facet_columns; i_c++) {
complex_facets[i_r][i_c] = (std::complex<double>) facets[i_r][i_c];
}
}
}

efficient method to select index of vector in c++

In C++, suppose you have a vector with boolean values, and you want to select randomly one index among those corresponding to True values.
What is the most efficient method to use?
Example:
vector<bool> v(4);
v.at(0)=true
v.at(1)=false
v.at(2)=true
v.at(3)=true
You want to select a number among the subset {0,2,3}.
I have so far tried 2 methods:
Stacking indexes in a vector and then selecting among these elements. Extremely slow.
Naive method: randomly select a index until v.at(rnd_sel_index) is True. Considerably faster.
Any suggestions faster than method 2?
Perhaps there's a more efficient approach.
Rather than storing what is there and what is not, perhaps it's better to store only what is not - i.e. a vector containing indices that are free.
the order of this vector can be easily randomised once, and you can then pull items from the back() until it's empty().
When you want to return items to the 'free index pool', simply insert them in a random position in the vector.
You can use the well-known method for selecting an element from a sequence of unknown length.
Example Code:
#include <random>
#include <iostream>
#include <vector>
#include <algorithm>
std::size_t choose_element(const std::vector<bool>& v) {
auto last = v.end();
auto chosen_i = std::find(v.begin(), last, true);
auto i = std::find(std::next(chosen_i), last, true);
double n = 2.0;
static auto random_generator = std::mt19937{std::random_device{}()};
while (i != last) {
if (std::bernoulli_distribution(1.0 / n)(random_generator))
chosen_i = i;
i = std::find(std::next(i), last, true);
++n;
}
return std::distance(v.begin(), chosen_i);
}
int main() {
std::vector<bool> v = {true, true, false, true};
std::vector<int> indexes(v.size());
const double N = 100;
for (int i=0; i<N; ++i)
++indexes[choose_element(v)];
for (auto& index : indexes)
std::cout << std::distance(indexes.data(), &index) << ": " << (index / N) << "\n";
return 0;
}
This has predictable performance and only takes one pass through the data. Of course if you are taking multiple samples from the same vector it may be more efficient to restructure the data to a different format and then draw from that. Also, if nearly all of the elements are true, your method (2) might perform better in the average case.

What is the best way to create a sparse array in C++?

I am working on a project that requires the manipulation of enormous matrices, specifically pyramidal summation for a copula calculation.
In short, I need to keep track of a relatively small number of values (usually a value of 1, and in rare cases more than 1) in a sea of zeros in the matrix (multidimensional array).
A sparse array allows the user to store a small number of values, and assume all undefined records to be a preset value. Since it is not physically possibly to store all values in memory, I need to store only the few non-zero elements. This could be several million entries.
Speed is a huge priority, and I would also like to dynamically choose the number of variables in the class at runtime.
I currently work on a system that uses a binary search tree (b-tree) to store entries. Does anyone know of a better system?
For C++, a map works well. Several million objects won't be a problem. 10 million items took about 4.4 seconds and about 57 meg on my computer.
My test application is as follows:
#include <stdio.h>
#include <stdlib.h>
#include <map>
class triple {
public:
int x;
int y;
int z;
bool operator<(const triple &other) const {
if (x < other.x) return true;
if (other.x < x) return false;
if (y < other.y) return true;
if (other.y < y) return false;
return z < other.z;
}
};
int main(int, char**)
{
std::map<triple,int> data;
triple point;
int i;
for (i = 0; i < 10000000; ++i) {
point.x = rand();
point.y = rand();
point.z = rand();
//printf("%d %d %d %d\n", i, point.x, point.y, point.z);
data[point] = i;
}
return 0;
}
Now to dynamically choose the number of variables, the easiest solution is to represent index as a string, and then use string as a key for the map. For instance, an item located at [23][55] can be represented via "23,55" string. We can also extend this solution for higher dimensions; such as for three dimensions an arbitrary index will look like "34,45,56". A simple implementation of this technique is as follows:
std::map data<string,int> data;
char ix[100];
sprintf(ix, "%d,%d", x, y); // 2 vars
data[ix] = i;
sprintf(ix, "%d,%d,%d", x, y, z); // 3 vars
data[ix] = i;
The accepted answer recommends using strings to represent multi-dimensional indices.
However, constructing strings is needlessly wasteful for this. If the size isn’t known at compile time (and thus std::tuple doesn’t work), std::vector works well as an index, both with hash maps and ordered trees. For std::map, this is almost trivial:
#include <vector>
#include <map>
using index_type = std::vector<int>;
template <typename T>
using sparse_array = std::map<index_type, T>;
For std::unordered_map (or similar hash table-based dictionaries) it’s slightly more work, since std::vector does not specialise std::hash:
#include <vector>
#include <unordered_map>
#include <numeric>
using index_type = std::vector<int>;
struct index_hash {
std::size_t operator()(index_type const& i) const noexcept {
// Like boost::hash_combine; there might be some caveats, see
// <https://stackoverflow.com/a/50978188/1968>
auto const hash_combine = [](auto seed, auto x) {
return std::hash<int>()(x) + 0x9e3779b9 + (seed << 6) + (seed >> 2);
};
return std::accumulate(i.begin() + 1, i.end(), i[0], hash_combine);
}
};
template <typename T>
using sparse_array = std::unordered_map<index_type, T, index_hash>;
Either way, the usage is the same:
int main() {
using i = index_type;
auto x = sparse_array<int>();
x[i{1, 2, 3}] = 42;
x[i{4, 3, 2}] = 23;
std::cout << x[i{1, 2, 3}] + x[i{4, 3, 2}] << '\n'; // 65
}
Boost has a templated implementation of BLAS called uBLAS that contains a sparse matrix.
https://www.boost.org/doc/libs/release/libs/numeric/ublas/doc/index.htm
Eigen is a C++ linear algebra library that has an implementation of a sparse matrix. It even supports matrix operations and solvers (LU factorization etc) that are optimized for sparse matrices.
Complete list of solutions can be found in the wikipedia. For convenience, I have quoted relevant sections as follows.
https://en.wikipedia.org/wiki/Sparse_matrix#Dictionary_of_keys_.28DOK.29
Dictionary of keys (DOK)
DOK consists of a dictionary that maps (row, column)-pairs to the
value of the elements. Elements that are missing from the dictionary
are taken to be zero. The format is good for incrementally
constructing a sparse matrix in random order, but poor for iterating
over non-zero values in lexicographical order. One typically
constructs a matrix in this format and then converts to another more
efficient format for processing.[1]
List of lists (LIL)
LIL stores one list per row, with each entry containing the column
index and the value. Typically, these entries are kept sorted by
column index for faster lookup. This is another format good for
incremental matrix construction.[2]
Coordinate list (COO)
COO stores a list of (row, column, value) tuples. Ideally, the entries
are sorted (by row index, then column index) to improve random access
times. This is another format which is good for incremental matrix
construction.[3]
Compressed sparse row (CSR, CRS or Yale format)
The compressed sparse row (CSR) or compressed row storage (CRS) format
represents a matrix M by three (one-dimensional) arrays, that
respectively contain nonzero values, the extents of rows, and column
indices. It is similar to COO, but compresses the row indices, hence
the name. This format allows fast row access and matrix-vector
multiplications (Mx).
Small detail in the index comparison. You need to do a lexicographical compare, otherwise:
a= (1, 2, 1); b= (2, 1, 2);
(a<b) == (b<a) is true, but b!=a
Edit: So the comparison should probably be:
return lhs.x<rhs.x
? true
: lhs.x==rhs.x
? lhs.y<rhs.y
? true
: lhs.y==rhs.y
? lhs.z<rhs.z
: false
: false
Hash tables have a fast insertion and look up. You could write a simple hash function since you know you'd be dealing with only integer pairs as the keys.
The best way to implement sparse matrices is to not to implement them - atleast not on your own. I would suggest to BLAS (which I think is a part of LAPACK) which can handle really huge matrices.
Since only values with [a][b][c]...[w][x][y][z] are of consequence, we only store the indice themselves, not the value 1 which is just about everywhere - always the same + no way to hash it. Noting that the curse of dimensionality is present, suggest go with some established tool NIST or Boost, at least read the sources for that to circumvent needless blunder.
If the work needs to capture the temporal dependence distributions and parametric tendencies of unknown data sets, then a Map or B-Tree with uni-valued root is probably not practical. We can store only the indice themselves, hashed if ordering ( sensibility for presentation ) can subordinate to reduction of time domain at run-time, for all 1 values. Since non-zero values other than one are few, an obvious candidate for those is whatever data-structure you can find readily and understand. If the data set is truly vast-universe sized I suggest some sort of sliding window that manages file / disk / persistent-io yourself, moving portions of the data into scope as need be. ( writing code that you can understand ) If you are under commitment to provide actual solution to a working group, failure to do so leaves you at the mercy of consumer grade operating systems that have the sole goal of taking your lunch away from you.
Here is a relatively simple implementation that should provide a reasonable fast lookup (using a hash table) as well as fast iteration over non-zero elements in a row/column.
// Copyright 2014 Leo Osvald
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#ifndef UTIL_IMMUTABLE_SPARSE_MATRIX_HPP_
#define UTIL_IMMUTABLE_SPARSE_MATRIX_HPP_
#include <algorithm>
#include <limits>
#include <map>
#include <type_traits>
#include <unordered_map>
#include <utility>
#include <vector>
// A simple time-efficient implementation of an immutable sparse matrix
// Provides efficient iteration of non-zero elements by rows/cols,
// e.g. to iterate over a range [row_from, row_to) x [col_from, col_to):
// for (int row = row_from; row < row_to; ++row) {
// for (auto col_range = sm.nonzero_col_range(row, col_from, col_to);
// col_range.first != col_range.second; ++col_range.first) {
// int col = *col_range.first;
// // use sm(row, col)
// ...
// }
template<typename T = double, class Coord = int>
class SparseMatrix {
struct PointHasher;
typedef std::map< Coord, std::vector<Coord> > NonZeroList;
typedef std::pair<Coord, Coord> Point;
public:
typedef T ValueType;
typedef Coord CoordType;
typedef typename NonZeroList::mapped_type::const_iterator CoordIter;
typedef std::pair<CoordIter, CoordIter> CoordIterRange;
SparseMatrix() = default;
// Reads a matrix stored in MatrixMarket-like format, i.e.:
// <num_rows> <num_cols> <num_entries>
// <row_1> <col_1> <val_1>
// ...
// Note: the header (lines starting with '%' are ignored).
template<class InputStream, size_t max_line_length = 1024>
void Init(InputStream& is) {
rows_.clear(), cols_.clear();
values_.clear();
// skip the header (lines beginning with '%', if any)
decltype(is.tellg()) offset = 0;
for (char buf[max_line_length + 1];
is.getline(buf, sizeof(buf)) && buf[0] == '%'; )
offset = is.tellg();
is.seekg(offset);
size_t n;
is >> row_count_ >> col_count_ >> n;
values_.reserve(n);
while (n--) {
Coord row, col;
typename std::remove_cv<T>::type val;
is >> row >> col >> val;
values_[Point(--row, --col)] = val;
rows_[col].push_back(row);
cols_[row].push_back(col);
}
SortAndShrink(rows_);
SortAndShrink(cols_);
}
const T& operator()(const Coord& row, const Coord& col) const {
static const T kZero = T();
auto it = values_.find(Point(row, col));
if (it != values_.end())
return it->second;
return kZero;
}
CoordIterRange
nonzero_col_range(Coord row, Coord col_from, Coord col_to) const {
CoordIterRange r;
GetRange(cols_, row, col_from, col_to, &r);
return r;
}
CoordIterRange
nonzero_row_range(Coord col, Coord row_from, Coord row_to) const {
CoordIterRange r;
GetRange(rows_, col, row_from, row_to, &r);
return r;
}
Coord row_count() const { return row_count_; }
Coord col_count() const { return col_count_; }
size_t nonzero_count() const { return values_.size(); }
size_t element_count() const { return size_t(row_count_) * col_count_; }
private:
typedef std::unordered_map<Point,
typename std::remove_cv<T>::type,
PointHasher> ValueMap;
struct PointHasher {
size_t operator()(const Point& p) const {
return p.first << (std::numeric_limits<Coord>::digits >> 1) ^ p.second;
}
};
static void SortAndShrink(NonZeroList& list) {
for (auto& it : list) {
auto& indices = it.second;
indices.shrink_to_fit();
std::sort(indices.begin(), indices.end());
}
// insert a sentinel vector to handle the case of all zeroes
if (list.empty())
list.emplace(Coord(), std::vector<Coord>(Coord()));
}
static void GetRange(const NonZeroList& list, Coord i, Coord from, Coord to,
CoordIterRange* r) {
auto lr = list.equal_range(i);
if (lr.first == lr.second) {
r->first = r->second = list.begin()->second.end();
return;
}
auto begin = lr.first->second.begin(), end = lr.first->second.end();
r->first = lower_bound(begin, end, from);
r->second = lower_bound(r->first, end, to);
}
ValueMap values_;
NonZeroList rows_, cols_;
Coord row_count_, col_count_;
};
#endif /* UTIL_IMMUTABLE_SPARSE_MATRIX_HPP_ */
For simplicity, it's immutable, but you can can make it mutable; be sure to change std::vector to std::set if you want a reasonable efficient "insertions" (changing a zero to a non-zero).
I would suggest doing something like:
typedef std::tuple<int, int, int> coord_t;
typedef boost::hash<coord_t> coord_hash_t;
typedef std::unordered_map<coord_hash_t, int, c_hash_t> sparse_array_t;
sparse_array_t the_data;
the_data[ { x, y, z } ] = 1; /* list-initialization is cool */
for( const auto& element : the_data ) {
int xx, yy, zz, val;
std::tie( std::tie( xx, yy, zz ), val ) = element;
/* ... */
}
To help keep your data sparse, you might want to write a subclass of unorderd_map, whose iterators automatically skip over (and erase) any items with a value of 0.