Zero-copy construction of an Eigen SparseMatrix - c++

I have the following problem:
I have an Eigen::SparseMatrix I need to send over the network, and my network library only supports sending arrays of primitive types.
I can retrieve the pointers to the backing arrays of my SparseMatrix by doing something like (here's the backing object's code):
// Get pointers to the indices and values, send data over the network
int num_items = sparse_matrix.nonZeros()
auto values_ptr = sparse_matrix.data().valuePtr()
auto index_ptr = sparse_matrix.data().indexPtr()
network_lib::send(values_ptr, num_items)
network_lib::send(index_ptr, 2 * num_items) // Times two b/c we have 2 indices per value
Now on the other side I have access to these two arrays. But AFAIK there is no way to create a SparseArray without copying all the data into a new SparseMatrix (see docs for construction).
I'd like to do something like:
Eigen::SparseMatrix<float> zero_copy_matrix(num_rows, num_cols);
zero_copy_matrix.data().valuePtr() = received_values_ptr;
zero_copy_matrix.data().indexPtr() = received_index_ptr;
But this throws a compiler error:
error: lvalue required as left operand of assignment zero_copy_matrix.data().valuePtr() = received_values_ptr;
Any idea on how we could zero-copy construct a sparse Eigen matrix from existing arrays of indexes and data?
Another approach I tried that didn't work (this is local, no communication):
zero_copy_matrix.reserve(num_non_zeros);
zero_copy_matrix.data().swap(original_matrix.data());
When I try to print out the zero_copy_matrix it has no values in it.

After digging around I think a good option for me would be to use an Eigen::Map<Eigen::SparseMatrix<float>> as such:
Eigen::Map<Eigen::SparseMatrix<float>> sparse_map(num_rows, num_cols, num_non_zeros,
original_outer_index_ptr, original_inner_index_ptr,
original_values_ptr);
AFAIK, this should be zero-copy. Answer from here.

Related

Conversion from std::vector<Eigen::Vector3d> to std::vector<Eigen::Vector3f>

I came across the problem that I want to convert from a std::vector<Eigen::Vector3d> to a std::vector<Eigen::Vector3f>. I was wondering if there is a solution where I dont have to iterate over the points.
// mapping using iteration
std::vector< Eigen::Vector3d> tf{ {1,1,1},{1,1,1},{1,1,1} };
std::vector< Eigen::Vector3f> tf2;
tf2.reserve(tf.size());
std::transform(tf.begin(), tf.end(), std::back_inserter(tf2), [](const Eigen::Vector3d& p) {
return p.cast<float>();
});
I tried some things like tf.data() and tried to cast that, but I didnt found a solution. I also looked into Eigen::Map<> class, but didnt really find a solution.
I don't think what you are asking is possible. Eigen::Map allows you to construct an Eigen data structure without needing to copy or move, it merely takes a view on existing contiguous data (typically from a std::array or std::vector). The operation you are looking to do, casting from doubles to float, two distinct types with different memory layouts, is an explicit operation. You would be shrinking the size of the vector in half. It is not possible to achieve this by taking a different view on the same data.
Assuming Vector3d and Vector3f don't introduce any padding (which is true for all compilers which Eigen supports), you could use an Eigen::Map<const Matrix3Xd> and .cast<float>() that into an Eigen::Map<Matrix3Xf> over the destination vector:
std::vector< Eigen::Vector3d> tf{ {1,1,1},{1,1,1},{1,1,1} };
std::vector< Eigen::Vector3f> tf2(tf.size()); // target needs to be actually allocated
Eigen::Matrix3Xf::Map(tf2[0].data(), 3, tf2.size())
= Eigen::Matrix3Xd::Map(tf[0].data(), 3, tf.size()).cast<float>();
With the upcoming 3.4 branch of Eigen you can also use iterators over the casted-map, like so:
Eigen::Map<Eigen::Matrix3Xd> input(tf[0].data(), 3, tf.size());
std::vector<Eigen::Vector3f> tf2(input.cast<float>().colwise().begin(),
input.cast<float>().colwise().end());

How to convert arrow::Array to std::vector?

I have an Apache arrow array that is created by reading a file.
std::shared_ptr<arrow::Array> array;
PARQUET_THROW_NOT_OK(reader->ReadColumn(0, &array));
Is there a way to convert it to std::vector or any other native array type in C++?
You can use std::static_pointer_cast to cast the arrow::Array to, for example, an arrow::DoubleArray if the array contains doubles, and then use the Value function to get the value at a particular index. For example:
auto arrow_double_array = std::static_pointer_cast<arrow::DoubleArray>(array);
std::vector<double> double_vector;
for (int64_t i = 0; i < array->length(); ++i)
{
double_vector.push_back(arrow_double_array->Value(i));
}
See the latter part of the ColumnarTableToVector function in this example:
https://arrow.apache.org/docs/cpp/examples/row_columnar_conversion.html. In that example, table->column(0)->chunk(0) is a std::shared_ptr<arrow::Array>.
To learn more, I found it useful to click on various parts of the inheritance diagram tree here: https://arrow.apache.org/docs/cpp/classarrow_1_1_flat_array.html. For example, strings in an arrow::StringArray are accessed using a GetString function instead of a Value function.
This is just what I've pieced together from these links, johnathan's comment above, and playing around with a small example myself, so I'm not sure if this is the best way, as I'm quite new to this.

Retuning multiple vectors from a function in c++?

I want to return multiple vectors from a function.
I am not sure either tuple can work or not. I tried but is not working.
xxx myfunction (vector<vector<float>> matrix1 , vector<vector<float>> matrix2) {
// some functional code: e.g.
// vector<vector<float>> matrix3 = matrix1 + matrix2;
// vector<vector<float>> matrix4 = matrix1 - matrix2;
return matrix3, matrix4;
If these matrices are very small then this approach might be OK, but generally I would not do it this way. First, regardless of their size, you should pass them in by const reference.
Also, std::vector<std::vector<T>> is not a very good "matrix" implementation - much better to store the data in a contiguous block and implement element-wise operations over the entire block. Also, if you are going to return the matrices (via a pair or other class) then you'll want to look into move semantics as you don't want extra copies.
If you are not using C++11 then I'd pass in matrices by reference and fill them in the function; e.g.
using Matrix = std::vector<std::vector<float>>; // or preferably something better
void myfunction(const Matrix &m1, const Matrix &m2, Matrix &diff, Matrix &sum)
{
// sum/diff clear / resize / whatever is appropriate for your use case
// sum = m1 + m2
// diff = m1 - m2
}
The main issue with functional style code, e.g. returning std::tuple<Matrix,Matrix> is avoiding copies. There are clever things one can here to avoid extra copies but sometimes it is just simpler, IMO, to go with a less "pure" style of coding.
For Matrices, I normally create a Struct or Class for it that has these vectors, and send objects of that class in to the function. It would also help to encapsulate Matrix related operations inside that Class.
If you still want to use vector of vector, here is my opinion. You could use InOut parameters using references/pointers : Meaning, if the parameters can be updated to hold results of calculation, you would be sending the arguments in, and you would not have to return anything in that case.
If the parameters need to be const and cannot be changed, then I normally send In parameters as const references, and separate Out parameters in the function argument list itself.
Hope this helps a bit.

How to get dimensions of a multidimensional vector in C++

all
I am using multidimensional STL vector to store my data in C++. What I have is a 3D vector
vector<vector<vector<double>>> vec;
What I want to retrieve from it is :
&vec[][1][]; // I need a pointer that points to a 2D matrix located at column 1 in vec
Anyone has any idea to do so? I would be extremly appreciate any help!
Regards
Long
It is best to consider vec just as a vector whose elements happen to be
vectors-of-vectors-of-double, rather than as a multi-dimensional structure.
You probably know, but just in case you don't I'll mention it,
that this datatype does not necessarily represent a rectangular cuboid.
vec will only have that "shape" if you ensure that all the vectors are
the same size at each level. The datatype is quite happy for the vector vec[j]
to be a different size from the one at vec[k] and likewise for vec[j][n]
to be a vector of different size from vec[j][m], so that your structure is "jagged".
So you want to get a pointer to the vector<vector<double>> that is at
index 1 in vec. You can do that by:
vector<vector<double>> * pmatrix = &vec[1];
However this pointer will be an unnecessarily awkward means of accessing that
vector<vector<double>>. You certainly won't be able to write the
like of:
double d = pmatrix[j][k];
and expect to get a double at coordinates (j,k) in the "matrix addressed
by a pmatrix". Because pmatrix is a pointer-to-a-vector-of-vector-of-double;
so what pmatrix[j] refers to is the vector-of-vector-of-double (not vector-of-double)
at index j from pmatrix, where the index goes in steps of
sizeof(vector<vector<double>>). The statement will reference who-knows-what
memory and very likely crash your program.
Instead, you must write the like of:
double d = (*pmatrix)[j][k];
where (*pmatrix) gives you the vector-of-vector-of-double addressed by pmatrix,
or equivalently but more confusingly:
double d = pmatrix[0][j][k];
Much simpler - and therefore, the natural C++ way - is to take a reference,
rather than pointer, to the vector<vector<double>> at index 1 in vec. You
do that simply by:
vector<vector<double>> & matrix = vec[1];
Now matrix is simply another name for the vector<vector<double>> at index 1 in vec,
and you can handle it matrix-wise just as you'd expect (always assuming
you have made sure it is a matrix, and not a jagged array).
Another thing to consider was raised in a comment by manu343726. Do you
want the code that receives this reference to vec[1] to be able to
use it to modify the contents of vec[1] - which would include changing its
size or the size of any of the vector<double>s within it?
If you allow modification, that's fine. If you don't then you want to get
a const reference. You can do that by:
vector<vector<double> > const & matrix = vec[1];
Possibly, you want the receiving code to be able to modify the doubles
but not the sizes of the vectors that contain them? In that case, std::vector
is the wrong container type for your application. If that's your position I
can update this answer to offer alternative containers.
Consider using matrix from some linear algebra library. There are some directions here

std::list of boost::arrays

I ran into some trouble while using a list of arrays.
So to clear things up I do know that I can't have a std::list containing arrays.
I am using boost::array to be able to do that.
Currently I prototype all datatypes needed for my pathfinding algorithm and test them for speed and cache coherence.
typedef boost::array<int,2> VertexB2i;
typedef std::list<VertexB2i> VertexList;
These types are not used for pathfinding, they are simply easier to use then all the real c++ arrays and flags for the pathfinder, so they are just used to generate a navigation mesh.
(I also know I could use a stl::pair instead boost::array in this case, but I want to keep the generated data as similar to the pathfinders data as possible, so I don't have to deal with two totally different interfaces the whole time)
VertexList* vl = new VertexList();
vl->push_back({{8,28}}); // this does not work, why?
So while setting up some testing data for these two data-types, I noticed that the commented line does not work, although this does work:
VertexList* vl = new VertexList();
VertexB2i ver1 = {{8,28}};
vl->push_back({ver1}); // works
So to sum it up:
Is there anyway to pushback a "VertexB2i" without declaring it separte first?
General advices?
std::array (or boost::array) is an aggregate type. In C++98/03 it can only be initialized in the declarator:
std::array<int, 2> arr = { 1, 2 };
thelist.push_back(arr); // makes a copy
In C++11, you can use uniform initialization to create temporaries as well:
std::list<std::array<int,2>> l;
l.push_back(std::array<int,2>{{2,3}}); // maybe copy
l.emplace_back(std::array<int,2>{{2,3}}); // maybe copy, probably optimized out