How can I parallelize the integration process by thrust?

How can I parallelize the integration process by thrust? - c++

I have writ a Simpson integrator:
template <typename val_type,
std::size_t num,
template<typename...> typename Container
>
class Integrator : public thrust::unary_function
<typename Container<val_type>::iterator,val_type> {
Container<val_type> coeff_array;
public:
using Iterator = Container<val_type>::iterator;
Integrator(val_type a, val_type b) {
// a,b indicate the interval of the integral
coeff_array.resize(num);
thrust::counting_iterator<unsigned int> cit(0);
val_type C = (b-a)/3./static_cast<val_type>(num);
thrust::transform(cit,cit+num,coeff_array.begin(),
[C](int idx) {
return idx%2==0 ? 2.*C : 4.*C;
});
coeff_array[0] = 1.*C;
}
val_type operator()(Iterator valBegin) {
return thrust::inner_product(coeff_array.begin(),
coeff_array.end(), valBegin, 0.);
}
};
which works well.
Now I have M vectors with same length N, and I need to integrate them all. The data is stored in a N*M device_vector with correct order. I tried some thing like this:
constexpr std::size_t N = 100, M = 60;
thrust::device_vector<float> vec(N*M);
Integrator<float,N,
thrust::device_vector<float>> integral(a,b);
//
some code initialize vec
//
thrust::device_vector<float> result(M);
for (std::size_t i=0; i<M; i++)
result[i] = integral(vec.begin()+N*i); // results are correct
The for loop is definitely ugly here. How can I parallelize the process here?

Related

How to map char values in one vector (std::vector<char *>) to specific values in a separate vector (std::vector<float>) [duplicate]

How can I sort two vectors in the same way, with criteria that uses only one of the vectors?
For example, suppose I have two vectors of the same size:
vector<MyObject> vectorA;
vector<int> vectorB;
I then sort vectorA using some comparison function. That sorting reordered vectorA. How can I have the same reordering applied to vectorB?
One option is to create a struct:
struct ExampleStruct {
MyObject mo;
int i;
};
and then sort a vector that contains the contents of vectorA and vectorB zipped up into a single vector:
// vectorC[i] is vectorA[i] and vectorB[i] combined
vector<ExampleStruct> vectorC;
This doesn't seem like an ideal solution. Are there other options, especially in C++11?

Finding a sort permutation
Given a std::vector<T> and a comparison for T's, we want to be able to find the permutation you would use if you were to sort the vector using this comparison.
template <typename T, typename Compare>
std::vector<std::size_t> sort_permutation(
const std::vector<T>& vec,
Compare& compare)
{
std::vector<std::size_t> p(vec.size());
std::iota(p.begin(), p.end(), 0);
std::sort(p.begin(), p.end(),
[&](std::size_t i, std::size_t j){ return compare(vec[i], vec[j]); });
return p;
}
Applying a sort permutation
Given a std::vector<T> and a permutation, we want to be able to build a new std::vector<T> that is reordered according to the permutation.
template <typename T>
std::vector<T> apply_permutation(
const std::vector<T>& vec,
const std::vector<std::size_t>& p)
{
std::vector<T> sorted_vec(vec.size());
std::transform(p.begin(), p.end(), sorted_vec.begin(),
[&](std::size_t i){ return vec[i]; });
return sorted_vec;
}
You could of course modify apply_permutation to mutate the vector you give it rather than returning a new sorted copy. This approach is still linear time complexity and uses one bit per item in your vector. Theoretically, it's still linear space complexity; but, in practice, when sizeof(T) is large the reduction in memory usage can be dramatic. (See details)
template <typename T>
void apply_permutation_in_place(
std::vector<T>& vec,
const std::vector<std::size_t>& p)
{
std::vector<bool> done(vec.size());
for (std::size_t i = 0; i < vec.size(); ++i)
{
if (done[i])
{
continue;
}
done[i] = true;
std::size_t prev_j = i;
std::size_t j = p[i];
while (i != j)
{
std::swap(vec[prev_j], vec[j]);
done[j] = true;
prev_j = j;
j = p[j];
}
}
}
Example
vector<MyObject> vectorA;
vector<int> vectorB;
auto p = sort_permutation(vectorA,
[](T const& a, T const& b){ /*some comparison*/ });
vectorA = apply_permutation(vectorA, p);
vectorB = apply_permutation(vectorB, p);
Resources
std::vector
std::iota
std::sort
std::swap
std::transform

With range-v3, it is simple, sort a zip view:
std::vector<MyObject> vectorA = /*..*/;
std::vector<int> vectorB = /*..*/;
ranges::v3::sort(ranges::view::zip(vectorA, vectorB));
or explicitly use projection:
ranges::v3::sort(ranges::view::zip(vectorA, vectorB),
std::less<>{},
[](const auto& t) -> decltype(auto) { return std::get<0>(t); });
Demo

I would like to contribute with a extension I came up with.
The goal is to be able to sort multiple vectors at the same time using a simple syntax.
sortVectorsAscending(criteriaVec, vec1, vec2, ...)
The algorithm is the same as the one Timothy proposed but using variadic templates, so we can sort multiple vectors of arbitrary types at the same time.
Here's the code snippet:
template <typename T, typename Compare>
void getSortPermutation(
std::vector<unsigned>& out,
const std::vector<T>& v,
Compare compare = std::less<T>())
{
out.resize(v.size());
std::iota(out.begin(), out.end(), 0);
std::sort(out.begin(), out.end(),
[&](unsigned i, unsigned j){ return compare(v[i], v[j]); });
}
template <typename T>
void applyPermutation(
const std::vector<unsigned>& order,
std::vector<T>& t)
{
assert(order.size() == t.size());
std::vector<T> st(t.size());
for(unsigned i=0; i<t.size(); i++)
{
st[i] = t[order[i]];
}
t = st;
}
template <typename T, typename... S>
void applyPermutation(
const std::vector<unsigned>& order,
std::vector<T>& t,
std::vector<S>&... s)
{
applyPermutation(order, t);
applyPermutation(order, s...);
}
template<typename T, typename Compare, typename... SS>
void sortVectors(
const std::vector<T>& t,
Compare comp,
std::vector<SS>&... ss)
{
std::vector<unsigned> order;
getSortPermutation(order, t, comp);
applyPermutation(order, ss...);
}
// make less verbose for the usual ascending order
template<typename T, typename... SS>
void sortVectorsAscending(
const std::vector<T>& t,
std::vector<SS>&... ss)
{
sortVectors(t, std::less<T>(), ss...);
}
Test it in Ideone.
I explain this a little bit better in this blog post.

In-place sorting using permutation
I would use a permutation like Timothy, although if your data is too large and you don't want to allocate more memory for the sorted vector you should do it in-place. Here is a example of a O(n) (linear complexity) in-place sorting using permutation:
The trick is to get the permutation and the reverse permutation to know where to put the data overwritten by the last sorting step.
template <class K, class T>
void sortByKey(K * keys, T * data, size_t size){
std::vector<size_t> p(size,0);
std::vector<size_t> rp(size);
std::vector<bool> sorted(size, false);
size_t i = 0;
// Sort
std::iota(p.begin(), p.end(), 0);
std::sort(p.begin(), p.end(),
[&](size_t i, size_t j){ return keys[i] < keys[j]; });
// ----------- Apply permutation in-place ---------- //
// Get reverse permutation item>position
for (i = 0; i < size; ++i){
rp[p[i]] = i;
}
i = 0;
K savedKey;
T savedData;
while ( i < size){
size_t pos = i;
// Save This element;
if ( ! sorted[pos] ){
savedKey = keys[p[pos]];
savedData = data[p[pos]];
}
while ( ! sorted[pos] ){
// Hold item to be replaced
K heldKey = keys[pos];
T heldData = data[pos];
// Save where it should go
size_t heldPos = rp[pos];
// Replace
keys[pos] = savedKey;
data[pos] = savedData;
// Get last item to be the pivot
savedKey = heldKey;
savedData = heldData;
// Mark this item as sorted
sorted[pos] = true;
// Go to the held item proper location
pos = heldPos;
}
++i;
}
}

I have recently wrote a proper zip iterator which works with the stl algorithms.
It allows you to produce code like this:
std::vector<int> a{3,1,4,2};
std::vector<std::string> b{"Alice","Bob","Charles","David"};
auto zip = Zip(a,b);
std::sort(zip.begin(), zip.end());
for (const auto & z: zip) std::cout << z << std::endl;
It is contained in a single header and the only requirement is C++17.
Check it out on GitHub.
There is also a post on codereview which contains all the source code.

Make a vector of pairs out of your individual vectors.
initialize vector of pairs
Adding to a vector of pair
Make a custom sort comparator:
Sorting a vector of custom objects
http://rosettacode.org/wiki/Sort_using_a_custom_comparator#C.2B.2B
Sort your vector of pairs.
Separate your vector of pairs into individual vectors.
Put all of these into a function.
Code:
std::vector<MyObject> vectorA;
std::vector<int> vectorB;
struct less_than_int
{
inline bool operator() (const std::pair<MyObject,int>& a, const std::pair<MyObject,int>& b)
{
return (a.second < b.second);
}
};
sortVecPair(vectorA, vectorB, less_than_int());
// make sure vectorA and vectorB are of the same size, before calling function
template <typename T, typename R, typename Compare>
sortVecPair(std::vector<T>& vecA, std::vector<R>& vecB, Compare cmp)
{
std::vector<pair<T,R>> vecC;
vecC.reserve(vecA.size());
for(int i=0; i<vecA.size(); i++)
{
vecC.push_back(std::make_pair(vecA[i],vecB[i]);
}
std::sort(vecC.begin(), vecC.end(), cmp);
vecA.clear();
vecB.clear();
vecA.reserve(vecC.size());
vecB.reserve(vecC.size());
for(int i=0; i<vecC.size(); i++)
{
vecA.push_back(vecC[i].first);
vecB.push_back(vecC[i].second);
}
}

I'm assuming that vectorA and vectorB have equal lengths. You could create another vector, let's call it pos, where:
pos[i] = the position of vectorA[i] after sorting phase
and then, you can sort vectorB using pos, i.e create vectorBsorted where:
vectorBsorted[pos[i]] = vectorB[i]
and then vectorBsorted is sorted by the same permutation of indexes as vectorA is.

I am not sure if this works but i would use something like this. For example to sort two vectors i would use descending bubble sort method and vector pairs.
For descending bubble sort, i would create a function that requires a vector pair.
void bubbleSort(vector< pair<MyObject,int> >& a)
{
bool swapp = true;
while (swapp) {
int key;
MyObject temp_obj;
swapp = false;
for (size_t i = 0; i < a.size() - 1; i++) {
if (a[i].first < a[i + 1].first) {
temp_obj = a[i].first;
key = a[i].second;
a[i].first = a[i + 1].first;
a[i + 1].first = temp_obj;
a[i].second = a[i + 1].second;
a[i + 1].second = key;
swapp = true;
}
}
}
}
After that i would put your 2 vector values into one vector pair. If you are able to add values at the same time use this one and than call the bubble sort function.
vector< pair<MyObject,int> > my_vector;
my_vector.push_back( pair<MyObject,int> (object_value,int_value));
bubbleSort(my_vector);
If you want to use values after adding to your 2 vectors, you can use this one and than call the bubble sort function.
vector< pair<MyObject,int> > temp_vector;
for (size_t i = 0; i < vectorA.size(); i++) {
temp_vector.push_back(pair<MyObject,int> (vectorA[i],vectorB[i]));
}
bubbleSort(temp_vector);
I hope this helps.
Regards,
Caner

Based on Timothy Shields answer.
With a small tweak to apply_permutaion you can apply the permutation to multiple vectors of different types at once with use of a fold expression.
template <typename T, typename... Ts>
void apply_permutation(const std::vector<size_t>& perm, std::vector<T>& v, std::vector<Ts>&... vs) {
std::vector<bool> done(v.size());
for(size_t i = 0; i < v.size(); ++i) {
if(done[i]) continue;
done[i] = true;
size_t prev = i;
size_t curr = perm[i];
while(i != curr) {
std::swap(v[prev], v[curr]);
(std::swap(vs[prev], vs[curr]), ...);
done[curr] = true;
prev = curr;
curr = perm[curr];
}
}
}

Get linear index for multidimensional access

I'm trying to implement a multidimensional std::array, which hold a contigous array of memory of size Dim-n-1 * Dim-n-2 * ... * Dim-1. For that, i use private inheritance from std::array :
constexpr std::size_t factorise(std::size_t value)
{
return value;
}
template<typename... Ts>
constexpr std::size_t factorise(std::size_t value, Ts... values)
{
return value * factorise(values...);
}
template<typename T, std::size_t... Dims>
class multi_array : std::array<T, factorise(Dims...)>
{
// using directive and some stuff here ...
template<typename... Indexes>
reference operator() (Indexes... indexes)
{
return base_type::operator[] (linearise(std::make_integer_sequence<Dims...>(), indexes...)); // Not legal, just to explain the need.
}
}
For instance, multi_array<5, 2, 8, 12> arr; arr(2, 1, 4, 3) = 12; will access to the linear index idx = 2*(5*2*8) + 1*(2*8) + 4*(8) + 3.
I suppose that i've to use std::integer_sequence, passing an integer sequence to the linearise function and the list of the indexes, but i don't know how to do it. What i want is something like :
template<template... Dims, std::size_t... Indexes>
auto linearise(std::integer_sequence<int, Dims...> dims, Indexes... indexes)
{
return (index * multiply_but_last(dims)) + ...;
}
With multiply_but_last multiplying all remaining dimension but the last (i see how to implement with a constexpr variadic template function such as for factorise, but i don't understand if it is possible with std::integer_sequence).
I'm a novice in variadic template manipulation and std::integer_sequence and I think that I'm missing something. Is it possible to get the linear index computation without overhead (i.e. like if the operation has been hand-writtent) ?
Thanks you very much for your help.

Following might help:
#include <array>
#include <cassert>
#include <iostream>
template <std::size_t, typename T> using alwaysT_t = T;
template<typename T, std::size_t ... Dims>
class MultiArray
{
public:
const T& operator() (alwaysT_t<Dims, std::size_t>... indexes) const
{
return values[computeIndex(indexes...)];
}
T& operator() (alwaysT_t<Dims, std::size_t>... indexes)
{
return values[computeIndex(indexes...)];
}
private:
size_t computeIndex(alwaysT_t<Dims, std::size_t>... indexes_args) const
{
constexpr std::size_t dimensions[] = {Dims...};
std::size_t indexes[] = {indexes_args...};
size_t index = 0;
size_t mul = 1;
for (size_t i = 0; i != sizeof...(Dims); ++i) {
assert(indexes[i] < dimensions[i]);
index += indexes[i] * mul;
mul *= dimensions[i];
}
assert(index < (Dims * ...));
return index;
}
private:
std::array<T, (Dims * ...)> values;
};
Demo
I replaced your factorize by fold expression (C++17).

I have a very simple function that converts multi-dimensional index to 1D index.
#include <initializer_list>
template<typename ...Args>
inline constexpr size_t IDX(const Args... params) {
constexpr size_t NDIMS = sizeof...(params) / 2 + 1;
std::initializer_list<int> args{params...};
auto ibegin = args.begin();
auto sbegin = ibegin + NDIMS;
size_t res = 0;
for (int dim = 0; dim < NDIMS; ++dim) {
size_t factor = dim > 0 ? sbegin[dim - 1] : 0;
res = res * factor + ibegin[dim];
}
return res;
}
You may need to add "-Wno-c++11-narrowing" flag to your compiler if you see a warning like non-constant-expression cannot be narrowed from type 'int'.
Example usage:
2D array
int array2D[rows*cols];
// Usually, you need to access the element at (i,j) like this:
int elem = array2D[i * cols + j]; // = array2D[i,j]
// Now, you can do it like this:
int elem = array2D[IDX(i,j,cols)]; // = array2D[i,j]
3D array
int array3D[rows*cols*depth];
// Usually, you need to access the element at (i,j,k) like this:
int elem = array3D[(i * cols + j) * depth + k]; // = array3D[i,j,k]
// Now, you can do it like this:
int elem = array3D[IDX(i,j,k,cols,depth)]; // = array3D[i,j,k]
ND array
// shapes = {s1,s2,...,sn}
T arrayND[s1*s2*...*sn]
// indices = {e1,e2,...,en}
T elem = arrayND[IDX(e1,e2,...,en,s2,...,sn)] // = arrayND[e1,e2,...,en]
Note that the shape parameters passed to IDX(...) begins at the second shape, which is s2 in this case.
BTW: This implementation requires C++ 14.

Generic C++ multidimensional iterators

In my current project I am dealing with a multidimensional datastructure.
The underlying file is stored sequentially (i.e. one huge array, no vector of vectors).
The algorithms that use these datastructures need to know the size of the individual dimensions.
I am wondering if a multidimensional iterator class has been definied somewhere in a generic way and if there are any standards or preferred ways on how to tackle this.
At the moment I am just using a linear iterator with some additional methods that return the size of each dimension and how many dimensions are there in the first part. The reason I don't like it is because I can't use std:: distance in a reasonable way for example (i.e. only returns distance of the whole structure, but not for each dimension separately).
For the most part I will access the datastructure in a linear fashion (first dimension start to finish -> next dimension+...and so on), but it would be good to know when one dimension "ends". I don't know how to do this with just operator*(), operator+() and operator==() in such an approach.
A vector of vectors approach is disfavored, because I don't want to split up the file. Also the algorithms must operate on structure with different dimensionality and are therefore hard to generalize (or maybe there is a way?).
Boost multi_array has the same problems (multiple "levels" of iterators).
I hope this is not too vague or abstract. Any hint in the right direction would be appreciated.
I was looking for a solution myself again and revisited boost:: multi_array. As it turns out it is possible to generate sub views on the data with them, but at the same time also take a direct iterator at the top level and implicitely "flatten" the data structure. The implemented versions of multi_array however do not suit my needs, therefore I probably will implement one myself (that handles the caching of the files in the background) that is compatible with the other multi_arrays.
I will update it again once the implementation is done.

I have just decided to open a public repository on Github : MultiDim Grid which might help for your needs. This is an ongoing project so
I would be glad if you can try it and tell me what you miss / need.
I have started working on this with this topic on codereview.
Put it simply :
MultiDim Grid proposes a flat uni-dimensional array which offer a
generic fast access between multi-dimension coordinates and flatten
index.
You get a container behaviour so you have access to iterators.

That's not that difficult to implement. Just state precisely what functionality your project requires. Here's a dumb sample.
#include <iostream>
#include <array>
#include <vector>
#include <cassert>
template<typename T, int dim>
class DimVector : public std::vector<T> {
public:
DimVector() {
clear();
}
void clear() {
for (auto& i : _sizes)
i = 0;
std::vector<T>::clear();
}
template<class ... Types>
void resize(Types ... args) {
std::array<int, dim> new_sizes = { args ... };
resize(new_sizes);
}
void resize(std::array<int, dim> new_sizes) {
clear();
for (int i = 0; i < dim; ++i)
if (new_sizes[i] == 0)
return;
_sizes = new_sizes;
int realsize = _sizes[0];
for (int i = 1; i < dim; ++i)
realsize *= _sizes[i];
std::vector<T>::resize(static_cast<size_t>(realsize));
}
decltype(auto) operator()(std::array<int, dim> pos) {
// check indexes and compute original index
size_t index;
for (int i = 0; i < dim; ++i) {
assert(0 <= pos[i] && pos[i] < _sizes[i]);
index = (i == 0) ? pos[i] : (index * _sizes[i] + pos[i]);
}
return std::vector<T>::at(index);
}
template<class ... Types>
decltype(auto) at(Types ... args) {
std::array<int, dim> pos = { args ... };
return (*this)(pos);
}
int size(int d) const {
return _sizes[d];
}
class Iterator {
public:
T& operator*() const;
T* operator->() const;
bool operator!=(const Iterator& other) const {
if (&_vec != &other._vec)
return true;
for (int i = 0; i < dim; ++i)
if (_pos[i] != other._pos[i])
return true;
return false;
}
int get_dim(int d) const {
assert(0 <= d && d < dim);
return _pos[d];
}
void add_dim(int d, int value = 1) {
assert(0 <= d && d < dim);
_pos[d] += value;
assert(0 <= _pos[i] && _pos[i] < _vec._sizes[i]);
}
private:
DimVector &_vec;
std::array<int, dim> _pos;
Iterator(DimVector& vec, std::array<int, dim> pos) : _vec(vec), _pos(pos) { }
};
Iterator getIterator(int pos[dim]) {
return Iterator(*this, pos);
}
private:
std::array<int, dim> _sizes;
};
template<typename T, int dim>
inline T& DimVector<T, dim>::Iterator::operator*() const {
return _vec(_pos);
}
template<typename T, int dim>
inline T* DimVector<T, dim>::Iterator::operator->() const {
return &_vec(_pos);
}
using namespace std;
int main() {
DimVector<int, 4> v;
v.resize(1, 2, 3, 4);
v.at(0, 0, 0, 1) = 1;
v.at(0, 1, 0, 0) = 1;
for (int w = 0; w < v.size(0); ++w) {
for (int z = 0; z < v.size(1); ++z) {
for (int y = 0; y < v.size(2); ++y) {
for (int x = 0; x < v.size(3); ++x) {
cout << v.at(w, z, y, x) << ' ';
}
cout << endl;
}
cout << "----------------------------------" << endl;
}
cout << "==================================" << endl;
}
return 0;
}
TODO list:
optimize: use T const& when possible
optimizate iterator: precompute realindex and then just change that realindex
implement const accessors
implement ConstIterator
implement operator>> and operator<< to serialize DimVector to/from file

C++: Performance of Loop in Expression Template

I'm using Expression Templates in a vector-like class for transformations such as moving averages. Here, different to standard arithmetic operations, the operator[](size_t i) does not make a single single acces to element i, but rather there is a whole loop which needs to be evaluated, e.g. for the moving average of a vector v
double operator[](size_t i) const
{
double ad=0.0;
for(int j=i-period+1; j<=i; ++j)
ad+=v[j];
return ad/period;
}
(thats not the real function because one must care for non-negative indices, but that doesn't matter now).
In using such a moving average construct, I have the fear that the code becomes rather in-performant, especially if one takes a double- or triple-moving average. Then one obtains nested loops and therefore quadratic or cubic scaling with the period-size.
My question is, whether compilers are so smart to optimize such redundant loops somehow away? Or is that not the case and one must manually take care for intermediate storage (--which is what I guess)? How could one do this reasonably in the example code below?
Example code, adapted from Wikipedia, compiles with Visual Studio 2013:
CRTP-base class and actual vector:
#include <vector>
template <typename E>
struct VecExpression
{
double operator[](int i) const { return static_cast<E const&>(*this)[i]; }
};
struct Vec : public VecExpression<Vec>
{
Vec(size_t N) : data(N) {}
double operator[](int i) const { return data[i]; }
double& operator[](int i) { return data[i]; }
std::vector<double> data;
};
Moving Average class:
template <typename VectorType>
struct VecMovingAverage : public VecExpression<VecMovingAverage<VectorType> >
{
VecMovingAverage(VectorType const& _vector, int _period) : vector(_vector), period(_period) {}
double operator[](int i) const
{
int s = std::max(i - period + 1, 0);
double ad = 0.0;
for (int j = s; j <= i; ++j)
ad += vector[j];
return ad / (i - s + 1);
}
VectorType const& vector;
int period;
};
template<typename VectorType>
auto MovingAverage(VectorType const& vector, int period = 10) -> VecMovingAverage<VectorType>
{
return VecMovingAverage<VectorType>(vector, period);
}
Now my above mentioned fears arise with expressions like this,
Vec vec(100);
auto tripleMA= MovingAverage(MovingAverage(MovingAverage(vec,20),20),20);
std::cout << tripleMA[40] << std::endl;
which I suppose require 20^3 evaluations for the single operator[] ... ?
EDIT: One obvious solution is to store the result. Move std::vector<double> data into the base class, then change the Moving Average class to something like (untested)
template <typename VectorType, bool Store>
struct VecMovingAverage : public VecExpression<VecMovingAverage<VectorType, Store> >
{
VecMovingAverage(VectorType const& _vector, int _period) : vector(_vector), period(_period) {}
double operator[](int i) const
{
if(Store && i<data.size())
{
return data[i];
}
else
{
int s = std::max(i - period + 1, 0);
double ad = 0.0;
for (int j = s; j <= i; ++j)
ad += vector[j];
ad /= (i - s + 1)
if(Store)
{
data.resize(i+1);
data[i]=ad;
}
return ad;
}
}
VectorType const& vector;
int period;
};
In the function one can then choose to store the result:
template<typename VectorType>
auto MovingAverage(VectorType const& vector, int period = 10) -> VecMovingAverage<VectorType>
{
static const bool Store=true;
return VecMovingAverage<VectorType, Store>(vector, period);
}
This could be extended such that storage is applied only for multiple applications, etc.

is there a way to pass nested initializer lists in C++11 to construct a 2D matrix?

Imagine you have a simple matrix class
template <typename T = double>
class Matrix {
T* data;
size_t row, col;
public:
Matrix(size_t m, size_t n) : row(m), col(n), data(new T[m*n]) {}
//...
friend std::ostream& operator<<(std::ostream& os, const Matrix& m) {
for (int i=0; i<m.row; ++i) {
for (int j=0; j<m.col; ++j)
os<<" "<<m.data[i + j*m.row];
os<<endl;
}
return os;
}
};
Is there a way that I can initialize this matrix with an initializer list? I mean to obtain the sizes of the matrix and the elements from an initializer list. Something like the following code:
Matrix m = { {1., 3., 4.}, {2., 6, 2.}};
would print
1 3 4
2 6 2
Looking forward to your answers. Thank you all.
aa
EDIT
So I worked on your suggestions to craft a somewhat generic array that initializes elements using initializer lists. But this is the most generic I could obtain.
I would appreciate if any of you have any suggestions as to make it a more generic class.
Also, a couple of questions:
Is it fine that a derived class initializes the state of the base class? I'm not calling the base constructor because of this, but should I call it anyways?
I defined the destructor a the Generic_base class as protected, is this the right way to do it?
Is there any foreseeable way to carry out the code that belongs to the constructor of the initializer in a more generic way? I mean to have one general constructor that takes care of all cases?
I included just the necessary code to illustrate the use of initializer lists in construction. When going to higher dimensions it gets messy, but I did one just to check the code.
#include <iostream>
#include <cassert>
using std::cout;
using std::endl;
template <int d, typename T>
class Generic_base {
protected:
typedef T value_type;
Generic_base() : n_(), data_(nullptr){}
size_t n_[d] = {0};
value_type* data_;
};
template <int d, typename T>
class Generic_traits;
template <typename T>
class Generic_traits<1,T> : public Generic_base<1,T> {
protected:
typedef T value_type;
typedef Generic_base<1,T> base_type;
typedef std::initializer_list<T> initializer_type;
using base_type::n_;
using base_type::data_;
public:
Generic_traits(initializer_type l) {
assert(l.size() > 0);
n_[0] = l.size();
data_ = new T[n_[0]];
int i = 0;
for (const auto& v : l)
data_[i++] = v;
}
};
template <typename T>
class Generic_traits<2,T> : public Generic_base<2,T> {
protected:
typedef T value_type;
typedef Generic_base<2,T> base_type;
typedef std::initializer_list<T> list_type;
typedef std::initializer_list<list_type> initializer_type;
using base_type::n_;
using base_type::data_;
public:
Generic_traits(initializer_type l) {
assert(l.size() > 0);
n_[0] = l.size();
n_[1] = l.begin()->size();
data_ = new T[n_[0]*n_[1]];
int i = 0, j = 0;
for (const auto& r : l) {
assert(r.size() == n_[1]);
for (const auto& v : r) {
data_[i + j*n_[0]] = v;
++j;
}
j = 0;
++i;
}
}
};
template <typename T>
class Generic_traits<4,T> : public Generic_base<4,T> {
protected:
typedef T value_type;
typedef Generic_base<4,T> base_type;
typedef std::initializer_list<T> list_type;
typedef std::initializer_list<list_type> llist_type;
typedef std::initializer_list<llist_type> lllist_type;
typedef std::initializer_list<lllist_type> initializer_type;
using base_type::n_;
using base_type::data_;
public:
Generic_traits(initializer_type l) {
assert(l.size() > 0);
assert(l.begin()->size() > 0);
assert(l.begin()->begin()->size() > 0);
assert(l.begin()->begin()->begin()->size() > 0);
size_t m = n_[0] = l.size();
size_t n = n_[1] = l.begin()->size();
size_t o = n_[2] = l.begin()->begin()->size();
n_[3] = l.begin()->begin()->begin()->size();
data_ = new T[m*n*o*n_[3]];
int i=0, j=0, k=0, p=0;
for (const auto& u : l) {
assert(u.size() == n_[1]);
for (const auto& v : u) {
assert(v.size() == n_[2]);
for (const auto& x : v) {
assert(x.size() == n_[3]);
for (const auto& y : x) {
data_[i + m*j + m*n*k + m*n*o*p] = y;
++p;
}
p = 0;
++k;
}
k = 0;
++j;
}
j = 0;
++i;
}
}
};
template <int d, typename T>
class Generic : public Generic_traits<d,T> {
public:
typedef Generic_traits<d,T> traits_type;
typedef typename traits_type::base_type base_type;
using base_type::n_;
using base_type::data_;
typedef typename traits_type::initializer_type initializer_type;
// initializer list constructor
Generic(initializer_type l) : traits_type(l) {}
size_t size() const {
size_t n = 1;
for (size_t i=0; i<d; ++i)
n *= n_[i];
return n;
}
friend std::ostream& operator<<(std::ostream& os, const Generic& a) {
for (int i=0; i<a.size(); ++i)
os<<" "<<a.data_[i];
return os<<endl;
}
};
int main()
{
// constructors for initializer lists
Generic<1, double> y = { 1., 2., 3., 4.};
cout<<"y -> "<<y<<endl;
Generic<2, double> C = { {1., 2., 3.}, {4., 5., 6.} };
cout<<"C -> "<<C<<endl;
Generic<4, double> TT = { {{{1.}, {7.}, {13.}, {19}}, {{2}, {8}, {14}, {20}}, {{3}, {9}, {15}, {21}}}, {{{4.}, {10}, {16}, {22}}, {{5}, {11}, {17}, {23}}, {{6}, {12}, {18}, {24}}} };
cout<<"TT -> "<<TT<<endl;
return 0;
}
Which prints as expected:
y -> 1 2 3 4
C -> 1 4 2 5 3 6
TT -> 1 4 2 5 3 6 7 10 8 11 9 12 13 16 14 17 15 18 19 22 20 23 21 24

Why not?
Matrix(std::initializer_list<std::initializer_list<T>> lst) :
Matrix(lst.size(), lst.size() ? lst.begin()->size() : 0)
{
int i = 0, j = 0;
for (const auto& l : lst)
{
for (const auto& v : l)
{
data[i + j * row] = v;
++j;
}
j = 0;
++i;
}
}
And as stardust_ suggests - you should use vectors, not arrays here.

The main issue with using initializer lists to tackle this problem, is that their size is not easily accessible at compile time. It looks like this particular class is for dynamic matrices, but if you wanted to do this on the stack (usually for speed/locality reasons), here is a hint at what you need (C++17):
template<typename elem_t, std::size_t ... dim>
struct matrix
{
template<std::size_t ... n>
constexpr matrix(const elem_t (&...list)[n]) : data{}
{
auto pos = &data[0];
((pos = std::copy(list, list + n, pos)), ...);
}
elem_t data[(dim * ... * 1)];
};
template<typename ... elem_t, std::size_t ... n>
matrix(const elem_t (&...list)[n]) -> matrix<std::common_type_t<elem_t...>, sizeof...(n), (n * ... * 1) / sizeof...(n)>;
I had to tackle this same problem in my linear algebra library, so I understand how unintuitive this is at first. But if you instead pass a C-array into your constructor, you will have both type and size information of the values you've passed in. Also take note of the constuctor template argument deduction (CTAD) to abstract away the template arguments.
You can then create constexpr matrix objects like this (or, leave out constexpr to simply do this at runtime on the stack):
constexpr matrix mat{ {1, 2, 3}, {4, 5, 6}, {7, 8, 9}, {10, 11, 12} };
Which will initialize an object at compile time of type:
const matrix<int, 4, 3>
If C++20 is supported by your compiler, I would recommend adding a "requires" clause to the CTAD to ensure that all sub-arrays are the same size (mathematically-speaking, n1 == n2 == n3 == n4, etc).

Using std::vector::emplace_back() (longer)
Using std::vector, instead of plain old array, you can use std::vector::emplace_back() to fill the vector:
template <typename T = double>
class Matrix {
std::vector<T> data;
size_t row{}, col{}; // Non-static member initialization
public:
Matrix(size_t m, size_t n) : data(std::vector<T>(m * n)), row(m), col(n)
{ // ^ Keep the order in which the members are declared
}
Matrix(std::initializer_list<std::initializer_list<T>> lst)
: row(lst.size())
, col(lst.size() ? lst.begin()->size() : 0) // Minimal validation
{
// Eliminate reallocations as we already know the size of matrix
data.reserve(row * col);
for (auto const& r : lst) {
for (auto const &c : r) {
data.emplace_back(c);
}
}
}
};
int main() {
Matrix<double> d = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
}
Using std::vector::insert() (better and shorter)
As #Bob mentioned in a comment, you can use std::vector::insert() member function, instead of the inner emplace_back loop:
template <typename T = double>
class Matrix {
std::vector<T> data;
size_t row{}, col{}; // Non-static member initialization
public:
Matrix(size_t m, size_t n) : data(std::vector<T>(m * n)), row(m), col(n)
{ // ^ Keep the order in which the members are declared
}
Matrix(std::initializer_list<std::initializer_list<T>> lst)
: row{lst.size()}
, col{lst.size() ? lst.begin()->size() : 0} // Minimal validation
{
// Eliminate reallocations as we already know the size of the matrix
data.reserve(row * col);
for (auto const& r : lst) {
data.insert(data.end(), r.begin(), r.end());
}
}
};
int main() {
Matrix<double> d = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
}
So, we're saying: For each row (r) in the lst, insert the content of the row from the beginning (r.begin()) to the end (r.end()) into the end of the empty vector, data, (in an empty vector semantically we have: empty_vec.begin() == empty_vec.end()).

i might be a bit late but here is code for generally initializing tensors, regardless if they are matricies or vectors or whatever tensor.You could restrict it by throwing runtime errors when its not a matrix. Below is the source code to extract the data from the initilizer_list its a bit hacky. The whole trick is that the constructor are implicitly called with the correct type.
#include <initializer_list>
#include <iostream>
using namespace std;
class ShapeElem{
public:
ShapeElem* next;
int len;
ShapeElem(int _len,ShapeElem* _next): next(_next),len(_len){}
void print_shape(){
if (next != nullptr){
cout <<" "<< len;
next->print_shape();
}else{
cout << " " << len << "\n";
}
}
int array_len(){
if (next != nullptr){
return len*next->array_len();
}else{
return len;
}
}
};
template<class value_type>
class ArrayInit{
public:
void* data = nullptr;
size_t len;
bool is_final;
ArrayInit(std::initializer_list<value_type> init) : data((void*)init.begin()), len(init.size()),is_final(true){}
ArrayInit(std::initializer_list<ArrayInit<value_type>> init): data((void*)init.begin()), len(init.size()),is_final(false){}
ShapeElem* shape(){
if(is_final){
ShapeElem* out = new ShapeElem(len,nullptr);
}else{
ArrayInit<value_type>* first = (ArrayInit<value_type>*)data;
ShapeElem* out = new ShapeElem(len,first->shape());
}
}
void assign(value_type** pointer){
if(is_final){
for(size_t k = 0; k < len;k ++ ){
(*pointer)[k] = ( ((value_type*)data)[k]);
}
(*pointer) = (*pointer) + len;
}else{
ArrayInit<value_type>* data_array = (ArrayInit<value_type>*)data;
for(int k = 0;k < len;k++){
data_array[k].assign(pointer);
}
}
}
};
int main(){
auto x = ArrayInit<int>({{1,2,3},{92,1,3}});
auto shape = x.shape();
shape->print_shape();
int* data = new int[shape->array_len()];
int* running_pointer = data;
x.assign(&running_pointer);
for(int i = 0;i < shape->array_len();i++){
cout << " " << data[i];
}
cout << "\n";
}
outputs
2 3
1 2 3 92 1 3
The shape() function will return you the shape of the tensor at each dimension. The array is exactly saved as it is written down. It's really import to create something like shape since this will give you the ordering in which the elements are.
If you want a specific index out of the tensor lets say a[1][2][3]
the correct position is in 1*a.shape[1]a.shape[2] + 2a.shape[2] + 3
Some minor details and tricks can be found in: https://github.com/martinpflaum/multidimensional_array_cpp

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js