Efficient implemention of operator[] for custom N-d arrays in c++ (using variadic template) - templates

First I would like to say that I do this to learn about variadic templates and that I am aware of libraries such as boost or xTensor that already implement N-d arrays.
As the title says, I would like to implement operator[](size_t i) such that it returns a (N-1)-d array which is a slice of the original array at index i.
Here is the relevant part of the code (C++17):
template <size_t firstDim, size_t... RestDims>
class Array
{
private:
static constexpr size_t N = sizeof...(RestDims) + 1;
static constexpr size_t length = firstDim*(RestDims * ...);
static constexpr size_t Dims[] = {firstDim, RestDims...};
double* data_;
public:
// Base constructor
Array() {
data_ = new double[length];
std::fill_n(data_,length, 0.0);
}
// copy constructor
Array(Array<firstDim, RestDims...> const& other) {
data_ = new double[length];
std::copy(other.data_, other.data_ + length, data_);
}
~Array() {
delete data_;
}
};
// specialization for N=1
// ...
My first idea was to add a sub-array of type Array<RestDims...> which would be defined recursively down to a 1-d array (hence the need for a specialization of 1-d arrays). The operator[] simply moves the sub-array's pointer and returns a reference to it. This is quite efficient and works fine unless I use operator[] more than once in the same expression (both sub-arrays point to the same data since operator[] returns a reference).
My second idea was to have an array of pointers to sub-arrays and dynamically allocate sub-arrays in operator[] the first time it is used and then reuse it if the sub-array is accessed again. This solves the previous issue but it is inefficient since it is essentially the same as using nested vectors (and I do get similar performance).
Is there a way to be (almost) as efficient as the first implementation while solving the mentioned issue?

Related

Create C++ array of unknown type

Is there some way to create an array in C++ where we don't know the type, but we do know it's size and alignmnent requirements?
Let's say we have a template:
template<typename T>
T* create_array(size_t numElements) { return new T[numElements]; }
This works because each element T has known size and alignment, which is known at compile-time. But I'm looking for something where we can delegate the creation for later by simply extracting size and align and passing them on. This is the interface that I seek:
// my_header.hpp
// "internal" helper function, implementation in source file!
void* _create_array(size_t s, size_t a, size_t n);
template<typename T>
T* create_array(size_t numElements) {
return (T*)_create_array(sizeof(T), alignof(T), numElements);
}
Can we implement this in a source file?:
#include "my_header.hpp"
void* _create_array(size_t s, size_t a, size_t n) {
// ... ?
}
Requirements:
Each array element must have the correct alignment.
The total array size must be equal to s*n, and be aligned to a.
Type safety is assumed to be managed by the templated interface.
Indexing into the array should use correct size and align offsets.
I'm using C++20, so newer features may also be considered.
In advance, thank you!
While you can also implement this yourself, you can simply use std::allocator:
template<typename T>
constexpr T* create_array(size_t numElements) {
std::allocator<T> a;
return std::allocator_traits<decltype(a)>::allocate(a, numElements);
}
and then
template<typename T>
constexpr void destroy_array(T* ptr) noexcept {
std::allocator<T> a;
std::allocator_traits<decltype(a)>::deallocate(a, ptr);
}
The benefit over doing it yourself via a call to operator new is that this will also be usable in constant expression evaluation.
You then need to create objects in the returned storage via placement-new, std::allocator_traits<std::allocator<T>>::construct or std::construct_at.
Anyway, first make sure that you really need to do all of this memory management manually. Standard library containers already offer similar functionality, e.g. std::vector has a .reserve member function to reserve memory in which objects can be placed later via push_back, emplace_back, resize, etc.
If you want to implement the above yourself, you basically need
#include<new>
//...
void* create_array(size_t s, size_t a, size_t n) {
// CAREFUL: check here that `s*n` does not overflow! Potential for vulnerabilities!
return ::operator new(s*n, std::align_val_t{a});
}
void destroy_array(void* ptr, size_t a) noexcept {
::operator delete(ptr, std::align_val_t{a});
}
(Note that identifiers starting with an underscore are reserved in the global namespace scope and may not be used there as function names, so I changed the name.)

Does std::vector range constructor [start, end) copy or just reference the data?

I wonder if the range constructor of std::vector does copy the data, or does it just reference it?
Have a look at this example:
vector<int> getVector() {
int arr[10];
for(int i=0; i<10; ++i) arr[i] = i;
return vector<int>(arr, arr+10);
}
Would this cause a bug (due to handing out a reference to the stack which is destroyed later) or is it fine, since it copies the data in the constructor?
Edit #1
For clarification: I'm looking for a more or less official resource that points out, which of the following pseudo code implementations of the constructor are valid. I know the signature of the constructor is different... but, you should get the idea.
Version A (just uses the given data internally)
template<typename T>
class vector {
private:
T* data;
int size;
public:
vector<T>(T* start, T* end) {
data = start;
size = (end - start);
}
};
Version B (explicitly copies the data)
template<typename T>
class vector {
private:
T* data;
int size;
public:
vector<T>(T* start, T* end) {
for(T* it = start; it < end; ++it) push_back(*it);
}
};
When in doubt, check the reference. The answer can be derived from Complexity section, although I'd agree there is no explicit confirmation:
Complexity: Makes only N calls to the copy constructor of T (where N
is the distance between first and last) and no reallocations if
iterators first and last are of forward, bidirectional, or random
access categories. It makes order N calls to the copy constructor of T
and order logN reallocations if they are just input iterators.
Like all constructors of std::vector<int>, this copies the integers. The same holds for methods like push_back and insert
This is why std::vector actually has two template arguments. The second one is defaulted to std::allocator; it's the allocator used to allocate memory for the 10 integers (and perhaps a few more so that the vector can grow - see capacity)
[Edit]
The actual code is most like Version B, but probably similar to
template<typename T>
class vector {
private:
T* _Data = nullptr;
size_t _Capacity = 0;
size_t _Used = 0;
public:
vector<T>(T* start, T* end) {
_Used = (end-begin);
reserve(_Used); // Sets _Data, _Capacity
std::uninitialized_copy(begin, end, _Data);
}
};
The C++ standard library is specified in a somewhat strange way.
It is specified saying what each method requires and what each method guarantees. It is not specified as in "vector is a container of values that it owns", even though that is the real underlying abstraction here.
Formally, what you are doing is safe not because "the vector copies", but because none of the preconditions of any of the methods of std vector are violated in the copy of the std vector your function returns.
Similarly, the values are set to be certain ones because of the postconditions of the constructor, and then the pre and post conditions of the copy constructor and/or C++17 prvalue "elision" rules.
But trying to reason about C++ code in this way is madness.
A std::vector semantically is a regular type with value semantics that owns its own elements. Regular types can be copied, and the copies behave sane even if the original object is destroyed.
Unless you make a std::vector<std::reference_wrapper<int>> you are safe, and you are unsafe for the reference wrapper because you stored elements which are not regular value types.
The vector can not be defined as a vector of references as for example std::vector<int &>. So the code is valid. The vector does not contain references to elements of the array. It creates new elements of the type int (as the template argument of the vector) not a vector of references.

Initialize an array of reference_wrapper

Coming from C to C++, I'm trying to understand the world of smart pointers & references. I've got the following:
class Game {
public:
...
private:
...
static GamePiece EmptyPiece;
reference_wrapper<GamePiece> _board[N][M] = { ref(Game::EmptyPiece) };
vector<GamePlayer> _players = vector<GamePlayer>(N_PLAYERS, GamePlayer());
...
};
In the following situation, I would like each Player to hold a vector<GamePiece> and return references to these pieces, and put then in the _board. However, the following initialization of my _board yields
no default constructor exists for class "std::reference_wrapper
What am I missing here? In terms of ownership, each GamePlayer is owned by the Game (as can be seen), and the GamePieces are definitely owned by the GamePlayers, and that's why I want to use references.
It's this here
reference_wrapper<GamePiece> _board[N][M] = { ref(Game::EmptyPiece) };
You initialize the first element (with some brace elision thrown in) but leave the rest default initialized. Which can't happen, since std::reference_wrapper cannot be default initialized (just like the reference it models).
You can substitute the raw array for a std::vector of N*M size, and use the appropriate constructor which will copy initialize all the elements (like you do for _players). Of course, you'll need to do the calculations for indexing by yourself, but the memory will be laid out sequentially.
Initializing an array of references is pain in my opinion. The problem is -- as said in the answer by #StoryTeller -- that a reference_wrapper is not default constructible.
So you have to write your own workaround functions. I'll post code for the general problem of initializing an array of references and won't dive deeply into your question.
So consider the following case: you have an array arr holding elements of some type (e.g. a Game as in your question) that supports operator[]. You want an array of const- or non-const references to elements in this array specified by the indices ind. Here you go:
template<typename arr_t, size_t ... I>
auto get_const_reference_array(arr_t const& arr, std::array<size_t, sizeof ...(I)> const& ind, std::index_sequence<I...>)
{
using T = std::decay_t<decltype(std::declval<arr_t>().operator[](size_t{}))>;
return std::array<std::reference_wrapper<const T>, sizeof ...(I)> { std::cref(arr[std::get<I>(ind)]) ... };
}
template<typename arr_t, size_t dim>
auto get_const_reference_array(arr_t const& arr, std::array<size_t, dim> const& ind)
{
return get_const_reference_array(arr, ind, std::make_index_sequence<dim>{});
}
For the non-const version, remove all const's in this code and replace std::cref by std::ref.
Use it as
std::array<int,5> arr{{1,3,5,7,9}};
std::array<size_t,2> ind{{1,3}};
auto ref_arr = get_const_reference_array(arr, ind);
std::vector<int> vec{{1,3,5,7,9}};
auto ref_vec = get_const_reference_array(vec, ind);
ref_arr then is an array of size 2 which holds const references to arr[1] and arr[3], and the same for the vector (note however that references to a vector are in general not stable, i.e. by resizing or similar actions they might get invalidated).

How to initialize std::vector from C-style array?

What is the cheapest way to initialize a std::vector from a C-style array?
Example: In the following class, I have a vector, but due to outside restrictions, the data will be passed in as C-style array:
class Foo {
std::vector<double> w_;
public:
void set_data(double* w, int len){
// how to cheaply initialize the std::vector?
}
Obviously, I can call w_.resize() and then loop over the elements, or call std::copy(). Are there any better methods?
Don't forget that you can treat pointers as iterators:
w_.assign(w, w + len);
You use the word initialize so it's unclear if this is one-time assignment or can happen multiple times.
If you just need a one time initialization, you can put it in the constructor and use the two iterator vector constructor:
Foo::Foo(double* w, int len) : w_(w, w + len) { }
Otherwise use assign as previously suggested:
void set_data(double* w, int len)
{
w_.assign(w, w + len);
}
Well, Pavel was close, but there's even a more simple and elegant solution to initialize a sequential container from a c style array.
In your case:
w_ (array, std::end(array))
array will get us a pointer to the beginning of the array (didn't catch it's name),
std::end(array) will get us an iterator to the end of the array.
The quick generic answer:
std::vector<double> vec(carray,carray+carray_size);
or question specific:
std::vector<double> w_(w,w+len);
based on above: Don't forget that you can treat pointers as iterators
You can 'learn' the size of the array automatically:
template<typename T, size_t N>
void set_data(const T (&w)[N]){
w_.assign(w, w+N);
}
Hopefully, you can change the interface to set_data as above. It still accepts a C-style array as its first argument. It just happens to take it by reference.
How it works
[ Update: See here for a more comprehensive discussion on learning the size ]
Here is a more general solution:
template<typename T, size_t N>
void copy_from_array(vector<T> &target_vector, const T (&source_array)[N]) {
target_vector.assign(source_array, source_array+N);
}
This works because the array is being passed as a reference-to-an-array. In C/C++, you cannot pass an array as a function, instead it will decay to a pointer and you lose the size. But in C++, you can pass a reference to the array.
Passing an array by reference requires the types to match up exactly. The size of an array is part of its type. This means we can use the template parameter N to learn the size for us.
It might be even simpler to have this function which returns a vector. With appropriate compiler optimizations in effect, this should be faster than it looks.
template<typename T, size_t N>
vector<T> convert_array_to_vector(const T (&source_array)[N]) {
return vector<T>(source_array, source_array+N);
}
std::vector<double>::assign is the way to go, because it's little code. But how does it work, actually? Doesnt't it resize and then copy? In MS implementation of STL I am using it does exactly so.
I'm afraid there's no faster way to implement (re)initializing your std::vector.

static array allocation issue!

I want to statically allocate the array. Look at the following code, this code is not correct but it will give you an idea what I want to do
class array
{
const int arraysize;
int array[arraysize];//i want to statically allocate the array as we can do it by non type parameters of templates
public:
array();
};
array::array():arraysize(10)
{
for(int i=0;i<10;i++)
array[i]=i;
}
main()
{
array object;
}
If your array size is always the same, make it a static member. Static members that are integral types can be initialized directly in the class definition, like so:
class array
{
static const int arraysize = 10;
int array[arraysize];
public:
array();
};
This should work the way you want. If arraysize is not always the same for every object of type array, then you cannot do this, and you will need to use template parameters, dynamically allocate the array, or use an STL container class (e.g. std::vector) instead.
It has to be done using template parameters, otherwise sizeof(array) would be different for every object.
This is how you would do it using template parameters.
template <int N>
class array
{
int data[N];
// ...
};
Or, you could use an std::vector if you don't mind dynamic allocation.
C++ doesn't allow variable-length arrays (i.e. ones whose sizes are not compile-time constants). Allowing one within a struct would make it impossible to calculate sizeof(array), as the size could differ from one instance to another.
Consider using std::vector instead, if the size is known only at runtime. This also avoids storing the array size in a separate variable. Notice that allocating from heap (e.g. by std::vector) also allows bigger arrays, as the available stack space is very limited.
If you want it a compile-time constant, take a template parameter. Then you should be looking for Boost.Array, which already implements it.
The array size must be a compile time constant. You are almost there, you just need to initialize the const and make it a static as well. Or, use a dynamic array or a vector.
EDIT: note about this answer: This is most likely the wrong way to do this for your situation. But if you really need it to be an array (not a vector or whatever) and you really need it to be dynamically allocated, do the following:
class array
{
int *p_array;
public:
array(int size);
};
array::array(int size)
{
p_array = malloc(size * sizeof(int));
}
Just make sure you clean up (IE free p_array in your descructor)