A minimalistic smart array (container) class template

A minimalistic smart array (container) class template - c++

I've written a (array) container class template (lets call it smart array) for using it in the BREW platform (which doesn't allow many C++ constructs like STD library, exceptions, etc. It has a very minimal C++ runtime support); while writing this my friend said that something like this already exists in Boost called MultiArray, I tried it but the ARM compiler (RVCT) cries with 100s of errors. I've not seen Boost.MultiArray's source, I've started learning templates only lately; template meta programming interests me a lot, although am not sure if this is strictly one that can be categorized thus.
So I want all my fellow C++ aficionados to review it ~ point out flaws, potential bugs, suggestions, optimizations, etc.; something like "you've not written your own Big Three which might lead to...". Possibly any criticism that will help me improve this class and thereby my C++ skills.
Edit: I've used std::vector since it's easily understood, later it will be replaced by a custom written vector class template made to work in the BREW platform. Also C++0x related syntax like static_assert will also be removed in the final code.
smart_array.h
#include <vector>
#include <cassert>
#include <cstdarg>
using std::vector;
template <typename T, size_t N>
class smart_array
{
vector < smart_array<T, N - 1> > vec;
public:
explicit smart_array(vector <size_t> &dimensions)
{
assert(N == dimensions.size());
vector <size_t>::iterator it = ++dimensions.begin();
vector <size_t> dimensions_remaining(it, dimensions.end());
smart_array <T, N - 1> temp_smart_array(dimensions_remaining);
vec.assign(dimensions[0], temp_smart_array);
}
explicit smart_array(size_t dimension_1 = 1, ...)
{
static_assert(N > 0, "Error: smart_array expects 1 or more dimension(s)");
assert(dimension_1 > 1);
va_list dim_list;
vector <size_t> dimensions_remaining(N - 1);
va_start(dim_list, dimension_1);
for(size_t i = 0; i < N - 1; ++i)
{
size_t dimension_n = va_arg(dim_list, size_t);
assert(dimension_n > 0);
dimensions_remaining[i] = dimension_n;
}
va_end(dim_list);
smart_array <T, N - 1> temp_smart_array(dimensions_remaining);
vec.assign(dimension_1, temp_smart_array);
}
smart_array<T, N - 1>& operator[](size_t index)
{
assert(index < vec.size() && index >= 0);
return vec[index];
}
size_t length() const
{
return vec.size();
}
};
template<typename T>
class smart_array<T, 1>
{
vector <T> vec;
public:
explicit smart_array(vector <size_t> &dimension) : vec(dimension[0])
{
assert(dimension[0] > 0);
}
explicit smart_array(size_t dimension_1 = 1) : vec(dimension_1)
{
assert(dimension_1 > 0);
}
T& operator[](size_t index)
{
assert(index < vec.size() && index >= 0);
return vec[index];
}
size_t length()
{
return vec.size();
}
};
Sample Usage:
#include "smart_array.h"
#include <iostream>
using std::cout;
using std::endl;
int main()
{
// testing 1 dimension
smart_array <int, 1> x(3);
x[0] = 0, x[1] = 1, x[2] = 2;
cout << "x.length(): " << x.length() << endl;
// testing 2 dimensions
smart_array <float, 2> y(2, 3);
y[0][0] = y[0][1] = y[0][2] = 0;
y[1][0] = y[1][1] = y[1][2] = 1;
cout << "y.length(): " << y.length() << endl;
cout << "y[0].length(): " << y[0].length() << endl;
// testing 3 dimensions
smart_array <char, 3> z(2, 4, 5);
cout << "z.length(): " << z.length() << endl;
cout << "z[0].length(): " << z[0].length() << endl;
cout << "z[0][0].length(): " << z[0][0].length() << endl;
z[0][0][4] = 'c'; cout << z[0][0][4] << endl;
// testing 4 dimensions
smart_array <bool, 4> r(2, 3, 4, 5);
cout << "r.length(): " << r.length() << endl;
cout << "r[0].length(): " << r[0].length() << endl;
cout << "r[0][0].length(): " << r[0][0].length() << endl;
cout << "r[0][0][0].length(): " << r[0][0][0].length() << endl;
// testing copy constructor
smart_array <float, 2> copy_y(y);
cout << "copy_y.length(): " << copy_y.length() << endl;
cout << "copy_x[0].length(): " << copy_y[0].length() << endl;
cout << copy_y[0][0] << "\t" << copy_y[1][0] << "\t" << copy_y[0][1] << "\t" <<
copy_y[1][1] << "\t" << copy_y[0][2] << "\t" << copy_y[1][2] << endl;
return 0;
}

If I'm understanding what you want from this type:
In short, it would be optimal to use the form:
template < typename T_, unsigned N_ >
struct t_array {
/* ... */
static const size_t Size = N_;
typedef T_ T;
T objects_[Size];
};
for many reasons if you want only a fixed size and fixed type array. The compiler can make a lot of safe assumptions - this has reduced object size to 20% (compared to using std::vector) for me in some cases. It also faster, safer. If you use them everywhere, then you may end up creating much larger binaries (compared to using std::vector).
There is a class <boost/array.hpp> which you should read.
Sorry if you don't find that helpful - I think reading at least one common production quality implementation (before venturing into new technologies) would help.

Related

Surprising behaviour with an unordered_set of pairs

How can the unordered_set can hold both (0, 1) and (1, 0) if they have the same hash value?
#include <iostream>
#include <unordered_set>
#include <utility>
using namespace std;
struct PairHash
{
template <class T1, class T2>
size_t operator()(pair<T1, T2> const &p) const
{
size_t hash_first = hash<T1>{}(p.first);
size_t hash_second = hash<T2>{}(p.second);
size_t hash_combined = hash_first ^ hash_second;
cout << hash_first << ", " << hash_second << ", " << hash_combined << endl;
return hash_combined;
}
};
int main()
{
unordered_set<pair<int, int>, PairHash> map;
map.insert({0, 1});
map.insert({1, 0});
cout << map.size() << endl;
for (auto& entry : map) {
cout << entry.first << ", " << entry.second << endl;
}
return 0;
}
Output:
0, 1, 1
1, 0, 1
2
1, 0
0, 1
Link to onlinegdb.

unordered_set can hold one instance of any unique data-value; it is not limited to only holding data-values with unique hash-values. In particular, when two data-values are different (according to their == operator) but both hash to the same hash-value, the unordered_set will make arrangements to hold both of them regardless, usually at a slightly reduced efficiency (since any hash-based lookups for either of them will internally hash to a data structure that holds both of them, which the unordered_set's lookup-code will have to iterate over until it finds the one it is looking for)

Why isn't that C++ const templated vector initialised before being used?

Update: the question is why the code below (MWE) works as it is and not as I would expect it to.
For personal convenience, I created the following templated vector const:
// shorthand for loops, etc.
template <size_t N>
const vector<size_t> range = []() {
vector<size_t> res(N);
for (size_t i = 0; i < N; i++) res[i] = i;
cout << "Created range<" << N << ">: [";
for (auto x: res) cout << x << ' ';
cout << ']' << endl;
return res;
}();
So that further, I can write more laconic loops like the following:
for (auto i : range<42>) do_something(i);
However, I realised (after some debugging) that it seems to be not guaranteed that all required instantiations of range<N> are initialised before usage! This is rather counter-intuitive so I think if I am doing something wrong.
More precisely, I have the following MWE:
#include <bits/stdc++.h>
using namespace std;
template <size_t N>
const vector<size_t> range = []() {
cout << "Initialising range<" << N << ">" << endl;
vector<size_t> result(N);
for (size_t i = 0; i < N; i++) result[i] = i;
return result;
}();
template <size_t K>
class Data {
private:
size_t m_code;
public:
size_t get_code() const { return m_code; }
constexpr static size_t cardinality = K + 1;
explicit Data(size_t code);
const static vector<Data> elems;
};
template <size_t K>
const vector<Data<K>> Data<K>::elems = []() {
cout << "Creating Data elements for K=" << K << endl;
vector<Data<K>> xs;
for (size_t i : range<Data<K>::cardinality>) xs.push_back(Data<K>(i));
return xs;
}();
template <size_t K>
Data<K>::Data(size_t code) {
m_code = code;
cout << "At the moment, range<" << K << "> is [";
for (auto k : range<K>)
cout << k << ' '; // <<< Shouldn't range<K> be already initialised here?..
cout << "] (len=" << range<K>.size() << ")" << endl;
}
int main() {
cout << ">>> Inside main()" << endl;
constexpr size_t K = 2;
cout << "Data elements:" << endl;
for (const auto &X : Data<K>::elems) {
cout << "Element Data(" << X.get_code() << ")" << endl;
}
cout << "Now, range<" << K << "> is [";
for (auto k : range<K>) cout << k << ' ';
cout << "] (len=" << range<K>.size() << ")" << endl;
}
This produces the following output:
Initialising range<3>
Creating Data elements for K=2
At the moment, range<2> is [] (len=0)
At the moment, range<2> is [] (len=0)
At the moment, range<2> is [] (len=0)
Initialising range<2>
>>> Inside main()
Data elements:
Element Data(0)
Element Data(1)
Element Data(2)
Now, range<2> is [0 1 ] (len=2)
I don't really understand why it is working as it is. I mean, I would expect a const vector (or any vector!) to be initialised before it is used and thus range<2> to be of length two any time I use it in the code.

The dynamic initialization of non-local static storage duration variables resulting from (non-explicit) template specializations is unordered, i.e. sequenced indeterminately, meaning that the order in which the initializations happen is unspecified. It does not take into account either dependencies between the variables, or order of definition, or order of instantiation.
Therefore your program has undefined behavior, since Data<2>::elems, instantiated from the use in main, has unordered dynamic initialization and uses range<2> and range<3>, both of which also have unordered dynamic initialization. Because it is unspecified whether the former or the latter are initialized first, it is possible that you access range<2> or range<3> before their initializations have begun, causing undefined behavior.
This can be resolved by using std::array instead of std::vector for range and in its initializer (and removing the cout statements in the initializer), so that the initializer becomes a constant expression. Then range<K> will not have dynamic initialization, but constant initialization, which is always performed before any dynamic initialization, i.e. before Data<K>::elems will use it.
In addition you should then declare range as constexpr to make sure that the initializer is indeed a constant expression. Otherwise you might still get dynamic initialization and the undefined behavior without warning, for example when you make a change that accidentally causes the initializer not to be a constant expression anymore.

Alternatives to std::array for objects of different types in C++11

I am looking for better solutions on how to organize and access my data.
My data is a set of structures (_array_10 and _array_20 in the example below) that contain std::array of different sizes (see my_data below).
Ideally, I would like to access it as it was an array of structs with different lengths, but this is not allowed, since different lengths are different types.
The solution I have below works, but I find it extremely ugly (specially the array of void *).
Q1. Any ideas on how to have a safer, more efficient/portable, or at least less ugly solution?
Q2. Is the proposed solution without templates portable? It relies on the fact that the length is stored before the rest of the data, since casting the pointer to an object with wrong length would mess the access to all fields that come after the first field of variable length.
My limitations include:
C++11
standard libraries
no std::vector
memory usage prevents me from being able to simply allocate an array of my_data with the maximum possible length
the bulk of the data (_array_10, _array_20, etc) will be placed in a memory area reserved specially for it
Using data_view and template require knowledge of the length of the arrays in build time. It would be great if we could avoid it.
Question edited to include the solution proposed by Guillaume Racicot
#include <iostream>
#include <array>
std::array<void *, 2> _ptrs;
template <int length>
struct my_data
{
int array_length;
std::array<int, length> something;
std::array<int, length> data;
my_data()
{
array_length = length;
}
};
struct my_data_view
{
int array_length;
const int * something;
const int * data;
template <int length>
my_data_view(my_data<length> const & data_in) :
array_length(length),
something(data_in.something.data()),
data(data_in.data.data())
{}
};
template <int length>
void
print_element(int array_idx, int element)
{
my_data<length> * ptr = reinterpret_cast<my_data<length> *>(_ptrs[array_idx]);
std::cout << "array " << length << ", data[" << element << "] = " << ptr->data[element] << ".\n";
}
void
print_element(int array_idx, int element)
{
my_data<1> * ptr = reinterpret_cast<my_data<1> *>(_ptrs[array_idx]);
int length = ptr->array_length;
int data_to_print = 0;
switch (length)
{
case 10:
{
data_to_print = reinterpret_cast<my_data<10> *>(_ptrs[array_idx])->data[element];
break;
}
case 20:
{
data_to_print = reinterpret_cast<my_data<20> *>(_ptrs[array_idx])->data[element];
break;
}
}
std::cout << "array " << length << ", data[" << element << "] = " << data_to_print << ".\n";
}
void
print_element(my_data_view view, int element)
{
int length = view.array_length;
int data_to_print = view.data[element];
std::cout << "array " << length << ", data[" << element << "] = " << data_to_print << ".\n";
}
int
main()
{
my_data<10> _array_10;
my_data<20> _array_20;
_ptrs[0] = static_cast<void *>(&_array_10);
_ptrs[1] = static_cast<void *>(&_array_20);
_array_10.data[5] = 11;
_array_20.data[5] = 22;
std::cout << "using template\n";
print_element<10>(0, 5);
print_element<20>(1, 5);
std::cout << "\nwithout template\n";
print_element(0, 5);
print_element(1, 5);
std::cout << "\nusing data_view\n";
print_element(my_data_view(_array_10), 5);
print_element(my_data_view(_array_20), 5);
}

You could create a dynamic view class that don't allocate:
struct my_data_view
{
int array_length;
std::span<int> something;
std::span<int> data;
template<int length>
my_data_view(my_data<length> const& data) :
array_length{length}, something{data.something}, data{data.data}
{}
};
Spans simply are a pointer and a size. If you don't have access to std::span (which is from C++20) you can simply replace those member with int* and use array_length for the size.
This my_data_view type is used like that:
void
print_element(my_data_view view, int element)
{
int length = view.array_length;
int data_to_print = view.data[element];
std::cout << "array " << length << ", data[" << element << "] = " << data_to_print << ".\n";
}
This is the code that will work both with std::span and simple int*.

Container initialization in C++98

I have to construct an ordered container (which must be iterable) with the following rule:
If the condition is true, the container is {1,0}, else it's {0,1}
I have the following code, but I don't find it "elegant":
vector<int> orderedSides;
if (condition)
{
orderedSides.push_back(1);
orderedSides.push_back(0);
}
else
{
orderedSides.push_back(0);
orderedSides.push_back(1);
}
Is there a better way to do this (from concision and performance point of view)?

You might implement something like this:
vector<int> orderedSides(2, 0);
(condition ? orderedSides.front() : orderedSides.back()) = 1;
which is a little bit shorter than explicit if clauses.
As #Deduplicator mentioned below, we might rewrite the second line in a more concise way:
orderedSides[!condition] = 1;

vector<int> orderedSides;
orderedSides.push_back(condition ? 1 : 0);
orderedSides.push_back(condition ? 0 : 1);
I don't think it's more performant but I find it more elegant.

You could compromise between efficiency and avoiding repetition, initialise the first with the condition and the second from the first.
vector<int> orderedSides(1, bool(condition)) ;
orderedSides.push_back(!orderedSides.back());

orderedSides.push_back(0);
orderedSides.push_back(1);
if (condition)
std::iter_swap(orderedSides.begin(), orderedSides.begin()+1);
I know this take bits cost. As one of candidates.

If building the elements (the ints in your question, whatever it is in real life) is free and side-effect-less:
static const int data[] = { 0, 1, 0 };
std::vector<int> orderedSides (data+condition, data+condition+2);
Full program example:
#include <iostream>
#include <vector>
std::vector<int> make(bool cond)
{
static const int data[] = { 0, 1, 0 };
return std::vector<int> (data+cond, data+cond+2);
}
std::ostream& operator<<(std::ostream& os, const std::vector<int>& v)
{
return os << "{ " << v[0] << ", " << v[1] << " }";
}
int main()
{
std::cout << "true: " << make(true) << "\n"
<< "false: " << make(false) << "\n";
}
Prints:
true: { 1, 0 }
false: { 0, 1 }
Demo

You can populate a std::vector from an array, even in C++98.
Here's an example:
#include <iostream>
#include <vector>
int main() {
bool condition = false;
std::cout << "condition is: " << std::boolalpha << condition << '\n';
int arr[][2] = {{0,1}, {1,0}};
int index = condition;
std::vector<int> v(arr[index], arr[index]+2);
for (int i = 0; i < v.size(); i++)
std::cout << v[i] << ' ';
std::cout << '\n';
}
The output is:
$ g++ tt.cc && ./a.out
condition is: false
0 1
For reference:
http://en.cppreference.com/w/cpp/container/vector/vector

Reference to a partial segment of a vector?

I have a black box C++ function which I don't have access to its source code:
void blackbox(vector<int> &input);
This function modifies the element of the input vector in an unknown manner.
The problem I have now is that I want to apply the black box function only for a partial segment of a vector, for example,
the last 500 elements of a vector. So, this is the routine that I wrote to attain this goal:
vector<int> foo (5,1000);
vector<int> bar (foo.end()-500,foo.end());
blackbox(bar);
swap_ranges(foo.end()-500,foo.end(),bar.begin());
This code may work, but is there a better way to do this?
It would be good if I can define a vector reference only for a segment of
an existing vector, instead of creating a copy.
I am not so comfortable with the copying and swapping parts in the above code; since this routine is
invoked so frequently, I think the repeated copying and swapping slows down the code.
If I knew the exact operations done by the block box, I would rewrite the function so that it takes vector iterators as the input
arguments. Unfortunately, this is not possible at the moment.

There's no well-defined way to achieve this functionality. With huge caveats and warnings, it can (for one GCC version at least) be hacked as below, or you could perhaps write something with better defined behaviour but based on your compiler's current std::vector implementation....
So... hacked. This will not work if insert/erase/resize/reserve/clear/push_back or any other operation affecting the overall vector is performed. It may not be portable / continue working / work with all optimisation levels / work on Tuesdays / use at own risk etc.. It depends on the empty base class optimisation.
You need a custom allocator but there's a catch: the allocator can't have any state or it'll change the binary layout of the vector object, so we end up with this:
#include <iostream>
#include <vector>
template <typename Container> // easy to get this working...
void f(Container& v)
{
std::cout << "f() v.data() " << v.data() << ", v.size() " << v.size() << '\n';
for (int& n : v) n += 10;
}
void g(std::vector<int>& v) // hard to get this working...
{
std::cout << "g() v.data() " << v.data() << ", v.size() " << v.size() << '\n';
for (int& n : v) n += 100;
}
int* p_; // ouch: can't be a member without changing vector<> memory layout
struct My_alloc : std::allocator<int>
{
// all no-ops except allocate() which returns the constructor argument...
My_alloc(int* p) { p_ = p; }
template <class U, class... Args>
void construct(U* p, Args&&... args) { std::cout << "My_alloc::construct(U* " << p << ")\n"; }
template <class U> void destroy(U* p) { std::cout << "My_alloc::destroy(U* " << p << ")\n"; }
pointer allocate(size_type n, std::allocator<void>::const_pointer hint = 0)
{
std::cout << "My_alloc::allocate() return " << p_ << "\n";
return p_;
}
void deallocate(pointer p, size_type n) { std::cout << "deallocate\n"; }
template <typename U>
struct rebind { typedef My_alloc other; };
};
int main()
{
std::vector<int> v = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
std::cout << "main() v.data() " << v.data() << '\n';
My_alloc my_alloc(&v[3]); // first element to "take over"
std::vector<int, My_alloc> w(3, my_alloc); // num elements to "take over"
f(w);
g(reinterpret_cast<std::vector<int>&>(w));
for (int n : v) std::cout << n << ' ';
std::cout << '\n';
std::cout << "sizeof v " << sizeof v << ", sizeof w " << sizeof w << '\n';
}
Output:
main() v.data() 0x9d76008
My_alloc::allocate() return 0x9d76014
My_alloc::construct(U* 0x9d76014)
My_alloc::construct(U* 0x9d76018)
My_alloc::construct(U* 0x9d7601c)
f() v.data() 0x9d76014, v.size() 3
g() v.data() 0x9d76014, v.size() 3
0 1 2 113 114 115 6 7 8 9
sizeof v 12, sizeof w 12
My_alloc::destroy(U* 0x9d76014)
My_alloc::destroy(U* 0x9d76018)
My_alloc::destroy(U* 0x9d7601c)
deallocate
See it run here

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js