std::vector and contiguous memory of multidimensional arrays - c++

I know that the standard does not force std::vector to allocate contiguous memory blocks, but all implementations obey this nevertheless.
Suppose I wish to create a vector of a multidimensional, static array. Consider 2 dimensions for simplicity, and a vector of length N. That is I wish to create a vector with N elements of, say, int[5].
Can I be certain that all N*5 integers are now contiguous in memory? So that I in principle could access all of the integers simply by knowing the address of the first element? Is this implementation dependent?
For reference the way I currently create a 2D array in a contiguous memory block is by first making a (dynamic) array of float* of length N, allocating all N*5 floats in one array and then copying the address of every 5th element into the first array of float*.

The standard does require the memory of an std::vector to be
contiguous. On the other hand, if you write something like:
std::vector<std::vector<double> > v;
the global memory (all of the v[i][j]) will not be contiguous. The
usual way of creating 2D arrays is to use a single
std::vector<double> v;
and calculate the indexes, exactly as you suggest doing with float.
(You can also create a second std::vector<float*> with the addresses
if you want. I've always just recalculated the indexes, however.)

Elements of a Vector are gauranteed to be contiguous as per C++ standard.
Quotes from the standard are as follows:
From n2798 (draft of C++0x):
23.2.6 Class template vector [vector]
1 A vector is a sequence container that supports random access iterators. In addition, it supports (amortized) constant time insert and erase operations at the end; insert and erase in the middle take linear time. Storage management is handled automatically, though hints can be given to improve efficiency. The elements of a vector are stored contiguously, meaning that if v is a vector where T is some type other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size().
C++03 standard (23.2.4.1):
The elements of a vector are stored contiguously, meaning that if v is a vector where T is some type other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size().
Also, see here what Herb Sutter's views on the same.

As #Als already pointed out, yes, std::vector (now) guarantees contiguous allocation. I would not, however, simulate a 2D matrix with an array of pointers. Instead, I'd recommend one of two approaches. The simpler by (by far) is to just use operator() for subscripting, and do a multiplication to convert the 2D input to a linear address in your vector:
template <class T>
class matrix2D {
std::vector<T> data;
int columns;
public:
T &operator()(int x, int y) {
return data[y * columns + x];
}
matrix2D(int x, int y) : data(x*y), columns(x) {}
};
If, for whatever reason, you want to use matrix[a][b] style addressing, you can use a proxy class to handle the conversion. Though it was for a 3D matrix instead of 2D, I posted a demonstration of this technique in previous answer.

For reference the way I currently create a 2D array in a contiguous memory block is by first making a (dynamic) array of float* of length N, allocating all N*5 floats in one array and then copying the address of every 5th element into the first array of float*.
That's not a 2D array, that's an array of pointers. If you want a real 2D array, this is how it's done:
float (*p)[5] = new float[N][5];
p [0] [0] = 42; // access first element
p[N-1][4] = 42; // access last element
delete[] p;
Note there is only a single allocation. May I suggest reading more about using arrays in C++?

Under the hood, a vector may look approximately like (p-code):
class vector<T> {
T *data;
size_t s;
};
Now if you make a vector<vector<T> >, there will be a layout like this
vector<vector<T>> --> data {
vector<T>,
vector<T>,
vector<T>
};
or in "inlined" form
vector<vector<T>> --> data {
{data0, s0},
{data1, s1},
{data2, s2}
};
Yes, the vector-vector therefore uses contiguous memory, but no, not as you'd like it. It most probably stores an array of pointers (and some other variables) to external places.
The standard only requires that the data of a vector is contiguous, but not the vector as a whole.

A simple class to create, as you call it, a 2D array, would be something like:
template <class T> 2DArray {
private:
T *m_data;
int m_stride;
public:
2DArray(int dimY, int dimX) : m_stride(dimX) : m_data(new[] T[dimX * dimY]) {}
~2DArray() { delete[] m_data; }
T* operator[](int row) { return m_data + m_stride * row; }
}
It's possible to use this like:
2DArray<int> myArray(30,20);
for (int i = 0; i < 30; i++)
for (int j = 0; j < 20; j++)
myArray[i][j] = i + j;
Or even pass &myArray[0][0] as address to low-level functions that take some sort of "flat buffers".
But as you see, it turns naive expectations around in that it's myarray[y][x].
Generically, if you interface with code that requires some sort of classical C-style flat array, then why not just use that ?
Edit: As said, the above is simple. No bounds check attempts whatsoever. Just like, "an array".

Related

Sorting a C 2D array via std::sort

I have a 2D array a[][40]. I'm trying to sort it by calling std::sort, and I have written the Compare function. However, C++ wants me to have a std::vector to be sorted, not a simple array and I want the sorted array to be a itself, I don't want to create another array and save the sorting result there. It seems there are a lot of ways to achieve that. I could think of five ways, but none of them seems to be efficient and working.
1)
Directly use std::sort(std::begin(a), std::begin(a) + something, cmp);
It doesn't work, because std::begin doesn't know how to point to the beginning of a 2D array. Furthermore, it'd sort incorrectly even if it compiled, since a 2D array is not an array of references to arrays, but consecutive arrays (unlike Java)
Playground: https://godbolt.org/g/1tu3TF
2)
std::vector<unsigned char[40]> k(a, a + x);
std::sort(k.begin(), k.end(), cmp);
Then copy everything back to a
It doesn't work, because it's a 2D array, and it can't be sorted this way, using std::sort. In contrast to the first trial, this one uses twice as much as memory, and copies everything twice (if it worked)!
Playground: https://godbolt.org/g/TgCT6Z
3)
std::vector<int> k(x);
for (int i = 0; i < x; k[i] = i, i++);
std::sort(k.begin(), k.end(), cmp2);
Then change the order of a to be the same of k;
The idea is simple, create a vector of representative "pointers", sort them (as the cmp2 function secretly accesses a and compares the values), then make a have the same order with k.
In the end, the re-ordering loop will be very complex, will require a large, temporary variable. Besides, for cmp2 to access the values of a, a global variable-pointer that points to a must be created, which is "bad" code.
Playground: https://godbolt.org/g/EjdMo7
4)
For all unsigned char[40], a struct can be created and their values can be copied to structs. Comparison and = operators will need to be declared. After sorted, they can be copied back to a.
It'd be a great solution if the arrays didn't have to be copied to structs to use struct's operators, but they need to be copied, so all values will be copied twice, and twice-as-needed memory will be used.
5)
For all unsigned char[40], a struct that has a pointer to them can be created. They can be sorted by the pointed values, and the result can be saved to a pointer array.
It's probably the best option, although the result is a pointer array instead a. Another reason on why it's good is it doesn't move the arrays, but the pointers.
To sum up, I need to sort the 2D array a[][40] via std::sort, but I haven't decided on the best way. It seems there's a "best way to do that" which I can't think of. Could you please help me?
EDIT: To clarify, I want {{3,2}{1,4}} to become {{1,4}{3,2}}
The problem is not in iterating a 2D array. Provided the columns size is a constexpr value, pointers to arrays are nice iterators.
But all C++ sort (or mutating) algorithms require the underlying type to be move constructible and move assignable and an array is not assignable. But wrapping the underlying arrays can be enough:
template <class T, int sz>
class wrapper {
T* base;
bool own; // a trick to allow temporaries: only them have own == true
public:
// constructor using a existing array
wrapper(T* arr): base(arr), own(false) {}
~wrapper() { // destructor
if (own) {
delete[] base; // destruct array for a temporary wrapper
}
}
// move constructor - in fact copy to a temporary array
wrapper(wrapper<T, sz>&& src): base(new T[sz]), own(true) {
for(int i=0; i<sz; i++) {
base[i] = src.base[i];
}
}
// move assignment operator - in fact also copy
wrapper<T, sz>& operator = (wrapper<T, sz>&& src) {
for(int i=0; i<sz; i++) {
base[i] = src.base[i];
}
return *this;
}
// native ordering based on lexicographic string order
bool operator < (const wrapper<T, sz>& other) const {
return std::char_traits<char>::compare(base, other.base, sz) < 0;
}
const T* value() const { // access to the underlying string for tests
return base;
}
};
Then, you can sort a C compatible 2D array with any C++ sort algo:
std::vector<wrapper<char, 40> > v { &arr[0], &arr[sz] }; // pointer are iterators...
std::sort(v.begin(), v.end()); // and that's all!
for (int i=0; i<sz; i++) { // control
std::cout << arr[i] << std::endl;
}
The overhead is a vector of structures containing a pointer and a bool, but what is sorted is actually the original 2D array.
Of course, as the C library is accessible from C++, qsort would certainly be easier for sorting a C compatible 2D array. But this way allows the use of stable_sort or partial_sort if they are relevant.

C++ N nested vectors at runtime

In C++ (with or without boost), how can I create an N dimensional vectors where N is determined at runtime?
Something along the lines of:
PROCEDURE buildNVectors(int n)
std::vector < n dimensional std::vector > *structure = new std::vector< n dimensional std::vector >()
END
If passed 1, a vector would be allocated. If passed 2, a 2d nested matrix would be allocated. If passed 3, a 3d cube is allocated. etc.
Unfortunately you will not be able to do this. A std::vector is a template type and as such it's type must be known at compile time. Since it's type is used to determine what dimensions it has you can only set that at compile time.
The good news is you can make your own class that uses a single dimension vector as the data storage and then you can fake that it has extra dimensions using math. This does make it tricky to access the vector though. Since you will not know how many dimensions the vector has you need to have a way to index into the container with an arbitrary number of elements. What you could do is overload the function call operator operator with a std::intializer_list which would allow you to index into it with something like
my_fancy_dynamic_dimension_vector({x,y,z,a,b,c});
A real rough sketch of what you could have would be
class dynmic_vector
{
std::vector<int> data;
int multiply(std::initializer_list<int> dims)
{
int sum = 1;
for (auto e : dims)
sum *= e;
return sum;
}
public:
dynmic_vector(std::initializer_list<int> dims) : data(multiply(dims)) {}
int & operator()(std::initializer_list<int> indexs)
{
// code here to translate the values in indexes into a 1d position
}
};
Or better yet, just use a boost::multi_array

use std::vector for dynamically allocated 2d array?

So I am writing a class, which has 1d-arrays and 2d-arrays, that I dynamically allocate in the constructor
class Foo{
int** 2darray;
int * 1darray;
};
Foo::Foo(num1, num2){
2darray = new int*[num1];
for(int i = 0; i < num1; i++)
{
array[i] = new int[num2];
}
1darray = new int[num1];
}
Then I will have to delete every 1d-array and every array in the 2d array in the destructor, right?
I want to use std::vector for not having to do this. Is there any downside of doing this? (makes compilation slower etc?)
TL;DR: when to use std::vector for dynamically allocated arrays, which do NOT need to be resized during runtime?
vector is fine for the vast majority of uses. Hand-tuned scenarios should first attempt to tune the allocator1, and only then modify the container. Correctness of memory management (and your program in general) is worth much, much more than any compilation time gains.
In other words, vector should be your starting point, and until you find it unsatisfactory, you shouldn't care about anything else.
As an additional improvement, consider using a 1-dimensional vector as a backend storage and only provide 2-dimensional indexed view. This scenario can improve the cache locality and overall performance, while also making some operations like copying of the whole structure much easier.
1 the second of two template parameters that vector accepts, which defaults to a standard allocator for a given type.
There should not be any drawbacks since vector guarantees contiguous memory. But if the size is fixed and C++11 is available maybe an array among other options:
it doesn't allow resizing
depending on how the vector is initialized prevents reallocations
size is hardcoded in the instructions (template argument). See Ped7g comment for a more detailed description
An 2D array is not a array of pointers.
If you define it this way, each row/colum can have a different size.
Furthermore the elements won't be in sequence in memory.
This might lead to poor performance as the prefetcher wont be able to predict your access-patterns really well.
Therefore it is not advised to nest std::vectors inside eachother to model multi-dimensional arrays.
A better approach is to map an continuous chunk of memory onto an mult-dimensional space by providing custom access methods.
You can test it in the browser: http://fiddle.jyt.io/github/3389bf64cc6bd7c2218c1c96f62fa203
#include<vector>
template<class T>
struct Matrix {
Matrix(std::size_t n=1, std::size_t m=1)
: n{n}, m{m}, data(n*m)
{}
Matrix(std::size_t n, std::size_t m, std::vector<T> const& data)
: n{n}, m{m}, data{data}
{}
//Matrix M(2,2, {1,1,1,1});
T const& operator()(size_t i, size_t j) const {
return data[i*m + j];
}
T& operator()(size_t i, size_t j) {
return data[i*m + j];
}
size_t n;
size_t m;
std::vector<T> data;
using ScalarType = T;
};
You can implement operator[] by returning a VectorView which has access to data an index and the dimensions.

Making only the outer vector in vector<vector<int>> fixed

I want to create a vector<vector<int>> where the outer vector is fixed (always containing the same vectors), but the inner vectors can be changed. For example:
int n = 2; //decided at runtime
assert(n>0);
vector<vector<int>> outer(n); //outer vector contains n empty vectors
outer.push_back(vector<int>()); //modifying outer vector - this should be error
auto outer_it = outer.begin();
(*outer_it).push_back(3); //modifying inner vector. should work (which it does).
I tried doing simply const vector<vector<int>>, but that makes even the inner vectors const.
Is my only option to create my own custom FixedVectors class, or are there better ways out there to do this?
by definition,
Vectors are sequence containers representing arrays that can change in
size. Just like arrays, vectors use contiguous storage locations for
their elements, which means that their elements can also be accessed
using offsets on regular pointers to its elements, and just as
efficiently as in arrays. But unlike arrays, their size can change
dynamically, with their storage being handled automatically by the
container.
if you aren't looking to have a data structure that changes in size, a vector probably isn't the best choice for an outer layer, How about using an array of vectors. This way the array is of a fixed size and cannot be modified, while still having the freedom of having its size declared in runtime.
vector<int> *outer;
int VectSize;
cout >> "size of vector array?"
cin >> VectSize;
outer = new vector<int>[VectSize]; //array created with fixed size
outer.push_back() //not happening
Wrap the outer vector into a class which just provides at, begin, end and operator []. Let the class take only have one constructor taking its capacity.
This most probably the best way.
const vector<unique_ptr<vector<int>>> outer = something(n);
For the something, you might write a function, like this:
vector<unique_ptr<vector<int>>> something(int n)
{
vector<unique_ptr<vector<int>>> v(n);
for (auto & p : v)
p.reset(new vector<int>);
return v;
}

C++ class for arrays with arbitrary indices

Do any of the popular C++ libraries have a class (or classes) that allow the developer to use arrays with arbitrary indices without sacrificing speed ?
To give this question more concrete form, I would like the possibility to write code similar to the below:
//An array with indices in [-5,6)
ArbitraryIndicesArray<int> a = ArbitraryIndicesArray<int>(-5,6);
for(int index = -5;index < 6;++index)
{
a[index] = index;
}
Really you should be using a vector with an offset. Or even an array with an offset. The extra addition or subtraction isn't going to make any difference to the speed of execution of the program.
If you want something with the exact same speed as a default C array, you can apply the offset to the array pointer:
int* a = new int[10];
a = a + 5;
a[-1] = 1;
However, it is not recommended. If you really want to do that you should create a wrapper class with inline functions that hides the horrible code. You maintain the speed of the C code but end up with the ability to add more error checking.
As mentioned in the comments, after altering the array pointer, you cannot then delete using that pointer. You must reset it to the actual start of the array. The alternative is you always keep the pointer to the start but work with another modified pointer.
//resetting the array by adding the offset (of -5)
delete [] (a - 5);
A std::vector<int> would do the trick here.
Random acess to a single element in a vector is only O(1).
If you really need the custom indices you can make your own small class based on a vector to apply an ofset.
Use the map class from the STL:
std::map<int, int> a;
for( int index = -5; index < 6; ++index )
{
a[index] = index;
}
map is implemented internally as a sorted container, which uses a binary search to locate items.
[This is an old thread but for reference sake...]
Boost.MultiArray has an extents system for setting any index range.
The arrays in the ObjexxFCL library have full support for arbitrary index ranges.
These are both multi-dimensional array libraries. For the OP 1D array needs the std::vector wrapper above should suffice.
Answer edited because I'm not very smart.
Wrap an std::vector and an offset into a class and provide an operator[]:
template <class T>
class ArbVector
{
private:
int _offset;
std::vector<T> container;
public:
ArbVector(int offset) : _offset(offset) {}
T& operator[](int n) { return container[n + _offset] }
};
Not sure if this compiles, but you get the idea.
Do NOT derive from std::vector though, see comments.