Fixed allocation std::vector - c++

I'm an embedded software developer and as such I can't always use all the nice C++ features. One of the most difficult things is avoiding dynamic memory allocation as it is somewhat universal with all STL containers.
The std::vector is however very useful when working with variable datasets. The problem though is that the allocation(e.g. std::reserve) isn't done at initialization or fixed. This means that memory fragmentation can occur when a copy occurs.
It would be great to have every vector have an allocated memory space which is the max size the vector can grow to. This would create deterministic behaviour and make it possible to map the memory usage of the microcontroller at compilation time. A call to push_back when the vector is at it's max size would create a std::bad_alloc.
I have read that an alternative version of std::allocator can be written to create new allocation behaviour. Would it be possible to create this kind of behaviour with std::allocator or would an alternative solution be a better fit?
I would really like to keep using the STL libraries and amend to them instead of recreating my own vector as I'm more likely to make mistakes than their implementation.
sidenote #1:
I can't use std::array as 1: it isn't provided by my compiler and 2: it does have a static allocation but I then still have to manage the boundary between my data and buffer inside the std::array. This means rewriting a std::vector with my allocation properties which is what I'm trying to get away from.

You can implement or reuse boost's static_vector; A variable-size array container with fixed capacity.
And also: LLVM's small vector without LLVM dependencies here. This creates objects at the stack until a compile-time constant is reached, then it moves to the heap.

You can always use a C-style array (same as underlying in std::array) as vectors aren't supposed to be static
int arr[5]; // static array of 5 integers
To have it more useful you can wrap it in a class template to hide the C-style
Example:
template<class type, std::size_t capacaty>
class StaticVector {
private:
type arr[capacaty];
std::size_t m_size;
public:
StaticVector() : m_size(0) {}
type at(std::size_t index) {
if (index >=0 && index < m_size) {
return arr[index];
}
return type();
}
void remove(std::size_t index) {
if (index >=0 && index < m_size) {
for (std::size_t i=index; i < m_size-1; i++) {
arr[i] = arr[i+1];
}
m_size--;
}
}
void push_back(type val) {
if (m_size < capacaty) {
arr[m_size] = val;
m_size++;
}
}
std::size_t size() {
return m_size;
}
};
Example with it in use: https://onlinegdb.com/BkBgSTlZH

Related

Pointer wrapper for insertion

Can someone tell me if there is a datatype in C++/STL that allows me to solve the following problem comfortably:
I have a preallocated contiguous area of memory representing an array of objects of type T.
I have a raw pointer ptrEnd to this area which points right after the last object of the area.
I have a pointer ptrCurrent that points to some position inside this area.
Now what I want is some kind of wrapper class that helps me insert new elements into this area. It should have some kind of "append" function which basically does the following things
Assign *ptrCurrent the value of object to insert
Increment ptrCurrent by one.
Omit the aforementioned steps if ptrCurrent >= ptrEnd. Return an error instead (or a false to indicate failure).
I could write something like this myself, but I wanted to ask first if there is a class in C++ STL that allows me to solve this problem more elegantly.
Thanks for your help.
There is a convenient feature for exactly this in C++17, polymorphic allocators. More specifically, this is what you want:
std::pmr::monotonic_buffer_resource buffer(sizeof(T) * 256);
// Buffer that can hold 256 objects of type `T`.
std::pmr::vector<T> vec(&buffer);
// The vector will use `buffer` as the backing storage.
live godbolt.org example
You'd need to write an Allocator that hands out Ts from your array, and then std::vector can use it.
template <typename T>
class ArrayAllocator
{
T* current;
T* end;
public:
using value_type = T;
ArrayAllocator(T* start, T* end) : current(start), end(end) {}
T* allocate(size_t n)
{
if (current + n >= end) throw std::bad_alloc();
T * result = current;
current += n;
return result;
}
void deallocate(T* what, size_t n)
{
if (what + n != current) throw std::runtime_error("bad deallocate");
current = what;
}
size_t max_size() { return end - current; }
};
You'd have to immediately reserve the whole amount, because when vector reallocates it needs to copy the old values into the new space, which will result in a "bad deallocate".
I ended up writing an AppendHelper class that takes the start and end pointer and otherwise reproduces the std::vector interface. I realized that using std::vector with a custom allocator meant not having full control over when allocation and deallocation is performed, so the result could behave differently from my original intention.

use std::vector for dynamically allocated 2d array?

So I am writing a class, which has 1d-arrays and 2d-arrays, that I dynamically allocate in the constructor
class Foo{
int** 2darray;
int * 1darray;
};
Foo::Foo(num1, num2){
2darray = new int*[num1];
for(int i = 0; i < num1; i++)
{
array[i] = new int[num2];
}
1darray = new int[num1];
}
Then I will have to delete every 1d-array and every array in the 2d array in the destructor, right?
I want to use std::vector for not having to do this. Is there any downside of doing this? (makes compilation slower etc?)
TL;DR: when to use std::vector for dynamically allocated arrays, which do NOT need to be resized during runtime?
vector is fine for the vast majority of uses. Hand-tuned scenarios should first attempt to tune the allocator1, and only then modify the container. Correctness of memory management (and your program in general) is worth much, much more than any compilation time gains.
In other words, vector should be your starting point, and until you find it unsatisfactory, you shouldn't care about anything else.
As an additional improvement, consider using a 1-dimensional vector as a backend storage and only provide 2-dimensional indexed view. This scenario can improve the cache locality and overall performance, while also making some operations like copying of the whole structure much easier.
1 the second of two template parameters that vector accepts, which defaults to a standard allocator for a given type.
There should not be any drawbacks since vector guarantees contiguous memory. But if the size is fixed and C++11 is available maybe an array among other options:
it doesn't allow resizing
depending on how the vector is initialized prevents reallocations
size is hardcoded in the instructions (template argument). See Ped7g comment for a more detailed description
An 2D array is not a array of pointers.
If you define it this way, each row/colum can have a different size.
Furthermore the elements won't be in sequence in memory.
This might lead to poor performance as the prefetcher wont be able to predict your access-patterns really well.
Therefore it is not advised to nest std::vectors inside eachother to model multi-dimensional arrays.
A better approach is to map an continuous chunk of memory onto an mult-dimensional space by providing custom access methods.
You can test it in the browser: http://fiddle.jyt.io/github/3389bf64cc6bd7c2218c1c96f62fa203
#include<vector>
template<class T>
struct Matrix {
Matrix(std::size_t n=1, std::size_t m=1)
: n{n}, m{m}, data(n*m)
{}
Matrix(std::size_t n, std::size_t m, std::vector<T> const& data)
: n{n}, m{m}, data{data}
{}
//Matrix M(2,2, {1,1,1,1});
T const& operator()(size_t i, size_t j) const {
return data[i*m + j];
}
T& operator()(size_t i, size_t j) {
return data[i*m + j];
}
size_t n;
size_t m;
std::vector<T> data;
using ScalarType = T;
};
You can implement operator[] by returning a VectorView which has access to data an index and the dimensions.

Implementing incremental array in C++

I want to implement an array that can increment as new values are added. Just like in Java. I don't have any idea of how to do this. Can anyone give me a way ?
This is done for learning purposes, thus I cannot use std::vector.
Here's a starting point: you only need three variables, nelems, capacity and a pointer to the actual array. So, your class would start off as
class dyn_array
{
T *data;
size_t nelems, capacity;
};
where T is the type of data you want to store; for extra credit, make this a template class. Now implement the algorithms discussed in your textbook or on the Wikipedia page on dynamic arrays.
Note that the new/delete allocation mechanism does not support growing an array like C's realloc does, so you'll actually be moving data's contents around when growing the capacity.
I would like to take the opportunity to interest you in an interesting but somewhat difficult topic: exceptions.
If you start allocating memory yourself and subsequently playing with raw pointers, you will find yourself in the difficult position of avoiding memory leaks.
Even if you are entrusting the book-keeping of the memory to a right class (say std::unique_ptr<char[]>), you still have to ensure that operations that change the object leave it in a consistent state should they fail.
For example, here is a simple class with an incorrect resize method (which is at the heart of most code):
template <typename T>
class DynamicArray {
public:
// Constructor
DynamicArray(): size(0), capacity(0), buffer(0) {}
// Destructor
~DynamicArray() {
if (buffer == 0) { return; }
for(size_t i = 0; i != size; ++i) {
T* t = buffer + i;
t->~T();
}
free(buffer); // using delete[] would require all objects to be built
}
private:
size_t size;
size_t capacity;
T* buffer;
};
Okay, so that's the easy part (although already a bit tricky).
Now, how do you push a new element at the end ?
template <typename T>
void DynamicArray<T>::resize(size_t n) {
// The *easy* case
if (n <= size) {
for (; n < size; ++n) {
(buffer + n)->~T();
}
size = n;
return;
}
// The *hard* case
// new size
size_t const oldsize = size;
size = n;
// new capacity
if (capacity == 0) { capacity = 1; }
while (capacity < n) { capacity *= 2; }
// new buffer (copied)
try {
T* newbuffer = (T*)malloc(capacity*sizeof(T));
// copy
for (size_t i = 0; i != oldsize; ++i) {
new (newbuffer + i) T(*(buffer + i));
}
free(buffer)
buffer = newbuffer;
} catch(...) {
free(newbuffer);
throw;
}
}
Feels right no ?
I mean, we even take care of a possible exception raised by T's copy constructor! yeah!
Do note the subtle issue we have though: if an exception is thrown, we have changed the size and capacity members but still have the old buffer.
The fix is obvious, of course: we should first change the buffer, and then the size and capacity. Of course...
But it is "difficult" to get it right.
I would recommend using an alternative approach: create an immutable array class (the capacity should be immutable, not the rest), and implement an exception-less swap method.
Then, you'll be able to implement the "transaction-like" semantics much more easily.
An array which grows dynamically as we add elements are called dynamic array, growable array, and here is a complete implementation of a dynamic array .
In C and C++ array notation is basically just short hand pointer maths.
So in this example.
int fib [] = { 1, 1, 2, 3, 5, 8, 13};
This:
int position5 = fib[5];
Is the same thing as saying this:
int position5 = int(char*(fib)) + (5 * sizeof(int));
So basically arrays are just pointers.
So if you want to auto allocate you will need to write some wrapper functions to call malloc() or new, ( C and C++ respectively).
Although you might find vectors are what you are looking for...

How is vector implemented in C++

I am thinking of how I can implement std::vector from the ground up.
How does it resize the vector?
realloc only seems to work for plain old stucts, or am I wrong?
it is a simple templated class which wraps a native array. It does not use malloc/realloc. Instead, it uses the passed allocator (which by default is std::allocator).
Resizing is done by allocating a new array and copy constructing each element in the new array from the old one (this way it is safe for non-POD objects). To avoid frequent allocations, often they follow a non-linear growth pattern.
UPDATE: in C++11, the elements will be moved instead of copy constructed if it is possible for the stored type.
In addition to this, it will need to store the current "size" and "capacity". Size is how many elements are actually in the vector. Capacity is how many could be in the vector.
So as a starting point a vector will need to look somewhat like this:
template <class T, class A = std::allocator<T> >
class vector {
public:
// public member functions
private:
T* data_;
typename A::size_type capacity_;
typename A::size_type size_;
A allocator_;
};
The other common implementation is to store pointers to the different parts of the array. This cheapens the cost of end() (which no longer needs an addition) ever so slightly at the expense of a marginally more expensive size() call (which now needs a subtraction). In which case it could look like this:
template <class T, class A = std::allocator<T> >
class vector {
public:
// public member functions
private:
T* data_; // points to first element
T* end_capacity_; // points to one past internal storage
T* end_; // points to one past last element
A allocator_;
};
I believe gcc's libstdc++ uses the latter approach, but both approaches are equally valid and conforming.
NOTE: This is ignoring a common optimization where the empty base class optimization is used for the allocator. I think that is a quality of implementation detail, and not a matter of correctness.
Resizing the vector requires allocating a new chunk of space, and copying the existing data to the new space (thus, the requirement that items placed into a vector can be copied).
Note that it does not use new [] either -- it uses the allocator that's passed, but that's required to allocate raw memory, not an array of objects like new [] does. You then need to use placement new to construct objects in place. [Edit: well, you could technically use new char[size], and use that as raw memory, but I can't quite imagine anybody writing an allocator like that.]
When the current allocation is exhausted and a new block of memory needs to be allocated, the size must be increased by a constant factor compared to the old size to meet the requirement for amortized constant complexity for push_back. Though many web sites (and such) call this doubling the size, a factor around 1.5 to 1.6 usually works better. In particular, this generally improves chances of re-using freed blocks for future allocations.
From Wikipedia, as good an answer as any.
A typical vector implementation consists, internally, of a pointer to
a dynamically allocated array,[2] and possibly data members holding
the capacity and size of the vector. The size of the vector refers to
the actual number of elements, while the capacity refers to the size
of the internal array. When new elements are inserted, if the new size
of the vector becomes larger than its capacity, reallocation
occurs.[2][4] This typically causes the vector to allocate a new
region of storage, move the previously held elements to the new region
of storage, and free the old region. Because the addresses of the
elements change during this process, any references or iterators to
elements in the vector become invalidated.[5] Using an invalidated
reference causes undefined behaviour
Like this:
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/stl_vector.h
(official gcc mirror on github)
///Implement Vector class
class MyVector {
int *int_arr;
int capacity;
int current;
public:
MyVector() {
int_arr = new int[1];
capacity = 1;
current = 0;
}
void Push(int nData);
void PushData(int nData, int index);
void PopData();
int GetData(int index);
int GetSize();
void Print();
};
void MyVector::Push(int data)
{
if (current == capacity){
int *temp = new int[2 * capacity];
for (int i = 0; i < capacity; i++)
{
temp[i] = int_arr[i];
}
delete[] int_arr;
capacity *= 2;
int_arr = temp;
}
int_arr[current] = data;
current++;
}
void MyVector::PushData(int data, int index)
{
if (index == capacity){
Push(index);
}
else
int_arr[index] = data;
}
void MyVector::PopData(){
current--;
}
int MyVector::GetData(int index)
{
if (index < current){
return int_arr[index];
}
}
int MyVector::GetSize()
{
return current;
}
void MyVector::Print()
{
for (int i = 0; i < current; i++) {
cout << int_arr[i] << " ";
}
cout << endl;
}
int main()
{
MyVector vect;
vect.Push(10);
vect.Push(20);
vect.Push(30);
vect.Push(40);
vect.Print();
std::cout << "\nTop item is "
<< vect.GetData(3) << std::endl;
vect.PopData();
vect.Print();
cout << "\nTop item is "
<< vect.GetData(1) << endl;
return 0;
}
It allocates a new array and copies everything over. So, expanding it is quite inefficient if you have to do it often. Use reserve() if you have to use push_back().
You'd need to define what you mean by "plain old structs."
realloc by itself only creates a block of uninitialized memory. It does no object allocation. For C structs, this suffices, but for C++ it does not.
That's not to say you couldn't use realloc. But if you were to use it (note you wouldn't be reimplementing std::vector exactly in this case!), you'd need to:
Make sure you're consistently using malloc/realloc/free throughout your class.
Use "placement new" to initialize objects in your memory chunk.
Explicitly call destructors to clean up objects before freeing your memory chunk.
This is actually pretty close to what vector does in my implementation (GCC/glib), except it uses the C++ low-level routines ::operator new and ::operator delete to do the raw memory management instead of malloc and free, rewrites the realloc routine using these primitives, and delegates all of this behavior to an allocator object that can be replaced with a custom implementation.
Since vector is a template, you actually should have its source to look at if you want a reference – if you can get past the preponderance of underscores, it shouldn't be too hard to read. If you're on a Unix box using GCC, try looking for /usr/include/c++/version/vector or thereabouts.
You can implement them with resizing array implementation.
When the array becomes full, create an array with twice as much the size and copy all the content to the new array. Do not forget to delete the old array.
As for deleting the elements from vector, do resizing when your array becomes a quarter full. This strategy makes prevents any performance glitches when one might try repeated insertion and deletion at half the array size.
It can be mathematically proved that the amortized time (Average time) for insertions is still linear for n insertions which is asymptotically the same as you will get with a normal static array.
realloc only works on heap memory. In C++ you usually want to use the free store.

C++ class for arrays with arbitrary indices

Do any of the popular C++ libraries have a class (or classes) that allow the developer to use arrays with arbitrary indices without sacrificing speed ?
To give this question more concrete form, I would like the possibility to write code similar to the below:
//An array with indices in [-5,6)
ArbitraryIndicesArray<int> a = ArbitraryIndicesArray<int>(-5,6);
for(int index = -5;index < 6;++index)
{
a[index] = index;
}
Really you should be using a vector with an offset. Or even an array with an offset. The extra addition or subtraction isn't going to make any difference to the speed of execution of the program.
If you want something with the exact same speed as a default C array, you can apply the offset to the array pointer:
int* a = new int[10];
a = a + 5;
a[-1] = 1;
However, it is not recommended. If you really want to do that you should create a wrapper class with inline functions that hides the horrible code. You maintain the speed of the C code but end up with the ability to add more error checking.
As mentioned in the comments, after altering the array pointer, you cannot then delete using that pointer. You must reset it to the actual start of the array. The alternative is you always keep the pointer to the start but work with another modified pointer.
//resetting the array by adding the offset (of -5)
delete [] (a - 5);
A std::vector<int> would do the trick here.
Random acess to a single element in a vector is only O(1).
If you really need the custom indices you can make your own small class based on a vector to apply an ofset.
Use the map class from the STL:
std::map<int, int> a;
for( int index = -5; index < 6; ++index )
{
a[index] = index;
}
map is implemented internally as a sorted container, which uses a binary search to locate items.
[This is an old thread but for reference sake...]
Boost.MultiArray has an extents system for setting any index range.
The arrays in the ObjexxFCL library have full support for arbitrary index ranges.
These are both multi-dimensional array libraries. For the OP 1D array needs the std::vector wrapper above should suffice.
Answer edited because I'm not very smart.
Wrap an std::vector and an offset into a class and provide an operator[]:
template <class T>
class ArbVector
{
private:
int _offset;
std::vector<T> container;
public:
ArbVector(int offset) : _offset(offset) {}
T& operator[](int n) { return container[n + _offset] }
};
Not sure if this compiles, but you get the idea.
Do NOT derive from std::vector though, see comments.