I am working on custom allocators. So far, I have tried to work on simple containers: std::list, std::vector, std::basic_string, etc...
My custom allocator is a static buffer allocator, its implementation is straightforward:
#include <memory>
template <typename T>
class StaticBufferAlloc : std::allocator<T>
{
private:
T *memory_ptr;
std::size_t memory_size;
public:
typedef std::size_t size_type;
typedef T *pointer;
typedef T value_type;
StaticBufferAlloc(T *memory_ptr, size_type memory_size) : memory_ptr(memory_ptr), memory_size(memory_size) {}
StaticBufferAlloc(const StaticBufferAlloc &other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size){};
pointer allocate(size_type n, const void *hint = 0) { return memory_ptr; } // when allocate return the buffer
void deallocate(T *ptr, size_type n) {} // empty cause the deallocation is buffer creator's responsability
size_type max_size() const { return memory_size; }
};
I am using it in this fashion:
using inner = std::vector<int, StaticBufferAlloc<int>>;
int buffer[201];
auto alloc1 = StaticBufferAlloc<int>(&buffer[100], 50);
inner v1(0, alloc1);
assert(v1.size() == 0);
const int N = 10;
// insert 10 integers
for (size_t i = 0; i < N; i++) {
v1.push_back(i);
}
assert(v1.size() == N);
All good so far, when I grow N past the max buffer size it throws and that's expected.
Now, I am trying to work with nested containers. In short, am trying to have a vector of the vector (matrix), where the parent vector and all its underlying elements (that are vectors i.e. containers) share the same static buffer for allocation. It looks like scoped_allocator can be a solution for my problem.
using inner = std::vector<int, StaticBufferAlloc<int>>;
using outer = std::vector<inner, std::scoped_allocator_adaptor<StaticBufferAlloc<inner>>>;
int buffer[201];
auto alloc1 = StaticBufferAlloc<int>(&buffer[100], 50);
auto alloc2 = StaticBufferAlloc<int>(&buffer[150], 50);
inner v1(0, alloc1);
inner v2(0, alloc2);
assert(v1.size() == 0);
assert(v2.size() == 0);
const int N = 10;
// insert 10 integers
for (size_t i = 0; i < N; i++)
{
v1.push_back(i);
v2.push_back(i);
}
assert(v1.size() == N);
assert(v2.size() == N);
outer v // <- how to construct this vector with the outer buffer?
v.push_back(v1);
v.push_back(v2);
...
My question is how to initialize the outer vector on its constructor call with its static buffer?
Creating a scoped allocator in C++11/C++14 was a little bit challenging. So I opted for a very modern solution introduced in C++17. Instead of implementing an allocator, I used polymorphic_allocator. Polymorphic allocators are scoped allocators, standard containers will automatically pass the allocators to sub-objects.
Basically, the idea was to use a polymorphic allocator and inject it with monotonic_buffer_resource. The monotonic_buffer_resource can be initialized with a memory resource.
Writing a custom memory resource was very simple:
class custom_resource : public std::pmr::memory_resource
{
public:
explicit custom_resource(std::pmr::memory_resource *up = std::pmr::get_default_resource())
: _upstream{up}
{
}
void *do_allocate(size_t bytes, size_t alignment) override
{
return _upstream; //do nothing, don't grow just return ptr
}
void do_deallocate(void *ptr, size_t bytes, size_t alignment) override
{
//do nothing, don't deallocate
}
bool do_is_equal(const std::pmr::memory_resource &other) const noexcept override
{
return this == &other;
}
private:
std::pmr::memory_resource *_upstream;
};
Using it is even simpler:
std::byte buffer[512];
custom_resource resource;
std::pmr::monotonic_buffer_resource pool{std::data(buffer), std::size(buffer), &resource};
std::pmr::vector<std::pmr::vector<int>> outer(&pool)
It is important to note that std::pmr::vector<T> is just std::vector<T, polymorphic_allocator>.
Useful resources:
CppCon 2017: Pablo Halpern “Allocators: The Good Parts”
C++ Weekly - Ep 222 - 3.5x Faster Standard Containers With PMR
Purpose of scoped allocator
std::pmr is cool but it requires modern versions of gcc to run (9+). Fortunately, Reddit is full of kind strangers. A C++14 solution can be found here.
Related
My goal is to have a std::list that allocates enough memory for the objects I will put in it, so I do not have to deal with potential exceptions when it expands, or the extra time needed for it to expand.
My first try involves splicing from a wave table:
std::list<T> list();
auto listI = list.begin;
typename std::list<T>::iterator waveStart = waveTable.begin();
for(int i = 0; i < waveIndex; i++) {
waveStart++;
}
typename std::list<T>::iterator waveEnd;
int tCounter = nSamples;
while(tCounter > 0) {
if(tCounter > (waveTable.size() - waveIndex)) {
waveEnd = waveStart;
for(int i = 0; i < (waveTable.size() - waveIndex); i++) {
waveEnd++;
}
tCounter = (tCounter - (waveTable.size() - waveIndex));
} else {
waveEnd = waveStart;
for(int i = 0; i < tCounter; i++) {
waveEnd++;
}
tCounter = 0;
}
list.splice(listI, waveTable, waveStart, waveEnd);
waveStart = waveTable.begin();
}
phase += (tp * frequency * (nSamples/sampleRate));
while(phase > tp) {
phase -= tp;
}
waveIndex = (phase / tp) * waveTable.size();
I was planning to copy the values, but splice removed the values from the waveTable, so I am going to use insert.
The problem is, insert increases the size of the list and I can not find a way to tell the list how much memory it will need to hold all the values I want to store in it.
My goal is to have a std::list that allocates enough memory for the objects I will put in it
Okay, seems reasonable. You can do this by calling the appropriate constructor as #Gyross mentioned.
so I do not have to deal with potential exceptions when it expands
Now this is where your question stops making sense. If you are out of memory, surely preallocating won't fix things. Also depending on your type, it can still potentially throw exceptions when the copy (or in some cases) move assignment function is called by you (when using node = newvalue).
or the extra time needed for it to expand
It is ironic that you are using std::list and care about performance this much, since it is one of if not the slowest container in the standard library. If you want a high performance fixed container I recommend writing/using a ringbuffer.
You can specify initial size in the constructor, with std::list<T> list_obj(n).
https://en.cppreference.com/w/cpp/container/list/list
There's also std::list::resize.
You may want to look at implementing a custom allocator.
The full requirements for a custom allocator class is here
https://en.cppreference.com/w/cpp/memory/allocator
The below example allocator always allocates num_items elements on the stack, and throws std::bad_alloc() if the number of elements exceeds the statically defined size.
You may want to move away from the stack using new delete or similar to avoid overflows if you have a lot of elements.
template<typename T, size_t num_items>
class MyAllocator
{
public:
using value_type = T;
using pointer = T*;
using const_pointer = const T*;
using void_pointer = std::nullptr_t;
using const_void_pointer = const std::nullptr_t;
using reference = T&;
using const_reference = const T&;
using size_type = std::size_t;
using difference_type = std::ptrdiff_t;
/* should copy assign the allocator when copy assigning container */
using propagate_on_container_copy_assignment = std::true_type;
/* should move assign the allocator when move assigning container */
using propagate_on_container_move_assignment = std::true_type;
/* should swap the allocator when swapping the container */
using propagate_on_container_swap = std::true_type;
/* two allocators does not always compare equal */
using is_always_equal = std::false_type;
MyAllocator() : m_index(0) {};
~MyAllocator() noexcept = default;
template<typename U>
struct rebind {
using other = MyAllocator<U, num_items>;
};
[[nodiscard]] pointer allocate(size_type n) {
if (m_index+n >= num_items)
throw std::bad_alloc();
pointer ret = &m_buffer[m_index];
m_index += n;
return ret;
}
void deallocate(pointer p, size_type n) {
(void) p;
(void) n;
/* do nothing */
}
size_type max_size() {
return num_items;
}
bool operator==(const MyAllocator& other) noexcept {
/* storage allocated by one allocator cannot be freed by another */
return false;
}
bool operator!=(const MyAllocator& other) noexcept {
/* storage allocated by one allocator cannot be freed by another */
return !(*this == other);
}
private:
value_type m_buffer[num_items];
size_t m_index;
};
And then you can use
constexpr size_t always_allocate_this_amount_of_elements = 10;
std::list<int, MyAllocator<int, always_allocate_this_amount_of_elements>> list;
for an embedded system we need a custom vector class, where the capacity is set during compile-time through a template parameter.
Until now we had an array of objects as a member variable.
template<class T, size_t SIZE>
class Vector {
...
T data[SIZE];
}
The problem here of course is that if T isn't a POD, the default constructors of T are called. Is there any way to let data be uninitialized until a corresponding push() call (with placement new inside)? Just using
uint8_t data[SIZE * sizeof(T)];
possibly breaks the alignment of T. We absolutely cannot use dynamic memory, the total container size always needs to be known at compile-time. We also cannot use C++'s alignas specifier since the compiler does not support C++11 yet :(
First I would check if the compiler has support for alignment, ie gcc has __attribute__(aligned(x)), there is likely something similar.
Then if you absolutely have to have aligned uninitialized data without such support, you will have to waste some space
// Align must be power of 2
template<size_t Len, size_t Align>
class aligned_memory
{
public:
aligned_memory()
: data((void*)(((std::uintptr_t)mem + Align - 1) & -Align)) {}
void* get() const {return data;}
private:
char mem[Len + Align - 1];
void* data;
};
And you'd use placement new with it
template<typename T, size_t N>
class Array
{
public:
Array() : sz(0) {}
void push_back(const T& t)
{
new ((T*)data.get() + sz++) T(t);
}
private:
aligned_memory<N * sizeof(T), /* alignment */> data;
size_t sz;
};
Live
The alignment of T can be found with C++11 alignof, check your compiler to see if it supports anything that can be used to find out its alignment. You can also just take a guess from printed pointer values and hope that's enough.
Another way is to use std::vector<> with a custom allocator that allocates on the stack.
This way you would create an empty vector, reserve the required space, which should be equal to the space your allocator allocates for you on the stack, and then populate the vector using vector<>::emplace_back. Your element type can be non-copyable but must be movable in this case.
E.g.:
#include <vector>
struct X {
X(int, int);
// Non-copyable.
X(X const&) = delete;
X& operator=(X const&) = delete;
// But movable.
X(X&&);
X& operator=(X&&);
};
template<class T, std::size_t N>
struct MyStackAllocator; // Implement me.
int main() {
std::vector<X, MyStackAllocator<X, 10>> v;
v.reserve(10);
v.emplace_back(1, 2);
v.emplace_back(3, 4);
}
Information about how to implement an allocator is widely available, for example, search YouTube for "c++ allocator".
You are going to have to use placement new along with a union trick to get the alignment properly set.
// use `std::max_align_t` and `std::aligned_storage` when you have it
// since don't have access to alignof(), use the presumably max
// alignment value
using MaxAlign = long;
template <typename T, int size>
class UninitializedArray {
union Node {
char data[sizeof(T)];
MaxAlign alignment;
};
Node aligned_data[size];
bool initialized;
public:
UninitializedArray() : initialized(false) {}
void initialize() {
for (int i = 0; i < static_cast<int>(size); ++i) {
new (&this->aligned_data[i].data) T();
}
this->initialized = true;
}
~UninitializedArray() {
if (this->initialized) {
for (int i = 0; i < static_cast<int>(size); ++i) {
T* ptr = reinterpret_cast<T*>(&this->aligned_data[i].data);
ptr->~T();
}
}
}
T& operator[](int index) {
if (!this->initialized) {
this->initialize();
}
T* ptr = reinterpret_cast<T*>(&this->aligned_data[i].data);
return *ptr;
}
};
And then use it like this
UninitializedArray<Something, 5> arr;
arr[0].do_something();
If you ever get C++17 working, then you can use std::array and std::optional to make this easy
std::optional<std::array<T, N>> optional_array;
// construct the optional, this will construct all your elements
optional_array.emplace();
// then use the value in the optional by "treating" the optional like
// a pointer
optional_array->at(0); // returns the 0th object
My program use 569MB of memory and it need to use 500MB only,
I have lot of std::vector with different size
Is there a way to set the capacity to the number of element to avoid the memory overhead.
(I don't case about performance, memory is key)
How to limit the capacity of std::vector to the number of element
The best that you can do, is to reserve the required space before you add the elements. This should also have the best performance, because there are no reallocations and copying caused by it.
If that is not practical, then you can use std::vector::shrink_to_fit() after the elements were added. Of course, that doesn't help if the allocation may never peak above the set limit.
Technically, neither of these methods are guaranteed by the standard to match the capacity with size. You are relying on the behaviour of the standard library implementation.
Write some wrapper and control size of your vector before pushing anything to it, or use fixed size std::array instead
You are perhaps looking for the shrink_to_fit method, see http://en.cppreference.com/w/cpp/container/vector/shrink_to_fit.
Or, if you are not able/allowed to use C++11, you may want to use the swap-to-fit idiom: https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Shrink-to-fit
In C++11 (note that shrink_to_fit may be ignored by the compiler):
vector<int> v;
// ...
v.shrink_to_fit();
The swap-to-fit idiom:
vector<int> v;
// ...
vector<int>( v ).swap(v);
// v is swapped with its temporary copy, which is capacity optimal
You can use custom allocator and feed the capacity you required to the template argument. modifyingthe example from this thread:
Compelling examples of custom C++ allocators?
#include <memory>
#include <iostream>
#include <vector>
namespace my_allocator_namespace
{
template <typename T, size_t capacity_limit>
class my_allocator: public std::allocator<T>
{
public:
typedef size_t size_type;
typedef T* pointer;
typedef const T* const_pointer;
template<typename _Tp1 >
struct rebind
{
typedef my_allocator<_Tp1 , capacity_limit> other;
};
pointer allocate(size_type n, const void *hint=0)
{
if( n > capacity_limit ) {
return std::allocator<T>::allocate(capacity_limit );
}
return std::allocator<T>::allocate(n, hint);
}
void deallocate(pointer p, size_type n)
{
return std::allocator<T>::deallocate(p, n);
}
my_allocator() throw(): std::allocator<T>() { }
my_allocator(const my_allocator &a) throw(): std::allocator<T>(a) { }
template <class U,size_t N>
my_allocator(const my_allocator<U,N> &a) throw(): std::allocator<T>(a) { }
~my_allocator() throw() { }
};
}
using namespace std;
using namespace my_allocator_namespace;
int main(){
vector<int, my_allocator<int,20> > int_vec(10);
for(int i = 0 ;i < 20; i++)
{
std::cerr << i << "," << int_vec.size() << std::endl;
int_vec.push_back(i);
}
}
however should be cautious for accessing out of range indices
OK, so I recently learned that (a) std::vector uses contiguous memory by definition/standard, and thus (b) &(v[0]) is the address of that contiguous block of memory, which you can read/write to as an old-skool C-array. Like...
void printem(size_t n, int* iary)
{ for (size_t i=0; i<n; ++i) std::cout << iary[i] << std::endl; }
void doublem(size_t n, int* iary)
{ for (size_t i=0; i<n; ++i) iary[i] *= 2; }
std::vector<int> v;
for (size_t i=0; i<100; ++i) v.push_back(i);
int* iptr = &(v[0]);
doublem(v.size(), iptr);
printem(v.size(), iptr);
OK, so that's cool, but I want to go in the other direction. I have lots and lots of existing code like
double computeSomething(const std::vector<SomeClass>& v) { ... }
If I have a C-array of objects, I can use such code like this:
SomeClass cary[100]; // 100*sizeof(SomeClass)
// populate this however
std::vector<SomeClass> v;
for (size_t i=0; i<100; ++i) v.push_back(cary[i]);
// now v is also using 100*sizeof(SomeClass)
double x = computeSomething(v);
I would like to do that (a) without the extra space and (b) without the extra time of inserting a redundant copy of all that data into the vector. Note that "just change your stupid computeSomething, idiot" is not sufficient, because there are thousands of such functions/methods that exhibit this pattern that are not under my control and, even if they were are too many to go and change all of them.
Note also that because I am only interested in const std::vector& usage, there is no worry that my original memory will ever need to be resized, or even modified. I would want something like a const std::vector constructor, but I don't know if the language even allows special constructors for const instances of a class, like:
namespace std { template <typename T> class vector {
vector() { ... }
vector(size_t n) { ... }
vector(size_t n, const T& t) { ... }
const vector(size_t n, T*) { ... } // can this be done?
...
If that is not possible, how about a container derived off of std::vector called std::const_vector, which (a) could construct from a pointer to a c-array and a size, and (b) purposefully did not implement non-const methods (push_back, resize, etc.), so then even if the object with a typename of const_vector is not actually a const object, the interface which only offers const methods makes it practically const (and any erroneous attempts to modify would be caught at compile time)?
UPDATE: A little messing around shows that this "solves" my problem wrt Windows-implementation of std::vector:
template <typename T>
class vector_tweaker : public std::vector<T> {
public:
vector_tweaker(size_t n, T* t) {
_saveMyfirst = _Myfirst;
_saveMylast = _Mylast;
_saveMyend = _Myend;
_Myfirst = t;
_Mylast = t + n;
_Myend = t + n;
}
~vector_tweaker() {
_Myfirst = _saveMyfirst;
_Mylast = _saveMylast;
_Myend = _saveMyend; // and proceed to std::vector destructor
}
private:
T* _saveMyfirst;
T* _saveMylast;
T* _saveMyend;
};
But of course that "solution" is hideous because (a) it offers no protection against the base class deleting the original memory by doing a resize() or push_back() (except for a careful user that only constructs const vector_tweaker()) -- and (b) it is specific to a particular implementation of std::vector, and would have to be reimplemented for others -- if indeed other platforms only declare their std::vector member data as protected: as microsoft did (seems a Bad Idea).
You can try reference-logic storing introduced in C++11 with std::reference_wrapper<>:
SomeClass cary[100];
// ...
std::vector<std::reference_wrapper<SomeClass>> cv;
cv.push_back(cary[i]); // no object copying is done, reference wrapper is stored
Or without C11, you can create a specialization of such template class for bytes - char. Then for the constructor from char* C-array you can use ::memcpy: which unfortunately will then use twice as much memory.
::memcpy(&v[0], c_arr, n);
Something like this:
template <typename T> class MyVector : public std::vector<T> {
};
template <> class MyVector<char> : public std::vector<char> {
public:
MyVector<char>(char* carr, size_t n) : std::vector<char>(n) {
::memcpy(&operator[](0), carr, n);
}
};
What I would recommend - replace all C-arrays to vectors where possible, then no extra copying will be needed.
C++11 standard has following lines in General Container Requirements.
(23.2.1 - 3)
For the components affected by this subclause that declare an allocator_type, objects stored in these components shall be constructed using the allocator_traits::construct function and destroyed using the allocator_traits::destroy function (20.6.8.2). These functions are called only for the container’s element type, not for internal types used by the container
(23.2.1 - 7)
Unless otherwise specified, all containers defined in this clause obtain memory using an allocator
Is it true or not, that all memory used by container is allocated by specified allocator? Because standard says that internal types are constructed not with allocator_traits::construct, so there should be some kind of call to operator new. But standard also says that all containers defined in this clause obtain memory using an allocator, which in my opinion means that it can't be ordinary new operator, it has to be placement new operator. Am I correct?
Let me show you example, why this is important.
Let's say we have a class, which holds some allocated memory:
#include <unordered_map>
#include <iostream>
#include <cstdint>
#include <limits>
#include <memory>
#include <new>
class Arena
{
public:
Arena(std::size_t size)
{
size_ = size;
location_ = 0;
data_ = nullptr;
if(size_ > 0)
data_ = new(std::nothrow) uint8_t[size_];
}
Arena(const Arena& other) = delete;
~Arena()
{
if(data_ != nullptr)
delete[] data_;
}
Arena& operator =(const Arena& arena) = delete;
uint8_t* allocate(std::size_t size)
{
if(data_ == nullptr)
throw std::bad_alloc();
if((location_ + size) >= size_)
throw std::bad_alloc();
uint8_t* result = &data_[location_];
location_ += size;
return result;
}
void clear()
{
location_ = 0;
}
std::size_t getNumBytesUsed() const
{
return location_;
}
private:
uint8_t* data_;
std::size_t location_, size_;
};
we also have custom allocator:
template <class T> class FastAllocator
{
public:
typedef T value_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
template <class U> class rebind
{
public:
typedef FastAllocator<U> other;
};
Arena* arena;
FastAllocator(Arena& arena_): arena(&arena_) {}
FastAllocator(const FastAllocator& other): arena(other.arena) {}
template <class U> FastAllocator(const FastAllocator<U>& other): arena(other.arena) {}
//------------------------------------------------------------------------------------
pointer allocate(size_type n, std::allocator<void>::const_pointer)
{
return allocate(n);
}
pointer allocate(size_type n)
{
return reinterpret_cast<pointer>(arena->allocate(n * sizeof(T)));
}
//------------------------------------------------------------------------------------
void deallocate(pointer, size_type) {}
//------------------------------------------------------------------------------------
size_type max_size() const
{
return std::numeric_limits<size_type>::max();
}
//------------------------------------------------------------------------------------
void construct(pointer p, const_reference val)
{
::new(static_cast<void*>(p)) T(val);
}
template <class U> void destroy(U* p)
{
p->~U();
}
};
This is how we use it:
typedef std::unordered_map<uint32_t, uint32_t, std::hash<uint32_t>, std::equal_to<uint32_t>,
FastAllocator<std::pair<uint32_t, uint32_t>>> FastUnorderedMap;
int main()
{
// Allocate memory in arena
Arena arena(1024 * 1024 * 50);
FastAllocator<uint32_t> allocator(arena);
FastAllocator<std::pair<uint32_t, uint32_t>> pairAllocator(arena);
FastAllocator<FastUnorderedMap> unorderedMapAllocator(arena);
FastUnorderedMap* fastUnorderedMap = nullptr;
try
{
// allocate memory for unordered map
fastUnorderedMap = unorderedMapAllocator.allocate(1);
// construct unordered map
fastUnorderedMap =
new(reinterpret_cast<void*>(fastUnorderedMap)) FastUnorderedMap
(
0,
std::hash<uint32_t>(),
std::equal_to<uint32_t>(),
pairAllocator
);
// insert something
for(uint32_t i = 0; i < 1000000; ++i)
fastUnorderedMap->insert(std::make_pair(i, i));
}
catch(std::bad_alloc badAlloc)
{
std::cout << "--- BAD ALLOC HAPPENED DURING FAST UNORDERED MAP INSERTION ---" << std::endl;
}
// no destructor of unordered map is called!!!!
return 0;
}
As you can see, destructor of unordered_map is never called, but memory is freed during destruction of arena object. Will there be any memory leak and why?
I would really appreciate any help on this topic.
An allocator is supposed to provide 4 functions (of interest here):
2 are used for memory management: allocate/deallocate
2 are used for objects lifetime management: construct/destroy
The these functions in your quote only apply to construct and destroy (which were mentioned in the previous sentence), and not to allocate/deallocate, thus there is no contradiction.
Now, regarding memory leaks, for an arena allocator to work not only should the objects in the container be built using the arena allocator (which the container guarantees) but all the memory those objects allocate should also be obtained from this allocator; this can get slightly more complicated unfortunately.