I maintain a old C++ application which has a lot of classes like below:
class ClassWithALotOfVectors
{
std::vector<int> _vector1;
std::vector<int> _vector2;
// a lot of other vector datamembers go here
std::vector<double> _vectorN;
};
that is - a data type where members a vectors of double or int.
The thing is - these vectors are never populated at the same time - and therefore
when we create 100000 instances of ClassWithALotOfVectors - memory usage adds up
to impressive number even though only 10% of these vector are in use.
So I decided to write a small "allocate on demand" vector class.
It is a wrapper around std::vector - where internal vector only create when
accessed for the first time (using two getters - ref() & const_ref() methods)
When I replaced std::vector in a ClassWithALotOfVectors class with a compact_vector
as below:
class ClassWithALotOfVectors2
{
compact_vector<int> _vector1;
compact_vector<int> _vector2;
compact_vector<double> _vectorN;
};
did a few tests and the results were promising - memory usage went down
dramatically, but suddenly found that application does not release memory
at the end - the memory consumptions grows in much slower rate than it
is used to be - but application does not seem to be deallocating the memory
at the end.
Can you look at my implementation of the compact_vector
and see if you can spot something wrong with a memory management.
template <class T> class compact_vector
{
public:
compact_vector<T>()
:_data(NULL)
{
}
~compact_vector<T>()
{
clear();
}
compact_vector<T>(const compact_vector<T> & rhs)
:_data(NULL)
{
if (NULL != rhs._data)
{
_data = new std::vector<T>(*(rhs._data));
}
else
{
clear();
}
}
// assignment
//
compact_vector<T> & operator=(const compact_vector<T> & rhs)
{
if (this != &rhs)
{
clear();
if (NULL != rhs._data)
{
_data = new std::vector<T>(*rhs._data);
}
}
return *this;
}
compact_vector<T> & operator=(const std::vector<T> & rhs)
{
clear();
if (!rhs.empty())
{
_data = new std::vector<T>(rhs);
}
return *this;
}
const std::vector<T> & const_ref()
{
createInternal();
return *_data;
}
std::vector<T> & ref()
{
createInternal();
return *_data;
}
void clear()
{
if (NULL != _data)
{
_data->clear();
delete _data;
_data = NULL;
}
}
private:
void createInternal()
{
if (NULL == _data)
{
_data = new std::vector<T>();
}
}
private:
compact_vector<T>(const std::vector<T> & rhs)
{
}
compact_vector<T>(std::vector<T> & rhs)
{
}
compact_vector<T>(std::vector<T> rhs)
{
}
std::vector<T> * _data;
};
Most implementations of std::vector don't acquire memory until you need it, and the size of a vector is usually just a few (3 + possibly extra debug information) pointers. That is, std::vector<int>() will not allocate space for any object. I believe that you are barking at the wrong tree here.
Why is your original code having a much higher memory usage? Are you expecting clear() to release memory?
If so you are mistaken. The clear() function destroys the contained elements but does not release the allocated memory. Consider adding a wrapper to clear the memory that uses the following idiom:
std::vector<int>().swap( _data );
What the previous line does is creating a new empty vector (usually no memory attached, not mandated, but common implementation) and swapping the contents of the two vectors. At the end of the expression, the temporary contains the data that was originally held by the _data vector and _data is empty and with no memory (implementation defined). The temporary is destructed at the end of the full expression and memory is released.
Alternatively, if your compiler supports C++11, you can use shrink_to_fit() after calling clear(). The standard does not require shrink_to_fit() to actually shrink to fit, but for an empty vector I would expect the implementation to do it. Test it by calling capacity() after the call and seeing whether it has gone down to 0 or not.
Use smart pointers. In this case, std::unique_ptr with hand-made copy ctor / assignment operator. Suddenly, all your problems are solved. It's like magic!
Sorry to sound so snarky, but memory leaks always have the same answer: smart pointer.
Since you are saying it's an old application and can't do much with it. How about having all the vectors having zero size reserved during construction.
It's tedious, but since old app and not much is expected to be modified. May be worth the effort for once.
class ClassWithALotOfVectors
{
std::vector<int> _vector1;
std::vector<int> _vector2;
// a lot of other vector datamembers go here
std::vector<double> _vectorN;
public:
ClassWithALotOfVectors() : Init() { }
void Init()
{
vector1.reserve(0);
/// Like wise other vectors.
}
};
The idea that only one vector at a time is ever in use sounds like the union concept. Sure enough, C++ has an element called an anonymous union that does precisely this.
class ClassWithALotOfVectors {
public:
union {
std::vector<double> _vector1;
std::vector<double> _vector2;
};
ClassWithALotOfVectors() { new(&_vector1) std::vector<double>; };
~ClassWithALotOfVectors() { _vector1.~vector<double>(); };
};
Because the union can't know which element's constructor to call, default constructors are disabled in the union and you have to manually construct and deconstruct the union's element, but I think this will accomplish what you are looking for, aliasing _vector1 and _vector2 to the same element but placing both in the ClassWithALotOfVectors namespace, then deallocating the vector when the destructor for ClassWithALotOfVectors is called.
For more on anonymous unions see:
CPP reference: Anonymous Unions
Related
I have a std::vector<std::unique_ptr<Kind>> which I want to clean up while it is being iterated upon, without explicitly calling the destructor of its members (.reset()).
The Kind is a heavy struct and its size increases during the iteration. The next object doesn't need to know about previous objects so I'd like to clean up an iterand when its not needed.
I know vector will clean up in the end, but by then, lots of Kind and their dynamically allocated memory adds up. I'm trying to reduce peak memory to just one element.
I want to avoid reset since other developers may not know about the dynamic allocation, forget calling reset in the end of the loop and cost memory penalty.
I cannot create a copy,
for(std::unique_ptr<Kind> t : store)
I cannot move it like
for(std::unique_ptr<Kind> &&t : store)
Then how do I do it ?
#include <iostream>
#include <vector>
struct Kind{
char a;
char *array;
Kind(const char c): a(c)
{
}
~Kind(){
free(array); // internal custom deallocator.
}
};
int main() {
std::vector<std::unique_ptr<Kind>> store;
store.push_back(std::make_unique<Kind>('y'));
store.push_back(std::make_unique<Kind>('z'));
for(std::unique_ptr<Kind> &t : store){
// increase size of Kind.array.
std::cout << t->a;
// Use the Kind.array
// clean up t automatically.
}
return 0;
}
Example of moving the element out of the vector.
int main() {
std::vector<std::unique_ptr<Kind>> store;
store.push_back(std::make_unique<Kind>('y'));
for(std::unique_ptr<Kind> &t : store){
auto tmp = std::move(t); // leaving a valid but empty entry in store
std::cout << tmp->a;
// clean up t automatically.
// tmp runs out of scope and cleans up
}
return 0;
}
In effect not much different from the reset, but might be relevant for what you actually do in your real program.
How to take ownership of an object while looping over std::vector of std::unique_ptr using a range based for loop?
Loop with a reference to the element, and std::move the unique pointer into another. Example:
for(std::unique_ptr<Kind> &t : store){
std::unique_ptr<Kind> owner = std::move(t);
// do something with newly owned pointer
I want to clean up
there's no need to keep older structs around
You could deallocate the object by resetting the pointer:
for(std::unique_ptr<Kind> &t : store) {
// do something
t.reset();
That said, this is typically unnecessary. They will be automatically be destroyed when the vector goes out of scope.
I'm trying to save some memory here
If you allocate dynamic objects while iterating this may be useful. Otherwise it won't affect peak memory use.
If you want to make sure the instances are deleted immediately after each iteration and you cannot wait until the entire loop is done, you can write a wrapper that takes care of that and expresses your intent at the same time:
template <typename T>
struct Stealing {
std::unique_ptr<T> ptr;
Stealing(std::unique_ptr<T>& ptr) : ptr(std::move(ptr)) {
}
auto operator*() {
return ptr.operator*();
}
auto operator->() {
return ptr.operator->();
}
}
You can use that in the loop as a drop-in replacement for a unique_ptr as such:
for (Stealing<Kind> t: store) {
// do what you like with t as if it was a std::unique_ptr
// when t goes out of scope, so does its member -> Kind gets destroyed
}
Having a struct such as:
struct PAIR {
vector<double> a;
vector<double> b;
};
Is using a function like the following a proper way to release the memory after defining and populating such a struct? If not, how do you deal with this situation?
void release(PAIR& p){
vector<double>().swap(p.a);
vector<double>().swap(p.b);
}
Isn't there a way to call some predefined/std function on PAIR itself to release memory?
Note that I'm not using new, etc. so definitions are simply like PAIR p;. Also, the struct is much more complex than just a pair of vectors that could have been defined using a std::pair.
All the related questions in SO on releasing memory for vectors are either about vectors themselves or vectors of a struct, not a struct containing multiple vectors. I'm looking for an elegant way to release memory used by such a struct.
Context
The vectors get really big, and I want to release the memory as soon as I can. But that lifetime/usability reaches in the middle of function! I don't want to spread the functionality in this function to multiple functions. These are pretty complicated computations and don't want to mess things up.
Given function does not release memory on the stack actually. It is approximately equivalent to
p.a.clear();
p.a.shrink_to_fit();
The vector itself remains in the memory (just with 0 elements).
Remember, any memory that was allocated on the stack (~ without the use of new) gets released only when the variable occupying this memory goes out of scope, not earlier.
So if you have a variable on the stack and want to delete it, you can just limit its scope:
struct PAIR {
vector<double> a;
vector<double> b;
};
int main()
{
// some stuff before...
{
PAIR p;
// some stuff with p object...
} // here p gets deleted (all memory gets released)
// some stuff after...
}
You mentioned new PAIR. With pointers it would look like this:
int main()
{
// some stuff before...
PAIR* p = new PAIR;
// some stuff with p object...
delete p; // here p gets deleted (all memory gets released)
// some stuff after...
}
Or as commentators requested:
int main()
{
// some stuff before...
{
auto p = std::make_unique<PAIR>();
// some stuff with p...
} // here p gets deleted (all memory gets released)
// some stuff after...
}
Is that what you wanted to achieve?
Does PAIR have to be a POD? Maybe something like this might work for you?
struct PAIR
{
private:
std::unique_ptr<std::vector<double>> aptr;
std::unique_ptr<std::vector<double>> bptr;
PAIR(const PAIR&) = delete;
public:
PAIR() : aptr(std::make_unique<std::vector<double>()),
bptr(std::make_unique<std::vector<double>()) {}
~PAIR() { release(); }
std::vector<double> &a = *aptr;
std::vector<double> &b = *bptr;
void release()
{
aptr.reset();
bptr.reset();
}
...
};
simply .resize(0) the vectors.
I have an OOP entity-component system that currently works like this:
// In the component system
struct Component { virtual void update() = 0; }
struct Entity
{
bool alive{true};
vector<unique_ptr<Component>> components;
void update() { for(const auto& c : components) c->update(); }
}
// In the user application
struct MyComp : Component
{
void update() override { ... }
}
To create new entities and components, I use C++'s usual new and delete:
// In the component system
struct Manager
{
vector<unique_ptr<Entity>> entities;
Entity& createEntity()
{
auto result(new Entity);
entities.emplace_back(result);
return *result;
}
template<typename TComp, typename... TArgs>
TComp& createComponent(Entity& mEntity, TArgs... mArgs)
{
auto result(new TComp(forward<TArgs>(mArgs)...));
mEntity.components.emplace_back(result);
return result;
}
void removeDead() { /* remove all entities with 'alive == false' - 'delete' is called here by the 'unique_ptr' */ }
}
// In the user application
{
Manager m;
auto& myEntity(m.createEntity());
auto& myComp(m.createComponent<MyComp>(myEntity));
// Do stuff with myEntity and myComp
m.removeDead();
}
The system works fine, and I like the syntax and flexibility. However, when continuously adding and removing entities and components to the manager, memory allocation/deallocation slows down the application. (I've profiled and determined that the slow down is caused by new and delete).
I've recently read that it's possible to pre-allocate heap memory in C++ - how can that be applied to my situation?
Desired result:
// In the user application
{
Manager m{1000};
// This manager can hold about 1000 entities with components
// (may not be 1000 because of dynamic component size,
// since the user can define it's on components, but it's ok for me)
auto& myEntity(m.createEntity());
auto& myComp(m.createComponent<MyComp>(myEntity));
// Do stuff with myEntity and myComp
m.removeDead();
// No 'delete' is called here! Memory of the 'dead' entities can
// be reused for new entity creation
}
// Manager goes out of scope: 'delete' is called here
There are a few things you can do to make the implementation of your design scale better.
In your current implementation there are two memory allocations per Entity and Component. The first one allocates an object and the second one when the object is put into the vector. The second one happens when the vector runs out of space and allocates a bigger array and moves the old elements into the new array.
In this case the best you can do is to use intrusive lists. That is, each of Entity and Component become also list nodes. Then, after these have been allocated no extra memory allocations are necessary to put the object into a list. Use a single or double-linked list from Boost.Intrusive, or write your own. This is how Linux kernel keeps track of many different objects.
The next step is to preallocate Entity and Component elements. Preallocating could be something as simple as a global array of these, or something more sophisticated, such as Boost.Pool. There are quite a few ways to build a memory pool of objects.
Once Entity and Component are preallocated and intrusive lists are used you are done.
An example which uses boost components:
#include <boost/intrusive/list.hpp>
#include <boost/pool/pool_alloc.hpp>
#include <new>
namespace bi = boost::intrusive;
// api.h
//
// Object pooling support begin.
//
template<class T>
struct Pool
{
static boost::pool_allocator<T> pool;
};
// Singleton. Although it is defined in the header, the linkers
// make sure there is only one instance of it in the application.
// It is instantiated on demand when Pool<T> is used.
template<class T>
boost::pool_allocator<T> Pool<T>::pool;
template<class Derived>
struct Pooled // use it on the most derived class only, not on intermediate base classes
{
// Automatically use the object pool for plain new/delete.
static void* operator new(size_t) { return Pool<Derived>::pool.allocate(1); }
static void operator delete(void* p) { return Pool<Derived>::pool.deallocate(static_cast<Derived*>(p), 1); }
};
//
// Object pooling support end.
//
// Using bi::list_base_hook<bi::link_mode<bi::auto_unlink> > because it automatically
// unlinks from the list when the object is destroyed. No need to manually
// remove the object from the list when an object is about to be destroyed.
struct Component
: bi::list_base_hook<bi::link_mode<bi::auto_unlink> > // make it an intrusive list node
{
virtual void update() = 0;
virtual ~Component() {}
};
struct Entity
: bi::list_base_hook<bi::link_mode<bi::auto_unlink> > // make it an intrusive list node
, Pooled<Entity> // optional, make it allocated from the pool
{
bool active = false;
bi::list<Component, bi::constant_time_size<false> > components;
~Entity() {
for(auto i = components.begin(), j = components.end(); i != j;)
delete &*i++; // i++ to make sure i stays valid after the object is destroyed
}
void update() {
for(auto& c : components)
c.update();
}
};
struct Manager
{
bi::list<Entity, bi::constant_time_size<false> > entities;
~Manager() {
for(auto i = entities.begin(), j = entities.end(); i != j;)
delete &*i++; // i++ to make sure i stays valid after the object is destroyed
}
Entity& createEntity() {
auto result = new Entity;
entities.push_back(*result);
return *result;
}
template<typename TComp, typename... TArgs>
TComp& createComponent(Entity& mEntity, TArgs... mArgs)
{
auto result = new TComp(std::forward<TArgs>(mArgs)...);
mEntity.components.push_back(*result);
return *result;
}
void removeDead() {
for(auto i = entities.begin(), j = entities.end(); i != j;) {
auto& entity = *i++;
if(!entity.active)
delete &entity;
}
}
};
// user.cc
struct MyComp
: Component
, Pooled<MyComp> // optional, make it allocated from the pool
{
void update() override {}
};
int main() {
Manager m;
auto& myEntity(m.createEntity());
auto& myComp(m.createComponent<MyComp>(myEntity));
m.removeDead();
}
In the above example boost::pool_allocator<T> actually uses new to allocate objects and then it keeps reusing destroyed objects rather than invoking delete on them. You can do better by preallocating all objects, but there are many ways to do so depending on your requirements, so that I use boost::pool_allocator<T> for simplicity to avoid hair splitting here. You can change the implementation of Pooled<T> to something like Pooled<T, N> where N stands for the maximum number of objects, the rest of the code stays the same because it uses plain new/delete which happen to be overridden for objects allocated from a pool.
C++ supports class-specific memory pools for this kind of thing. The general-purpose new/delete pair inevitably trades off among
Time spent searching for a free block of the right size to meet each request
Time spent coalescing free blocks
Time spent maintaining and perhaps reorganizing internal data structures to make the above two operations faster.
The primary way to gain speed is to avoid these tradeoffs entirely with custom allocators that - as you say - pre-allocate a big chunk of memory viewed as a simple array of free objects all of the same size. Initially these are all linked on a free list, where the link pointers occupy the first bytes of each block "overlaid" where the data will eventually go. Allocation is just unchaining a block from the head of the free list - a "pop" operation needing about 2 instructions. Deallocation is a "push:" two more instructions. In many cases, memory hardware can be set to generate a trap when the the pool is empty so there is no per-allocation overhead for detecting
this error condition. (In GC systems the same trick is used to initiate collection with no overhead.)
In your case you'd need two pools: one for Entities and one for Components.
Defining your own pool allocator is not so hard, especially if your application is single threaded. See this document for a tutorial treatment.
Using most of answers and Google as references, I implemented some pre-allocation utilities in my SSVUtils library.
Prealloc.h
Example:
using MemUnit = char;
using MemUnitPtr = MemUnit*;
using MemSize = decltype(sizeof(MemUnit)); // Should always be 1 byte
class MemBuffer
{
Uptr<MemUnit[]> buffer;
MemRange range;
MemBuffer(MemSize mSize) : ...
{
// initialize buffer from mSize
}
};
class PreAllocatorChunk
{
protected:
MemSize chunkSize;
MemBuffer buffer;
std::stack<MemRange> available;
public:
PreAllocatorChunk(MemSize mChunkSize, unsigned int mChunks) : ...
{
// Add "chunks" to to available...
}
template<typename T, typename... TArgs> T* create(TArgs&&... mArgs)
{
// create on first "chunk" using placement new
auto toUse(available.top().begin); available.pop();
return new (toUse) T{std::forward<TArgs>(mArgs)...};
}
};
More pre-allocation utilities are available:
PreAllocatorDynamic: pre-allocates a big buffer, then, when creating an object, splits the buffer in two parts:
[buffer start, buffer start + obj size)
[buffer start + obj size, buffer end)
When an object is destroyed, its occupied memory range is set as "available". If during creation of a new object no big enough "chunk" is found, the pre-allocator tries to unify contiguous memory chunks before throwing a runtime exception. This pre-allocator is sometimes faster than new/delete, but it greatly depends on the size of pre-allocated buffer.
PreAllocatorStatic<T>: inherited from PreAllocatorChunk. Size of a chunk is equal to sizeof(T). Fastest pre-allocator, less flexible. Almost always faster than new/delete.
One of your issues can be solved by allocating enough space in the vectors on their creation
For
vector<unique_ptr<Entity>> entities;
provide enough space in the constructor
Manager::Manager() : entities(10000)
{
//...
}
Thus, you avoid reallocations and copying during later stages.
The second issue is the creation of your unique_ptr<Entity> pointers. Here, as you will always use default constructed objects, you could also use a pre-allocated pool of objects from which you create the pointers. Instead of calling new you would call an own class
class EntityPool
{
public:
EntityPool(unsigned int size = 10000) : pool(size), nextEntity(0)
{
}
Entity* getNext(void)
{
if (nextEntity != pool.size()) // if pool is exhausted create new
{
pool.emplace_back(Entity());
}
return pool[nextEntity++];
}
private:
vector<Entity> pool;
unsigned int nextEntity; // index into the vector to the next Entity
};
struct Manager
{
vector<unique_ptr<Entity>> entities;
Entity& createEntity()
{
entities.emplace_back(entityPoolInstance.getNext());
return *result;
}
//...
Or you could wire in the standard 'placement new'. This allows you to allocate a big block of memory to construct (place) objects into as you choose. This will keep the block on the heap for as long as you need it, and allow you to allocate multiple short lived objects into this block instead of doing the costly allocations and de-allocations that just end up fragmenting the heap. There's a couple of gotcha's involved, but all in all its a REALLY simple solution without having to go down the custom memory manager route.
Here's an excellent treatment of removing some of the pitfalls and describing placement new in detail.
I've used data structures as simple as a stack to keep track of the next free block to allocate into: push the address of a block that's about to be deleted onto the stack. When allocating just pop the next free block off of the stack and use that as the arg to placement new. Super easy and super fast!
I have a program that contains a processing phase that needs to use a bunch of different object instances (all allocated on the heap) from a tree of polymorphic types, all eventually derived from a common base class.
As the instances may cyclically reference each other, and do not have a clear owner, I want allocated them with new, handle them with raw pointers, and leave them in memory for the phase (even if they become unreferenced), and then after the phase of the program that uses these instances, I want to delete them all at once.
How I thought to structure it is as follows:
struct B; // common base class
vector<unique_ptr<B>> memory_pool;
struct B
{
B() { memory_pool.emplace_back(this); }
virtual ~B() {}
};
struct D : B { ... }
int main()
{
...
// phase begins
D* p = new D(...);
...
// phase ends
memory_pool.clear();
// all B instances are deleted, and pointers invalidated
...
}
Apart from being careful that all B instances are allocated with new, and that noone uses any pointers to them after the memory pool is cleared, are there problems with this implementation?
Specifically I am concerned about the fact that the this pointer is used to construct a std::unique_ptr in the base class constructor, before the derived class constructor has completed. Does this result in undefined behaviour? If so is there a workaround?
In case you haven't already, familiarize yourself with Boost.Pool. From the Boost documentation:
What is Pool?
Pool allocation is a memory allocation scheme that is very fast, but
limited in its usage. For more information on pool allocation (also
called simple segregated storage, see concepts concepts and Simple Segregated Storage.
Why should I use Pool?
Using Pools gives you more control over how memory is used in your
program. For example, you could have a situation where you want to
allocate a bunch of small objects at one point, and then reach a point
in your program where none of them are needed any more. Using pool
interfaces, you can choose to run their destructors or just drop them
off into oblivion; the pool interface will guarantee that there are no
system memory leaks.
When should I use Pool?
Pools are generally used when there is a lot of allocation and
deallocation of small objects. Another common usage is the situation
above, where many objects may be dropped out of memory.
In general, use Pools when you need a more efficient way to do unusual
memory control.
Which pool allocator should I use?
pool_allocator is a more general-purpose solution, geared towards
efficiently servicing requests for any number of contiguous chunks.
fast_pool_allocator is also a general-purpose solution but is geared
towards efficiently servicing requests for one chunk at a time; it
will work for contiguous chunks, but not as well as pool_allocator.
If you are seriously concerned about performance, use
fast_pool_allocator when dealing with containers such as std::list,
and use pool_allocator when dealing with containers such as
std::vector.
Memory management is tricky business (threading, caching, alignment, fragmentation, etc. etc.) For serious production code, well-designed and carefully optimized libraries are the way to go, unless your profiler demonstrates a bottleneck.
Your idea is great and millions of applications are already using it. This pattern is most famously known as «autorelease pool». It forms a base for ”smart” memory management in Cocoa and Cocoa Touch Objective-C frameworks. Despite the fact that C++ provides hell of a lot of other alternatives, I still think this idea got a lot of upside. But there are few things where I think your implementation as it stands may fall short.
The first problem that I can think of is thread safety. For example, what happens when objects of the same base are created from different threads? A solution might be to protect the pool access with mutually exclusive locks. Though I think a better way to do this is to make that pool a thread-specific object.
The second problem is invoking an undefined behavior in case where derived class's constructor throws an exception. You see, if that happens, the derived object won't be constructed, but your B's constructor would have already pushed a pointer to this to the vector. Later on, when the vector is cleared, it would try to call a destructor through a virtual table of the object that either doesn't exist or is in fact a different object (because new could reuse that address).
The third thing I don't like is that you have only one global pool, even if it is thread-specific, that just doesn't allow for a more fine grained control over the scope of allocated objects.
Taking the above into account, I would do a couple of improvements:
Have a stack of pools for more fine-grained scope control.
Make that pool stack a thread-specific object.
In case of failures (like exception in derived class constructor), make sure the pool doesn't hold a dangling pointer.
Here is my literally 5 minutes solution, don't judge for quick and dirty:
#include <new>
#include <set>
#include <stack>
#include <cassert>
#include <memory>
#include <stdexcept>
#include <iostream>
#define thread_local __thread // Sorry, my compiler doesn't C++11 thread locals
struct AutoReleaseObject {
AutoReleaseObject();
virtual ~AutoReleaseObject();
};
class AutoReleasePool final {
public:
AutoReleasePool() {
stack_.emplace(this);
}
~AutoReleasePool() noexcept {
std::set<AutoReleaseObject *> obj;
obj.swap(objects_);
for (auto *p : obj) {
delete p;
}
stack_.pop();
}
static AutoReleasePool &instance() {
assert(!stack_.empty());
return *stack_.top();
}
void add(AutoReleaseObject *obj) {
objects_.insert(obj);
}
void del(AutoReleaseObject *obj) {
objects_.erase(obj);
}
AutoReleasePool(const AutoReleasePool &) = delete;
AutoReleasePool &operator = (const AutoReleasePool &) = delete;
private:
// Hopefully, making this private won't allow users to create pool
// not on stack that easily... But it won't make it impossible of course.
void *operator new(size_t size) {
return ::operator new(size);
}
std::set<AutoReleaseObject *> objects_;
struct PrivateTraits {};
AutoReleasePool(const PrivateTraits &) {
}
struct Stack final : std::stack<AutoReleasePool *> {
Stack() {
std::unique_ptr<AutoReleasePool> pool
(new AutoReleasePool(PrivateTraits()));
push(pool.get());
pool.release();
}
~Stack() {
assert(!stack_.empty());
delete stack_.top();
}
};
static thread_local Stack stack_;
};
thread_local AutoReleasePool::Stack AutoReleasePool::stack_;
AutoReleaseObject::AutoReleaseObject()
{
AutoReleasePool::instance().add(this);
}
AutoReleaseObject::~AutoReleaseObject()
{
AutoReleasePool::instance().del(this);
}
// Some usage example...
struct MyObj : AutoReleaseObject {
MyObj() {
std::cout << "MyObj::MyObj(" << this << ")" << std::endl;
}
~MyObj() override {
std::cout << "MyObj::~MyObj(" << this << ")" << std::endl;
}
void bar() {
std::cout << "MyObj::bar(" << this << ")" << std::endl;
}
};
struct MyObjBad final : AutoReleaseObject {
MyObjBad() {
throw std::runtime_error("oops!");
}
~MyObjBad() override {
}
};
void bar()
{
AutoReleasePool local_scope;
for (int i = 0; i < 3; ++i) {
auto o = new MyObj();
o->bar();
}
}
void foo()
{
for (int i = 0; i < 2; ++i) {
auto o = new MyObj();
bar();
o->bar();
}
}
int main()
{
std::cout << "main start..." << std::endl;
foo();
std::cout << "main end..." << std::endl;
}
Hmm, I needed almost exactly the same thing recently (memory pool for one phase of a program that gets cleared all at once), except that I had the additional design constraint that all my objects would be fairly small.
I came up with the following "small-object memory pool" -- perhaps it will be of use to you:
#pragma once
#include "defs.h"
#include <cstdint> // uintptr_t
#include <cstdlib> // std::malloc, std::size_t
#include <type_traits> // std::alignment_of
#include <utility> // std::forward
#include <algorithm> // std::max
#include <cassert> // assert
// Small-object allocator that uses a memory pool.
// Objects constructed in this arena *must not* have delete called on them.
// Allows all memory in the arena to be freed at once (destructors will
// be called).
// Usage:
// SmallObjectArena arena;
// Foo* foo = arena::create<Foo>();
// arena.free(); // Calls ~Foo
class SmallObjectArena
{
private:
typedef void (*Dtor)(void*);
struct Record
{
Dtor dtor;
short endOfPrevRecordOffset; // Bytes between end of previous record and beginning of this one
short objectOffset; // From the end of the previous record
};
struct Block
{
size_t size;
char* rawBlock;
Block* prevBlock;
char* startOfNextRecord;
};
template<typename T> static void DtorWrapper(void* obj) { static_cast<T*>(obj)->~T(); }
public:
explicit SmallObjectArena(std::size_t initialPoolSize = 8192)
: currentBlock(nullptr)
{
assert(initialPoolSize >= sizeof(Block) + std::alignment_of<Block>::value);
assert(initialPoolSize >= 128);
createNewBlock(initialPoolSize);
}
~SmallObjectArena()
{
this->free();
std::free(currentBlock->rawBlock);
}
template<typename T>
inline T* create()
{
return new (alloc<T>()) T();
}
template<typename T, typename A1>
inline T* create(A1&& a1)
{
return new (alloc<T>()) T(std::forward<A1>(a1));
}
template<typename T, typename A1, typename A2>
inline T* create(A1&& a1, A2&& a2)
{
return new (alloc<T>()) T(std::forward<A1>(a1), std::forward<A2>(a2));
}
template<typename T, typename A1, typename A2, typename A3>
inline T* create(A1&& a1, A2&& a2, A3&& a3)
{
return new (alloc<T>()) T(std::forward<A1>(a1), std::forward<A2>(a2), std::forward<A3>(a3));
}
// Calls the destructors of all currently allocated objects
// then frees all allocated memory. Destructors are called in
// the reverse order that the objects were constructed in.
void free()
{
// Destroy all objects in arena, and free all blocks except
// for the initial block.
do {
char* endOfRecord = currentBlock->startOfNextRecord;
while (endOfRecord != reinterpret_cast<char*>(currentBlock) + sizeof(Block)) {
auto startOfRecord = endOfRecord - sizeof(Record);
auto record = reinterpret_cast<Record*>(startOfRecord);
endOfRecord = startOfRecord - record->endOfPrevRecordOffset;
record->dtor(endOfRecord + record->objectOffset);
}
if (currentBlock->prevBlock != nullptr) {
auto memToFree = currentBlock->rawBlock;
currentBlock = currentBlock->prevBlock;
std::free(memToFree);
}
} while (currentBlock->prevBlock != nullptr);
currentBlock->startOfNextRecord = reinterpret_cast<char*>(currentBlock) + sizeof(Block);
}
private:
template<typename T>
static inline char* alignFor(char* ptr)
{
const size_t alignment = std::alignment_of<T>::value;
return ptr + (alignment - (reinterpret_cast<uintptr_t>(ptr) % alignment)) % alignment;
}
template<typename T>
T* alloc()
{
char* objectLocation = alignFor<T>(currentBlock->startOfNextRecord);
char* nextRecordStart = alignFor<Record>(objectLocation + sizeof(T));
if (nextRecordStart + sizeof(Record) > currentBlock->rawBlock + currentBlock->size) {
createNewBlock(2 * std::max(currentBlock->size, sizeof(T) + sizeof(Record) + sizeof(Block) + 128));
objectLocation = alignFor<T>(currentBlock->startOfNextRecord);
nextRecordStart = alignFor<Record>(objectLocation + sizeof(T));
}
auto record = reinterpret_cast<Record*>(nextRecordStart);
record->dtor = &DtorWrapper<T>;
assert(objectLocation - currentBlock->startOfNextRecord < 32768);
record->objectOffset = static_cast<short>(objectLocation - currentBlock->startOfNextRecord);
assert(nextRecordStart - currentBlock->startOfNextRecord < 32768);
record->endOfPrevRecordOffset = static_cast<short>(nextRecordStart - currentBlock->startOfNextRecord);
currentBlock->startOfNextRecord = nextRecordStart + sizeof(Record);
return reinterpret_cast<T*>(objectLocation);
}
void createNewBlock(size_t newBlockSize)
{
auto raw = static_cast<char*>(std::malloc(newBlockSize));
auto blockStart = alignFor<Block>(raw);
auto newBlock = reinterpret_cast<Block*>(blockStart);
newBlock->rawBlock = raw;
newBlock->prevBlock = currentBlock;
newBlock->startOfNextRecord = blockStart + sizeof(Block);
newBlock->size = newBlockSize;
currentBlock = newBlock;
}
private:
Block* currentBlock;
};
To answer your question, you're not invoking undefined behaviour since nobody is using the pointer until the object is fully constructed (the pointer value itself is safe to copy around until then). However, it's a rather intrusive method, as the object(s) themselves need to know about the memory pool. Additionally, if you're constructing a large number of small objects, it would likely be faster to use an actual pool of memory (like my pool does) instead of calling out to new for every object.
Whatever pool-like approach you use, be careful that the objects are never manually deleteed, because that would lead to a double free!
I still think this is an interesting question without a definitive reply, but please let me break it down into the different questions you are actually asking:
1.) Does inserting a pointer to a base class into a vector before initialisation of a subclass prevent or cause issues with retrieving inherited classes from that pointer. [slicing for example.]
Answer: No, so long as you are 100% sure of the relevant type that is being pointed to, this mechanism does not cause these issues however note the following points:
If the derived constructor fails, you are left with an issue later when you are likely to have a dangling pointer at least sitting in the vector, as that address space it [the derived class] thought it was getting would be freed to the operating environment on failure, but the vector still has the address as being of the base class type.
Note that a vector, although kind of useful, is not the best structure for this, and even if it was, there should be some inversion of control involved here to allow the vector object to control initialisation of your objects, so that you have awareness of success/failure.
These points lead to the implied 2nd question:
2.) Is this a good pattern for pooling?
Answer: Not really, for the reasons mentioned above, plus others (Pushing a vector past it's end point basically ends up with a malloc which is unnecessary and will impact performance.) Ideally you want to use a pooling library, or a template class, and even better, separate the allocation/de-allocation policy implementation away from the pool implementation, with a low level solution already being hinted at, which is to allocate adequate pool memory from pool initialisation, and then use this using pointers to void from within the pool address space (See Alex Zywicki's solution above.) Using this pattern, the pool destruction is safe as the pool which will be contiguous memory can be destroyed en masse without any dangling issues, or memory leaks through losing all references to an object (losing all reference to an object whose address is allocated through the pool by the storage manager leaves you with dirty chunk/s, but will not cause a memory leak as it is managed by the pool implementation.
In the early days of C/C++ (before mass proliferation of the STL), this was a well discussed pattern and many implementations and designs can be found out there in good literature: As an example:
Knuth (1973 The art of computer programming: Multiple volumes), and for a more complete list, with more on pooling, see:
http://www.ibm.com/developerworks/library/l-memory/
The 3rd implied question seems to be:
3) Is this a valid scenario to use pooling?
Answer: This is a localised design decision based on what you are comfortable with, but to be honest, your implementation (no controlling structure/aggregate, possibly cyclic sharing of sub sets of objects) suggests to me that you would be better off with a basic linked list of wrapper objects, each of which contains a pointer to your superclass, used only for addressing purposes. Your cyclical structures are built on top of this, and you simply amend/grow shrink the list as required to accommodate all of your first class objects as required, and when finished, you can then easily destroy them in effectively an O(1) operation from within the linked list.
Having said that, I would personally recommend that at this time (when you have a scenario where pooling does have a use and so you are in the right mind-set) to carry out the building of a storage management/pooling set of classes that are paramaterised/typeless now as it will hold you in good stead for the future.
This sounds what I have heard called a Linear Allocator.
I will explain the basics of how I understand how it works.
Allocate a block of memory using ::operator new(size);
Have a void* that is your Pointer to the next free space in memory.
You will have an alloc(size_t size) function that will give you a pointer to the location in the block from step one for you to construct on to using Placement New
Placement new looks like... int* i = new(location)int(); where location is a void* to a block of memory you alloced from the allocator.
when you are done with all of your memory you will call a Flush() function that will dealloc the memory from the pool or at least wipe the data clean.
I programmed one of these recently and i will post my code here for you as well as do my best to explain.
#include <iostream>
class LinearAllocator:public ObjectBase
{
public:
LinearAllocator();
LinearAllocator(Pool* pool,size_t size);
~LinearAllocator();
void* Alloc(Size_t size);
void Flush();
private:
void** m_pBlock;
void* m_pHeadFree;
void* m_pEnd;
};
don't worry about what i'm inheriting from. i have been using this allocator in conjunction with a memory pool. but basically instead of getting the memory from operator new i am getting memory from a memory pool. the internal workings are the same essentially.
Here is the implementation:
LinearAllocator::LinearAllocator():ObjectBase::ObjectBase()
{
m_pBlock = nullptr;
m_pHeadFree = nullptr;
m_pEnd=nullptr;
}
LinearAllocator::LinearAllocator(Pool* pool,size_t size):ObjectBase::ObjectBase(pool)
{
if (pool!=nullptr) {
m_pBlock = ObjectBase::AllocFromPool(size);
m_pHeadFree = * m_pBlock;
m_pEnd = (void*)((unsigned char*)*m_pBlock+size);
}
else{
m_pBlock = nullptr;
m_pHeadFree = nullptr;
m_pEnd=nullptr;
}
}
LinearAllocator::~LinearAllocator()
{
if (m_pBlock!=nullptr) {
ObjectBase::FreeFromPool(m_pBlock);
}
m_pBlock = nullptr;
m_pHeadFree = nullptr;
m_pEnd=nullptr;
}
MemoryBlock* LinearAllocator::Alloc(size_t size)
{
if (m_pBlock!=nullptr) {
void* test = (void*)((unsigned char*)m_pEnd-size);
if (m_pHeadFree<=test) {
void* temp = m_pHeadFree;
m_pHeadFree=(void*)((unsigned char*)m_pHeadFree+size);
return temp;
}else{
return nullptr;
}
}else return nullptr;
}
void LinearAllocator::Flush()
{
if (m_pBlock!=nullptr) {
m_pHeadFree=m_pBlock;
size_t size = (unsigned char*)m_pEnd-(unsigned char*)*m_pBlock;
memset(*m_pBlock,0,size);
}
}
This code is fully functional except for a few lines which will need to be changed because of my inheritance and use of the memory pool. but I bet you can figure out what needs to change and just let me know if you need a hand changing the code. This code has not been tested in any sort of professional manor and is not guaranteed to be thread safe or anything fancy like that. i just whipped it up and thought i could share it with you since you seemed to need help.
I also have a working implementation of a fully generic memory pool if you think it may help you. I can explain how it works if you need.
Once again if you need any help let me know. Good luck.
So having
struct ResultStructure
{
ResultStructure(const ResultStructure& other)
{
// copy code in here ? using memcpy ? how???
}
ResultStructure& operator=(const ResultStructure& other)
{
if (this != &other) {
// copy code in here ?
}
return *this
}
int length;
char* ptr;
};
How to implement "copy constructor" and "assignment operator"? (sorry - I am C++ nube)
Update: sbi and others ask - why do I want to manually deal with raw memory? My answer is simple - In a students project I am part of now we use lots of C library's such as for example OpenCV OpenAL and FFmpeg and there are more to come. Currently using C++ we try to create a graph based direct show like cross platform library that would be helpful in live video broadcasting and processing. Our graph elements currently use char* and int pairs for data exchange. To cast data to subscribed elements we use raw memcpy now. I want to go further and make it possible for us to make our graph elements base C++ template. So that one graph element would be capable of of sharing current graph element data with other Graph elements and that data it shares would be a structure that would contain not one char* one int but any number of data fields and nearly any elements inside. So that is why I need to understand how to create a basic C++ structure that implements "copy constructor" and "assignment operator" for me to be capable to use new for us data casting algorithms like
void CastData(T item){
for(size_t i = 0 ; i < FuncVec.size(); i++){
T dataCopy = item;
FuncVec[i](dataCopy);
}
}
instead of currently used
void CastData(char * data, int length){
for(size_t i = 0 ; i < FuncVec.size(); i++){
char* dataCopy = new char[length];
memcpy(dataCopy, data, length);
FuncVec[i](dataCopy, length);
delete[] dataCopy;
}
}
You might want to explain why you want to manually deal with raw memory. I haven't done this in a long time, it's what std::string and std::vector where designed for:
struct ResultStructure
{
// nothing else needed
std::string data; // or std::vector<char>
};
But if you really need to do this the hard way (is this homework?), then be advised that it is, at first, surprisingly hard to get this right. For example, a naive implementation of the assignment operator might be like this:
// DON'T TRY THIS AT HOME!!
ResultStructure& ResultStructure::operator=(const ResultStructure& rhs)
{
delete[] ptr; // free old ressource
ptr = new char[rhs.length]; // allocate new resourse
std::copy(rhs.ptr, rhs.ptr+rhs.length, ptr; // copy data
length = rhs.length;
}
If someone accidentally assigns an object to itself (which might happen if all you have is two references and you don't suspect them to refer to the same object), then this will fail fatally.
Also, what if new throws an exception? (It might throw std::bad_alloc if memory is exhausted.) Then we have already deleted the old data and have not allocated new data. The pointer, however, still points at where the old data used to be (actually, I think this is implementation-defined, but I have yet to see an implementation that changes a ptr upon deletion), and the class' destructor (you knew that class would need a destructor, right?) would then attempt to delete a piece of data at an address where no data is allocated. That's Undefined Behavior. The best you can hope for is that it crashes immediately.
The easiest way to do this is to employ the Copy-And-Swap idiom:
struct ResultStructure
{
ResultStructure(const ResultStructure& other)
: ptr(new char[rhs.length]), length(rhs.length)
{
std::copy(rhs.ptr, rhs.ptr+rhs.length, ptr);
}
~ResultStructure() // your class needs this
{
delete[] ptr;
}
ResultStructure& operator=(ResultStructure rhs) // note: passed by copy
{
this->swap(rhs);
return *this
}
void swap(const ResultStruct& rhs)
{
using std::swap;
swap(length, rhs.length);
swap(ptr, rhs.ptr);
}
std::size_t length;
char* ptr;
};
Note that I have added a destructor, changed the assignment operator to pass the argument per copy (we need the copy constructor invoked to allocate memory), and added a swap member function. Per convention a swap() function never throws and is fast, preferably O(1).
I know that GMan's discussion of the Copy-And-Swap idiom is exhaustive enough to be and exhausting while mine is probably too terse for you, and you will likely not understand a lot of it at first, but try to persevere and to understand as much as possible.
If you use std::string, instead of char*, you would not even need to write operator= or copy-constructor. The compiler generated code would do your job very well.
But as a general solution (for some other scenario), use copy-and-swap idiom:
Copy-and-Swap Idiom
What is the copy-and-swap idiom?
Exceptional C++ by Herb Sutter has described these in great detail. I would recommend you to read items from this book. For the time being, you can read this article online:
Exception-Safe Generic Containers
The easy solution is to use a std::string instead of char* member.
Then the compiler-generated copy constructor and copy assignment operator just work.
As a rule, and especially as a novice, don't have raw pointer members.
Cheers & hth.,
As has been said, and as was recommending in the question this emanated from, you should probably reuse an existing container. At least the code would be right :)
For educational purposes though, let's examine this structure:
class Result
{
public:
private:
size_t length; // can't really be negative, right ?
char* data;
};
Here, we need explicit memory management. This implies, notably, following the Rule Of Three (say thanks to Fred)
Let's begin with actually building our object:
Result::Result(): length(0), data(0) {}
Result::Result(size_t l, char* d): length(0), data(0)
{
if (!d) { return; }
data = new char[l]; // this may throw, so we do it first
length = l;
memcpy(data, d, l);
}
Now we can implement the traditional operators:
// Swap
void Result::swap(Result& other)
{
using std::swap;
swap(length, other.length);
swap(data, other.data);
}
// Copy Constructor
Result::Result(Result const& other): length(0), data(0)
{
if (!other.length) { return; }
data = new char[other.length];
length = other.length;
mempcy(data, other.data, length);
}
// Assignemt Operator
Result& Result::operator=(Result other)
{
this->swap(other);
return *this;
}
// !IMPORTANT!
// Destructor
Result::~Result()
{
delete[] data; // use the array form of delete
// no need to test for nullity, it's handled
}
this is std::vector<char>'s job - or is this homework?
the vector would replace both length and the allocation behind ptr. the vector is the c++ idiom, and you'll not make friends with other c++ devs if you implement your classes like you've described. of course, there are corner cases, but standard containers such as vector are the default.
the vector knows how to copy chars as well as itself, and the implementations are optimized and tested.
here's how to explicitly implement copy ctor/assign using a vector:
struct ResultStructure {
ResultStructure(const ResultStructure& other) : d_chars(other.d_chars) {
}
ResultStructure& operator=(const ResultStructure& other) {
if (this != &other) {
this->d_chars = other.d_chars;
}
return *this;
}
std::vector<char> d_chars;
};
I think this should do the work:
struct ResultStructure
{
ResultStructure(const ResultStructure& other);
ResultStructure& operator=(const ResultStructure& other);
int length;
char* ptr;
};
ResultStructure::ResultStructure(const ResultStructure& other):length(other.length)
{
ptr = (char*)malloc(length);
memcpy(ptr, other.ptr, length);
}
ResultStructure& ResultStructure::operator=(const ResultStructure& other)
{
length = other.length;
ptr = (char*)malloc(length);
memcpy(ptr, other.ptr, length);
return *this;
}
Please remember about freeing ptr in destructor.
What is stored under ptr? If text, why not to use std::string? Anyway you can use std::vector. The constructors will be much easier then...
How is the memory to which ptr points allocated?
if using new, allocate with new, set length and then copy
other.length = length;
other.ptr = new char[length];
memcpy( other.ptr, ptr, length );
If you're allocating the memory with malloc, substitute a call to malloc in place of the call to new. If you're using some other memory scheme, figure it out. :)