I'm creating pooled_allocator to alloc memory for my components of specific type in one memory block. How can I update already allocated ptrs stored in some container?
I implemented ECS for my game with common approach. I used
std::unordered_map<TypeIndex, vector<BaseComponent*>>
to store my components by type.
And it is how I allocates memory for components now.
template<typename T, typename... Args>
T* Entity::assign(Args&&... args)
{
//....
// if there are no components of this type yet
std::allocator<T> alloc;
T* comp = std::allocator_traits<std::allocator<T>>::allocate(alloc, 1);
std::allocator_traits<std::allocator<T>>::construct(alloc, comp, T(args...));
// get container for type and add pointer to newly allocated component, that has to be stored in pool
container.push_back(comp);
components.insert({ getTypeIndex<T>(), container });
}
So, now I wanna implement some pooled_allocator, that should meet all std::allocator_traits requirenments to use it as allocator for my components. But I wanna to make it dynamic so it should be able to extend its internal memory block.
What I have for now?
template<typename T, unsigned int Size = 10>
class pooled_allocator
{
// some typedefs
typedef value_type * pointer;
static void* m_memoryBlock; // malloc(SIZE*Size) or nullptr to reproduce problem
static std::list<MyBlockNode> m_allocated;
static const size_t SIZE; // sizeof(T)
// some methods
static pointer allocate(const size_t amount)
{
if (!m_memoryBlock)
m_memoryBlock = malloc(amount * SIZE);
else
{
// here realloc can return another pointer
m_memoryBlock = realloc(m_memoryBlock, m_allocated.size() * SIZE + amount * SIZE);
}
int index = m_allocated.size();
m_allocated.push_back(MyBlockNode { });
pointer elPointer = (pointer)((pointer)m_memoryBlock + index * SIZE);
return elPointer;
}
template <class Up, class... Args>
void construct(Up* p, Args&&... args)
{
new((void*)p) Up(std::forward<Args>(args)...);
}
}
The problem is where realloc returns the pointer to another allocated block ( different pointer ) already allocated objects have copied to a new place and all ptrs stored inside components[typeIndex()] becomes invalid(
How I can fix it? One of the options is to return some ComponentHandle instead of T* and if internal memory block has moved, update all already returned handles with new pointers, but it will decrease allocation speed and makes some restrictions.
And I know about boost::pool_allocator, but my game is too small to integrate such libs.
You created your problem defining your own memory allocation like that.
A solution is to not realloc but to allocate a new block when needed, and to manage all these blocks.
But the real question is why do you do that ? You probably choose that way so solve an other problem, which one ?
Related
I want to create my own game engine so I bought a few books one being Game Engine Architecture Second Edition by Jason Gregory and in it he suggests implementing a few custom allocators. One type of allocator the book talked about was a stack-based allocator, but I got confused when reading it. How do you store data in it? What data type do you use? For example, do you use a void*, void**, an array of char[]? The book says you're meant to allocate one big block of memory using malloc in the begining and free it in the end, and "allocate" memory by incrementing a pointer. If you could help explain this more that would be great because I can't seem to find a tutorial that doesn't use std::allocator. I also thought this might help others interested in a custom allocator so I posted the question here.
This is the header file example they give in the book:
class StackAllocator
{
public:
// Represents the current top of the stack.
// You can only roll back to the marker not to arbitrary locations within the stack
typedef U32 Marker;
explicit StackAllocator(U32 stackSize_bytes);
void* alloc(U32 size_bytes); // Allocates a new block of the given size from stack top
Marker getMarker(); // Returns a Marker to the current stack top
void freeToMarker(Marker marker); // Rolls the stack back to a previous marker
void clear(); // Clears the entire stack(rolls the stack back to zero)
private:
// ...
}
EDIT:
After a while I got this working but I don't know if I'm doing it right
Header File
typedef std::uint32_t U32;
struct Marker {
size_t currentSize;
};
class StackAllocator
{
private:
void* m_buffer; // Buffer of memory
size_t m_currSize = 0;
size_t m_maxSize;
public:
void init(size_t stackSize_bytes); // allocates size of memory
void shutDown();
void* allocUnaligned(U32 size_bytes);
Marker getMarker();
void freeToMarker(Marker marker);
void clear();
};
.cpp File
void StackAllocator::init(size_t stackSize_bytes) {
this->m_buffer = malloc(stackSize_bytes);
this->m_maxSize = stackSize_bytes;
}
void StackAllocator::shutDown() {
this->clear();
free(m_buffer);
m_buffer = nullptr;
}
void* StackAllocator::allocUnaligned(U32 size_bytes) {
assert(m_maxSize - m_currSize >= size_bytes);
m_buffer = static_cast<char*>(m_buffer) + size_bytes;
m_currSize += size_bytes;
return m_buffer;
}
Marker StackAllocator::getMarker() {
Marker marker;
marker.currentSize = m_currSize;
return marker;
}
void StackAllocator::freeToMarker(Marker marker) {
U32 difference = m_currSize - marker.currentSize;
m_currSize -= difference;
m_buffer = static_cast<char*>(m_buffer) - difference;
}
void StackAllocator::clear() {
m_buffer = static_cast<char*>(m_buffer) - m_currSize;
}
Okay for simplicity let's say you're tracking a collection of MyFunClass for your engine. It could be anything, and your linear allocator doesn't necessarily have to track objects of a homogenous type, but often that's how it's done. In general, when using custom allocators you're trying to "shape" your memory allocations to separate static data from dynamic, infrequently accessed vs. frequently accessed, with a view towards optimizing your working set and achieving locality of reference.
Given the code you provided, first, you'd allocate your memory pool. For simplicity, assume you want enough space to pool 1000 objects of type MyFunClass.
StackAllocator sa;
sa.Init( 1000 * sizeof(MyFunClass) );
Then each time you need to "allocate" a new block of memory for a FunClass, you might do it like this:
void* mem = sa.allocUnaligned( sizeof(MyFunClass) );
Of course, this doesn't actually allocate anything. All the allocation already happened in Step 1. It just marks some of your already-allocated memory as in-use.
It also doesn't construct a MyFunClass. Your allocator isn't strongly typed, so the memory it returns can be interpreted however you want: as a stream of bytes; as a backing representation of a C++ class object; etc.
Now, how would you use a buffer allocated in this fashion? One common way is with placement new:
auto myObj = new (mem) MyFunClass();
So now you're constructing your C++ object in the memory space you reserved with the call to allocUnaligned.
(Note that the allocUnaligned bit gives you some insight into why we don't usually write our own custom allocators: because they're hard as heck to get right! We haven't even mentioned alignment issues yet.)
For extra credit, take a look at scope stacks which take the linear allocator approach to the next level.
So I have a struct as shown below, I would like to create an array of that structure and allocate memory for it (using malloc).
typedef struct {
float *Dxx;
float *Dxy;
float *Dyy;
} Hessian;
My first instinct was to allocate memory for the whole structure, but then, I believe the internal arrays (Dxx, Dxy, Dyy) won't be assigned. If I assign internal arrays one by one, then the structure of arrays would be undefined. Now I think I should assign memory for internal arrays and then for the structure array, but it seems just wrong to me. How should I solve this issue?
I require a logic for using malloc in this situation instead of new / delete because I have to do this in cuda and memory allocation in cuda is done using cudaMalloc, which is somewhat similar to malloc.
In C++ you should not use malloc at all and instead use new and delete if actually necessary. From the information you've provided it is not, because in C++ you also rather use std::vector (or std::array) over C-style-arrays. Also the typedef is not needed.
So I'd suggest rewriting your struct to use vectors and then generate a vector of this struct, i.e.:
struct Hessian {
std::vector<float> Dxx;
std::vector<float> Dxy;
std::vector<float> Dyy;
};
std::vector<Hessian> hessianArray(2); // vector containing two instances of your struct
hessianArray[0].Dxx.push_back(1.0); // example accessing the members
Using vectors you do not have to worry about allocation most of the time, since the class handles that for you. Every Hessian contained in hessianArray is automatically allocated for you, stored on the heap and destroyed when hessianArray goes out of scope.
It seems like problem which could be solved using STL container. Regarding the fact you won't know sizes of arrays you may use std::vector.
It's less error-prone, easier to maintain/work with and standard containers free their resources them self (RAII). #muXXmit2X already shown how to use them.
But if you have/want to use dynamic allocation, you have to first allocate space for array of X structures
Hessian *h = new Hessian[X];
Then allocate space for all arrays in all structures
for (int i = 0; i < X; i++)
{
h[i].Dxx = new float[Y];
// Same for Dxy & Dyy
}
Now you can access and modify them. Also dont forget to free resources
for (int i = 0; i < X; i++)
{
delete[] h[i].Dxx;
// Same for Dxy & Dyy
}
delete[] h;
You should never use malloc in c++.
Why?
new will ensure that your type will have their constructor called. While malloc will not call constructor. The new keyword is also more type safe whereas malloc is not typesafe at all.
As other answers point out, the use of malloc (or even new) should be avoided in c++. Anyway, as you requested:
I require a logic for using malloc in this situation instead of new / delete because I have to do this in cuda...
In this case you have to allocate memory for the Hessian instances first, then iterate throug them and allocate memory for each Dxx, Dxy and Dyy. I would create a function for this like follows:
Hessian* create(size_t length) {
Hessian* obj = (Hessian*)malloc(length * sizeof(Hessian));
for(size_t i = 0; i < length; ++i) {
obj[i].Dxx = (float*)malloc(sizeof(float));
obj[i].Dxy = (float*)malloc(sizeof(float));
obj[i].Dyy = (float*)malloc(sizeof(float));
}
return obj;
}
To deallocate the memory you allocated with create function above, you have to iterate through Hessian instances and deallocate each Dxx, Dxy and Dyy first, then deallocate the block which stores the Hessian instances:
void destroy(Hessian* obj, size_t length) {
for(size_t i = 0; i < length; ++i) {
free(obj[i].Dxx);
free(obj[i].Dxy);
free(obj[i].Dyy);
}
free(obj);
}
Note: using the presented method will pass the responsibility of preventing memory leaks to you.
If you wish to use the std::vector instead of manual allocation and deallocation (which is highly recommended), you can write a custom allocator for it to use cudaMalloc and cudaFree like follows:
template<typename T> struct cuda_allocator {
using value_type = T;
cuda_allocator() = default;
template<typename U> cuda_allocator(const cuda_allocator<U>&) {
}
T* allocate(std::size_t count) {
if(count <= max_size()) {
void* raw_ptr = nullptr;
if(cudaMalloc(&raw_ptr, count * sizeof(T)) == cudaSuccess)
return static_cast<T*>(raw_ptr);
}
throw std::bad_alloc();
}
void deallocate(T* raw_ptr, std::size_t) {
cudaFree(raw_ptr);
}
static std::size_t max_size() {
return std::numeric_limits<std::size_t>::max() / sizeof(T);
}
};
template<typename T, typename U>
inline bool operator==(const cuda_allocator<T>&, const cuda_allocator<U>&) {
return true;
}
template<typename T, typename U>
inline bool operator!=(const cuda_allocator<T>& a, const cuda_allocator<U>& b) {
return !(a == b);
}
The usage of an custom allocator is very simple, you just have to specify it as second template parameter of std::vector:
struct Hessian {
std::vector<float, cuda_allocator<float>> Dxx;
std::vector<float, cuda_allocator<float>> Dxy;
std::vector<float, cuda_allocator<float>> Dyy;
};
/* ... */
std::vector<Hessian, cuda_allocator<Hessian>> hessian;
Context:
I am trying to create a custom allocator which mimics std::allocator (not derived from) in some ways, but allows instanced allocators. My generic containers have constructors, which allow the user to specify a pointer, to a custom Allocator object. When no allocator is specified, I want it to default to a singleton NewDeleteAllocator which derives from a abstract Allocator class. This simply wraps the global new and delete operators. This idea is taken from Towards a Better Allocator Model by Pablo Halpern.
Client code that uses custom allocator:
// 'foo_container.hpp'
// enclosed in package namespace
template <class T>
class FooContainer
{
private:
// -- Private member properties --
Allocator * allocator;
public:
// -- Constructors --
FooContainer( Allocator * allocator = 0 )
{
this->allocator = !allocator ? (Allocator *)defaultAllocator : allocator;
}
FooContainer( const FooContainer &rhs, Allocator * allocator = 0 )
{
// don't implicitly copy allocator
this->allocator = !allocator ? (Allocator *)defaultAllocator : allocator;
// copying logic goes here
}
}
Custom allocator implementation:
// 'allocator.hpp'
// enclosed in package namespace
class Allocator
{
public:
virtual ~Allocator(){ };
virtual void * allocate( size_t bytes ) = 0;
virtual void deallocate( void * ptr ) = 0;
};
class NewDeleteAllocator : public Allocator
{
public:
virtual ~NewDeleteAllocator()
{
}
virtual void * allocate( size_t bytes )
{
return ::operator new( bytes );
}
virtual void deallocate( void * ptr )
{
::operator delete( ptr ); // memory leak?
}
private:
};
//! #todo Only for testing purposes
const Allocator * defaultAllocator = new NewDeleteAllocator();
Main Question:
I know that allocating via new might also store information about the allocation along with the pointer. I realize calling delete with the scope resolution operator :: is not quite the same as just calling delete, but how does ::delete( ptr ) know the size of the data that ptr is pointing to? Is this a safe operation? From my understanding, deleting via a void pointer could result in undefined behaviour, according to the C++ standard. If this is bad, how else could I implement this?
Further details:
I did some very rough preliminary testing with the following code:
// inside member function of 'FooContainer'
for( size_t i = 0; i < 1000000; i++ )
{
for( size_t j; j = 1; j < 20; j++ )
{
void * ptr = allocator->allocate( j );
allocator->deallocate( ptr );
}
}
I observed the program total memory usage with Xcode's profiling tools. Memory usage stays constant at a low value. I'm aware that this is not the proper way to check for memory leaks. I don't know if the compiler could optimize this out. I am just experimenting with the idea. I would really appreciate some input to the main question, before I make any commitments to the architecture of my library. The whole approach might be flawed in the first place.
Thanks for the input. I don't want to make any bad assumptions.
Calling ::delete on a pointer returned from a call to ::new is safe.
Calling ::delete[] on a pointer returned from a call to ::new[] is safe.
Calling delete x on a pointer returned from a call to auto x = new {...} is safe if you don't away x's type.
Calling delete[] x on a pointer returned from a call to auto x = new {...}[z] is safe if you don't away x's type.
mixing is UB
"but how does ::delete( ptr ) know the size of the data that ptr is
pointing to"
The dynamic allocated memory in C++ is usually implemented through a heap. The heap initially allocates a very large amount of space, and than he handles it, letting the program handle random chunks from it. The heap stores the size of every chunk of memory it allocates. For example, if you need 8 bytes of memory, the heap reserve for you at least 12 bytes, holding the size in the first 4 bytes and the data in the 8 latter. Than, it returns a pointer to the 8 bytes. So, in the process of deleting, the program knows how much "to delete" through accessing the pointer - 4.
I have an OOP entity-component system that currently works like this:
// In the component system
struct Component { virtual void update() = 0; }
struct Entity
{
bool alive{true};
vector<unique_ptr<Component>> components;
void update() { for(const auto& c : components) c->update(); }
}
// In the user application
struct MyComp : Component
{
void update() override { ... }
}
To create new entities and components, I use C++'s usual new and delete:
// In the component system
struct Manager
{
vector<unique_ptr<Entity>> entities;
Entity& createEntity()
{
auto result(new Entity);
entities.emplace_back(result);
return *result;
}
template<typename TComp, typename... TArgs>
TComp& createComponent(Entity& mEntity, TArgs... mArgs)
{
auto result(new TComp(forward<TArgs>(mArgs)...));
mEntity.components.emplace_back(result);
return result;
}
void removeDead() { /* remove all entities with 'alive == false' - 'delete' is called here by the 'unique_ptr' */ }
}
// In the user application
{
Manager m;
auto& myEntity(m.createEntity());
auto& myComp(m.createComponent<MyComp>(myEntity));
// Do stuff with myEntity and myComp
m.removeDead();
}
The system works fine, and I like the syntax and flexibility. However, when continuously adding and removing entities and components to the manager, memory allocation/deallocation slows down the application. (I've profiled and determined that the slow down is caused by new and delete).
I've recently read that it's possible to pre-allocate heap memory in C++ - how can that be applied to my situation?
Desired result:
// In the user application
{
Manager m{1000};
// This manager can hold about 1000 entities with components
// (may not be 1000 because of dynamic component size,
// since the user can define it's on components, but it's ok for me)
auto& myEntity(m.createEntity());
auto& myComp(m.createComponent<MyComp>(myEntity));
// Do stuff with myEntity and myComp
m.removeDead();
// No 'delete' is called here! Memory of the 'dead' entities can
// be reused for new entity creation
}
// Manager goes out of scope: 'delete' is called here
There are a few things you can do to make the implementation of your design scale better.
In your current implementation there are two memory allocations per Entity and Component. The first one allocates an object and the second one when the object is put into the vector. The second one happens when the vector runs out of space and allocates a bigger array and moves the old elements into the new array.
In this case the best you can do is to use intrusive lists. That is, each of Entity and Component become also list nodes. Then, after these have been allocated no extra memory allocations are necessary to put the object into a list. Use a single or double-linked list from Boost.Intrusive, or write your own. This is how Linux kernel keeps track of many different objects.
The next step is to preallocate Entity and Component elements. Preallocating could be something as simple as a global array of these, or something more sophisticated, such as Boost.Pool. There are quite a few ways to build a memory pool of objects.
Once Entity and Component are preallocated and intrusive lists are used you are done.
An example which uses boost components:
#include <boost/intrusive/list.hpp>
#include <boost/pool/pool_alloc.hpp>
#include <new>
namespace bi = boost::intrusive;
// api.h
//
// Object pooling support begin.
//
template<class T>
struct Pool
{
static boost::pool_allocator<T> pool;
};
// Singleton. Although it is defined in the header, the linkers
// make sure there is only one instance of it in the application.
// It is instantiated on demand when Pool<T> is used.
template<class T>
boost::pool_allocator<T> Pool<T>::pool;
template<class Derived>
struct Pooled // use it on the most derived class only, not on intermediate base classes
{
// Automatically use the object pool for plain new/delete.
static void* operator new(size_t) { return Pool<Derived>::pool.allocate(1); }
static void operator delete(void* p) { return Pool<Derived>::pool.deallocate(static_cast<Derived*>(p), 1); }
};
//
// Object pooling support end.
//
// Using bi::list_base_hook<bi::link_mode<bi::auto_unlink> > because it automatically
// unlinks from the list when the object is destroyed. No need to manually
// remove the object from the list when an object is about to be destroyed.
struct Component
: bi::list_base_hook<bi::link_mode<bi::auto_unlink> > // make it an intrusive list node
{
virtual void update() = 0;
virtual ~Component() {}
};
struct Entity
: bi::list_base_hook<bi::link_mode<bi::auto_unlink> > // make it an intrusive list node
, Pooled<Entity> // optional, make it allocated from the pool
{
bool active = false;
bi::list<Component, bi::constant_time_size<false> > components;
~Entity() {
for(auto i = components.begin(), j = components.end(); i != j;)
delete &*i++; // i++ to make sure i stays valid after the object is destroyed
}
void update() {
for(auto& c : components)
c.update();
}
};
struct Manager
{
bi::list<Entity, bi::constant_time_size<false> > entities;
~Manager() {
for(auto i = entities.begin(), j = entities.end(); i != j;)
delete &*i++; // i++ to make sure i stays valid after the object is destroyed
}
Entity& createEntity() {
auto result = new Entity;
entities.push_back(*result);
return *result;
}
template<typename TComp, typename... TArgs>
TComp& createComponent(Entity& mEntity, TArgs... mArgs)
{
auto result = new TComp(std::forward<TArgs>(mArgs)...);
mEntity.components.push_back(*result);
return *result;
}
void removeDead() {
for(auto i = entities.begin(), j = entities.end(); i != j;) {
auto& entity = *i++;
if(!entity.active)
delete &entity;
}
}
};
// user.cc
struct MyComp
: Component
, Pooled<MyComp> // optional, make it allocated from the pool
{
void update() override {}
};
int main() {
Manager m;
auto& myEntity(m.createEntity());
auto& myComp(m.createComponent<MyComp>(myEntity));
m.removeDead();
}
In the above example boost::pool_allocator<T> actually uses new to allocate objects and then it keeps reusing destroyed objects rather than invoking delete on them. You can do better by preallocating all objects, but there are many ways to do so depending on your requirements, so that I use boost::pool_allocator<T> for simplicity to avoid hair splitting here. You can change the implementation of Pooled<T> to something like Pooled<T, N> where N stands for the maximum number of objects, the rest of the code stays the same because it uses plain new/delete which happen to be overridden for objects allocated from a pool.
C++ supports class-specific memory pools for this kind of thing. The general-purpose new/delete pair inevitably trades off among
Time spent searching for a free block of the right size to meet each request
Time spent coalescing free blocks
Time spent maintaining and perhaps reorganizing internal data structures to make the above two operations faster.
The primary way to gain speed is to avoid these tradeoffs entirely with custom allocators that - as you say - pre-allocate a big chunk of memory viewed as a simple array of free objects all of the same size. Initially these are all linked on a free list, where the link pointers occupy the first bytes of each block "overlaid" where the data will eventually go. Allocation is just unchaining a block from the head of the free list - a "pop" operation needing about 2 instructions. Deallocation is a "push:" two more instructions. In many cases, memory hardware can be set to generate a trap when the the pool is empty so there is no per-allocation overhead for detecting
this error condition. (In GC systems the same trick is used to initiate collection with no overhead.)
In your case you'd need two pools: one for Entities and one for Components.
Defining your own pool allocator is not so hard, especially if your application is single threaded. See this document for a tutorial treatment.
Using most of answers and Google as references, I implemented some pre-allocation utilities in my SSVUtils library.
Prealloc.h
Example:
using MemUnit = char;
using MemUnitPtr = MemUnit*;
using MemSize = decltype(sizeof(MemUnit)); // Should always be 1 byte
class MemBuffer
{
Uptr<MemUnit[]> buffer;
MemRange range;
MemBuffer(MemSize mSize) : ...
{
// initialize buffer from mSize
}
};
class PreAllocatorChunk
{
protected:
MemSize chunkSize;
MemBuffer buffer;
std::stack<MemRange> available;
public:
PreAllocatorChunk(MemSize mChunkSize, unsigned int mChunks) : ...
{
// Add "chunks" to to available...
}
template<typename T, typename... TArgs> T* create(TArgs&&... mArgs)
{
// create on first "chunk" using placement new
auto toUse(available.top().begin); available.pop();
return new (toUse) T{std::forward<TArgs>(mArgs)...};
}
};
More pre-allocation utilities are available:
PreAllocatorDynamic: pre-allocates a big buffer, then, when creating an object, splits the buffer in two parts:
[buffer start, buffer start + obj size)
[buffer start + obj size, buffer end)
When an object is destroyed, its occupied memory range is set as "available". If during creation of a new object no big enough "chunk" is found, the pre-allocator tries to unify contiguous memory chunks before throwing a runtime exception. This pre-allocator is sometimes faster than new/delete, but it greatly depends on the size of pre-allocated buffer.
PreAllocatorStatic<T>: inherited from PreAllocatorChunk. Size of a chunk is equal to sizeof(T). Fastest pre-allocator, less flexible. Almost always faster than new/delete.
One of your issues can be solved by allocating enough space in the vectors on their creation
For
vector<unique_ptr<Entity>> entities;
provide enough space in the constructor
Manager::Manager() : entities(10000)
{
//...
}
Thus, you avoid reallocations and copying during later stages.
The second issue is the creation of your unique_ptr<Entity> pointers. Here, as you will always use default constructed objects, you could also use a pre-allocated pool of objects from which you create the pointers. Instead of calling new you would call an own class
class EntityPool
{
public:
EntityPool(unsigned int size = 10000) : pool(size), nextEntity(0)
{
}
Entity* getNext(void)
{
if (nextEntity != pool.size()) // if pool is exhausted create new
{
pool.emplace_back(Entity());
}
return pool[nextEntity++];
}
private:
vector<Entity> pool;
unsigned int nextEntity; // index into the vector to the next Entity
};
struct Manager
{
vector<unique_ptr<Entity>> entities;
Entity& createEntity()
{
entities.emplace_back(entityPoolInstance.getNext());
return *result;
}
//...
Or you could wire in the standard 'placement new'. This allows you to allocate a big block of memory to construct (place) objects into as you choose. This will keep the block on the heap for as long as you need it, and allow you to allocate multiple short lived objects into this block instead of doing the costly allocations and de-allocations that just end up fragmenting the heap. There's a couple of gotcha's involved, but all in all its a REALLY simple solution without having to go down the custom memory manager route.
Here's an excellent treatment of removing some of the pitfalls and describing placement new in detail.
I've used data structures as simple as a stack to keep track of the next free block to allocate into: push the address of a block that's about to be deleted onto the stack. When allocating just pop the next free block off of the stack and use that as the arg to placement new. Super easy and super fast!
EDITED: reworded question.
When new and malloc are called, the size of the block of memory to be allocated is passed:
void* malloc(size_t);
void* operator new(size_t);
is it possible to get type information, i.e. so you could do a sizeof(T) where T was the type the memory was being allocated for:
T t = new T;
I'd like to overload new and malloc but require the type information not just the size of the memory being allocated.
FURTHER EDIT:
The reason I am doing this is that I will overload malloc. This function will inspect the size of the memory being allocated and allocate from a particualr memory pool:
template <int v> struct int2type { value = v };
inline void* malloc(const std::size_t sz)
{
if(sz <= 64)
{
size_obj* ptr = malloc(nggt::core::int2type<sizeof(size_obj) + (sz % 8)>);
ptr->sz = sz;
return ptr+1;
}
else if(sz <= 128)
{
size_obj* ptr = malloc(nggt::core::int2type<sizeof(size_obj) + 128>);
ptr->sz = sz;
return ptr+1;
}
else
{
return TSuper::malloc(sz);
}
}
inline void* malloc(const nggt::core::int2type<sizeof(size_obj) + 8> sz)
{
return m_heap8.malloc(sz.value);
}
inline void* malloc(const nggt::core::int2type<sizeof(size_obj) + 16> sz)
{
return m_heap16.malloc(sz.value);
}
There are also supporintg free overloads using a freelist to return memory to a pool. Problem is I can't use size_t sz in a template as it's not known at compile time. If I could get the type information I could do sizeof(T) and be done!
Cheers,
Graeme
The new with the std::size_t argument is commonly known as non-placement new. The std::size_t argument is passed in by default to let the operating system know how many bytes to allocate for the object you are creating. You, yourself, don't need to provide it to the new declaration.
Edit:
In response to your edit. It's common practice if you are overloading the new operator to
place it within the namespace you would like it to be called from
Avoid a namespace new if you can make specific news for your objects. (that is, there is behavior specific to each class, why not place it with the class itself)
The first argument should be the std::size_t which you don't need to worry about. It'll be handled for you.
Everything else after that is all yours:
struct MyObject{
MyObject* operator new(std::size_t s, const char* message){
cout << "Creating an object of size " << size_of(MyObject) << endl;
return new MyObject();
}
};
You can even use the "old" new in your call of your "new" new.
Further Edit:
In response to your next edit, don't call your own malloc, malloc. Instead, call it some other name. If you're writing C++, malloc has no place within your code what so ever unless you're dealing with some legacy application. In C++, you should only be touching the new operator.