std::unique_ptr is NOT zero cost - c++

I have setup, similar to this:
There is class similar to vector (it is implemented using std::vector).
It contains pointers to int's.
I am using my own custom allocator.
The vector does not create elements, but it can destroy elements.
In order to destroy it needs to call non static method Allocator::deallocate(int *p).
If I do it with manual livetime management, I can call Allocator::deallocate(int *p) manually. This works, but is not RAII.
Alternatively, I can use std::unique_ptr with custom deleter. However if I do so, the size of array became double, because each std::unique_ptr must contain pointer to the allocator.
Is there any way I can do it without doubling the size of the vector?
Note i do not want to templatize the class.
Here is best RAII code I come up.
#include <functional>
#include <cstdlib>
#include <memory>
struct MallocAllocator{
template<class T>
static T *allocate(size_t size = sizeof(T) ) noexcept{
return reinterpret_cast<T *>( malloc(size) );
}
// this is deliberately not static method
void deallocate(void *p) noexcept{
return ::free(p);
}
// this is deliberately not static method
auto getDeallocate() noexcept{
return [this](void *p){
deallocate(p);
};
}
};
struct S{
std::function<void(void *)> fn;
S(std::function<void(void *)> fn) : fn(fn){}
auto operator()() const{
auto f = [this](void *p){
fn(p);
};
return std::unique_ptr<int, decltype(f)>{ (int *) malloc(sizeof(int)), f };
}
};
int main(){
MallocAllocator m;
S s{ m.getDeallocate() };
auto x = s();
printf("%zu\n", sizeof(x));
}

You can't do it. If you want your unique_ptr to store a reference to non-static deleter, there is nothing you can do about that, it will have to store it somewhere.
Some ways to work around this:
If you are using allocator-aware data structure, pass you allocator to it and don't use unique_ptr's, use the actual data type as a stored type.
Wrap you allocated objects around some sort of manager that would deallocate those objects when needed. You lose RAII inside it, but to outside code it will still be RAII. You can even transfer the ownership of some objects from this manager to the outside code and you'll only have to use custom deleter there.
(not recommended) Use some global state that you can access from deleter to make it size 0.

Related

How to custom deallocate an object from a base class pointer?

I have a class hierarchy that I'm storing in a std::vector<std::unique_ptr<Base>>. There is frequent adding and removing from this vector, so I wanted to experiment with custom memory allocation to avoid all the calls to new and delete. I'd like to use STL tools only, so I'm trying std::pmr::unsynchronized_pool_resource for the allocation, and then adding a custom deleter to the unique_ptr.
Here's what I've come up with so far:
#include <memory_resource>
#include <vector>
#include <memory>
// dummy classes
struct Base
{
virtual ~Base() {}
};
struct D1 : public Base
{
D1(int i_) : i(i_) {}
int i;
};
struct D2 : public Base
{
D2(double d_) : d(d_) {}
double d;
};
// custom deleter: this is what I'm concerned about
struct Deleter
{
Deleter(std::pmr::memory_resource& m, std::size_t s, std::size_t a) :
mr(m), size(s), align(a) {}
void operator()(Base* a)
{
a->~Base();
mr.get().deallocate(a, size, align);
}
std::reference_wrapper<std::pmr::memory_resource> mr;
std::size_t size, align;
};
template <typename T>
using Ptr = std::unique_ptr<T, Deleter>;
// replacement function for make_unique
template <typename T, typename... Args>
Ptr<T> newT(std::pmr::memory_resource& m, Args... args)
{
auto aPtr = m.allocate(sizeof(T), alignof(T));
return Ptr<T>(new (aPtr) T(args...), Deleter(m, sizeof(T), alignof(T)));
}
// simple construction of vector
int main()
{
auto pool = std::pmr::unsynchronized_pool_resource();
auto vec = std::vector<Ptr<Base>>();
vec.push_back(newT<Base>(pool));
vec.push_back(newT<D1>(pool, 2));
vec.push_back(newT<D2>(pool, 4.0));
return 0;
}
This compiles, and I'm pretty sure that it doesn't leak (please tell me if I'm wrong!) But I'm not too happy with the Deleter class, which has to take extra arguments for the size and alignment.
I first tried making it a template, so that I could work out the size and alignment automatically:
template <typename T>
struct Deleter
{
Deleter(std::pmr::memory_resource& m) :
mr(m) {}
void operator()(Base* a)
{
a->~Base();
mr.get().deallocate(a, sizeof(T), alignof(T));
}
std::reference_wrapper<std::pmr::memory_resource> mr;
};
But then the unique_ptrs for each type are incompatible, and the vector won't hold them.
Then I tried deallocating through the base class:
mr.get().deallocate(a, sizeof(Base), alignof(Base));
But this is clearly a bad idea, as the memory that's deallocated has a different size and alignment from what was allocated.
So, how do I deallocate through the base pointer without storing the size and alignment at runtime? delete seems to manage, so it seems like it should be possible here as well.
After writing my answer, I would recommend you stick with your code.
Letting unique_ptr handle the storage is not bad at all, it is allocated on stack if unique_ptr itself is, it is safe, and there is no additional overhead at deallocation time. The latter is not true for std::shared_ptr which uses type-erause for its deleters.
I think it is the cleanest and simplest way how to achieve the goal. And there's nothing wrong with your code as far as I can tell.
Most allocators to my knowledge allocate extra space for storing any data they need for deallocation directly next to the pointer they return to you. We can do the same to the aPtr blob:
// Extra information needed for deallocation
struct Header {
std::size_t s;
std::size_t a;
std::pmr::memory_resource* res;
};
// Deleter is now just a free function
void deleter(Base* a) {
// First delete the object itself.
a->~Base();
// Obtain the header
auto* ptr = reinterpret_cast<unsigned char*>(a);
Header* header = reinterpret_cast<Header*>(ptr - sizeof(Header));
// Deallocate the allocated blob.
header->res->deallocate(ptr, header->s, header->a);
};
// Use the new custom function.
template <typename T>
using Ptr = std::unique_ptr<T, decltype(&deleter)>;
template <typename T, typename... Args>
Ptr<T> newT(std::pmr::memory_resource& m, Args... args) {
// Let the compiler calculate the correct way how to store `T` and `H`
// together.
struct Storage {
Header header;
T type;
};
Header h = {sizeof(Storage), alignof(Storage)};
auto aPtr = m.allocate(h.s, h.a);
// Use dummy header.
Storage* storage = new (aPtr) Storage{h, T(args...)};
static_assert(sizeof(Storage) == (sizeof(Header) + sizeof(T)),
"No padding bytes allowed in Storage.");
return Ptr<T>(&storage->type, deleter);
}
We store all information necessary for deallocation in Header structure.
Allocating both T and the header in a single blob is not straight forward as it might seem - see below. We need at least sizeof(T)+sizeof(Header) bytes but must also respect the alignof(T). So we let the compiler figure it out via Storage.
This way we can allocate T properly and return a pointer to &storage->type to the user. The issue now is that there might be some to-deleter-unknown amount of padding in Storage between header and type, thus the deleter function would not be able to recover &storage->header only from &storage->type pointer.
I have two proposals for this:
Just assert the padding amount to 0.
Manually write the header at the known place, albeit I cannot guarantee 100% safe.
Restricting to known padding
Although the extra padding in Storage is unlikely because Header is aligned to 8 bytes on normal 64-bit systems which should be generally enough for all Ts, there is no such alignment guarantee in C++. vtable pointer makes this even less guaranteed IMHO and the fact that alignas(N) offers some user-control over the alignment, increasing it in particular for e.g. vector instructions, doesn't help either. So to be safe, we can just use static_assert and if any "weird" type comes along, the code will not compile and remain safe.
If that happens, one can manually add extra padding to Storage and modify the subtraction amount. The cost would be extra memory for that padding for all allocations.
Writing the header manually
Another option is that we just ignore storage->header member and write the header ourselves directly before type, potentially into the padding area. This requires the use of memcopy because we cannot just placement-new it there because of possible alignof(Header) mismatch. Same in deleter itself because there is no Header object at ptr-sizeof(Header), simple reinterpret_cast<Header*>(ptr-sizeof(header)) would break the strict aliasing rule.
// Extra information needed for deallocation
struct Header {
std::size_t s;
std::size_t a;
std::pmr::memory_resource* res;
};
// Deleter is now just a free function
void deleter(Base* a) {
// First delete the object itself.
a->~Base();
// Obtain the header
auto* ptr = reinterpret_cast<unsigned char*>(a);
Header header;
std::memcpy(&header, ptr - sizeof(Header), sizeof(Header));
// Deallocate the allocated blob.
header.res->deallocate(ptr, header.s, header.a);
};
// Use the new custom function.
template <typename T>
using Ptr = std::unique_ptr<T, decltype(&deleter)>;
template <typename T, typename... Args>
Ptr<T> newT(std::pmr::memory_resource& m, Args... args) {
// Let the compiler calculate the correct way how to store `T` and `H`
// together.
struct Storage {
Header header;
// Padding???
T type;
};
Header h = {sizeof(Storage), alignof(Storage)};
auto aPtr = m.allocate(h.s, h.a);
// Use dummy header.
Storage* storage = new (aPtr) Storage{{0, 0}, T(args...)};
// Write our own header at the known -sizeof(Header) offset.
auto* ptr = reinterpret_cast<unsigned char*>(storage);
std::memcpy(ptr - sizeof(Header), &h, sizeof(Header));
return Ptr<T>(&storage->type, deleter);
}
I know this solution is safe w.r.t strict aliasing, object lifetime and allocating T. What I am not 100% certain about is whether the compiler is allowed to store anything relevant to T inside the potential padding bytes, which would thus be overwritten by the manually-written header.

How to reduce compiled binary size of type erasure

I am developing firmware in C++. As part of the firmware, I wrote my own, lightweight version of std::shared_ptr (without weak pointers and some other features) and std::make_shared. Without weak pointers, the control block is destroyed at the same time as the object itself, so I was able to do the following:
template <typename T, typename ... ARGS>
shared_ptr<T> make_shared(ARGS&&... args) {
struct FullBlock {
_shared_ptr_base::control_block block;
T value;
};
//Allocate block on heap
auto block = new FullBlock{
{0, {nullptr, nullptr}},
T{std::forward<ARGS>(args)...} //value
};
block->block.destructor.p = block;
block->block.destructor.destroy = [](const void* x){
delete static_cast<const FullBlock*>(x);
};
return shared_ptr<T>(&block->block, &block->value);
}
Where block.destructor.p is a void* (type erased) pointer to the control block + object (i.e., FullBlock) and block.destructor.destroy is a function pointer to, in this case, a lambda that does the deleting.
I use these shared pointers primarily as part of a message-passing framework. Since there are a lot of different types of messages, there are a lot of different instances of make_shared and thus the deleter lambda. Each one of these lambdas requires space in .text.
My question is, since the destructors are just functions which take a (hidden) pointer to an object, is there a way to store the destructor without having to create a special lambda function for every single type that I pass to make_shared?
In my head, it should be as simple as the following (but I know it's not):
struct Foo;
using Destructor = void(*)(const void*);
Foo* f;
Destructor d = &Foo::~Foo;
void* ptr = static_cast<void*>(f);
d(ptr); //Call the destructor

Using a smart pointer to manage memory allocated within function

Background
If I had a function foo as follows
void foo(std::vector<int>& values)
{
values = std::vector<int>(10, 1);
}
Then I could call it as
std::vector<int> values;
foo(values);
Notice that the initial vector is empty, then it is populated within the function foo.
I often come across interfaces that I cannot change (i.e. 3rd party), that have the same intention as above, but using raw arrays, for example
void foo(int*& values)
{
values = new int[10];
std::fill_n(values, 10, 1);
}
My problem with these is that now I am responsible for managing that memory, e.g.
int* values;
foo(values);
delete[] values;
Question
Is there any way I can use smart pointers to manage this memory for me? I would like to do something like
std::unique_ptr<int[]> values;
foo(values.get());
but get returns a pointer that is an r-value, so I cannot pass it by non-const reference.
Create empty unique_ptr.
Create pointer.
Pass pointer to function.
Reset unique_ptr to that pointer.
Like that:
std::unique_ptr<int[]> values;
int* values_ptr;
foo(values_ptr);
values.reset(values_ptr);
Or:
int* values_ptr;
foo(values_ptr);
std::unique_ptr<int[]> values(values_ptr);
Simple answer: you cannot.
But i believe there is a way to simplify you life. Since you always do the same thing, you simply have to write a function that does it for you and call that function when you need. Since i suppose you want to use the returned value written in memory before releasing it, i make my function return it.
template <typename T, typename Function>
std::unique_ptr<T[]> call(Function f) {
T * value;
f(value);
return { value };
}
// then to use it
auto value = call<int>(foo);
Would that suit you?
The function could be improved to detect the difference for pointers to one object or to an array if you need it, and it could probably also be improved to detect automatically the needed parameter type so that you do not have to type it.

Typedef a shared_ptr type with a static custom deleter, similar to unique_ptr

I have read through many questions on SO on custom deleter for shared_ptr and unique_ptr, and the difference between the two. But, I still haven't found any clear answer to this question:
How can one best go about creating a type that acts as a shared_ptr with a custom deleter, similar to how unique_ptr has the deleter as part of the type definition?
For unique_ptr usage, I use a deleter class, that handles deletion of individual types (limiting it to just two types, for brevity):
struct SDL_Deleter {
void operator()( SDL_Surface* ptr ) { if (ptr) SDL_FreeSurface( ptr );}
void operator()( SDL_RWops* ptr ) { if (ptr) SDL_RWclose( ptr );}
};
using SurfacePtr = std::unique_ptr<SDL_Surface, SDL_Deleter>;
using RWopsPtr = std::unique_ptr<SDL_RWops, SDL_Deleter>;
Which can be used with something like
SurfacePtr surface(IMG_Load("image.png"));
And will call SDL_FreeSurface upon destruction.
This is all fine and well. However, how does one go about achieving the same for shared_ptr? Its type is defined as
template< class T > class shared_ptr;
and the way to provide a custom deleter is through the constructor. It doesn't feel right that the user of the shared_ptr wrapper needs to know which pointer type is wrapped, and how that pointer is supposed to be deleted. What would be the best way to achieve the same kind of usage as with the unique_ptr example of above.
In other words, that I could end up with:
SurfaceShPtr surface(IMG_Load("image.png"));
Instead of of something like
SurfaceShPtr surface(IMG_Load("image.png"),
[=](SDL_Surface* ptr){SDL_FreeSurface(ptr);});
Or, just slightly better
SurfaceShPtr surface(IMG_Load("image.png"),
SDL_Deleter());
Is there a way to do this, without having to create a RAII wrapper class (instead of a typedef), adding even more overhead?
If the answer is "this isn't possible". Why not?
The other answer provided here was that something close to what I asked could be done through function returns of unique_ptr with custom deleter, which can be implicitly converted to a shared_ptr.
The answer given was that a deleter defined as a type trait was not possible for std::shared_ptr. The answer suggested as an alternative, to use a function which returns a unique_ptr, implicitly converted to a shared_ptr.
Since this isn't part of the type, it is possible to make a simple mistake, leading to memory leaks. Which is what I wanted to avoid.
For example:
// Correct usage:
shared_ptr<SDL_Surface> s(createSurface(IMG_Load("image.png")));
// Memory Leak:
shared_ptr<SDL_Surface> s(IMG_Load("image.png"));
The concept I want to express is having the deleter as part of the type (which unique_ptr allows), but with the functionality of a shared_ptr. My suggested solution is deriving from shared_ptr, and providing the deleter type as a template argument. This takes up no additional memory, and works in the same way as for unique_ptr.
template<class T, class D = std::default_delete<T>>
struct shared_ptr_with_deleter : public std::shared_ptr<T>
{
explicit shared_ptr_with_deleter(T* t = nullptr)
: std::shared_ptr<T>(t, D()) {}
// Reset function, as it also needs to properly set the deleter.
void reset(T* t = nullptr) { std::shared_ptr<T>::reset(t, D()); }
};
Together with a deleter class (Thanks Jonathan Wakely. Way cleaner than my macro (now removed)):
struct SDL_Deleter
{
void operator()(SDL_Surface* p) const { if (p) SDL_FreeSurface(p); }
void operator()(SDL_RWops* p) const { if (p) SDL_RWclose(p); }
};
using SurfacePtr = std::unique_ptr<SDL_Surface, SDL_Deleter>;
using SurfaceShPtr = shared_ptr_with_deleter<SDL_Surface, SDL_Deleter>;
using RWopsPtr = std::unique_ptr<SDL_RWops, SDL_Deleter>;
using RWopsShPtr = shared_ptr_with_deleter<SDL_RWops, SDL_Deleter>;
Instances with SurfaceShPtr members are type guaranteed to clean up properly, the same as for SurfacePtr, which is what I wanted.
// Correct Usage (much harder to use incorrectly now):
SurfaceShPtr s(IMG_Load("image.png"));
// Still correct usage
s.reset(IMG_Load("other.png"));
I'll leave this up for a while, for comments, etc, without accepting the answer. Maybe there are even more dangerous caveats I've missed (having a non-virtual destructor not being one, as the parent shared_ptr is given charge of the deletion).
A typedef is a static, compile-time feature.
A deleter passed to a shared_ptr is a dynamic, run-time property. The deleter is "type-erased" and is not part of the shared_ptr interface.
Therefore you can't declare a typedef to represent an alternative deleter, you just pass one to the constructor.
What would be the best way to achieve the same kind of usage as with the unique_ptr example of above.
You could use functions to create the resources and return them in a shared_ptr
shared_ptr<SDL_Surface> create_sdl_surface(const char* s)
{
return shared_ptr<SDL_Surface>(IMG_load(s), SDL_FreeSurface);
}
But I would have those functions return a unique_ptr instead, which can be converted to shared_ptr, as below.
I would get rid of the macro and do something like this:
// type with overloaded functions for freeing each resource type
struct SDL_deleter
{
void operator()(SDL_Surface* p) const { if (p) SDL_FreeSurface(p); }
void operator()(SDL_RWops* p) const { if (p) SDL_RWclose(p); }
// etc.
};
// a unique_ptr using SDL_deleter:
template<typename P>
using SDL_Ptr = std::unique_ptr<P, SDL_deleter>;
// typedefs for the common ptr types:
using SurfacePtr = SDL_ptr<SDL_Surface>;
using RWopsPtr = SDL_ptr<SDL_RWops>;
// etc.
To answer the shared_ptr part of your question, define functions that create resources and return them in a SDL_ptr:
SurfacePtr createSurface(const char* s) { return SurfacePtr(IMG_load(s)); }
RWopsPtr createRWops([...]) { return RWopsPtr([...]); }
// etc.
Then you can easily create a shared_ptr from the result of those functions:
shared_ptr<SDL_Surface> s = createSurface("image.png");
The shared_ptr automatically acquires the right deleter from the unique_ptr.

c++ Garbage collection and calling destructors

Per-frame I need to allocate some data that needs to stick around until the end of the frame.
Currently, I'm allocating the data off a different memory pool that allows me to mark it with the frame count. At the end of the frame, I walk the memory pool and delete the memory that was allocated in a particular frame.
The problem I'm running into is that in order to keep a hold on the data, I have to place it in a structure thusly:
struct FrameMemory
{
uint32 frameIndex;
bool allocatedType; //0 = new(), 1 = new[]
void* pMemPtr;
}
So later, when i get around to freeing the memory, it looks something like this:
{
for(all blocks)
if(block[i].frameIndex == targetIndex)
if(block[i].allocatedType == 0)
delete block[i].pMemPtr;
else if (block[i].allocatedType ==1)
delete[] block[i].pMemPtr;
}
The issue is that, because I have to overload the pointer to the memory as a void*, the DELETE operator doesn't properly DELETE the memory as its' native base type. IE the destructor NEVER gets called for the object.
I've attempted to find ways to use smart-pointer templated objects for the solution, but in order to do that, I have to overload the templated class to a non-templated base-type, which makes deletion even more difficult.
Does anyone have a solution to a problem like this?
If you don't want to force all the objects to inherit from Destructible, you can store a pointer to a deleter function (or functor) along with the pointer to the data itself. The client code is responsible for providing a function that knows how to delete the data correctly, typically something like:
void xxx_deleter(void *data) {
xxx *ptr = static_cast<xxx *>(data);
delete ptr;
}
Though the deleter will usually be a lot like the above, this also gives the client the option of storing complex data structures and still getting them deleted correctly.
class Destructable
{
public:
virtual ~Destructable() {}
};
Instead of void *, store Destructable * in the pool. Make objects allocated from the pool inherit from Destructable.
Alternatively, override the operators new and delete for the relevant classes. Make them use the pool. Delete the object as soon as the CPU is done with it, in the regular way, in the code that owns it and hence knows its correct type; since the pool will not reuse the memory until it sees the appropriate frame end, whatever asynchronous hardware required you to lag garbage collection in this way can still do its thing.
The only way I could think of to do that would be to add a type entry to the FrameMemory struct, then use that to properly cast the memory for the delete. For example, if you have memory of type foo, you could have something like:
if (block[i].BlockType == BLOCKTYPE_FOO)
{
foo *theMemory = (foo *)block[i].pMemPtr;
delete theMemory;
}
Note that this can be an ****extremely**** dangerous operation if you do it wrong.
If you are mean stack frame (i.e. inside function)
you can try to use alloca()
First thing that I can think of is using boost::shared_ptr<void> (for the non-array version, some work may be required to adapt it for the array version) as the pointer type. And I think that should take care of mostly every detail. Whenever the frame is destroyed the memory will be appropriately deleted:
struct FrameMemory
{
uint32 frameIndex;
// bool allocatedType; //0 = new(), 1 = new[] only works with instances, not arrays
boost::shared_ptr<void> pMemPtr;
};
If you want to implement something similar manually, you can use a 'deleter' function pointer to handle the deletion of the objects, instead of calling delete directly. Here is a rough approach to how you could modify your code:
// helper deleter functions
template <typename T>
void object_deleter( void *p ) {
delete static_cast<T*>(p);
}
template <typename T>
void array_deleter( void *p ) {
delete [] static_cast<T*>(p);
}
class FrameMemory
{
public:
const uint32 frameIndex;
void* pMemPtr;
private:
void (*deleter)(void*);
public:
template <typename T>
FrameMemory( uint32 frame, T* memory, bool isarray = false )
: frameIndex(frame), pMemPtr(memory),
deleter( isarray? array_deleter<T> : object_deleter<T> )
{}
void delete() {
deleter(pMemPtr)
}
};
struct X;
void usage()
{
{
FrameMemory f( 1, new X );
f.delete();
}
{
FrameMemory f( 1, new x[10], true );
f.delete();
}
}
I would modify it further so that instead of having to call FrameMemory::delete() that was performed in the destructor, but that would take more time than I have right now to do correctly (that is, deciding how copies are to be handled and so on...
I would do something like this:
struct FrameMemoryBase
{
uint32 frameIndex;
bool allocatedType; //0 = new(), 1 = new[]
virtual void Free() = 0;
};
template <typename T>
struct FrameMemory : public FrameMemoryBase
{
void Free()
{
if(allocatedType == 0)
delete pMemPtr;
else if (allocatedType ==1)
delete[] pMemPtr;
}
T *pMemPtr;
};
which you would use via:
{
for(all blocks)
if(block[i].frameIndex == targetIndex)
block[i].Free();
}
If you also free the FrameMemory struct you could just change Free to a virtual destructor. I'm not sure this is what you're looking for since I don't understand what "I've attempted to find ways to use smart-pointer templated objects for the solution, but in order to do that, I have to overload the templated class to a non-templated base-type, which makes deletion even more difficult." means, but I hope this is helpful.
This requires that the memory management code somehow have access to the declarations of what you wish to free, but I don't think there's any way around that, assuming you want destructors called, which you explicitly do.