I have recently been reading about custom memory allocators for c++ and came across an intressting concept where rather than using pointers "handles" are used which are effectively pointers to pointers, this allows the allocator to rearrange its memory to avoid fragmentation while avoiding the problem of invalidating all the pointers to the allocated memory.
However, different allocators may wish to use handles differently, for example a pool allocator would have no need to rearrange its memory, where as other allocators would. Those that need to rearrange their memory may need to treat handles as pointers to pointers, indexes to an array of pointers, etc whereas allocators that do not rearrange their memory would treat handles as a simple pointer. Ideally each allocator would be able to use a different type of handle so that it could achieve optimum performance, having a base handle class with virtual methods would incur a lot of overhead as handles would be used every time you needed to access any function/member of a class allocated dynamically.
My solution was to use partial template specialization so that the handle type was worked out at compile time, removing the run time overhead of virtuals and allowing the compiler to do other optimizations (eg: inlining)
/////////////////////////////////////////////////
/// \brief The basic handle class, acts as simple pointer
/// Single layer of indirection
/////////////////////////////////////////////////
template <typename T>
class Handle{
public:
T* operator->(){return obj;}
//other methods...
private:
T* obj;
};
/////////////////////////////////////////////////
/// \brief Pointer specialization of the handle class, acts as a pointer to pointer
/// allowing allocators to rearrange their data
/////////////////////////////////////////////////
template <typename T>
class Handle<T *>{
public:
T* operator->(){return *obj;};
//other methods...
private:
T** obj;
};
This works perfectly and allows allocators to return whichever handle type they need, however it means that any function that needs to take a handle as a parameter needs to be overloaded to accept both types of specialization, also a class holding a handle as a member will need to be templated as to whether it has a normal handle, pointer to pointer handle or some other type.
This problem only gets worse as more handle types are added or a function takes more than one handle and all combinations of handle types must be given an overload.
Either I need to be able to make all handles that point to an instance of "TypeA" have the type Handle<TypeA> and then use a different method to template specialization to provide the different functionality or somehow hide the template parameter from any code using the handles. How could this be achieved?
(This method of hiding template parameters would also be useful in other instances, for example in a policy based logging system where a class may wish to hold a reference to any type of logger without itself being templated. Obviously in the case of logging virtual inheritance could be used as the dominating factor in speed would be the I/O rather than function call overhead)
I have implemented a memory system that allowed exactly what you describe, but could not think of a way to have unique handle types without virtual functions. The template parameters are part of the type.
In the end I made a single handle type and used the least significant bit of the pointer to store whether it was a direct or indirect pointer. Before dereferencing I would check the bit and if it was not set I would simply return the pointer, otherwise I would unset the bit and ask the memory system for the actual pointer.
The scheme did work but I eventually removed the indirect memory handle support from my memory system as I found that the overheads could be quite high and because it was so intrusive on all aspects of my code. Basically almost everywhere a pointer would normally be used I had to use a handle instead. It also required memory to be locked before use on other threads so that it wasn't defragmented while in use. Finally it required me to write entirely custom containers in order to get acceptable performance. I didn't want a double indirection on every access of a vector in a loop for example.
Related
I'm writing some code that handles cryptographic secrets, and I've created a custom ZeroedMemory implementation of std::pmr::memory_resource which handles sanitizes memory on deallocation and encapsulates using the magic you have to use to prevent optimizing compilers from eliding away the operation. The idea was to avoid specializing std::array, because the lack of a virtual destructor means that destruction after type erasure would cause memory to be freed without being sanitized.
Unfortunately, I came to realize afterwards that std::array isn't an AllocatorAwareContainer. My std::pmr::polymorphic_allocator approach was a bit misguided, since obviously there's no room in an std::array to store a pointer to a specific allocator instance. Still, I can't fathom why allocators for which std::allocator_traits<A>::is_always_equal::value == true wouldn't be allowed, and I could easily re-implement my solution as a generic Allocator instead of the easier-to-use std::pmr::memory_resource...
Now, I could normally just use an std::pmr::vector instead, but one of the nice features of std::array is that the length of the array is part of the type. If I'm dealing with a 32-byte key, for example, I don't have to do runtime checks to be sure that the std::array<uint8_t, 32> parameter someone passed to my function is, in fact, the right length. In fact, those cast down nicely to a const std::span<uint8_t, 32>, which vastly simplifies writing functions that need to interoperate with C code because they enable me to handle arbitrary memory blocks from any source basically for free.
Ironically, std::tuple takes allocators... but I shudder to imagine the typedef needed to handle a 32-byte std::tuple<uint8_t, uint8_t, uint8_t, uint8_t, ...>.
So: is there any standard-ish type that holds a fixed number of homogenously-typed items, a la std::array, but is allocator aware (and preferably stores the items in a continguous region, so it can be down-cast to an std::span)?
You need cooperation from both the compiler and the OS in order for such a scheme to work. P1315 is a proposal to address the compiler/language side of things. As for the OS, you have to make sure that the memory was never paged out to disk, etc. in order for this to truly zero memory.
This sounds like a XY problem. You seem to be misusing allocators. Allocators are used to handle runtime memory allocation and deallocation, not to hook stack memory. What you are trying to do — zeroing the memory after using — should really be done with a destructor. You may want to write a class Key for this:
class Key {
public:
// ...
~Key()
{
secure_clear(*this); // for illustration
}
// ...
private:
std::array<std::uint8_t, 32> key;
};
You can easily implement iterator and span support. And you don't need to play with allocators.
If you want to reduce boilerplate code and make the new class automatically iterator / span friendly, use inheritance:
class Key :public std::array<std::uint8_t, 32> {
public:
// ...
~Key()
{
secure_clear(*this); // for illustration
}
// ...
};
Recently, I've come across a couple of type-erasure implementations that use a "hand-rolled" vtable - Adobe ASL's any_regular_t is one example, although I've seen it used in Boost ASIO, too (for the completion routine queue).
Basically, the parent type is passed a pointer to a static type full of function pointers defined in the child type, similar to the below...
struct parent_t;
struct vtbl {
void (*invoke)(parent_t *, std::ostream &);
};
struct parent_t {
vtbl *vt;
parent_t(vtbl *v) : vt(v) { }
void invoke(std::ostream &os) {
vt->invoke(this, os);
}
};
template<typename T>
struct child_t : parent_t {
child_t(T val) : parent_t(&vt_), value_(val) { }
void invoke(std::ostream &os) {
// Actual implementation here
...
}
private:
static void invoke_impl(parent_t *p, std::ostream &os) {
static_cast<child_t *>(p)->invoke(os);
}
T value_;
static vtbl vt_;
};
template<typename T>
vtbl child_t<T>::vt_ = { &child_t::invoke_impl };
My question is, what is the advantage of this idiom? From what I can tell, it's just a re-implementation of what a compiler would provide for free. Won't there still be the overhead of an extra indirection when parent_t::invoke calls vtbl::invoke.
I'm guessing that it's probably got something to do with the compiler being able to inline, or optimize out the call to vtbl::invoke or something, but I'm not comfortable enough with Assembler to be able to work this out myself.
A class having a useful vtable basically requires that it be dynamically allocated. While you can do a fixed storage buffer and allocate there, it is a hassle; you don't have reasonable control over the size of instances once you go virtual. With a manual vtable, you do.
Glancing at the source in question, there are a lot of asserts about the size of various structures (because they need to fit in an array of two doubles in one case).
Also a "class" with a hand-rolled vtable can be standard layout; certain kinds of casting becomes legal if you do this. I don't see this being used in the Adobe code.
In some cases, it can be allocated separately from the vtable entirely (as I do when I do view-based type erasure: I create a custom vtable for the incoming type, and store a void* for it, then dispatch my interface to said custom vtable). I don't see this being used in the Adobe code; but an any_regular_view that acts as a pseudo-reference to an any_regular might use this technique. I use it for things like can_construct<T>, or sink<T>, or function_view<Sig> or even move_only_function<Sig> (ownership is handled by a unique_ptr, operations via a local vtable with 1 entry).
You can create dynamic classes if you have a hand-rolled vtable, where you allocate a vtable entry and set its pointers to whatever you choose (possibly programmatically). If you have 10 methods, each of which can be in one of 10 states, that would require 10^10 different classes with normal vtables. With a hand-rolled vtable, you just need to manage each classes' lifetime in a table somewhere (so instances don't outlive the class).
As an example, I could take a method, and add a "run before" or "run after" method to it, on a particular instance of a class (with careful lifetime management), or on every instance of that class.
It is also possible that the resulting vtables might be simpler than compiler-generated ones in various ways, as they aren't as powerful. Compiler-generated vtables handle virtual inheritance and dynamic casting, for example. The virtual inheritance case might have no overhead unless used, but dynamic casting may require overhead.
You can also gain control over initialization. With a compiler-generated vtable, the status of the table is defined (or left undefined) as the standard dictates: with a hand-rolled one, you can ensure any invariants you choose hold.
The OO pattern existed in C before C++ came around. C++ simply chose a few reasonable options; when you go back to pseudo-C style manual OO, you get access to those alternative options. You can dress things up (with glue) so that they look like normal C++ types to the casual user, while inside they are anything but.
I have had a template for some time which wrappers a C library FILE*. It's a fairly classic implementation of a shared pointer to a wrapper class for the FILE*. The reasoning behind using my own custom shared pointer is to provide free function replacements for some of the C library FILE* free functions, in order to allow me to do a drop-in replacement of legacy code that works with FILE*.
The implementation that I have uses an inner wrapper that guarantees that when it is deleted, the owned FILE* is closed. RAII.
However, I've come into the need to create a similar system to handle the case where I want the underlying FILE* to be flushed & truncated, rather than closed when the last FILE* holder is destroyed. That is to say, I have an open FILE* of the original guaranteed-to-close type, but wish to hand out an unowned copy of the FILE* to another object that is going to guarantee that when it's last instance is destroyed, that it will flush & truncate the FILE* rather than close it, hence leaving me with the underlying FILE* in an open state, but with the contents of the stream flushed to disk (and the file size reflecting only valid contents).
I have solved this trivially for compile time polymorphism. But I need some way to supply runtime polymorphism, and I really don't want to put yet-another-layer-of-indirection into this scenario (i.e. if I used a polymorphic pointer to either an auto-close or auto-flush FILE* wrapper, I'd be golden - but I really want to keep the same depth I have now and hide the polymorphism inside the custom shared pointer implementation).
Basically, if I have a:
template <class WrapperT>
class FilePointerT
{
public:
// omitted: create, destroy, manipulate the underlying wrappered FILE*
private:
WrapperT * m_pFileWrapper;
ReferenceCount m_rc;
}
Obviously, tons of details omitted. Suffice it to say that when the last one of these objects is deleted, it deletes the last m_pFileWrapper (in fact, if I were rewriting this code, I'd probably use a boost::shared_ptr).
Regardless, the real issue here is I am stumped on how to have a FilePointerT<WrapperT> whose WrapperT can vary, but can then be used in code as if they were all the same (which, after all, they are, since the implementation of WrapperT has zero affect on the structure and interface of FilePointerT (essentially a pimpl).
What can I declare that can possibly hold any FilePointerT<WrapperT> for any WrapperT?
Or, how can I change the definition of FilePointerT in order to allow me to supply specific WrapperT?
Can't you simply use std::shared_ptr<FILE *, deleter_function> ? Provide ordinary overloads for the free functions, no funny template malarky.
You can use type erasure to treat all versions of FilePointerT transparently. As the above poster mentions I'd also go for a shared_ptr approach, in fact the deleter isn't even part of the shared_ptr signature so you'll be able to vary the deleter while keeping the type constant.
For what it's worth, what I ended up doing is to embed the wrapper into the FilePointer class, instead of making it part of its type.
class FilePointer
{
public:
// create using a file wrapper (which will handle policy issues)
FilePointer(FileWrapper * pfw) : m_pFileWrapper(pfw) { }
protected:
FileWrapper * m_pFileWrapper; // wrapper has close/flush policy
ReferenceCount m_references; // reference count
};
Then the file pointer just delegates the real work to the wrapper, and the wrapper implements the needed policy, and code can be written to consume FilePointer(s).
There are obviously other ways to do this, but that's what I went with.
I have my own multi-threaded service which handles some commands. The service consists of command parser, worker threads with queues and some caches. I don't want to keep an eye on each object's life-cycle, so I use shared_ptr's very extensive. Every component uses shared_ptr's in its own way:
command parser creates shared_ptr's and stores them in cache;
worker binds shared_ptr's to functors and puts them to queue.
cache temporary or permanently holds some shared_ptr's.
the data that is referenced by shared_ptr can also hold some other shared_ptr's.
And there is another underlying service (for example, command receiver and sender) that has the same structure, but uses his own cache, workers and shared_ptr's. It's independent from my service and is maintained by another developer.
It's a complete nightmare, when I try to track all shared_ptr dependencies to prevent cross-references.
Is there a way to specify some shared_ptr "interface" or "policy", so I will know which shared_ptr's I can pass safely to the underlying service without inspecting the code or interacting with the developer? Policy should involve the shared_ptr owning-cycle, for example, the worker holds the functor with binded shared_ptr since the dispatch() function call and only til some other function call, while the cache holds the shared_ptr since the cache's constructor call and til the cache's destructor call.
Especially, I'm curious about shutdown situation, when the application may freeze while waiting the threads to join.
There is no silver bullet... and shared_ptr certainly is not one.
My first question would be: do you need all those shared pointers ?
The best way to avoid cyclic references is to define the lifetime policy of each object and make sure they are compatible. This can be easily documented:
you pass me a reference, I expect the object to live throughout the function call, but no more
you pass me a unique_ptr, I am now responsible for the object
you pass me a shared_ptr, I expect to be able to keep a handle to the object myself without adversely affecting you
Now, there are rare situations where the use of shared_ptr is indeed necessary. The indication of caches lead me to think that it might be your case, at least for some uses.
In this case, you can (at least informally) enforce a layering approach.
Define a number of layers, from 0 (the base) to infinite
Each type of object is ascribed to a layer, several types may share the same layer
An object of type A might only hold a shared_ptr to an object of type B if, and only if, Layer(A) > Layer(B)
Note that we expressly forbid sibling relationships. With this scheme, no circle of references can ever be formed. Indeed, we obtain a DAG (Directed Acyclic Graph).
Now, when a type is created, it must be ascribed a layer number, and this must be documented (preferably in the code).
An object may change of layer, however:
if its layer number decreases, then you must reexamine the references it holds (easy)
if its layer number increases, then you must reexamine all the references to it (hard)
Note: by convention, types of objects which cannot hold any reference are usually in the layer 0.
Note 2: I first stumble upon this convention in an article by Herb Sutter, where he applied it to Mutexes and tried to prevent deadlock. This is an adaptation to the current issue.
This can be enforced a bit more automatically (by the compiler) as long as you are ready to work your existing code base.
We create a new SharedPtr class aware of our layering scheme:
template <typename T>
constexpr unsigned getLayer(T const&) { return T::Layer; }
template <typename T, unsigned L>
class SharedPtrImpl {
public:
explicit SharedPtrImpl(T* t): _p(t)
{
static_assert(L > getLayer(std::declval<T>()), "Layering Violation");
}
T* get() const { return _p.get(); }
T& operator*() const { return *this->get(); }
T* operator->() const { return this->get(); }
private:
std::shared_ptr<T> _p;
};
Each type that may be held in such a SharedPtr is given its layer statically, and we use a base class to help us out:
template <unsigned L>
struct LayerMember {
static unsigned const Layer = L;
template <typename T>
using SharedPtr<T> = SharedPtrImpl<T, L>;
};
And now, we can easily use it:
class Foo: public LayerMember<3> {
public:
private:
SharedPtr<Bar> _bar; // statically checked!
};
However this coding approach is a little more involved, I think that convention may well be sufficient ;)
You should look at weak_ptr. It complements shared_ptr but does not keep objects alive, so is very useful when you might have circular references.
I often come accross the problem that I have a class that has a pair of Register/Unregister-kind-of-methods. e.g.:
class Log {
public:
void AddSink( ostream & Sink );
void RemoveSink( ostream & Sink );
};
This applies to several different cases, like the Observer pattern or related stuff. My concern is, how safe is that? From a previous question I know, that I cannot safely derive object identity from that reference. This approach returns an iterator to the caller, that they have to pass to the unregister method, but this exposes implementation details (the iterator type), so I don't like it. I could return an integer handle, but that would require a lot of extra internal managment (what is the smallest free handle?). How do you go about this?
You are safe unless the client object has two derivations of ostream without using virtual inheritance.
In short, that is the fault of the user -- they should not be multiply inheriting an interface class twice in two different ways.
Use the address and be done with it. In these cases, I take a pointer argument rather than a reference to make it explicit that I will store the address. It also prevents implicit conversions that might kick in if you decided to take a const reference.
class Log {
public:
void AddSink( ostream* Sink );
void RemoveSink( ostream* Sink );
};
You can create an RAII object that calls AddSink in the constructor, and RemoveSink in the destructor to make this pattern exception-safe.
You could manage your objects using smart pointers and compare the pointers for equality inside your register / deregister functions.
If you only have stack allocated objects that are never copied between an register and deregister call you could also pass a pointer instead of the reference.
You could also do:
typedef iterator handle_t;
and hide the fact that your giving out internal iterators if exposing internal data structures worries you.
In your previous question, Konrad Rudolph posted an answer (that you did not accept but has the highest score), saying that everything should be fine if you use base class pointers, which you appear to do.