How standard containers allocates memory for their nodes (internal structures)? - c++

I cannot figure out how std::set and std::map, for example allocate the memory for their nodes if they have allocators of type
std::allocator<Key>
and
std::allocator<std::pair<const Key, T> >
respectively. As far as I can guess, there should be a code like this, in std::set, for example:
std::pair<iterator, bool> insert(const value_type& value)
{
...
Node * node = new Node();
node->value = value;
...
return InsertNode(node);
}
or
std::pair<iterator, bool> insert(const value_type& value)
{
...
Node * node = new Node();
node->p_value = a1.allocate(1);
*(node->p_value) = value;
...
return InsertNode(node);
}
where Node is some internal structure like red-black tree node, for example.
So the question is how is this Node memory allocated?

Allocators in C++ (for some reason) are expected to be typed. That is, a specific allocator class instance allocates objects of a specific type.
However, because the actual type a container may need to allocate can be different from the container's value type (the type the container logically contains), allocators are (mostly) required to be able to be converted into an alternate allocator instance that allocates objects of any other type. This process is called "rebinding", and it is initiated by invoking allocator.rebind<U>, where U is the type the system wants to actually allocate.
A new allocator instance which is a rebound form of the original allocator must allocate from the same memory pool as the original allocator. Rebinding is therefore treated as a type-based change, not a truly distinct object.
Implementations of standard library containers are not permitted to use new or delete; all of their dynamic de/allocation and object creation/destruction within allocated memory must be performed through the allocator.
When std::set<T> goes to insert an item, it will rebind your allocator to some node-type which internally contains either a T itself or enough properly aligned storage to create a T within. It will then use the allocator's interface to create that node type and initialize the T it contains. std::map is a bit more complicated, but essentially its node type must store a Key and a Value.

The allocator that you provide is rebound to the type of the node structure, and then used to allocate the nodes.
std::allocator (the default) will call operator new, which then will do something implementation-specific (usually calling malloc).

Related

What is the purpose of pointer rebind?

I trying to implement std::list (MSVC). And one thing I cannot understand:
template <class _Value_type, class _Voidptr> // voidptr? For what?
struct _List_node { // list node
using value_type = _Value_type;
using _Nodeptr = _Rebind_pointer_t<_Voidptr, _List_node>; // what is the purpose of such rebind?
...
}
I understand the reason of allocator rebind, but pointer? Why should I use it and where?
UPD: I understand, what rebind is. I mean, why not just _Nodeptr*? Why do I need rebind? (Thanks to Evg)
The answer to this question comes from allocators, too. Let's take a look at how _Rebind_pointer_t is defined:
template <class _Ptr, class _Ty>
using _Rebind_pointer_t = typename pointer_traits<_Ptr>::template rebind<_Ty>;
That is, we have
template <class _Value_type, class _Voidptr>
struct _List_node {
using _Nodeptr = typename pointer_traits<_Voidptr>::template rebind<_List_node>;
// ...
}
Now let's take a look at how _List_node is used:
using _Node = _List_node<_Ty, typename _Alty_traits::void_pointer>;
Effectively, we rebind allocator's void_pointer to _List_node pointer. This trick is needed to support allocators that use fancy pointers internally.
One such example can be found in Boost.Interprocess library. It has boost::interprocess::allocator:
An STL compatible allocator that uses a segment manager as memory source. The internal pointer type will of the same type (raw, smart) as typename SegmentManager::void_pointer type. This allows placing the allocator in shared memory, memory mapped-files, etc...
For example, we can write
namespace bi = boost::interprocess;
using Allocator = bi::allocator<int, bi::managed_shared_memory::segment_manager>;
std::list<int, Allocator> list(/* allocator object */);
Now std::allocator_traits<decltype(list)::allocator_type>::void_pointer will be not void* as with default allocator, but boost::interprocess::offset_ptr<void, ...>. As a result, _Nodeptr will be not _Nodeptr*, but boost::interprocess::offset_ptr<_Nodeptr, ...>.
The user instantiates the list with the value_type.
For example for list<int> the value_type would be int.
Also the list allocator (which can also be provided) allocates memory for objects of value_type's.
But the value_type is not what the list internally holds.
The list holds internally the **Nodes** for which the value_type is a member of.
So to be able to convert allocation and pointer's from value_type to Node (which holds value_type and pointer to the next node at least) the rebind is used.
In contrary, this would not be need for a vector<int> for example.
That's because the internal representation of vector will normaly hold internally the pointer to array of value_type's objects and that's int in this case. So no rebind needed here.

Is it possible to have an array of smart pointers that automatically updates its values with their index?

In C++, is there a way to write an array of smart pointers that automatically updates the pointed-to values with their index in the array? The pointed-to values have a member to store the index, similar to an intrusive refcount.
I am interested in writing a heap with updatable priorities. If the values in the heap were always updated to point to their index inside the heap storage, without special knowledge inside the heap algorithm, it would be easy to follow that link back into the heap when changing the value's priority. Knowing the position of the changed item, the heap invariant could then be quickly restored.
This is my attempt at a basic implementation. I would prefer to parameterize Containers reference to the global array without making instances larger than one pointer, and it would be good to improve safety. It would be more useful if it was also a random access iterator.
class Contained {
public:
uintptr_t index;
};
class Container {
public:
Contained *value;
Container& operator=(Container& other);
};
Container foobars[4];
Container& Container::operator=(Container& other) {
this->value = other.value;
this->value->index = ((uintptr_t)this - (uintptr_t)foobars) / sizeof(this->value);
return *this;
}

Can I use allocator specified for some type to allocate objects of another type in C++?

Some container A has a template parameter Alloc (that is a template too) representing an allocator type. A specifies Alloc for the type A::Node.
template <template <T> Alloc>
class A {
struct Node {
};
Alloc<Node> allocator_; // the allocator object
};
Please excuse me for possibly wrong C++ code above.
So, allocator_.allocate(1) will allocate sizeof(A::Node) bytes. But during operation, container A needs a memory for some object of other than A::Node type, say a temporary string (of chars).
From technical point of view, I could use existing allocator in such a dirty way:
size_t string_len = 500;
// how much objects spanned in memory is enough to fit our string?
size_t equal_size = (string_len / sizeof(Node)) + 1;
auto mem = allocator_.allocate(equal_size);
char *p = (char*)mem; // reinterpret cast
// ... use p to store the string ... memcpy(p, str_src, str_len); //
// Now p is not needed, so return memory to allocator:
allocator_.deallocate(mem, equal_size);
Is there a less dirty approach, considering I need no more than 1 allocator and I wish to put all the memory management to it?
All this comes from those needs:
to have a single allocator that could be killed to free all (possibly leaked) memory that A is allocated for any its purposes during operation
to have not more than 1 allocator (including the default ::new, ::delete)
std::allocator has a member type rebind for exactly that purpose:
std::allocator<Node> alloc;
std::allocator<Node>::rebind<char>::other char_alloc;
char * mem = char_alloc.allocate(string_len);
You can use an allocator's rebind for this. From this documentation:
A structure that enables an allocator for objects of one type to allocate storage for objects of another type.
it is built exactly for your case - taking an allocator type oriented to one type, and building the corresponding one oriented to some other type.
In your case, it might look like
typename Alloc<T>::rebind<Node>::other allocator_;
You should probably use Alloc::rebind member template to get an allocator for that another object.
However, that does mean that you do have 2 allocators. The advantage of rebind is to allow the user of your template to specify the allocator type only for a single allocated type.
Also note that rebind is optional, so if you must support such allocators, you'll need to pass the other allocator as an argument, but you can still use the rebound allocator as a default value.

Compile time method to determine whether object has automatic storage duration

I'd like to be able to enforce at compile time that a particular type can be used only to create objects with automatic storage duration.
template<typename T, typename Alloc>
struct Array
{
T* data; // owned resource
Array(std::size_t size); // allocates via Alloc
~Array(); // deallocates via Alloc
};
typedef Array<int, AutoAllocator<int>> AutoArray;
void foo(AutoArray a) // ok
{
AutoArray l = AutoArray(); // ok
static AutoArray s; // error
new AutoArray(); // error
std::vector<AutoArray> v(1); // error
}
The application for this would be to enable choosing an optimal allocation strategy for resources owned by an instance of AutoArray. The idea being that the resource allocation pattern required for objects with automatic storage duration is compatible with a LIFO resource allocator.
What method could I use to achieve this in C++?
EDIT: The secondary goal is to allow the allocation strategy for Array to be transparently switched by dropping in either AutoAllocator or the default std::allocator.
typedef Array<int, std::allocator<int>> DynamicArray;
Assume that there is a large base of code that already uses DynamicArray.
This cannot be done. Consider that you created a type that held this as a member. When the compiler generates the code for the constructor of that type it does not know where the object is being created, is the complete object in the stack, is it in the heap?
You need to solve your problem with a different mind set, for example, you can pass the allocator to the constructor of the object (the way BSL does) and possibly default to a safe allocator (based on new-delete), then for those use cases where a lifo allocator is a better option the user can explicitly request it.
This won't be the same as a compiler error, but it will be obvious enough to detect on a code review.
If you are really interested on interesting uses of allocators, you might want to take a look at the BSL replacement for the standard library, as it allows for polymorphic allocators that are propagated to the members of containers. In the BSL world, your examples would become:
// Assume a blsma::Allocator implementing LIFO, Type uses that protocol
LifoAllocator alloc; // implements the bslma::Allocator protocol
Type l(&alloc); // by convention bslma::Allocator by pointer
static Type s; // defaults to new-delete if not passed
new (&alloc) Type(&alloc); // both 'Type' and it's contents share the allocator
// if the lifetime makes sense, if not:
new Type; // not all objects need to use the same allocator
bsl::vector<Type> v(&alloc);
v.resize(1); // nested object uses the allocator in the container
Using allocators in general is not simple, and you will have to be careful of the relative lifetimes of the objects with respect to each other and to the allocators.

how to create an allocator for std::map using a pool of objects

This is a followup from stl allocator, copy constructor of other type, rebind
I am using std::map and want a custom allocator that can reuse the storage for the internal nodes. The items being stored are pointers, so I'm not talking about reusing them, just the internal allocations for the map.
The main requirements is that different instances of the map cannot share an object pool, it must be unique per instance.
I don't need a complete solution, I'd just like an idea of how to cope with the required copy constructor that takes allocators of a different type. I don't know how to manage the internal memory in that case.
As you point out in the other question, the allocators shouldn't have any state. Use thread-local storage or a pointer in each allocator object to the memory pool: the allocators merely become a type-specific interface to that pool.
struct MemoryPool {
// none of this depends on the type of objects being allocated
};
template<class T>
struct MyAllocator {
template<class U> struct rebind { typedef MyAllocator<U> other; };
MemoryPool *_pool; // copied to any allocator constructed
template<class U>
MyAllocator(MyAllocator const &other) : _pool(other._pool) {}
// allocate, deallocate use _pool
// construct, destruct deal with T
};