std::unique_ptr<T[]> and custom allocator deleter - c++

I am trying to use std::unique_ptr<T[]> with custom memory allocators. Basically, I have custom allocators that are subclasses of IAllocator, which provides the following methods:
void* Alloc( size_t size )
template<typename T> T* AllocArray( size_t count )
void Free( void* mem )
template<typename T> void FreeArray( T* arr, size_t count )
Since the underlying memory might come from a pre-allocated block, I need the special ...Array()-methods to allocate and free arrays, they allocate/free memory and call T() / ~T() on every element in the range.
Now, as far as I know, custom deleters for std::unique_ptr use the signature:
void operator()(T* ptr) const
In the case of unique_ptr<T[]>, normally you would call delete[] and be done with it, but I have to call FreeArray<T>, for which I need the number of elements in the range. Given only the raw pointer, I think there is no way of obtaining the size of the range, hence the only thing I could come up with is this:
std::unique_ptr<T[], MyArrDeleter> somePtr( allocator.AllocArray<T>( 20 ), MyArrDeleter( allocator, 20 ) );
Where essentially the size of the array has to be passed into the deleter object manually. Is there a better way to do this? This seems quite error-prone to me...

Yes, there most certainly is a better way:
Use a maker-function.
template<class T, class A> std::unique_ptr<T[], MyArrDeleter>
my_maker(size_t count, A&& allocator) {
return {somePtr(allocator.AllocArray<T>(count), MyArrDeleter(allocator, count)};
}
auto p = my_maker<T>(42, allocator);

T* doesn't contain such information, neither unique_ptr knows about the size of the array (since it uses directly a delete [] as you stated). You could let the T be a unique_ptr<T> to manage the destruction automatically but this could not be possible if the whole contiguous T* is managed by a memory allocator (and not a single T* object). Eg:
unique_ptr<unique_ptr<Foo>[]> data;
data.reset(new unique_ptr<Foo>[50]);
data[0].reset(new Foo());

Related

Create C++ array of unknown type

Is there some way to create an array in C++ where we don't know the type, but we do know it's size and alignmnent requirements?
Let's say we have a template:
template<typename T>
T* create_array(size_t numElements) { return new T[numElements]; }
This works because each element T has known size and alignment, which is known at compile-time. But I'm looking for something where we can delegate the creation for later by simply extracting size and align and passing them on. This is the interface that I seek:
// my_header.hpp
// "internal" helper function, implementation in source file!
void* _create_array(size_t s, size_t a, size_t n);
template<typename T>
T* create_array(size_t numElements) {
return (T*)_create_array(sizeof(T), alignof(T), numElements);
}
Can we implement this in a source file?:
#include "my_header.hpp"
void* _create_array(size_t s, size_t a, size_t n) {
// ... ?
}
Requirements:
Each array element must have the correct alignment.
The total array size must be equal to s*n, and be aligned to a.
Type safety is assumed to be managed by the templated interface.
Indexing into the array should use correct size and align offsets.
I'm using C++20, so newer features may also be considered.
In advance, thank you!
While you can also implement this yourself, you can simply use std::allocator:
template<typename T>
constexpr T* create_array(size_t numElements) {
std::allocator<T> a;
return std::allocator_traits<decltype(a)>::allocate(a, numElements);
}
and then
template<typename T>
constexpr void destroy_array(T* ptr) noexcept {
std::allocator<T> a;
std::allocator_traits<decltype(a)>::deallocate(a, ptr);
}
The benefit over doing it yourself via a call to operator new is that this will also be usable in constant expression evaluation.
You then need to create objects in the returned storage via placement-new, std::allocator_traits<std::allocator<T>>::construct or std::construct_at.
Anyway, first make sure that you really need to do all of this memory management manually. Standard library containers already offer similar functionality, e.g. std::vector has a .reserve member function to reserve memory in which objects can be placed later via push_back, emplace_back, resize, etc.
If you want to implement the above yourself, you basically need
#include<new>
//...
void* create_array(size_t s, size_t a, size_t n) {
// CAREFUL: check here that `s*n` does not overflow! Potential for vulnerabilities!
return ::operator new(s*n, std::align_val_t{a});
}
void destroy_array(void* ptr, size_t a) noexcept {
::operator delete(ptr, std::align_val_t{a});
}
(Note that identifiers starting with an underscore are reserved in the global namespace scope and may not be used there as function names, so I changed the name.)

Custom allocator with preallocated memory for STL containers

I'd like to use a std::vector which allocates the memory for its element from a preallocated buffer. So, I would like to provide a buffer pointer T* buffer of size n to std::vector.
I thought I could simply write a std::span-like class which also provides a push_back method; that would be exactly what I need. However, I've stumbled across the code from this post (see below) which seems to solve this problem with a custom allocator.
Nobody commented on that, but doesn't the example with std::vector<int, PreAllocator<int>> my_vec(PreAllocator<int>(&my_arr[0], 100)); provided in the post end in undefined behavior? I've run the code with the Visual Studio 2019 implementation and at least this implementation is rebinding the provided allocator to allocate an element of type struct std::_Container_proxy. Now this should be a huge problem, since you've only provided memory to store your 100 int's. Am I missing something?
template <typename T>
class PreAllocator
{
private:
T* memory_ptr;
std::size_t memory_size;
public:
typedef std::size_t size_type;
typedef T* pointer;
typedef T value_type;
PreAllocator(T* memory_ptr, std::size_t memory_size) : memory_ptr(memory_ptr), memory_size(memory_size) {}
PreAllocator(const PreAllocator& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};
template<typename U>
PreAllocator(const PreAllocator<U>& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};
template<typename U>
PreAllocator& operator = (const PreAllocator<U>& other) { return *this; }
PreAllocator<T>& operator = (const PreAllocator& other) { return *this; }
~PreAllocator() {}
pointer allocate(size_type n, const void* hint = 0) { return memory_ptr; }
void deallocate(T* ptr, size_type n) {}
size_type max_size() const { return memory_size; }
};
int main()
{
int my_arr[100] = { 0 };
std::vector<int, PreAllocator<int>> my_vec(0, PreAllocator<int>(&my_arr[0], 100));
}
The standard makes no requirement on the types of the objects that are allocated by vector using the provided allocator. The requirements are placed on the storage of the elements (the storage must be contiguous), but the implementation is free to make additional allocations of objects of other types, including the case when the allocator is used to allocate raw storage to place both internal data of the container and the elements. This point is especially relevant to node-based allocators, such as list or map, but it is also valid for vector.
Furthermore, the implementation is free to perform multiple allocation requests as a result of user requests. For example, two calls to push_back may result in two allocation requests. This means that the allocator must keep track of the previously allocated storage and perform new allocations from the unallocated storage. Otherwise, container's internal structures or previously inserted elements may get corrupted.
In this sense, the PreAllocator template, as specified in the question, indeed has multiple issues. Most importantly, it doesn't track allocated memory and always returns the pointer to the beginning of the storage from allocate. This will almost certainly cause problems, unless the user is lucky to use a specific implementation of vector that doesn't allocate anything other than the storage for its elements, and the user is very careful about the operations he invokes on the vector.
Next, the allocator does not detect storage exhaustion. This could lead to out-of-bound error conditions.
Lastly, the allocator does not ensure proper alignment of the allocated storage. The underlying buffer is only aligned to alignof(int), which may not be enough if the container allocates its internal structures that have higher alignment requirements (e.g. if the structures contain pointers, and pointers are larger than int).
The general recommendation when designing allocators is to implement them in terms of raw storage of bytes. That storage may be used to create objects of different types, sizes and alignment requirements, which may be allocated through different copies of the allocator (after rebinding to those other types). In this sense, the allocator type you pass to the container is only a handle, which may be rebound and copied by the container as it sees fit.

std::end for unique_ptr<T[]>

I want to implement std::end for unique pointer.
The problem is that I have to get N(count of elements in array).
1.Approach deduce type from template
template <typename T, size_t N>
T* end(const unique_ptr<T[N]> &arr)
{
return arr.get() + N;
}
But I got error error: C2893: Failed to specialize function template 'T *test::end(const std::unique_ptr> &)' with [ _Ty=T [N] ] With the following template arguments: 'T=int' 'N=0x00'
It looks like It is not possible to deduce N
2.Get N from allocator.
Allocator has to know N to correctly execute delete[].
You could read about this in this article. There are two approaches:
Over-allocate the array and put n just to the left.
Use an associative array with p as the key and n as the value.
The problem is how to get this size cross platform/compiler.
Maybe someone knows better approaches or know how to make this works?
If you have a run time sized array and you need to know the size of it without having to manually do the book keeping then you should use a std::vector. It will manage the memory and size for you.
std::unique_ptr<T[]> is just a wrapper for a raw pointer. You cannot get the size of the block the pointer points to from just the pointer. The reason you use a std::unique_ptr<T[]> over T* foo = new T[size] is the unique_ptr makes sure delete[] is called when the pointer goes out of scope.
Something like this?
template<class X>
struct sized_unique_buffer;
template<class T, std::size_t N>
struct sized_unique_buffer<T[N]>:
std::unique_ptr<T[]>
{
using std::unique_ptr<T[]>::unique_ptr;
T* begin() const { return this->get(); }
T* end() const { return *this?begin(*this)+N:nullptr; }
bool empty() const { return N==0 || !*this; }
};
where we have a compile-time unenforced promise of a fixed compile-time length.
A similar design could work for a dynamic runtime length.
In some compilers, the number of T when T can be trivially destroyed is not stored when you call new T[N]. The system is free to over-allocate and give you a larger buffer (ie, round to a page boundary for a large allocation, or implicitly store the size of the buffer via the location from which it is allocated to reduce overhead and round allocations up), so the allocation size need not exactly match the number of elements.
For non-trivially destroyed T it is true that the compiler must know how many to destroy from just the pointer. This information is not exposed to C++.
You can do manual allocation of buffers and the count and pass that on to a unique_ptr with a custom deleter, even a stateless one. This would permit a type
unique_buffer<T[]> ptr;
where you can get the number of elements out at only a modest runtime cost.
If you instead store the length in the deleter, you can get a bit more locality on the loop limits (saving a cache miss) at the cost of a larger unique_buffer<T[]>.
Doing this with an unadulterated unique_ptr<T[]> is not possible in a portable way.

A templated 'strdup()'?

template<typename T>
static T *anydup(const T *src, size_t len) {
T *ptr = malloc(len * sizeof(T));
memcpy(ptr, src, (len * sizeof(T)));
return ptr;
}
Is this proper? Can I expect any errors from this when using an int, long, etc.? I'm very new to generic programming and am trying to learn more.
No this is not proper ! When you have a malloc() in C++ code, you should become very suspicious:
malloc() allocates memory, but doesn't properly create objects. The only way to work with such memory would be to use a placement new.
memcpy() doesn't respect the copy semantic of C++ objects. This could only work with trivially copiable classes. I would cause hard to find bugs elsewhere (shallow copies, and other awful things that lead to UB).
For basic types like char, int, double, it would work. But not for more complex types.
Alternative 1: adapt your code to properly create and copy objects
template<typename T>
T *anydup (const T *src, size_t len) {
T *ptr = new T[len]; // requires that T has a default constructor
copy (src, src+len, ptr); // requires that T is copyiable
return ptr;
}
Attention: risk of memory leakage if user forget to delete the array, or UB if user doesnet use delete[] ! To avoid this you could opt for returning unique_ptr<T[]>.
Alternative 2: Get rid of arrays and pointers and memory nightmares: use vectors !
template<typename T>
vector<T> anydup (const vector<T> src) {
vector<T> v(len); // requires that T has a default constructor
copy (src.cbegin(), src.cend(), v); // requires that T is copyable
return v;
}
You could consider creating the vector using a copy constructor as suggested by Remy Lebeau and FDinoff in the comments, either in the function or directly in the using code.
If you use copy() directly in the using code, you'll soon discover that there are also copy_if(), copy_backwards() and some other nice <algorithms> that could be used depending on circumstances.

Stack-buffer based STL allocator?

I was wondering if it practicable to have an C++ standard library compliant allocator that uses a (fixed sized) buffer that lives on the stack.
Somehow, it seems this question has not been ask this way yet on SO, although it may have been implicitly answered elsewhere.
So basically, it seems, as far as my searches go, that it should be possible to create an allocator that uses a fixed size buffer. Now, on first glance, this should mean that it should also be possible to have an allocator that uses a fixed size buffer that "lives" on the stack, but it does appear, that there is no widespread such implementation around.
Let me give an example of what I mean:
{ ...
char buf[512];
typedef ...hmm?... local_allocator; // should use buf
typedef std::basic_string<char, std::char_traits<char>, local_allocator> lstring;
lstring str; // string object of max. 512 char
}
How would this be implementable?
The answer to this other question (thanks to R. Martinho Fernandes) links to a stack based allocator from the chromium sources: http://src.chromium.org/viewvc/chrome/trunk/src/base/stack_container.h
However, this class seems extremely peculiar, especially since this StackAllocator does not have a default ctor -- and there I was thinking that every allocator class needs a default ctor.
It's definitely possible to create a fully C++11/C++14 conforming stack allocator*. But you need to consider some of the ramifications about the implementation and the semantics of stack allocation and how they interact with standard containers.
Here's a fully C++11/C++14 conforming stack allocator (also hosted on my github):
#include <functional>
#include <memory>
template <class T, std::size_t N, class Allocator = std::allocator<T>>
class stack_allocator
{
public:
typedef typename std::allocator_traits<Allocator>::value_type value_type;
typedef typename std::allocator_traits<Allocator>::pointer pointer;
typedef typename std::allocator_traits<Allocator>::const_pointer const_pointer;
typedef typename Allocator::reference reference;
typedef typename Allocator::const_reference const_reference;
typedef typename std::allocator_traits<Allocator>::size_type size_type;
typedef typename std::allocator_traits<Allocator>::difference_type difference_type;
typedef typename std::allocator_traits<Allocator>::const_void_pointer const_void_pointer;
typedef Allocator allocator_type;
public:
explicit stack_allocator(const allocator_type& alloc = allocator_type())
: m_allocator(alloc), m_begin(nullptr), m_end(nullptr), m_stack_pointer(nullptr)
{ }
explicit stack_allocator(pointer buffer, const allocator_type& alloc = allocator_type())
: m_allocator(alloc), m_begin(buffer), m_end(buffer + N),
m_stack_pointer(buffer)
{ }
template <class U>
stack_allocator(const stack_allocator<U, N, Allocator>& other)
: m_allocator(other.m_allocator), m_begin(other.m_begin), m_end(other.m_end),
m_stack_pointer(other.m_stack_pointer)
{ }
constexpr static size_type capacity()
{
return N;
}
pointer allocate(size_type n, const_void_pointer hint = const_void_pointer())
{
if (n <= size_type(std::distance(m_stack_pointer, m_end)))
{
pointer result = m_stack_pointer;
m_stack_pointer += n;
return result;
}
return m_allocator.allocate(n, hint);
}
void deallocate(pointer p, size_type n)
{
if (pointer_to_internal_buffer(p))
{
m_stack_pointer -= n;
}
else m_allocator.deallocate(p, n);
}
size_type max_size() const noexcept
{
return m_allocator.max_size();
}
template <class U, class... Args>
void construct(U* p, Args&&... args)
{
m_allocator.construct(p, std::forward<Args>(args)...);
}
template <class U>
void destroy(U* p)
{
m_allocator.destroy(p);
}
pointer address(reference x) const noexcept
{
if (pointer_to_internal_buffer(std::addressof(x)))
{
return std::addressof(x);
}
return m_allocator.address(x);
}
const_pointer address(const_reference x) const noexcept
{
if (pointer_to_internal_buffer(std::addressof(x)))
{
return std::addressof(x);
}
return m_allocator.address(x);
}
template <class U>
struct rebind { typedef stack_allocator<U, N, allocator_type> other; };
pointer buffer() const noexcept
{
return m_begin;
}
private:
bool pointer_to_internal_buffer(const_pointer p) const
{
return (!(std::less<const_pointer>()(p, m_begin)) && (std::less<const_pointer>()(p, m_end)));
}
allocator_type m_allocator;
pointer m_begin;
pointer m_end;
pointer m_stack_pointer;
};
template <class T1, std::size_t N, class Allocator, class T2>
bool operator == (const stack_allocator<T1, N, Allocator>& lhs,
const stack_allocator<T2, N, Allocator>& rhs) noexcept
{
return lhs.buffer() == rhs.buffer();
}
template <class T1, std::size_t N, class Allocator, class T2>
bool operator != (const stack_allocator<T1, N, Allocator>& lhs,
const stack_allocator<T2, N, Allocator>& rhs) noexcept
{
return !(lhs == rhs);
}
This allocator uses a user-provided fixed-size buffer as an initial source of memory, and then falls back on a secondary allocator (std::allocator<T> by default) when it runs out of space.
Things to consider:
Before you just go ahead and use a stack allocator, you need to consider your allocation patterns. Firstly, when using a memory buffer on the stack, you need to consider what exactly it means to allocate and deallocate memory.
The simplest method (and the method employed above) is to simply increment a stack pointer for allocations, and decrement it for deallocations. Note that this severely limits how you can use the allocator in practice. It will work fine for, say, an std::vector (which will allocate a single contiguous memory block) if used correctly, but will not work for say, an std::map, which will allocate and deallocate node objects in varying order.
If your stack allocator simply increments and decrements a stack pointer, then you'll get undefined behavior if your allocations and deallocations are not in LIFO order. Even an std::vector will cause undefined behavior if it first allocates a single contiguous block from the stack, then allocates a second stack block, then deallocates the first block, which will happen every time the vector increases it's capacity to a value that is still smaller than stack_size. This is why you'll need to reserve the stack size in advance. (But see the note below regarding Howard Hinnant's implementation.)
Which brings us to the question ...
What do you really want from a stack allocator?
Do you actually want a general purpose allocator that will allow you to allocate and deallocate memory chunks of various sizes in varying order, (like malloc), except it draws from a pre-allocated stack buffer instead of calling sbrk? If so, you're basically talking about implementing a general purpose allocator that maintains a free list of memory blocks somehow, only the user can provide it with a pre-existing stack buffer. This is a much more complex project. (And what should it do if it runs out space? Throw std::bad_alloc? Fall back on the heap?)
The above implementation assumes you want an allocator that will simply use LIFO allocation patterns and fall back on another allocator if it runs out of space. This works fine for std::vector, which will always use a single contiguous buffer that can be reserved in advance. When std::vector needs a larger buffer, it will allocate a larger buffer, copy (or move) the elements in the smaller buffer, and then deallocate the smaller buffer. When the vector requests a larger buffer, the above stack_allocator implementation will simply fall back to a secondary allocator (which is std::allocator by default.)
So, for example:
const static std::size_t stack_size = 4;
int buffer[stack_size];
typedef stack_allocator<int, stack_size> allocator_type;
std::vector<int, allocator_type> vec((allocator_type(buffer))); // double parenthesis here for "most vexing parse" nonsense
vec.reserve(stack_size); // attempt to reserve space for 4 elements
std::cout << vec.capacity() << std::endl;
vec.push_back(10);
vec.push_back(20);
vec.push_back(30);
vec.push_back(40);
// Assert that the vector is actually using our stack
//
assert(
std::equal(
vec.begin(),
vec.end(),
buffer,
[](const int& v1, const int& v2) {
return &v1 == &v2;
}
)
);
// Output some values in the stack, we see it is the same values we
// inserted in our vector.
//
std::cout << buffer[0] << std::endl;
std::cout << buffer[1] << std::endl;
std::cout << buffer[2] << std::endl;
std::cout << buffer[3] << std::endl;
// Attempt to push back some more values. Since our stack allocator only has
// room for 4 elements, we cannot satisfy the request for an 8 element buffer.
// So, the allocator quietly falls back on using std::allocator.
//
// Alternatively, you could modify the stack_allocator implementation
// to throw std::bad_alloc
//
vec.push_back(50);
vec.push_back(60);
vec.push_back(70);
vec.push_back(80);
// Assert that we are no longer using the stack buffer
//
assert(
!std::equal(
vec.begin(),
vec.end(),
buffer,
[](const int& v1, const int& v2) {
return &v1 == &v2;
}
)
);
// Print out all the values in our vector just to make sure
// everything is sane.
//
for (auto v : vec) std::cout << v << ", ";
std::cout << std::endl;
See: http://ideone.com/YhMZxt
Again, this works fine for vector - but you need to ask yourself what exactly you intend to do with the stack allocator. If you want a general purpose memory allocator that just happens to draw from a stack buffer, you're talking about a much more complex project. A simple stack allocator, however, which merely increments and decrements a stack pointer will work for a limited set of use cases. Note that for non-POD types, you'll need to use std::aligned_storage<T, alignof(T)> to create the actual stack buffer.
I'd also note that unlike Howard Hinnant's implementation, the above implementation doesn't explicitly make a check that when you call deallocate(), the pointer passed in is the last block allocated. Hinnant's implementation will simply do nothing if the pointer passed in isn't a LIFO-ordered deallocation. This will enable you to use an std::vector without reserving in advance because the allocator will basically ignore the vector's attempt to deallocate the initial buffer. But this also blurs the semantics of the allocator a bit, and relies on behavior that is pretty specifically bound to the way std::vector is known to work. My feeling is that we may as well simply say that passing any pointer to deallocate() which wasn't returned via the last call to allocate() will result in undefined behavior and leave it at that.
*Finally - the following caveat: it seems to be debatable whether or not the function that checks whether a pointer is within the boundaries of the stack buffer is even defined behavior by the standard. Order-comparing two pointers from different new/malloc'd buffers is arguably implementation defined behavior (even with std::less), which perhaps makes it impossible to write a standards-conforming stack allocator implementation that falls back on heap allocation. (But in practice this won't matter unless you're running a 80286 on MS-DOS.)
** Finally (really now), it's also worth noting that the word "stack" in stack allocator is sort of overloaded to refer both to the source of memory (a fixed-size stack array) and the method of allocation (a LIFO increment/decrement stack pointer). When most programmers say they want a stack allocator, they're thinking about the former meaning without necessarily considering the semantics of the latter, and how these semantics restrict the use of such an allocator with standard containers.
Apparently, there is a conforming Stack Allocator from one Howard Hinnant.
It works by using a fixed size buffer (via a referenced arena object) and falling back to the heap if too much space is requested.
This allocator doesn't have a default ctor, and since Howard says:
I've updated this article with a new allocator that is fully C++11 conforming.
I'd say that it is not a requirement for an allocator to have a default ctor.
Starting in c++17 it's actually quite simple to do.
Full credit goes to the author of the dumbest allocator, as that's what this is based on.
The dumbest allocator is a monotonic bump allocator which takes a char[] resource as its underlying storage. In the original version, that char[] is placed on the heap via mmap, but it's trivial to change it to point at a char[] on the stack.
template<std::size_t Size=256>
class bumping_memory_resource {
public:
char buffer[Size];
char* _ptr;
explicit bumping_memory_resource()
: _ptr(&buffer[0]) {}
void* allocate(std::size_t size) noexcept {
auto ret = _ptr;
_ptr += size;
return ret;
}
void deallocate(void*) noexcept {}
};
This allocates Size bytes on the stack on creation, default 256.
template <typename T, typename Resource=bumping_memory_resource<256>>
class bumping_allocator {
Resource* _res;
public:
using value_type = T;
explicit bumping_allocator(Resource& res)
: _res(&res) {}
bumping_allocator(const bumping_allocator&) = default;
template <typename U>
bumping_allocator(const bumping_allocator<U,Resource>& other)
: bumping_allocator(other.resource()) {}
Resource& resource() const { return *_res; }
T* allocate(std::size_t n) { return static_cast<T*>(_res->allocate(sizeof(T) * n)); }
void deallocate(T* ptr, std::size_t) { _res->deallocate(ptr); }
friend bool operator==(const bumping_allocator& lhs, const bumping_allocator& rhs) {
return lhs._res == rhs._res;
}
friend bool operator!=(const bumping_allocator& lhs, const bumping_allocator& rhs) {
return lhs._res != rhs._res;
}
};
And this is the actual allocator. Note that it would be trivial to add a reset to the resource manager, letting you create a new allocator starting at the beginning of the region again. Also could implement a ring buffer, with all the usual risks thereof.
As for when you might want something like this: I use it in embedded systems. Embedded systems usually don't react well to heap fragmentation, so having the ability to use dynamic allocation that doesn't go on the heap is sometimes handy.
It really depends on your requirements, sure if you like you can create an allocator that operates only on the stack but it would be very limited since the same stack object is not accessible from everywhere in the program as a heap object would be.
I think this article explains allocators it very well
http://www.codeguru.com/cpp/cpp/cpp_mfc/stl/article.php/c4079
A stack-based STL allocator is of such limited utility that I doubt you will find much prior art. Even the simple example you cite quickly blows up if you later decide you want to copy or lengthen the initial lstring.
For other STL containers such as the associative ones (tree-based internally) or even vector and deque which use either a single or multiple contiguous blocks of RAM, the memory usage semantics quickly become unmanageable on the stack in almost any real-world usage.
This is actually an extremely useful practice and used in performant development, such as games, quite a bit. To embed memory inline on the stack or within the allocation of a class structure can be critical for speed and or management of the container.
To answer your question, it comes down to the implementation of the stl container. If the container not only instantiates but also keeps reference to your allocator as a member then you are good to go to create a fixed heap, I've found this to not always be the case as it is not part of the spec. Otherwise it becomes problematic. One solution can be to wrap the container, vector, list, etc, with another class who contains the storage. Then you can use an allocator to draw from that. This could require a lot of template magickery (tm).