Why std::string allocating twice? - c++

I wrote a custom allocator for std::string and std::vector as follows:
#include <cstdint>
#include <iterator>
#include <iostream>
template <typename T>
struct PSAllocator
{
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
typedef T value_type;
template<typename U>
struct rebind {typedef PSAllocator<U> other;};
PSAllocator() throw() {};
PSAllocator(const PSAllocator& other) throw() {};
template<typename U>
PSAllocator(const PSAllocator<U>& other) throw() {};
template<typename U>
PSAllocator& operator = (const PSAllocator<U>& other) { return *this; }
PSAllocator<T>& operator = (const PSAllocator& other) { return *this; }
~PSAllocator() {}
pointer allocate(size_type n, const void* hint = 0)
{
std::int32_t* data_ptr = reinterpret_cast<std::int32_t*>(::operator new(n * sizeof(value_type)));
std::cout<<"Allocated: "<<&data_ptr[0]<<" of size: "<<n<<"\n";
return reinterpret_cast<pointer>(&data_ptr[0]);
}
void deallocate(T* ptr, size_type n)
{
std::int32_t* data_ptr = reinterpret_cast<std::int32_t*>(ptr);
std::cout<<"De-Allocated: "<<&data_ptr[0]<<" of size: "<<n<<"\n";
::operator delete(reinterpret_cast<T*>(&data_ptr[0]));
}
};
Then I ran the following test case:
int main()
{
typedef std::basic_string<char, std::char_traits<char>, PSAllocator<char>> cstring;
cstring* str = new cstring();
str->resize(1);
delete str;
std::cout<<"\n\n\n\n";
typedef std::vector<char, PSAllocator<char>> cvector;
cvector* cv = new cvector();
cv->resize(1);
delete cv;
}
For whatever odd reason, it goes on to print:
Allocated: 0x3560a0 of size: 25
Allocated: 0x3560d0 of size: 26
De-Allocated: 0x3560a0 of size: 25
De-Allocated: 0x3560d0 of size: 26
Allocated: 0x351890 of size: 1
De-Allocated: 0x351890 of size: 1
So why does it allocate twice for std::string and a lot more bytes?
I'm using g++ 4.8.1 x64 sjlj on Windows 8 from: http://sourceforge.net/projects/mingwbuilds/.

I can't reproduce the double allocation, since apparently my libstdc++ does not allocate anything at all for the empty string. The resize however does allocate 26 bytes, and gdb helps me identifying how they are composed:
size_type __size = (__capacity + 1) * sizeof(_CharT) + sizeof(_Rep);
( 1 + 1) * 1 + 24
So the memory is mostly for this _Rep representation, which in turn consists of the following data members:
size_type _M_length; // 8 bytes
size_type _M_capacity; // 8 bytes
_Atomic_word _M_refcount; // 4 bytes
I guess the last four bytes is just for the sake of alignment, but I might have missed some data element.
I guess the main reason why this _Rep structure is allocated on the heap is that it can be shared among string instances, and perhaps also that it can be avoided for empty strings as the lack of a first allocation on my system suggests.
To find out why your implementation doesn't make use of this empty string optimization, have a look at the default constructor. Its implementation seems to depend on the value of _GLIBCXX_FULLY_DYNAMIC_STRING, which apparently is non-zero in your setup. I'd not advise changing that setting directly, since it starts with an underscore and is therefore considered private. But you might find some public setting to affect this value.

Related

Boost Pool experience requested. Is it useful as allocator with preallocation?

Recently i have been looking for a pool/allocator mechanism.
Boost Pool seems to provide the solution, but there is still things, which it have not been able to deduce from the documentation.
What need to be allocated
Several small classes (~30 chars)
std::map (i want to ensure it do not perform dynamic allocator by itself)
allocation within pugi::xml
std::strings
How to control of address space for allocation (or just amount)
The object_pool seem the to provide a good way for allocating need 1)
However it would like to set a fixed size for the allocator to use. By default it grabs memory be ifself.
If possible i would like to give it the address space it can play within.
char * mem_for_class[1024*1024];
boost::object_pool<my_class,mem_for_class> q;
or:
const int max_no_objs=1024;
boost::object_pool<my_class,max_no_objs> q;
Although the UserAllocator is available in Boost::Pool; it seem to defeat the point. I am afraid the control needed would make it too inefficient... and it would be better to start from scratch.
It it possible to set a fixed area for pool_allocator ?
The question is a bit similar to the first.
Do boost pool provide any way of limiting how much / where there is allocated memory when giving boost::pool_allocator to a std-type-class (e.g. map)
My scenario
Embedded linux programming. The system must keep running for..ever. So we can not risk any memory segmentation. Currently i mostly either static allocation (stack), but also a few raw "new"s.
I would like an allocation scheme that ensure i use the same memory area each time the program loops.
Speed /space is important, but safety is still top priority.
I hope StackOverflow is the place to ask. I tried contacting the author of Boost::Pool "Stephen" without luck. I have not found any Boost-specific forum.
You can always create an allocator that works with STL. If it works with STL, it should work with boost as you are able to pass boost allocators to STL containers..
Considering the above, an allocator that can allocate at a specified memory address AND has a size limitation specified by you can be written as follows:
#include <iostream>
#include <vector>
template<typename T>
class CAllocator
{
private:
std::size_t size;
T* data = nullptr;
public:
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef T value_type;
CAllocator() {}
CAllocator(pointer data_ptr, size_type max_size) noexcept : size(max_size), data(data_ptr) {};
template<typename U>
CAllocator(const CAllocator<U>& other) noexcept {};
CAllocator(const CAllocator &other) : size(other.size), data(other.data) {}
template<typename U>
struct rebind {typedef CAllocator<U> other;};
pointer allocate(size_type n, const void* hint = 0) {return &data[0];}
void deallocate(void* ptr, size_type n) {}
size_type max_size() const {return size;}
};
template <typename T, typename U>
inline bool operator == (const CAllocator<T>&, const CAllocator<U>&) {return true;}
template <typename T, typename U>
inline bool operator != (const CAllocator<T>& a, const CAllocator<U>& b) {return !(a == b);}
int main()
{
const int size = 1024 / 4;
int ptr[size];
std::vector<int, CAllocator<int>> vec(CAllocator<int>(&ptr[0], size));
int ptr2[size];
std::vector<int, CAllocator<int>> vec2(CAllocator<int>(&ptr2[0], size));
vec.push_back(10);
vec.push_back(20);
vec2.push_back(30);
vec2.push_back(40);
for (std::size_t i = 0; i < vec2.size(); ++i)
{
int* val = &ptr2[i];
std::cout<<*val<<"\n";
}
std::cout<<"\n\n";
vec2 = vec;
for (std::size_t i = 0; i < vec2.size(); ++i)
{
int* val = &ptr2[i];
std::cout<<*val<<"\n";
}
std::cout<<"\n\n";
vec2.clear();
vec2.push_back(100);
vec2.push_back(200);
for (std::size_t i = 0; i < vec2.size(); ++i)
{
int* val = &ptr2[i];
std::cout<<*val<<"\n";
}
}
This allocator makes sure that all memory is allocated at a specified address. No more than the amount you specify can be allocated with the freedom to allocate were you want whether it is on the stack or the heap.
You may create your own pool or use a std::unique_ptr as the pool for a single container.
EDIT: For strings, you need an offset of sizeof(_Rep_base). See: Why std::string allocating twice?
and http://ideone.com/QWtxWg
It is defined as:
struct _Rep_base
{
std::size_t _M_length;
std::size_t _M_capacity;
_Atomic_word _M_refcount;
};
So the example becomes:
struct Repbase
{
std::size_t length;
std::size_t capacity;
std::int16_t refcount;
};
int main()
{
typedef std::basic_string<char, std::char_traits<char>, CAllocator<char>> CAString;
const int size = 1024;
char ptr[size] = {0};
CAString str(CAllocator<char>(&ptr[0], size));
str = "Hello";
std::cout<<&ptr[sizeof(Repbase)];
}

Is it possible to initialize std::vector over already allocated memory?

My question is fairly simple and I am quite surprised I can't find anything related. Probably it is easy or totally stupid (or I can't search).
As the title says, is it possible to use std::vector on already allocated memory, so it does not allocate new elements from start but uses what is given. I would imagine it as something like:
T1 *buffer = new T1[some_size];
std::vector<T2> v(buffer, some_size); // <- ofc doesn't work
The opposite is quite simple and (maybe not pretty but) works:
std::vector<T2> v(some_size);
T1 *buffer = &v[0];
The storage is guaranteed to be continuous, so it is as safe as an iterator.
My motivation is quite simple. I pass around some raw memory data, i.e. bytes, and since I know their interpretation at some other places I would like to convert them back to something meaningful. I could do a reinterpret_cast and use normal c-style array, but I prefer c++ facilities.
I get the feeling this should be safe given that we give up ownership of buffer to vector, because it needs to be able to reallocate.
Like this.. Containers in the standard usually take an allocator. Using c++11's allocator traits, it is very easy to create an allocator as you don't have to have all the members in the allocator. However if using an older version of C++, you will need to implement each member and do the rebinding as well!
For Pre-C++11, you can use the following:
#include <iterator>
#include <vector>
#include <iostream>
template<typename T>
class PreAllocator
{
private:
T* memory_ptr;
std::size_t memory_size;
public:
typedef std::size_t size_type;
typedef ptrdiff_t difference_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
typedef T value_type;
PreAllocator(T* memory_ptr, std::size_t memory_size) throw() : memory_ptr(memory_ptr), memory_size(memory_size) {};
PreAllocator (const PreAllocator& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};
template<typename U>
PreAllocator (const PreAllocator<U>& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};
template<typename U>
PreAllocator& operator = (const PreAllocator<U>& other) {return *this;}
PreAllocator<T>& operator = (const PreAllocator& other) {return *this;}
~PreAllocator() {}
pointer address (reference value) const {return &value;}
const_pointer address (const_reference value) const {return &value;}
pointer allocate (size_type n, const void* hint = 0) {return memory_ptr;}
void deallocate (T* ptr, size_type n) {}
void construct (pointer ptr, const T& val) {new (ptr) T (val);}
template<typename U>
void destroy (U* ptr) {ptr->~U();}
void destroy (pointer ptr) {ptr->~T();}
size_type max_size() const {return memory_size;}
template<typename U>
struct rebind
{
typedef PreAllocator<U> other;
};
};
int main()
{
int my_arr[100] = {0};
std::vector<int, PreAllocator<int> > my_vec(PreAllocator<int>(&my_arr[0], 100));
my_vec.push_back(1024);
std::cout<<"My_Vec[0]: "<<my_vec[0]<<"\n";
std::cout<<"My_Arr[0]: "<<my_arr[0]<<"\n";
int* my_heap_ptr = new int[100]();
std::vector<int, PreAllocator<int> > my_heap_vec(PreAllocator<int>(&my_heap_ptr[0], 100));
my_heap_vec.push_back(1024);
std::cout<<"My_Heap_Vec[0]: "<<my_heap_vec[0]<<"\n";
std::cout<<"My_Heap_Ptr[0]: "<<my_heap_ptr[0]<<"\n";
delete[] my_heap_ptr;
my_heap_ptr = NULL;
}
For C++11, you can use the following:
#include <cstdint>
#include <iterator>
#include <vector>
#include <iostream>
template <typename T>
class PreAllocator
{
private:
T* memory_ptr;
std::size_t memory_size;
public:
typedef std::size_t size_type;
typedef T* pointer;
typedef T value_type;
PreAllocator(T* memory_ptr, std::size_t memory_size) : memory_ptr(memory_ptr), memory_size(memory_size) {}
PreAllocator(const PreAllocator& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};
template<typename U>
PreAllocator(const PreAllocator<U>& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};
template<typename U>
PreAllocator& operator = (const PreAllocator<U>& other) { return *this; }
PreAllocator<T>& operator = (const PreAllocator& other) { return *this; }
~PreAllocator() {}
pointer allocate(size_type n, const void* hint = 0) {return memory_ptr;}
void deallocate(T* ptr, size_type n) {}
size_type max_size() const {return memory_size;}
};
int main()
{
int my_arr[100] = {0};
std::vector<int, PreAllocator<int>> my_vec(0, PreAllocator<int>(&my_arr[0], 100));
my_vec.push_back(1024);
std::cout<<"My_Vec[0]: "<<my_vec[0]<<"\n";
std::cout<<"My_Arr[0]: "<<my_arr[0]<<"\n";
int* my_heap_ptr = new int[100]();
std::vector<int, PreAllocator<int>> my_heap_vec(0, PreAllocator<int>(&my_heap_ptr[0], 100));
my_heap_vec.push_back(1024);
std::cout<<"My_Heap_Vec[0]: "<<my_heap_vec[0]<<"\n";
std::cout<<"My_Heap_Ptr[0]: "<<my_heap_ptr[0]<<"\n";
delete[] my_heap_ptr;
my_heap_ptr = nullptr;
}
Notice the difference between the two allocators! This will work with both heap buffers/arrays and stack buffer/arrays. It will also work with most containers. It is safer to use the Pre-C++11 version because it will be backwards compatible and work with more containers (ie: std::List).
You can just place the allocator in a header and use it as much as you want in any projects. It is good if you want to use SharedMemory or any buffer that is already allocated.
WARNING:
DO NOT use the same buffer for multiple containers at the same time! A buffer can be reused but just make sure no two containers use it at the same time.
Example:
int my_arr[100] = {0};
std::vector<int, PreAllocator<int> > my_vec(PreAllocator<int>(&my_arr[0], 100));
std::vector<int, PreAllocator<int> > my_vec2(PreAllocator<int>(&my_arr[0], 100));
my_vec.push_back(1024);
my_vec2.push_back(2048);
std::cout<<"My_Vec[0]: "<<my_vec[0]<<"\n";
std::cout<<"My_Arr[0]: "<<my_arr[0]<<"\n";
The output of the above is 2048! Why? Because the last vector overwrote the values of the first vector since they share the same buffer.
Yes, std::vector takes a custom allocator as a template parameter which can achieve what you want.
If you look at the docs for std::vector you will see the second template parameter is a custom allocator:
template < class T, class Alloc = allocator<T> > class vector;
You can define your own std::allocator that returns whatever memory you want. This may be a lot more work than is worth it for your purposes though.
You cannot just pass in a pointer to random memory, however. You would have problems like, what if the vector needs to grow beyond the size of your initial buffer?
If you just want to operate on raw bytes sometimes, and vectors at other times, I would write helper functions to convert between the two, and just convert between the two when needed. If this causes performance issues (which you should measure), then custom allocator is your next course of action.

Bad allocator implementation

I've got a pretty simple problem. I've been whipping up a non-thread-safe allocator. The allocator is a fairly simple memory arena strategy- allocate a big chunk, put all allocations in it, do nothing for deallocation, remove the whole lot on arena destruction. However, actually attempting to use this scheme throws an access violation.
static const int MaxMemorySize = 80000;
template <typename T>
class LocalAllocator
{
public:
std::vector<char>* memory;
int* CurrentUsed;
typedef T value_type;
typedef value_type * pointer;
typedef const value_type * const_pointer;
typedef value_type & reference;
typedef const value_type & const_reference;
typedef std::size_t size_type;
typedef std::size_t difference_type;
template <typename U> struct rebind { typedef LocalAllocator<U> other; };
template <typename U>
LocalAllocator(const LocalAllocator<U>& other) {
CurrentUsed = other.CurrentUsed;
memory = other.memory;
}
LocalAllocator(std::vector<char>* ptr, int* used) {
CurrentUsed = used;
memory = ptr;
}
template<typename U> LocalAllocator(LocalAllocator<U>&& other) {
CurrentUsed = other.CurrentUsed;
memory = other.memory;
}
pointer address(reference r) { return &r; }
const_pointer address(const_reference s) { return &r; }
size_type max_size() const { return MaxMemorySize; }
void construct(pointer ptr, value_type&& t) { new (ptr) T(std::move(t)); }
void construct(pointer ptr, const value_type & t) { new (ptr) T(t); }
void destroy(pointer ptr) { static_cast<T*>(ptr)->~T(); }
bool operator==(const LocalAllocator& other) const { return Memory == other.Memory; }
bool operator!=(const LocalAllocator&) const { return false; }
pointer allocate(size_type n) {
if (*CurrentUsed + (n * sizeof(T)) > MaxMemorySize)
throw std::bad_alloc();
auto val = &(*memory)[*CurrentUsed];
*CurrentUsed += (n * sizeof(T));
return reinterpret_cast<pointer>(val);
}
pointer allocate(size_type n, pointer) {
return allocate(n);
}
void deallocate(pointer ptr, size_type n) {}
pointer allocate() {
return allocate(sizeof(T));
}
void deallocate(pointer ptr) {}
};
I've initialized memory to point to a vector which is resized to the MaxMemorySize, and I've also initialized CurrentUsed to point to an int which is zero. I fed in an allocator with these values to the constructor of a std::unordered_map, but it keeps throwing an access violation in the STL internals. Any suggestions?
Edit: Here's my usage:
std::vector<char> memory;
int CurrentUsed = 0;
memory.resize(80000);
std::unordered_map<int, int, std::hash<int>, std::equal_to<int>, LocalAllocator<std::pair<const int, int>>> dict(
std::unordered_map<int, int>().bucket_count(),
std::hash<int>(),
std::equal_to<int>(),
LocalAllocator<std::pair<const int, int>>(&memory, &CurrentUsed)
);
// start timer
QueryPerformanceCounter(&t1);
for (int i=0;i<10000;i++)
dict[i]=i; // crash
Edit: Bloody hell. It worked when I increased the size to 1MB. I had to increase it to over 800,000 bytes to get it to work without throwing.
When I test this code, the rebind is being used to request multiple allocators against the same memory block. I put
cout << n << " " << sizeof(T) << " " << typeid(T).name() << endl;
at the top of allocate(size_type) and when I added three elements to a unordered_map got:
1 64 struct std::_List_nod<...>
16 4 struct std::_List_iterator<...>
1 64 struct std::_List_nod<...>
1 64 struct std::_List_nod<...>
1 64 struct std::_List_nod<...>
If your implementation isn't coincidentally using nice round 64-byte requests this class will return mis-aligned allocations.
MSVC10's hashtable types just has a ton of space overhead for small value types. It's overrunning the amount of space you've reserved and throwing bad_alloc.
It's implemented as a list<value_t> holding all the elements and a hash bucket vector<list<value_t>::iterator> with between 2 and 16 slots per element.
That's a total of 4 to 18 pointers of overhead per element.
Something like this implementation is probably required by the standard. Unlike vector, unordered_map has a requirement that elements not be moved once added to the container.

Extra allocations and magical space reduction in the STL - using rvalue references

I replaced the Standard allocator with an allocator that will "phone home" about how much memory it consumes. Now I'm going through some of my code, wondering why the hell it allocates and then deallocates so many entries.
Just for reference, I'm not trying to pre-optimize my code or anything, I'm mostly curious, except I definitely need to know if my total size is off, because I need to know exactly how much my object is using for the C# GC.
Take this sample function:
void add_file(string filename, string source) {
file_source_map.insert(std::pair<const string, string>(std::move(filename), std::move(source)));
}
It allocates six times (48bytes), and then deallocates four times(32bytes). Since the pair is an rvalue, and I moved the strings into it, surely the map will allocate a new node and move the rvalue pair into it, without triggering any more allocations and certainly not having to de-allocate any. The filename and source arguments also come from rvalues and should be moved in, instead of copied in. Just a note: the string is also being tracked by the allocator, it's not std::string but std::basic_string<char, std::char_traits<char>, Allocator<char>>.
Just for reference, I'm on MSVC.
Here's my allocator code:
template<typename T>
class Allocator {
public :
// typedefs
typedef T value_type;
typedef value_type* pointer;
typedef const value_type* const_pointer;
typedef value_type& reference;
typedef const value_type& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
public :
// convert an allocator<T> to allocator<U>
template<typename U>
struct rebind {
typedef Allocator<U> other;
};
public :
Parser* parser;
inline ~Allocator() {}
inline Allocator(Allocator const& other) {
parser = other.parser;
}
inline Allocator(Parser* ptr)
: parser(ptr) {}
template<typename U>
inline Allocator(Allocator<U> const& other) {
parser = other.parser;
}
// address
inline pointer address(reference r) { return &r; }
inline const_pointer address(const_reference r) { return &r; }
// memory allocation
inline pointer allocate(size_type cnt,
typename std::allocator<void>::const_pointer = 0) {
int newsize = cnt * sizeof (T);
parser->size += newsize;
std::cout << "Allocated " << newsize << "\n";
return reinterpret_cast<pointer>(::operator new(newsize));
}
inline void deallocate(pointer p, size_type count) {
size_type size = count * sizeof(T);
::operator delete(p);
parser->size -= size;
std::cout << "Deallocated " << size << "\n";
}
// size
inline size_type max_size() const {
return std::numeric_limits<size_type>::max() / sizeof(T);
}
// construction/destruction
inline void construct(pointer p, const T& t) { new(p) T(t); }
inline void destroy(pointer p) { p->~T(); }
inline bool operator==(Allocator const& other) { return other.parser == parser; }
inline bool operator!=(Allocator const& a) { return !operator==(a); }
};
When I call add_file (posted above) from C# via the wrapper functions, I can clearly see each allocation and deallocation and their appropriate sizes on the console, and that is four allocations of 8, one of 80 which I know comes from the map, two more allocations of 8, and then four deallocations of 8, which says to me that there are four redundant strings in the function, because they're all rvalues and there's no reason for any deallocations to occur.
I ran your code in VS 2010, and I believe that the allocations you see are only Visual Studio STL debugging facilities as all the 8 bytes allocations are issued from _String_val constructor :
In release (_ITERATOR_DEBUG_LEVEL == 0), the constructor is trivial
In debug (_ITERATOR_DEBUG_LEVEL != 0), it allocates a _Container_proxy (which happens to have a size of 8) through the allocator.
If I run your code in release mode, the node allocation for the map falls to 72 bytes and the 8 bytes allocation and deallocation disappear : the strings seem to be correctly moved.

Can one leverage std::basic_string to implement a string having a length limitation?

I'm working with a low-level API that accepts a char* and numeric value to represent a string and its length, respectively. My code uses std::basic_string and calls into these methods with the appropriate translation. Unfortunately, many of these methods accept string lengths of varying size (i.e. max(unsigned char), max(short), etc...) and I'm stuck writing code to make sure that my string instances do not exceed the maximum length prescribed by the low-level API.
By default, the maximum length of an std::basic_string instance is bound by the maximum value of size_t (either max(unsigned int) or max(__int64)). Is there a way to manipulate the traits and allocator implementations of a std::basic_string implementation so that I may specify my own type to use in place of size_t? By doing so, I am hoping to leverage any existing bounds checks within the std::basic_string implementation so I don't have to do so when performing the translation.
My initial investigation suggests that this is not possible without writing my own string class, but I'm hoping that I overlooked something :)
you can pass a custom allocator to std::basic_string which has a max size of whatever you want. This should be sufficient. Perhaps something like this:
template <class T>
class my_allocator {
public:
typedef T value_type;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
pointer address(reference r) const { return &r; }
const_pointer address(const_reference r) const { return &r; }
my_allocator() throw() {}
template <class U>
my_allocator(const my_allocator<U>&) throw() {}
~my_allocator() throw() {}
pointer allocate(size_type n, void * = 0) {
// fail if we try to allocate too much
if((n * sizeof(T))> max_size()) { throw std::bad_alloc(); }
return static_cast<T *>(::operator new(n * sizeof(T)));
}
void deallocate(pointer p, size_type) {
return ::operator delete(p);
}
void construct(pointer p, const T& val) { new(p) T(val); }
void destroy(pointer p) { p->~T(); }
// max out at about 64k
size_type max_size() const throw() { return 0xffff; }
template <class U>
struct rebind { typedef my_allocator<U> other; };
template <class U>
my_allocator& operator=(const my_allocator<U> &rhs) {
(void)rhs;
return *this;
}
};
Then you can probably do this:
typedef std::basic_string<char, std::char_traits<char>, my_allocator<char> > limited_string;
EDIT: I've just done a test to make sure this works as expected. The following code tests it.
int main() {
limited_string s;
s = "AAAA";
s += s;
s += s;
s += s;
s += s;
s += s;
s += s;
s += s; // 512 chars...
s += s;
s += s;
s += s;
s += s;
s += s;
s += s; // 32768 chars...
s += s; // this will throw std::bad_alloc
std::cout << s.max_size() << std::endl;
std::cout << s.size() << std::endl;
}
That last s += s will put it over the top and cause a std::bad_alloc exception, (since my limit is just short of 64k). Unfortunately gcc's std::basic_string::max_size() implementation does not base its result on the allocator you use, so it will still claim to be able to allocate more. (I'm not sure if this is a bug or not...).
But this will definitely allow you impose hard limits on the sizes of strings in a simple way. You could even make the max size a template parameter so you only have to write the code for the allocator once.
I agree with Evan Teran about his solution. This is just a modification of his solution no more:
template <typename Type, typename std::allocator<Type>::size_type maxSize>
struct myalloc : std::allocator<Type>
{
// hide std::allocator[ max_size() & allocate(...) ]
std::allocator<Type>::size_type max_size() const throw()
{
return maxSize;
}
std::allocator<Type>::pointer allocate
(std::allocator<Type>::size_type n, void * = 0)
{
// fail if we try to allocate too much
if((n * sizeof(Type))> max_size()) { throw std::bad_alloc(); }
return static_cast<Type *>(::operator new(n * sizeof(Type)));
}
};
Be aware you should not use polymorphism at all with myalloc. So this is disastrous:
// std::allocator doesn't have a virtual destructor
std::allocator<char>* alloc = new myalloc<char>;
You just use it as if it is a separate type, it is safe in following case:
myalloc<char, 1024> alloc; // max size == 1024
Can't you create a class with std::string as parent and override c_str()?
Or define your own c_str16(), c_str32(), etc and implement translation there?