I've got a pretty simple problem. I've been whipping up a non-thread-safe allocator. The allocator is a fairly simple memory arena strategy- allocate a big chunk, put all allocations in it, do nothing for deallocation, remove the whole lot on arena destruction. However, actually attempting to use this scheme throws an access violation.
static const int MaxMemorySize = 80000;
template <typename T>
class LocalAllocator
{
public:
std::vector<char>* memory;
int* CurrentUsed;
typedef T value_type;
typedef value_type * pointer;
typedef const value_type * const_pointer;
typedef value_type & reference;
typedef const value_type & const_reference;
typedef std::size_t size_type;
typedef std::size_t difference_type;
template <typename U> struct rebind { typedef LocalAllocator<U> other; };
template <typename U>
LocalAllocator(const LocalAllocator<U>& other) {
CurrentUsed = other.CurrentUsed;
memory = other.memory;
}
LocalAllocator(std::vector<char>* ptr, int* used) {
CurrentUsed = used;
memory = ptr;
}
template<typename U> LocalAllocator(LocalAllocator<U>&& other) {
CurrentUsed = other.CurrentUsed;
memory = other.memory;
}
pointer address(reference r) { return &r; }
const_pointer address(const_reference s) { return &r; }
size_type max_size() const { return MaxMemorySize; }
void construct(pointer ptr, value_type&& t) { new (ptr) T(std::move(t)); }
void construct(pointer ptr, const value_type & t) { new (ptr) T(t); }
void destroy(pointer ptr) { static_cast<T*>(ptr)->~T(); }
bool operator==(const LocalAllocator& other) const { return Memory == other.Memory; }
bool operator!=(const LocalAllocator&) const { return false; }
pointer allocate(size_type n) {
if (*CurrentUsed + (n * sizeof(T)) > MaxMemorySize)
throw std::bad_alloc();
auto val = &(*memory)[*CurrentUsed];
*CurrentUsed += (n * sizeof(T));
return reinterpret_cast<pointer>(val);
}
pointer allocate(size_type n, pointer) {
return allocate(n);
}
void deallocate(pointer ptr, size_type n) {}
pointer allocate() {
return allocate(sizeof(T));
}
void deallocate(pointer ptr) {}
};
I've initialized memory to point to a vector which is resized to the MaxMemorySize, and I've also initialized CurrentUsed to point to an int which is zero. I fed in an allocator with these values to the constructor of a std::unordered_map, but it keeps throwing an access violation in the STL internals. Any suggestions?
Edit: Here's my usage:
std::vector<char> memory;
int CurrentUsed = 0;
memory.resize(80000);
std::unordered_map<int, int, std::hash<int>, std::equal_to<int>, LocalAllocator<std::pair<const int, int>>> dict(
std::unordered_map<int, int>().bucket_count(),
std::hash<int>(),
std::equal_to<int>(),
LocalAllocator<std::pair<const int, int>>(&memory, &CurrentUsed)
);
// start timer
QueryPerformanceCounter(&t1);
for (int i=0;i<10000;i++)
dict[i]=i; // crash
Edit: Bloody hell. It worked when I increased the size to 1MB. I had to increase it to over 800,000 bytes to get it to work without throwing.
When I test this code, the rebind is being used to request multiple allocators against the same memory block. I put
cout << n << " " << sizeof(T) << " " << typeid(T).name() << endl;
at the top of allocate(size_type) and when I added three elements to a unordered_map got:
1 64 struct std::_List_nod<...>
16 4 struct std::_List_iterator<...>
1 64 struct std::_List_nod<...>
1 64 struct std::_List_nod<...>
1 64 struct std::_List_nod<...>
If your implementation isn't coincidentally using nice round 64-byte requests this class will return mis-aligned allocations.
MSVC10's hashtable types just has a ton of space overhead for small value types. It's overrunning the amount of space you've reserved and throwing bad_alloc.
It's implemented as a list<value_t> holding all the elements and a hash bucket vector<list<value_t>::iterator> with between 2 and 16 slots per element.
That's a total of 4 to 18 pointers of overhead per element.
Something like this implementation is probably required by the standard. Unlike vector, unordered_map has a requirement that elements not be moved once added to the container.
Related
I'm playing around with std::function and custom allocators but its not behaving as I expected when I don't provide the function with an initial functor.
When I provide a custom allocator to the constructor but no initial functor, the allocator is never used or so it seems.
This is my code.
//Simple functor class that is big to force allocations
struct Functor128
{
Functor128()
{}
char someBytes[128];
void operator()(int something)
{
cout << "Functor128 Called with value " << something << endl;
}
};
int main(int argc, char* argv[])
{
Allocator<char, 1> myAllocator1;
Allocator<char, 2> myAllocator2;
Allocator<char, 3> myAllocator3;
Functor128 myFunctor;
cout << "setting up function1" << endl;
function<void(int)> myFunction1(allocator_arg, myAllocator1, myFunctor);
myFunction1(7);
cout << "setting up function2" << endl;
function<void(int)> myFunction2(allocator_arg, myAllocator2);
myFunction2 = myFunctor;
myFunction2(9);
cout << "setting up function3" << endl;
function<void(int)> myFunction3(allocator_arg, myAllocator3);
myFunction3 = myFunction1;
myFunction3(19);
}
Output:
setting up function1
Allocator 1 allocating 136 bytes.
Functor128 Called with value 7
setting up function2
Functor128 Called with value 9
setting up function3
Allocator 1 allocating 136 bytes.
Functor128 Called with value 19
So case1: myFunction1 allocates using allocator1 as expected.
case2: myFunction2 is given allocator2 in constructor but when assigned a functor it appears to reset to using the default std::allocator to make the allocation.(hence no print out about allocation).
case3: myFunction3 is given allocator3 in constructor but when assigned to from myFunction1 the allocation takes place using function1's allocator to make the allocation.
Is this correct behaviour?
In particular, in case 2 why revert to using default std::allocator?
If so what is the point of the empty constructor that takes an allocator as the allocator never gets used.
I am using VS2013 for this code.
My Allocator class is just a minimal implementation that uses new and logs out when it allocates
template<typename T, int id = 1>
class Allocator {
public:
// typedefs
typedef T value_type;
typedef value_type* pointer;
typedef const value_type* const_pointer;
typedef value_type& reference;
typedef const value_type& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
public:
// convert an allocator<T> to allocator<U>
template<typename U>
struct rebind {
typedef Allocator<U> other;
};
public:
inline Allocator() {}
inline ~Allocator() {}
inline Allocator(Allocator const&) {}
template<typename U>
inline Allocator(Allocator<U> const&) {}
// address
inline pointer address(reference r) { return &r; }
inline const_pointer address(const_reference r) { return &r; }
// memory allocation
inline pointer allocate(size_type cnt,
typename std::allocator<void>::const_pointer = 0)
{
size_t numBytes = cnt * sizeof (T);
std::cout << "Allocator " << id << " allocating " << numBytes << " bytes." << std::endl;
return reinterpret_cast<pointer>(::operator new(numBytes));
}
inline void deallocate(pointer p, size_type) {
::operator delete(p);
}
// size
inline size_type max_size() const {
return std::numeric_limits<size_type>::max() / sizeof(T);
}
// construction/destruction
inline void construct(pointer p, const T& t) { new(p)T(t); }
inline void destroy(pointer p) { p->~T(); }
inline bool operator==(Allocator const&) { return true; }
inline bool operator!=(Allocator const& a) { return !operator==(a); }
}; // end of class Allocator
std::function's allocator support is...weird.
The current spec for operator=(F&& f) is that it does std::function(std::forward<F>(f)).swap(*this);. As you can see, this means that memory for f is allocated using whatever std::function uses by default, rather than the allocator used to construct *this. So the behavior you observe is correct, though surprising.
Moreover, since the (allocator_arg_t, Allocator) and (allocator_arg_t, Allocator, nullptr_t) constructors are noexcept, they can't really store the allocator even if they wanted to (type-erasing an allocator may require a dynamic allocation). As is, they are basically no-ops that exist to support the uses-allocator construction protocol.
LWG very recently rejected an issue that would change this behavior.
Recently i have been looking for a pool/allocator mechanism.
Boost Pool seems to provide the solution, but there is still things, which it have not been able to deduce from the documentation.
What need to be allocated
Several small classes (~30 chars)
std::map (i want to ensure it do not perform dynamic allocator by itself)
allocation within pugi::xml
std::strings
How to control of address space for allocation (or just amount)
The object_pool seem the to provide a good way for allocating need 1)
However it would like to set a fixed size for the allocator to use. By default it grabs memory be ifself.
If possible i would like to give it the address space it can play within.
char * mem_for_class[1024*1024];
boost::object_pool<my_class,mem_for_class> q;
or:
const int max_no_objs=1024;
boost::object_pool<my_class,max_no_objs> q;
Although the UserAllocator is available in Boost::Pool; it seem to defeat the point. I am afraid the control needed would make it too inefficient... and it would be better to start from scratch.
It it possible to set a fixed area for pool_allocator ?
The question is a bit similar to the first.
Do boost pool provide any way of limiting how much / where there is allocated memory when giving boost::pool_allocator to a std-type-class (e.g. map)
My scenario
Embedded linux programming. The system must keep running for..ever. So we can not risk any memory segmentation. Currently i mostly either static allocation (stack), but also a few raw "new"s.
I would like an allocation scheme that ensure i use the same memory area each time the program loops.
Speed /space is important, but safety is still top priority.
I hope StackOverflow is the place to ask. I tried contacting the author of Boost::Pool "Stephen" without luck. I have not found any Boost-specific forum.
You can always create an allocator that works with STL. If it works with STL, it should work with boost as you are able to pass boost allocators to STL containers..
Considering the above, an allocator that can allocate at a specified memory address AND has a size limitation specified by you can be written as follows:
#include <iostream>
#include <vector>
template<typename T>
class CAllocator
{
private:
std::size_t size;
T* data = nullptr;
public:
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef T value_type;
CAllocator() {}
CAllocator(pointer data_ptr, size_type max_size) noexcept : size(max_size), data(data_ptr) {};
template<typename U>
CAllocator(const CAllocator<U>& other) noexcept {};
CAllocator(const CAllocator &other) : size(other.size), data(other.data) {}
template<typename U>
struct rebind {typedef CAllocator<U> other;};
pointer allocate(size_type n, const void* hint = 0) {return &data[0];}
void deallocate(void* ptr, size_type n) {}
size_type max_size() const {return size;}
};
template <typename T, typename U>
inline bool operator == (const CAllocator<T>&, const CAllocator<U>&) {return true;}
template <typename T, typename U>
inline bool operator != (const CAllocator<T>& a, const CAllocator<U>& b) {return !(a == b);}
int main()
{
const int size = 1024 / 4;
int ptr[size];
std::vector<int, CAllocator<int>> vec(CAllocator<int>(&ptr[0], size));
int ptr2[size];
std::vector<int, CAllocator<int>> vec2(CAllocator<int>(&ptr2[0], size));
vec.push_back(10);
vec.push_back(20);
vec2.push_back(30);
vec2.push_back(40);
for (std::size_t i = 0; i < vec2.size(); ++i)
{
int* val = &ptr2[i];
std::cout<<*val<<"\n";
}
std::cout<<"\n\n";
vec2 = vec;
for (std::size_t i = 0; i < vec2.size(); ++i)
{
int* val = &ptr2[i];
std::cout<<*val<<"\n";
}
std::cout<<"\n\n";
vec2.clear();
vec2.push_back(100);
vec2.push_back(200);
for (std::size_t i = 0; i < vec2.size(); ++i)
{
int* val = &ptr2[i];
std::cout<<*val<<"\n";
}
}
This allocator makes sure that all memory is allocated at a specified address. No more than the amount you specify can be allocated with the freedom to allocate were you want whether it is on the stack or the heap.
You may create your own pool or use a std::unique_ptr as the pool for a single container.
EDIT: For strings, you need an offset of sizeof(_Rep_base). See: Why std::string allocating twice?
and http://ideone.com/QWtxWg
It is defined as:
struct _Rep_base
{
std::size_t _M_length;
std::size_t _M_capacity;
_Atomic_word _M_refcount;
};
So the example becomes:
struct Repbase
{
std::size_t length;
std::size_t capacity;
std::int16_t refcount;
};
int main()
{
typedef std::basic_string<char, std::char_traits<char>, CAllocator<char>> CAString;
const int size = 1024;
char ptr[size] = {0};
CAString str(CAllocator<char>(&ptr[0], size));
str = "Hello";
std::cout<<&ptr[sizeof(Repbase)];
}
I wrote a custom allocator for std::string and std::vector as follows:
#include <cstdint>
#include <iterator>
#include <iostream>
template <typename T>
struct PSAllocator
{
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
typedef T value_type;
template<typename U>
struct rebind {typedef PSAllocator<U> other;};
PSAllocator() throw() {};
PSAllocator(const PSAllocator& other) throw() {};
template<typename U>
PSAllocator(const PSAllocator<U>& other) throw() {};
template<typename U>
PSAllocator& operator = (const PSAllocator<U>& other) { return *this; }
PSAllocator<T>& operator = (const PSAllocator& other) { return *this; }
~PSAllocator() {}
pointer allocate(size_type n, const void* hint = 0)
{
std::int32_t* data_ptr = reinterpret_cast<std::int32_t*>(::operator new(n * sizeof(value_type)));
std::cout<<"Allocated: "<<&data_ptr[0]<<" of size: "<<n<<"\n";
return reinterpret_cast<pointer>(&data_ptr[0]);
}
void deallocate(T* ptr, size_type n)
{
std::int32_t* data_ptr = reinterpret_cast<std::int32_t*>(ptr);
std::cout<<"De-Allocated: "<<&data_ptr[0]<<" of size: "<<n<<"\n";
::operator delete(reinterpret_cast<T*>(&data_ptr[0]));
}
};
Then I ran the following test case:
int main()
{
typedef std::basic_string<char, std::char_traits<char>, PSAllocator<char>> cstring;
cstring* str = new cstring();
str->resize(1);
delete str;
std::cout<<"\n\n\n\n";
typedef std::vector<char, PSAllocator<char>> cvector;
cvector* cv = new cvector();
cv->resize(1);
delete cv;
}
For whatever odd reason, it goes on to print:
Allocated: 0x3560a0 of size: 25
Allocated: 0x3560d0 of size: 26
De-Allocated: 0x3560a0 of size: 25
De-Allocated: 0x3560d0 of size: 26
Allocated: 0x351890 of size: 1
De-Allocated: 0x351890 of size: 1
So why does it allocate twice for std::string and a lot more bytes?
I'm using g++ 4.8.1 x64 sjlj on Windows 8 from: http://sourceforge.net/projects/mingwbuilds/.
I can't reproduce the double allocation, since apparently my libstdc++ does not allocate anything at all for the empty string. The resize however does allocate 26 bytes, and gdb helps me identifying how they are composed:
size_type __size = (__capacity + 1) * sizeof(_CharT) + sizeof(_Rep);
( 1 + 1) * 1 + 24
So the memory is mostly for this _Rep representation, which in turn consists of the following data members:
size_type _M_length; // 8 bytes
size_type _M_capacity; // 8 bytes
_Atomic_word _M_refcount; // 4 bytes
I guess the last four bytes is just for the sake of alignment, but I might have missed some data element.
I guess the main reason why this _Rep structure is allocated on the heap is that it can be shared among string instances, and perhaps also that it can be avoided for empty strings as the lack of a first allocation on my system suggests.
To find out why your implementation doesn't make use of this empty string optimization, have a look at the default constructor. Its implementation seems to depend on the value of _GLIBCXX_FULLY_DYNAMIC_STRING, which apparently is non-zero in your setup. I'd not advise changing that setting directly, since it starts with an underscore and is therefore considered private. But you might find some public setting to affect this value.
I replaced the Standard allocator with an allocator that will "phone home" about how much memory it consumes. Now I'm going through some of my code, wondering why the hell it allocates and then deallocates so many entries.
Just for reference, I'm not trying to pre-optimize my code or anything, I'm mostly curious, except I definitely need to know if my total size is off, because I need to know exactly how much my object is using for the C# GC.
Take this sample function:
void add_file(string filename, string source) {
file_source_map.insert(std::pair<const string, string>(std::move(filename), std::move(source)));
}
It allocates six times (48bytes), and then deallocates four times(32bytes). Since the pair is an rvalue, and I moved the strings into it, surely the map will allocate a new node and move the rvalue pair into it, without triggering any more allocations and certainly not having to de-allocate any. The filename and source arguments also come from rvalues and should be moved in, instead of copied in. Just a note: the string is also being tracked by the allocator, it's not std::string but std::basic_string<char, std::char_traits<char>, Allocator<char>>.
Just for reference, I'm on MSVC.
Here's my allocator code:
template<typename T>
class Allocator {
public :
// typedefs
typedef T value_type;
typedef value_type* pointer;
typedef const value_type* const_pointer;
typedef value_type& reference;
typedef const value_type& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
public :
// convert an allocator<T> to allocator<U>
template<typename U>
struct rebind {
typedef Allocator<U> other;
};
public :
Parser* parser;
inline ~Allocator() {}
inline Allocator(Allocator const& other) {
parser = other.parser;
}
inline Allocator(Parser* ptr)
: parser(ptr) {}
template<typename U>
inline Allocator(Allocator<U> const& other) {
parser = other.parser;
}
// address
inline pointer address(reference r) { return &r; }
inline const_pointer address(const_reference r) { return &r; }
// memory allocation
inline pointer allocate(size_type cnt,
typename std::allocator<void>::const_pointer = 0) {
int newsize = cnt * sizeof (T);
parser->size += newsize;
std::cout << "Allocated " << newsize << "\n";
return reinterpret_cast<pointer>(::operator new(newsize));
}
inline void deallocate(pointer p, size_type count) {
size_type size = count * sizeof(T);
::operator delete(p);
parser->size -= size;
std::cout << "Deallocated " << size << "\n";
}
// size
inline size_type max_size() const {
return std::numeric_limits<size_type>::max() / sizeof(T);
}
// construction/destruction
inline void construct(pointer p, const T& t) { new(p) T(t); }
inline void destroy(pointer p) { p->~T(); }
inline bool operator==(Allocator const& other) { return other.parser == parser; }
inline bool operator!=(Allocator const& a) { return !operator==(a); }
};
When I call add_file (posted above) from C# via the wrapper functions, I can clearly see each allocation and deallocation and their appropriate sizes on the console, and that is four allocations of 8, one of 80 which I know comes from the map, two more allocations of 8, and then four deallocations of 8, which says to me that there are four redundant strings in the function, because they're all rvalues and there's no reason for any deallocations to occur.
I ran your code in VS 2010, and I believe that the allocations you see are only Visual Studio STL debugging facilities as all the 8 bytes allocations are issued from _String_val constructor :
In release (_ITERATOR_DEBUG_LEVEL == 0), the constructor is trivial
In debug (_ITERATOR_DEBUG_LEVEL != 0), it allocates a _Container_proxy (which happens to have a size of 8) through the allocator.
If I run your code in release mode, the node allocation for the map falls to 72 bytes and the 8 bytes allocation and deallocation disappear : the strings seem to be correctly moved.
I'm working with a low-level API that accepts a char* and numeric value to represent a string and its length, respectively. My code uses std::basic_string and calls into these methods with the appropriate translation. Unfortunately, many of these methods accept string lengths of varying size (i.e. max(unsigned char), max(short), etc...) and I'm stuck writing code to make sure that my string instances do not exceed the maximum length prescribed by the low-level API.
By default, the maximum length of an std::basic_string instance is bound by the maximum value of size_t (either max(unsigned int) or max(__int64)). Is there a way to manipulate the traits and allocator implementations of a std::basic_string implementation so that I may specify my own type to use in place of size_t? By doing so, I am hoping to leverage any existing bounds checks within the std::basic_string implementation so I don't have to do so when performing the translation.
My initial investigation suggests that this is not possible without writing my own string class, but I'm hoping that I overlooked something :)
you can pass a custom allocator to std::basic_string which has a max size of whatever you want. This should be sufficient. Perhaps something like this:
template <class T>
class my_allocator {
public:
typedef T value_type;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
pointer address(reference r) const { return &r; }
const_pointer address(const_reference r) const { return &r; }
my_allocator() throw() {}
template <class U>
my_allocator(const my_allocator<U>&) throw() {}
~my_allocator() throw() {}
pointer allocate(size_type n, void * = 0) {
// fail if we try to allocate too much
if((n * sizeof(T))> max_size()) { throw std::bad_alloc(); }
return static_cast<T *>(::operator new(n * sizeof(T)));
}
void deallocate(pointer p, size_type) {
return ::operator delete(p);
}
void construct(pointer p, const T& val) { new(p) T(val); }
void destroy(pointer p) { p->~T(); }
// max out at about 64k
size_type max_size() const throw() { return 0xffff; }
template <class U>
struct rebind { typedef my_allocator<U> other; };
template <class U>
my_allocator& operator=(const my_allocator<U> &rhs) {
(void)rhs;
return *this;
}
};
Then you can probably do this:
typedef std::basic_string<char, std::char_traits<char>, my_allocator<char> > limited_string;
EDIT: I've just done a test to make sure this works as expected. The following code tests it.
int main() {
limited_string s;
s = "AAAA";
s += s;
s += s;
s += s;
s += s;
s += s;
s += s;
s += s; // 512 chars...
s += s;
s += s;
s += s;
s += s;
s += s;
s += s; // 32768 chars...
s += s; // this will throw std::bad_alloc
std::cout << s.max_size() << std::endl;
std::cout << s.size() << std::endl;
}
That last s += s will put it over the top and cause a std::bad_alloc exception, (since my limit is just short of 64k). Unfortunately gcc's std::basic_string::max_size() implementation does not base its result on the allocator you use, so it will still claim to be able to allocate more. (I'm not sure if this is a bug or not...).
But this will definitely allow you impose hard limits on the sizes of strings in a simple way. You could even make the max size a template parameter so you only have to write the code for the allocator once.
I agree with Evan Teran about his solution. This is just a modification of his solution no more:
template <typename Type, typename std::allocator<Type>::size_type maxSize>
struct myalloc : std::allocator<Type>
{
// hide std::allocator[ max_size() & allocate(...) ]
std::allocator<Type>::size_type max_size() const throw()
{
return maxSize;
}
std::allocator<Type>::pointer allocate
(std::allocator<Type>::size_type n, void * = 0)
{
// fail if we try to allocate too much
if((n * sizeof(Type))> max_size()) { throw std::bad_alloc(); }
return static_cast<Type *>(::operator new(n * sizeof(Type)));
}
};
Be aware you should not use polymorphism at all with myalloc. So this is disastrous:
// std::allocator doesn't have a virtual destructor
std::allocator<char>* alloc = new myalloc<char>;
You just use it as if it is a separate type, it is safe in following case:
myalloc<char, 1024> alloc; // max size == 1024
Can't you create a class with std::string as parent and override c_str()?
Or define your own c_str16(), c_str32(), etc and implement translation there?