Extra allocations and magical space reduction in the STL - using rvalue references

Extra allocations and magical space reduction in the STL - using rvalue references - c++

I replaced the Standard allocator with an allocator that will "phone home" about how much memory it consumes. Now I'm going through some of my code, wondering why the hell it allocates and then deallocates so many entries.
Just for reference, I'm not trying to pre-optimize my code or anything, I'm mostly curious, except I definitely need to know if my total size is off, because I need to know exactly how much my object is using for the C# GC.
Take this sample function:
void add_file(string filename, string source) {
file_source_map.insert(std::pair<const string, string>(std::move(filename), std::move(source)));
}
It allocates six times (48bytes), and then deallocates four times(32bytes). Since the pair is an rvalue, and I moved the strings into it, surely the map will allocate a new node and move the rvalue pair into it, without triggering any more allocations and certainly not having to de-allocate any. The filename and source arguments also come from rvalues and should be moved in, instead of copied in. Just a note: the string is also being tracked by the allocator, it's not std::string but std::basic_string<char, std::char_traits<char>, Allocator<char>>.
Just for reference, I'm on MSVC.
Here's my allocator code:
template<typename T>
class Allocator {
public :
// typedefs
typedef T value_type;
typedef value_type* pointer;
typedef const value_type* const_pointer;
typedef value_type& reference;
typedef const value_type& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
public :
// convert an allocator<T> to allocator<U>
template<typename U>
struct rebind {
typedef Allocator<U> other;
};
public :
Parser* parser;
inline ~Allocator() {}
inline Allocator(Allocator const& other) {
parser = other.parser;
}
inline Allocator(Parser* ptr)
: parser(ptr) {}
template<typename U>
inline Allocator(Allocator<U> const& other) {
parser = other.parser;
}
// address
inline pointer address(reference r) { return &r; }
inline const_pointer address(const_reference r) { return &r; }
// memory allocation
inline pointer allocate(size_type cnt,
typename std::allocator<void>::const_pointer = 0) {
int newsize = cnt * sizeof (T);
parser->size += newsize;
std::cout << "Allocated " << newsize << "\n";
return reinterpret_cast<pointer>(::operator new(newsize));
}
inline void deallocate(pointer p, size_type count) {
size_type size = count * sizeof(T);
::operator delete(p);
parser->size -= size;
std::cout << "Deallocated " << size << "\n";
}
// size
inline size_type max_size() const {
return std::numeric_limits<size_type>::max() / sizeof(T);
}
// construction/destruction
inline void construct(pointer p, const T& t) { new(p) T(t); }
inline void destroy(pointer p) { p->~T(); }
inline bool operator==(Allocator const& other) { return other.parser == parser; }
inline bool operator!=(Allocator const& a) { return !operator==(a); }
};
When I call add_file (posted above) from C# via the wrapper functions, I can clearly see each allocation and deallocation and their appropriate sizes on the console, and that is four allocations of 8, one of 80 which I know comes from the map, two more allocations of 8, and then four deallocations of 8, which says to me that there are four redundant strings in the function, because they're all rvalues and there's no reason for any deallocations to occur.

I ran your code in VS 2010, and I believe that the allocations you see are only Visual Studio STL debugging facilities as all the 8 bytes allocations are issued from _String_val constructor :
In release (_ITERATOR_DEBUG_LEVEL == 0), the constructor is trivial
In debug (_ITERATOR_DEBUG_LEVEL != 0), it allocates a _Container_proxy (which happens to have a size of 8) through the allocator.
If I run your code in release mode, the node allocation for the map falls to 72 bytes and the 8 bytes allocation and deallocation disappear : the strings seem to be correctly moved.

Related

Can I write a custom allocator to decide the reallocation amount of std::vector?

To my understanding, a custom allocator must fit the requierements of the Allocator Concept. Based on that interface though, I can't see how I would choose a new allocation amount when the vector has run out of reserve.
For example, the current implementation on my machine will double the allocation size every time the reserve is exceeded during a push_back(). I'd like to provide a custom allocator that is slow and memory conscious. It will only allocate the previous capacity+1 to accomodate the new element.
These are the interfaces of the concept I'm looking at:
a.allocate(n)
a.allocate(n, cvptr) (optional)
I've made a working boilerplate allocator like so:
#include <limits>
#include <iostream>
template <class T> class MyAlloc {
public:
// type definitions
typedef T value_type;
typedef T *pointer;
typedef const T *const_pointer;
typedef T &reference;
typedef const T &const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
pointer address(reference value) const {
return &value;
}
const_pointer address(const_reference value) const {
return &value;
}
size_type max_size() const throw() {
return std::numeric_limits<std::size_t>::max() / sizeof(T);
}
pointer allocate(size_type num, const void * = 0) {
return (pointer)(::operator new(num * sizeof(T)));
}
void construct(pointer p, const T &value) {
new ((void *)p) T(value);
}
void destroy(pointer p) {
p->~T();
}
void deallocate(pointer p, size_type num) {
::operator delete((void *)p);
}
};
Looking at the allocate function:
pointer allocate(size_type num, const void * = 0) {
return (pointer)(::operator new(num * sizeof(T)));
}
I could allocate more or less memory here, but I don't see a way of reporting that back to the vector so that it knows what its current capacity is.
Maybe this falls outside the responsibility of an allocator?

The STL model that C++ inherited is based on a specific division between container and allocator. The purpose of an allocator is to provide the memory that someone requests. The decision about how much memory to allocate is entirely up to the container, without regard for which allocator it uses to provide that memory.
That's the model C++ uses. You could write your own vector-like container that allows its allocator to specify how much it should allocate. But other than that, no.

What's the point of std::function constructor with custom allocator but no other args?

I'm playing around with std::function and custom allocators but its not behaving as I expected when I don't provide the function with an initial functor.
When I provide a custom allocator to the constructor but no initial functor, the allocator is never used or so it seems.
This is my code.
//Simple functor class that is big to force allocations
struct Functor128
{
Functor128()
{}
char someBytes[128];
void operator()(int something)
{
cout << "Functor128 Called with value " << something << endl;
}
};
int main(int argc, char* argv[])
{
Allocator<char, 1> myAllocator1;
Allocator<char, 2> myAllocator2;
Allocator<char, 3> myAllocator3;
Functor128 myFunctor;
cout << "setting up function1" << endl;
function<void(int)> myFunction1(allocator_arg, myAllocator1, myFunctor);
myFunction1(7);
cout << "setting up function2" << endl;
function<void(int)> myFunction2(allocator_arg, myAllocator2);
myFunction2 = myFunctor;
myFunction2(9);
cout << "setting up function3" << endl;
function<void(int)> myFunction3(allocator_arg, myAllocator3);
myFunction3 = myFunction1;
myFunction3(19);
}
Output:
setting up function1
Allocator 1 allocating 136 bytes.
Functor128 Called with value 7
setting up function2
Functor128 Called with value 9
setting up function3
Allocator 1 allocating 136 bytes.
Functor128 Called with value 19
So case1: myFunction1 allocates using allocator1 as expected.
case2: myFunction2 is given allocator2 in constructor but when assigned a functor it appears to reset to using the default std::allocator to make the allocation.(hence no print out about allocation).
case3: myFunction3 is given allocator3 in constructor but when assigned to from myFunction1 the allocation takes place using function1's allocator to make the allocation.
Is this correct behaviour?
In particular, in case 2 why revert to using default std::allocator?
If so what is the point of the empty constructor that takes an allocator as the allocator never gets used.
I am using VS2013 for this code.
My Allocator class is just a minimal implementation that uses new and logs out when it allocates
template<typename T, int id = 1>
class Allocator {
public:
// typedefs
typedef T value_type;
typedef value_type* pointer;
typedef const value_type* const_pointer;
typedef value_type& reference;
typedef const value_type& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
public:
// convert an allocator<T> to allocator<U>
template<typename U>
struct rebind {
typedef Allocator<U> other;
};
public:
inline Allocator() {}
inline ~Allocator() {}
inline Allocator(Allocator const&) {}
template<typename U>
inline Allocator(Allocator<U> const&) {}
// address
inline pointer address(reference r) { return &r; }
inline const_pointer address(const_reference r) { return &r; }
// memory allocation
inline pointer allocate(size_type cnt,
typename std::allocator<void>::const_pointer = 0)
{
size_t numBytes = cnt * sizeof (T);
std::cout << "Allocator " << id << " allocating " << numBytes << " bytes." << std::endl;
return reinterpret_cast<pointer>(::operator new(numBytes));
}
inline void deallocate(pointer p, size_type) {
::operator delete(p);
}
// size
inline size_type max_size() const {
return std::numeric_limits<size_type>::max() / sizeof(T);
}
// construction/destruction
inline void construct(pointer p, const T& t) { new(p)T(t); }
inline void destroy(pointer p) { p->~T(); }
inline bool operator==(Allocator const&) { return true; }
inline bool operator!=(Allocator const& a) { return !operator==(a); }
}; // end of class Allocator

std::function's allocator support is...weird.
The current spec for operator=(F&& f) is that it does std::function(std::forward<F>(f)).swap(*this);. As you can see, this means that memory for f is allocated using whatever std::function uses by default, rather than the allocator used to construct *this. So the behavior you observe is correct, though surprising.
Moreover, since the (allocator_arg_t, Allocator) and (allocator_arg_t, Allocator, nullptr_t) constructors are noexcept, they can't really store the allocator even if they wanted to (type-erasing an allocator may require a dynamic allocation). As is, they are basically no-ops that exist to support the uses-allocator construction protocol.
LWG very recently rejected an issue that would change this behavior.

Why std::string allocating twice?

I wrote a custom allocator for std::string and std::vector as follows:
#include <cstdint>
#include <iterator>
#include <iostream>
template <typename T>
struct PSAllocator
{
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
typedef T value_type;
template<typename U>
struct rebind {typedef PSAllocator<U> other;};
PSAllocator() throw() {};
PSAllocator(const PSAllocator& other) throw() {};
template<typename U>
PSAllocator(const PSAllocator<U>& other) throw() {};
template<typename U>
PSAllocator& operator = (const PSAllocator<U>& other) { return *this; }
PSAllocator<T>& operator = (const PSAllocator& other) { return *this; }
~PSAllocator() {}
pointer allocate(size_type n, const void* hint = 0)
{
std::int32_t* data_ptr = reinterpret_cast<std::int32_t*>(::operator new(n * sizeof(value_type)));
std::cout<<"Allocated: "<<&data_ptr[0]<<" of size: "<<n<<"\n";
return reinterpret_cast<pointer>(&data_ptr[0]);
}
void deallocate(T* ptr, size_type n)
{
std::int32_t* data_ptr = reinterpret_cast<std::int32_t*>(ptr);
std::cout<<"De-Allocated: "<<&data_ptr[0]<<" of size: "<<n<<"\n";
::operator delete(reinterpret_cast<T*>(&data_ptr[0]));
}
};
Then I ran the following test case:
int main()
{
typedef std::basic_string<char, std::char_traits<char>, PSAllocator<char>> cstring;
cstring* str = new cstring();
str->resize(1);
delete str;
std::cout<<"\n\n\n\n";
typedef std::vector<char, PSAllocator<char>> cvector;
cvector* cv = new cvector();
cv->resize(1);
delete cv;
}
For whatever odd reason, it goes on to print:
Allocated: 0x3560a0 of size: 25
Allocated: 0x3560d0 of size: 26
De-Allocated: 0x3560a0 of size: 25
De-Allocated: 0x3560d0 of size: 26
Allocated: 0x351890 of size: 1
De-Allocated: 0x351890 of size: 1
So why does it allocate twice for std::string and a lot more bytes?
I'm using g++ 4.8.1 x64 sjlj on Windows 8 from: http://sourceforge.net/projects/mingwbuilds/.

I can't reproduce the double allocation, since apparently my libstdc++ does not allocate anything at all for the empty string. The resize however does allocate 26 bytes, and gdb helps me identifying how they are composed:
size_type __size = (__capacity + 1) * sizeof(_CharT) + sizeof(_Rep);
( 1 + 1) * 1 + 24
So the memory is mostly for this _Rep representation, which in turn consists of the following data members:
size_type _M_length; // 8 bytes
size_type _M_capacity; // 8 bytes
_Atomic_word _M_refcount; // 4 bytes
I guess the last four bytes is just for the sake of alignment, but I might have missed some data element.
I guess the main reason why this _Rep structure is allocated on the heap is that it can be shared among string instances, and perhaps also that it can be avoided for empty strings as the lack of a first allocation on my system suggests.
To find out why your implementation doesn't make use of this empty string optimization, have a look at the default constructor. Its implementation seems to depend on the value of _GLIBCXX_FULLY_DYNAMIC_STRING, which apparently is non-zero in your setup. I'd not advise changing that setting directly, since it starts with an underscore and is therefore considered private. But you might find some public setting to affect this value.

Bad allocator implementation

I've got a pretty simple problem. I've been whipping up a non-thread-safe allocator. The allocator is a fairly simple memory arena strategy- allocate a big chunk, put all allocations in it, do nothing for deallocation, remove the whole lot on arena destruction. However, actually attempting to use this scheme throws an access violation.
static const int MaxMemorySize = 80000;
template <typename T>
class LocalAllocator
{
public:
std::vector<char>* memory;
int* CurrentUsed;
typedef T value_type;
typedef value_type * pointer;
typedef const value_type * const_pointer;
typedef value_type & reference;
typedef const value_type & const_reference;
typedef std::size_t size_type;
typedef std::size_t difference_type;
template <typename U> struct rebind { typedef LocalAllocator<U> other; };
template <typename U>
LocalAllocator(const LocalAllocator<U>& other) {
CurrentUsed = other.CurrentUsed;
memory = other.memory;
}
LocalAllocator(std::vector<char>* ptr, int* used) {
CurrentUsed = used;
memory = ptr;
}
template<typename U> LocalAllocator(LocalAllocator<U>&& other) {
CurrentUsed = other.CurrentUsed;
memory = other.memory;
}
pointer address(reference r) { return &r; }
const_pointer address(const_reference s) { return &r; }
size_type max_size() const { return MaxMemorySize; }
void construct(pointer ptr, value_type&& t) { new (ptr) T(std::move(t)); }
void construct(pointer ptr, const value_type & t) { new (ptr) T(t); }
void destroy(pointer ptr) { static_cast<T*>(ptr)->~T(); }
bool operator==(const LocalAllocator& other) const { return Memory == other.Memory; }
bool operator!=(const LocalAllocator&) const { return false; }
pointer allocate(size_type n) {
if (*CurrentUsed + (n * sizeof(T)) > MaxMemorySize)
throw std::bad_alloc();
auto val = &(*memory)[*CurrentUsed];
*CurrentUsed += (n * sizeof(T));
return reinterpret_cast<pointer>(val);
}
pointer allocate(size_type n, pointer) {
return allocate(n);
}
void deallocate(pointer ptr, size_type n) {}
pointer allocate() {
return allocate(sizeof(T));
}
void deallocate(pointer ptr) {}
};
I've initialized memory to point to a vector which is resized to the MaxMemorySize, and I've also initialized CurrentUsed to point to an int which is zero. I fed in an allocator with these values to the constructor of a std::unordered_map, but it keeps throwing an access violation in the STL internals. Any suggestions?
Edit: Here's my usage:
std::vector<char> memory;
int CurrentUsed = 0;
memory.resize(80000);
std::unordered_map<int, int, std::hash<int>, std::equal_to<int>, LocalAllocator<std::pair<const int, int>>> dict(
std::unordered_map<int, int>().bucket_count(),
std::hash<int>(),
std::equal_to<int>(),
LocalAllocator<std::pair<const int, int>>(&memory, &CurrentUsed)
);
// start timer
QueryPerformanceCounter(&t1);
for (int i=0;i<10000;i++)
dict[i]=i; // crash
Edit: Bloody hell. It worked when I increased the size to 1MB. I had to increase it to over 800,000 bytes to get it to work without throwing.

When I test this code, the rebind is being used to request multiple allocators against the same memory block. I put
cout << n << " " << sizeof(T) << " " << typeid(T).name() << endl;
at the top of allocate(size_type) and when I added three elements to a unordered_map got:
1 64 struct std::_List_nod<...>
16 4 struct std::_List_iterator<...>
1 64 struct std::_List_nod<...>
1 64 struct std::_List_nod<...>
1 64 struct std::_List_nod<...>
If your implementation isn't coincidentally using nice round 64-byte requests this class will return mis-aligned allocations.

MSVC10's hashtable types just has a ton of space overhead for small value types. It's overrunning the amount of space you've reserved and throwing bad_alloc.
It's implemented as a list<value_t> holding all the elements and a hash bucket vector<list<value_t>::iterator> with between 2 and 16 slots per element.
That's a total of 4 to 18 pointers of overhead per element.
Something like this implementation is probably required by the standard. Unlike vector, unordered_map has a requirement that elements not be moved once added to the container.

Can one leverage std::basic_string to implement a string having a length limitation?

I'm working with a low-level API that accepts a char* and numeric value to represent a string and its length, respectively. My code uses std::basic_string and calls into these methods with the appropriate translation. Unfortunately, many of these methods accept string lengths of varying size (i.e. max(unsigned char), max(short), etc...) and I'm stuck writing code to make sure that my string instances do not exceed the maximum length prescribed by the low-level API.
By default, the maximum length of an std::basic_string instance is bound by the maximum value of size_t (either max(unsigned int) or max(__int64)). Is there a way to manipulate the traits and allocator implementations of a std::basic_string implementation so that I may specify my own type to use in place of size_t? By doing so, I am hoping to leverage any existing bounds checks within the std::basic_string implementation so I don't have to do so when performing the translation.
My initial investigation suggests that this is not possible without writing my own string class, but I'm hoping that I overlooked something :)

you can pass a custom allocator to std::basic_string which has a max size of whatever you want. This should be sufficient. Perhaps something like this:
template <class T>
class my_allocator {
public:
typedef T value_type;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
pointer address(reference r) const { return &r; }
const_pointer address(const_reference r) const { return &r; }
my_allocator() throw() {}
template <class U>
my_allocator(const my_allocator<U>&) throw() {}
~my_allocator() throw() {}
pointer allocate(size_type n, void * = 0) {
// fail if we try to allocate too much
if((n * sizeof(T))> max_size()) { throw std::bad_alloc(); }
return static_cast<T *>(::operator new(n * sizeof(T)));
}
void deallocate(pointer p, size_type) {
return ::operator delete(p);
}
void construct(pointer p, const T& val) { new(p) T(val); }
void destroy(pointer p) { p->~T(); }
// max out at about 64k
size_type max_size() const throw() { return 0xffff; }
template <class U>
struct rebind { typedef my_allocator<U> other; };
template <class U>
my_allocator& operator=(const my_allocator<U> &rhs) {
(void)rhs;
return *this;
}
};
Then you can probably do this:
typedef std::basic_string<char, std::char_traits<char>, my_allocator<char> > limited_string;
EDIT: I've just done a test to make sure this works as expected. The following code tests it.
int main() {
limited_string s;
s = "AAAA";
s += s;
s += s;
s += s;
s += s;
s += s;
s += s;
s += s; // 512 chars...
s += s;
s += s;
s += s;
s += s;
s += s;
s += s; // 32768 chars...
s += s; // this will throw std::bad_alloc
std::cout << s.max_size() << std::endl;
std::cout << s.size() << std::endl;
}
That last s += s will put it over the top and cause a std::bad_alloc exception, (since my limit is just short of 64k). Unfortunately gcc's std::basic_string::max_size() implementation does not base its result on the allocator you use, so it will still claim to be able to allocate more. (I'm not sure if this is a bug or not...).
But this will definitely allow you impose hard limits on the sizes of strings in a simple way. You could even make the max size a template parameter so you only have to write the code for the allocator once.

I agree with Evan Teran about his solution. This is just a modification of his solution no more:
template <typename Type, typename std::allocator<Type>::size_type maxSize>
struct myalloc : std::allocator<Type>
{
// hide std::allocator[ max_size() & allocate(...) ]
std::allocator<Type>::size_type max_size() const throw()
{
return maxSize;
}
std::allocator<Type>::pointer allocate
(std::allocator<Type>::size_type n, void * = 0)
{
// fail if we try to allocate too much
if((n * sizeof(Type))> max_size()) { throw std::bad_alloc(); }
return static_cast<Type *>(::operator new(n * sizeof(Type)));
}
};
Be aware you should not use polymorphism at all with myalloc. So this is disastrous:
// std::allocator doesn't have a virtual destructor
std::allocator<char>* alloc = new myalloc<char>;
You just use it as if it is a separate type, it is safe in following case:
myalloc<char, 1024> alloc; // max size == 1024

Can't you create a class with std::string as parent and override c_str()?
Or define your own c_str16(), c_str32(), etc and implement translation there?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Extra allocations and magical space reduction in the STL - using rvalue references - c++

Related

Can I write a custom allocator to decide the reallocation amount of std::vector?

What's the point of std::function constructor with custom allocator but no other args?

Why std::string allocating twice?

Bad allocator implementation

Can one leverage std::basic_string to implement a string having a length limitation?

Categories

Resources