I want to replace the standard allocator with a more robust allocator (the C++ standard only requires an overflow check on vector::resize). The various C++ allocators supplied with many libraries fall flat on their face when fed negative self tests.
I have access to a more robust allocator. ESAPI's allocator not only checks for overflow, it also has debug instrumentation to help find mistakes. http://code.google.com/p/owasp-esapi-cplusplus/source/browse/trunk/esapi/util/zAllocator.h.
Is there a standard way to replace the C++ allocator used in a program without too much effort? I also want to ensure its replaced in library code, which I may not have access to source code.
Unlike malloc which is a library function that can be replaced by another function with the same signature, std::allocator is a class template and template code is instantiated as needed and inlined into code that uses it. Some standard library code will have already been compiled into the library's object files and will contain instantiated std::allocator code which can't be replaced. So the only way is if the standard library provides some non-standard way to replace its std::allocator. Luckily, GCC's libstdc++ allows you to do just that, allowing you to select the implementation used for std::allocator when GCC is configured and built, with a few different choices
It wouldn't be too much work to add the ESAPI allocator to the GCC sources as one of the options, then rebuild GCC to use that allocator as the base class of std::allocator providing its implementation. You might need to tweak the ESAPI allocator code a bit, and maybe alter the libstdc++ configure script to allow you to say --enable-libstdcxx-allocator=esapi
If you want to modify allocation on a global basis instead of per-container, you probably want to replace ::operator new and ::operator delete. Conceivably, you'd also want to replace ::operator new[] and ::operator delete[] as well -- but these are only used for allocating arrays, which you should almost never use anyway (aside, in case it wasn't obvious: no, these are not used to allocate memory for a std::vector, despite its being rather similar to an array in some ways).
Although trying to replace most parts of the library is prohibited, the standard specifically allows replacing these.
Of course, if somebody is already specifying a different allocator for a particular container, and that allocator doesn't (eventually) get its memory via ::operator new (or ::operator new[]) this will not affect that container/those containers.
In C++0x, define a new template alias in namespace mystd that is a std::vector but with your custom allocator. Replace all std::vectors with mystd::vector. Get rid of all using namespace std and using std::vector in your code.
Rebuild. Replace the places where you used a raw vector<T> with mystd::vector<T>.
Oh, and use a better name than mystd.
Related
My team is working on an application, where we need to track memory usage, and provide statistics on how much memory areas of the program utilize (e.g. N bytes used by uncontrolled STL containers). I need to find a way to identify memory allocated in 3rd party libs from STL containers.
The application makes use of 3rd party libraries that either we don't have access to the source code, or have been directed not to make changes to the source. Some of these libraries use standard STL containers, like std::vector<int>, but they have used (or appear to use, in the case of the closed libs) the default std::allocator. We are targeting Windows, with future work planned for Mac and Linux platforms, using C++17 as much as possible.
I've overridden the malloc and free functions; overridden new, new[], delete and delete[] operators; and created an STLAllocator class derived from std::allocator that is used as the _Alloc template parameter for our use of STL containers. For the libraries that provide hooks to replace the memory allocators, I have done so. When the STL containers in the remaining 3rd partly libs use the default std::allocator, I can see their new and delete calls come through the new and delete overrides, but these appear no different to tracking than a call to new or delete made from main.
I've read many great descriptions of how to declare and use your own std::allocator class, been reminded of the template parameter equality issue when providing different allocators, and made aware of an upcoming solution using std::experimental::pmr::polymorphic_allocator, but I haven't found a definitive answer to my question. Is there a way to supplant the default std::allocator for 3rd party libs that don't provide a hook to override the default std::allocator used by STL containers?
For anyone interested, here is the link that describes the template parameter equality issue; it's also a good overview of std::allocator in general:
https://blog.feabhas.com/2019/03/thanks-for-the-memory-allocator/
Is there a way to supplant the default std::allocator for 3rd party libs that don't provide a hook to override the default std::allocator used by STL containers?
Not in general; especially for things that you don't have source code for.
Consider (for example) a call to std::allocator<int>::allocate. Chances are, it's marked as inline, which means that the body of the function has been embedded in the object code that you're linking. Providing your own copy of that function at link time (or in a separate dylib) will have no effect.
Providing your own global operator new is probably the best you can do.
While reading into the libssh library, I saw that they specifically say
libssh follows the allocate-it-deallocate-it pattern. Each object that you allocate using xxxxx_new() must be deallocated using xxxxx_free()
Is this something that comes from it being a C library rather than a C++ library where new and delete didn't exist or is it a common practice to forget about new and delete and manually create and delete objects using the xxxx_new and xxxx_free pattern? If it is a common practice what are it's benefits over new and delete and the constructors and destructors that are called?
[EDIT] Added the link to where I read this as an <a> tag on "libssh library" for those asking.
A first glance at the link you provided reveals that libssh uses the xxxx_new() functions as combined allocator/constructor calls. It's really just a standard naming of factory functions. Likewise, xxxx_free() acts as a destructor/deallocator combination.
Combining allocation and construction into a single function call is a good idea whenever a library wants to provide typesafe opaque pointers to its user code: To compile the user code, the compiler only needs to know that the type exists and that it's distinct from any other type. There is no need to have the full class/struct declaration in a public header.
This approach is not very popular with C++ libraries, because they generally want their objects to behave like any normal C++ object (which means that the pointers/references must not be opaque to the compiler). But if a library provides a C interface, such factory functions make it unlikely that you get weird errors due to users passing in pointers to uninitialized objects (forgotten constructor call), or screwing up the allocation of your objects.
Is this something that comes from it being a C library rather than a C++ library where new and delete didn't exist
Most probably yes. There's often need for more initializations done than plain memory allocation as available with malloc() in c code. It's similar as new calls constructors of class/struct instances created.
or is it a common practice to forget about new and delete and manually create and delete objects using the xxxx_new and xxxx_free pattern?
No, that's not common practice in c++.
The way to deal with dynamically allocated instances and ownership is to use the functions and smart pointer classes from the standard c++ Dynamic memory management utilities.
I have a question very similar to
How do I allocate a std::string on the stack using glibc's string implementation?
but I think it's worth asking again.
I want an std::string with local storage that overflows into the free store. std::basic_string provides an allocator as a template parameter, so it seems like the thing to do is to write an allocator with local storage and use it to parameterize the basic_string, like so:
std::basic_string<
char,
std::char_traits<char>,
inline_allocator<char, 10>
>
x("test");
I tried to write the inline_allocator class that would work the way you'd expect: it reserves 10 bytes for storage, and if the basic_string needs more than 10 bytes, then it calls ::operator new(). I couldn't get it to work. In the course of executing the above line of code, my GCC 4.5 standard string library calls the copy constructor for inline_allocator 4 times. It's not clear to me that there's a sensible way to write the copy constructor for inline_allocator.
In the other StackOverflow thread, Eric Melski provided this link to a class in Chromium:
http://src.chromium.org/svn/trunk/src/base/stack_container.h
which is interesting, but it's not a drop-in replacement for std::string, because it wraps the std::basic_string in a container so that you have to call an overloaded operator->() to get at the std::basic_string.
I can't find any other solutions to this problem. Could it be that there is no good solution? And if that's true, then are the std::basic_string and std::allocator concepts badly flawed? I mean, it seems like this should be a very basic and simple use case for std::basic_string and std::allocator. I suppose the std::allocator concept is designed primarily for pools, but I think it ought to cover this as well.
It seems like the rvalue-reference move semantics in C++0x might make it possible to write inline_allocator, if the string library is re-written so that basic_string uses the move constructor of its allocator instead of the copy constructor. Does anyone know what the prospect is for that outcome?
My application needs to construct a million tiny ASCII strings per second, so I ended up writing my own fixed-length string class based on Boost.Array, which works fine, but this is still bothering me.
Andrei Alexandrescu, C++ programmer extraordinaire who wrote "Modern C++ Design" once wrote a great article about building different string implementations with customizable storage systems. His article (linked here) describes how you can do what you've described above as a special case of a much more general system that can handle all sorts of clever memory allocation requirements. This doesn't talk so much about std::string and focuses more on a completely customized string class, but you might want to look into it as there are some real gems in the implementation.
C++2011 is really going to help you here :)
The fact is that the allocator concept in C++03 was crippled. One of the requirement was that an allocator of type A should be able to deallocate memory from any other allocator from type A... Unfortunately this requirement is also at odds with stateful allocators each hooked to its own pool.
Howard Hinnant (who manages the STL subgroup of the C++ commitee and is implementing a new STL from scratch for C++0x) has explored stack-based allocators on his website, which you could get inspiration from.
This is generally unnecessary. It's called the "short string optimization", and most implementations of std::string already include it. It may be hard to find, but it's usually there anyway.
Just for example, here's the relevant piece of sso_string_base.h that's part of MinGW:
enum { _S_local_capacity = 15 };
union
{
_CharT _M_local_data[_S_local_capacity + 1];
size_type _M_allocated_capacity;
};
The _M_local_data member is the relevant one -- space for it to store (up to) 15 characters (plus a NUL terminator) without allocating any space on the heap.
If memory serves, the Dinkumware library included with VC++ allocates space for 20 characters, though it's been a while since I looked, so I can't swear to that (and tracking down much of anything in their headers tends to be a pain, so I prefer to avoid looking if I can).
In any case, I'd give good odds that you've been engaged in that all-too-popular pass-time known as premature optimization.
I believe the code from Chromium just wraps things into a nice shell. But you can get the same effect without using the Chromium wrapper container.
Because the allocator object gets copied so often, it needs to hold a reference or pointer to the memory. So what you'd need to do is create the storage buffer, create the allocator object, then call the std::string constructor with the allocator.
It will be a lot wordier than using the wrapper class but should get the same effect.
You can see an example of the verbose method (still using the chromium stuff) in my question about stack vectors.
I have a test app that is linked with some DLLs (or .so's). In my main app I defined a global new/delete like this:
void* operator new(size_t n)
{
....
}
void operator delete(void* p)
{
...
}
But I noticed the operators are only called for things I allocate in my main app, but not if one of the DLLs does.
How do I make allocations in the DLLs go through my operator new/delete? (This should also include memory allocated by STL, so if one of the DLLs has an std::string, I'd like my operator new to be called when STL allocates its std::string internal buffer).
I'm more interested in a Windows solution, but a Linux one would also be appreciated.
edit: perhaps I wasn't clear originally, this test app I was doing was meant to track memory usage for a few auto-generated classes defined in a DLL. Creating my own allocator and using that in the generated code STL structures isn't an option, more so there are other non-STL allocations. But seeing the answers, I think the best option is either to use a profiler or simply monitor the memory usage using perfmon.
I'd like my operator new to be called when STL allocates its std::string internal buffer
typedef std::basic_string<char, std::char_traits<char>, ALLOCATOR> mystring;
The code in the DLLs already uses its own new implementation, and there's no good reason why defining your own implementation should magically change the implementation that the DLLs use (what if they use their own custom implementation?).
So if you want the strings to use your allocator, you need to explicitly create them as such.
Anything that is intended to use your global definitions must be compiled with those definitions available. The technique you use will not override anything already compiled in DLLs or even other source files that don't include these definitions. In many cases the allocators and standard functions will also not use these functions even when visible.
If you really need to do this you'll have to intercept calls to malloc (and other allocation routine). This isn't easy. You can't simply do it from the code. You'll have to tell the linker how to do this. On Linux I think this is LD_PRELOAD, though I can't remember, and on Windows I'm not sure at all.
If you can indicate why you'd like to do this perhaps I can offer an alternate solution.
There is no way to completely do what you want to do. There are too many corner cases where memory is leaked.
The closest I think you can get is by doing the following:
Each class in your dll/.so will have to have a static factory/destroy method. Pass a pointer to an allocation function to the factory and the deallocation function to the destroy method in each class in each dll/.so.
For an example of how to get close, google for the HORDE memory allocation library, which does get close.
Another thing to look at is the various C++ class plugin libraries that allow you to load any dll/.so as a plugin. Last time I checked there were at least 10 such libraries with source in the googlesphere. :)
I am trying to write a container class which uses STL allocators. What I currently do is to have a private member
std::allocator<T> alloc_;
(this will later be templated so that the user can pick a different allocator) and then call
T* ptr = alloc_.allocate(1,0);
to get a pointer to a newly allocated 'T' object (and used alloc_.construct to call the constructor; see the answer below). This works with the GNU C++ library.
However, with STLPort on Solaris, this fails to do the right thing and leads to all sorts of bizarre memory corruption errors. If I instead do
std::allocator_interface<std::allocator<T> > alloc_;
then it is all working as it should.
What is the correct way to use the stl::allocator? The STLPort/Solaris version fails to compile with g++, but is g++ right?
You need to both allocate and construct with the allocator. Something like this:
T* ptr = alloc_.allocate(1,0);
alloc_.construct(ptr, value);
Lots of things are downright broken if you don't start with a properly constructed object. Imagine a std::string being allocated but not constructed. When you try to assign to it, it will first try to cleanup its old contents by freeing some data, which will of course be garbage values from the heap and crash.
Something you might want to do is have your own custom allocator that you can use to see how the standard containers interact wit allocators. Stephan T. Lavavej posted a nice, simple one called the mallocator. Drop it into a test program that uses various STL containers and you can easily see how the allocator is used by the standard containers:
http://blogs.msdn.com/vcblog/archive/2008/08/28/the-mallocator.aspx
Not all of the interface functions in the mallocator (such as construct() and destroy()) are instrumented with trace output, so you might want to drop trace statements in there to more easily see how the standard containers might use those functions without resorting to a debugger.
That should give you a good idea of how your containers might be expected to use a custom allocator.