What happens if two libraries (dynamicaly linked) have their own globally overridden version of the new and the delete operators and they use their own memory management?
Is it generally wrong to ship memory management facilities inside a library, or can it be good in some cases to provide memory management only for some specific classes, defining only class-specific new and delete operators override?
Are there some differences in the case of static linked libraries?
In general, this is labelled "here be dragons". It depends on all sorts of things. Often the two libraries will fight, and new and delete will end up overridden by one of them - that's the best you can hope for.
Alternatives:
Library A starts up. Overrides new/delete, allocates some memory. Library B starts up and overrides. At system shutdown, library A's memory is freed with library B's delete. That is not good.
Memory allocated in library A uses library A's override, and similarly for library B. If you ever end up with memory allocated in library A being freed by library B, you will lose. (And it can be even more confusing, because if the object being deleted by B has a virtual destructor, the deletion can end up being done by A ... so it works.)
I think Martin pretty well answers your question, by outlining what happens and also nothing that it's a bit dangerous and not highly advisable (there be dragons indeed). Let me extend it a bit by providing an alternative: avoid overriding new/delete, and instead use the allocator concept.
For instance, if you look at std::vector, you notice that it is templated both on what type it stores, but also on an allocator. By writing a conforming allocator, you can control exactly how std::vector allocates and de-allocates memory. Notice how nice and loosely coupled this is: you have no difficulty exerting full control over memory allocation even though you can't change any of the original std::vector source code.
If you want library B to do allocation in a specific way, what I would do is:
Write an allocator that conforms to the allocator concept in the way that std::allocator does.
Make sure that all classes that use dynamic memory allocation directly (e.g. not through one of their members) are allocator aware.
Typedef all of those classes so that they use your allocator by default.
To clarify step 3, what I mean is write something like this in a fundamental header file of your library:
template <class T>
using my_lib::vector = std::vector<T, my_lib::MyAllocator<T>>;
Now you can use ''vector'' anywhere in your library, which will be a normal vector but using your allocation scheme.
Step 1 can range from very easy (for stateless allocators) to quite tricky (there's a number of gotcha with stateful allocators). As for step 2, it's quite straightforward as far as dynamic memory for containers goes as all of the containers in the standard library already support this. But if you are using dynamic memory for e.g. polymorphism, you'll need to do a bit of extra work (probably write a suitable wrapper) to do this in an allocator-aware way.
If someone has good examples of reasons why you'd want to override new/delete as opposed to using allocators (e.g. because there is something you can't do with allocators) I'd be interested to hear them.
Edit: and just to bring this full circle, note that there will not be any problem using libraries A and B simultaneously if you do this, as no global operators have been overridden.
Related
I have a background in Java and I'm still not fully used to the concept of pointers and scope, so sorry if the question seems silly.
I read somewhere that I should delete pointers that I have allocated on the Heap. I understand that but should I also delete a pointer that is given to me like this:
#include<dirent.h>
DIR* dir;
struct dirent* entries;
dir= opendir("D:/DIR")
entries= readdir(entries)
// Should I delete the pointers after I'm done with them?
delete entries;
delete dir;
Should I delete the pointers which are assigned from somewhere else or just going out of scope deletes them automatically?
Or is it even right to delete them since I haven't assigned them using new? But then if it's wrong how can I make sure that the memory assigned from the other methods would be deleted after I'm finished using them?
Not necessarily.
The unavoidable rule in C++ is that every new needs to be paired with a delete, and every new[] with a delete[], and that malloc, free, &c. are still available.
So it's tempting to suppose that the memory you get back from readdir is required to be released with an explicit call to delete, but that might not be the case:
It might have been allocated with a new[], or even, malloc!
The library might provide a function that you need to call that releases the memory.
If the library has a persuasion to working well with the C++ standard library, it might provide you with a deleter: which you can pass on construction to a smart pointer such as std::unique_ptr.
I think that (2) is most likely and is the most sensible since different C++ runtime environments might perform new and delete differently. (3) is an extension to this concept and if they support it then use that.
The golden rule is to check the documentation and do what it tells you.
There's no definitive answer to the question, because it always depends on the semantics by which the memory was allocated. For example in the very code example you gave, you must not use delete for deallocation, because opendir is not a C++ function (but a POSIX one) and to properly close it you call closedir. The pointer itself then can be discarded (no deletion requierd, closedir internally does the cleanup). Just make sure you don't use it after free (see: use-after-free-bug).
In general you always have to consult the manual of the function that gives you a pointer, where it's also specified exactly how to deallocate it.
Just to give you the idea:
malloc/calloc/realloc → free
fopen → fclose
X… → XFree
C is not the same as C++, notably for this aspect.
When using some external C (or C++) function provided by some external library, you should read its documentation and follow the "ownership" rules and conventions. For example if you use getline you understand that you need to free like here. If you use opendir you should use closedir. If you use sqlite3_prepare_v2 you'll need to sqlite3_finalize it, etc... Sometimes you'll think in terms of some abstract data type (like here) with a destructor-like function.
When you develop your own (public or "private") function in C, you need to document if it returns heap allocated memory and who (and how) is responsible of free (or releasing) it.
With C++, you also have smart pointers and RAII; so you generally can avoid manual new and delete. Smart pointers are helpful, but not a silver bullet.
So you should explicit and document conventions (and follow the conventions of external libraries and APIs) about ownership. Understanding and defining wisely such conventions is an important task.
Circular references are difficult to handle (consider weak pointers when applicable). I recommend reading about garbage collection concepts and techniques (e.g. the GC handbook), to at least be able to name appropriately your approaches and understand the limitations and power of reference counting. In some cases, you might even explicitly use a garbage collector library or code your own allocator.
Manual memory management is a whole program property, and that is why it is hard. There are even cases (long-living processes doing a lot of allocation) where you need to be afraid of fragmentation.
Tools like valgrind and the address sanitizer (with GCC's instrumentation options or Clang's ones) are practically very helpful to hunt memory leaks and some other memory bugs. Be also aware of ASLR. Take care to understand the virtual address space of your process. On Linux, read proc(5) then try cat /proc/$$/maps and cat /proc/self/maps in some terminal for a useful insight.
Ultimately you should consult the vendor's manual and see if their functions do the cleaning themselves or you need to call another function to do the cleaning etc. In general when talking about raw pointers in C++ you should explicitly release the allocated memory where appropriate. Whether it comes from a function or a new / new[] operator does not make a difference. To avoid this new / delete combination you can utilize smart pointers and the RAII technique.
Is it possible to override the way STL allocates, manages, and frees memory? How would one do so, if it's possible? Is there a way to do this in a way that keeps the code that handles the raw memory in one class or file?
I would like to do this for my entire program so I can keep track of memory usage, timing, and lifetime info. Purely out of curiousity of course!
You can do that by redefining the operators new and delete in one of your files.
The linker will override the standard ones by yours when resolving symbols.
You'll find lots and lots of answers on SO, like this one: overloading new/delete or that one: How to track memory allocations in C++ (especially new/delete) .
There exist some libraries on the internet that do that for you as well, like Memtrack or this one . SO has also some resources on that: C++ memory leak auto detection library .
Standard Library classes that manage data with dynamic storage duration take an allocator as one of their template arguments. The class will then make calls to an instance of the allocator for memory management. For instance you can do std::vector<int, MyAllocator> somevec; or std::list<Node*, MyAllocator> someList; to provide custom allocators to containers.
Here is an SO Q&A about allocators. The answer the link goes to includes skeleton code for an allocator that should be a good starting point for you.
I have been researching switching my allocation method from simpling overloading new to using multiple allocators through the code base. However, how can I efficiently use multiple allocators? The only way I could devise through my research was having the allocators be globals. Although, this seemed to have issues since it is typically a "bad idea" to have the use of many globals.
I am looking to find out how to use multiple allocators efficiently. For example, I may have one allocator use only for a particular subsystem, and a different allocator for a different subsystem. I am not sure if the only way to do this is through using multiple global allocators, so I am hoping for a better insight and design.
In C++2003 the allocator model is broken and there isn't really a proper solution. For C++2011 the allocator model was fixed and you can have per instance allocators which are propagated down to contained objects (unless, of course, you choose to replace them). Generally, for this to be useful you probably want to use a dynamically polymorphic allocator type which the default std::allocator<T> is not required to be (and generally I would expect it not to be dynamically polymorphic although this may be the better implementation choice). However, [nearly] all classes in the standard C++ library which do memory allocation are templates which take the allocator type as template argument (e.g. the IOStreams are an exception but generally they don't allocate any interesting amount of memory to warrant adding allocator support).
In several of your comments you are insisting that allocators effectively need to be global: that is definitely not correct. Each allocator-aware type stores a copy of the allocator given (at least, if it has any instance level data; if it doesn't there isn't anything to store as is e.g. the case with the default allocator using operator new() and operator delete()). This effectively means that the allocation mechanism given to an object needs to stick around as long as there is any active allocator using it. This can be done using a global object but it can also be done using e.g. reference counting or associating the allocator with an object containing all objects to which it is given. For example, if each "document" (think XML, Excel, Pages, whatever structure file) passes an allocator to its members, the allocator can live as member of the document and get destroyed when the document is destroyed after all its content is destroyed. This part of the allocator model should work with pre-C++2011 classes, as long as they take an allocator argument, as well. However, in pre-C++2011 classes the allocator won't be passed to contained objects. For example, if you give an allocator to a std::vector<std::string> the C++2011 version will create the std::strings using the allocator given to the std::vector<std::string> appropriately converted to deal with std::strings. This won't happen with pre-C++2011 allocators.
To actually use allocators in a subsystem you will effectively need to pass them around, either explicitly as an argument to your functions and/or classes or implicitly by way of allocator-aware objects which serve as a context. For example, if you use any of the standard containers as [part of] the context passed around, you can obtain the used allocator using its get_allocator() method.
You can use new placement. This can be used either to specify a memory region, or to overload the type's static void* operator new(ARGS). Globals are not required, and really a bad idea here, if efficiency is important and your problems are demanding. You would need to hold on to one or more allocators, of course.
The best thing you can do is understand your problems and create strategies for your allocators based on the patterns in your program and on actual usage. The general purpose malloc is very good at what it does, so always use that as one baseline to measure against. If you don't know your usage patterns, your allocator will likely be slower than malloc.
Also keep in mind that these types you use will lose compatability with standard containers, unless you use a global or thread local and custom allocator for standard containers -- which quickly defeats the purpose in many contexts. The alternative is to also write your own allocators and containers.
Some uses for multiple allocators include reduced CPU usage, reduced fragmentation, and fewer cache misses. So the solution really depends on what type and where your allocation bottleneck is.
CPU usage will be improved by having lockless heaps for active threads, eliminating synchronization. This can be done in your memory allocator with thread local storage.
Fragmentation will be improved by having allocations with different lifespans be allocated from different heaps -- allocating background IO in a separate heap from the users active task will ensure the two do not confound one another. This is likely done by having a stack for your heaps, and push/popping when you're in different functional scopes.
Cache misses will be improved by keeping allocations within a system together. Having Quadtree/Octree allocations come from their own heap will guarantee there is locality in view frustrum queries. This is best done by overloading operator new and operator delete for the specific classes (OctreeNode).
According to C++ Primer 4th edition, page 755, there is a note saying:
Modern C++ programs ordinarily ought to use the allocator class
to allocate memory. It is safer and more flexible.
I don't quite understand this statement.
So far all the materials I read teach using new to allocate memory in C++.
An example of how vector class utilize allocator is shown in the book.
However, I cannot think of other scenarios.
Can anyone help to clarify this statement? and give me more examples?
When should I use allocator and when to use new? Thanks!
For general programming, yes you should use new and delete.
However, if you are writing a library, you should not!
I don't have your textbook, but I imagine it is discussing allocators in the context of writing library code.
Users of a library may want control over exactly what gets allocated from where. If all of the library's allocations went through new and delete, the user would have no way to have that fine-grained level of control.
All STL containers take an optional allocator template argument. The container will then use that allocator for its internal memory needs. By default, if you omit the allocator, it will use std::allocator which uses new and delete (specifically, ::operator new(size_t) and ::operator delete(void*)).
This way, the user of that container can control where memory gets allocated from if they desire.
Example of implementing a custom allocator for use with STL, and explanation: Improving Performance with Custom Pool Allocators for STL
Side Note: The STL approach to allocators is non-optimal in several ways. I recommend reading Towards a Better Allocator Model for a discussion of some of those issues.
Edit in 2019: The situation in C++ has improved since this answer was written. Stateful allocators are supported in C++11, and that support was improved in C++17. Some of the people involved in the "Towards a Better Allocator Model" were involved in those changes (eg: N2387), so that's nice (:
The two are not contradictory. Allocators are a PolicyPattern or StrategyPattern used by the STL libraries' container adapters to allocate chunks of memory for use with objects.
These allocators frequently optimize memory allocation by allowing
* ranges of elements to be allocated at once, and then initialized using a placement new
* items to be selected from secondary, specialized heaps depending on blocksize
One way or another, the end result will (almost always) be that the objects are allocated with new (placement or default)
Another vivid example would be how e.g. boost library implements smartpointers. Because smartpointers are very small (with little overhead) the allocation overhead might become a burden. It would make sense for the implementation to define a specialized allocator to do the allocations, so one may have efficient std::set<> of smartpointers, std::map<..., smartpointer> etc.
(Now I'm almost sure that boost actually optimizes storage for most smartpointers by avoiding any virtuals, therefore the vft, making the class a POD structure, with only the raw pointer as storage; some of the example will not apply. But then again, extrapolate to other kinds of smartpointer (refcounting smartpointers, pointers to member functions, pointers to member functions with instance reference etc. etc.))
I am working on a plugin for an application, where the memory should be allocated by the Application and keep track of it. Hence, memory handles should be obtained from the host application in the form of buffers and later on give them back to the application. Now, I am planning on using STL Vectors and I am wondering what sort of memory allocation does it use internally.
Does it use 'new' and 'delete' functions internally? If so, can I just overload 'new' and 'delete' with my own functions? Or should I create my own template allocator which looks like a difficult job for me since I am not that experienced in creating custom templates.
Any suggestions/sample code are welcome. Memory handles can be obtained from the application like this
void* bufferH = NULL;
bufferH = MemReg()->New_Mem_Handle(size_of_buffer);
MemReg()->Dispose_Mem_Handle(bufferH); //Dispose it
vector uses std::allocator by default, and std::allocator is required to use global operator new (that is, ::operator new(size_t)) to obtain the memory (20.4.1.1). However, it isn't required to call it exactly once per call to allocator::allocate.
So yes, if you replace global operator new then vector will use it, although not necessarily in a way that really allows your implementation to manage memory "efficiently". Any special tricks you want to use could, in principle, be made completely irrelevant by std::allocator grabbing memory in 10MB chunks and sub-allocating.
If you have a particular implementation in mind, you can look at how its vector behaves, which is probably good enough if your planned allocation strategy is inherently platform-specific.
STL containers use an allocator they are given at construction time, with a default allocator that uses operator new and operator delete.
If you find the default is not working for you, you can provide a custom allocator that conforms to the container's requirements. There are some real-world examples cited here.
I would measure performance using the default first, and optimize only if you really need to. The allocator abstraction offers you a relatively clean way to fine-tune here without major redesign. How you use the vector could have far more performance impact than the underlying allocator (reserve() in advance, avoid insert and removal in the middle of the range of elements, handle copy construction of elements efficiently - the standard caveats).
std::vector uses the unitialized_* functions to construct its elements from raw memory (using placement new). It allocates storage using whatever allocator it was created with, and by default, that allocator uses ::operator new(size_t) and ::operator delete(void *p) directly (i.e., not a type specific operator new).
From this article, "The concept of allocators was originally introduced to provide an abstraction for different memory models to handle the problem of having different pointer types on certain 16-bit operating systems (such as near, far, and so forth)" ...
"The standard provides an allocator that internally uses the global operators 'new' and 'delete'"
The author also points out the alocator interface isn't that scary. As Neil Buchanan would say, "try it yourself!"
The actual std::allocator has been optimized for a rather large extent of size objects. It isn't the best when it comes to allocating many small objects nor is it the best for many large objects. That being said, it also wasn't written for multi-threaded applications.
May I suggest, before attempting to write your own you check out the Hoard allocator if you're going the multi-threaded route. (Or you can check out the equally appealing Intel TBB page too.)