Boost shared_ptr does not destroy the object immediately - c++

I am developing a Bayesian inference sampler in C++, which relays much on a tree, and this tree is implemented with the help of smart pointers (Boost's shared_ptr and weak_ptr).
During inference (i.e. running a long C++ function for 1-2 minutes), the tree is being changed much, creating and destroying many its nodes.
The inference process entirely (100% load) occupies the processor (one its thread, more precisely). For some reason new memory (for new nodes) is being occupied, but old memory is not being freed entirely, which yields in memory overflow after 1-2 minutes of inference.
Though, if I add pauses to the inference process, it seems the program entirely destroy old objects, and all work fine.
It seems for me that the reason is that destructor (or, more precisely, what happens after it, i.e. memory release) is being delayed for some reason.
Could you tell, please:
1) Does it seem to be a real problem?
2) If yes, how it is better to wait while "enough memory" would be released? What are standard strategies?
(The program is being run on Unix.)

The memory problems you observe seem not to be situated in C++ itself. If a shared_ptr releases its memory, it does so immediately, not in some delayed manner. However, your operating system might delay the "real" release for some time as it sees fit. In programs like Windows' "Task Manager" that might look as if your program consumes more and more memory, while that is only the memory the OS reserves for you but you don't actually occupy. If your calculations produce such a heavy processor load, the scheduler might delay rather "unimportant" tasks (like freeing memory) until there is time, to not get in the way of more important things like your calculations.
However, freeing and allocating memory is expensive. And it seems that you free and allocate lots of memory interchangeably. You should consider to recycle that memory, either by doing your own memory management (like memory pools etc.), or by recycling the objects (i.e. the nodes) themselves, meaning not really destroying them but giving them back to some "node pool" and resetting them with new values. Both can be done in conjunction with shared_ptr.

Sounds like you have cycles in your tree, i.e. it uses shared_ptr for both child and parent pointers, which prevents automatic tree node destruction. You may be better off using plain pointers.

when the last shared_ptr pointing to an object is destroyed, the object is deleted immediately. So what you get sounds quite strange. The only think I could imagine is that you have attached a garbage collector to your implementation... Otherwise double check that all old objects get destroyed.

Related

Is ending a program without releasing all dynamically allocated resources risky?

I know that stack allocated resources are released in reverse order as they were allocated at the end of a function as part of RAII. I've been working on a project and I allocate a lot of memory with "new" from a library I'm using, and am testing stuff out. I haven't added a shutdown function as a counterpart to the initialise function that does all the dynamic allocation. When you shut down the program I'm pretty sure there is no memory leak as the allocated memory should be reclaimed by the operating system. At least any modern OS, as explained in this question similar to mine: dynamically allocated memory after program termination .
I'm wondering two things:
1: Is there a particular order that resources are released in in this case? Does it have anything to do with your written code (ie., which order you allocated it), or is it completely up to the OS to do its thing?
2: The reason I haven't made a shutdown function to reverse the initialisation is because I say to myself I'm just testing stuff now, I'll do it later. Is there any risk of doing any damage to anything by doing what I'm doing? The worse I can imagine is what was mentioned in the answer to that question I linked, and that is that the OS fails to reclaim the memory and you end up with a memory leak even after the program exits.
I've followed the Bullet physics library tutorial and initialise a bunch of code like this:
pSolver = new btSequentialImpulseConstraintSolver;
pOverlappingPairCache = new btDbvtBroadphase();
pCollisionConfig = new btDefaultCollisionConfiguration();
pDispatcher = new btCollisionDispatcher(pCollisionConfig);
pDynamicsWorld = new btDiscreteDynamicsWorld(pDispatcher, pOverlappingPairCache, pSolver, pCollisionConfig);
And never call delete on any of it at the moment because, as I said, I'm just testing.
It depends on the resources. Open files will be closed. Memory will be freed. Destructors will not be called. Temporary files created will not be deleted.
There is no risk of having a memory leak after the program exits.
Because programs can crash, there are many mechanisms in place preventing a process to leak after it stopped and leaking usually isn't that bad.
As a matter of fact, if you have a lot of allocations that you don't delete until the end of the program, it can be faster to just have the kernel clean up after you.
However destructors are not run. This mostly causes temporary files to not be deleted.
Also it makes debugging actual memory leaks more difficult.
I suggest using std::unique_ptr<T> and not leaking in the first place.
It depends on how the memory is actually allocated and on your host system.
If you are only working with classes that don't override operator new() AND you are using a modern operating system that guarantees memory resources are released when the process exits, then all dynamically allocated memory should be released when your program terminates. There is no guarantee on the order of memory release (e.g. objects will not be released in the same order, or in reverse order, of their construction). The only real risk in this case is associated with bugs in the host operating system that cause resources of programs/processes to be improperly managed (which is a low risk - but not zero risk - for user programs in modern windows or unix OSs).
If you are using any classes that override operator new() (i.e. that change how raw memory is allocated in the process of dynamically constructing an object) then the risk depends on how memory is actually being allocated - and what the requirements are for deallocation. For example, if the operator new() uses global or system-wide resources (e.g. mutexes, semaphores, memory that is shared between processes) then there is a risk that your program will not properly release those resources, and then indirectly cause problems for other programs which use the same resources. In practice, depending on the design of such a class, the needed cleanup might be in a destructor, an operator delete() or some combination of the two - but, however it is done, your program will need to explicitly release such objects (e.g. a delete expression that corresponds to the new expression) to ensure the global resources are properly released.
One risk is that destructors of your dynamically allocated objects will not be invoked. If your program relies on the destructor doing anything other than release dynamically allocated memory (presumably allocated by the class constructor and managed by other member functions) then the additional clean-up actions will not be performed.
If your program will ever be built and run on a host system that doesn't have a modern OS then there is no guarantee that dynamically allocated memory will be reclaimed.
If code in your program will ever be reused in a larger long-running program (e.g. your main() function is renamed, and then called from another program in a loop) then your code may cause that larger program to have a memory leak.
It's fine, since the operating system (unless it's some exotic or ancient OS) will not leak the memory after the process has ended. Same goes for sockets and file handles; they will be closed at process exit. It's not in good style to not clean up after yourself, but if you don't, there's no harm done to the overall OS environment.
However, in your example, it looks to me like the only memory that you would actually need to release yourself is that of pDynamicsWorld, since the others should be cleaned up by the btDiscreteDynamicsWorld instance. You're passing them as constructor arguments, and I suspect they get destroyed automatically when pDynamicsWorld gets destroyed. You should read the documentation to make sure.
However, it's just not in good style (because it's unsafe) to use delete anymore. So instead of using delete to destroy pDynamicsWorld, you can use a unique_ptr instead, which you can safely create using the std::make_unique function template:
#include <memory>
// ...
// Allocate everything else with 'new' here, as usual.
// ...
// Except for this one, which doesn't seem to be passed to another
// constructor.
auto pDynamicsWorld = std::make_unique<btDiscreteDynamicsWorld>(
pDispatcher, pOverlappingPairCache, pSolver, pCollisionConfig);
Now, pDispatcher, pOverlappingPairCache, pSolver and pCollisionConfig should be destroyed by pDynamicsWorld automatically, and pDynamicsWorld itself will be destroyed automatically when it goes out of scope because it's a unique_ptr.
But, again: Read the documentation of Bullet Physics to check whether the objects you pass as arguments to the constructors of the Bullet Physics classes actually do get cleaned up automatically or not.

Should I free long-lived memory that would normally be freed at the very end of the program?

I am currently writing a library that parse some structured binary data into a set of objects. These objects are expected to outlive any user code, and would normally be freed at the end or after the main function.
I am using shared (and weak) pointers to manage the memory of each object, but it is causing a lot of added complexity to the program, and raises structural issues that I will not get into in this particular question.
Considering that:
traversing the entirety of the binary data is expensive and I cannot afford to do it more than one time,
each visited entry is used to build an object, that then gets registered (i.e. added into the set),
entries in the binary data may rely on other entries that appears later but gets parsed immediately, and registered when the entry is visited again,
duplicate entries may appear at any moment, but I need to merge those duplicates into one instance (and update any pointer referencing those duplicates to the new merged entry) before registration,
every single one of those objects is guaranteed to be of one of many POD types deriving a common class, so nothing except memory needs to be cleaned up,
the resulting program will run on a modern OS (or in this case, that collects memory from dead processes),
I am very tempted to just use raw pointers, never free the memory taken by those objects and let the OS do its cleanup after the process exits.
What would be the best course of action?
If you're writing reusable code, you need to at least provide the option of cleaning up. What if some program uses your library for one operation, and then continues running? It's not safe to assume that the process exits immediately after your library's task is complete.
The other answers cover the general and standard approach: in an ideal world, yes, you'd clean up your memory, because it makes the code more generic and more reusable and helps with tooling. As others have said, std::unique_ptr for owning pointers and raw pointers for non-owning pointers should work well.
There are a couple of more specialized approaches that may or may not be useful:
Use a pool allocator (such as Boost.Pool, or roll your own) to allocate a bunch of memory up front then dole out pieces of it for your objects. You can then free every object at once by deleting the pool.
Intentionally not freeing memory is occasionally a valid technique. See, e.g., "Increasing Compiler Performance by Over 75%", by Walter Bright. Of course, a compiler is a specialized problem domain, and Walter Bright is probably one of the top compiler developers alive, so techniques that work for his problem domain shouldn't be blindly applied elsewhere.
the resulting program will run on a modern OS (or in this case, that collects memory from dead processes)
I am very tempted to just use raw pointers, never free the memory taken by those objects and let the OS do its cleanup after the process exits.
If you take this approach, then anyone who uses your library and then uses valgrind to try to detect memory leaks in their program will report massive leaks coming from your library and complain to you about it, so if I were you I definitely would not do this.
If you are writing a library then you should provide a cleanup function that frees all memory that you allocated.
A practical example of why this is useful is if a Windows DLL uses your library. When the library is loaded, static data is initialized. When the library is unloaded, static data is cleared. If your library has some global pointers to memory that is never freed, then load-unload cycles of the DLL will leak memory.
If the objects are all of the same type, then rather than allocating each one independently, you could just put them all into a vector and have them refer to each other by index number instead of using pointers. The vector's built-in memory management takes care of allocating space as needed, and when you're done with the objects, you can just destroy the vector to deallocate them all at once. (Note that vector::clear() doesn't actually free the memory, though it does make it available to store a new set of objects in the vector.)
If your objects aren't all the same type, you'll want to look into the more general concept of region-based memory management. As above, the idea is that you can allocate all your objects in a relatively small number of memory chunks (possibly just one), which can be freed later without having to track all the
individual objects allocated within.
If your ownership and lifetimes are clear I suggest you use unique_ptr for the owning pointers and raw pointers for the non-owning pointers. It should be less complex than shared_ptr and weak_ptr whilst still managing memory automatically.
I don't think not managing memory at all is an option. But I think using smart pointers to express ownership is not just about good memory management it also makes code easier to reason about.
Try to think of future maintenance work. Suppose your code needs to be broken up or other stuff done after it. In this case you're opening yourself up to leaks or being a resource hog later down the line.
Cleaning up (or being able to do s) is good. It may seem obvious now that an application should work with a single structured binary dataset throughout its entire lifetime, but you'll start kicking yourself once you realize you need to write an application that needs to reset half-way through and start over with another dataset.
(a related thing that's easy to overlook is that an application may need to work with two completely independent datasets at the same time, so try not to design your library to exclude that use case!)
That said, I think you may be focusing too much on the extremes. Code that shouldn't participate in memory management can use raw pointers, and this is reasonable when there is no risk of these pointers outliving your structured dataset in memory.
However, that doesn't mean that code that does participate in memory management needs to use raw pointers too. You can use smart pointers to manage your data structures even if you are passing raw pointers out to the user.
That aside, keep in mind that, in my experience, pointers are usually the wrong semantics — usually, most use cases are most natural with reference or value semantics, which means you should be passing around raw references, or passing around lightweight wrapper class that have reference or value semantics, but are implemented as containing a pointer to the actual data. Or even as a copy of the actual data if appropriate.

Prefer heap over stack?

I recently dove into graphics programming and I noticed that many graphic engines (i.e Ogre), and many coders overall, prefer to initialize class instances dynamically. Here's an example from Ogre Basic Tutorial 1
//...
Ogre::Entity* ogreHead = mSceneMgr->createEntity("Head", "ogrehead.mesh");
Ogre::SceneNode* headNode = mSceneMgr->getRootSceneNode()->createChildSceneNode("HeadNode");
//...
ogreHead and headNode data members and methods are then referred to as ogreHead->blabla.
Why mess around with object pointers instead of plain objects?
BTW, I've also read somewhere that heap memory allocation is much slower than stack memory allocation.
Heap allocation is, inevitably much slower than stack allocation. More on "How much slower?" later. However, in many cases, the choice is "made for you", for several reasons:
Stack is limited. And if you run out, the application almost always gets terminated - there is no real good recovery, even printing an error message to say "I ran out of stack" may be hard...
Stack allocation "goes away" when you leave the function where the allocation was made.
Variability is much more well defined and easy to deal with. C++ does not cope with "variable length arrays" very well, and it's certainly not guaranteed to work in all compilers.
How much slower is heap over stack?
We'll get to "and does it matter" in a bit.
For a given allocation, stack allocation is simply a subtract operation [1], where at the very minimum new or malloc will be a function call, and probably even the most simple allocator will be several dozen instructions, in complex cases thousands [because memory has to be gotten from the OS, and cleared of it's previous content]. So anything from a 10x to "infinitely" slower, give or take. Exact numbers will depend on the exact system the code is running in, size of the allocation, and often "previous calls to the allocator" (e.g. a long list of "freed" allocations can make allocating a new object slower, because a good fit has to be searched for). And of course, unless you do the "ostrich" method of heap management, you also need to free the object and cope with "out of memory" which adds more code/time to the execution and complexity of the code.
With some reasonably clever programming, however, this can be mostly hidden - for example, allocating something that stays allocated for a long time, over the lifetime of the object, will be "nothing to worry about". Allocating objects from the heap for every pixel or every trianle in a 3D game would CLEARLY be a bad idea. But if the lifetime of the object is many frames or even the entire game, the time to allocate and free it will be nearly nothing.
Similarly, instead of doing 10000 individual object allocations, make one for 10000 objects. Object pool is one such concept.
Further, often the allocation time isn't where the time is spent. For example, reading a triangle list from a file from a disk will take much longer than allocating the space for the same triangle list - even if you allocate each single one!
To me, the rule is:
Does it fit nicely on the stack? Typically a few kilobytes is fine, many kilobytes not so good, and megabytes definitely not ok.
Is the number (e.g. array of objects) known, and the maximum such that you can fit it on the stack?
Do you know what the object will be? In other words abstract/polymorphic classes will probably need to be allocated on the heap.
Is its lifetime the same as the scope it is in? If not, use the heap (or stack further down, and pass it up the stack).
[1] Or add if stack is "grows towards high addresses" - I don't know of a machine which has such an architecture, but it is conceivable and I think some have been made. C certainly makes no promises as to which way the stack grows, or anything else about how the runtime stack works.
The scope of the stack is limited: it only exists within a function. Now, modern user-interfacing programs are usually event driven, which means that a function of yours is invoked to handle an event, and then that function must return in order for the program to continue running. So, if your event handler function wishes to create an object which will remain in existence after the function has returned, clearly, that object cannot be allocated on the stack of that function, because it will cease to exist as soon as the function returns. That's the main reason why we allocate things on the heap.
There are other reasons, too.
Sometimes, the exact size of a class is not known during compilation time. If the exact size of a class is not known, it cannot be created on the stack, because the compiler needs to have precise knowledge of how much space it needs to allocate for each item on the stack.
Furthermore, factory methods like whatever::createEntity() are often used. If you have to invoke a separate method to create an object for you, then that object cannot be created on the stack, for the reason explained in the first paragraph of this answer.
Why pointers instead of objects?
Because pointers help make things fast. If you pass an object by value, to another function, for example
shoot(Orge::Entity ogre)
instead of
shoot(Orge::Entity* ogrePtr)
If ogre isn't a pointer, what happens is you are passing the whole object into the function, rather than a reference. If the compiler doesn't optimize, you are left with an inefficient program. There are other reasons too, with the pointer, you can modify the passed in object (some argue references are better but that's a different discussion). Otherwise you would be spending too much time copying modified objects back and forth.
Why heap?
In some sense heap is a safer type of memory to access and allows you to safely reset/recover. If you call new and don't have memory, you can flag that as an error. If you are using the stack, there is actually no good way to know you have caused stackoverflow, without some other supervising program, at which point you are already in danger zone.
Depends on your application. Stack has local scope so if the object goes out of scope, it will deallocate memory for the object. If you need the object in some other function, then no real way to do that.
Applies more to OS, heap is comparatively much larger than stack, especially in multi-threaded application where each thread can have a limited stack size.

How to redefine / override the behavior and the life cycle of a smart pointer?

I always read about the possibility of rewriting a new definition for the smart pointers behavior, but still today i can't find a real example.
Now I want to propose this problem and see if I can get a solution:
Smart pointers are using reference counting or reference linking to manage their lifetime cycle, my basic problem consist in adding a new state that can cause the release and the deletion of my pointers, I would like to free my resources when an event is triggered.
It's more or less like when playing a game, usually all the resources are loaded and freed when the user is passing from the level 1 to the level 2, so when this happen the resources from the level 1 are freed. Also I would like to stick with this example because you can't wait for the automatic reference counting and maybe think that, ok if in the level 2 a resource from the level 1 is not used it will be automatically freed because it's no longer requested; this can be true but operating with the memory when an user is using the machine under stress is a really bad move.
I would like to stick with the smart pointers because I am also interested in all the other features that they offer, but they have this big downside for me and I need to manage their life cycle in a direct way.
What options do I have ?
It sounds like you just threw in a bunch of shared pointers and didn't bother thinking proper about ownership of the resources. From the description it seems clear to me that the level object should own all these resources. That means they should not be shared, and thus no reference counting is needed at all.
Using smart pointers frees you from the burden of releasing the resources manually at the appropriate time, but it doesn't free you from thinking about what the appropriate time is.
If I am correct and level should own the resources, use a smart pointer that gives unique ownership, or just use automatic objects and don't even bother with smart pointers. Every other object except the level that needs access to those objects needs a non-owning pointer or reference to it, i.e., a traditional pointer or reference.
If I am wrong and the resources should indeed be shared, then they should not be released when the level is destroyed: the other objects that were sharing ownership will not like it.
If you have shared-pointers remaining after a level is finished, that should be due to the fact that the shared-pointers were not allocated as automatic variables on the stack of functions called during the life-time of the level, but rather are stored in some type of container or series of globally accessible containers. Thus the main issue you need to be concerned with is managing the life-cycle of the containers containing the shared-pointers that manage the resources for each level.
For instance, a smart-pointer allocated on the stack inside some function foo will only have a life-time the corresponds to the duration of the function call. Once the function call is complete, then the shared-pointer is destroyed. If there are additional shared-pointers still pointing to a resource, then the resource itself won't be destroyed, but those additional shared-pointers need to be residing in some other location other than the stack of the callee. So your job is to manage those "other locations", which I'm guessing are most likely some series of globally accessible containers.
Therefore flushing the containers that mange the data for each level should in-turn completely destroy the allocated resources for each level. If you want, you could use an event-driven interface or simple observer pattern for triggering the flush of the containers, or it could simply be done explicitly by the destructor for an object that manages the life-time of level resources.
In the end though it comes down to resource management ... just because a shared-pointer is designed to prevent memory leaks does not mean you shouldn't keep track of how or where they are allocated. If you centralize the storage of your shared pointers, then destroying the resources they manage will not be that big a deal.
As mentioned before, shared pointers for your particular problem might not be the right solution. However, you can still manage manually their destruction if you'd like to time slice the freeing up of your level for example (or maybe even wait later when you have some spare CPU cycles if the memory pressure isn't too high or a concern). Once you are done with a resource, simple queue it up for destruction in a global queue (or several queues, perhaps one per type or by priority, etc.). Later on simply process that queue and remove the references. This will trigger the object's destruction if it is indeed the last reference to it. You could easily check a timer every few iterations for example and check that once you've spent 1ms freeing stuff, quit and continue on the next frame.
Performance wise, using memory pools per level (or per asset package) makes more sense. It makes for easier memory management and you can sometimes get away with freeing the whole pool at once and skip calling the destructors on all the objects within (if you know they do nothing!).
It sounds like you really shouldn't be using shared pointers at all. There are ways to override the behaviors by specifying custom deallocators but I would consider a different type of "smart pointer".
The Apache web server uses memory pools to deallocate all of the resources associated with a request by having the request own a pool of memory. When you allocate memory inside of the server, you are required to identify the pool that you want to allocate from. The server maintains a handful of memory pools each with a different lifetime - one for the server instance, another for the module, another for each request, etc. This sounds like a better match to your situation.
The Apache code uses their Apache Portable Runtime for memory management. It is written in C and might not be the best match for what you are doing. It does look like Boost has a memory pool library as well though I have never used it.

program terminates with dynamic memory on the heap

I will preface this question to c/c++ as it mostly pertains to that, and I have seen it have the most impact with c/c++.
this has concerned me for some time, and I understand some of this problem can be avoided (and I would like to avoid the lectures on ways to avoid, but rather focus on the aftermath just in case it does happen), but I would still have the underlying question.
initial thoughts:
A pointer simply serves as a address to an object somewhere else in memory (this can be because of needing to modify the number of things of that type int[], or because the nature of the thing can change throughout the lifespan of the thing polymophism)
anytime the keyword new is used it should have a corresponding keyword delete (if not multiple depending on exception handling, and multiple exit points)
when a dynamically allocated memory chunk is acted upon by keyword delete the destructor is called (and its actions are performed if any), the memory chunk is returned to the system store to be made available for other things, and (depending on compiler, macros, or programmer) the pointer is set to NULL to avoid illegal memory accessing.
situation:
when I am writing a program that uses dynamic memory (combination of pointers, new, and delete). if something happens, and the program terminates unexpectedly (unhandled exception, memory access error, illegal operation. etc). the system should attempt to remove all memory that the program is using, and return it to the system, but pointers are not always cleared. this may vary between operating system, and compiler (on how program termination is performed), but the things that were pointed to may still exist in memory because all that was deleted was the pointer, and not the thing that was pointed to. granted this can be quite small loss (less then a MB for a small program, but for say stress testing a data store, or processing large files this can be quite large possibly even in the GB range.
the direct question is what steps can be taken to get that memory back? the only thing that I have found that works is to just restart the system (this is when using g++, and VS2008/2010 on a windows system)
If the program terminates, then all memory it was using is returned to the system. At least under Windows which you say you are using. If you think this is not happening, then perhaps your program is not actually terminating at all.
The heap is bound to the allocator, and the allocator is bound to the process. When the process exits, the heap comes undone. Only system-shared resources aren't deallocated.