Memory leaks preferring in order to not to slow down shutdown

Memory leaks preferring in order to not to slow down shutdown - c++

Chromium's documentation says:
NOTE: Both Singleton and base::LazyInstance provide "leaky" traits to leak the global on shutdown. This is often advisable (except potentially in library code where the code may be dynamically loaded into another process's address space or when data needs to be flushed on process shutdown) in order to not to slow down shutdown.
In which case this is acceptable?

"Acceptable" is a bit tough to define, as I personally prefer to avoid this practice. For example, we use leak detection tools which would end up showing false positives with this practice unless otherwise suppressed (though I noticed in Chromium's documentation that their leaky traits suppress it: I don't do that and treat false positive warnings as something to fix rather than to suppress).
Naturally this would only be acceptable if the operating systems being targeted reclaim the resources acquired by the singleton when shutting down a process. Most modern operating systems do this though for the most common types of resources. Naturally if your singleton creates a massive temporary file of some sort, it would be wise to destroy it with an explicit shutdown process.
Shutdown Speed
The cited reason to simply avoid manually reclaiming resources for speedy shutdown is legit, but it's often accompanied by a mindset which allocates a boatload of memory in the teeniest chunks.
This can often come about when using linked structures or lots of teeny abstract objects allocated individually against a general-purpose allocator. If there's millions of teeny memory chunks to free individually, that can translate to painful shutdown times that span seconds.
Yet the speed of startup, loading input data, and shutting down tend to be often quite related to each other. If we want to speed all of them up, it can be helpful to preallocate and pool memory in bulk.
With this kind of bulk resource-allocation and pooling strategy, applications can start up and shut down and just operate quickly overall. This can be hard to apply, but if you have that going for you, the application will often be able to insta-shut down anyway even with gigabytes of data loaded, and probably simply leaking resources will start to seem more lazy than an optimization technique and maybe push it more towards the aesthetic of being "unacceptable". YMMV.
Yet in cases where there is an actual need for performance, you do what you need to do. I imagine the Chromium developers had their codebase structured in a nature where this was beneficial before designing leaky traits and suppressors for leak detectors. Everyone has a different kind of scenario they're looking at.
Shutdown Correctness
In some really complex codebases with a lot of plugins being loaded and unloaded on shutdown, e.g., just getting the damned application to shut down correctly can actually be really hard.
I looked down at a codebase once where we worked ourselves into a corner this way and actually had the bulk of our bugs relating to the application crashing on shutdown (least user-critical time to crash, but terribly annoying nevertheless).
The issue for us was with singletons and other types of globals being created when plugins were loaded which, on creation, triggered side effects to external states (other globals or other singletons living in a different module). In those cases, there was the symmetrical desire to reverse those side effects upon destruction of such a global (ex: removing from a global collection if constructing the global added to one). This introduced the issue of destruction order (which was extremely difficult here due to the decentralized code written by numerous authors and global states dispersed across numerous plugins).
When globals start depending on other globals (lazy-constructed or not) during destruction, things can get pretty awkward just ensuring that things are destroyed in the proper order so that a global doesn't end up relying on another global during shutdown which has already been invalidated (possibly even with the dylib unloaded).
We were almost certainly doing all sorts of things wrong there (starting with using too many singletons and globals and scattering them across plugins in an ad hoc fashion with no unifying design mindset), and this experience made me dislike singletons and globals of all sorts in these kinds of complex scenarios (not for the reduced flexibility, but the difficulty of just writing correct shutdown code), but it probably would have been a lot easier if we didn't bother to try to make these globals reverse side effects to other globals on destruction, to simply leave them there and let the operating system clean up resources. There I could understand the temptation to just leak and not bother to destroy things, or even if we do destroy things, to not reverse side effects to other globals. It's an ugly workaround hack to a very faulty design, but I can understand that kind of issue a little easier than working around performance issues by leaking.
In any case, what's "acceptable" is going to vary a lot. It's going to range from aesthetics all the way to portability concerns. This is the first time I've encountered an efficiency argument for simply leaking singletons, but I can understand what sort of scenarios might lead to that.

When all the memory (and other resources) are reclaimed and released back to the OS at shutdown anyway. This is what happens on linux systems for instance.

Related

RAII vs. Garbage Collector

I recently watched a great talk by Herb Sutter about "Leak Free C++..." at CppCon 2016 where he talked about using smart pointers to implement RAII (Resource acquisition is initialization) - Concepts and how they solve most of the memory leaks issues.
Now I was wondering. If I strictly follow RAII rules, which seems to be a good thing, why would that be any different from having a garbage collector in C++? I know that with RAII the programmer is in full control of when the resources are freed again, but is that in any case beneficial to just having a garbage collector? Would it really be less efficient? I even heard that having a garbage collector can be more efficient, as it can free larger chunks of memory at a time instead of freeing small memory pieces all over the code.

If I strictly follow RAII rules, which seems to be a good thing, why would that be any different from having a garbage collector in C++?
While both deal with allocations, they do so in completely different manners. If you are reffering to a GC like the one in Java, that adds its own overhead, removes some of the determinism from the resource release process and handles circular references.
You can implement GC though for particular cases, with much different performance characteristics. I implemented one once for closing socket connections, in a high-performance/high-throughput server (just calling the socket close API took too long and borked the throughput performance). This involved no memory, but network connections, and no cyclic dependency handling.
I know that with RAII the programmer is in full control of when the resources are freed again, but is that in any case beneficial to just having a garbage collector?
This determinism is a feature that GC simply doesn't allow. Sometimes you want to be able to know that after some point, a cleanup operation has been performed (deleting a temporary file, closing a network connection, etc).
In such cases GC doesn't cut it which is the reason in C# (for example) you have the IDisposable interface.
I even heard that having a garbage collector can be more efficient, as it can free larger chunks of memory at a time instead of freeing small memory pieces all over the code.
Can be ... depends on the implementation.

Garbage collection solves certain classes of resource problems that RAII cannot solve. Basically, it boils down to circular dependencies where you do not identify the cycle before hand.
This gives it two advantages. First, there are going to be certain types of problem that RAII cannot solve. These are, in my experience, rare.
The bigger one is that it lets the programmer be lazy and not care about memory resource lifetimes and certain other resources you don't mind delayed cleanup on. When you don't have to care about certain kinds of problems, you can care more about other problems. This lets you focus on the parts of your problem you want to focus on.
The downside is that without RAII, managing resources whose lifetime you want constrained is hard. GC languages basically reduce you to either having extremely simple scope-bound lifetimes or require you to do resource management manually, like in C, with manually stating you are done with a resource. Their object lifetime system is strongly tied to GC, and doesn't work well for tight lifetime management of large complex (yet cycle-free) systems.
To be fair, resource management in C++ takes a lot of work to do properly in such large complex (yet cycle-free) systems. C# and similar languages just make it a touch harder, in exchange they make the easy case easy.
Most GC implementations also forces non-locality full fledged classes; creating contiguous buffers of general objects, or composing general objects into one larger object, is not something that most GC implementations make easy. On the other hand, C# permits you to create value type structs with somewhat limited capabilities. In the current era of CPU architecture, cache friendliness is key, and the lack of locality GC forces is a heavy burden. As these languages have a bytecode runtime for the most part, in theory the JIT environment could move commonly used data together, but more often than not you just get a uniform performance loss due to frequent cache misses compared to C++.
The last problem with GC is that deallocation is indeterminate, and can sometimes cause performance problems. Modern GCs make this less of a problem than it has been in the past.

RAII and GC solve problems in completely different directions. They are completely different, despite what some would say.
Both address the issue that managing resources is hard. Garbage Collection solves it by making it so that the developer doesn't need to pay as much attention to managing those resources. RAII solves it by making it easier for developers to pay attention to their resource management. Anyone who says they do the same thing has something to sell you.
If you look at recent trends in languages, you're seeing both approaches being used in the same language because, frankly, you really need both sides of the puzzle. You're seeing lots of languages which use garbage collection of sorts so that you don't have to pay attention to most objects, and those languages also offer RAII solutions (such as python's with operator) for the times you really want to pay attention to them.
C++ offers RAII through constructors/destructors and GC through shared_ptr (If I may make the argument that refcounting and GC are in the same class of solutions because they're both designed to help you not need to pay attention to lifespan)
Python offers RAII through with and GC through a refcounting system plus a garbage collector
C# offers RAII through IDisposable and using and GC through a generational garbage collector
The patterns are cropping up in every language.

Notice that RAII is a programming idiom, while GC is a memory management technique. So we are comparing apples with oranges.
But we can restrict RAII to its memory management aspects only and compare that to GC techniques.
The main difference between so called RAII based memory management techniques (which really means reference counting, at least when you consider memory resources and ignore the other ones such as files) and genuine garbage collection techniques is the handling of circular references (for cyclic graphs).
With reference counting, you need to code specially for them (using weak references or other stuff).
In many useful cases (think of std::vector<std::map<std::string,int>>) the reference counting is implicit (since it can only be 0 or 1) and is practically omitted, but the contructor and destructor functions (essential to RAII) behave as if there was a reference counting bit (which is practically absent). In std::shared_ptr there is a genuine reference counter. But memory is still implicitly manually managed (with new and delete triggered inside constructors and destructors), but that "implicit" delete (in destructors) gives the illusion of automatic memory management. However, calls to new and delete still happen (and they cost time).
BTW the GC implementation may (and often does) handle circularity in some special way, but you leave that burden to the GC (e.g. read about the Cheney's algorithm).
Some GC algorithms (notably generational copying garbage collector) don't bother releasing memory for individual objects, it is release en masse after the copy. In practice the Ocaml GC (or the SBCL one) can be faster than a genuine C++ RAII programming style (for some, not all, kind of algorithms).
Some GC provide finalization (mostly used to manage non-memory external resources like files), but you'll rarely use it (since most values consume only memory resources). The disadvantage is that finalization does not offer any timing guarantee. Practically speaking, a program using finalization is using it as a last resort (e.g. closing of files should still happen more or less explicitly outside of finalization, and also with them).
You still can have memory leaks with GC (and also with RAII, at least when used improperly), e.g. when a value is kept in some variable or some field but will never be used in the future. They just happen less often.
I recommend reading the garbage collection handbook.
In your C++ code, you might use Boehm's GC or Ravenbrook's MPS or code your own tracing garbage collector. Of course using a GC is a tradeoff (there are some inconvenience, e.g. non-determinism, lack of timing guarantees, etc...).
I don't think that RAII is the ultimate way of dealing with memory in all cases. In several occasions, coding your program in a genuinely and efficiently GC implementations (think of Ocaml or SBCL) can be simpler (to develop) and faster (to execute) than coding it with fancy RAII style in C++17. In other cases it is not. YMMV.
As an example, if you code a Scheme interpreter in C++17 with the fanciest RAII style, you would still need to code (or use) a explicit GC inside it (because a Scheme heap has circularities). And most proof assistants are coded in GC-ed languages, often functional ones, (the only one I know which is coded in C++ is Lean) for good reasons.
BTW, I'm interested in finding such a C++17 implementation of Scheme (but less interested in coding it myself), preferably with some multi-threading ability.

One of the problem about garbage collectors is that it's hard to predict program performance.
With RAII you know that in exact time resource will go out of scope you will clear some memory and it will take some time. But if you are not a master of garbage collector settings you cannot predict when cleanup will happen.
For example: cleaning a bunch of small objects can be done more effectively with GC because it can free large chunk, but it will be not fast operation, and it's hard to predict when in will occur and because of "large chunk cleanup" it will take some processor time and can affect your program performance.

Roughly speaking. The RAII idiom may be better for the latency and jitter. A garbage collector may be better for the system's throughput.

"Efficient" is a very broad term, in sense of development efforts RAII is typically less efficient than GC, but in terms of performance GC is typically less efficient than RAII. However it is possible to provide contr-examples for both cases. Dealing with generic GC when you have very clear resource (de)allocation patters in managed languages can be rather troublesome, just like the code using RAII can be surprisingly inefficient when shared_ptr is used for everything for no reason.

Garbage collection and RAII each support one common construct for which the other is not really suitable.
In a garbage-collected system, code may efficiently treat references to immutable objects (such as strings) as proxies for the data contained therein; passing around such references is almost as cheap as passing around "dumb" pointers, and is faster than making a separate copy of the data for each owner, or trying to track ownership of a shared copy of the data. In addition, garbage-collected systems make it easy to create immutable object types by writing a class which creates a mutable object, populating it as desired, and providing accessor methods, all while refraining from leaking references to anything that might mutate it once the constructor finishes. In cases where references to immutable objects need to be widely copied but the objects themselves don't, GC beats RAII hands down.
On the other hand, RAII is excellent at handling situations where an object needs to acquire exclusive services from outside entities. While many GC systems allow objects to define "Finalize" methods and request notification when they are found to be abandoned, and such methods may sometimes manage to release outside services that are no longer needed, they are seldom reliable enough to provide a satisfactory way of ensuring timely release of outside services. For management of non-fungible outside resources, RAII beats GC hands down.
The key difference between the cases where GC wins versus those where RAII wins is that GC is good at managing fungible memory that can be freed on an as-needed basis, but poor at handling non-fungible resources. RAII is good at handling objects with clear ownership, but bad at handling ownerless immutable data holders which have no real identity apart from the data they contain.
Because neither GC nor RAII handles all scenarios well, it would be helpful for languages to provide good support for both of them. Unfortunately, languages which focus on one tend to treat the other as an afterthought.

The main part of the question about whether one or the other is "beneficial" or more "efficient" cannot be answered without giving lots of context and arguing about the definitions of these terms.
Beyond that, you can basically feel the tension of the ancient "Is Java or C++ the better language?" flamewar crackling in the comments. I wonder what an "acceptable" answer to this question could look like, and am curious to see it eventually.
But one point about a possibly important conceptual difference has not yet been pointed out: With RAII, you are tied to the thread that calls the destructor. If your application is single threaded (and even though it was Herb Sutter who stated that The Free Lunch Is Over: Most software today effectively still is single-threaded), then a single core may be busy with handling the cleanups of objects that are no longer relevant for the actual program...
In contrast to that, the garbage collector usually runs in its own thread, or even multiple threads, and is thus (to some extent) decoupled from the execution of the other parts.
(Note: Some answers already tried to point out application patterns with different characteristics, mentioned efficiency, performance, latency and throughput - but this specific point was not mentioned yet)

RAII uniformly deals with anything that is describable as a resource. Dynamic allocations are one such resource, but they are by no means the only one, and arguably not the most important one. Files, sockets, database connections, gui feedback and more are all things that can be managed deterministically with RAII.
GCs only deal with dynamic allocations, relieving the programmer of worrying about the total volume of allocated objects over the lifetime of the program (they only have to care about the peak concurrent allocation volume fitting)

RAII and garbage collection are intended to solve different problems.
When you use RAII you leave an object on the stack which sole purpose is to clean up whatever it is you want managed (sockets, memory, files, etc.) on leaving the scope of the method. This is for exception-safety, not just garbage collection, which is why you get responses about closing sockets and freeing mutexes and the like. (Okay, so no one mentioned mutexes besides me.) If an exception is thrown, stack-unwinding naturally cleans up the resources used by a method.
Garbage collection is the programmatic management of memory, though you could "garbage-collect" other scarce resources if you'd like. Explicitly freeing them makes more sense 99% of the time. The only reason to use RAII for something like a file or socket is you expect the use of the resource to be complete when the method returns.
Garbage collection also deals with objects that are heap-allocated, when for instance a factory constructs an instance of an object and returns it. Having persistent objects in situations where control must leave a scope is what makes garbage collection attractive. But you could use RAII in the factory so if an exception is thrown before you return, you don't leak resources.

I even heard that having a garbage collector can be more efficient, as it can free larger chunks of memory at a time instead of freeing small memory pieces all over the code.
That's perfectly doable - and, in fact, is actually done - with RAII (or with plain malloc/free). You see, you don't necessarily always use the default allocator, which deallocates piecemeal only. In certain contexts you use custom allocators with different kinds of functionality. Some allocators have the in-built ability of freeing everything in some allocator region, all at once, without having to iterate individual allocated elements.
Of course, you then get into the question of when to deallocate everything - whether the use of those allocators (or the slab of memory with which they're associated has to be RAIIed or not, and how.

Implementing a memory manager in multithreaded C/C++ with dynamically sized memory pool?

Background: I'm developing a multiplatform framework of sorts that will be used as base for both game and util/tool creation. The basic idea is to have a pool of workers, each executing in its own thread. (Furthermore, workers will also be able to spawn at runtime.) Each thread will have it's own memory manager.
I have long thought about creating my own memory management system, and I think this project will be perfect to finally give it a try. I find such a system fitting due to the types of usages of this framework will often require memory allocations in realtime (games and texture edition tools).
Problems:
No generally applicable solution(?) - The framework will be used for both games/visualization (not AAA, but indie/play) and tool/application creation. My understanding is that for game development it is usual (at least for console games) to allocate a big chunk of memory only once in the initialization, and then use this memory internally in the memory manager. But is this technique applicable in a more general application?
In a game you could theoretically know how much memory your scenes and resources will need, but for example, a photo editing application will load resources of all different sizes... So in the latter case a more dynamic memory "chunk size" would be needed? Which leads me to the next problem:
Moving already allocated data and keeping valid pointers - Normally when allocating on the heap, you will acquire a simple pointer to the memory chunk. In a custom memory manager, as far as I understand it, a similar approach is then to return a pointer to somewhere free in the pre-allocated chunk. But what happens if the pre-allocated chunk is too small and needs to be resized or even defragmentated? The data would be needed to be moved around in the memory and the old pointers would be invalid. Is there a way to transparently wrap these pointers in some way, but still use them as normally "outside" the memory management as if they were usual C++ pointers?
Third party libraries - If there is no way to transparently use a custom memory management system for all memory allocation in the application, every third party library I'm linking with, will still use the "old" OS memory allocations internally. I have learned that it is common for libraries to expose functions to set custom allocation functions that the library will use, but it is not guaranteed every library I will use will have this ability.
Questions: Is it possible and feasible to implement a memory manager that can use a dynamically sized memory chunk pool? If so, how would defragmentation and memory resize work, without breaking currently in-use pointers? And finally, how is such a system best implemented to work with third party libraries?
I'm also thankful for any related reading material, papers, articles and whatnot! :-)

As someone who has previously written many memory managers and heap implementations for AAA games for the last few generations of consoles let me tell you its simply not worth it anymore.
Your information is old - back in the gamecube era [circa 2003] we used to do what you said- allocate a large chunk and carve out that chunk manually using custom algorithms tweaked for each game.
Once virtual memory came along (xbox era), games got more complicated [and so made more allocations and became multimthreaded] address fragmentation made this untenable. So we switched to custom allocators to handle certain types of requests only - for instance physical memory, or lock free small block low fragmentation heaps or thread local cache of recently used blocks.
As built in memory managers become better it gets harder to do better than those - certainly in the general case and a close thing for a specific use cases. Doug Lea Allocator [or whatever the mainstream c++ linux compilers come with now] and the latest Windows low fragmentation heaps are really very good, and you'd do far better investing your time elsewhere.
I've got spreadsheets at work measuring all kinds of metrics for a whole load of allocators - all the big name ones and a fair few I've collected over the years. And basically whilst the specialist allocators can win on a few metrics [lowest overhead per alloc, spacial proximity, lowest fragmentation, etc] for overall metrics the mainstream ones are simply the best.
As a user of your library, my personal preferred option is you just allocate memory when you need it. Use operator new/the new operator and I can use the standard C++ mechanisms to replace those and use my custom heap (if I indeed have one), or alternatively I can use platform specific ways of replacing your allocations (e.g. XMemAlloc on Xbox). I don't need tagging [capturing callstacks is far superior which I can do if I want]. Lower down that list comes you giving me an interface that you'll call when you need to allocate memory - this is just a pain for you to implement and I'll probably just pass it onto operator new anyway. The worst thing you can do is 'know best' and create your own custom heaps. If memory allocation performance is a problem, I'd much rather you share the solution the whole game uses than roll your own.

If you're looking to write your own malloc()/free(), etc., you probably should start by checking out the source code for existing systems such as dlmalloc. This is a hard problem, though, for what it's worth. Writing your own malloc library is Hard. Beating existing general purpose malloc libraries will be Even Harder.

And now, here is the correct answer: DON'T IMPLEMENT YET ANOTHER MEMORY MANAGER.
It is incredibly hard to implement a memory manager that does not fail under different kinds of usage patterns and events. You may be able to build a specific manager that works well under YOUR usage patterns, but to write one which works well for MANY users is a full-time job that almost no one has really done well. Worse, it is fantastically easy to implement a memory manager that works great 99% of the time and then 1% of the time crash or suddenly consume most or all available memory on your system due to unexpected heap fragmentation.
I say this as someone who has written multiple memory managers, watched multiple people write their own memory managers, and watched even more people attempt to write memory managers and fail. This problem is deceptively difficult, not because it's hard to write templated allocators and generic types with inheritance and such, but because the other solutions given in this thread tend to fail under corner types of load behavior. Once you start supporting byte alignments (as all real-world allocators must) then heap fragmentation rears its ugly head. Cute heuristics that work great for small test programs, fail miserably when subjected to large, real-world programs.
And once you get it working, someone else will need: cookies to verify against memory stomps; heap usage reporting; memory pools; pools of pools; memory leak tracking and reporting; heap auditing; chunk splitting and coalescing; thread-local storage; lookasides; CPU and process-level page faulting and protection; setting and checking and clearing "free-memory" patterns aka 0xdeadbeef; and whatever else I can't think of off the top of my head.
Writing yet another memory manager falls squarely under the heading of Premature Optimization. Since there are multiple free, good, memory managers with thousands of hours of development and testing behind them, you have to justify spending the cost of your own time in such a way that the result would provide some sort of measurable improvement over what other people have done, and you can use, for free.
If you are SURE you want to implement your own memory manager (and hopefully you are NOT sure after reading this message), read through the dlmalloc sources in detail, then read through the tcmalloc sources in detail as well, THEN make sure you understand the performance trade-offs in implementing a thread-safe versus a thread-unsafe memory manager, and why the naive implementations tend to give poor performance results.

Prepare more than one solution and let the user of the framework adopt any particular one. Policy classes to the generic allocator you develop would do this nicely.
A nice way to get around this is to wrap up pointers in a class with overloaded * operator. Make the internal data of that class only an index to the memory pool. Now, you can just change the index quickly after a background thread copies the data over.
Most good C++ libraries support allocators and you should implement one. You can also overload the global new so your version gets used. And keep in mind that you generally won't need to think about a library allocating or deallocating a large amount of data, which is generally a responsibility of client code.

How to make Qt GUI apps in C++ without memory leaks

I haven't been able to create a Qt GUI app that didn't have over 1K 'definitely lost' bytes in valgrind. I have experimented with this, making minimal apps that just show one QWidget, that extend QMainWindow; that just create a QApplication object without showing it or without executing it or both, but they always leak.
Trying to figure this out I have read that it's because X11 or glibc has bugs, or because valgrind gives false positives. And in one forum thread it seemed to be implied that creating a QApplication-object in the main function and returning the object's exec()-function, as is done in tutorials, is a "simplified" way to make GUIs (and not necessarily good, perhaps?).
The valgrind output does indeed mention libX11 and libglibc, and also libfontconfig. The rest of the memory losses, 5 loss records, occurs at ??? in libQtCore.so during QLibrary::setFileNameAndVersion.
If there is a more appropriate way to create GUI apps that prevents even just some of this from happening, what is it?
And if any of the valgrind output is just noise, how do I create a suppression file that suppresses the right things?
EDIT: Thank you for comments and answers!
I'm not worrying about the few lost kB themselves, but it'll be easier to find my own memory leaks if I don't have to filter several screens of errors but can normally get an "OK" from valgrind. And if I'm going to suppress warnings, I'd better know what they are, right?
Interesting to see how accepted leaks can be!

It is not uncommon for large-scale multi-thread-capable libraries such as QT, wxWidgets, X11, etc. to set up singleton-type objects that initialize once when a process is started and then make no attempt to effort to clean up the allocation when the process shuts down.
I can assure you that anything "leaked" from a function such as QLibrary::setFileNameAndVersion() has been left so intentionally. The bits of memory left behind by X11/glibc/fontConfig are probably not bugs either.
It could be seen as bad coding practice or etiquette, but it can also greatly simplify certain types of tasks. Operating systems these days offer a very strong guarantee for cleaning up any memory or resources left open by a process when its killed (either gracefully or by force), and if the allocation in question is very likely to be needed for the duration of the application, including shutdown procedures -- and various core components of QT would qualify -- then it can be a boon to performance to have the library set up some memory allocations as soon as it is loaded/initialized, and allow those to persist indefinitely. Among other things, this allows the memory to be present for use by any other C++ destructors that might reference that memory.
Since those allocations are only set up once, and from one point in the code, there is no risk of a meaningful memory leak. Its just memory that belongs to the process and is thus cleaned up when the process is closed by the operating system.
Conclusion: if the memory leak isn't in your code, and it doesn't appear to get significantly larger over time (and by significant these days, think megabytes), and/or is clearly orginating from first-time initialization setup code that is only ever invoked once within your app, then don't worry about it. It is probably intentional.

One way to test this can be to run your code inside a loop, and vary the number of iterations. If the difference between allocs and frees is independent on the number of iterations, you are likely to be safe.

performance penalty of message passing as opposed to shared data

There is a lot of buzz these days about not using locks and using Message passing approaches like Erlang. Or about using immutable datastructures like in Functional programming vs. C++/Java.
But what I am concerned with is the following:
AFAIK, Erlang does not guarantee Message delivery. Messages might be lost. Won't the algorithm and code bloat and be complicated again if you have to worry about loss of messages? Whatever distributed algorithm you use must not depend on guaranteed delivery of messages.
What if the Message is a complicated object? Isn't there a huge performance penalty in copying and sending the messages vs. say keeping it in a shared location (like a DB that both processes can access)?
Can you really totally do away with shared states? I don't think so. For e.g. in a DB, you have to access and modify the same record. You cannot use message passing there. You need to have locking or assume Optimistic concurrency control mechanisms and then do rollbacks on errors. How does Mnesia work?
Also, it is not the case that you always need to worry about concurrency. Any project will also have a large piece of code that doesn't have to do anything with concurrency or transactions at all (but they do have performance and speed as a concern). A lot of these algorithms depend on shared states (that's why pass-by-reference or pointers are so useful).
Given this fact, writing programs in Erlang etc is a pain because you are prevented from doing any of these things. May be, it makes programs robust, but for things like Solving a Linear Programming problem or Computing the convex hulll etc. performance is more important and forcing immutability etc. on the algorithm when it has nothing to do with Concurrency/Transactions is a poor decision. Isn't it?

That's real life : you need to account for this possibility regardless of the language / platform. In a distributed world (the real world), things fail: live with it.
Of course there is a cost: nothing is free in our universe. But shouldn't you use another medium (e.g. file, db) instead of shuttling "big objects" in communication pipes? You can always use "message" to refer to "big objects" stored somewhere.
Of course not: the idea behind functional programming / Erlang OTP is to "isolate" as much as possible the areas were "shared state" is manipulated. Futhermore, having clearly marked places where shared state is mutated helps testability & traceability.
I believe you are missing the point: there is no such thing as a silver bullet. If your application cannot be successfully built using Erlang then don't do it. You can always some other part of the overall system in another fashion i.e. use a different language / platform. Erlang is no different from another language in this respect: use the right tool for the right job.
Remember: Erlang was designed to help solve concurrent, asynchronous and distributed problems. It isn't optimized for working efficiently on a shared block of memory for example... unless you count interfacing with nif functions working on shared blocks part of the game :-)

Real-world systems are always hybrids anyway: I don't believe the modern paradigms try, in practice, to get rid of mutable data and shared state.
The objective, however, is not to need concurrent access to this shared state. Programs can be divided into the concurrent and the sequential, and use message-passing and the new paradigms for the concurrent parts.
Not every code will get the same investment: There is concern that threads are fundamentally "considered harmful". Something like Apache may need traditional concurrent threads and a key piece of technology like that may be carefully refined over a period of years so it can blast away with fully concurrent shared state. Operating system kernels are another example where "solve the problem no matter how expensive it is" may make sense.
There is no benefit to fast-but-broken: But for new code, or code that doesn't get so much attention, it may be the case that it simply isn't thread-safe, and it will not handle true concurrency, and so the relative "efficiency" is irrelevant. One way works, and one way doesn't.
Don't forget testability: Also, what value can you place on testing? Thread-based shared-memory concurrency is simply not testable. Message-passing concurrency is. So now you have the situation where you can test one paradigm but not the other. So, what is the value in knowing that the code has been tested? The danger in not even knowing if the other code will work in every situation?

A few comments on the misunderstanding you have of Erlang:
Erlang guarantees that messages will not be lost, and that they will arrive in the order sent. A basic error situation is that machine A can not speak to machine B. When that happens process monitors and links will trigger, and system node-down messages will be sent to the processes that registered for it. Nothing will be silently dropped. Processes will "crash" and supervisors (if any) tries to restart them.
Objects can not be mutated, so they are always copied. One way to secure immutability is by copying values to other erlang process' heaps. Another way is to allocate objects in a shared heap, message references to them and simply not have any operations that mutate them. Erlang does the first for performance! Realtime suffers if you need to stop all processes to garbage collect a shared heap. Ask Java.
There is shared state in Erlang. Erlang is not proud of it, but it is pragmatic about it. One example is the local process registry which is a global map that maps a name to a process so that system processes can be restarted and claim their old name. Erlang just tries to avoid shared state if it possibly can. ETS tables that are public are another example.
Yes, sometimes Erlang is too slow. This happens all languages. Sometimes Java is too slow. Sometimes C++ is too slow. Just because a tight loop in a game had to drop down to assembly to kick off some serious SIMD-based vector mathematics you can't deduce that everything should be written in assembly because it is the only language that is fast when it matters. What matters is being able to write systems that have good performance, and Erlang manages quite well. See benchmarks on yaws or rabbitmq.
Your facts are not facts about Erlang. Even if you think Erlang programming is a pain, you will find other people create some awesome software thanks to it. You should attempt writing an IRC server in Erlang, or something else very concurrent. Even if you're never going to use Erlang again, you would have learned to think about concurrency another way. But of course, you will, because Erlang is awesome easy.
Those that do not understand Erlang are doomed to re-implement it badly.
Okay, the original was about Lisp, but... its true!

There are some implicit assumption in your questions - you assume that all the data can fit
on one machine and that the application is intrinsically localised to one place.
What happens if the application is so large it cannot fit on one machine? What happens if the application outgrows one machine?
You don't want to have one way to program an application if it fits on one machine and
a completely different way of programming it as soon as it outgrows one machine.
What happens if you want make a fault-tolerant application? To make something fault-tolerant you need at least two physically separated machines and no sharing.
When you talk about sharing and data bases you omit to mention that things like mySQL
cluster achieve fault-tolerence precisely by maintaining synchronised copies of the
data in physically separated machines - there is a lot of message passing and
copying that you don't see on the surface - Erlang just exposes this.
The way you program should not suddenly change to accommodate fault-tolerance and scalability.
Erlang was designed primarily for building fault-tolerant applications.
Shared data on a multi-core has it's own set of problems - when you access shared data
you need to acquire a lock - if you use a global lock (the easiest approach) you can end up
stopping all the cores while you access the shared data. Shared data access on a multicore
can be problematic due to caching problems, if the cores have local data caches then accessing "far away" data (in some other processors cache) can be very expensive.
Many problems are intrinsically distributed and the data is never available in one place
at the same time so - these kind of problems fit well with the Erlang way of thinking.
In a distributed setting "guaranteeing message delivery" is impossible - the destination machine might have crashed. Erlang cannot thus guarantee message delivery -
it takes a different approach - the system will tell you if it failed to deliver a message
(but only if you have used the link mechanism) - then you can write you own custom error
recovery.)
For pure number crunching Erlang is not appropriate - but in a hybrid system Erlang
is good at managing how computations get distributed to available processors, so we see a lot of systems where Erlang manages the distribution and fault-tolerent aspects of the problem, but the problem itself is solved in a different language.
and other languages are used

For e.g. in a DB, you have to access and modify the same record
But that is handled by the DB. As a user of the database, you simply execute your query, and the database ensures it is executed in isolation.
As for performance, one of the most important things about eliminating shared state is that it enables new optimizations. Shared state is not particularly efficient. You get cores fighting over the same cache lines, and data has to be written through to memory where it could otherwise stay in a register or in CPU cache.
Many compiler optimizations rely on absence of side effects and shared state as well.
You could say that a stricter language guaranteeing these things requires more optimizations to be performant than something like C, but it also makes these optimizations much much easier for the compiler to implement.
Many concerns similar to concurrency issues arise in singlethreaded code. Modern CPUs are pipelined, execute instructions out of order, and can run 3-4 of them per cycle. So even in a single-threaded program, it is vital that the compiler and CPU is able to determine which instructions can be interleaved and executed in parallel.

For correctness, shared is the way to go, and keep the data as normalized as possible. For immediacy, send messages to inform of changes, but always back them up with polling. Messages get dropped, duplicated, re-ordered, delayed - don't rely on them.
If speed is what you're worried about, first do it single-thread and tune the daylights out of it. Then if you've got multiple cores and know how to split up the work, use parallelism.

Erlang provides supervisors and gen_server callbacks for synchronous calls, so you will know about it if a message isn't delivered: either the gen_server call returns a timeout, or your whole node will be brought down and up if the supervisor is triggered.
usually if the processes are on the same node, message-passing languages optimise away the data copying, so it's almost like shared memory, except if the object is changed used by both afterward, which can not be done using shared memory either anyways
There is some state which is kept by processes by passing it around to themselves in the recursive tail-calls, also some state can be of course passed through messages. I don't use mnesia much, but it is a transactional database, so once you have passed the operation to mnesia (and it has returned) you are pretty much guaranteed it will go through..
Which is why it is easy to tie such applications into erlang with the use of ports or drivers. The easiest are the ports, it's much like a unix pipe, though I think performance isn't that great...and as said, message-passing usually ends up just being pointer passing anyways as the VM/compiler optimise the memory copy out.

Garbage Collection in C++ -- why?

I keep hearing people complaining that C++ doesn't have garbage collection. I also hear that the C++ Standards Committee is looking at adding it to the language. I'm afraid I just don't see the point to it... using RAII with smart pointers eliminates the need for it, right?
My only experience with garbage collection was on a couple of cheap eighties home computers, where it meant that the system would freeze up for a few seconds every so often. I'm sure it has improved since then, but as you can guess, that didn't leave me with a high opinion of it.
What advantages could garbage collection offer an experienced C++ developer?

I keep hearing people complaining that C++ doesn't have garbage collection.
I am so sorry for them. Seriously.
C++ has RAII, and I always complain to find no RAII (or a castrated RAII) in Garbage Collected languages.
What advantages could garbage collection offer an experienced C++ developer?
Another tool.
Matt J wrote it quite right in his post (Garbage Collection in C++ -- why?): We don't need C++ features as most of them could be coded in C, and we don't need C features as most of them could coded in Assembly, etc.. C++ must evolve.
As a developer: I don't care about GC. I tried both RAII and GC, and I find RAII vastly superior. As said by Greg Rogers in his post (Garbage Collection in C++ -- why?), memory leaks are not so terrible (at least in C++, where they are rare if C++ is really used) as to justify GC instead of RAII. GC has non deterministic deallocation/finalization and is just a way to write a code that just don't care with specific memory choices.
This last sentence is important: It is important to write code that "juste don't care". In the same way in C++ RAII we don't care about ressource freeing because RAII do it for us, or for object initialization because constructor do it for us, it is sometimes important to just code without caring about who is owner of what memory, and what kind pointer (shared, weak, etc.) we need for this or this piece of code. There seems to be a need for GC in C++. (even if I personaly fail to see it)
An example of good GC use in C++
Sometimes, in an app, you have "floating data". Imagine a tree-like structure of data, but no one is really "owner" of the data (and no one really cares about when exactly it will be destroyed). Multiple objects can use it, and then, discard it. You want it to be freed when no one is using it anymore.
The C++ approach is using a smart pointer. The boost::shared_ptr comes to mind. So each piece of data is owned by its own shared pointer. Cool. The problem is that when each piece of data can refer to another piece of data. You cannot use shared pointers because they are using a reference counter, which won't support circular references (A points to B, and B points to A). So you must know think a lot about where to use weak pointers (boost::weak_ptr), and when to use shared pointers.
With a GC, you just use the tree structured data.
The downside being that you must not care when the "floating data" will really be destroyed. Only that it will be destroyed.
Conclusion
So in the end, if done properly, and compatible with the current idioms of C++, GC would be a Yet Another Good Tool for C++.
C++ is a multiparadigm language: Adding a GC will perhaps make some C++ fanboys cry because of treason, but in the end, it could be a good idea, and I guess the C++ Standards Comitee won't let this kind of major feature break the language, so we can trust them to make the necessary work to enable a correct C++ GC that won't interfere with C++: As always in C++, if you don't need a feature, don't use it and it will cost you nothing.

The short answer is that garbage collection is very similar in principle to RAII with smart pointers. If every piece of memory you ever allocate lies within an object, and that object is only referred to by smart pointers, you have something close to garbage collection (potentially better). The advantage comes from not having to be so judicious about scoping and smart-pointering every object, and letting the runtime do the work for you.
This question seems analogous to "what does C++ have to offer the experienced assembly developer? instructions and subroutines eliminate the need for it, right?"

With the advent of good memory checkers like valgrind, I don't see much use to garbage collection as a safety net "in case" we forgot to deallocate something - especially since it doesn't help much in managing the more generic case of resources other than memory (although these are much less common). Besides, explicitly allocating and deallocating memory (even with smart pointers) is fairly rare in the code I've seen, since containers are a much simpler and better way usually.
But garbage collection can offer performance benefits potentially, especially if alot of short lived objects are being heap allocated. GC also potentially offers better locality of reference for newly created objects (comparable to objects on the stack).

The motivating factor for GC support in C++ appears to be lambda programming, anonymous functions etc. It turns out that lambda libraries benefit from the ability to allocate memory without caring about cleanup. The benefit for ordinary developers would be simpler, more reliable and faster compiling lambda libraries.
GC also helps simulate infinite memory; the only reason you need to delete PODs is that you need to recycle memory. If you have either GC or infinite memory, there is no need to delete PODs anymore.

I don't understand how one can argue that RAII replaces GC, or is vastly superior. There are many cases handled by a gc that RAII simply cannot deal with at all. They are different beasts.
First, RAII is not bullet proof: it works against some common failures which are pervasive in C++, but there are many cases where RAII does not help at all; it is fragile to asynchronous events (like signals under UNIX). Fundamentally, RAII relies on scoping: when a variable is out of scope, it is automatically freed (assuming the destructor is correctly implemented of course).
Here is a simple example where neither auto_ptr or RAII can help you:
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <memory>
using namespace std;
volatile sig_atomic_t got_sigint = 0;
class A {
public:
A() { printf("ctor\n"); };
~A() { printf("dtor\n"); };
};
void catch_sigint (int sig)
{
got_sigint = 1;
}
/* Emulate expensive computation */
void do_something()
{
sleep(3);
}
void handle_sigint()
{
printf("Caught SIGINT\n");
exit(EXIT_FAILURE);
}
int main (void)
{
A a;
auto_ptr<A> aa(new A);
signal(SIGINT, catch_sigint);
while (1) {
if (got_sigint == 0) {
do_something();
} else {
handle_sigint();
return -1;
}
}
}
The destructor of A will never be called. Of course, it is an artificial and somewhat contrived example, but a similar situation can actually happen; for example when your code is called by another code which handles SIGINT and which you have no control over at all (concrete example: mex extensions in matlab). It is the same reason why finally in python does not guarantee execution of something. Gc can help you in this case.
Other idioms do not play well with this: in any non trivial program, you will need stateful objects (I am using the word object in a very broad sense here, it can be any construction allowed by the language); if you need to control the state outside one function, you can't easily do that with RAII (which is why RAII is not that helpful for asynchronous programming). OTOH, gc have a view of the whole memory of your process, that is it knows about all the objects it allocated, and can clean asynchronously.
It can also be much faster to use gc, for the same reasons: if you need to allocate/deallocate many objects (in particular small objects), gc will vastly outperform RAII, unless you write a custom allocator, since the gc can allocate/clean many objects in one pass. Some well known C++ projects use gc, even where performance matter (see for example Tim Sweenie about the use of gc in Unreal Tournament: http://lambda-the-ultimate.org/node/1277). GC basically increases throughput at the cost of latency.
Of course, there are cases where RAII is better than gc; in particular, the gc concept is mostly concerned with memory, and that's not the only ressource. Things like file, etc... can be well handled with RAII. Languages without memory handling like python or ruby do have something like RAII for those cases, BTW (with statement in python). RAII is very useful when you precisely need to control when the ressource is freed, and that's quite often the case for files or locks for example.

The committee isn't adding garbage-collection, they are adding a couple of features that allow garbage collection to be more safely implemented. Only time will tell whether they actually have any effect whatsoever on future compilers. The specific implementations could vary widely, but will most likely involve reachability-based collection, which could involve a slight hang, depending on how it's done.
One thing is, though, no standards-conformant garbage collector will be able to call destructors - only to silently reuse lost memory.

What advantages could garbage collection offer an experienced C++ developer?
Not having to chase down resource leaks in your less-experienced colleagues' code.

It's an all-to-common error to assume that because C++ does not have garbage collection baked into the language, you can't use garbage collection in C++ period. This is nonsense. I know of elite C++ programmers who use the Boehm collector as a matter of course in their work.

Garbage collection allows to postpone the decision about who owns an object.
C++ uses value semantics, so with RAII, indeed, objects are recollected when going out of scope. This is sometimes referred to as "immediate GC".
When your program starts using reference-semantics (through smart pointers etc...), the language does no longer support you, you're left to the wit of your smart pointer library.
The tricky thing about GC is deciding upon when an object is no longer needed.

Garbage collection makes RCU lockless synchronization much easier to implement correctly and efficiently.

Easier thread safety and scalability
There is one property of GC which may be very important in some scenarios. Assignment of pointer is naturally atomic on most platforms, while creating thread-safe reference counted ("smart") pointers is quite hard and introduces significant synchronization overhead. As a result, smart pointers are often told "not to scale well" on multi-core architecture.

Garbage collection is really the basis for automatic resource management. And having GC changes the way you tackle problems in a way that is hard to quantify. For example when you are doing manual resource management you need to:
Consider when an item can be freed (are all modules/classes finished with it?)
Consider who's responsibility it is to free a resource when it is ready to be freed (which class/module should free this item?)
In the trivial case there is no complexity. E.g. you open a file at the start of a method and close it at the end. Or the caller must free this returned block of memory.
Things start to get complicated quickly when you have multiple modules that interact with a resource and it is not as clear who needs to clean up. The end result is that the whole approach to tackling a problem includes certain programming and design patterns which are a compromise.
In languages that have garbage collection you can use a disposable pattern where you can free resources you know you've finished with but if you fail to free them the GC is there to save the day.
Smart pointers which is actually a perfect example of the compromises I mentioned. Smart pointers can't save you from leaking cyclic data structures unless you have a backup mechanism. To avoid this problem you often compromise and avoid using a cyclic structure even though it may otherwise be the best fit.

I, too, have doubts that C++ commitee is adding a full-fledged garbage collection to the standard.
But I would say that the main reason for adding/having garbage collection in modern language is that there are too few good reasons against garbage collection. Since eighties there were several huge advances in the field of memory management and garbage collection and I believe there are even garbage collection strategies that could give you soft-real-time-like guarantees (like, "GC won't take more than .... in the worst case").

using RAII with smart pointers eliminates the need for it, right?
Smart pointers can be used to implement reference counting in C++ which is a form of garbage collection (automatic memory management) but production GCs no longer use reference counting because it has some important deficiencies:
Reference counting leaks cycles. Consider A↔B, both objects A and B refer to each other so they both have a reference count of 1 and neither is collected but they should both be reclaimed. Advanced algorithms like trial deletion solve this problem but add a lot of complexity. Using weak_ptr as a workaround is falling back to manual memory management.
Naive reference counting is slow for several reasons. Firstly, it requires out-of-cache reference counts to be bumped often (see Boost's shared_ptr up to 10× slower than OCaml's garbage collection). Secondly, destructors injected at the end of scope can incur unnecessary-and-expensive virtual function calls and inhibit optimizations such as tail call elimination.
Scope-based reference counting keeps floating garbage around as objects are not recycled until the end of scope whereas tracing GCs can reclaim them as soon as they become unreachable, e.g. can a local allocated before a loop be reclaimed during the loop?
What advantages could garbage collection offer an experienced C++ developer?
Productivity and reliability are the main benefits. For many applications, manual memory management requires significant programmer effort. By simulating an infinite-memory machine, garbage collection liberates the programmer from this burden which allows them to focus on problem solving and evades some important classes of bugs (dangling pointers, missing free, double free). Furthermore, garbage collection facilitates other forms of programming, e.g. by solving the upwards funarg problem (1970).

In a framework that supports GC, a reference to an immutable object such as a string may be passed around in the same way as a primitive. Consider the class (C# or Java):
public class MaximumItemFinder
{
String maxItemName = "";
int maxItemValue = -2147483647 - 1;
public void AddAnother(int itemValue, String itemName)
{
if (itemValue >= maxItemValue)
{
maxItemValue = itemValue;
maxItemName = itemName;
}
}
public String getMaxItemName() { return maxItemName; }
public int getMaxItemValue() { return maxItemValue; }
}
Note that this code never has to do anything with the contents of any of the strings, and can simply treat them as primitives. A statement like maxItemName = itemName; will likely generate two instructions: a register load followed by a register store. The MaximumItemFinder will have no way of knowing whether callers of AddAnother are going to retain any reference to the passed-in strings, and callers will have no way of knowing how long MaximumItemFinder will retain references to them. Callers of getMaxItemName will have no way of knowing if and when MaximumItemFinder and the original supplier of the returned string have abandoned all references to it. Because code can simply pass string references around like primitive values, however, none of those things matter.
Note also that while the class above would not be thread-safe in the presence of simultaneous calls to AddAnother, any call to GetMaxItemName would be guaranteed to return a valid reference to either an empty string or one of the strings that had been passed to AddAnother. Thread synchronization would be required if one wanted to ensure any relationship between the maximum-item name and its value, but memory safety is assured even in its absence.
I don't think there's any way to write a method like the above in C++ which would uphold memory safety in the presence of arbitrary multi-threaded usage without either using thread synchronization or else requiring that every string variable have its own copy of its contents, held in its own storage space, which may not be released or relocated during the lifetime of the variable in question. It would certainly not be possible to define a string-reference type which could be defined, assigned, and passed around as cheaply as an int.

Garbage Collection Can Make Leaks Your Worst Nightmare
Full-fledged GC that handles things like cyclic references would be somewhat of an upgrade over a ref-counted shared_ptr. I would somewhat welcome it in C++, but not at the language level.
One of the beauties about C++ is that it doesn't force garbage collection on you.
I want to correct a common misconception: a garbage collection myth that it somehow eliminates leaks. From my experience, the worst nightmares of debugging code written by others and trying to spot the most expensive logical leaks involved garbage collection with languages like embedded Python through a resource-intensive host application.
When talking about subjects like GC, there's theory and then there's practice. In theory it's wonderful and prevents leaks. Yet at the theoretical level, so is every language wonderful and leak-free since in theory, everyone would write perfectly correct code and test every single possible case where a single piece of code could go wrong.
Garbage collection combined with less-than-ideal team collaboration caused the worst, hardest-to-debug leaks in our case.
The problem still has to do with ownership of resources. You have to make clear design decisions here when persistent objects are involved, and garbage collection makes it all too easy to think that you don't.
Given some resource, R, in a team environment where the developers aren't constantly communicating and reviewing each other's code carefully at alll times (something a little too common in my experience), it becomes quite easy for developer A to store a handle to that resource. Developer B does as well, perhaps in an obscure way that indirectly adds R to some data structure. So does C. In a garbage-collected system, this has created 3 owners of R.
Because developer A was the one that created the resource originally and thinks he's the owner of it, he remembers to release the reference to R when the user indicates that he no longer wants to use it. After all, if he fails to do so, nothing would happen and it would be obvious from testing that the user-end removal logic did nothing. So he remembers to release it, as any reasonably competent developer would do. This triggers an event for which B handles it and also remembers to release the reference to R.
However, C forgets. He's not one of the stronger developers on the team: a somewhat fresh recruit who has only worked in the system for a year. Or maybe he's not even on the team, just a popular third party developer writing plugins for our product that many users add to the software. With garbage collection, this is when we get those silent logical resource leaks. They're the worst kind: they do not necessarily manifest in the user-visible side of the software as an obvious bug besides the fact that over durations of running the program, the memory usage just continues to rise and rise for some mysterious purpose. Trying to narrow down these issues with a debugger can be about as fun as debugging a time-sensitive race condition.
Without garbage collection, developer C would have created a dangling pointer. He may try to access it at some point and cause the software to crash. Now that's a testing/user-visible bug. C gets embarrassed a bit and corrects his bug. In the GC scenario, just trying to figure out where the system is leaking may be so difficult that some of the leaks are never corrected. These are not valgrind-type physical leaks that can be detected easily and pinpointed to a specific line of code.
With garbage collection, developer C has created a very mysterious leak. His code may continue to access R which is now just some invisible entity in the software, irrelevant to the user at this point, but still in a valid state. And as C's code creates more leaks, he's creating more hidden processing on irrelevant resources, and the software is not only leaking memory but also getting slower and slower each time.
So garbage collection does not necessarily mitigate logical resource leaks. It can, in less than ideal scenarios, make leaks far easier to silently go unnoticed and remain in the software. The developers might get so frustrated trying to trace down their GC logical leaks that they simply tell their users to restart the software periodically as a workaround. It does eliminate dangling pointers, and in a safety-obsessed software where crashing is completely unacceptable under any scenario, then I would prefer GC. But I'm often working in less safety-critical but resource-intensive, performance-critical products where a crash that can be fixed promptly is preferable to a really obscure and mysterious silent bug, and resource leaks are not trivial bugs there.
In both of these cases, we're talking about persistent objects not residing on the stack, like a scene graph in a 3D software or the video clips available in a compositor or the enemies in a game world. When resources tie their lifetimes to the stack, both C++ and any other GC language tend to make it trivial to manage resources properly. The real difficulty lies in persistent resources referencing other resources.
In C or C++, you can have dangling pointers and crashes resulting from segfaults if you fail to clearly designate who owns a resource and when handles to them should be released (ex: set to null in response to an event). Yet in GC, that loud and obnoxious but often easy-to-spot crash is exchanged for a silent resource leak that may never be detected.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js