Is there an acceptable limit for memory leaks? - c++

I've just started experimenting with SDL in C++, and I thought checking for memory leaks regularly may be a good habit to form early on.
With this in mind, I've been running my 'Hello world' programs through Valgrind to catch any leaks, and although I've removed everything except the most basic SDL_Init() and SDL_Quit() statements, Valgrind still reports 120 bytes lost and 77k still reachable.
My question is: Is there an acceptable limit for memory leaks, or should I strive to make all my code completely leak-free?

Be careful that Valgrind isn't picking up false positives in its measurements.
Many naive implementations of memory analyzers flag lost memory as a leak when it isn't really.
Maybe have a read of some of the papers in the external links section of the Wikipedia article on Purify. I know that the documentation that comes with Purify describes several scenarios where you get false positives when trying to detect memory leaks and then goes on to describe the techniques Purify uses to get around the issues.
BTW I'm not affiliated with IBM in any way. I've just used Purify extensively and will vouch for its effectiveness.
Edit: Here's an excellent introductory article covering memory monitoring. It's Purify specific but the discussion on types of memory errors is very interesting.
HTH.
cheers,
Rob

You have to be careful with the definition of "memory leak". Something which is allocated once on first use, and freed on program exit, will sometimes be shown up by a leak-detector, because it started counting before that first use. But it's not a leak (although it may be bad design, since it may be some kind of global).
To see whether a given chunk of code leaks, you might reasonably run it once, then clear the leak-detector, then run it again (this of course requires programmatic control of the leak detector). Things which "leak" once per run of the program usually don't matter. Things which "leak" every time they're executed usually do matter eventually.
I've rarely found it too difficult to hit zero on this metric, which is equivalent to observing creeping memory usage as opposed to lost blocks. I had one library where it got so fiddly, with caches and UI furniture and whatnot, that I just ran my test suite three times over, and ignored any "leaks" which didn't occur in multiples of three blocks. I still caught all or almost all the real leaks, and analysed the tricky reports once I'd got the low-hanging fruit out of the way. Of course the weaknesses of using the test suite for this purpose are (1) you can only use the parts of it that don't require a new process, and (2) most of the leaks you find are the fault of the test code, not the library code...

Living with memory leaks (and other careless issues) is, at its best, (in my opinion) very bad programming. At its worst it makes software unusable.
You should avoid introducing them in the first place and run the tools you and others have mentioned to try to detect them.
Avoid sloppy programming - there are enough bad programmers out there already - the world doesn't need another one.
EDIT
I agree - many tools can provide false positives.

If you are really worried about memory leaking, you will need to do some calculations.
You need to test your application for like, an hour and then calculate the leaked memory. This way, you get a leaked memory bytes/minute value.
Now, you will need to estimate the average length of the session of your program. For example, for notepad.exe, 15 minutes sounds like a good estimation for me.
If (average session length)*(leaked bytes / minute) > 0.3 * (memory space normally occupied by your process), then you should probably do some more efforts to reduce memory leaks. I just made up 0.3, use common sense to determine your acceptable threshold.
Remember that an important aspect of being a programmer is being a Software Engineer, and very often Engineering is about choosing the least worst option from two or more bad options. Maths always comes handy when you need to measure how bad an option is actually.

Most OSes (including Windows) will give back all of a program's allocated memory when the program is unloaded. This includes any memory which the program itself may have lost track of.
Given that, my usual theory is that it's perfectly fine to leak memory during startup, but not OK to do it during runtime.
So really the question isn't if you are leaking any memory, it is if you are continually leaking it during your program's runtime. If you use your program for a while, and no matter what you do it stays at 120 bytes lost rather than increasing, I'd say you have done great. Move on.

For a desktop application, small memory leaks are not a real problem. For services (servers) no memory leaks are acceptable.

It depends on your application. Some leaking may be unavoidable (due to the time needed to find the leak v.s. deadlines). As long as your application can run as long as you want, and not take an crazy amount of memory in that time it's probably fine.

It does look like SDL developers don't use Valgrind, but I basically only care about those 120 bytes lost.
With this in mind, I've been running my 'Hello world' programs through Valgrind to catch any leaks, and although I've removed everything except the most basic SDL_Init() and SDL_Quit() statements, Valgrind still reports 120 bytes lost and 77k still reachable.
Well, with Valgrind, "still reachable memory" is often not really leaked memory, especially in such a simple program. I can bet safely that there is basically no allocation in SDL_Quit(), so the "leaks" are just structures allocated once by SDL_Init().
Try adding useful work and seeing if those amounts increase; try making a loop of useful work (like creating and destroying some SDL structure) and see if the amount of leaks grows with the amount of iterations. In the latter case, you should check the stack traces of the leaks and fix them.
Otherwise, those 77k leaks count as "memory which should be freed at program end, but for which they rely on the OS to free it.
So, actually, I'm more worried right now by those 120 bytes, if they are not false positives, and they are usually few. False positives with Valgrind are mostly cases where usage of uninitialized memory is intended (for instance because it is actually padding).

As per Rob Wells' comments on Purify, download and try out some of the other tools out there. I use BoundsChecker and AQTime, and have seen different false positives in both over the years. Note that the memory leak might also be in a third party component, which you may want to exclude from your analysis. From example, MFC had a number of memory leaks in the first view versions.
IMO, memory leaks should be tracked down for any code that is going into a code base that may have a long life. If you can't track them down, at least make a note that they exist for the next user of the same code.

Firstable memory leaks are only a serious problem when they grow with time, otherwise the app just looks a little bigger from the outside (obviously there's a limit here too, hence the 'serious').
When you have a leak that grows with time you might be in trouble. How much trouble depends on the circumstances though. If you know where the memory is going and can make sure that you'll always have enough memory to run the program and everything else on that machine you are still somewhat fine.
If you don't know where the memory is going however, i wouldn't ship the program and keep digging.

With SDL on Linux in particular, there seem to be some leaking in the underlying X windows library. There's nothing much you can do about those (unless you want to try to fix the library itself, which is probably not for the faint hearted).
You can use valgrind's suppression mechanism (see --suppressions and --gen-suppressions in the valgrind man page) to tell it not to bother you with these errors.
In general we do have to be a little more lenient with third party libraries; while we should absolutely not accept memory leaks in our own code, and the presence of memory leaks should be a factor when choosing between alternative third party libraries, sometimes there's no choice but to ignore them (though it may be a good idea to report them to the library maintainer).

Related

What are the issues with undeleted dynamic instance? [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Is it ever acceptable to have a memory leak in your C or C++ application?
What if you allocate some memory and use it until the very last line of code in your application (for example, a global object's destructor)? As long as the memory consumption doesn't grow over time, is it OK to trust the OS to free your memory for you when your application terminates (on Windows, Mac, and Linux)? Would you even consider this a real memory leak if the memory was being used continuously until it was freed by the OS.
What if a third party library forced this situation on you? Would refuse to use that third party library no matter how great it otherwise might be?
I only see one practical disadvantage, and that is that these benign leaks will show up with memory leak detection tools as false positives.
No.
As professionals, the question we should not be asking ourselves is, "Is it ever OK to do this?" but rather "Is there ever a good reason to do this?" And "hunting down that memory leak is a pain" isn't a good reason.
I like to keep things simple. And the simple rule is that my program should have no memory leaks.
That makes my life simple, too. If I detect a memory leak, I eliminate it, rather than run through some elaborate decision tree structure to determine whether it's an "acceptable" memory leak.
It's similar to compiler warnings – will the warning be fatal to my particular application? Maybe not.
But it's ultimately a matter of professional discipline. Tolerating compiler warnings and tolerating memory leaks is a bad habit that will ultimately bite me in the rear.
To take things to an extreme, would it ever be acceptable for a surgeon to leave some piece of operating equipment inside a patient?
Although it is possible that a circumstance could arise where the cost/risk of removing that piece of equipment exceeds the cost/risk of leaving it in, and there could be circumstances where it was harmless, if I saw this question posted on SurgeonOverflow.com and saw any answer other than "no," it would seriously undermine my confidence in the medical profession.
–
If a third party library forced this situation on me, it would lead me to seriously suspect the overall quality of the library in question. It would be as if I test drove a car and found a couple loose washers and nuts in one of the cupholders – it may not be a big deal in itself, but it portrays a lack of commitment to quality, so I would consider alternatives.
I don't consider it to be a memory leak unless the amount of memory being "used" keeps growing. Having some unreleased memory, while not ideal, is not a big problem unless the amount of memory required keeps growing.
Let's get our definitions correct, first. A memory leak is when memory is dynamically allocated, eg with malloc(), and all references to the memory are lost without the corresponding free. An easy way to make one is like this:
#define BLK ((size_t)1024)
while(1){
void * vp = malloc(BLK);
}
Note that every time around the while(1) loop, 1024 (+overhead) bytes are allocated, and the new address assigned to vp; there's no remaining pointer to the previous malloc'ed blocks. This program is guaranteed to run until the heap runs out, and there's no way to recover any of the malloc'ed memory. Memory is "leaking" out of the heap, never to be seen again.
What you're describing, though, sound like
int main(){
void * vp = malloc(LOTS);
// Go do something useful
return 0;
}
You allocate the memory, work with it until the program terminates. This is not a memory leak; it doesn't impair the program, and all the memory will be scavenged up automagically when the program terminates.
Generally, you should avoid memory leaks. First, because like altitude above you and fuel back at the hangar, memory that has leaked and can't be recovered is useless; second, it's a lot easier to code correctly, not leaking memory, at the start than it is to find a memory leak later.
In theory no, in practise it depends.
It really depends on how much data the program is working on, how often the program is run and whether or not it is running constantly.
If I have a quick program that reads a small amount of data makes a calculation and exits, a small memory leak will never be noticed. Because the program is not running for very long and only uses a small amount of memory, the leak will be small and freed when the program exists.
On the other hand if I have a program that processes millions of records and runs for a long time, a small memory leak might bring down the machine given enough time.
As for third party libraries that have leaks, if they cause a problem either fix the library or find a better alternative. If it doesn't cause a problem, does it really matter?
Many people seem to be under the impression that once you free memory, it's instantly returned to the operating system and can be used by other programs.
This isn't true. Operating systems commonly manage memory in 4KiB pages. malloc and other sorts of memory management get pages from the OS and sub-manage them as they see fit. It's quite likely that free() will not return pages to the operating system, under the assumption that your program will malloc more memory later.
I'm not saying that free() never returns memory to the operating system. It can happen, particularly if you are freeing large stretches of memory. But there's no guarantee.
The important fact: If you don't free memory that you no longer need, further mallocs are guaranteed to consume even more memory. But if you free first, malloc might re-use the freed memory instead.
What does this mean in practice? It means that if you know your program isn't going to require any more memory from now on (for instance it's in the cleanup phase), freeing memory is not so important. However if the program might allocate more memory later, you should avoid memory leaks - particularly ones that can occur repeatedly.
Also see this comment for more details about why freeing memory just before termination is bad.
A commenter didn't seem to understand that calling free() does not automatically allow other programs to use the freed memory. But that's the entire point of this answer!
So, to convince people, I will demonstrate an example where free() does very little good. To make the math easy to follow, I will pretend that the OS manages memory in 4000 byte pages.
Suppose you allocate ten thousand 100-byte blocks (for simplicity I'll ignore the extra memory that would be required to manage these allocations). This consumes 1MB, or 250 pages. If you then free 9000 of these blocks at random, you're left with just 1000 blocks - but they're scattered all over the place. Statistically, about 5 of the pages will be empty. The other 245 will each have at least one allocated block in them. That amounts to 980KB of memory, that cannot possibly be reclaimed by the operating system - even though you now only have 100KB allocated!
On the other hand, you can now malloc() 9000 more blocks without increasing the amount of memory your program is tying up.
Even when free() could technically return memory to the OS, it may not do so. free() needs to achieve a balance between operating quickly and saving memory. And besides, a program that has already allocated a lot of memory and then freed it is likely to do so again. A web server needs to handle request after request after request - it makes sense to keep some "slack" memory available so you don't need to ask the OS for memory all the time.
There is nothing conceptually wrong with having the os clean up after the application is run.
It really depends on the application and how it will be run. Continually occurring leaks in an application that needs to run for weeks has to be taken care of, but a small tool that calculates a result without too high of a memory need should not be a problem.
There is a reason why many scripting language do not garbage collect cyclical references… for their usage patterns, it's not an actual problem and would thus be as much of a waste of resources as the wasted memory.
I believe the answer is no, never allow a memory leak, and I have a few reasons which I haven't seen explicitly stated. There are great technical answers here but I think the real answer hinges on more social/human reasons.
(First, note that as others mentioned, a true leak is when your program, at any point, loses track of memory resources that it has allocated. In C, this happens when you malloc() to a pointer and let that pointer leave scope without doing a free() first.)
The important crux of your decision here is habit. When you code in a language that uses pointers, you're going to use pointers a lot. And pointers are dangerous; they're the easiest way to add all manner of severe problems to your code.
When you're coding, sometimes you're going to be on the ball and sometimes you're going to be tired or mad or worried. During those somewhat distracted times, you're coding more on autopilot. The autopilot effect doesn't differentiate between one-off code and a module in a larger project. During those times, the habits you establish are what will end up in your code base.
So no, never allow memory leaks for the same reason that you should still check your blind spots when changing lanes even if you're the only car on the road at the moment. During times when your active brain is distracted, good habits are all that can save you from disastrous missteps.
Beyond the "habit" issue, pointers are complex and often require a lot of brain power to track mentally. It's best to not "muddy the water" when it comes to your usage of pointers, especially when you're new to programming.
There's a more social aspect too. By proper use of malloc() and free(), anyone who looks at your code will be at ease; you're managing your resources. If you don't, however, they'll immediately suspect a problem.
Maybe you've worked out that the memory leak doesn't hurt anything in this context, but every maintainer of your code will have to work that out in his head too when he reads that piece of code. By using free() you remove the need to even consider the issue.
Finally, programming is writing a mental model of a process to an unambiguous language so that a person and a computer can perfectly understand said process. A vital part of good programming practice is never introducing unnecessary ambiguity.
Smart programming is flexible and generic. Bad programming is ambiguous.
I'm going to give the unpopular but practical answer that it's always wrong to free memory unless doing so will reduce the memory usage of your program. For instance a program that makes a single allocation or series of allocations to load the dataset it will use for its entire lifetime has no need to free anything. In the more common case of a large program with very dynamic memory requirements (think of a web browser), you should obviously free memory you're no longer using as soon as you can (for instance closing a tab/document/etc.), but there's no reason to free anything when the user selects clicks "exit", and doing so is actually harmful to the user experience.
Why? Freeing memory requires touching memory. Even if your system's malloc implementation happens not to store metadata adjacent to the allocated memory blocks, you're likely going to be walking recursive structures just to find all the pointers you need to free.
Now, suppose your program has worked with a large volume of data, but hasn't touched most of it for a while (again, web browser is a great example). If the user is running a lot of apps, a good portion of that data has likely been swapped to disk. If you just exit(0) or return from main, it exits instantly. Great user experience. If you go to the trouble of trying to free everything, you may spend 5 seconds or more swapping all the data back in, only to throw it away immediately after that. Waste of user's time. Waste of laptop's battery life. Waste of wear on the hard disk.
This is not just theoretical. Whenever I find myself with too many apps loaded and the disk starts thrashing, I don't even consider clicking "exit". I get to a terminal as fast as I can and type killall -9 ... because I know "exit" will just make it worse.
I think in your situation the answer may be that it's okay. But you definitely need to document that the memory leak is a conscious decision. You don't want a maintenance programmer to come along, slap your code inside a function, and call it a million times. So if you make the decision that a leak is okay you need to document it (IN BIG LETTERS) for whoever may have to work on the program in the future.
If this is a third party library you may be trapped. But definitely document that this leak occurs.
But basically if the memory leak is a known quantity like a 512 KB buffer or something then it is a non issue. If the memory leak keeps growing like every time you call a library call your memory increases by 512KB and is not freed, then you may have a problem. If you document it and control the number of times the call is executed it may be manageable. But then you really need documentation because while 512 isn't much, 512 over a million calls is a lot.
Also you need to check your operating system documentation. If this was an embedded device there may be operating systems that don't free all the memory from a program that exits. I'm not sure, maybe this isn't true. But it is worth looking into.
I'm sure that someone can come up with a reason to say Yes, but it won't be me.
Instead of saying no, I'm going to say that this shouldn't be a yes/no question.
There are ways to manage or contain memory leaks, and many systems have them.
There are NASA systems on devices that leave the earth that plan for this. The systems will automatically reboot every so often so that memory leaks will not become fatal to the overall operation. Just an example of containment.
If you allocate memory and use it until the last line of your program, that's not a leak. If you allocate memory and forget about it, even if the amount of memory isn't growing, that's a problem. That allocated but unused memory can cause other programs to run slower or not at all.
I can count on one hand the number of "benign" leaks that I've seen over time.
So the answer is a very qualified yes.
An example. If you have a singleton resource that needs a buffer to store a circular queue or deque but doesn't know how big the buffer will need to be and can't afford the overhead of locking or every reader, then allocating an exponentially doubling buffer but not freeing the old ones will leak a bounded amount of memory per queue/deque. The benefit for these is they speed up every access dramatically and can change the asymptotics of multiprocessor solutions by never risking contention for a lock.
I've seen this approach used to great benefit for things with very clearly fixed counts such as per-CPU work-stealing deques, and to a much lesser degree in the buffer used to hold the singleton /proc/self/maps state in Hans Boehm's conservative garbage collector for C/C++, which is used to detect the root sets, etc.
While technically a leak, both of these cases are bounded in size and in the growable circular work stealing deque case there is a huge performance win in exchange for a bounded factor of 2 increase in the memory usage for the queues.
If you allocate a bunch of heap at the beginning of your program, and you don't free it when you exit, that is not a memory leak per se. A memory leak is when your program loops over a section of code, and that code allocates heap and then "loses track" of it without freeing it.
In fact, there is no need to make calls to free() or delete right before you exit. When the process exits, all of its memory is reclaimed by the OS (this is certainly the case with POSIX. On other OSes – particularly embedded ones – YMMV).
The only caution I'd have with not freeing the memory at exit time is that if you ever refactor your program so that it, for example, becomes a service that waits for input, does whatever your program does, then loops around to wait for another service call, then what you've coded can turn into a memory leak.
In this sort of question context is everything. Personally I can't stand leaks, and in my code I go to great lengths to fix them if they crop up, but it is not always worth it to fix a leak, and when people are paying me by the hour I have on occasion told them it was not worth my fee for me to fix a leak in their code. Let me give you an example:
I was triaging a project, doing some perf work and fixing a lot of bugs. There was a leak during the applications initialization that I tracked down, and fully understood. Fixing it properly would have required a day or so refactoring a piece of otherwise functional code. I could have done something hacky (like stuffing the value into a global and grabbing it some point I know it was no longer in use to free), but that would have just caused more confusion to the next guy who had to touch the code.
Personally I would not have written the code that way in the first place, but most of us don't get to always work on pristine well designed codebases, and sometimes you have to look at these things pragmatically. The amount of time it would have taken me to fix that 150 byte leak could instead be spent making algorithmic improvements that shaved off megabytes of ram.
Ultimately, I decided that leaking 150 bytes for an app that used around a gig of ram and ran on a dedicated machine was not worth fixing it, so I wrote a comment saying that it was leaked, what needed to be changed in order to fix it, and why it was not worth it at the time.
this is so domain-specific that its hardly worth answering. use your freaking head.
space shuttle operating system: nope, no memory leaks allowed
rapid development proof-of-concept code: fixing all those memory leaks is a waste of time.
and there is a spectrum of intermediate situations.
the opportunity cost ($$$) of delaying a product release to fix all but the worst memory leaks is usually dwarfs any feelings of being "sloppy or unprofessional". Your boss pays you to make him money, not to get a warm, fuzzy feelings.
You have to first realize that there's a big difference between a perceived memory leak and an actual memory leak. Very frequently analysis tools will report many red herrings, and label something as having been leaked (memory or resources such as handles etc) where it actually isn't. Often times this is due to the analysis tool's architecture. For example, certain analysis tools will report run time objects as memory leaks because it never sees those object freed. But the deallocation occurs in the runtime's shutdown code, which the analysis tool might not be able to see.
With that said, there will still be times when you will have actual memory leaks that are either very difficult to find or very difficult to fix. So now the question becomes is it ever OK to leave them in the code?
The ideal answer is, "no, never." A more pragmatic answer may be "no, almost never." Very often in real life you have limited number of resources and time to resolve and endless list of tasks. When one of the tasks is eliminating memory leaks, the law of diminishing returns very often comes in to play. You could eliminate say 98% of all memory leaks in an application in a week, but the remaining 2% might take months. In some cases it might even be impossible to eliminate certain leaks because of the application's architecture without a major refactoring of code. You have to weigh the costs and benefits of eliminating the remaining 2%.
While most answers concentrate on real memory leaks (which are not OK ever, because they are a sign of sloppy coding), this part of the question appears more interesting to me:
What if you allocate some memory and use it until the very last line of code in your application (for example, a global object's deconstructor)? As long as the memory consumption doesn't grow over time, is it OK to trust the OS to free your memory for you when your application terminates (on Windows, Mac, and Linux)? Would you even consider this a real memory leak if the memory was being used continuously until it was freed by the OS.
If the associated memory is used, you cannot free it before the program ends. Whether the free is done by the program exit or by the OS does not matter. As long as this is documented, so that change don't introduce real memory leaks, and as long as there is no C++ destructor or C cleanup function involved in the picture. A not-closed file might be revealed through a leaked FILE object, but a missing fclose() might also cause the buffer not to be flushed.
So, back to the original case, it is IMHO perfectly OK in itself, so much that Valgrind, one of the most powerful leak detectors, will treat such leaks only if requested. On Valgrind, when you overwrite a pointer without freeing it beforehand, it gets considered as a memory leak, because it is more likely to happen again and to cause the heap to grow endlessly.
Then, there are not nfreed memory blocks which are still reachable. One could make sure to free all of them at the exit, but that is just a waste of time in itself. The point is if they could be freed before. Lowering memory consumption is useful in any case.
Even if you are sure that your 'known' memory leak will not cause havoc, don't do it. At best, it will pave a way for you to make a similar and probably more critical mistake at a different time and place.
For me, asking this is like questioning "Can I break the red light at 3 AM in the morning when no one is around?". Well sure, it may not cause any trouble at that time but it will provide a lever for you to do the same in rush hour!
No, you should not have leaks that the OS will clean for you. The reason (not mentioned in the answers above as far as I could check) is that you never know when your main() will be re-used as a function/module in another program. If your main() gets to be a frequently-called function in another persons' software - this software will have a memory leak that eats memory over time.
KIV
I agree with vfilby – it depends. In Windows, we treat memory leaks as relatively serous bugs. But, it very much depends on the component.
For example, memory leaks are not very serious for components that run rarely, and for limited periods of time. These components run, do theire work, then exit. When they exit all their memory is freed implicitly.
However, memory leaks in services or other long run components (like the shell) are very serious. The reason is that these bugs 'steal' memory over time. The only way to recover this is to restart the components. Most people don't know how to restart a service or the shell – so if their system performance suffers, they just reboot.
So, if you have a leak – evaluate its impact two ways
To your software and your user's experience.
To the system (and the user) in terms of being frugal with system resources.
Impact of the fix on maintenance and reliability.
Likelihood of causing a regression somewhere else.
Foredecker
I'm surprised to see so many incorrect definitions of what a memory leak actually is. Without a concrete definition, a discussion on whether it's a bad thing or not will go nowhere.
As some commentors have rightly pointed out, a memory leak only happens when memory allocated by a process goes out of scope to the extent that the process is no longer able to reference or delete it.
A process which is grabbing more and more memory is not necessarily leaking. So long as it is able to reference and deallocate that memory, then it remains under the explicit control of the process and has not leaked. The process may well be badly designed, especially in the context of a system where memory is limited, but this is not the same as a leak. Conversely, losing scope of, say, a 32 byte buffer is still a leak, even though the amount of memory leaked is small. If you think this is insignificant, wait until someone wraps an algorithm around your library call and calls it 10,000 times.
I see no reason whatsoever to allow leaks in your own code, however small. Modern programming languages such as C and C++ go to great lengths to help programmers prevent such leaks and there is rarely a good argument not to adopt good programming techniques - especially when coupled with specific language facilities - to prevent leaks.
As regards existing or third party code, where your control over quality or ability to make a change may be highly limited, depending on the severity of the leak, you may be forced to accept or take mitigating action such as restarting your process regularly to reduce the effect of the leak.
It may not be possible to change or replace the existing (leaking) code, and therefore you may be bound to accept it. However, this is not the same as declaring that it's OK.
I guess it's fine if you're writing a program meant to leak memory (i.e. to test the impact of memory leaks on system performance).
Its really not a leak if its intentional and its not a problem unless its a significant amount of memory, or could grow to be a significant amount of memory. Its fairly common to not cleanup global allocations during the lifetime of a program. If the leak is in a server or long running app, grows over time, then its a problem.
I think you've answered your own question. The biggest drawback is how they interfere with the memory leak detection tools, but I think that drawback is a HUGE drawback for certain types of applications.
I work with legacy server applications that are supposed to be rock solid but they have leaks and the globals DO get in the way of the memory detection tools. It's a big deal.
In the book "Collapse" by Jared Diamond, the author wonders about what the guy was thinking who cut down the last tree on Easter Island, the tree he would have needed in order to build a canoe to get off the island. I wonder about the day many years ago when that first global was added to our codebase. THAT was the day it should have been caught.
I see the same problem as all scenario questions like this: What happens when the program changes, and suddenly that little memory leak is being called ten million times and the end of your program is in a different place so it does matter? If it's in a library then log a bug with the library maintainers, don't put a leak into your own code.
I'll answer no.
In theory, the operating system will clean up after you if you leave a mess (now that's just rude, but since computers don't have feelings it might be acceptable). But you can't anticipate every possible situation that might occur when your program is run. Therefore (unless you are able to conduct a formal proof of some behaviour), creating memory leaks is just irresponsible and sloppy from a professional point of view.
If a third-party component leaks memory, this is a very strong argument against using it, not only because of the imminent effect but also because it shows that the programmers work sloppily and that this might also impact other metrics. Now, when considering legacy systems this is difficult (consider web browsing components: to my knowledge, they all leak memory) but it should be the norm.
Historically, it did matter on some operating systems under some edge cases. These edge cases could exist in the future.
Here's an example, on SunOS in the Sun 3 era, there was an issue if a process used exec (or more traditionally fork and then exec), the subsequent new process would inherit the same memory footprint as the parent and it could not be shrunk. If a parent process allocated 1/2 gig of memory and didn't free it before calling exec, the child process would start using that same 1/2 gig (even though it wasn't allocated). This behavior was best exhibited by SunTools (their default windowing system), which was a memory hog. Every app that it spawned was created via fork/exec and inherited SunTools footprint, quickly filling up swap space.
This was already discussed ad nauseam. Bottom line is that a memory leak is a bug and must be fixed. If a third party library leaks memory, it makes one wonder what else is wrong with it, no? If you were building a car, would you use an engine that is occasionally leaking oil? After all, somebody else made the engine, so it's not your fault and you can't fix it, right?
Generally a memory leak in a stand alone application is not fatal, as it gets cleaned up when the program exits.
What do you do for Server programs that are designed so they don't exit?
If you are the kind of programmer that does not design and implement code where the resources are allocated and released correctly, then I don't want anything to do with you or your code. If you don't care to clean up your leaked memory, what about your locks? Do you leave them hanging out there too? Do you leave little turds of temporary files laying around in various directories?
Leak that memory and let the program clean it up? No. Absolutely not. It's a bad habit, that leads to bugs, bugs, and more bugs.
Clean up after yourself. Yo momma don't work here no more.
As a general rule, if you've got memory leaks that you feel you can't avoid, then you need to think harder about object ownership.
But to your question, my answer in a nutshell is In production code, yes. During development, no. This might seem backwards, but here's my reasoning:
In the situation you describe, where the memory is held until the end of the program, it's perfectly okay to not release it. Once your process exits, the OS will clean up anyway. In fact, it might make the user's experience better: In a game I've worked on, the programmers thought it would be cleaner to free all the memory before exiting, causing the shutdown of the program to take up to half a minute! A quick change that just called exit() instead made the process disappear immediately, and put the user back to the desktop where he wanted to be.
However, you're right about the debugging tools: They'll throw a fit, and all the false positives might make finding your real memory leaks a pain. And because of that, always write debugging code that frees the memory, and disable it when you ship.

How to predict memory leak in linux c/c++ programme?

I'm tring to predict memory leak in an linux c++ programme.
The programme is an network server which fork a lot of sub-process to handle request.
I can get the RSS of the process after it handled one request, and I want to make the process kill itself if it find out that there could be memory leak.
I can make the process kill itself if it's RSS is bigger than an configured value ( like 1GB ), but this seems too simple and this value is also not easy to configure.
I can also make the process kill itself if the RSS is bigger and bigger for N times (like 100), but this seems not proper too, if the process cache some datas.
Is there any memory leak prediction algorithms?
I have googled it, but can't find one.
Thanks.
Sorry for not makeing the question clear.
Actually I'm not trying to find some way to find how the memory leak, I know there is tools like valgrind.
The situation is like this:
The programme I worted is an RPC framework, there is a father process, which will fork lots of sub-processes.
These sub process will run some business logic code which is not written by me and I have no control of these codes.
When these logic codes leak memory, the sub-process will be killed by the os OOM Killer.
But at that point, due to lack of memory, the os will halt for a while (minutes or even more ) or even breakdown and we will have to reboot the system.
I want to avoid this situation by predict the memory leak of the logic codes and make the sub-process kill itself before the OOM Kill do it's job.
And if the sub-process kill itself, the father process will fork another sub-process to run the logic code.
I can get the memory occupied by the sub-process after it handle one request.
So the simplest way is to check if the sub-process occupy too much memory than an pre-configured value (like 1GB), if it is, then make it kill itself.
But this algorithm seems too simple and the value is not easy to configure.
So what I'm finding is an algorithm to predict that if there is memory leak in an process.
Thanks
You can't predict memory leaks (it probably can be proved equivalent to the halting problem).
You could measure or instrument the binary, notably with valgrind (which will only tell you if some particular execution has leaked).
You could also consider using Boehm's conservative garbage collector. It would usually free memory automatically (but there is no guarantee since it is a conservative garbage collector!).
I'm tring to predict memory leak in an linux c++ programme. [...] Is there any memory leak prediction algorithms?
No, there is no algorithm. This depends on the developer's good or bad habits. That means, if you wrote ten applications and they all had memory leaks, you can predict that your 11th one will have memory leaks (i.e. there is 1 chance in 11 that your app will not have memory leaks).
This looks to me like the x-y problem. If you have child processes with memory leaks, the proper solution would be to remove the memory leaks, from the child processes. Otherwise, you can impose a limit in your system (i.e. no child process will consume more than 350Mb of memory"). This limit though is in no way connected to memory leaks.
To detect memory leaks in your development process, consider using a specialized tool (like valgrind or BoundsChecker).
Without further constraints, no there isn't. The program could do anything and as a result have what ever kind of pattern in its memory use. Tools like valgrind can detect memory leaks, for example by tracking each memory allocation and noticing when there is a piece of allocated memory which is not reachable. But they are essentially running the program in a VM or under debugger, which a huge performance penalty, so not a solution for production use memory leak detection.
Solution is really to not leak memory, to find memory leaks while still developing the software. There you can use tools like valgrind, combined with enough test cases, to catch all leaks. But best is to do high quality software. Memory leaks are result of either bad methodology, or sloppy implementation.
As a practical solution to problem itself: Do not try to detect memory leaks. Just restart the process for example once per 24h, at the time where it causes least disruption. Of course this is a bit "ugly", it encourages just ignoring some bugs when they should be fixed for example. This is a case of "ugly reality vs. sound engineering principles".
One way to deal with memory leaks is to implement reference counting. The idea is to have for each object or struct, a count of how much pointers are pointing to it, when this count drop to 0 free the allocated memory. This forces you to rethink the code, but it is effective when several objects are pointing to another object and cannot predict when it will become useless.
If you are programming in C++, you could try boost libraries, they have implementation of smart pointers for reference counting.
http://www.boost.org/doc/libs/1_55_0/libs/smart_ptr/smart_ptr.htm

Is it acceptable not to deallocate memory

I'm working on a project that is supposed to be used from the command line with the following syntax:
program-name input-file
The program is supposed to process the input, compute some stuff and spit out results on stdout.
My language of choice is C++ for several reasons I'm not willing to debate. The computation phase will be highly symbolic (think compiler) and will use pretty complex dynamically allocated data structures. In particular, it's not amenable to RAII style programming.
I'm wondering if it is acceptable to forget about freeing memory, given that I expect the entire computation to consume less than the available memory and that the OS is free to reclaim all the memory in one step after the program finishes (assume program terminates in seconds). What are your feeling about this?
As a backup plan, if ever my project will require to run as a server or interactively, I figured that I can always refit a garbage collector into the source code. Does anyone have experience using garbage collectors for C++? Do they work well?
It shouldn't cause any problems in the specific situation described the question.
However, it's not exactly normal. Static analysis tools will complain about it. Most importantly, it builds bad habits.
Sometimes not deallocating memory is the right thing to do.
I used to write compilers. After building the parse tree and traversing it to write the intermediate code, we would simply just exit. Deallocating the tree would have
added a bit of slowness to the compiler, which we wanted of course to be as fast as possible.
taken up code space
taken time to code and test the deallocators
violated the "no code executes better than 'no code'" dictum.
HTH! FWIW, this was "back in the day" when memory was non-virtual and minimal, the boxes were much slower, and the first two were non-trivial considerations.
My feeling would be something like "WTF!!!"
Look at it this way:
You choose a programming language that does not include a garbage collector, we are not allowed to ask why.
You are basically stating that you are too lazy to care about freeing the memory.
Well, WTF again. Laziness isn't a good reason for anything, the least of what is playing around with memory without freeing it.
Just free the memory, it's a bad practice, the scenario may change and then can be a million reasons you can need that memory freed and the only reason for not doing it is laziness, don't get bad habits, and get used to do things right, that way you'll tend to do them right in the future!!
Not deallocating memory should not be problem but it is a bad practice.
Joel Coehoorn is right:
It shouldn't cause any problems.
However, it's not exactly normal.
Static analysis tools will complain
about it. Most importantly, it builds
bad habits.
I'd also like to add that thinking about deallocation as you write the code is probably a lot easier than trying to retrofit it afterwards. So I would probably make it deallocate memory; you don't know how your program might be used in future.
If you want a really simple way to free memory, look at the "pools" concept that Apache uses.
Well, I think that it's not acceptable. You've already alluded to potential future problems yourself. Don't think they're necessarily easy to solve.
Things like “… given that I expect the entire computation to consume less …” are famous last phrases. Similarly, refitting code with some feature is one of these things they all talk of and never do.
Not deallocating memory might sound good in the short run but can potentially create a huge load of problems in the long run. Personally, I just don't think that's worth it.
There are two strategies. Either you build in the GC design from the very beginning. It's more work but it will pay off. For a lot of small objects it might pay to use a pool allocator and just keep track of the memory pool. That way, you can keep track of the memory consumption and simply avoid a lot of problems that similar code, but without allocation pool, would create.
Or you use smart pointers throughout the program from the beginning. I actually prefer this method even though it clutters the code. One solution is to rely heavily on templates, which takes out a lot of redundancy when referring to types.
Take a look at projects such as WebKit. Their computation phase resembles yours since they build parse trees for HTML. They use smart pointers throughout their program.
Finally: “It’s a question of style … Sloppy work tends to be habit-forming.”
– Silk in Castle of Wizardry by David Eddings.
will use pretty complex dynamically
allocated data structures. In
particular, it's not amenable to RAII
style programming.
I'm almost sure that's an excuse for lazy programming. Why can't you use RAII? Is it because you don't want to keep track of your allocations, there's no pointer to them that you keep? If so, how do you expect to use the allocated memory - there's always a pointer to it that contains some data.
Is it because you don't know when it should be released? Leave the memory in RAII objects, each one referenced by something, and they'll all trickle-down free each other when the containing object gets freed - this is particularly important if you want to run it as a server one day, each iteration of the server effective runs a 'master' object that holds all others so you can just delete it and all the memory disappears. It also helps prevent you retro-fitting a GC.
Is it because all your memory is allocated and kept in-use all the time, and only freed at the end? If so see above.
If you really, really cannot think of a design where you cannot leak memory, at least have the decency to use a private heap. Destroy that heap before you quit and you'll have a better design already, if a little 'hacky'.
There are instances where memory leaks are ok - static variables, globally initialised data, things like that. These aren't generally large though.
Reference counting smart pointers like shared_ptr in boost and TR1 could also help you manage your memory in a simple manner.
The drawback is that you have to wrap every pointers that use these objects.
I've done this before, only to find that, much later, I needed the program to be able to process several inputs without separate commands, or that the guts of the program were so useful that they needed to be turned into a library routine that could be called many times from within another program that was not expected to terminate. It was much harder to go back later and re-engineer the program than it would have been to make it leak-less from the start.
So, while it's technically safe as you've described the requirements, I advise against the practice since it's likely that your requirements may someday change.
If the run time of your program is very short, it should not be a problem. However, being too lazy to free what you allocate and losing track of what you allocate are two entirely different things. If you have simply lost track, its time to ask yourself if you actually know what your code is doing to a computer.
If you are just in a hurry or lazy and the life of your program is small in relation to what it actually allocates (i.e. allocating 10 MB per second is not small if running for 30 seconds) .. then you should be OK.
The only 'noble' argument regarding freeing allocated memory sets in when a program exits .. should one free everything to keep valgrind from complaining about leaks, or just let the OS do it? That entirely depends on the OS and if your code might become a library and not a short running executable.
Leaks during run time are generally bad, unless you know your program will run in a short amount of time and not cause other programs far more important than your's as far as the OS is concerned to skid to dirty paging.
What are your feeling about this?
Some O/Ses might not reclaim the memory, but I guess you're not intenting to run on those O/Ses.
As a backup plan, if ever my project will require to run as a server or interactively, I figured that I can always refit a garbage collector into the source code.
Instead, I figure you can spawn a child process to do the dirty work, grab the output from the child process, let the child process die as soon as possible after that and then expect the O/S to do the garbage collection.
I have not personally used this, but since you are starting from scratch you may wish to consider the Boehm-Demers-Weiser conservative garbage collector
The answer really depends on how large your program will be and what performance characteristics it needs to exhibit. If you never deallocate memory, your process's memory footprint will be much larger than it would otherwise be. Depeding on the system, this could cause a lot of paging and slow down the performance for you or other applications on the system.
Beyond that, what everyone above says is correct. It probably won't cause harm in the short term, but it's a bad practice that you should avoid. You'll never be able to use the code again. Trying to retrofit a GC on afterwards will be a nightmare. Just think about going to each place you allocate memory and trying to retrofit it but not break anything.
One more reason to avoid doing this: reputation. If you fail to deallocate, everyone who maintains the code will curse your name and your rep in the company will take a hit. "Can you believe how dumb he was? Look at this code."
If it is non-trivial for you to determine where to deallocate the memory, I would be concerned that other aspects of the data structure manipulation may not be fully understood either.
Apart from the fact that the OS (kernel and/or C/C++ library) can choose not to free the memory when the execution ends, your application should always provide proper freeing of allocated memory as a good practice. Why? Suppose you decide to extend that application or reuse the code; you'll quickly get in trouble if the code you had previously written hogs up the memory unnecessarily, after finishing its job. It's a recipe for memory leaks.
In general, I agree it's a bad practice.
For a one shot program, it can be OK, but it kinda shows like you don't what you are doing.
There is one solution to your problem though - use a custom allocator, which preallocates larger blocks from malloc, and then, after the computation phase, instead of freeing all the little blocks from you custom allocator, just release the larger preallocated blocks of memory. Then you don't need to keep track of all objects you need to deallocate and when. One guy who wrote a compiler too explained this approach many years ago to me, so if it worked for him, it will probably work for you as well.
Try to use automatic variables in methods so that they will be freed automatically from the stack.
The only useful reason to not free heap memory is to save a tiny amount of computational power used in the free() method. You might loose any advantage if page faults become an issue due to large virtual memory needs with small physical memory resources. Some factors to consider are:
If you are allocating a few huge chunks of memory or many small chunks.
Is the memory going to need to be locked into physical memory.
Are you absolutely positive the code and memory needed will fit into 2GB, for a Win32 system, including memory holes and padding.
That's generally a bad idea. You might encounter some cases where the program will try to consume more memory than it's available. Plus you risk being unable to start several copies of the program.
You can still do this if you don't care of the mentioned issues.
When you exit from a program, the memory allocated is automatically returned to the system. So you may not deallocate the memory you had allocated.
But deallocations becomes necessary when you go for bigger programs such as an OS or Embedded systems where the program is meant to run forever & hence a small memory leak can be malicious.
Hence it is always recommended to deallocate the memory you have allocated.

Are memory leaks ever ok? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Is it ever acceptable to have a memory leak in your C or C++ application?
What if you allocate some memory and use it until the very last line of code in your application (for example, a global object's destructor)? As long as the memory consumption doesn't grow over time, is it OK to trust the OS to free your memory for you when your application terminates (on Windows, Mac, and Linux)? Would you even consider this a real memory leak if the memory was being used continuously until it was freed by the OS.
What if a third party library forced this situation on you? Would refuse to use that third party library no matter how great it otherwise might be?
I only see one practical disadvantage, and that is that these benign leaks will show up with memory leak detection tools as false positives.
No.
As professionals, the question we should not be asking ourselves is, "Is it ever OK to do this?" but rather "Is there ever a good reason to do this?" And "hunting down that memory leak is a pain" isn't a good reason.
I like to keep things simple. And the simple rule is that my program should have no memory leaks.
That makes my life simple, too. If I detect a memory leak, I eliminate it, rather than run through some elaborate decision tree structure to determine whether it's an "acceptable" memory leak.
It's similar to compiler warnings – will the warning be fatal to my particular application? Maybe not.
But it's ultimately a matter of professional discipline. Tolerating compiler warnings and tolerating memory leaks is a bad habit that will ultimately bite me in the rear.
To take things to an extreme, would it ever be acceptable for a surgeon to leave some piece of operating equipment inside a patient?
Although it is possible that a circumstance could arise where the cost/risk of removing that piece of equipment exceeds the cost/risk of leaving it in, and there could be circumstances where it was harmless, if I saw this question posted on SurgeonOverflow.com and saw any answer other than "no," it would seriously undermine my confidence in the medical profession.
–
If a third party library forced this situation on me, it would lead me to seriously suspect the overall quality of the library in question. It would be as if I test drove a car and found a couple loose washers and nuts in one of the cupholders – it may not be a big deal in itself, but it portrays a lack of commitment to quality, so I would consider alternatives.
I don't consider it to be a memory leak unless the amount of memory being "used" keeps growing. Having some unreleased memory, while not ideal, is not a big problem unless the amount of memory required keeps growing.
Let's get our definitions correct, first. A memory leak is when memory is dynamically allocated, eg with malloc(), and all references to the memory are lost without the corresponding free. An easy way to make one is like this:
#define BLK ((size_t)1024)
while(1){
void * vp = malloc(BLK);
}
Note that every time around the while(1) loop, 1024 (+overhead) bytes are allocated, and the new address assigned to vp; there's no remaining pointer to the previous malloc'ed blocks. This program is guaranteed to run until the heap runs out, and there's no way to recover any of the malloc'ed memory. Memory is "leaking" out of the heap, never to be seen again.
What you're describing, though, sound like
int main(){
void * vp = malloc(LOTS);
// Go do something useful
return 0;
}
You allocate the memory, work with it until the program terminates. This is not a memory leak; it doesn't impair the program, and all the memory will be scavenged up automagically when the program terminates.
Generally, you should avoid memory leaks. First, because like altitude above you and fuel back at the hangar, memory that has leaked and can't be recovered is useless; second, it's a lot easier to code correctly, not leaking memory, at the start than it is to find a memory leak later.
In theory no, in practise it depends.
It really depends on how much data the program is working on, how often the program is run and whether or not it is running constantly.
If I have a quick program that reads a small amount of data makes a calculation and exits, a small memory leak will never be noticed. Because the program is not running for very long and only uses a small amount of memory, the leak will be small and freed when the program exists.
On the other hand if I have a program that processes millions of records and runs for a long time, a small memory leak might bring down the machine given enough time.
As for third party libraries that have leaks, if they cause a problem either fix the library or find a better alternative. If it doesn't cause a problem, does it really matter?
Many people seem to be under the impression that once you free memory, it's instantly returned to the operating system and can be used by other programs.
This isn't true. Operating systems commonly manage memory in 4KiB pages. malloc and other sorts of memory management get pages from the OS and sub-manage them as they see fit. It's quite likely that free() will not return pages to the operating system, under the assumption that your program will malloc more memory later.
I'm not saying that free() never returns memory to the operating system. It can happen, particularly if you are freeing large stretches of memory. But there's no guarantee.
The important fact: If you don't free memory that you no longer need, further mallocs are guaranteed to consume even more memory. But if you free first, malloc might re-use the freed memory instead.
What does this mean in practice? It means that if you know your program isn't going to require any more memory from now on (for instance it's in the cleanup phase), freeing memory is not so important. However if the program might allocate more memory later, you should avoid memory leaks - particularly ones that can occur repeatedly.
Also see this comment for more details about why freeing memory just before termination is bad.
A commenter didn't seem to understand that calling free() does not automatically allow other programs to use the freed memory. But that's the entire point of this answer!
So, to convince people, I will demonstrate an example where free() does very little good. To make the math easy to follow, I will pretend that the OS manages memory in 4000 byte pages.
Suppose you allocate ten thousand 100-byte blocks (for simplicity I'll ignore the extra memory that would be required to manage these allocations). This consumes 1MB, or 250 pages. If you then free 9000 of these blocks at random, you're left with just 1000 blocks - but they're scattered all over the place. Statistically, about 5 of the pages will be empty. The other 245 will each have at least one allocated block in them. That amounts to 980KB of memory, that cannot possibly be reclaimed by the operating system - even though you now only have 100KB allocated!
On the other hand, you can now malloc() 9000 more blocks without increasing the amount of memory your program is tying up.
Even when free() could technically return memory to the OS, it may not do so. free() needs to achieve a balance between operating quickly and saving memory. And besides, a program that has already allocated a lot of memory and then freed it is likely to do so again. A web server needs to handle request after request after request - it makes sense to keep some "slack" memory available so you don't need to ask the OS for memory all the time.
There is nothing conceptually wrong with having the os clean up after the application is run.
It really depends on the application and how it will be run. Continually occurring leaks in an application that needs to run for weeks has to be taken care of, but a small tool that calculates a result without too high of a memory need should not be a problem.
There is a reason why many scripting language do not garbage collect cyclical references… for their usage patterns, it's not an actual problem and would thus be as much of a waste of resources as the wasted memory.
I believe the answer is no, never allow a memory leak, and I have a few reasons which I haven't seen explicitly stated. There are great technical answers here but I think the real answer hinges on more social/human reasons.
(First, note that as others mentioned, a true leak is when your program, at any point, loses track of memory resources that it has allocated. In C, this happens when you malloc() to a pointer and let that pointer leave scope without doing a free() first.)
The important crux of your decision here is habit. When you code in a language that uses pointers, you're going to use pointers a lot. And pointers are dangerous; they're the easiest way to add all manner of severe problems to your code.
When you're coding, sometimes you're going to be on the ball and sometimes you're going to be tired or mad or worried. During those somewhat distracted times, you're coding more on autopilot. The autopilot effect doesn't differentiate between one-off code and a module in a larger project. During those times, the habits you establish are what will end up in your code base.
So no, never allow memory leaks for the same reason that you should still check your blind spots when changing lanes even if you're the only car on the road at the moment. During times when your active brain is distracted, good habits are all that can save you from disastrous missteps.
Beyond the "habit" issue, pointers are complex and often require a lot of brain power to track mentally. It's best to not "muddy the water" when it comes to your usage of pointers, especially when you're new to programming.
There's a more social aspect too. By proper use of malloc() and free(), anyone who looks at your code will be at ease; you're managing your resources. If you don't, however, they'll immediately suspect a problem.
Maybe you've worked out that the memory leak doesn't hurt anything in this context, but every maintainer of your code will have to work that out in his head too when he reads that piece of code. By using free() you remove the need to even consider the issue.
Finally, programming is writing a mental model of a process to an unambiguous language so that a person and a computer can perfectly understand said process. A vital part of good programming practice is never introducing unnecessary ambiguity.
Smart programming is flexible and generic. Bad programming is ambiguous.
I'm going to give the unpopular but practical answer that it's always wrong to free memory unless doing so will reduce the memory usage of your program. For instance a program that makes a single allocation or series of allocations to load the dataset it will use for its entire lifetime has no need to free anything. In the more common case of a large program with very dynamic memory requirements (think of a web browser), you should obviously free memory you're no longer using as soon as you can (for instance closing a tab/document/etc.), but there's no reason to free anything when the user selects clicks "exit", and doing so is actually harmful to the user experience.
Why? Freeing memory requires touching memory. Even if your system's malloc implementation happens not to store metadata adjacent to the allocated memory blocks, you're likely going to be walking recursive structures just to find all the pointers you need to free.
Now, suppose your program has worked with a large volume of data, but hasn't touched most of it for a while (again, web browser is a great example). If the user is running a lot of apps, a good portion of that data has likely been swapped to disk. If you just exit(0) or return from main, it exits instantly. Great user experience. If you go to the trouble of trying to free everything, you may spend 5 seconds or more swapping all the data back in, only to throw it away immediately after that. Waste of user's time. Waste of laptop's battery life. Waste of wear on the hard disk.
This is not just theoretical. Whenever I find myself with too many apps loaded and the disk starts thrashing, I don't even consider clicking "exit". I get to a terminal as fast as I can and type killall -9 ... because I know "exit" will just make it worse.
I think in your situation the answer may be that it's okay. But you definitely need to document that the memory leak is a conscious decision. You don't want a maintenance programmer to come along, slap your code inside a function, and call it a million times. So if you make the decision that a leak is okay you need to document it (IN BIG LETTERS) for whoever may have to work on the program in the future.
If this is a third party library you may be trapped. But definitely document that this leak occurs.
But basically if the memory leak is a known quantity like a 512 KB buffer or something then it is a non issue. If the memory leak keeps growing like every time you call a library call your memory increases by 512KB and is not freed, then you may have a problem. If you document it and control the number of times the call is executed it may be manageable. But then you really need documentation because while 512 isn't much, 512 over a million calls is a lot.
Also you need to check your operating system documentation. If this was an embedded device there may be operating systems that don't free all the memory from a program that exits. I'm not sure, maybe this isn't true. But it is worth looking into.
I'm sure that someone can come up with a reason to say Yes, but it won't be me.
Instead of saying no, I'm going to say that this shouldn't be a yes/no question.
There are ways to manage or contain memory leaks, and many systems have them.
There are NASA systems on devices that leave the earth that plan for this. The systems will automatically reboot every so often so that memory leaks will not become fatal to the overall operation. Just an example of containment.
If you allocate memory and use it until the last line of your program, that's not a leak. If you allocate memory and forget about it, even if the amount of memory isn't growing, that's a problem. That allocated but unused memory can cause other programs to run slower or not at all.
I can count on one hand the number of "benign" leaks that I've seen over time.
So the answer is a very qualified yes.
An example. If you have a singleton resource that needs a buffer to store a circular queue or deque but doesn't know how big the buffer will need to be and can't afford the overhead of locking or every reader, then allocating an exponentially doubling buffer but not freeing the old ones will leak a bounded amount of memory per queue/deque. The benefit for these is they speed up every access dramatically and can change the asymptotics of multiprocessor solutions by never risking contention for a lock.
I've seen this approach used to great benefit for things with very clearly fixed counts such as per-CPU work-stealing deques, and to a much lesser degree in the buffer used to hold the singleton /proc/self/maps state in Hans Boehm's conservative garbage collector for C/C++, which is used to detect the root sets, etc.
While technically a leak, both of these cases are bounded in size and in the growable circular work stealing deque case there is a huge performance win in exchange for a bounded factor of 2 increase in the memory usage for the queues.
If you allocate a bunch of heap at the beginning of your program, and you don't free it when you exit, that is not a memory leak per se. A memory leak is when your program loops over a section of code, and that code allocates heap and then "loses track" of it without freeing it.
In fact, there is no need to make calls to free() or delete right before you exit. When the process exits, all of its memory is reclaimed by the OS (this is certainly the case with POSIX. On other OSes – particularly embedded ones – YMMV).
The only caution I'd have with not freeing the memory at exit time is that if you ever refactor your program so that it, for example, becomes a service that waits for input, does whatever your program does, then loops around to wait for another service call, then what you've coded can turn into a memory leak.
In this sort of question context is everything. Personally I can't stand leaks, and in my code I go to great lengths to fix them if they crop up, but it is not always worth it to fix a leak, and when people are paying me by the hour I have on occasion told them it was not worth my fee for me to fix a leak in their code. Let me give you an example:
I was triaging a project, doing some perf work and fixing a lot of bugs. There was a leak during the applications initialization that I tracked down, and fully understood. Fixing it properly would have required a day or so refactoring a piece of otherwise functional code. I could have done something hacky (like stuffing the value into a global and grabbing it some point I know it was no longer in use to free), but that would have just caused more confusion to the next guy who had to touch the code.
Personally I would not have written the code that way in the first place, but most of us don't get to always work on pristine well designed codebases, and sometimes you have to look at these things pragmatically. The amount of time it would have taken me to fix that 150 byte leak could instead be spent making algorithmic improvements that shaved off megabytes of ram.
Ultimately, I decided that leaking 150 bytes for an app that used around a gig of ram and ran on a dedicated machine was not worth fixing it, so I wrote a comment saying that it was leaked, what needed to be changed in order to fix it, and why it was not worth it at the time.
this is so domain-specific that its hardly worth answering. use your freaking head.
space shuttle operating system: nope, no memory leaks allowed
rapid development proof-of-concept code: fixing all those memory leaks is a waste of time.
and there is a spectrum of intermediate situations.
the opportunity cost ($$$) of delaying a product release to fix all but the worst memory leaks is usually dwarfs any feelings of being "sloppy or unprofessional". Your boss pays you to make him money, not to get a warm, fuzzy feelings.
You have to first realize that there's a big difference between a perceived memory leak and an actual memory leak. Very frequently analysis tools will report many red herrings, and label something as having been leaked (memory or resources such as handles etc) where it actually isn't. Often times this is due to the analysis tool's architecture. For example, certain analysis tools will report run time objects as memory leaks because it never sees those object freed. But the deallocation occurs in the runtime's shutdown code, which the analysis tool might not be able to see.
With that said, there will still be times when you will have actual memory leaks that are either very difficult to find or very difficult to fix. So now the question becomes is it ever OK to leave them in the code?
The ideal answer is, "no, never." A more pragmatic answer may be "no, almost never." Very often in real life you have limited number of resources and time to resolve and endless list of tasks. When one of the tasks is eliminating memory leaks, the law of diminishing returns very often comes in to play. You could eliminate say 98% of all memory leaks in an application in a week, but the remaining 2% might take months. In some cases it might even be impossible to eliminate certain leaks because of the application's architecture without a major refactoring of code. You have to weigh the costs and benefits of eliminating the remaining 2%.
While most answers concentrate on real memory leaks (which are not OK ever, because they are a sign of sloppy coding), this part of the question appears more interesting to me:
What if you allocate some memory and use it until the very last line of code in your application (for example, a global object's deconstructor)? As long as the memory consumption doesn't grow over time, is it OK to trust the OS to free your memory for you when your application terminates (on Windows, Mac, and Linux)? Would you even consider this a real memory leak if the memory was being used continuously until it was freed by the OS.
If the associated memory is used, you cannot free it before the program ends. Whether the free is done by the program exit or by the OS does not matter. As long as this is documented, so that change don't introduce real memory leaks, and as long as there is no C++ destructor or C cleanup function involved in the picture. A not-closed file might be revealed through a leaked FILE object, but a missing fclose() might also cause the buffer not to be flushed.
So, back to the original case, it is IMHO perfectly OK in itself, so much that Valgrind, one of the most powerful leak detectors, will treat such leaks only if requested. On Valgrind, when you overwrite a pointer without freeing it beforehand, it gets considered as a memory leak, because it is more likely to happen again and to cause the heap to grow endlessly.
Then, there are not nfreed memory blocks which are still reachable. One could make sure to free all of them at the exit, but that is just a waste of time in itself. The point is if they could be freed before. Lowering memory consumption is useful in any case.
Even if you are sure that your 'known' memory leak will not cause havoc, don't do it. At best, it will pave a way for you to make a similar and probably more critical mistake at a different time and place.
For me, asking this is like questioning "Can I break the red light at 3 AM in the morning when no one is around?". Well sure, it may not cause any trouble at that time but it will provide a lever for you to do the same in rush hour!
No, you should not have leaks that the OS will clean for you. The reason (not mentioned in the answers above as far as I could check) is that you never know when your main() will be re-used as a function/module in another program. If your main() gets to be a frequently-called function in another persons' software - this software will have a memory leak that eats memory over time.
KIV
I agree with vfilby – it depends. In Windows, we treat memory leaks as relatively serous bugs. But, it very much depends on the component.
For example, memory leaks are not very serious for components that run rarely, and for limited periods of time. These components run, do theire work, then exit. When they exit all their memory is freed implicitly.
However, memory leaks in services or other long run components (like the shell) are very serious. The reason is that these bugs 'steal' memory over time. The only way to recover this is to restart the components. Most people don't know how to restart a service or the shell – so if their system performance suffers, they just reboot.
So, if you have a leak – evaluate its impact two ways
To your software and your user's experience.
To the system (and the user) in terms of being frugal with system resources.
Impact of the fix on maintenance and reliability.
Likelihood of causing a regression somewhere else.
Foredecker
I'm surprised to see so many incorrect definitions of what a memory leak actually is. Without a concrete definition, a discussion on whether it's a bad thing or not will go nowhere.
As some commentors have rightly pointed out, a memory leak only happens when memory allocated by a process goes out of scope to the extent that the process is no longer able to reference or delete it.
A process which is grabbing more and more memory is not necessarily leaking. So long as it is able to reference and deallocate that memory, then it remains under the explicit control of the process and has not leaked. The process may well be badly designed, especially in the context of a system where memory is limited, but this is not the same as a leak. Conversely, losing scope of, say, a 32 byte buffer is still a leak, even though the amount of memory leaked is small. If you think this is insignificant, wait until someone wraps an algorithm around your library call and calls it 10,000 times.
I see no reason whatsoever to allow leaks in your own code, however small. Modern programming languages such as C and C++ go to great lengths to help programmers prevent such leaks and there is rarely a good argument not to adopt good programming techniques - especially when coupled with specific language facilities - to prevent leaks.
As regards existing or third party code, where your control over quality or ability to make a change may be highly limited, depending on the severity of the leak, you may be forced to accept or take mitigating action such as restarting your process regularly to reduce the effect of the leak.
It may not be possible to change or replace the existing (leaking) code, and therefore you may be bound to accept it. However, this is not the same as declaring that it's OK.
I guess it's fine if you're writing a program meant to leak memory (i.e. to test the impact of memory leaks on system performance).
Its really not a leak if its intentional and its not a problem unless its a significant amount of memory, or could grow to be a significant amount of memory. Its fairly common to not cleanup global allocations during the lifetime of a program. If the leak is in a server or long running app, grows over time, then its a problem.
I think you've answered your own question. The biggest drawback is how they interfere with the memory leak detection tools, but I think that drawback is a HUGE drawback for certain types of applications.
I work with legacy server applications that are supposed to be rock solid but they have leaks and the globals DO get in the way of the memory detection tools. It's a big deal.
In the book "Collapse" by Jared Diamond, the author wonders about what the guy was thinking who cut down the last tree on Easter Island, the tree he would have needed in order to build a canoe to get off the island. I wonder about the day many years ago when that first global was added to our codebase. THAT was the day it should have been caught.
I see the same problem as all scenario questions like this: What happens when the program changes, and suddenly that little memory leak is being called ten million times and the end of your program is in a different place so it does matter? If it's in a library then log a bug with the library maintainers, don't put a leak into your own code.
I'll answer no.
In theory, the operating system will clean up after you if you leave a mess (now that's just rude, but since computers don't have feelings it might be acceptable). But you can't anticipate every possible situation that might occur when your program is run. Therefore (unless you are able to conduct a formal proof of some behaviour), creating memory leaks is just irresponsible and sloppy from a professional point of view.
If a third-party component leaks memory, this is a very strong argument against using it, not only because of the imminent effect but also because it shows that the programmers work sloppily and that this might also impact other metrics. Now, when considering legacy systems this is difficult (consider web browsing components: to my knowledge, they all leak memory) but it should be the norm.
Historically, it did matter on some operating systems under some edge cases. These edge cases could exist in the future.
Here's an example, on SunOS in the Sun 3 era, there was an issue if a process used exec (or more traditionally fork and then exec), the subsequent new process would inherit the same memory footprint as the parent and it could not be shrunk. If a parent process allocated 1/2 gig of memory and didn't free it before calling exec, the child process would start using that same 1/2 gig (even though it wasn't allocated). This behavior was best exhibited by SunTools (their default windowing system), which was a memory hog. Every app that it spawned was created via fork/exec and inherited SunTools footprint, quickly filling up swap space.
This was already discussed ad nauseam. Bottom line is that a memory leak is a bug and must be fixed. If a third party library leaks memory, it makes one wonder what else is wrong with it, no? If you were building a car, would you use an engine that is occasionally leaking oil? After all, somebody else made the engine, so it's not your fault and you can't fix it, right?
Generally a memory leak in a stand alone application is not fatal, as it gets cleaned up when the program exits.
What do you do for Server programs that are designed so they don't exit?
If you are the kind of programmer that does not design and implement code where the resources are allocated and released correctly, then I don't want anything to do with you or your code. If you don't care to clean up your leaked memory, what about your locks? Do you leave them hanging out there too? Do you leave little turds of temporary files laying around in various directories?
Leak that memory and let the program clean it up? No. Absolutely not. It's a bad habit, that leads to bugs, bugs, and more bugs.
Clean up after yourself. Yo momma don't work here no more.
As a general rule, if you've got memory leaks that you feel you can't avoid, then you need to think harder about object ownership.
But to your question, my answer in a nutshell is In production code, yes. During development, no. This might seem backwards, but here's my reasoning:
In the situation you describe, where the memory is held until the end of the program, it's perfectly okay to not release it. Once your process exits, the OS will clean up anyway. In fact, it might make the user's experience better: In a game I've worked on, the programmers thought it would be cleaner to free all the memory before exiting, causing the shutdown of the program to take up to half a minute! A quick change that just called exit() instead made the process disappear immediately, and put the user back to the desktop where he wanted to be.
However, you're right about the debugging tools: They'll throw a fit, and all the false positives might make finding your real memory leaks a pain. And because of that, always write debugging code that frees the memory, and disable it when you ship.

Heap corruption under Win32; how to locate?

I'm working on a multithreaded C++ application that is corrupting the heap. The usual tools to locate this corruption seem to be inapplicable. Old builds (18 months old) of the source code exhibit the same behaviour as the most recent release, so this has been around for a long time and just wasn't noticed; on the downside, source deltas can't be used to identify when the bug was introduced - there are a lot of code changes in the repository.
The prompt for crashing behaviuor is to generate throughput in this system - socket transfer of data which is munged into an internal representation. I have a set of test data that will periodically cause the app to exception (various places, various causes - including heap alloc failing, thus: heap corruption).
The behaviour seems related to CPU power or memory bandwidth; the more of each the machine has, the easier it is to crash. Disabling a hyper-threading core or a dual-core core reduces the rate of (but does not eliminate) corruption. This suggests a timing related issue.
Now here's the rub:
When it's run under a lightweight debug environment (say Visual Studio 98 / AKA MSVC6) the heap corruption is reasonably easy to reproduce - ten or fifteen minutes pass before something fails horrendously and exceptions, like an alloc; when running under a sophisticated debug environment (Rational Purify, VS2008/MSVC9 or even Microsoft Application Verifier) the system becomes memory-speed bound and doesn't crash (Memory-bound: CPU is not getting above 50%, disk light is not on, the program's going as fast it can, box consuming 1.3G of 2G of RAM). So, I've got a choice between being able to reproduce the problem (but not identify the cause) or being able to idenify the cause or a problem I can't reproduce.
My current best guesses as to where to next is:
Get an insanely grunty box (to replace the current dev box: 2Gb RAM in an E6550 Core2 Duo); this will make it possible to repro the crash causing mis-behaviour when running under a powerful debug environment; or
Rewrite operators new and delete to use VirtualAlloc and VirtualProtect to mark memory as read-only as soon as it's done with. Run under MSVC6 and have the OS catch the bad-guy who's writing to freed memory. Yes, this is a sign of desperation: who the hell rewrites new and delete?! I wonder if this is going to make it as slow as under Purify et al.
And, no: Shipping with Purify instrumentation built in is not an option.
A colleague just walked past and asked "Stack Overflow? Are we getting stack overflows now?!?"
And now, the question: How do I locate the heap corruptor?
Update: balancing new[] and delete[] seems to have gotten a long way towards solving the problem. Instead of 15mins, the app now goes about two hours before crashing. Not there yet. Any further suggestions? The heap corruption persists.
Update: a release build under Visual Studio 2008 seems dramatically better; current suspicion rests on the STL implementation that ships with VS98.
Reproduce the problem. Dr Watson will produce a dump that might be helpful in further analysis.
I'll take a note of that, but I'm concerned that Dr Watson will only be tripped up after the fact, not when the heap is getting stomped on.
Another try might be using WinDebug as a debugging tool which is quite powerful being at the same time also lightweight.
Got that going at the moment, again: not much help until something goes wrong. I want to catch the vandal in the act.
Maybe these tools will allow you at least to narrow the problem to certain component.
I don't hold much hope, but desperate times call for...
And are you sure that all the components of the project have correct runtime library settings (C/C++ tab, Code Generation category in VS 6.0 project settings)?
No I'm not, and I'll spend a couple of hours tomorrow going through the workspace (58 projects in it) and checking they're all compiling and linking with the appropriate flags.
Update: This took 30 seconds. Select all projects in the Settings dialog, unselect until you find the project(s) that don't have the right settings (they all had the right settings).
My first choice would be a dedicated heap tool such as pageheap.exe.
Rewriting new and delete might be useful, but that doesn't catch the allocs committed by lower-level code. If this is what you want, better to Detour the low-level alloc APIs using Microsoft Detours.
Also sanity checks such as: verify your run-time libraries match (release vs. debug, multi-threaded vs. single-threaded, dll vs. static lib), look for bad deletes (eg, delete where delete [] should have been used), make sure you're not mixing and matching your allocs.
Also try selectively turning off threads and see when/if the problem goes away.
What does the call stack etc look like at the time of the first exception?
I have same problems in my work (we also use VC6 sometimes). And there is no easy solution for it. I have only some hints:
Try with automatic crash dumps on production machine (see Process Dumper). My experience says Dr. Watson is not perfect for dumping.
Remove all catch(...) from your code. They often hide serious memory exceptions.
Check Advanced Windows Debugging - there are lots of great tips for problems like yours. I recomend this with all my heart.
If you use STL try STLPort and checked builds. Invalid iterator are hell.
Good luck. Problems like yours take us months to solve. Be ready for this...
We've had pretty good luck by writing our own malloc and free functions. In production, they just call the standard malloc and free, but in debug, they can do whatever you want. We also have a simple base class that does nothing but override the new and delete operators to use these functions, then any class you write can simply inherit from that class. If you have a ton of code, it may be a big job to replace calls to malloc and free to the new malloc and free (don't forget realloc!), but in the long run it's very helpful.
In Steve Maguire's book Writing Solid Code (highly recommended), there are examples of debug stuff that you can do in these routines, like:
Keep track of allocations to find leaks
Allocate more memory than necessary and put markers at the beginning and end of memory -- during the free routine, you can ensure these markers are still there
memset the memory with a marker on allocation (to find usage of uninitialized memory) and on free (to find usage of free'd memory)
Another good idea is to never use things like strcpy, strcat, or sprintf -- always use strncpy, strncat, and snprintf. We've written our own versions of these as well, to make sure we don't write off the end of a buffer, and these have caught lots of problems too.
Run the original application with ADplus -crash -pn appnename.exe
When the memory issue pops-up you will get a nice big dump.
You can analyze the dump to figure what memory location was corrupted.
If you are lucky the overwrite memory is a unique string you can figure out where it came from. If you are not lucky, you will need to dig into win32 heap and figure what was the orignal memory characteristics. (heap -x might help)
After you know what was messed-up, you can narrow appverifier usage with special heap settings. i.e. you can specify what DLL you monitor, or what allocation size to monitor.
Hopefully this will speedup the monitoring enough to catch the culprit.
In my experience, I never needed full heap verifier mode, but I spent a lot of time analyzing the crash dump(s) and browsing sources.
P.S:
You can use DebugDiag to analyze the dumps.
It can point out the DLL owning the corrupted heap, and give you other usefull details.
You should attack this problem with both runtime and static analysis.
For static analysis consider compiling with PREfast (cl.exe /analyze). It detects mismatched delete and delete[], buffer overruns and a host of other problems. Be prepared, though, to wade through many kilobytes of L6 warning, especially if your project still has L4 not fixed.
PREfast is available with Visual Studio Team System and, apparently, as part of Windows SDK.
Is this in low memory conditions? If so it might be that new is returning NULL rather than throwing std::bad_alloc. Older VC++ compilers didn't properly implement this. There is an article about Legacy memory allocation failures crashing STL apps built with VC6.
The apparent randomness of the memory corruption sounds very much like a thread synchronization issue - a bug is reproduced depending on machine speed. If objects (chuncks of memory) are shared among threads and synchronization (critical section, mutex, semaphore, other) primitives are not on per-class (per-object, per-class) basis, then it is possible to come to a situation where class (chunk of memory) is deleted / freed while in use, or used after deleted / freed.
As a test for that, you could add synchronization primitives to each class and method. This will make your code slower because many objects will have to wait for each other, but if this eliminates the heap corruption, your heap-corruption problem will become a code optimization one.
You tried old builds, but is there a reason you can't keep going further back in the repository history and seeing exactly when the bug was introduced?
Otherwise, I would suggest adding simple logging of some kind to help track down the problem, though I am at a loss of what specifically you might want to log.
If you can find out what exactly CAN cause this problem, via google and documentation of the exceptions you are getting, maybe that will give further insight on what to look for in the code.
My first action would be as follows:
Build the binaries in "Release" version but creating debug info file (you will find this possibility in project settings).
Use Dr Watson as a defualt debugger (DrWtsn32 -I) on a machine on which you want to reproduce the problem.
Repdroduce the problem. Dr Watson will produce a dump that might be helpful in further analysis.
Another try might be using WinDebug as a debugging tool which is quite powerful being at the same time also lightweight.
Maybe these tools will allow you at least to narrow the problem to certain component.
And are you sure that all the components of the project have correct runtime library settings (C/C++ tab, Code Generation category in VS 6.0 project settings)?
So from the limited information you have, this can be a combination of one or more things:
Bad heap usage, i.e., double frees, read after free, write after free, setting the HEAP_NO_SERIALIZE flag with allocs and frees from multiple threads on the same heap
Out of memory
Bad code (i.e., buffer overflows, buffer underflows, etc.)
"Timing" issues
If it's at all the first two but not the last, you should have caught it by now with either pageheap.exe.
Which most likely means it is due to how the code is accessing shared memory. Unfortunately, tracking that down is going to be rather painful. Unsynchronized access to shared memory often manifests as weird "timing" issues. Things like not using acquire/release semantics for synchronizing access to shared memory with a flag, not using locks appropriately, etc.
At the very least, it would help to be able to track allocations somehow, as was suggested earlier. At least then you can view what actually happened up until the heap corruption and attempt to diagnose from that.
Also, if you can easily redirect allocations to multiple heaps, you might want to try that to see if that either fixes the problem or results in more reproduceable buggy behavior.
When you were testing with VS2008, did you run with HeapVerifier with Conserve Memory set to Yes? That might reduce the performance impact of the heap allocator. (Plus, you have to run with it Debug->Start with Application Verifier, but you may already know that.)
You can also try debugging with Windbg and various uses of the !heap command.
MSN
Graeme's suggestion of custom malloc/free is a good idea. See if you can characterize some pattern about the corruption to give you a handle to leverage.
For example, if it is always in a block of the same size (say 64 bytes) then change your malloc/free pair to always allocate 64 byte chunks in their own page. When you free a 64 byte chunk then set the memory protection bits on that page to prevent reads and wites (using VirtualQuery). Then anyone attempting to access this memory will generate an exception rather than corrupting the heap.
This does assume that the number of outstanding 64 byte chunks is only moderate or you have a lot of memory to burn in the box!
If you choose to rewrite new/delete, I have done this and have simple source code at:
http://gandolf.homelinux.org/~smhanov/blog/?id=10
This catches memory leaks and also inserts guard data before and after the memory block to capture heap corruption. You can just integrate with it by putting #include "debug.h" at the top of every CPP file, and defining DEBUG and DEBUG_MEM.
The little time I had to solve a similar problem.
If the problem still exists I suggest you do this :
Monitor all calls to new/delete and malloc/calloc/realloc/free.
I make single DLL exporting a function for register all calls. This function receive parameter for identifying your code source, pointer to allocated area and type of call saving this information in a table.
All allocated/freed pair is eliminated. At the end or after you need you make a call to an other function for create report for left data.
With this you can identify wrong calls (new/free or malloc/delete) or missing.
If have any case of buffer overwritten in your code the information saved can be wrong but each test may detect/discover/include a solution of failure identified. Many runs to help identify the errors.
Good luck.
Do you think this is a race condition? Are multiple threads sharing one heap? Can you give each thread a private heap with HeapCreate, then they can run fast with HEAP_NO_SERIALIZE. Otherwise, a heap should be thread safe, if you're using the multi-threaded version of the system libraries.
A couple of suggestions. You mention the copious warnings at W4 - I would suggest taking the time to fix your code to compile cleanly at warning level 4 - this will go a long way to preventing subtle hard to find bugs.
Second - for the /analyze switch - it does indeed generate copious warnings. To use this switch in my own project, what I did was to create a new header file that used #pragma warning to turn off all the additional warnings generated by /analyze. Then further down in the file, I turn on only those warnings I care about. Then use the /FI compiler switch to force this header file to be included first in all your compilation units. This should allow you to use the /analyze switch while controling the output