Memory leak in multi-threaded C++ application on Linux

Memory leak in multi-threaded C++ application on Linux - c++

We have a big multi-threaded C++ application running on Linux. We see that occupied by the application memory grows fast and believe there are some leaks. We have tried every tool we have (valgrind, DynLeak, Purify) but did not find anything. Since this application can run on Windows, we have also tried Bounds Checker. Did not help, too.
We need a new tool that can help. I've looked at Google Perfomrance Tools, MMGR by Paul Nettle, MemCheck Deluxe. None of them impressed me.
Is there anywhere a good tool for this task?

The definition of a memory leak in C/C++ is very specific: it is memory that has been allocated and then the pointer was overwritten or otherwise lost. Valgrind generally detects such cases out of the box, but things are not always that simple.
Your application could very well be still using that memory. In that case you might have what a Java programmer would consider a leak, e.g. entering data in a structure and rarely (or never) removing entries.
You might be measuring the memory usage of your memory incorrectly. On Linux memory usage measurements are not as straight-forward as they seem. How have you measured your memory usage?
You should consider using the application hooks (Valgrind calls them client requests) of whatever memory analysis tool your are using, to avoid the issue with reports only being issued at program termination. Using those hooks might help you pin-point the location of your leak.
You should try using a heap profiler, such as massif from Valgrind, to look for memory allocation locations with inordinate amounts of allocated memory.
Make sure you are not using a custom allocator or garbage collector in your application. As far as I know, no memory analysis tool will work with a custom allocator without user interference.
If your memory leak is massive enough to be detectable within an acceptable amount of application run-time, you could try a binary search of old revisions through your version control system to identify the commit that introduced the problem. At least Mercurial
and Git offer built-in support for this task.

If by "did not help" you mean it did not report memory leaks, it is quite possible you don't have one and just use more and more memory that is still referenced by pointers and can be deleted.

To help you debug the problem, perhaps in your logging, you should also write memory size, number of objects (their type) and a few other stats which are useful to you. At least until you become more familiar with the tools you mentioned.

Related

Find out where memory is consumed

I have this relatively large numerical application code that may run for a few days and eventually spit out some numbers. The whole thing is written in C++, making use of a bunch of 3rd-party libraries, and compiled using GCC 4.6. The code uses shared pointers throughout.
Unfortunately, over time, the memory consumption of the code increases until all of the (shared) memory is used up, then crashes. Algorithmically, the code shouldn't build up memory over time, so there'll be a bug somewhere.
I did run a small example through valgrind's leak checker which reports that all should be fine. My thought was that shared pointers might unintentionally be created someplace, preventing from unneeded data from being freed along the process (but this is just a guess).
At the end of the day, I'm running out of ideas how to debug such a thing.
Any ideas?

Since you already have the valgrind toolsuite available, I would advise you to run the massif tool.
Massif will track the memory allocation origins and the report will indicate you how many bytes each allocation site/function created. This will help you understand where that memory blow-up comes from.

GNU libstdc++ defaults to caching STL-related memory allocations, apparently for microbenchmark speed reasons. However, the actual effect tends to be quite negative for both speed and memory footprint when using allocators such as tcmalloc and jemalloc. tcmalloc disables this behavior by setting GLIBCPP_FORCE_NEW=1 and GLIBCXX_FORCE_NEW=1 in the environment (for libstdc++ versions 3.3 and 3.4, respectively), but I know of no other allocator that does so. Therefore it generally a good idea to set the appropriate environment variable when launching your application.

Even if you have no leaks, you could face memory fragmentation.
If you are on Linux, I suggest to try jemalloc allocator. It runs great on Linux. It runs on many architectures, I used it successfully even on zLinux (on IBM zSeries mainframe). It's really easy to use - you don't even need to rebuild your application or any libraries, just build jemalloc and start your application with LD_PRELOAD set like this: LD_PRELOAD=/usr/lib/libjemalloc.so <app>

How to find memory leaks in source code

If it is known that an application leaks memory (when executed), what are the various ways to locate such memory leak bugs in the source code of the application.
I know of certain parsers/tools (which probably do static analysis of the code) which can be used here but are there any other ways/techniques to do that, specific to the language (C/C++)/platform?

compile Your code with -g flag
Download valgrind (if You work on Linux) an run it with --leak-check=yes option
I thinkt that valgrind is the best tool for this task.
For Windows: See this topic: Is there a good Valgrind substitute for Windows?

There's valgrind and probably other great tools out there.
But I'll tell you what I do, that works very well for me, given that many times I code in environments where you can't run valgrind:
Be sure to pair each allocation with a deallocation. I always count news or mallocs and search for the delete or free.
If in C++ and using exceptions, try to put them paired on constructors/destructors. If you like risk, or can't put them in Ctor/dtor, be sure no exception can make the program flow not to execute the deallocation.
Use of smart pointers and ptr containers.
One can monitor alloc/dealloc rewriting new or installing a malloc handler. At some point, if the code runs continuously it can be obvious if the memory usage becomes stationary and doesn't grow without bounds which would be the worst case of leak.
Be careful with containers that never shrink such as vectors. There are tricks to shrink them swapping them with an empty container.

There are two general techniques for memory leak detection, dynamic and static analysis.
In dynamic analysis, you run the code and a tool analyzes the run to see what memory has leaked at the end. Dynamic analysis tends to be highly accurate but will only correctly analyze that specific executions you do within your tool. So, if some of your leaks that only happens for certain input and you don't have a test that uses that input, dynamic analysis will not detect those leaks.
Static analysis analyzes the source code to create all possible code paths and see if a leak can occur in any of them. While static analysis is pretty good right now, it's not perfect - you can not only get false negatives (the analysis misses leaks), you can also get false positives (the tool claims you have a leak when there actually isn't one).
There are many dynamic analysis tools including such well known tools as Valgrind (open source but limited to x86 Linux and Mac) and Purify (commercial but also available for Windows, Solaris and AIX). Wikipedia has a decent list of some other dynamic analysis tools as well.
On the static analysis side, the only tool I've thought worthwhile is Coverity (commerical). Once again, Wikipedia has a list of many other static analysis tools.

Purify will do a seemingly miraculous job of doing this
Not only memory leaks, but many other kinds of memory errors.
It works by instrumenting your machine code in real time, so you don't need source or need to compile with any particular options.
Just instrument your code with Purify (simplest way to do this: CC="purify cc" make), run your program, and get a nice gui that will show your leaks and other errors.
Available for Windows, Linux, and various flavors of Unix. There's a free trial download available.
http://www.ibm.com/software/awdtools/purify

If you utilize smart pointers and keep a table of them, then you can analyze it to tell what memory you are still using. Either provide a window to view it, or more commonly, stream to a log before the program terminates.

As far as doing it manually is concerned I don't think there are any established practices. To go over the code with a fine-toothed comb, looking for news (allocs) without corresponding deletes (frees), is all that is there to it.

You can also use purify for detection of memory leak.

There aren't very many general purpose guidelines for finding memory leaks. Fortunately, there's one simple guideline for preventing most leaks, both of memory and of other resources: use RAII (Resource Acquisition Is Initialization), and they just won't happen to start with. The name is a lousy description, but if you Google on it, you should get quite a few useful hits.

Personally, I would recommend that you wrap all variables which you need to allocate/deallocate memory with the clone_ptrclass which performs all the deallocation of memory for you when it is no longer needed. Thus, you do not have to use delete. It is quite similar to auto_ptr. The major difference is that you do not have to deal with the tricky ownership transfer part. More information and code on clone_ptr can be found here.

how to find allocated memory in linux

Good afternoon all,
What I'm trying to accomplish: I'd like to implement an extension to a C++ unit test fixture to detect if the test allocates memory and doesn't free it. My idea was to record allocation levels or free memory levels before and after the test. If they don't match then you're leaking memory.
What I've tried so far: I've written a routine to read /proc/self/stat to get the vm size and resident set size. Resident set size seems like what I need but it's obviously not right. It changes between successive calls to the function with no memory allocation. I believe it's returning the cached memory used not what's allocated. It also changes in 4k increments so it's too coarse to be of any real use.
I can get the stack size by allocating a local and saving it's address. Are there any problems with doing this?
Is there a way to get real free or allocated memory on linux?
Thanks

Your best bet may actually be to use a tool specifically designed for the job of finding memory leaks. I have personal experience with Electric Fence, which is easy to use and seems to do the job nicely (not sure how well it will handle C++). Also recommended by others is Dmalloc.
For sure though, everyone seems to like Valgrind, which can do just about anything and even has front-ends (though anything that has a front-end built for it means that it probably isn't the simplest thing in the world). If the KDE folks can recommend it, it must be able to handle just about anything. (I'm not saying anything bad about KDE, just that it is a very large C++ codebase, so if Valgrind can handle KDE software, it must have something going for it. Though I don't have personal experience with it as Electric Fence was always enough for me)

I'd have to agree with those suggesting Valgrind and similar, but if the run-time overhead is too great, one option may be to use mallinfo() call to retrieve statistics on currently allocated memory, and check whether uordblks is nonzero.
Note that this will have to be run before global destructors are called - so if you have any allocations that are cleaned up there, this will register a false positive. It also won't tell you where the allocation is made - but it's a good first pass to figure out which test cases need work.

don't look a the OS to get allocation info. the C library manages memory internally, and only asks the OS for more RAM in chunks (4KB in your case). In most cases, it's never released to back to the OS, so you can't really check anything there.
You'll have to patch malloc() and free() to get the info you need.
Or, use Valgrind.

Not a direct answer but you could re-define the ::new and ::delete operators, and internally either via a singleton or global objects, keep track of the allocated, and de-allocated memory.
Edit: If this is a personal, DIY project then cool. But if its for something critical you can always jump onto one of the many leak detection libraries/programs available, a quick google search should suffice.

google-perftools can be used in your test code.

Is there a way to monitor heap usage in C++/MacOS?

I fear that some of my code is causing memory leaks, and I'm not sure about how to check it. Is there a tool or something for MacOS X?
Thank you

Yes - there's an application called MallocDebug which is installed as part of the Xcode package.
You can find it in the /Developer/Applications/Performance Tools folder.

Apple has a good description of how to use MallocDebug on OS X on their developer pages.
document on finding leaks in general
enabling debug features of malloc in particular.

Of course UNIX provides a quick and dirty way of detecting memory leaks... top.
Launch your app and watch the system memory allocated to your process over time. If it continually grows when it shouldn't then there is likely a memory leak. At which point you break out Valgrind or use MallocDebug, etc.
Of course if you use smart pointers and/or RAII, then you shouldn't have memory leaks in your code, right? ;)))

THE BEST tool PERIOD for memory errors, leaks, etc. is Valgrind. Get started here. You dont need to do anything special in your code and this will report where the memory was allocated (with a full stack trace, even in C). Also, it'll detect writes to freed memory, uninitialized memory usage, and much more.

Reducing memory footprint of large unfamiliar codebase

Suppose you have a fairly large (~2.2 MLOC), fairly old (started more than 10 years ago) Windows desktop application in C/C++. About 10% of modules are external and don't have sources, only debug symbols.
How would you go about reducing application's memory footprint in half? At least, what would you do to find out where memory is consumed?

Override malloc()/free() and new()/delete() with wrappers that keep track of how big the allocations are and (by recording the callstack and later resolving it against the symbol table) where they are made from. On shutdown, have your wrapper display any memory still allocated.
This should enable you both to work out where the largest allocations are and to catch any leaks.

this is description/skeleton of memory tracing application I used to reduce memory consumption of our game by 20%. It helped me to track many allocations done by external modules.

It's not an easy task. Begin by chasing down any memory leaks you cand find (a good tool would be Rational Purify). Skim the source code and try to optimize data structures and/or algorithms.
Sorry if this may sound pessimistic, but cutting down memory usage by 50% doesn't sound realistic.

There is a chance is you can find some significant inefficiencies very fast. First you should check what is the memory used for. A tool which I have found very handy for this is Memory Validator
Once you have this "memory usage map", you can check for Low Hanging Fruit. Are there any data structures consuming a lot of memory which could be represented in a more compact form? This is often possible, esp. when the data access is well encapsulated and when you have a spare CPU power you can dedicate to compressing / decompressing them on each access.

I don't think your question is well posed.
The size of source code is not directly related to the memory footprint. Sure, the compiled code will occupy some memory but the application might will have memory requirements on it's own. Both static (the variables declared in the code) and dynamic (the object the application creates).
I would suggest you to profile program execution and study the code carefully.

First places to start for me would be:
Does the application do a lot of preallocation memory to be used later? Does this memory often sit around unused, never handed out? Consider switching to newing/deleting (or better use a smart_ptr) as needed.
Does the code use a static array such as
Object arrayOfObjs[MAX_THAT_WILL_EVER_BE_USED];
and hand out objs in this array? If so, consider manually managing this memory.

One of the tools for memory usage analysis is LeakDiag, available for free download from Microsoft. It apparently allows to hook all user-mode allocators down to VirtualAlloc and to dump process allocation snapshots to XML at any time. These snapshots then can be used to determine which call stacks allocate most memory and which call stacks are leaking. It lacks pretty frontend for snapshot analysis (unless you can get LDParser/LDGrapher via Microsoft Premier Support), but all the data is there.
One more thing to note is that you may have false leak positives from BSTR allocator due to caching, see "Hey, why am I leaking all my BSTR's?"

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js