Increase in memory footprint. False alarm or a memory leak? - c++

I have a graphics program where I am creating and destroying the same objects over and over again. All in all, there are 140 objects. They get deleted and newed such that the number never increases 140. This is a requirement as it is a stress test, that is I cannot have a memory pool or dummy objects. Now I am fairly certain there aren't any memory leaks. I am also using a memory leak detector which is not reporting any leaks.
The problem is that the memory footprint of the program keeps increasing (albeit quite slowly, slower than the rate at which the objects are being destroyed/created). So my question then is whether an increasing memory footprint is a solid sign for memory leaks or can it sometimes be deceiving?
EDIT: I am using new/delete to create/destroy the objects

It does seem possible that this behavior could come from a situation in which there is no leak.
Is there any chance that your heap is getting fragmented?
Say you make lots of allocations of size n. You free them all, which makes your C library insert those buffers into a free list. Some other code path then makes allocations smaller than n, so those blocks in the free list get chunked up into smaller units. Then the next iteration of the loop does another batch of allocations of size n, and the free list no longer contains contiguous memory at that size, and malloc has to ask the kernel for more memory. Eventually those "smaller-than-n" allocations get freed as would your "n-sized" ones, but if you run enough iterations where the fragmentation exists, I could see the process gradually increasing its memory footprint.
One way to avoid this might be to allocate all your objects once, and not keep allocating/freeing them. Since you're using C++ this might necessitate placement new or something similar. Since you are using Windows, I might also mention that Win32 supports having multiple heaps in a process, so if your objects come from a different heap than other allocations you may avoid this.

It depends if you're under a CLR (or a virtual machine with garbage collector) or your still in the old mode (like C++, MFC ect...)
When you have a GC around - you can't really tell, only if you test it long enough. GC can decide not to clean your objects for now... (there is a way to force it)
In the native applications, yes, a footprint increase might mean a leak.
there are some tools (very good tools) for c++ that find these leaks (google devpartner or boundschecker)
I guess there are some tools for c# and Java as well.

If your application's process footprint increases beyond a reasonable limit, which depends on your application and what it does, and continues to increase until eventually you (will) run out of virtual memory, you definitely have a memory leak.

Try memory allocation tests included in CRT: http://msdn.microsoft.com/en-us/library/e5ewb1h3%28VS.80%29.aspx
They help A LOT.
But I've noticed that apps do tend to vary their memory consumption a little if you look at some factors. Windows 7 might also create extra padding in memory allocation to fix bugs: http://msdn.microsoft.com/en-us/library/dd744764%28VS.85%29.aspx

I strongly suggest to try Visual Studio 2015 (the Community Edition is free). It comes with Diagnostic Tools that helps you analyze Memory Usage; it allows you to take snapshots and view the heap

Related

How to find "weak" memory leaks

In C++ programs, I have sometimes had problems with "weak" memory leaks. By that, I mean that some objects accumulate resources, but, eventually, these objects are destroyed properly and their memory is released, so that these leaks do not show up using the traditional memory debugging tools like valgrind or address sanitizers.
A typical example would be a poorly-written cache that keeps all the cached results from the beginning of the program. It grows forever, but its memory is reclaimed at the end of the program, when the cache is destroyed.
How can one debug this ? Are there tools available to see where are the largest objects allocated by the program ? To dump the current state of allocated memory (including call stack) ? To see which objects are growing ? I'm using Linux, but I am interested in other platforms as well.
If other platforms are an option, I would recommend Visual Studio on Windows.
It has powerful profiling options, including one for memory usage.
https://learn.microsoft.com/en-us/visualstudio/profiling/memory-usage
While debugging you can take a snapshot to see where memory is being used.
You can also take memory usage snapshots at different times and compare them.
You can use a profiler like e.g. Intel VTune (this is available for Linux as well) to trace memory consumption of your application. In VTune, you can see the memory consumption over time, and select a time window to see where memory was allocated during that window.
It still will be difficult to detect such problems if your application allocates and deallocates a lot of memory correctly, and only a small fraction is deallocated too late. In that case you need to check a lot of allocations/deallocations before you can find the bad one(s).

Memory stability of a C++ application in Linux

I want to verify the memory stability of a C++ application I wrote and compiled for Linux.
It is a network application that responds to remote clients connectings in a rate of 10-20 connections per second.
On long run, memory was rising to 50MB, eventhough the app was making calls to delete...
Investigation shows that Linux does not immediately free memory. So here are my questions :
How can force Linux to free memory I actually freed? At least I want to do this once to verify memory stability.
Otherwise, is there any reliable memory indicator that can report memory my app is actually holding?
What you are seeing is most likely not a memory leak at all. Operating systems and malloc/new heaps both do very complex accounting of memory these days. This is, in general, a very good thing. Chances are any attempt on your part to force the OS to free the memory will only hurt both your application performance and overall system performance.
To illustrate:
The Heap reserves several areas of virtual memory for use. None of it is actually committed (backed by physical memory) until malloc'd.
You allocate memory. The Heap grows accordingly. You see this in task manager.
You allocate more memory on the Heap. It grows more.
You free memory allocated in Step 2. The Heap cannot shrink, however, because the memory in #3 is still allocated, and Heaps are unable to compact memory (it would invalidate your pointers).
You malloc/new more stuff. This may get tacked on after memory allocated in step #3, because it cannot fit in the area left open by free'ing #2, or because it would be inefficient for the Heap manager to scour the heap for the block left open by #2. (depends on the Heap implementation and the chunk size of memory being allocated/free'd)
So is that memory at step #2 now dead to the world? Not necessarily. For one thing, it will probably get reused eventually, once it becomes efficient to do so. In cases where it isn't reused, the Operating System itself may be able to use the CPU's Virtual Memory features (the TLB) to "remap" the unused memory right out from under your application, and assign it to another application -- on the fly. The Heap is aware of this and usually manages things in a way to help improve the OS's ability to remap pages.
These are valuable memory management techniques that have the unmitigated side effect of rendering fine-grained memory-leak detection via Process Explorer mostly useless. If you want to detect small memory leaks in the heap, then you'll need to use runtime heap leak-detection tools. Since you mentioned that you're able to build on Windows as well, I will note that Microsoft's CRT has adequate leak-checking tools built-in. Instructions for use found here:
http://msdn.microsoft.com/en-us/library/974tc9t1(v=vs.100).aspx
There are also open-source replacements for malloc available for use with GCC/Clang toolchains, though I have no direct experience with them. I think on Linux Valgrind is the preferred and more reliable method for leak-detection anyway. (and in my experience easier to use than MSVCRT Debug).
I would suggest using valgrind with memcheck tool or any other profiling tool for memory leaks
from Valgrind's page:
Memcheck
detects memory-management problems, and is aimed primarily at
C and C++ programs. When a program is run under Memcheck's
supervision, all reads and writes of memory are checked, and calls to
malloc/new/free/delete are intercepted. As a result, Memcheck can
detect if your program:
Accesses memory it shouldn't (areas not yet allocated, areas that have been freed, areas past the end of heap blocks, inaccessible areas
of the stack).
Uses uninitialised values in dangerous ways.
Leaks memory.
Does bad frees of heap blocks (double frees, mismatched frees).
Passes overlapping source and destination memory blocks to memcpy() and related functions.
Memcheck reports these errors as soon as they occur, giving the source
line number at which it occurred, and also a stack trace of the
functions called to reach that line. Memcheck tracks addressability at
the byte-level, and initialisation of values at the bit-level. As a
result, it can detect the use of single uninitialised bits, and does
not report spurious errors on bitfield operations. Memcheck runs
programs about 10--30x slower than normal. Cachegrind
Massif
Massif is a heap profiler. It performs detailed heap profiling by
taking regular snapshots of a program's heap. It produces a graph
showing heap usage over time, including information about which parts
of the program are responsible for the most memory allocations. The
graph is supplemented by a text or HTML file that includes more
information for determining where the most memory is being allocated.
Massif runs programs about 20x slower than normal.
Using valgrind is as simple as running application with desired switches and give it as an input of valgrind:
valgrind --tool=memcheck ./myapplication -f foo -b bar
I very much doubt that anything beyond wrapping malloc and free [or new and delete ] with another function can actually get you anything other than very rough estimates.
One of the problems is that the memory that is freed can only be released if there is a long contiguous chunk of memory. What typically happens is that there are "little bits" of memory that are used all over the heap, and you can't find a large chunk that can be freed.
It's highly unlikely that you will be able to fix this in any simple way.
And by the way, your application is probably going to need those 50MB later on when you have more load again, so it's just wasted effort to free it.
(If the memory that you are not using is needed for something else, it will get swapped out, and pages that aren't touched for a long time are prime candidates, so if the system runs low on memory for some other tasks, it will still reuse the RAM in your machine for that space, so it's not sitting there wasted - it's just you can't use 'ps' or some such to figure out how much ram your program uses!)
As suggested in a comment: You can also write your own memory allocator, using mmap() to create a "chunk" to dole out portions from. If you have a section of code that does a lot of memory allocations, and then ALL of those will definitely be freed later, to allocate all those from a separate lump of memory, and when it's all been freed, you can put the mmap'd region back into a "free mmap list", and when the list is sufficiently large, free up some of the mmap allocations [this is in an attempt to avoid calling mmap LOTS of times, and then munmap again a few millisconds later]. However, if you EVER let one of those memory allocations "escape" out of your fenced in area, your application will probably crash (or worse, not crash, but use memory belonging to some other part of the application, and you get a very strange result somewhere, such as one user gets to see the network content supposed to be for another user!)
Use valgrind to find memory leaks : valgrind ./your_application
It will list where you allocated memory and did not free it.
I don't think it's a linux problem, but in your application. If you monitor the memory usage with « top » you won't get very precise usages. Try using massif (a tool of valgrind) : valgrind --tool=massif ./your_application to know the real memory usage.
As a more general rule to avoid leaks in C++ : use smart pointers instead of normal pointers.
Also in many situations, you can use RAII (http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization) instead of allocating memory with "new".
It is not typical for an OS to release memory when you call free or delete. This memory goes back to the heap manager in the runtime library.
If you want to actually release memory, you can use brk. But that opens up a very large can of memory-management worms. If you directly call brk, you had better not call malloc. For C++, you can override new to use brk directly.
Not an easy task.
The latest dlmalloc() has a concept called an mspace (others call it a region). You can call malloc() and free() against an mspace. Or you can delete the mspace to free all memory allocated from the mspace at once. Deleting an mspace will free memory from the process.
If you create an mspace with a connection, allocate all memory for the connection from that mspace, and delete the mspace when the connection closes, you would have no process growth.
If you have a pointer in one mspace pointing to memory in another mspace, and you delete the second mspace, then as the language lawyers say "the results are undefined".

Memory leak in c++

I am running my c++ application on an intel Xscale device. The problem is, when I run my application offtarget (Ubuntu) with Valgrind, it does not show any memory leaks.
But when I run it on the target system, it starts with 50K free memory, and reduces to 2K overnight. How to catch this kind of leakage, which is not being shown by Valgrind?
A common culprit with these small embedded deviecs is memory fragmentation. You might have free memory in your application between 2 objects. A common solution to this is the use of a dedicated allocator (operator new in C++) for the most common classes. Memory pools used purely for objects of size N don't fragment - the space between two objects will always be a multiple of N.
It might not be an actual memory leak, but maybe a situation of increasing memory usage. For example it could be allocating a continually increasing string:
string s;
for (i=0; i<n; i++)
s += "a";
50k isn't that much, maybe you should go over your source by hand and see what might be causing the issue.
This may be not a leak, but just the runtime heap not releasing memory to the operating system. This can also be fragmentation.
Possible ways to overcome this:
Split into two applications. The master application will have the simple logic with little or no dynamic memory usage. It will start the worker application to actually do work in such chunks that the worker application will not run out of memory and will restart that application periodically. This way memory is periodically returned to the operating system.
Write your own memory allocator. For example you can allocate a dedicated heap and only allocate memory from there, then free the dedicated heap entirely. This requires the operating system to support multiple heaps.
Also note that it's possible that your program runs differently on Ubuntu and on the target system and therefore different execution paths are taken and the code resulting in memory leaks is executed on the target system, but not on Ubuntu.
This does sounds like fragmentation. Fragmentation is caused by you allocating objects on the stack, say:
object1
object2
object3
object4
And then deleting some objects
object1
object3
object4
You now have a hole in the memory that is unused. If you allocate another object that's too big for the hole, the hole will remain wasted. Eventually with enough memory churn, you can end up with so many holes that they waste you memory.
The way around this is to try and decide your memory requirements up front. If you've got particular objects that you know you are creating many of, try and ensure they're the same size.
You can use a pool to make the allocations more efficient for a particular class... or at least let you track it better so you can understand what's going on and come up with a good solution.
One way of doing this is to create a single static:
struct Slot
{
Slot() : free(true) {}
bool free;
BYTE data[20]; // you'll need to tune the value 20 to what your program needs
};
Slot pool[500]; // you'll need to pick a good pool size too.
Create the pool up front when your program starts and pre-allocate it so that it is as big as the maximum requirements for your program. You may want to HeapAlloc it (or the equivalent in your OS so that you can control when it appears from somewhere in you application startup).
Then override the new and delete operators for a suspect class so that they return slots from this vector. So, your objects will be stored in this vector.
You can override new and delete for classes of the same size to be put in this vector.
Create pools of different sizes for different objects.
Just go for the worst offenders at first.
I've done something like this before and it solved my problem on an embedded device. I also was using a lot of STL, so I created a custom allocator (google for stl custom allocator - there are loads of links). This was useful for records stored in a mini-database my program used.
If your memory usage goes down, i don't think it can be defined as a memory leak.
Where are you getting reports of memory usage ? The system might just have put most of your program's memory use in virtual memory.
All i can add is that Valgrind is known to be pretty efficient at finding memory leaks !
Also, are you sure when you profiled your code, the code-coverage was enough to cover all the code-paths which might be executed on target platform?
Valgrind for sure does not lie. As has been pointed out, this might indeed be the runtime heap not releasing the memory, but i would think otherwise.
Are you using any sophisticated technique to track the scope of object..?
if yes, than valgrind is not smart enough, Though you can try by setting xscale related option with valgrind
Most applications show a pattern of memory use like this:
they use very little when they start
as they create data structures they use more and more
as they start deleting old data structures or reusing existing ones, they reach a steady state where memory use stays roughly constant
If your app is continuosly increasing in size, you may have aleak. If it increases in sizze over aperiod and then reaches arelatively steady state, you probably don't.
You can use the massif tool from Valgrind, which will show you where the most memory is allocated and how it evolves over time.

Reducing memory footprint of large unfamiliar codebase

Suppose you have a fairly large (~2.2 MLOC), fairly old (started more than 10 years ago) Windows desktop application in C/C++. About 10% of modules are external and don't have sources, only debug symbols.
How would you go about reducing application's memory footprint in half? At least, what would you do to find out where memory is consumed?
Override malloc()/free() and new()/delete() with wrappers that keep track of how big the allocations are and (by recording the callstack and later resolving it against the symbol table) where they are made from. On shutdown, have your wrapper display any memory still allocated.
This should enable you both to work out where the largest allocations are and to catch any leaks.
this is description/skeleton of memory tracing application I used to reduce memory consumption of our game by 20%. It helped me to track many allocations done by external modules.
It's not an easy task. Begin by chasing down any memory leaks you cand find (a good tool would be Rational Purify). Skim the source code and try to optimize data structures and/or algorithms.
Sorry if this may sound pessimistic, but cutting down memory usage by 50% doesn't sound realistic.
There is a chance is you can find some significant inefficiencies very fast. First you should check what is the memory used for. A tool which I have found very handy for this is Memory Validator
Once you have this "memory usage map", you can check for Low Hanging Fruit. Are there any data structures consuming a lot of memory which could be represented in a more compact form? This is often possible, esp. when the data access is well encapsulated and when you have a spare CPU power you can dedicate to compressing / decompressing them on each access.
I don't think your question is well posed.
The size of source code is not directly related to the memory footprint. Sure, the compiled code will occupy some memory but the application might will have memory requirements on it's own. Both static (the variables declared in the code) and dynamic (the object the application creates).
I would suggest you to profile program execution and study the code carefully.
First places to start for me would be:
Does the application do a lot of preallocation memory to be used later? Does this memory often sit around unused, never handed out? Consider switching to newing/deleting (or better use a smart_ptr) as needed.
Does the code use a static array such as
Object arrayOfObjs[MAX_THAT_WILL_EVER_BE_USED];
and hand out objs in this array? If so, consider manually managing this memory.
One of the tools for memory usage analysis is LeakDiag, available for free download from Microsoft. It apparently allows to hook all user-mode allocators down to VirtualAlloc and to dump process allocation snapshots to XML at any time. These snapshots then can be used to determine which call stacks allocate most memory and which call stacks are leaking. It lacks pretty frontend for snapshot analysis (unless you can get LDParser/LDGrapher via Microsoft Premier Support), but all the data is there.
One more thing to note is that you may have false leak positives from BSTR allocator due to caching, see "Hey, why am I leaking all my BSTR's?"

How to solve Memory Fragmentation

We've occasionally been getting problems whereby our long-running server processes (running on Windows Server 2003) have thrown an exception due to a memory allocation failure. Our suspicion is these allocations are failing due to memory fragmentation.
Therefore, we've been looking at some alternative memory allocation mechanisms that may help us and I'm hoping someone can tell me the best one:
1) Use Windows Low-fragmentation Heap
2) jemalloc - as used in Firefox 3
3) Doug Lea's malloc
Our server process is developed using cross-platform C++ code, so any solution would be ideally cross-platform also (do *nix operating systems suffer from this type of memory fragmentation?).
Also, am I right in thinking that LFH is now the default memory allocation mechanism for Windows Server 2008 / Vista?... Will my current problems "go away" if our customers simply upgrade their server os?
First, I agree with the other posters who suggested a resource leak. You really want to rule that out first.
Hopefully, the heap manager you are currently using has a way to dump out the actual total free space available in the heap (across all free blocks) and also the total number of blocks that it is divided over. If the average free block size is relatively small compared to the total free space in the heap, then you do have a fragmentation problem. Alternatively, if you can dump the size of the largest free block and compare that to the total free space, that will accomplish the same thing. The largest free block would be small relative to the total free space available across all blocks if you are running into fragmentation.
To be very clear about the above, in all cases we are talking about free blocks in the heap, not the allocated blocks in the heap. In any case, if the above conditions are not met, then you do have a leak situation of some sort.
So, once you have ruled out a leak, you could consider using a better allocator. Doug Lea's malloc suggested in the question is a very good allocator for general use applications and very robust most of the time. Put another way, it has been time tested to work very well for most any application. However, no algorithm is ideal for all applications and any management algorithm approach can be broken by the right pathelogical conditions against it's design.
Why are you having a fragmentation problem? - Sources of fragmentation problems are caused by the behavior of an application and have to do with greatly different allocation lifetimes in the same memory arena. That is, some objects are allocated and freed regularly while other types of objects persist for extended periods of time all in the same heap.....think of the longer lifetime ones as poking holes into larger areas of the arena and thereby preventing the coalesce of adjacent blocks that have been freed.
To address this type of problem, the best thing you can do is logically divide the heap into sub arenas where the lifetimes are more similar. In effect, you want a transient heap and a persistent heap or heaps that group things of similar lifetimes.
Some others have suggested another approach to solve the problem which is to attempt to make the allocation sizes more similar or identical, but this is less ideal because it creates a different type of fragmentation called internal fragmentation - which is in effect the wasted space you have by allocating more memory in the block than you need.
Additionally, with a good heap allocator, like Doug Lea's, making the block sizes more similar is unnecessary because the allocator will already be doing a power of two size bucketing scheme that will make it completely unnecessary to artificially adjust the allocation sizes passed to malloc() - in effect, his heap manager does that for you automatically much more robustly than the application will be able to make adjustments.
I think you’ve mistakenly ruled out a memory leak too early.
Even a tiny memory leak can cause a severe memory fragmentation.
Assuming your application behaves like the following:
Allocate 10MB
Allocate 1 byte
Free 10MB
(oops, we didn’t free the 1 byte, but who cares about 1 tiny byte)
This seems like a very small leak, you will hardly notice it when monitoring just the total allocated memory size.
But this leak eventually will cause your application memory to look like this:
.
.
Free – 10MB
.
.
[Allocated -1 byte]
.
.
Free – 10MB
.
.
[Allocated -1 byte]
.
.
Free – 10MB
.
.
This leak will not be noticed... until you want to allocate 11MB
Assuming your minidumps had full memory info included, I recommend using DebugDiag to spot possible leaks.
In the generated memory report, examine carefully the allocation count (not size).
As you suggest, Doug Lea's malloc might work well. It's cross platform and it has been used in shipping code. At the very least, it should be easy to integrate into your code for testing.
Having worked in fixed memory environments for a number of years, this situation is certainly a problem, even in non-fixed environments. We have found that the CRT allocators tend to stink pretty bad in terms of performance (speed, efficiency of wasted space, etc). I firmly believe that if you have extensive need of a good memory allocator over a long period of time, you should write your own (or see if something like dlmalloc will work). The trick is getting something written that works with your allocation patterns, and that has more to do with memory management efficiency as almost anything else.
Give dlmalloc a try. I definitely give it a thumbs up. It's fairly tunable as well, so you might be able to get more efficiency by changing some of the compile time options.
Honestly, you shouldn't depend on things "going away" with new OS implementations. A service pack, patch, or another new OS N years later might make the problem worse. Again, for applications that demand a robust memory manager, don't use the stock versions that are available with your compiler. Find one that works for your situation. Start with dlmalloc and tune it to see if you can get the behavior that works best for your situation.
You can help reduce fragmentation by reducing the amount you allocate deallocate.
e.g. say for a web server running a server side script, it may create a string to output the page to. Instead of allocating and deallocating these strings for every page request, just maintain a pool of them, so your only allocating when you need more, but your not deallocating (meaning after a while you get the situation you not allocating anymore either, because you have enough)
You can use _CrtDumpMemoryLeaks(); to dump memory leaks to the debug window when running a debug build, however I believe this is specific to the Visual C compiler. (it's in crtdbg.h)
I'd suspect a leak before suspecting fragmentation.
For the memory-intensive data structures, you could switch over to a re-usable storage pool mechanism. You might also be able to allocate more stuff on the stack as opposed to the heap, but in practical terms that won't make a huge difference I think.
I'd fire up a tool like valgrind or do some intensive logging to look for resources not being released.
#nsaners - I'm pretty sure the problem is down to memory fragmentation. We've analyzed minidumps that point to a problem when a large (5-10mb) chunk of memory is being allocated. We've also monitored the process (on-site and in development) to check for memory leaks - none were detected (the memory footprint is generally quite low).
The problem does happen on Unix, although it's usually not as bad.
The Low-framgmentation heap helped us, but my co-workers swear by Smart Heap
(it's been used cross platform in a couple of our products for years). Unfortunately due to other circumstances we couldn't use Smart Heap this time.
We also look at block/chunking allocating and trying to have scope-savvy pools/strategies, i.e.,
long term things here, whole request thing there, short term things over there, etc.
As usual, you can usually waste memory to gain some speed.
This technique isn't useful for a general purpose allocator, but it does have it's place.
Basically, the idea is to write an allocator that returns memory from a pool where all the allocations are the same size. This pool can never become fragmented because any block is as good as another. You can reduce memory wastage by creating multiple pools with different size chunks and pick the smallest chunk size pool that's still greater than the requested amount. I've used this idea to create allocators that run in O(1).
if you talking about Win32 - you can try to squeeze something by using LARGEADDRESSAWARE. You'll have ~1Gb extra defragmented memory so your application will fragment it longer.
The simple, quick and dirty, solution is to split the application into several process, you should get fresh HEAP each time you create the process.
Your memory and speed might suffer a bit (swapping) but fast hardware and big RAM should be able to help.
This was old UNIX trick with daemons, when threads did not existed yet.