C++ Multithread program on linux memory issue

C++ Multithread program on linux memory issue - c++

I'm developing a software that requires creation and deletion of a large number of threads.
When I create a thread the memory increases and when delete them (this is confirmed by using the command ps -mo THREAD -p <pid>), the memory related to the program/software does not decrease (top command). As a result I run out of memory.
I have used Valgrind to check for memory error/leak and I can't find any. This is on a debian box. Please let me know what the issue could be.

How are you deleting the threads?
The notes here http://www.kernel.org/doc/man-pages/online/pages/man3/pthread_join.3.html talk about needing to call join in some cases to free up resources.

You do not run out of memory.
The "free memory" you see in top command is actually not the memory that is available when required. Linux kernel uses as much as possible/useable of the free memory for its page cache. When a process requires memory, the kernel can throw away that page cache and provide that memory to a process.
In other words: linux uses the free memory, instead of just leaving it idling around...
Use free -m: In the row labeled "-/+ buffers/cache:" you will see the real amount of memory available for processes.

Related

How to find "weak" memory leaks

In C++ programs, I have sometimes had problems with "weak" memory leaks. By that, I mean that some objects accumulate resources, but, eventually, these objects are destroyed properly and their memory is released, so that these leaks do not show up using the traditional memory debugging tools like valgrind or address sanitizers.
A typical example would be a poorly-written cache that keeps all the cached results from the beginning of the program. It grows forever, but its memory is reclaimed at the end of the program, when the cache is destroyed.
How can one debug this ? Are there tools available to see where are the largest objects allocated by the program ? To dump the current state of allocated memory (including call stack) ? To see which objects are growing ? I'm using Linux, but I am interested in other platforms as well.

If other platforms are an option, I would recommend Visual Studio on Windows.
It has powerful profiling options, including one for memory usage.
https://learn.microsoft.com/en-us/visualstudio/profiling/memory-usage
While debugging you can take a snapshot to see where memory is being used.
You can also take memory usage snapshots at different times and compare them.

You can use a profiler like e.g. Intel VTune (this is available for Linux as well) to trace memory consumption of your application. In VTune, you can see the memory consumption over time, and select a time window to see where memory was allocated during that window.
It still will be difficult to detect such problems if your application allocates and deallocates a lot of memory correctly, and only a small fraction is deallocated too late. In that case you need to check a lot of allocations/deallocations before you can find the bad one(s).

Dynamical Memory Allocation / Making use of unused memory

I'm going to write an application that needs a lot of memory dynamically.
Most of the memory is used for caching purposes and is just used for speed ups.
Those parts could actually be freed on demand.
Unfortunately my kernel will kill the process if it runs out of memory. But it could
simply free memory. So what I want is very similar to the linux page cache as it is
explained here. Is it possible to implement such behaviour in userspace in a convenient way?
I'm thinking about implementing such a cache with "cache files" which are stored on a ramfs/tmpfs with memory mapped file IO, but i'm sure, that there is a more comfortable way.
Thanks in advance!

Yes this should be possible. Most kernels have a memory alloc method where the process sleeps until it gets the requested memory. ( all the kernels ive worked with have). If yours doesnt this may be a good time to implement one. You could check out the kmem functions in linux.
However this is a passive way of doing what youve asked. The process will be waiting until someone else frees up memory.
If you want to free up memory from your own process address space when theres no memory, this can be done easily from user space. You need to keep a journal of allocated memory and free the ones you dont need on demand when an alloc fails.

Memory stability of a C++ application in Linux

I want to verify the memory stability of a C++ application I wrote and compiled for Linux.
It is a network application that responds to remote clients connectings in a rate of 10-20 connections per second.
On long run, memory was rising to 50MB, eventhough the app was making calls to delete...
Investigation shows that Linux does not immediately free memory. So here are my questions :
How can force Linux to free memory I actually freed? At least I want to do this once to verify memory stability.
Otherwise, is there any reliable memory indicator that can report memory my app is actually holding?

What you are seeing is most likely not a memory leak at all. Operating systems and malloc/new heaps both do very complex accounting of memory these days. This is, in general, a very good thing. Chances are any attempt on your part to force the OS to free the memory will only hurt both your application performance and overall system performance.
To illustrate:
The Heap reserves several areas of virtual memory for use. None of it is actually committed (backed by physical memory) until malloc'd.
You allocate memory. The Heap grows accordingly. You see this in task manager.
You allocate more memory on the Heap. It grows more.
You free memory allocated in Step 2. The Heap cannot shrink, however, because the memory in #3 is still allocated, and Heaps are unable to compact memory (it would invalidate your pointers).
You malloc/new more stuff. This may get tacked on after memory allocated in step #3, because it cannot fit in the area left open by free'ing #2, or because it would be inefficient for the Heap manager to scour the heap for the block left open by #2. (depends on the Heap implementation and the chunk size of memory being allocated/free'd)
So is that memory at step #2 now dead to the world? Not necessarily. For one thing, it will probably get reused eventually, once it becomes efficient to do so. In cases where it isn't reused, the Operating System itself may be able to use the CPU's Virtual Memory features (the TLB) to "remap" the unused memory right out from under your application, and assign it to another application -- on the fly. The Heap is aware of this and usually manages things in a way to help improve the OS's ability to remap pages.
These are valuable memory management techniques that have the unmitigated side effect of rendering fine-grained memory-leak detection via Process Explorer mostly useless. If you want to detect small memory leaks in the heap, then you'll need to use runtime heap leak-detection tools. Since you mentioned that you're able to build on Windows as well, I will note that Microsoft's CRT has adequate leak-checking tools built-in. Instructions for use found here:
http://msdn.microsoft.com/en-us/library/974tc9t1(v=vs.100).aspx
There are also open-source replacements for malloc available for use with GCC/Clang toolchains, though I have no direct experience with them. I think on Linux Valgrind is the preferred and more reliable method for leak-detection anyway. (and in my experience easier to use than MSVCRT Debug).

I would suggest using valgrind with memcheck tool or any other profiling tool for memory leaks
from Valgrind's page:
Memcheck
detects memory-management problems, and is aimed primarily at
C and C++ programs. When a program is run under Memcheck's
supervision, all reads and writes of memory are checked, and calls to
malloc/new/free/delete are intercepted. As a result, Memcheck can
detect if your program:
Accesses memory it shouldn't (areas not yet allocated, areas that have been freed, areas past the end of heap blocks, inaccessible areas
of the stack).
Uses uninitialised values in dangerous ways.
Leaks memory.
Does bad frees of heap blocks (double frees, mismatched frees).
Passes overlapping source and destination memory blocks to memcpy() and related functions.
Memcheck reports these errors as soon as they occur, giving the source
line number at which it occurred, and also a stack trace of the
functions called to reach that line. Memcheck tracks addressability at
the byte-level, and initialisation of values at the bit-level. As a
result, it can detect the use of single uninitialised bits, and does
not report spurious errors on bitfield operations. Memcheck runs
programs about 10--30x slower than normal. Cachegrind
Massif
Massif is a heap profiler. It performs detailed heap profiling by
taking regular snapshots of a program's heap. It produces a graph
showing heap usage over time, including information about which parts
of the program are responsible for the most memory allocations. The
graph is supplemented by a text or HTML file that includes more
information for determining where the most memory is being allocated.
Massif runs programs about 20x slower than normal.
Using valgrind is as simple as running application with desired switches and give it as an input of valgrind:
valgrind --tool=memcheck ./myapplication -f foo -b bar

I very much doubt that anything beyond wrapping malloc and free [or new and delete ] with another function can actually get you anything other than very rough estimates.
One of the problems is that the memory that is freed can only be released if there is a long contiguous chunk of memory. What typically happens is that there are "little bits" of memory that are used all over the heap, and you can't find a large chunk that can be freed.
It's highly unlikely that you will be able to fix this in any simple way.
And by the way, your application is probably going to need those 50MB later on when you have more load again, so it's just wasted effort to free it.
(If the memory that you are not using is needed for something else, it will get swapped out, and pages that aren't touched for a long time are prime candidates, so if the system runs low on memory for some other tasks, it will still reuse the RAM in your machine for that space, so it's not sitting there wasted - it's just you can't use 'ps' or some such to figure out how much ram your program uses!)
As suggested in a comment: You can also write your own memory allocator, using mmap() to create a "chunk" to dole out portions from. If you have a section of code that does a lot of memory allocations, and then ALL of those will definitely be freed later, to allocate all those from a separate lump of memory, and when it's all been freed, you can put the mmap'd region back into a "free mmap list", and when the list is sufficiently large, free up some of the mmap allocations [this is in an attempt to avoid calling mmap LOTS of times, and then munmap again a few millisconds later]. However, if you EVER let one of those memory allocations "escape" out of your fenced in area, your application will probably crash (or worse, not crash, but use memory belonging to some other part of the application, and you get a very strange result somewhere, such as one user gets to see the network content supposed to be for another user!)

Use valgrind to find memory leaks : valgrind ./your_application
It will list where you allocated memory and did not free it.
I don't think it's a linux problem, but in your application. If you monitor the memory usage with « top » you won't get very precise usages. Try using massif (a tool of valgrind) : valgrind --tool=massif ./your_application to know the real memory usage.
As a more general rule to avoid leaks in C++ : use smart pointers instead of normal pointers.
Also in many situations, you can use RAII (http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization) instead of allocating memory with "new".

It is not typical for an OS to release memory when you call free or delete. This memory goes back to the heap manager in the runtime library.
If you want to actually release memory, you can use brk. But that opens up a very large can of memory-management worms. If you directly call brk, you had better not call malloc. For C++, you can override new to use brk directly.
Not an easy task.

The latest dlmalloc() has a concept called an mspace (others call it a region). You can call malloc() and free() against an mspace. Or you can delete the mspace to free all memory allocated from the mspace at once. Deleting an mspace will free memory from the process.
If you create an mspace with a connection, allocate all memory for the connection from that mspace, and delete the mspace when the connection closes, you would have no process growth.
If you have a pointer in one mspace pointing to memory in another mspace, and you delete the second mspace, then as the language lawyers say "the results are undefined".

Increased memory usage for a process

I have a C++ process running in Solaris which creates 3 threads to do some tasks.
These threads execute in loops and it runs as long as the process is running.
But, I see that the memory usage of the process grows continuously and the process core dumps once the memory usage exceeds 4GB.
Can someone give me some pointers on what could be the issue behind memory usage growth?
What can I do to prevent process from core dumping because of memory exhaustion?
Will thread restart help?
Any pointers welcome.

No, restarting a thread would not help.
It seems like you have a memory leak in your application.
In my experience there are two types of memory leaks:
real memory leaks that you can see when the application exits
'false' memory leaks, like a big list that increases during the lifetime of your application but which is correctly cleaned up at the end
For the first type, there are tools which can report the memory that has not been freed by your application when it exits. I don't know about Solaris but there are numerous tools under Windows which can do that. For Unix, I think that Valgrind does this.
For the second type, there are also tools under Windows that can take snapshots of the memory of your application. Simply take two snapshots with an interval of a few minutes or hours (depending on your application) and let them compare by the tool. There are probably simlar tools like this on Solaris.
Using these tools will probably require your application to take much more memory, since the tool needs to store the call stack of every memory allocation. Because of this it will also run much slower. However, you will only see this effect when you are actively using this tool, so there is no effect in real-life production code.
So, just look for this kind of tools under Solaris. I quickly Googled for it and found this link: http://prefetch.net/blog/index.php/2006/02/19/finding-memory-leaks-on-solaris-systems/. This could be a starting point.
EDIT: Some additional information: are you looking at the right kind of memory? Even if you only allocated 3GB in total, the total virtual address space may still reach 4GB because of memory fragmentation. Unfortunately, there is nothing you can do about this (except using another memory allocation strategy).

Ubuntu System Monitor and valgrind to discover memory leaks in C++ applications

I'm writing an application in C++ which uses some external open source libraries. I tried to look at the Ubuntu System Monitor to have information about how my process uses resources, and I noticed that resident memory continues to increase to very large values (over 100MiB). This application should run in an embedded device, so I have to be careful.
I started to think there should be a (some) memory leak(s), so I'm using valgrind. Unfortunately it seems valgrind is not reporting significant memory leaks, only some minor issues in the libraries I'm using, nothing more.
So, do I have to conclude that my algorithm really uses that much memory? It seems very strange to me... Or maybe I'm misunderstanding the meaning of the columns of the System Monitor? Can someone clarify the meaning of "Virtual Memory", "Resident Memory", "Writable Memory" and "Memory" in the System Monitor when related to software profiling? Should I expect those values to immediately represent how much memory my process is taking in RAM?
In the past I've used tools that were able to tell me where I was using memory, like Apple Profiling Tools. Is there anything similar I can use in Linux as well?
Thanks!

Another tool you can try is the /lib/libmemusage.so library:
$ LD_PRELOAD=/lib/libmemusage.so vim
Memory usage summary: heap total: 4643025, heap peak: 997580, stack peak: 26160
total calls total memory failed calls
malloc| 42346 4528378 0
realloc| 52 7988 0 (nomove:26, dec:0, free:0)
calloc| 34 106659 0
free| 28622 3720100
Histogram for block sizes:
0-15 14226 33% ==================================================
16-31 8618 20% ==============================
32-47 1433 3% =====
48-63 4174 9% ==============
64-79 4736 11% ================
80-95 313 <1% =
...
(I quit vim immediately after startup.)
Maybe the histogram of block sizes will give you enough information to tell where leaks may be happening.
valgrind is very configurable; --leak-check=full --show-reachable=yes might be a good starting point, if you haven't tried it yet.
"Virtual Memory", "Resident Memory", "Writable Memory" and "Memory"
Virtual memory is the address space that your application has allocated. If you run malloc(1024*1024*100);, the malloc(3) library function will request 100 megabytes of storage from the operating system (or handle it out of the free lists). The 100 megabytes will be allocated with mmap(..., MAP_ANONYMOUS), which won't actually allocate any memory. (See the rant at the end of the malloc(3) page for details.) The OS will provide memory the first time each page is written.
Virtual memory accounts for all the libraries and executable objects that are mapped into your process, as well as your stack space.
Resident memory is the amount of memory that is actually in RAM. You might link against the entire 1.5 megabyte C library, but only use the 100k (wild guess) of the library required to support the Standard IO interface. The rest of the library will be demand paged in from disk when it is needed. Or, if your system is under memory pressure and some less-recently-used data is paged out to swap, it will no longer count against Resident memory.
Writable memory is the amount of address space that your process has allocated with write privileges. (Check the output of pmap(1) command: pmap $$ for the shell, for example, to see which pages are mapped to which files, anonymous space, the stack, and the privileges on those pages.) This is a reasonable indication of how much swap space the program might require in a worst-case swapping scenario, when everything must be paged to disk, or how much memory the process is using for itself.
Because there are probably 50--100 processes on your system at a time, and almost all of them are linked against the standard C library, all the processes get to share the read-only memory mappings for the library. (They also get to share all the copy-on-write private writable mappings for any files opened with mmap(..., MAP_PRIVATE|PROT_WRITE), until the process writes to the memory.) The top(1) tool will report the amount of memory that can be shared among processes in the SHR column. (Note that the memory might not be shared, but some of it (libc) definitely is shared.)
Memory is very vague. I don't know what it means.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js