I would like to consult this problem
I have a program that I'm running that in the long run, its memory keep increasing up until all resource are exhausted and of course it crashes (takes several of days to reach critical size).
what I've done till now, is using Valgrind, found all the memory leaks and fixed them, but now I still have a small memory leak that is caused by heap increasing size, for this I used Valgrind massif tool.
the problem is that when I use massif, it cannot run for too long time, and it causes the application to crash after several hours.
I've tried to find the memory leak for an one hour run, the problem that the minimum threshold cannot be lowered from 1% of memory, and after 1 hour I can see memory increase but it still small compare to the rest of the application.
so I can see part that takes more memory, but I cannot see which parts
example in valgrind output file:
->03.11% (4,377,152B) in 28 places, all below massif's threshold (01.00%)
any thoughts?
Use google perftools.
You can link your program or even LD_PRELOAD the library in and it will profile your heap use generating snapshots, it won't take much of your performance out, when you see that the heap is already too big you can stop it and get a graph of where the memory is spent.
EDIT:
tutorial here
Example:
Have you used valgrind with -leak-check--full? What are you using that could use memory? Have you deleted every new?
Maybe you crashed because you are allocating a huge memory space at once (happened to me before) and valgrind can't see it sometime.
It is "strange" anyway, tell us the answer if you find it !
Related
I'm developing a c++ program. This program seems to leak sometimes. Then it suddenly grows 2/3 times in size and keeps growing up to out of memory. "Run it in valgrind" you would say. I do, but the strange thing is that at exit, it shows leakage of "only" 0.5MB. Not the gigabytes I see using "top" (RSS).
The program is a program which monitors 10 webcams. They produce a constant fps, a constant resolution, rtsp or mjpeg stream. Nothing of that changes. Time periods are: days(!) of constant memory usage and then suddenly memory usage starts to grow:
So question now is: any tips for how to figure out these kinds of problems?
valgrind provides various ways to track memory allocations that are later on deallocated.
To see peak of memory, you can e.g. use the valgrind massif tool.
To see which code allocates/deallocates a lot of memory, you can e.g. use memcheck option --xtree-memory=full and visualise the resulting file with kcachegrind.
We have a complex algorithm, which processes OpenCV images, thereby allocating and deallocating several GB of memory, mostly cv::Mat with about 10MB size each. If we run this iteratively unter valgrind (either with --tool=massif or --tool=memcheck) the memory footprint returns to the same value (+-1MB) after each iteration, and no significant memory leak is found. Watching from outside via ps or pmap or from inside via /proc/self/status also shows a maximum footprint of 2.3GB not increasing.
If we run the same software without valgrind however, then the memory footprint (checked from outside via ps or pmap or from inside via /proc/self/status) increases with every iteration about several hundred MB, soon reaching 5BG after a few iterations.
Thus we have something looking like a memory leak, but valgrind is of no help for finding the cause.
What could this be?
(This is C++ under Ubuntu).
Thanks to the comment of #phd I found a solution to my problem: Using tcmalloc reduces the memory footprint dramatically (using 2.5GB instead of 6GB). See the attached grafic
RSS memory using different malloc libraries
There seems to be still a slight increase in memory usage with tcmalloc or jemalloc, but at least it's ok for the iteration numbers we usally have.
Still wonder though how malloc can waste so much resources. I tried to find out with malloc_info(), but with no success. I suspect memory fragmentation and/or multithreading plays a role here.
I have an internal C++ application that will indefinitely grow--so much so that we've had to implement logic that actually kills it once the RSS reaches a certain peak size (2.0G) just to maintain some semblance of order. However, this has shown some strange behaviors.
First, I ran the application through Valgrind w/ memcheck, and fixed some random memory leaks here and there. However, the extent of these memory leaks were measured in the 10s of megabytes. This makes sense, as it could be that there's no actual memory leaking--it could just be poor memory management on the application side.
Next, I used Valgrind w/ massif to check to see where the memory is going, and this is where it gets strange. The peak snapshot is 161M--nowhere near the 1.9G+ peaks we see using the RSS field. The largest consumption is where I'd expect--in std::string--but this is not abnormal.
Finally, and this is the most puzzling--before we were aware of this memory leak, I actually was testing this service on AWS, and just for fun, set the number of workers to a high number on a CC2.8XL machine, 44 workers. That's 60.5G of RAM, and no swap. Fast forward a month: I go to look at the host--and low and behold, it's maxed out on RAM--BUT! The processes are still running fine, and are stuck at varying stages of memory usage--almost evenly distributed from 800M to 1.9G. Every once in a while dmesg prints out an Xen error about being unable to allocate memory, but other than that, the processes never die and continue to actively process (i.e., they're not "stuck").
Is there something I'm missing here? It's basically working, but for the life of me, I can't figure out why. What would be a good recommendation on what to look for next? Are there any tools that might help me figure it out?
Note that valgrind memcheck only discovers when you "abandon" memory. while(1) vec.push_back(n++); will fill all available memory but not report any leaks. By the sounds of things, you are collecting strings somewhere that take up a lot of space. I have also worked on code that uses a lot of memory but not really leaking it [it's all in various places that valgrind is happy is not a leak!]. Sometimes you can track it down by simply adding some markers to the memory allocations, or some such, to indicate WHERE you are allocating memory.
In std:: functions, there is typically an Allocator argument. If you implement several different pools of memory, you may find where you are allocating memory.
I have also seen cases where I think that the process is having it's memory fragmented, so there are lots of little free spaces in the heap - this can happen if, for example, you create a lot of strings by adding to the size of the string.
If it's a issue of fragmentation, run valgrind massif with the --pages-as-heap=yes option may confirm whether if it's fragmentation.
I have a program that accepts data from a socket, does some quality control and assorted other conditioning to it, then writes it out to a named pipe. I ran valgrind on it and fixed all the memory leaks that originally existed. I then created a 'demo' environment on a system where I had 32 instances of this program running, each being fed unique data and each outputting to it's own pipe. We tested it and everything looked to be fine. Then I tried stress testing it by boosting the rate at which data is sent in to an absurd rate and things looked to be good at first...but my programs kept consuming more and more memory until I had no resources left.
I turned to valgrind and ran the exact same setup except with each program running inside valgrind using leak-check=full. A few odd things happened. First, the memory did leak, but only to the point where each program had consumed .9 % of my memory (previously the largest memory hog had a full 6% of my memory). With valgrind running the CPU cost of the programs shot up and I was now at 100% cpu with a huge load-average, so it's possible the lack of available CPU caused the programs to all run slowly enough that the leak took too long to manifest. When I tried stopping these programs valgrind showed no direct memory leaks, it showed some potential memory leaks but I checked them and I don't think any of them represent real memory leaks; and besides which the possible memory leak only showed as a few kilobytes while the program was consuming over 100 MB. The reachable (non-leaked) memory reported by valgrind was also in the KB range, so valgrind seems to believe that my programs are consuming a fraction of the memory that Top says they are using.
I've run a few other tests and got odd results. A single program, even running at triple the rate my original memory-leak was detected at, never seems to consume more than .9% memory, two programs leak up to 1.9 and 1.3% memory respectively but no more etc, it's as if the amount of memory leaked, and the rate at which it leaks, is somehow dependent on how many instances of my program are running at one time; which makes no sense, each instance should be 100% independent of the others.
I also found if I run 32 instances with only one instance running in valgrind the valgrinded instance (that's a word if I say it is!) leaks memory, but at a slower rate than the ones running outside of valgrind. The valgrind instance will still say I have no direct leaks and reports far less memory consumption then Top shows.
I'm rather stumped as to what could be causing this result, and why valgrind refuses to be aware of the memory leak. I thought it might be an outside library, but I don't really use any external libraries; just basic C++ functions/objects. I also considered it could be the data written to the output pipe to fast causing the buffer to grow indefinitely, but 1) there should be an upper limit that such a buffer can grow and 2) once memory has been leaked if I drop the data input rate to nothing the memory stays consumed rather then slowly dropping back to a reasonable amount.
Can anyone give me a hint as to where I should look from here? I'm totally stumped as to why the memory is behaving this way.
Thanks.
This sounds like a problem I had recently.
If your program accepts data and buffers it internally without any limits, then it may be reading and buffering faster than it can output the data. In that case, memory use will continue to increase without limit.
The more instances of the program that you run, the slower each instance will go, and the faster the buffers will increase.
This may or may not be your problem, but without more information it is the best I can do.
You should first look for soft leak. It happens when some static or singleton gradually increases some buffer or container and collects trash into it. Technically it is not leak but its effects are as bad.
May I suggest you give a try with MemoryScape? This tool does a pretty good job in memory leak detection. It's not free but given the time and energy spent, it is worth trying.
I have a C++ process running in Solaris which creates 3 threads to do some tasks.
These threads execute in loops and it runs as long as the process is running.
But, I see that the memory usage of the process grows continuously and the process core dumps once the memory usage exceeds 4GB.
Can someone give me some pointers on what could be the issue behind memory usage growth?
What can I do to prevent process from core dumping because of memory exhaustion?
Will thread restart help?
Any pointers welcome.
No, restarting a thread would not help.
It seems like you have a memory leak in your application.
In my experience there are two types of memory leaks:
real memory leaks that you can see when the application exits
'false' memory leaks, like a big list that increases during the lifetime of your application but which is correctly cleaned up at the end
For the first type, there are tools which can report the memory that has not been freed by your application when it exits. I don't know about Solaris but there are numerous tools under Windows which can do that. For Unix, I think that Valgrind does this.
For the second type, there are also tools under Windows that can take snapshots of the memory of your application. Simply take two snapshots with an interval of a few minutes or hours (depending on your application) and let them compare by the tool. There are probably simlar tools like this on Solaris.
Using these tools will probably require your application to take much more memory, since the tool needs to store the call stack of every memory allocation. Because of this it will also run much slower. However, you will only see this effect when you are actively using this tool, so there is no effect in real-life production code.
So, just look for this kind of tools under Solaris. I quickly Googled for it and found this link: http://prefetch.net/blog/index.php/2006/02/19/finding-memory-leaks-on-solaris-systems/. This could be a starting point.
EDIT: Some additional information: are you looking at the right kind of memory? Even if you only allocated 3GB in total, the total virtual address space may still reach 4GB because of memory fragmentation. Unfortunately, there is nothing you can do about this (except using another memory allocation strategy).