getting std::bad_alloc error ; How to cross-verify that OS is really running out of memory - c++

I have a C++ program/Linux, which within 2-3 seconds of running starts spitting error std::bad alloc on a 32GB RAM (and gets restarted by wrapper caller). What I really care about is to solve this problem, but I would like to go step by step and build up my confidence in my understanding of the problem.
It looks like the system is not able to allocate memory for a new request (this would happen when the OS has run out of memory). While the program is running, on another terminal I run the sar command with the smallest interval possible (1 second), but I see that kbcached is ~24GB memory. Why is the OS not able to release the caching and make that memory available to the new request? Either 1 sec is too much time (in comparison to how fast programs run) or I am doing something wrong here.
Basically I would like to cross-verify and pin-point that the OS is indeed running out of memory and thus is not able to allocate memory, and then take things from this point on. How to do it?
Ideally, I would like to have the system statistics right at the point when memory allocation fails, like how much caching, total used up memory etc.

If you actually want to see how your process's memory is allocated, you could set a breakpoint with gdb for when the exception is thrown. When it is, inspect the process with a tool like pmap, which can show you additional information about how the process uses memory.
If that's too primitive (and it quickly will be, pmap is pretty primitive), valgrind includes Massif and many other utilities for diagnosing memory usage, CPU utilization, and other runtime problems.

Related

Memory leak that doesn't crash when OOM, or show up in massif/valgrind

I have an internal C++ application that will indefinitely grow--so much so that we've had to implement logic that actually kills it once the RSS reaches a certain peak size (2.0G) just to maintain some semblance of order. However, this has shown some strange behaviors.
First, I ran the application through Valgrind w/ memcheck, and fixed some random memory leaks here and there. However, the extent of these memory leaks were measured in the 10s of megabytes. This makes sense, as it could be that there's no actual memory leaking--it could just be poor memory management on the application side.
Next, I used Valgrind w/ massif to check to see where the memory is going, and this is where it gets strange. The peak snapshot is 161M--nowhere near the 1.9G+ peaks we see using the RSS field. The largest consumption is where I'd expect--in std::string--but this is not abnormal.
Finally, and this is the most puzzling--before we were aware of this memory leak, I actually was testing this service on AWS, and just for fun, set the number of workers to a high number on a CC2.8XL machine, 44 workers. That's 60.5G of RAM, and no swap. Fast forward a month: I go to look at the host--and low and behold, it's maxed out on RAM--BUT! The processes are still running fine, and are stuck at varying stages of memory usage--almost evenly distributed from 800M to 1.9G. Every once in a while dmesg prints out an Xen error about being unable to allocate memory, but other than that, the processes never die and continue to actively process (i.e., they're not "stuck").
Is there something I'm missing here? It's basically working, but for the life of me, I can't figure out why. What would be a good recommendation on what to look for next? Are there any tools that might help me figure it out?
Note that valgrind memcheck only discovers when you "abandon" memory. while(1) vec.push_back(n++); will fill all available memory but not report any leaks. By the sounds of things, you are collecting strings somewhere that take up a lot of space. I have also worked on code that uses a lot of memory but not really leaking it [it's all in various places that valgrind is happy is not a leak!]. Sometimes you can track it down by simply adding some markers to the memory allocations, or some such, to indicate WHERE you are allocating memory.
In std:: functions, there is typically an Allocator argument. If you implement several different pools of memory, you may find where you are allocating memory.
I have also seen cases where I think that the process is having it's memory fragmented, so there are lots of little free spaces in the heap - this can happen if, for example, you create a lot of strings by adding to the size of the string.
If it's a issue of fragmentation, run valgrind massif with the --pages-as-heap=yes option may confirm whether if it's fragmentation.

how to hunt down memory leak valgrind says doesn't exist?

I have a program that accepts data from a socket, does some quality control and assorted other conditioning to it, then writes it out to a named pipe. I ran valgrind on it and fixed all the memory leaks that originally existed. I then created a 'demo' environment on a system where I had 32 instances of this program running, each being fed unique data and each outputting to it's own pipe. We tested it and everything looked to be fine. Then I tried stress testing it by boosting the rate at which data is sent in to an absurd rate and things looked to be good at first...but my programs kept consuming more and more memory until I had no resources left.
I turned to valgrind and ran the exact same setup except with each program running inside valgrind using leak-check=full. A few odd things happened. First, the memory did leak, but only to the point where each program had consumed .9 % of my memory (previously the largest memory hog had a full 6% of my memory). With valgrind running the CPU cost of the programs shot up and I was now at 100% cpu with a huge load-average, so it's possible the lack of available CPU caused the programs to all run slowly enough that the leak took too long to manifest. When I tried stopping these programs valgrind showed no direct memory leaks, it showed some potential memory leaks but I checked them and I don't think any of them represent real memory leaks; and besides which the possible memory leak only showed as a few kilobytes while the program was consuming over 100 MB. The reachable (non-leaked) memory reported by valgrind was also in the KB range, so valgrind seems to believe that my programs are consuming a fraction of the memory that Top says they are using.
I've run a few other tests and got odd results. A single program, even running at triple the rate my original memory-leak was detected at, never seems to consume more than .9% memory, two programs leak up to 1.9 and 1.3% memory respectively but no more etc, it's as if the amount of memory leaked, and the rate at which it leaks, is somehow dependent on how many instances of my program are running at one time; which makes no sense, each instance should be 100% independent of the others.
I also found if I run 32 instances with only one instance running in valgrind the valgrinded instance (that's a word if I say it is!) leaks memory, but at a slower rate than the ones running outside of valgrind. The valgrind instance will still say I have no direct leaks and reports far less memory consumption then Top shows.
I'm rather stumped as to what could be causing this result, and why valgrind refuses to be aware of the memory leak. I thought it might be an outside library, but I don't really use any external libraries; just basic C++ functions/objects. I also considered it could be the data written to the output pipe to fast causing the buffer to grow indefinitely, but 1) there should be an upper limit that such a buffer can grow and 2) once memory has been leaked if I drop the data input rate to nothing the memory stays consumed rather then slowly dropping back to a reasonable amount.
Can anyone give me a hint as to where I should look from here? I'm totally stumped as to why the memory is behaving this way.
Thanks.
This sounds like a problem I had recently.
If your program accepts data and buffers it internally without any limits, then it may be reading and buffering faster than it can output the data. In that case, memory use will continue to increase without limit.
The more instances of the program that you run, the slower each instance will go, and the faster the buffers will increase.
This may or may not be your problem, but without more information it is the best I can do.
You should first look for soft leak. It happens when some static or singleton gradually increases some buffer or container and collects trash into it. Technically it is not leak but its effects are as bad.
May I suggest you give a try with MemoryScape? This tool does a pretty good job in memory leak detection. It's not free but given the time and energy spent, it is worth trying.

How to perform cache operations in C++?

I want to run my C++ program after flushing the cache, Before running my program I do not know what is there in the cache. Is there some other manner in C++ on Ubuntu via which I may flush my cache before running my program.
EDIT: The motive for flushing the cache is... that each time I run my program I do not want my present data structures to be there in the cache... I mean I want a cold cache... whereby all the accesses are made from the disk.
One way of achieving this is to restart the computer... but considering the number of experiments that I have to run, this is not feasible for me. So, can anyone be kind enough to guide me as to how I can achieve this.
You have no need to flush the cache from your user-mode (non-kernel-mode) program. The OS (Linux, in the case of ubuntu) provides your application with a fresh virtual address space, with no "leftover stuff" from other programs. Without executing special OS system calls, your program can't even get to memory that's used for other applications. So from a cache perspective, your application start from a clean slate, as far as it's concerned. There are cacheflush() system calls (syntax differs by OS), but unless you're doing something out-of-the-ordinary for typical user-mode applications, you can just forget that the cache even exists; it's just there to speed up your program, and the OS manages it via the CPU's MMU, your app does not need to manage it.
You may have also heard about "memory leaks" (memory allocated to your application that your application forgets to free/delete, which is "lost forever" once your application forgets about it). If you're writing a (potentially) long-running program, leaked memory is definitely a concern. But leaked memory is only an issue for the application that leaks it; in modern virtual-memory environments, if application A leaks memory, it doesn't affect application B. And when application A exits, the OS clears out its virtual address space, and any leaked memory is at that point reclaimed by the system and no longer consumes any system resources whatsoever. In many cases, programmers specifically choose to NOT free/delete a memory allocation, knowing that the OS will automatically reclaim the entire amount of the memory when the application exits. There's nothing wrong with that strategy, as long as the program doesn't keep doing that on a repeating basis, exhausting its virtual address space.
This is a common question.
Firstly you have to understand that the caches are never really empty, just like a register is never really empty, it's always there, and it always has a value. The phrase "Flushing the cache" actually refers to writing the cache contents to memory, also called a memory barrier.
see https://en.wikipedia.org/wiki/Memory_barrier
This is not your problem, and so you are using the wrong terminology.
What you really want is to fill the cache with the wrong values. This is harder than it sounds, because you are fighting all the optimisations that normally are your friend. Memcpy'ing a large block of memory (several MB - given the size of todays caches) should normally work though.
However...
You also have file caches and other things that will give your application an unfair advantages. This can be a very complex subject, and is a small project in it's own right.

Increased memory usage for a process

I have a C++ process running in Solaris which creates 3 threads to do some tasks.
These threads execute in loops and it runs as long as the process is running.
But, I see that the memory usage of the process grows continuously and the process core dumps once the memory usage exceeds 4GB.
Can someone give me some pointers on what could be the issue behind memory usage growth?
What can I do to prevent process from core dumping because of memory exhaustion?
Will thread restart help?
Any pointers welcome.
No, restarting a thread would not help.
It seems like you have a memory leak in your application.
In my experience there are two types of memory leaks:
real memory leaks that you can see when the application exits
'false' memory leaks, like a big list that increases during the lifetime of your application but which is correctly cleaned up at the end
For the first type, there are tools which can report the memory that has not been freed by your application when it exits. I don't know about Solaris but there are numerous tools under Windows which can do that. For Unix, I think that Valgrind does this.
For the second type, there are also tools under Windows that can take snapshots of the memory of your application. Simply take two snapshots with an interval of a few minutes or hours (depending on your application) and let them compare by the tool. There are probably simlar tools like this on Solaris.
Using these tools will probably require your application to take much more memory, since the tool needs to store the call stack of every memory allocation. Because of this it will also run much slower. However, you will only see this effect when you are actively using this tool, so there is no effect in real-life production code.
So, just look for this kind of tools under Solaris. I quickly Googled for it and found this link: http://prefetch.net/blog/index.php/2006/02/19/finding-memory-leaks-on-solaris-systems/. This could be a starting point.
EDIT: Some additional information: are you looking at the right kind of memory? Even if you only allocated 3GB in total, the total virtual address space may still reach 4GB because of memory fragmentation. Unfortunately, there is nothing you can do about this (except using another memory allocation strategy).

Any useful suggestions to figure out where memory is being free'd in a Win32 process?

An application I am working with is exhibiting the following behaviour:
During a particular high-memory operation, the memory usage of the process under Task Manager (Mem Usage stat) reaches a peak of approximately 2.5GB (Note: A registry key has been set to allow this, as usually there is a maximum of 2GB for a process under 32-bit Windows)
After the operation is complete, the process size slowly starts decreasing at a rate of 1MB per second.
I am trying to figure out the easiest way to quickly determine who is freeing this memory, and where it is being free'd.
I am having trouble attaching a memory profiler to my code, and I don't particularly want to override the new/delete operators to track the allocations/deallocations (IOW, I want to do this without re-compiling my code).
Can anyone offer any useful suggestions of how I could do this via the Visual Studio debugger?
Update
I should also mention that it's a multi-threaded application, so pausing the application and analysing the call stack through the debugger is not the most desirable option. I considered freezing different threads one at a time to see if the memory stops reducing, but I'm fairly certain this will cause the application to crash.
Ahh! You're looking at the wrong counter!
Mem Usage doesn't tell you that memory is being freed. Only that the working set is being purged! This could mean some other application needs memory, or the VMM decided to mark some of your process's pages as Stand By for some other process to quickly use. It does not mean that VirtualFree, HeapFree or any other free function is being called.
Look at the commit size (VM Size, Private Bytes, etc).
But if you still want to know when memory is being decommitted or freed or what-have-you, then break on some free calls. E.g. (for Visual C++)
{,,kernel32.dll}HeapFree
or
{,,msvcr80.dll}free
etc.
Or just a regular function breakpoint on the above. Just make sure it resolves the address.
cdb/WinDbg let you do it via
bp kernel32!HeapFree
bp msvcrt!free
etc.
Names may vary depending on which CRT version you use and how you link against it (via /MT or /MD and its variants)
You might find this article useful:
http://www.gamasutra.com/view/feature/1430/monitoring_your_pcs_memory_usage_.php?print=1
basically what I had in mind was hooking the low level allocation functions.
A couple different ideas:
The C runtime has a set of memory debugging functions; you'd need to recompile though. You could get a snapshot at computation completion and later, and use _CrtMemDifference to see what changed.
Or, you can attach to the process in your debugger, and cause it to dump a core before and after the memory freeing. Using NTSD, you can see what heaps are around, and the sizes of things. (You'll need a lot of disk space, and a fair amount of patience.) There's a setting (I think you get it through gflags, but I don't remember) that causes it to save a piece of the call stack as part of the dump; using that you can figure out what kind of object is being deallocated. Unfortunately, it only stores 4 or 5 stack frames, so you'll likely have to do something more clever as the next step to figure out where it's being freed. Either look at the code ("oh yeah, there's only one place where that can happen") or put in breakpoints on those destructors, or add tracing to the allocations and deallocations.
If your memory manager wipes free'd data to a known value (usually something like 0xfeeefeee), you can set a data breakpoint on a particular instance of something you're interested in. When it gets free'd, the breakpoint will trigger when the memory gets wiped.
I recommend you to check UMDH tool that comes with Debugging Tools For Windows (You can find usage and samples in the debugging tools help). You can snap shot running process's heap allocations with stack trace and compare them.
You could try Memory Validator to monitor the allocations and deallocations. Memory Validator has a couple of features that will help you identify where data is being deallocated:
Hotspots view. This can show you a tree of all allocations and deallocations or just all allocations or just all deallocations. It presents the data as a percentage of memory activity (based on amount of memory (de)allocated at a given location).
Analysis view. You can perform queries asking for data in a given address range. You can restrict these queries to any of alloc, realloc, dealloc behaviours.
Objects view. You can view allocations by type and see the maximum number of objects of each type (plus lots of other stats). Right click on a type to get a context menu, choose show all deallocations - will show deallocation locations for that type on Analysis tab.
I think the Hotspots view may give you the insight you need.