High memory usage for dummies - c++

I've just restarted my firefox web browser again because it started stuttering and slowing down. This happens every other day due to (my understanding) of excessive memory usage.
I've noticed it takes 40M when it starts and then, by the time I notice slow down, it goes to
1G and my machine has nothing more to offer unless I close other applications.
I'm trying to understand the technical reasons behind why its such a difficult problem to sol
ve.
Mozilla have a page about high memory usage:
http://support.mozilla.com/en-US/kb/High+memory+usage
But I'm looking for a slightly more in depth and satisfying explanation. Not super technical but enough to give the issue more respect and please the crowd here.
Some questions I'm already pondering (they could be silly so take it easy):
When I close all tabs, why doesn't the memory usage go all the way down?
Why is there no limits on extensions/themes/plugins memory usage?
Why does the memory usage increase if it's left open for long periods of time?
Why are memory leaks so difficult to find and fix?
App and language agnostic answers also much appreciated.

Browsers are like people - they get old, they get bloated, and they get ditched for younger and leaner models.
Firefox is not just a browser, it's an ecosystem.
While I feel that recent versions are quite bloated, the core product is generally stable.
However, firefox is an ecosystem/platform for:
1) Badly written plug-ins
2) Badly written JavaScript code that executes within it.
3) Adobe flash as a platform for heavyweight video and for poorly written ad scripts such as 'hit Osama bin Laden with a duck to reduce your mortgage rate and receive a free iPod* (participation required).
4) Quicktime and other media player.
5) Some embedded Java code.
The description of a memory leak suggests a script running amok or a third-party tool requesting more memory. If you ever run Flash on a Mac, that's almost a given along with 90% CPU utilization.
The goal of most programming languages is not to save you but to give you tools to save yourself. You can write bad and bloated code with memory leaks in any language, including ones with garbage collection. Third party tools are usually not as well tested as the platform itself. Web pages that try to do too much are also not uncommon.
If you want to do an experiment to demonstrate this, get a mac with Firefox and go to a well-written site like Stack Overflow and spend an hour. Your memory usage shouldn't grow much. Then spend 5 minutes visiting random pages on Myspace.
Now let me try and answer your questions based on my guesses since I'm not familiar with the source code
When I close all tabs, why
doesn't the memory usage go all the
way down?
Whereas each browser instance is an independent process with its own memory, the tabs in a single window are all within the same process. Firefox used to have some sort of in-memory caching and merely closing a tab doesn't clear the relevant information immediately from the in-memory cache. If you reopened a tab to the same site, you might get better performance. There was some advanced option to allow you to disable it, something like browser.cache.memory.enable. Or just search for disabling the memory cache.
* Why is there no limits on extensions/themes/plugins memory usage?
For the same reason that Windows or Linux doesn't have a vetting process on applications you can run on them. It's an open environment and you assume the risk. If you want an environment where applications and extensions are 'validated', Apple might be the way to go :)
* Why does the memory usage increase if it's left open for long periods of time?
Not all calculations and actions in a script have visual manifestations. A script could be doing some stuff in the background (like requesting extra materials, pre-fetching stuff, just bugs) even if you don't see it.
* Why are memory leaks so difficult to find and fix?
It's about bookkeeping. Think about every item you ever borrowed (even a pen) or that someone borrowed from you in your entire life. Are they all accounted for? Memory leaks are the same way (you borrow memory from the system), except that you pass items around. Then look at the stuff on your desk, did you leave anything lying around because 'you might need it soon' even though you probably won't? same story.

Why are memory leaks so difficult to find and fix?
Because some developers refuse to use tools like Electric Fence.

Memory leaks are present in the first place because you want to keep things in memory and not on disk. For example, suppose you have a web page, which have images, CSS, JavaSript, text. If to display the page you will go to the hard disk every time you want to use the JavaScript interpreter or a CSS parser, or a font rendering engine to display a text with, then the browser will be very slow and sometimes won't work at all (because one JavaScript piece might need variables present which are left by another JavaScript piece, for example). Therefore, a browser tries to keep all things necessary for its work in memory, and those things get cross-referenced easily (JavaScript calling into Adobe Flash, Adobe Flash calling into JavaScript and so on). And you have to be very careful with such resource references, because cleaning them prematurely and out of order will break the code (better to keep a resource around then to die suddenly because it isn't there).
P.S. See also this article for some gory details.

Related

Does my text editor application have a memory leak? Why does it consume 3x more memory than Notepad

I am writing a text Editor application. As an experiment I ran the application and monitered its memory usage on Task Manager as I performed different actions.
When I first launched the application, it used 3000 kB.
It stayed roughly the same when I typed
When I clicked on save, it shot up to 9000kb
and then it just stayed at 8500kb (It didn't go back down to 3000kb)
Is this caused by a memory leak? I'm a bit confused because I observed similar behaviour with Notepad.
Launching: 1500kb
Saving: 6000 kb
After saving, memory stays at around 5000kb
Also, why does my application take up 3x more memory than Notepad.exe, what kind of things could cause that? Should I be worried?
To start with you want to know where that memory is actually being used. There are a lot of complex programs to do memory analysis/profiling, but if you want something more detailed than Task Manager but still fairly simple and free, Sysinternals vmmap is great.
http://technet.microsoft.com/en-us/sysinternals/dd535533
As others have mentioned, the save is probably causing other libraries to be pulled in. The text itself is also going to contribute to your memory usage. VMMap will help you determine how much is yours and how much is other stuff. Then you could see if your part is really growing substantially over time or not. You probably want a large amount of time of stress testing to really see if it is leaking memory if you are not going to use a memory profiler, otherwise the leak is probably not going to be big enough to really notice easily.
The File-Save dialog starting up for the first time probably burns a lot of memory. Opening the file dialog embeds a copy of Explorer in the window, for instance, and loading Explorer into your process carries a lot of baggage along with it.
The fact that you are using Qt means there's a lots of unnecessary code added to your software. Qt Core for instance is over 2MB, Qt Gui is about 8 MB. Microsoft on the other hand have probably coded Notepad using pure C/C++ and the Windows API, which mean they have a smaller and faster executable.
Finally, it also depends on your compiler. MinGW is going to create larger and slower executables than Visual C++ compiler. So if you can, try to use Microsoft's compiler.
I tried exactly the same in notepad, the save needs more memory. If you open a current file and save it, then there is no difference in memory. Creating the file takes tons of memory, in the end.

Memory counter - Collision Detection Project

I thought I would ask the experts - see if you can help me :o)
My son has written C++ code for Collision Detection using Brute Force and Octree algorithms.
He has used Debug etc - and to collect stats on mem usage he has used windows & task manager - which have given him all the end results he has needed so far. The results are not yet as they were expect to be (that Octree would use more memory overall).
His tutor has suggested he checks memory once each is "initialised" and then plot at points through the test.
He was pointed in the direction of Valgrind .... but it looked uite complicated and becaus ehe has autism, he is worried that it might affect his programmes :o)
Anyone suggest a simple way to grab the information on Memory if not also Frame Rate and CPU usage???
Any help gratefully received, as I know nothing so can't help him at all, except for typing this on here - as it's "social" environment he can't deal with it.
Thanks
Rosalyn
For the memory leaks:
If you're on Windows, Visual C++ by Microsoft (the Express version is free) has a nice tool for debugging and is easy to setup with instructions can be found here; otherwise if you're on Linux, Valgrind is one of the standards. I have used the Visual C++ tool often and it's a nice verification that you have no memory leaks. Also, you can use it to enabled your programs to break on allocation numbers that you get from the memory leak log so it quickly points you to when and where the memory is getting assigned that leaks. Again, it's easy to implement (just a few header files and then a single function call where you want to dump the leaks at).
I have found the best way to implement the VC++ tool is to make the call to dump the memory leaks to the output window right before main returns a value. That way, you can catch the leaks of absolutely everything in your program. This works very well and I have used it for some advanced software.
For the framerate and CPU usage:
I usually use my own tools for benchmarking since they're not difficult to code once you learn the functions to call; this would usually require OS API calls, but I think Boost has that available and is cross-platform. There might be other tools out there that can track the process in the OS to get benchmarking data as well, but I'm not certain if they would be free or not.
It looks like you're running under a windows system. This isn't a programming solution, and you may have already tried it (so feel free to ignore), but if not, you should take a look at performance monitor (it's one of the tools that ships with windows). It'll let you track all sorts of useful stats about individual proceses and the system as a whole (cpu/commit size etc). It plots the results for you as a graph as the program is running and you can save the results off for future viewing.
On Windows 7, you get to it from here:
Control Panel\All Control Panel Items\Performance Information and Tools\Advanced Tools
Then Open Performance Monitor.
For older versions of windows, it used to be one of the administrative tools options.

Beyond Stack Sampling: C++ Profilers

A Hacker's Tale
The date is 12/02/10. The days before Christmas are dripping away and I've pretty much hit a major road block as a windows programmer. I've been using AQTime, I've tried sleepy, shiny, and very sleepy, and as we speak, VTune is installing. I've tried to use the VS2008 profiler, and it's been positively punishing as well as often insensible. I've used the random pause technique. I've examined call-trees. I've fired off function traces. But the sad painful fact of the matter is that the app I'm working with is over a million lines of code, with probably another million lines worth of third-party apps.
I need better tools. I've read the other topics. I've tried out each profiler listed in each topic. There simply has to be something better than these junky and expensive options, or ludicrous amounts of work for almost no gain. To further complicate matters, our code is heavily threaded, and runs a number of Qt Event loops, some of which are so fragile that they crash under heavy instrumentation due to timing delays. Don't ask me why we're running multiple event loops. No one can tell me.
Are there any options more along the lines of Valgrind in a windows environment?
Is there anything better than the long swath of broken tools I've already tried?
Is there anything designed to integrate with Qt, perhaps with a useful display of events in queue?
A full list of the tools I tried, with the ones that were really useful in italics:
AQTime: Rather good! Has some trouble with deep recursion, but the call graph is correct in these cases, and can be used to clear up any confusion you might have. Not a perfect tool, but worth trying out. It might suit your needs, and it certainly was good enough for me most of the time.
Random Pause attack in debug mode: Not enough information enough of the time.
A good tool but not a complete solution.
Parallel Studios: The nuclear option. Obtrusive, weird, and crazily powerful. I think you should hit up the 30 day evaluation, and figure out if it's a good fit. It's just darn cool, too.
AMD Codeanalyst: Wonderful, easy to use, very crash-prone, but I think that's an environment thing. I'd recommend trying it, as it is free.
Luke Stackwalker: Works fine on small projects, it's a bit trying to get it working on ours. Some good results though, and it definitely replaces Sleepy for my personal tasks.
PurifyPlus: No support for Win-x64 environments, most prominently Windows 7. Otherwise excellent. A number of my colleagues in other departments swear by it.
VS2008 Profiler: Produces output in the 100+gigs range in function trace mode at the required resolution. On the plus side, produces solid results.
GProf: Requires GCC to be even moderately effective.
VTune: VTune's W7 support borders on criminal. Otherwise excellent
PIN: I'd need to hack up my own tool, so this is sort of a last resort.
Sleepy\VerySleepy: Useful for smaller apps, but failing me here.
EasyProfiler: Not bad if you don't mind a bit of manually injected code to indicate where to instrument.
Valgrind: *nix only, but very good when you're in that environment.
OProfile: Linux only.
Proffy: They shoot wild horses.
Suggested tools that I haven't tried:
XPerf:
Glowcode:
Devpartner:
Notes:
Intel environment at the moment. VS2008, boost libraries. Qt 4+. And the wretched humdinger of them all: Qt/MFC integration via trolltech.
Now: Almost two weeks later, it looks like my issue is resolved. Thanks to a variety of tools, including almost everything on the list and a couple of my personal tricks, we found the primary bottlenecks. However, I'm going to keep testing, exploring, and trying out new profilers as well as new tech. Why? Because I owe it to you guys, because you guys rock. It does slow the timeline down a little, but I'm still very excited to keep trying out new tools.
Synopsis
Among many other problems, a number of components had recently been switched to the incorrect threading model, causing serious hang-ups due to the fact that the code underneath us was suddenly no longer multithreaded. I can't say more because it violates my NDA, but I can tell you that this would never have been found by casual inspection or even by normal code review. Without profilers, callgraphs, and random pausing in conjunction, we'd still be screaming our fury at the beautiful blue arc of the sky. Thankfully, I work with some of the best hackers I've ever met, and I have access to an amazing 'verse full of great tools and great people.
Gentlefolk, I appreciate this tremendously, and only regret that I don't have enough rep to reward each of you with a bounty. I still think this is an important question to get a better answer to than the ones we've got so far on SO.
As a result, each week for the next three weeks, I'll be putting up the biggest bounty I can afford, and awarding it to the answer with the nicest tool that I think isn't common knowledge. After three weeks, we'll hopefully have accumulated a definitive profile of the profilers, if you'll pardon my punning.
Take-away
Use a profiler. They're good enough for Ritchie, Kernighan, Bentley, and Knuth. I don't care who you think you are. Use a profiler. If the one you've got doesn't work, find another. If you can't find one, code one. If you can't code one, or it's a small hang up, or you're just stuck, use random pausing. If all else fails, hire some grad students to bang out a profiler.
A Longer View
So, I thought it might be nice to write up a bit of a retrospective. I opted to work extensively with Parallel Studios, in part because it is actually built on top of the PIN Tool. Having had academic dealings with some of the researchers involved, I felt that this was probably a mark of some quality. Thankfully, I was right. While the GUI is a bit dreadful, I found IPS to be incredibly useful, though I can't comfortably recommend it for everyone. Critically, there's no obvious way to get line-level hit counts, something that AQT and a number of other profilers provide, and I've found very useful for examining rate of branch-selection among other things. In net, I've enjoyed using AQTime as well, and I've found their support to be really responsive. Again, I have to qualify my recommendation: A lot of their features don't work that well, and some of them are downright crash-prone on Win7x64. XPerf also performed admirably, but is agonizingly slow for the sampling detail required to get good reads on certain kinds of applications.
Right now, I'd have to say that I don't think there's a definitive option for profiling C++ code in a W7x64 environment, but there are certainly options that simply fail to perform any useful service.
First:
Time sampling profilers are more robust than CPU sampling profilers. I'm not extremely familiar with Windows development tools so I can't say which ones are which. Most profilers are CPU sampling.
A CPU sampling profiler grabs a stack trace every N instructions.
This technique will reveal portions of your code that are CPU bound. Which is awesome if that is the bottle neck in your application. Not so great if your application threads spend most of their time fighting over a mutex.
A time sampling profiler grabs a stack trace every N microseconds.
This technique will zero in on "slow" code. Whether the cause is CPU bound, blocking IO bound, mutex bound, or cache thrashing sections of code. In short what ever piece of code is slowing your application will standout.
So use a time sampling profiler if at all possible especially when profiling threaded code.
Second:
Sampling profilers generate gobs of data. The data is extremely useful, but there is often too much to be easily useful. A profile data visualizer helps tremendously here. The best tool I've found for profile data visualization is gprof2dot. Don't let the name fool you, it handles all kinds of sampling profiler output (AQtime, Sleepy, XPerf, etc). Once the visualization has pointed out the offending function(s), jump back to the raw profile data to get better hints on what the real cause is.
The gprof2dot tool generates a dot graph description that you then feed into a graphviz tool. The output is basically a callgraph with functions color coded by their impact on the application.
A few hints to get gprof2dot to generate nice output.
I use a --skew of 0.001 on my graphs so I can easily see the hot code paths. Otherwise the int main() dominates the graph.
If you're doing anything crazy with C++ templates you'll probably want to add --strip. This is especially true with Boost.
I use OProfile to generate my sampling data. To get good output I need configure it to load the debug symbols from my 3rd party and system libraries. Be sure to do the same, otherwise you'll see that CRT is taking 20% of your application's time when what's really going on is malloc is trashing the heap and eating up 15%.
What happened when you tried random pausing? I use it all the time on a monster app. You said it did not give enough information, and you've suggested you need high resolution. Sometimes people need a little help in understanding how to use it.
What I do, under VS, is configure the stack display so it doesn't show me the function arguments, because that makes the stack display totally unreadable, IMO.
Then I take about 10 samples by hitting "pause" during the time it's making me wait. I use ^A, ^C, and ^V to copy them into notepad, for reference. Then I study each one, to try to figure out what it was in the process of trying to accomplish at that time.
If it was trying to accomplish something on 2 or more samples, and that thing is not strictly necessary, then I've found a live problem, and I know roughly how much fixing it will save.
There are things you don't really need to know, like precise percents are not important, and what goes on inside 3rd-party code is not important, because you can't do anything about those. What you can do something about is the rich set of call-points in code you can modify displayed on each stack sample. That's your happy hunting ground.
Examples of the kinds of things I find:
During startup, it can be about 30 layers deep, in the process of trying to extract internationalized character strings from DLL resources. If the actual strings are examined, it can easily turn out that the strings don't really need to be internationalized, like they are strings the user never actually sees.
During normal usage, some code innocently sets a Modified property in some object. That object comes from a super-class that captures the change and triggers notifications that ripple throughout the entire data structure, manipulating the UI, creating and desroying obects in ways hard to foresee. This can happen a lot - the unexpected consequences of notifications.
Filling in a worksheet row-by-row, cell-by-cell. It turns out if you build the row all at once, from an array of values, it's a lot faster.
P.S. If you're multi-threaded, when you pause it, all threads pause. Take a look at the call stack of each thread. Chances are, only one of them is the real culprit, and the others are idling.
I've had some success with AMD CodeAnalyst.
Do you have an MFC OnIdle function? In the past I had a near real-time app I had to fix that was dropping serial packets when set at 19.2K speed which a PentiumD should have been able to keep up with. The OnIdle function was what was killing things. I'm not sure if QT has that concept, but I'd check for that too.
Re the VS Profiler -- if it's generating such large files, perhaps your sampling interval is too frequent? Try lowering it, as you probably have enough samples anyway.
And ideally, make sure you're not collecting samples until you're actually exercising the problem area. So start with collection paused, get your program to do its "slow activity", then start collection. You only need at most 20 seconds of collection. Stop collection after this.
This should help reduce your sample file sizes, and only capture what is necessary for your analysis.
I have successfully used PurifyPlus for Windows. Although it is not cheap, IBM provides a trial version that is slightly crippled. All you need for profiling with quantify are pdb files and linking with /FIXED:NO. Only drawback: No support for Win7/64.
Easyprofiler - I haven't seen it mentioned here yet so not sure if you've looked at it already. It takes a slightly different approach in how it gathers metric data. A drawback to using its compile-time profile approach is you have to make changes to the code-base. Thus you'll need to have some idea of where the slow might be and insert profiling code there.
Going by your latest comments though, it sounds like you're at least making some headway. Perhaps this tool might provide some useful metrics for you. If nothing else it has some really purdy charts and pictures :P
Two more tool suggestions.
Luke Stackwalker has a cute name (even if it's trying a bit hard for my taste), it won't cost you anything, and you get the source code. It claims to support multi threaded programs, too. So it is surely worth a spin.
http://lukestackwalker.sourceforge.net/
Also Glowcode, which I've had pointed out to me as worth using:
http://www.glowcode.com/
Unfortunately I haven't done any PC work for a while, so I haven't tried either of these. I hope the suggestions are of help anyway.
Checkout XPerf
This is free, non-invasive and extensible profiler offered by MS. It was developed by Microsoft to profile Windows.
If you're suspicious of the event loop, could overriding QCoreApplication::notify() and dosome manual profiling (one or two maps of senders/events to counts/time)?
I'm thinking that you first log the frequency of event types, then examine those events more carefully (which object sends it, what does it contain, etc). Signals across threads are queued implicitly, so they end up in the event loop (as well explicit queued connections too, obviously).
We've done it to trap and report exceptions in our event handlers, so really, every event goes through there.
Just an idea.
Edit: I see now you mentioned this in your first post. Dammit, I never thought I'd be that guy.
You can use Pin to instrument your code with finer granularity. I think Pin would let you create a tool to count how many times you enter a function or how many clockticks you spend there, roughly emulating something like VTune or CodeAnalyst. Then you could strip down which functions get instrumented until your timing issues go away.
I can tell you what I use everyday.
a) AMD Code Analyst
It is easy, and it will give you a quick overview of what is happening. It will be ok for most of the time.
With AMD CPUs, it will tell you info about the cpu pipeline, but you only need this only if you have heavy loops, like in graphic engines, video codecs, etc.
b) VTune.
It is very well integrated in vs2008
after you know the hotspots, you need to sample not only time, but other things like cache misses, and memory usage. This is very important. Setup a sampling session, and edit the properties. I always sample for time, memory read/write, and cache misses (three different runs)
But more than the tool, you need to get experience with profiling. And that means understanding how the CPU/Memory/PCI works... so, this is my 3rd option
c) Unit testing
This is very important if you are developing a big application that needs huge performance. If you cannot split the app in some pieces, it will be difficult to track cpu usage. I dont test all the cases and classes, but I have hardcoded executions and input files with important features.
My advice is using random sampling in several small tests, and try to standardise a profile strategy.
I use xperf/ETW for all of my profiling needs. It has a steep learning curve but is incredibly powerful. If you are profiling on Windows then you must know xperf. I frequently use this profiler to find performance problems in my code and in other people's code.
In the configuration that I use it:
xperf grabs CPU samples from every core that is executing code every
ms. The sampling rate can be increased to 8 KHz and the samples
include user-mode and kernel code. This allows finding out what a
thread is doing while it is running
xperf records every context
switch (allowing for perfect reconstruction of how much time each
thread uses), plus call stacks for when threads are switched in, plus
call stacks for what thread readied another thread, allowing tracing
of wait chains and finding out why a thread is not running
xperf
records all file I/O from all processes
xperf records all disk I/O
from all processes
xperf records what window is active, the CPU
frequency, CPU power state, UI delays, etc.
xperf can also record all
heap allocations from one process, all virtual allocations from all
processes, and much more.
That's a lot of data, all on one timeline, for all processes. No other profiler on Windows can do that.
I have blogged extensively about how to use xperf/ETW. These blog posts, and some professionally quality training videos, can be found here:
http://randomascii.wordpress.com/2014/08/19/etw-training-videos-available-now/
If you want to find out what might happen if you don't use xperf read these blog posts:
http://randomascii.wordpress.com/category/investigative-reporting/
These are tales of performance problems I have found in other people's code, that should have been found by the developers. This includes mshtml.dll being loaded into the VC++ compiler, a denial of service in VC++'s find-in-files, thermal throttling in a surprising number of customer machines, slow single-stepping in Visual Studio, a 4 GB allocation in a hard-disk driver, a powerpoint performance bug, and more.
I just finished the first usable version of CxxProf, a portable manual instrumented profiling library for C++.
It fulfills the following goals:
Easy integration
Easily remove the lib during compile time
Easily remove the lib during runtime
Support for multithreaded applications
Support for distributed systems
Keep impact on a minimum
These points were ripped from the project wiki, have a look there for more details.
Disclaimer: Im the main developer of CxxProf
Just to throw it out, even though it's not a full-blown profiler: if all you're after is hung event loops that take long processing an event, an ad-hoc tool is simple matter in Qt. That approach could be easily expanded to keep track of how long did each event take to process, and what those events were, and so on. It's not a universal profiler, but an event-loop-centric one.
In Qt, all cross-thread signal-slot calls are delivered via the event loop, as are timers, network and serial port notifications, and all user interaction,. Thus, observing the event loops is a big step towards understanding where the application is spending its time.
DevPartner, originally developed by NuMega and now distributed by MicroFocus, was once the solution of choice for profiling and code analysis (memory and resource leaks for example).
I haven't tried it recently, so I cannot assure you it will help you; but I once had excellent results with it, so that this is an alternative I do consider to re-install in our code quality process (they provide a 14 days trial)
though your os is win7,the programm cann't run under xp?
how about profile it under xp and the result should be a hint for win7.
There are lots of profilers listed here and I've tried a few of them myself - however I ended up writing my own based on this:
http://code.google.com/p/high-performance-cplusplus-profiler/
It does of course require that you modify the code base, but it's perfect for narrowing down bottlenecks, should work on all x86s (could be a problem with multi-core boxes, i.e. it uses rdtsc, however - this is purely for indicative timing anyway - so I find it's sufficient for my needs..)
I use Orbit profiler, easy, open source and powerfull ! https://orbitprofiler.com/

What to do when an out-of-memory error occurs? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What's the graceful way of handling out of memory situations in C/C++?
Hi,
this seems to be a simple question a first glance. And I don't want to start a huge discussion on what-is-the-best-way-to-do-this....
Context: Windows >= 5, 32 bit, C++, Windows SDK / Win32 API
But after asking a similar question, I read some MSDN and about the Win32 memory management, so now I'm even more confused on what to do if an allocation fails, let's say the C++ new operator.
So I'm very interested now in how you implement (and implicitly, if you do implement) an error handling for OOM in your applications.
If, where (main function?), for which operations (allocations) , and how you handle an OOM error.
(I don't really mean that subjectively, turning this into a question of preference, I just like to see different approaches that account for different conditions or fit different situations. So feel free to offer answers for GUI apps, services - user-mode stuff ....)
Some exemplary reactions to OOM to show what I mean:
GUI app: Message box, exit process
non-GUI app: Log error, exit process
service: try to recover, e.g. kill the thread that raised an exception, but continue execution
critical app: try again until an allocation succeeds (reducing the requested amount of memory)
hands from OOM, let STL / boost / OS handle it
Thank you for your answers!
The best-explained way will receive the great honour of being an accepted answer :D - even if it only consists of a MessageBox line, but explains why evering else was useless, wrong or unneccessary.
Edit: I appreciate your answers so far, but I'm missing a bit of an actual answer; what I mean is most of you say don't mind OOM since you can't do anything when there's no memory left (system hangs / poor performance). But does that mean to avoid any error handling for OOM? Or only do a simple try-catch in the main showing a MessageBox?
On most modern OSes, OOM will occur long after the system has become completely unusable, since before actually running out, the virtual memory system will start paging physical RAM out to make room for allocating additional virtual memory and in all likelihood the hard disk will begin to thrash like crazy as pages have to be swapped in and out at higher and higher frequencies.
In short, you have much more serious concerns to deal with before you go anywhere near OOM conditions.
Side note: At the moment, the above statement isn't as true as it used to be, since 32-bit machines with loads of physical RAM can exhaust their address space before they start to page. But this is still not common and is only temporary, as 64-bit ramps up and approaches mainstream adoption.
Edit: It seems that 64-bit is already mainstream. While perusing the Dell web site, I couldn't find a single 32-bit system on offer.
You do the exact same thing you do when:
you created 10,000 windows
you allocated 10,000 handles
you created 2,000 threads
you exceeded your quota of kernel pool memory
you filled up the hard disk to capacity.
You send your customer a very humble message where you apologize for writing such crappy code and promise a delivery date for the bug fix. Any else is not nearly good enough. How you want to be notified about it is up to you.
Basically, you should do whatever you can to avoid having the user lose important data. If disk space is available, you might write out recovery files. If you want to be super helpful, you might allocate recovery files while your program is open, to ensure that they will be available in case of emergency.
Simply display a message or dialog box (depending on whether your in a terminal or window system), saying "Error: Out of memory", possibly with debugging info, and include an option for your user to file a bug report, or a web link to where they can do that.
If your really out of memory then, in all honesty, there's no point doing anything other than gracefully exiting, trying to handle the error is useless as there is nothing you can do.
In my case, what happens when you have an app that fragments the memory up so much it cannot allocate the contiguous block needed to process the huge amount of nodes?
Well, I split the processing up as much as I could.
For OOM, you can do the same thing, chop your processes up into as many pieces as possible and do them sequentially.
Of course, for handling the error until you get to fix it (if you can!), you typically let it crash. Then you determine that those memory allocs are failing (like you never expected) and put a error message direct to the user along the lines of "oh dear, its all gone wrong. log a call with the support dept". In all cases, you inform the user however you like. Though, its established practice to use whatever mechanism the app currently uses - if it writes to a log file, do that, if it displays an error dialog, do the same, if it uses the Windows 'send info to microsoft' dialog, go right ahead and let that be the bearer of bad tidings - users are expecting it, so don't try to be clever and do something else.
It depends on your app, your skill level, and your time. If it needs to be running 24/7 then obviously you must handle it. It depends on the situation. Perhaps it may be possible to try a slower algorithm but one that requires less heap. Maybe you can add functionality so that if OOM does occur your app is capable of cleaning itself up, and so you can try again.
So I think the answer is 'ALL OF THE ABOVE!', apart from LET IT CRASH. You take pride in your work, right?
Don't fall into the 'there's loads of memory so it probably won't happen' trap. If every app writer took that attitude you'd see OOM far more often, and not all apps are running on a desktop machines, take a mobile phone for example, it's highly likely for you to run into OOM on a RAM starved platform like that, trust me!
If all else fails display a useful message (assuming there's enough memory for a MessageBox!)

How to optimize paging for large in memory database

I have an application where the entire database is implemented in memory using a stl-map for each table in the database.
Each item in the stl-map is a complex object with references to other items in the other stl-maps.
The application works with a large amount of data, so it uses more than 500 MByte RAM. Clients are able to contact the application and get a filtered version of the entire database. This is done by running through the entire database, and finding items relevant for the client.
When the application have been running for an hour or so, then Windows 2003 SP2 starts to page out parts of the RAM for the application (Eventhough there is 16 GByte RAM on the machine).
After the application have been partly paged out then a client logon takes a long time (10 mins) because it now generates a page fault for each pointer lookup in the stl-map. If running the client logon a second time right after then it is fast (few secs) because all the memory is now back in RAM.
I can see it is possible to tell Windows to lock memory in RAM, but this is generally only recommended for device drivers, and only for "small" amounts of memory.
I guess a poor mans solution could be to loop through the entire memory database, and thus tell Windows we are still interested in keeping the datamodel in RAM.
I guess another poor mans solution could be to disable the pagefile completely on Windows.
I guess the expensive solution would be a SQL database, and then rewrite the entire application to use a database layer. Then hopefully the database system will have implemented means to for fast access.
Are there other more elegant solutions ?
This sounds like either a memory leak, or a serious fragmentation problem. It seems to me that the first step would be to figure out what's causing 500 Mb of data to use up 16 Gb of RAM and still want more.
Edit: Windows has a working set trimmer that actively attempts to page out idle data. The basic idea is that it goes through and marks pages as being available, but leaves the data in them (and the virtual memory manager knows what data is in them). If, however, you attempt to access that memory before it's allocated to other purposes, it'll be marked as being in use again, which will normally prevent it from being paged out.
If you really think this is the source of your problem, you can indirectly control the working set trimmer by calling SetProcessWorkingSetSize. At least in my experience, this is only rarely of much use, but you may be in one of those unusual situations where it's really helpful.
As #Jerry Coffin said, it really sounds like your actual problem is a memory leak. Fix that.
But for the record, none of your "poor mans solutions" would work. At all.
Windows pages out some of your data because there's not room for it in RAM.
Looping through the entire memory database would load in every byte of the data model, yes... which would cause other parts of it to be paged out. In the end, you'd generate a lot of page faults, and the only difference in the end would be which parts of the data structure are paged out.
Disabling the page file? Yes, if you think a hard crash is better than low performance. Windows doesn't page data out because it's fun. It does that to handle situations where it would otherwise run out of memory. If you disable the pagefile, the app will just crash when it would otherwise page out data.
If your dataset really is so big it doesn't fit in memory, then I don't see why an SQL database would be especially "expensive". Unlike your current solution, databases are optimized for this purpose. They're meant to handle datasets too large to fit in memory, and to do this efficiently.
It sounds like you have a memory leak. Fixing that would be the elegant, efficient and correct solution.
If you can't do that, then either
throw more RAM at the problem (the app ends up using 16GB? Throw 32 or 64GB at it then), or
switch to a format that's optimized for efficient disk access (A SQL database probably)
We have a similar problem and the solution we choose was to allocate everything in a shared memory block. AFAIK, Windows doesn't page this out. However, using stl-map here is not for faint of heart either and was beyond what we required.
We are using Boost Shared Memory to implement this for us and it works well. Follow examples closely and you will be up and running quickly. Boost also has Boost.MultiIndex that will do a lot of what you want.
For a no cost sql solution have you looked at Sqlite? They have an option to run as an in memory database.
Good luck, sounds like an interesting application.
I have an application where the entire
database is implemented in memory
using a stl-map for each table in the
database.
That's the start of the end: STL's std::map is extremely memory inefficient. Same applies to std::list. Every element would be allocated separately causing rather serious memory waste. I often use std::vector + sort() + find() instead of std::map in applications where it is possible (more searches than modifications) and I know in advance memory usage might become an issue.
When the application have been running
for an hour or so, then Windows 2003
SP2 starts to page out parts of the
RAM for the application (Eventhough
there is 16 GByte RAM on the machine).
Hard to tell without knowing how your application is written. Windows has the feature to unload from RAM whatever memory of idle applications can be unloaded. But that normally affects memory mapped files and alike.
Otherwise, I would strongly suggest to read up the Windows memory management documentation . It is not very easy to understand, yet Windows has all sorts and types of memory available to applications. I never had luck with it, but probably in your application using custom std::allocator would work.
I can believe it is the fault of flawed pagefile behaviour -i've run my laptops mostly with pagefile turned off since nt4.0. In my experience, at least up to XP Pro, Windows intrusively swaps pages out just to provide the dubious benefit of having a really-really-slow extension to the maximum working set space.
Ask what benefit swapping to harddisk is achieving with 16 Gigabityes of real RAM available? If your working set it so big as to need more virtual memory than +10 Gigs, then once swapping is actualy required processes will take anything from a bit longer, to thousands of times longer to complete. On Windows the untameable file system cache seems to antagonise the relationships.
Now when I (very) occasionaly run out of working set on my XP laptops, there is no traffic jam, the guilty app just crashes. A utility to suspend memory glugging processes before that time and make an alert would be nice, but there is no such thing just a violation, a crash, and sometimes explorer.exe goes down too.
Pagefiles - who needs em'
---- Edit
Given snakefoot explanation, the problem is swapping out memory that is not used for a longer period of time and due to this not having the data in memory when needed. This is the same as this:
Can I tell Windows not to swap out a particular processes’ memory?
and VirtualLock function should do its job:
http://msdn.microsoft.com/en-us/library/aa366895(VS.85).aspx
---- Previous answer
First of all you need to distinguish between memory leak and memory need problems.
If you have a memory leak then it would be bigger effort to convert entire application to SQL than to debug the application.
SQL cannot be faster then a well designed, domain specific in-memory database and if you have bugs, chances are you will have different ones in an SQL version as well.
If this is a memory need problem, then you will need to switch to SQL anyway and this sounds like a good moment.