How to profile memory usage? - c++

I am aware of Valgrind, but it just detects memory management issues. What I am searching is a tool that gives me an overview, which parts of my program do consume how much memory. A graphical representation with e.g. a tree map (as KCachegrind does for Callgrind) would be cool.
I am working on a Linux machine, so windows tools will not help me very much.

Use massif, which is part of the Valgrind tools. massif-visualizer can help you graph the data or you can just use the ms_print command.

Try out the heap profiler delivered with gperftools, by Google. I've always built it from sources, but it's available as a precompiled package under several Linux distros.
It's as simple to use as linking a dynamic library to your executables and running the program. It collects information about every dynamic memory allocation (as far as I've seen) and save to disk a memory dump every time one of the following happens:
HEAP_PROFILE_ALLOCATION_INTERVAL bytes have been allocated by the program (default: 1Gb)
the high-water memory usage mark increases by HEAP_PROFILE_INUSE_INTERVAL bytes (default: 100Mb)
HEAP_PROFILE_TIME_INTERVAL seconds have elapsed (default: inactive)
You explicitly call HeapProfilerDump() from your code
The last one, in my experience, is the most useful because you can control exactly when to have a snapshot of the heap usage and then compare two different snapshots and see what's wrong.
Eventually, there are several possible output formats, like textual or graphical (in the form of a directed graph):
Using this tool I've been able to spot incorrect memory usages that I couldn't find using Massif.

A "newer" option is HeapTrack. Contrary to massif, it is an instrumented version of malloc/free that stores all the calls and dumps a log.
The GUI is nice (but requires Qt5 IIRC) and the results timings (because you may want to track time as well) are less biased than valgrind (as they are not emulated).

Use callgrind option with valgrind

Related

Troubleshoot C++ program memory usage issue

I am authoring a C++ program and find it consumes too much memory. I would like to know which part of the program consumes the most number of memory, ideally, I would like to know how much percentage of memory are consumed by what kind of C++ objects the program is using at a particular moment.
In Java, I know tools like Eclipse Memory Analyzer (https://www.eclipse.org/mat/) which could take a heap dump and show/visualize such memory usage, and I wonder if this can be done for a C++ program. For example, I expect to use a tool/approach letting me know a particular vector<shared_ptr<MyObject>> is holding 30% of the memory.
Note:
I develop the program mainly on macOS (compile using Apple Clang), so it will be better if the approach works on macOS. But I do deploy to Linux as well (compile using gcc) so approaches/tools on Linux is okay.
I tried using Apple's Intruments for such purpose, but so far I can only use it to find memory allocation issue. I have no idea how to figure out the memory consumption of the program at a particular moment (the memory consumption should be related with C++ objects in the program so that I can do some action to reduce it accordingly).
I don't find an easy way to visualize/summarize each part of my program's memory yet. So far, the best tool/approach that I found is Apple's Instruments (if you are on macOS).
By using Instruments, you can use Allocations profiling template. When using this profiling template, you can choose File ==> Recording Options ==> Check Discard events for freed memory option
And you will be able to figure out the un-free memory (aka. the data that are still in the memory) during allocation recording. If you have your program's debug symbol loaded, you can see which function leads to this result.
Although this doesn't address all the issues, it does help to identify part of the problem.

Find out where memory is consumed

I have this relatively large numerical application code that may run for a few days and eventually spit out some numbers. The whole thing is written in C++, making use of a bunch of 3rd-party libraries, and compiled using GCC 4.6. The code uses shared pointers throughout.
Unfortunately, over time, the memory consumption of the code increases until all of the (shared) memory is used up, then crashes. Algorithmically, the code shouldn't build up memory over time, so there'll be a bug somewhere.
I did run a small example through valgrind's leak checker which reports that all should be fine. My thought was that shared pointers might unintentionally be created someplace, preventing from unneeded data from being freed along the process (but this is just a guess).
At the end of the day, I'm running out of ideas how to debug such a thing.
Any ideas?
Since you already have the valgrind toolsuite available, I would advise you to run the massif tool.
Massif will track the memory allocation origins and the report will indicate you how many bytes each allocation site/function created. This will help you understand where that memory blow-up comes from.
GNU libstdc++ defaults to caching STL-related memory allocations, apparently for microbenchmark speed reasons. However, the actual effect tends to be quite negative for both speed and memory footprint when using allocators such as tcmalloc and jemalloc. tcmalloc disables this behavior by setting GLIBCPP_FORCE_NEW=1 and GLIBCXX_FORCE_NEW=1 in the environment (for libstdc++ versions 3.3 and 3.4, respectively), but I know of no other allocator that does so. Therefore it generally a good idea to set the appropriate environment variable when launching your application.
Even if you have no leaks, you could face memory fragmentation.
If you are on Linux, I suggest to try jemalloc allocator. It runs great on Linux. It runs on many architectures, I used it successfully even on zLinux (on IBM zSeries mainframe). It's really easy to use - you don't even need to rebuild your application or any libraries, just build jemalloc and start your application with LD_PRELOAD set like this: LD_PRELOAD=/usr/lib/libjemalloc.so <app>

how can I see how much memory my program is eating?

my program is running and creating variables, I need to know what's the total of Bytes these variables take.
I don't want to know how much is the physical memory space that the system gives my program to be executed, I know I can open the processes manager and find out.
I neither want to write into my code some sizeof and agregations so I can know the total size of the variable pool (let say the code is too complex to be modify like that).
Finally I'm using Microsoft VC++ 2010 Express, I just want to know if there is a workspace which monitor that kind of information.
Thanks in advance.
Check this out: Memory Performance Information . There are few metrics of a running process you might be interested in, you will primarily want private bytes, and this data is available both programmatically or through tools like Performance Monitor. You can also enumerate heaps of the process with GetProcessHeaps (and even HeapWalk if you need details) and check heap allocation sizes directly.
Valgrind Massif profiler is a great tool (see here ) but only for Unix/Linux I think. In your case, on Windows I think Insure++ or softwareverify are good choices (they are commercial tools).
A free alternative is Google's tcmalloc which provides a heap profiler here

Newbie: Performance Analysis through the command line

I am looking for a performance analysis tool with the following properties:
Free.
Runs on Windows.
Does not require using the GUI
(i.e. can be run from command line or by using some library in any programming language).
Runs on some x86-based architecture (preferably Intel).
Can measure the running time of my C++, mingw-compiled, program, except for the time spent in a few certain functions I specify (and all calls emanating from them).
Can measure the amount of memory used by my program, except for the memory allocated by those functions I specified in (5) and all calls emanating from them.
A tool that have properties (1) to (5) (without 6) can still be very valuable to me.
My goal is to be able to compare the running time and memory usage of different programs in a consistent way (i.e. the main requirement is that timing the same program twice would return roughly the same results).
Mingw should already have a gprof tool. To use it you just need to compile with correct flags set. I think it was -g -pg.
For heap analysis (free) you can use umdh.exe which is a full heap dumper, you can also compare consecutive memory snapshots to check for leakage over time. You'd have to filter the output yourself to remove functions that were not of interest, however.
I know that's not exactly what you were asking for in (6), but it might be useful. I think filtering like this is not going to be so common in freeware.

Memory snapshot of a huge C/C++ project (Windows/Unix)

I'm trying to take a snapshot of the memory used by a large application running on both Unix / Windows. My ultimate aim would be to have a kind of chart breaking down the memory used by which area of code.
The program is split into about 30 different projects most of which are either static libraries or dynamic dlls. Some of these are written in C, some C++ and others a mixture of the two. In total the code across all projects is about 600,000 lines.
With the heap I could try and overload every 'malloc/free' and 'new/delete' across all projects and track it that way, but that is quite daunting with an application this size.
Also, that wouldn't pick up all the static global data littered around the projects too.
Thanks for any help.
You could give valgrind a try.
Here is a quote about one of the tools:
Massif
Massif is a heap profiler. It performs detailed heap profiling by taking regular snapshots of a program's heap. It produces a graph showing heap usage over time, including information about which parts of the program are responsible for the most memory allocations. The graph is supplemented by a text or HTML file that includes more information for determining where the most memory is being allocated. Massif runs programs about 20x slower than normal.
It supports on Linux now, but if doing the analysis on Linux and applying the results to the Windows version works for you this might help you.
If you are working with ELF binaries you could check the object files "*.o" before linking with an elf analyzer and see how big are the static memory sections and the size the bss(non-initialized static data) will have once loaded.