I would like to get some information about the memory usage of my C++ program. The way I do this is by accessing /proc/self/stat and printing the virtual and resident set size.
You can find an example here.
Is this a good way to go? How accurate is the information I am accessing*?
Could someone recommend a better way to measure memory usage programmatically?
*Asking, because I get unexpected, sudden jumps of mem usage. My expectation was that the information is perfectly accurate.
OS: I am running inside a docker container, which is based on RHEL.
Additional info: If I limit the memory usage of the container with docker run -m, the printed memory is greater than the limit I set.
How to programatically get the memory usage of the current program?
There is no standard way to get memory usage of a program in C++.
The concept of "memory usage" itself is somewhat vague and can mean different things. Depending on what you mean, there may or might not be a system specific way to get the information.
The way I do this is by accessing /proc/self/stat
Is this a good way to go?
I don't think so. As far as I know, /proc filesystem is not portable. Use the getrusage function on POSIX systems.
Related
This question already has answers here:
How to determine CPU and memory consumption from inside a process
(10 answers)
Closed 6 years ago.
See, I wanted to measure memory usage of my C++ program. From inside the program, without profilers or process viewers, etc.
Why from inside the program?
Measurements will be done thousands of times—must be automated; therefore, having an eye on Task Manager, top, whatever, will not do
Measurements are to be done during production runs—performance degradation, which may be caused by profilers, is not acceptable since the run times are non-negligible already (several hours for large problem instances)
note. Why measure at all? The only reason to measure used memory (as reported by the OS) as opposed to calculating “expected” usage in advance is the fact that I can not directly, analytically “sizeof” how much does my principal data structure use. The structure itself is
unordered_map<bitset, map<uint16_t, int64_t> >
these are packed into a vector for all I care (a list would actually suffice as well, I only ever need to access the “neighbouring” structures; without details on memory usage, I can hardly decide which to choose)
vector< unordered_map<bitset, map<uint16_t, int64_t> > >
so if anybody knows how to “sizeof” the memory occupied by such a structure, that would also solve the issue (though I'd probably have to fork the question or something).
Environment: It may be assumed that the program runs all alone on the given machine (along with the OS, etc. of course; either a PC or a supercomputer's node); it is certain to be the only one program requiring large (say > 512 MiB) amounts of memory—computational experiment environment. The program is either run on my home PC (16GiB RAM; Windows 7 or Linux Mint 18.1) or the institution supercomputer's node (circa 100GiB RAM, CentOS 7), and the program may want to consume all that RAM. Note that the supercomputer effectively prohibits disk swapping of user processes, and my home PC has a smallish page file.
Memory usage pattern. The program can be said to sequentially fill a sort of table, each row wherein is the vector<...> as specified above. Say the prime data structure is called supp. Then, for each integer k, to fill supp[k], the data from supp[k-1] is required. As supp[k] is filled it is used to initialize supp[k+1]. Thus, at each time, this, prev, and next “table rows” must be readily accessible. After the table is filled, the program does a relatively quick (compared with “initializing” and filling the table), non-exhaustive search in the table, through which a solution is obtained. Note that the memory is only allocated through the STL containers, I never explicitly new() or malloc() myself.
Questions. Wishful thinking.
What is the appropriate way to measure total memory usage (including swapped to disk) of a process from inside its source code (one for Windows, one for Linux)?
Should probably be another question, or rather a good googling session, but still---what is the proper (or just easy) way to explicitly control (say encourage or discourage) swapping to disk? A pointer to an authoritative book on the subject would be very welcome. Again, forgive my ignorance, I'd like a means to say something on the lines of “NEVER swap supp” or
“swap supp[10]”; then, when I need it, “unswap supp[10]”—all from the program's code. I thought I'd have to resolve to serialize the data structures and explicitly store them as a binary file, then reverse the transformation.
On Linux, it appeared the easiest to just catch the heap pointers through sbrk(0), cast them as 64-bit unsigned integers, and compute the difference after the memory gets allocated, and this approach produced plausible results (did not do more rigorous tests yet).
edit 5. Removed reference to HeapAlloc wrangling—irrelevant.
edit 4. Windows solution
This bit of code reports the working set that matches the one in Task Manager; that's about all I wanted—tested on Windows 10 x64 (tested by allocations like new uint8_t[1024*1024], or rather, new uint8_t[1ULL << howMuch], not in my “production” yet ).
On Linux, I'd try getrusage or something to get the equivalent.
The principal element is GetProcessMemoryInfo, as suggested by #IInspectable and #conio
#include<Windows.h>
#include<Psapi.h>
//get the handle to this process
auto myHandle = GetCurrentProcess();
//to fill in the process' memory usage details
PROCESS_MEMORY_COUNTERS pmc;
//return the usage (bytes), if I may
if (GetProcessMemoryInfo(myHandle, &pmc, sizeof(pmc)))
return(pmc.WorkingSetSize);
else
return 0;
edit 5. Removed reference to GetProcessWorkingSetSize as irrelevant. Thanks #conio.
On Windows, the GlobalMemoryStatusEx function gives you useful information both about your process and the whole system.
Based on this table you might want to look at MEMORYSTATUSEX.ullAvailPhys to answer "Am I getting close to hitting swapping overhead?" and changes in (MEMORYSTATUSEX.ullTotalVirtual – MEMORYSTATUSEX.ullAvailVirtual) to answer "How much RAM is my process allocating?"
To know how much physical memory your process takes you need to query the process working set or, more likely, the private working set. The working set is (more or less) the amount of physical pages in RAM your process uses. Private working set excludes shared memory.
See
What is private bytes, virtual bytes, working set?
How to interpret Windows Task Manager?
https://blogs.msdn.microsoft.com/tims/2010/10/29/pdc10-mysteries-of-windows-memory-management-revealed-part-two/
for terminology and a little bit more details.
There are performance counters for both metrics.
(You can also use QueryWorkingSet(Ex) and calculate that on your own, but that's just nasty in my opinion. You can get the (non-private) working set with GetProcessMemoryInfo.)
But the more interesting question is whether or not this helps your program to make useful decisions. If nobody's asking for memory or using it, the mere fact that you're using most of the physical memory is of no interest. Or are you worried about your program alone using too much memory?
You haven't said anything about the algorithms it employs or its memory usage patterns. If it uses lots of memory, but does this mostly sequentially, and comes back to old memory relatively rarely it might not be a problem. Windows writes "old" pages to disk eagerly, before paging out resident pages is completely necessary to supply demand for physical memory. If everything goes well, reusing these already written to disk pages for something else is really cheap.
If your real concern is memory thrashing ("virtual memory will be of no use due to swapping overhead"), then this is what you should be looking for, rather than trying to infer (or guess...) that from the amount of physical memory used. A more useful metric would be page faults per unit of time. It just so happens that there are performance counters for this too. See, for example Evaluating Memory and Cache Usage.
I suspect this to be a better metric to base your decision on.
my program is running and creating variables, I need to know what's the total of Bytes these variables take.
I don't want to know how much is the physical memory space that the system gives my program to be executed, I know I can open the processes manager and find out.
I neither want to write into my code some sizeof and agregations so I can know the total size of the variable pool (let say the code is too complex to be modify like that).
Finally I'm using Microsoft VC++ 2010 Express, I just want to know if there is a workspace which monitor that kind of information.
Thanks in advance.
Check this out: Memory Performance Information . There are few metrics of a running process you might be interested in, you will primarily want private bytes, and this data is available both programmatically or through tools like Performance Monitor. You can also enumerate heaps of the process with GetProcessHeaps (and even HeapWalk if you need details) and check heap allocation sizes directly.
Valgrind Massif profiler is a great tool (see here ) but only for Unix/Linux I think. In your case, on Windows I think Insure++ or softwareverify are good choices (they are commercial tools).
A free alternative is Google's tcmalloc which provides a heap profiler here
BEA recommends to keep both min and max heap sizes same. They didn't elaborate the reason for the suggestion. Can someone provide details?
I also got another recommendation from an architect of not setting anything for minimum and just set the maximum. Any comments on this? If i dont use it, what would be the default?
What is the best tool to monitor and tune JVM settings. I am using JDK1.6, on BEA weblogic 10g. It is on Linux 32bit JVM.
Is Max heap size set to 2GB any good? Server has lots of RAM. Currently it is set at 1.5GB, and it is at 80% usage when there is 40 concurrent users.
Thanks,
When the JVM needs to increase the size of the heap it will invoke a full garbage collection which may reduce throughput or cause a pause, so I would think that this is recommended by them for performance reasons.
Default value is documented as 2MB so if you don't override it you are likely to get a lot of (probably very quick) full collections after starting up as the heap is frequently resized.
Unless you are trying to keep the memory footprint as small as possible, I would follow BEA's advice.
Impossible to say from the information given if 2GB is appropriate, or if the objects that are using up space in it are still in scope - the old generation will just gradually fill up until it runs out of space when a full collection will be invoked. Where does that 80% figure come from?
Use the following JVM arguments to log GC details to a file called gc.log:
-verbose:gc -XX:+PrintGCDetails -Xloggc:gc.log
Then you can analyse this using something like http://www.tagtraum.com/gcviewer.html
I would like to know if there is an efficient way to measure the actual memory consumption of a particular C data structure.
The goal being to make benchmarks based on how the memory usage changes after specific operations on those data structures.
I do not seek a way to count the number of objects in use; I do want to know exactly how big the memory usage of an object put under stress can get.
Is there a standard way to do that, either in C code, or from outside? (Some equivalent to the time (1) utility would be a start).
Obviously, I could track down every single pointer and do a sum of all sizeofs. If this is the only way, please do tell me. I wonder whether there is a simpler way. Or maybe a library to do it for me.
If you want to monitor the memory usage of the program on a global level you can replace new/delete in C++ or malloc/free in C with your own functions and log the memory usage.
On Unix for memory consumption you can use valgrind with the tool Massif (+ any visualization tool), but I don't know if it is suited for your problem since it will give you a detailled view of all the memory consumption of your program.
Yep, cnicutar, on Linux you have pmap or maybe even pstat.
On MS there are myriad profiling tools for VStudio depending on your contribution to the MS machine (even free ones for cmd line use). Call me a green horn, I don't have issues with memory leaks.
I am aware of Valgrind, but it just detects memory management issues. What I am searching is a tool that gives me an overview, which parts of my program do consume how much memory. A graphical representation with e.g. a tree map (as KCachegrind does for Callgrind) would be cool.
I am working on a Linux machine, so windows tools will not help me very much.
Use massif, which is part of the Valgrind tools. massif-visualizer can help you graph the data or you can just use the ms_print command.
Try out the heap profiler delivered with gperftools, by Google. I've always built it from sources, but it's available as a precompiled package under several Linux distros.
It's as simple to use as linking a dynamic library to your executables and running the program. It collects information about every dynamic memory allocation (as far as I've seen) and save to disk a memory dump every time one of the following happens:
HEAP_PROFILE_ALLOCATION_INTERVAL bytes have been allocated by the program (default: 1Gb)
the high-water memory usage mark increases by HEAP_PROFILE_INUSE_INTERVAL bytes (default: 100Mb)
HEAP_PROFILE_TIME_INTERVAL seconds have elapsed (default: inactive)
You explicitly call HeapProfilerDump() from your code
The last one, in my experience, is the most useful because you can control exactly when to have a snapshot of the heap usage and then compare two different snapshots and see what's wrong.
Eventually, there are several possible output formats, like textual or graphical (in the form of a directed graph):
Using this tool I've been able to spot incorrect memory usages that I couldn't find using Massif.
A "newer" option is HeapTrack. Contrary to massif, it is an instrumented version of malloc/free that stores all the calls and dumps a log.
The GUI is nice (but requires Qt5 IIRC) and the results timings (because you may want to track time as well) are less biased than valgrind (as they are not emulated).
Use callgrind option with valgrind