Interpreting memory consumption of program using pugixml

Interpreting memory consumption of program using pugixml - c++

I have a program which parses an XML file of ~50MB and extracts the data to an internal object structure with no links to the original XML file. When I try to roughly estimate how much memory I need, I reckon 40MB.
But my program needs something like 350MB, and I try to find out what happens. I use boost::shared_ptr, so I'm not dealing with raw pointers and hopefully I didn't produce memory leaks.
I try to write what I did, and I hope someone might point out problems in my process, wrong assumptions and so on.
First, how did I measure? I used htop to find out that my memory is full and processes using my piece of code are using the most of it. To sum up memory of different threads and to get a more pretty output, I used http://www.pixelbeat.org/scripts/ps_mem.py which confirmed my observation.
I roughly estimated the theoretical consumption to get an idea which factor lies between the consumption and what it should be at least. It's 10. So I used valgrind --tool=massif to analyze memory consumption. It shows, that at the peak of 350MB 250MB are used by something called xml_allocator which stems from the pugixml library. I went to the section of my code where I instantiate the pugi::xml_document and put an std::cout into the destructor of the object to confirm that it is released which happens pretty early in my program (at the end I sleep for 20s to have enough time to measure memory consumption, which stays 350MB even after the console output from the destructor appears).
Now I have no idea how to interpret that and hope that someone can help me where I make wrong assumptions or some such.
The outermost code snippet using pugixml is similar to:
void parse( std::string filename, my_data_structure& struc )
{
pugi::xml_document doc;
pugi::xml_parse_result result = doc.load_file(filename.c_str());
for (pugi::xml_node node = doc.child("foo").child("bar"); node; node = node.next_sibling("bar"))
{
struc.hams.push_back( node.attribute("ham").value() );
}
}
And since in my code I don't store pugixml elements somewhere (only actual values pulled out of it), I would doc expect to release all resources when the function parse is left, but looking on the graph, I cannot tell where (on the time axis) this happens.

Your assumptions are incorrect.
Here's how to estimate pugixml memory consumption:
When you load the document, the entire text of the document gets loaded to memory. So that's 50 Mb for your file. This comes as 1 allocation from xml_document::load_file -> load_file_impl
In addition to that, there's the DOM structure that contains links to other nodes, etc. The size of the node is 32 bytes, the size of the attribute is 20 bytes; that's for 32-bit processes, multiply by 2 for 64-bit processes. This comes as many allocations (each allocation is roughly 32kb) from xml_allocator.
Depending on the density of nodes/attributes in your document, memory consumption can range from, say, 110% of the document size (i.e. 50 Mb -> 55 Mb) to, say, 600% (i.e. 50 Mb -> 300 Mb).
When you destroy pugixml document (xml_document dtor gets called), the data is freed - however, depending on the way OS heap behaves, you may not see it returned to the system immediately - it may stay in process heap. To verify that you can try doing the parsing again and checking that peak memory is the same after the second parse.

Related

Quick'n'Dirty: Measure memory usage?

There's plenty of detailed instructions to measure C++ memory usage. (Example links at the bottom)
If I've got a program processing and displaying pixel data, I can use Windows Task Manager to spot memory leakage if processing/displaying/closing multiple files in succession means the executable's Memory (Private working set) grows with each iteration. Yes, not perfect but processing 1000s of frames of data, this works as a quick'n'dirty solution.
To chase down a memory bug in that (large) project, I wrote a program to accurately measure memory usage, using the Lanzelot's useful answer. Namely, the part titled "Total Physical Memory (RAM)". But if I calloc the size of 1 double, I get 577536. Even if that's a quote in bits, that's a lot..
I tried writing a bog standard program to pause, assign some memory (let's say calloc a Megabyte worth of data) and pause again before free'ing said memory. Pause are long enough to let me comfortably look at WTM. Except the executable only grows by 4 K(!) per memory assignment.
What am I missing here? Is QtCreator or the compiler optimising the assigned memory? Why does the big complex project seemingly allow memory usage resolution down to ~1MB, while whatever memory size I fruitlessly fiddle with in my simple program, bare moves the memory shown in Windows Task Manager at all?
C++: Measuring memory usage from within the program, Windows and Linux
https://stackoverflow.com/a/64166/2903608
--- Edit ---
Example code, as simple as:
double *pDouble = (double*) calloc(1, sizeof(double));
*pDouble = 5.0;
qDebug() << "*pDouble: " << *pDouble;
If I look through WTM, this takes 4K (whether 1, or 1000000 double's). With Lanzelot's solution, north of 500k..

Memory Issues in C++

I am having run-time memory allocation errors with a C++ application. I have eliminated memory leaks, invalid pointer references and out-of-bounds vector assignments as the source of the issue - I am pretty sure it's something to do with memory fragmentation and I am hoping to get help with how to further diagnose and correct the problem.
My code is too large to post (about 15,000 lines - I know not huge but clearly too large to put online), so I am going to describe things with a few relevant snippets of code.
Basically, my program takes a bunch of string and numerical data sets as inputs (class objects with vector variables of type double, string, int and bool), performs a series of calculations, and then spits out the resulting numbers. I have tested and re-tested the calculations and outputs - everything is calculating as it should, and on smaller datasets things run perfectly.
However, when I scale things up, I start getting memory allocation errors, but I don't think I am even close to approaching the memory limits of my system - please see the two graphs below...my program cycles through a series of scenarios (performing identical calculations under a different set of parameters for each scenario) - in the first graph, I run 7 scenarios on a dataset of about 200 entries. As the graph shows, each "cycle" results in memory swinging up and back down to its baseline, and the overall memory usage is tiny (see the seven small blips on the right half of the bottom graph). On the second graph, I am now running a dataset of about 10,000 entries (see notes on dataset below). In this case, I only get through 2 full cycles before getting my error (as it is trying to resize a class object for the third scenario). You can see the first two scenarios in the bottom right-half graph; a lot more memory usage than before, but still only a small fraction of available memory. And as with the smaller dataset, usage increases while my scenario runs, and then decreases back to it's initial level before reaching the next scenario.
This pattern, along with other tests I have done, lead me to believe it's some sort of fragmentation problem. The error always occurs when I am attempting to resize a vector, although the particular resize operation that causes the error varies based on the dataset size. Can anyone help me understand what's going on here and how I might fix it? I can describe things in much greater detail but already felt like my post was getting long...please ask questions if you need to and I will respond/edit promptly.
Clarification on the data set
The numbers 200 and 10,000 represent the number of unique records I am analyzing. Each record contains somewhere between 75 and 200 elements / variables, many of which are then being manipulated. Further, each variable is being manipulated over time and across multiple iterations (both dimensions variable). As a result, for an average "record" (the 200 to 10,000 referenced above), there could be easily as many as 200,000 values associated with it - a sample calculation:
1 Record * 75 Variables * 150 periods * 20 iterations = 225,000 unique values per record.
Offending Code (in this specific instance):
vector<LoanOverrides> LO;
LO.resize(NumOverrides + 1); // Error is occuring here. I am certain that NumOverrides is a valid numerical entry = 2985
// Sample class definition
class LoanOverrides {
public:
string IntexDealName;
string LoanID;
string UniqueID;
string PrepayRate;
string PrepayUnits;
double DefaultRate;
string DefaultUnits;
double SeverityRate;
string SeverityUnits;
double DefaultAdvP;
double DefaultAdvI;
double RecoveryLag;
string RateModRate;
string RateModUnits;
string BalanceForgivenessRate;
string BalanceForgivenessRateUnits;
string ForbearanceRate;
string ForbearanceRateUnits;
double ForbearanceRecoveryRate;
string ForbearanceRecoveryUnits;
double BalloonExtension;
double ExtendPctOfPrincipal;
double CouponStepUp;
};

You have a 64-bit operating system capable of allocating large quantities of memory, but have built your application as a 32-bit application, which can only allocate a maximum of about 3GB of memory. You are trying to allocate more than that.
Try compiling as a 64-bit application. This may enable you to reach your goals. You may have to increase your pagefile size.
See if you can dispose of intermediate results earlier than you are currently doing so.
Try calculating how much memory is being used/would be used by your algorithm, and try reworking your algorithm to use less.
Try avoiding duplicating data by reworking your algorithm. I see you have a lot of reference data, which by the looks of it isn't going to change during the application run. You could put all of that into a single vector, which you allocate once, then refer to them via integer indexes everywhere else, rather than copying them. (Just guessing that you are copying them).
Try avoiding loading all the data at once by reworking your algorithm to work on batches.
Without knowing more about your application, it is impossible to offer better advice. But basically you are running out of memory because you are allocating a huge, huge amount of it, and based on your application and the snippets you have posted I think you can probably avoid doing so with a little thought. Good luck.

C++: Does this look like memory fragmentation?

SUMMARY:
I have an application which consumes way more memory that it should (roughly about 250% of the expected amount) but I can't seem to find any memory leaks. Calling the same function (which does a lot of allocations) will keep increasing memory usage to some point and then it will not change and stay there.
PROGRAM DETAILS:
The application uses a quadtree data structure to store 'Points'. It is possible to specify the maximum number of points to be stored in memory (cache size). The 'Points' are stored in 'PointBuckets' (arrays of points linked to the leaf nodes of the quadtree) which, if the maximum total number of points in the quadtree is reached, are serialized and saved to temporary files, to be retrieved when needed. This all seems to work fine.
Now when a file is loaded a new Quadtree is created and the old one is deleted if it exists, then points are read from the file and inserted into the quadtree one by one. A lot of memory allocations take place as buckets are being created and deleted during node splitting etc.
SYMPTOMS:
If I load a file that is expected to use 300MB of memory once, I get the expected amount of memory consumed. All good. If I keep loading the same file over and over again the memory usage keeps growing (I'm looking at the RES column in top, Linux) till about 700MB. That could indicate a memory leak. However if I then keep loading the files still, memory consumption just stays at 700MB.
Another thing: When I use valgrind massif and look at the memory usage it always stays within expected limit. For example if I specify cache size to be 1.5 GB and run my program alone, it will eventually consume 4GB of memory. If I run it in massif, it will stay below 2GB for all the time and then in the produced graphs I'll be able to see that it in fact never allocated more then the expected 1.5GB. My naive assumption is that this happens because massif uses a custom memory pool which somehow prevents fragmentation.
So what do you think is going on here? What kind of solution should I look for to solve this issue, if it is memory fragmentation?

I'd put it more at simple allocator and OS caching behaviours. They retain memory you allocated instead of freeing it so that it can be returned to you in a more prompt fashion the next time you request it. However, 250% does sound like a lot for this kind of effect- you could be looking at fragmentation problems.
Try swapping your allocator for a fragmentation-free allocator like object pool or memory arena.

Why does my C++ program's memory usage keep growing?

I am new to Linux and C++ and have a question about the memory usage of my application.
My application processes a lot of real-time data, about 500 messages per second.
I use std::map to manage (i.e., insert and erase) all the messages. For example,
std::map<int, data_struct> m_map;
// when receive a new message, convert the message into a data structure
m_map.insert(std::pair<int, data_struct>(message.id, data));
// when need to erase a message
iter = m_map.find(id);
if (iter != m_map.end()) {
m.map.erase(iter);
}
The size of the m_map is roughly about 2500, i.e., the application receives a lot new message at the beginning, then gradually need to erase messages. After about 10 seconds, it reaches a point that the number of new message received is about the same as the messages need to be erased.
My question is this, after about 20 minutes, in the Linux System Monitor, I noticed the memory my application uses is about 1GB. And it seems the size doubles every 20 minutes. Is this something normal, does the application really use that much memory? Am I missing something here?
Thanks.

If your program allocates and deallocates chunks of memory often, you get fragemtation - there is only so much the OS can do to make sure there are no gaps between the chunks of memory you have allocated. But generally, the memory usage resulting from this will plateau.
If your program's memory is continually increasing, you have a memory leak - either you are forgetting to delete objects (or call free() in the case of C-style allocations) or you are accumulating your objects in a container and forgetting to remove them.
For finding missing delete calls, use valgrind!
using valgrind to detect memory leaks is as simple as installing it using your favorite package manager and then running
valgrind my_program
Your program will run and when it's finished, valgrind will dump a terribly detailed report of memory leaks and where they came from, including complete stack traces.
valgrind is awesome.

map::erase() calls your object's destructor method, so you should there for memory leaks
maybe, if you can measure how much the used memory increased, and the size of your data structures, you get a good hint about where the problem is.
you get about 600.000 messages every 20 min. If memory usage doubles from 1GB to 2GB, you get something like 1.8kB loss per message (1GB / 600.000 messages in 20min)

Well, Memory usage increasing means more than just leaking.
Is RSS (Resident Size) increasing ?
Or
Is VSZ (Virtual Size) increasing?
You can find out by running "ps -aux"
What I have experienced is if RSS stays same and VSZ keeps increasing then its sure is memory leak.
If RSS keeps growing and your VSZ stabilises at some point, then you are touching lot of memory everytime and also you are increasing the amount of memory you are touching, finding this out in your code is tough.

Why Does a Memory Leak not Continue after Peaking?

I created an intentional memory leak to demonstrate a point to people who will shortly be learning pointers.
int main()
{
while (1)
{
int *a = new int [2];
//delete [] a;
}
}
If this is run without the commented code, the memory stays low and doesn't rise, as expected. However, if this is run as is, then on a machine with 2GB of RAM, the memory usage rapidly rises to about 1.5GB, or whatever is not in use by the system. Once it hits this point though, the CPU usage (which was previously max) greatly falls, and the memory usage as well, down to about 100MB.
What exactly caused this intervening action (if there's something more specific than "Windows", that'd be great), and why does the program not take up the CPU it would looping, but not terminate either? It seems like it's stuck between the end of the loop and the end of main.
Windows XP, GCC, MinGW.

What's probably happening is that your code allocates all available physical RAM. When it reaches that limit, the system starts to allocate space on the swap file for it. That means it's (nearly) constantly waiting on the disk, so its CPU usage drops to (almost) zero.
The system may easily keep track of the fact that it never actually writes to the memory it allocates, so when it needs to be stored on the swap file, it'll just make a small record basically saying "process X has N bytes of uninitialized storage" instead of actually copying all the data to the hard drive (but I'm not sure of that, and it may well depend on the exact system you're using).

To paraphrase Inigo Montoya, "I don't think that means what you think that means." The Windows task manager doesn't display the memory usage data that you are looking for.
The "Mem Usge" column displays something related to the working set size (or the resident set size) of the process. That is, "Mem Usage" displays a number related to the amount of physical memory currently allocated to your proccess.
The "VM Size" column displays a number wholly unrelated to the virtual memory subsystem (it is actually the size of the private heaps allocated by the process.
Try using a different tool to visual virtual memory usage. I suggest Process Explorer.

I guess when the program exhausts the available physical memory, it starts to use on-disk (virtual) memory, and it becomes so slow, it seems as if it's inactive. Try adding some speed visualization:
int counter = 0;
while (1)
{
int *a = new int [2];
++counter;
if (counter % 1000000 == 0)
std::cout << counter << '\n'
}

The default Memory column in the task manager of XP is the size of the working set of the process (the amount of physical memory allocated to that process), not the actual memory usage.
http://msdn.microsoft.com/en-us/library/windows/desktop/ms684891%28v=vs.85%29.aspx
http://blogs.msdn.com/b/salvapatuel/archive/2007/10/13/memory-working-set-explored.aspx

The "Mem Usage" column of the task manager is probably the "working set" as explained by a few answers in this question, although to be honest I still get confused how the task manager refers to memory as it changes from version to version. This value goes up/down as you are obviously not actually using much memory at any given time. If you look at the "VM Size" you should see it constantly increase until something bad happens.
You can also given Process Explorer a try which I find easily to understand in how it displays things.

Several things: first, if you're only allocating 2 ints at a time, it
could take hours before you notice that the total memory usage is going
up because of it. And second, on a lot of systems, allocation doesn't
commit until you actually access the memory; the address space may be
reserved, but you don't really have the memory (and the program will
crash if you try to access the memory and there isn't any available).
If you want to simulate a leak, I'd recommend allocating at least a page
at a time, if not considerably more, and writing at least one byte in
each allocated page.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js