Excessive memory usage in Mixed Mode c++/CLR application

Excessive memory usage in Mixed Mode c++/CLR application - c++

I have an native c++ application from which i am making call to .net dll(external function), I see that when i make call to managed, it allocates the full stack allocated for thread specified using /stack linker option, however if I make only native function calls, it allocates the stack it is required for the calculations.
Below are my observations
With /stack option set to 80MB, and with calls to managed external function.
With /stack option set to 1MB, and with calls to managed external function.
With /stack option set to 80MB, and with calls to native internal function.
When we make calls to .Net external function, there are also some extra threads which are related to GC. Also threads in our application are also using considerably more stack space compared to the case where we do not call .Net external function. I am not sure if managed stack is place on top of native stack. Can someone help me in understanding why full stack for a thread is allocated when we make calls to .Net external function and also memory management in mixed mode application.

OK I finally found the answer.
The CLR always commits the whole stack memory for managed threads as soon as a managed thread is created, or lazily when a native thread becomes a managed thread. this is done to ensure that stack overflow can be dealt with predictably by the execution engine.
In managed code, the System.Threading.Thread class's constructor provides two overloads that accepts a maxStackSize parameter. As full stack is committed at creation time for all managed threads, and so maxStackSize parameter represents both the reserve and the commit size:they are effectively same.
Just to clarify, there are three steps to using memory on the stack:
Reserve virtual address space for the stack in the process
Commit the pages
Use the pages
The default behaviour in normal Win32 programs is to do just 1 when a thread starts up. The problem with that is that the dynamic stack growth can fail under high load - you might find that when you want to add a stack frame, the system doesn't have any free virtual memory available for you.
By doing step 2, the CLR ensures that memory will be kept in reserve so that step 3 will never fail.
Any useful information regarding this are still appreciated.
Thank You.

Related

Can I ask a running thread about how much stack it has been using?

So let's say I have created a thread, on Windows OS, for which I know that the default stack size is much more than what is needed. Can I then run the application and ask the thread about the amount of stack it actually has used so that I know how much I should set its stack size instead of the default stack size?

Windows usually doesn't commit the entire stack, it only reserves it. (Well, unless you ask it to, e.g. by specifying a non-zero stack size argument for CreateThread without also passing the STACK_SIZE_PARAM_IS_A_RESERVATION flag).
You can use this to figure out how much stack your thread has ever needed while running, including any CRT, WinAPI or third-party library calls.
To do that, simply read the StackBase and StackLimit values from the TEB - see answers to this question for how to do that. The difference of the two values should be the amount of stack memory that has been committed, i.e. - the amount of stack memory that the thread has actually used.
Alternatively, if a manual process is sufficient: Simply start the application in WinDBG, set a breakpoint before the thread exits and then dump the TEB with the !teb command. You can also dump the entire memory map with !address and look for committed areas with usage Stack.

Checking all sorts of memory usage during the runtime of a C++ Application

I'm using CentOS 7 and I'm running a C++ Application. Recently I switched to a newer version of a library which the application was using for various MySQL C API functions. But after integrating the new library, I saw a tremendous increase in memory usage of the program i.e. the application crashes if left running for more than a day or two. Precisely, what happens is the memory usage for the application starts increasing upto a point where the application alone is using 74.9% of total memory of the system and then it is forcefully shut down by the system.
Is there any way of how to track memory usage of the whole application including the static variables as well. I've already tried valgrind's tool Massif.
Can anyone tell me what could be the possible reasons for the increased memory usage or any tools that can give me a deep insight of how the memory is being allocated (both static and dynamic). Is there any tool which can tell us about Memory Allocation for a C++ Application running in a linux environment?
Thanks in advance!

Static memory is allocate when the program starts. Are you seeing memory growth or a startup increase?
Since it takes 'a day or two to crash', the trouble is likely a memory leak or unbounded growth of a data structure. Valgrind should be able to help with both. If valgrind shows a big leak with the --leak-check-full option then you will likely have found the issue.
To check for unbounded growth, put a preemptive _exit() in the program at a point where you suspect the heap has grown. For example, put a timer on the main loop and have the program _exit after 10 minutes. If the valgrind shows a big 'in use at exit' then you likely have unbounded growth of a data structure but not a leak. Massif can help track this down. The ms_print gives details of allocations with function stack.
If you find an issue, try switching back to the older version of your library. If the problem goes away, check and make sure you are using the API properly in the new version. If you don't have the source code then you are a bit stuck in terms of a fix.
If you want to go the extra mile, you can write a shared library interposer for malloc/free to see what is happening. Here is a good start. Linux has the backtrace functionality that can help with determining the exact stack.
Finally, if you must use the 3rd party library and find the heap growing without bound or leaking then you can use the shared library interposer to directly call free/delete. This is a risky last-ditch unrecommended strategy but I've used in production to limp a process along.

How to have my program stops if its memory consumption exceeds a limit ?

I develop a C++ framework that is used to run user code in a well defined environment (Linux boxes under our supervision).
I would like to prevent badly written modules to start eating up all memory of a machine. As I develop the framework could I simply force the program to stop itself if its memory consumption is too high ? What api or tool should I use for this ?

A simple mechanism for controlling a process's resource limits is provided by setrlimit. But that doesn't isolate the process (as you should for untrusted third-party code), it only puts some restrictions on it. To properly isolate a process from the rest of the system, you should either use a VM, or make use of cgroups and kernel namespaces — preferrably not by hand, but via some existing library or framework (such as Docker).

How to have my program stops if its memory consumption exceeds a limit ?
When you define the interface between the application and it's modules, ensure that one of the first steps (probably the first) will be to pass an allocator-like class instance, from the application to the modules.
This instance should be used in the module to allocate and deallocate all necessary memory.
This will allow implementations of this allocator instance, to report memory allocations to the main application, which should be able to triggering an exception, if a limit (per module or per application) is reached.

You can directly provide your own operator new. However, that won't protect you from calls to malloc, or direct OS calls. This would require patching or wrapping glibc (since you're on Linux). Doable but not nice.
What's your desired security level? Are you protecting against Murphy or Machiavelli ? Might a plugin use a third-party library which allocates memory on bahalf of the plugin? Do you need to keep track of the plugin that allocated the memory?

Releasing virtual memory reserved by C++ 'new' on Windows

I'm writing a 32-bit .NET program with a 2 stage input process:
It uses native C++ via C++/CLI to parse an indefinite number files into corresponding SQLite databases (all with the same schema). The allocations by C++ 'new' will typically consume up to 1GB of the virtual address space (out of 2GB available; I'm aware of the 3GB extension but that'll just delay the issue).
It uses complex SQL queries (run from C#) to merge the databases into a single database. I set the cache_size to 1GB for the merged database so that the merging part has minimal page faults.
My problem is that the cache in stage 2 does not re-use the 1GB of memory allocated by 'new' and properly released by 'delete' in stage 1. I know there's no leak because immediately after leaving stage 1, 'private bytes' drops down to a low amount like I'd expect. 'Virtual size' however remains at about the peak of what the C++ used.
This non-sharing between the C++ and SQLite cache causes me to run out of virtual address space. How can I resolve this, preferably in a fairly standards-compliant way? I really would like to release the memory allocated by C++ back to the OS.

This is not something you can control effectively from the C++ level of abstraction (in other words you cannot know for sure if memory that your program released to the C++ runtime is going to be released to the OS or not). Using special allocation policies and non-standard extensions to try to handle the issue is probably not working anyway because you cannot control how the external libraries you use deal with memory (e.g. if the have cached data).
A possible solution would be moving the C++ part to an external process that terminates once the SQLite databases have been created. Having an external process will introduce some annoyiance (e.g. it's a bit harder to keep a "live" control on what happens), but also opens up more possibilities like parallel processing even if libraries are not supporting multithreading or using multiple machines over a network.

Since you're interoperating with C++/CLI, you're presumably using Microsoft's compiler.
If that's the case, then you probably want to look up _heapmin. After you exit from your "stage 1", call it, and it'll release blocks of memory held by the C++ heap manager back to the OS, if the complete block that was allocated from the OS is now free.

On Linux, we used google malloc (http://code.google.com/p/google-perftools/). It has a function to release the free memory to the OS: MallocExtension::instance()->ReleaseFreeMemory().
In theory, gcmalloc works on Windows, but I never personally used it there.

You could allocate it off the GC from C#, pin it, use it, and then allow it to return, thus freeing it and letting the GC compact it and re-use the memory.

Any useful suggestions to figure out where memory is being free'd in a Win32 process?

An application I am working with is exhibiting the following behaviour:
During a particular high-memory operation, the memory usage of the process under Task Manager (Mem Usage stat) reaches a peak of approximately 2.5GB (Note: A registry key has been set to allow this, as usually there is a maximum of 2GB for a process under 32-bit Windows)
After the operation is complete, the process size slowly starts decreasing at a rate of 1MB per second.
I am trying to figure out the easiest way to quickly determine who is freeing this memory, and where it is being free'd.
I am having trouble attaching a memory profiler to my code, and I don't particularly want to override the new/delete operators to track the allocations/deallocations (IOW, I want to do this without re-compiling my code).
Can anyone offer any useful suggestions of how I could do this via the Visual Studio debugger?
Update
I should also mention that it's a multi-threaded application, so pausing the application and analysing the call stack through the debugger is not the most desirable option. I considered freezing different threads one at a time to see if the memory stops reducing, but I'm fairly certain this will cause the application to crash.

Ahh! You're looking at the wrong counter!
Mem Usage doesn't tell you that memory is being freed. Only that the working set is being purged! This could mean some other application needs memory, or the VMM decided to mark some of your process's pages as Stand By for some other process to quickly use. It does not mean that VirtualFree, HeapFree or any other free function is being called.
Look at the commit size (VM Size, Private Bytes, etc).
But if you still want to know when memory is being decommitted or freed or what-have-you, then break on some free calls. E.g. (for Visual C++)
{,,kernel32.dll}HeapFree
or
{,,msvcr80.dll}free
etc.
Or just a regular function breakpoint on the above. Just make sure it resolves the address.
cdb/WinDbg let you do it via
bp kernel32!HeapFree
bp msvcrt!free
etc.
Names may vary depending on which CRT version you use and how you link against it (via /MT or /MD and its variants)

You might find this article useful:
http://www.gamasutra.com/view/feature/1430/monitoring_your_pcs_memory_usage_.php?print=1
basically what I had in mind was hooking the low level allocation functions.

A couple different ideas:
The C runtime has a set of memory debugging functions; you'd need to recompile though. You could get a snapshot at computation completion and later, and use _CrtMemDifference to see what changed.
Or, you can attach to the process in your debugger, and cause it to dump a core before and after the memory freeing. Using NTSD, you can see what heaps are around, and the sizes of things. (You'll need a lot of disk space, and a fair amount of patience.) There's a setting (I think you get it through gflags, but I don't remember) that causes it to save a piece of the call stack as part of the dump; using that you can figure out what kind of object is being deallocated. Unfortunately, it only stores 4 or 5 stack frames, so you'll likely have to do something more clever as the next step to figure out where it's being freed. Either look at the code ("oh yeah, there's only one place where that can happen") or put in breakpoints on those destructors, or add tracing to the allocations and deallocations.

If your memory manager wipes free'd data to a known value (usually something like 0xfeeefeee), you can set a data breakpoint on a particular instance of something you're interested in. When it gets free'd, the breakpoint will trigger when the memory gets wiped.

I recommend you to check UMDH tool that comes with Debugging Tools For Windows (You can find usage and samples in the debugging tools help). You can snap shot running process's heap allocations with stack trace and compare them.

You could try Memory Validator to monitor the allocations and deallocations. Memory Validator has a couple of features that will help you identify where data is being deallocated:
Hotspots view. This can show you a tree of all allocations and deallocations or just all allocations or just all deallocations. It presents the data as a percentage of memory activity (based on amount of memory (de)allocated at a given location).
Analysis view. You can perform queries asking for data in a given address range. You can restrict these queries to any of alloc, realloc, dealloc behaviours.
Objects view. You can view allocations by type and see the maximum number of objects of each type (plus lots of other stats). Right click on a type to get a context menu, choose show all deallocations - will show deallocation locations for that type on Analysis tab.
I think the Hotspots view may give you the insight you need.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js