C++ test if two DLLs share the same heap - c++

It is well known that the freeing of heap memory must be done with the same allocator as the one used to allocate it. This is something to take into account when exchanging heap allocated objects across DLL boundaries.
One solution is to provide a destructor for each object, like in a C API: if a DLL allows creating object A it will have to provide a function A_free or something similar 1.
Another related solution is to wrap all allocations into shared_ptr because they store a link to the deallocator 2.
Another solution is to "inject" a top-level allocator into all loaded DLLs (recursively) 3.
Another solution is to just to not exchange heap allocated objects but instead use some kind of protocol 4.
Yet another solution is to be absolutely sure that the DLLs will share the same heap, which should (will?) happen if they share compatible compilations options (compiler, flags, runtime, etc.) 5 6.
This seems quite difficult to guarantee, especially if one would like to use a package manager and not build everything at once.
Is there a way to check at runtime that the heaps are actually the same between multiple DLLs, preferably in a cross-platform way?
For reliability and ease of debugging this seems better than hoping that the application will immediately crash and not corrupt stuff silently.

I think the biggest problem here is your definition of "heap". That assumes there is a unique definition.
The problem is that Windows has HeapAlloc, while C++ typically uses "heap" as the memory allocated by ::operator new. These two can be the same, distinct, a subset, or partially overlapping.
With two DLL's, both might be written in C++ and use ::operator new, but they both could have linked their own unique versions. So there might be multiple answers to the observation in the previous paragraph.
Now let's assume for an example that one DLL has ::operator new forward directly to HeapAlloc, but the other counts allocations before calling HeapAlloc. Clearly the two can't be mixed formally, because the count kept by the second allocator would be wrong. But the code is so simple that both news are probably inlined. So at assembly level you just have calls to HeapAlloc.
There's no way you can detect this at runtime, even if you would disassemble the code on the fly (!) - the inlined counter increment instruction is not distinguishable from the surrounding code.

Related

Allocation numbers in C++ (windows) and its predictibility

I am using _CrtDumpMemoryLeaks to identify memory leaks in our software. We are using a third party library in a multi-threaded application. This library does have memory leaks and therefore in our tests we want to identify those that ours and discard those we do not have any control over.
We use continuous integration so new functions/algorithms/bug fixes get added all the time.
So the question is - is there a safe way of identifying those leaks that are ours and those that are the third parties library. We though about using allocation numbers but is that safe?
In a big application I worked on the global new and delete operators were overwritten (eg. see How to properly replace global new & delete operators) and used private heaps (eg. HeapCreate). Third party libraries would use the process heap and thus the allocation would be clearly separated.
Frankly I don't think you can get far with allocation numbers. Using explicit separate heaps for app/libraries (and maybe even have separate per-component heaps within your own app) would be much more manageable. Consider that you can add your own app specific header to each allocated block and thus enable very fancy memory tracking. For example capture the allocation entire call-stack would be possible, for debugging. Enable per-component accounting. Etc etc.
You might be able to do this using Mirosoft's heap debugging library without using any third-party solutions. Based on what I learned from a previous question here, you should just make sure that all memory allocated in your code is allocated through a call to _malloc_dbg where the second argument is set to _CLIENT_BLOCK. Then you can set a callback function with _CrtSetDumpClient, and that callback will only receive information about the client blocks that were allocated, not the other ones.
You can easily use the preprocessor to convert all the calls to malloc and free to actually call their debugging versions (e.g. _malloc_dbg); just look at how it's done in crtdbg.h which comes with Visual Studio.
The tricky part for me would be figuring out how to override the new and delete operators to call debugging functions like _malloc_dbg. It might be hard to find a solution where only the news and deletes in your own code are affected, and not in the third-party library.
You may want to use DebugDiag Tool provided by Microsoft. For complete information about the tool
we can refer : http://www.microsoft.com/en-sg/download/details.aspx?id=40336
DebugDiag can be used for identifying various issue. We can follow the steps to track down the
leaks(ours and third party module):
Configure the DebugDiag under Rule Type "Native(non .NET) Memory and Handle Leak".
Now Re-run the application for sometime and capture the dump files. We can also configure
the DebugDiag to capture the dump file after specified interval.
Now we can open/analyze the captured dump file using DebugDiag under the "Performance Analyzers".
Once analysis is complete, DebugDiag would automatically generate the report and would give you the
modules/DLL information where leak is possible(with probability). After this we get the information about the modules from DebugDiag tool, we can concentrate on that particular module by doing static code analysis. If modules belongs to third party DLL, we can share the DebugDiag report to them. In addition to this, if you run/attach your application with appropriate PDB file, DebugDiag also provides the call stack from where chances of memory leak is possible.
These information were very useful in the past while debugging memory leak on windows based application. Hopefully above information would be useful.
The answer would REALLY depend on the actual implementation of the third partly library. Does it only leak a consistent number of items, or does that depend on, for example, the number of threads, what functions are used within the library, or some such? When are the allocations made?
Even then if it's a consistent number of leaks regardless of library usage, I'd be hesitant to use this the allocation number. By all means, give it a try. If all the allocations are made very early on, and they don't depend on any of "your" code, then it could work - and it is a REALLY simple thing. But try adding for example a static std::vector<int>(100) to see if memory allocations in static variables are affecting the allocation number... If it does, this method is probably doomed (unless you have very strict rules on static objects).
Using a separate heap (with new/delete operators replaced) would be the correct solution, as this can probably be expanded to gather other statistics too [like number of allocations made, to detect parts of the code that makes excessive allocations - of course, this has to be analysed based on what the code actually does].
The newer Doug Lea malloc's include the mspace abstraction. An mspace is a separate heap. In our couple 100K NCSL application, we use a dozen different mspace's for different parts of the code. We use allocators to have STL containers allocate memory from the right mspace.
Some of the benefits
3rd party code does not use mspaces, so their allocation (and leaks) do not mix with ours
We can look at the memory usage of each mspace to see which piece of code might have memory leaks
Any memory corruption is contained within one mspace thus limiting the amount of code we need to look at for debugging.

How can you track memory across DLL boundaries

I want performant run-time memory metrics so I wrote a memory tracker based on overloading new & delete. It basically lets walk your allocations in the heap and analyze everything about them - fragmentation, size, time, number, callstack, etc. But, it has 2 fatal flaws: It can't track memory allocated in other DLLs and when ownership of objects is passed to DLLs or vice versa crashes ensue. And some smaller flaws: If a user uses malloc instead of new it's untracked; or if a user makes a class defined new/delete.
How can I eliminate these flaws? I think I must be going about this fundamentally incorrectly by overloading new/delete, is there a better way?
The right way to implement this is to use detours and a separate tool that runs in its own process. The procedure is roughly the following:
Create memory allocation in a remote process.
Place there code of a small loader that will load your dll.
Call CreateRemoteThread API that will run your loader.
From inside of the loaded dll establish detours (hooks, interceptors) on the alloc/dealloc functions.
Process the calls, track activity.
If you implement your tool this way, it will be not important from what DLL or directly from exe the memory allocation routines are called. Plus you can track activities from any process, not necessarily that you compiled yourself.
MS Windows allows checking contents of the virtual address space of the remote process. You can summarize use of virtual address space that was collected this way in a histogram, like the following:
From this picture you can see how many virtual allocation of what size are existing in your target process.
The picture above shows an overview of the virtual address space usage in 32-bit MSVC DevEnv. Blue stripe means a commited piece of emory, magenta stripe - reserved. Green is unoccupied part of the address space.
You can see that lower addresses are pretty fragmented, while the middle area - not. Blue lines at high addresses - various dlls that are loaded into the process.
You should find out the common memory management routines that are called by new/delete and malloc/free, and intercept those. It is usually malloc/free in the end, but check to make sure.
On UNIX, I would use LD_PRELOAD with some library that re-implemented those routines. On Windows, you have to hack a little bit, but this link seems to give a good description of the process. It basically suggests that you use Detours from Microsoft Research.
Passing ownership of objects between modules is fundamentally flawed. It showed up with your custom allocator, but there are plenty of other cases that will fail also:
compiler upgrades, and recompiling only some DLLs
mixing compilers from different vendors
statically linking the runtime library
Just to name a few. Free every object from the same module that allocated it (often by exporting a deletion function, such as IUnknown::Release()).

malloc() vs. HeapAlloc()

What is the difference between malloc() and HeapAlloc()? As far as I understand malloc allocates memory from the heap, just as HeapAlloc, right?
So what is the difference?
Actually, malloc() (and other C runtime heap functions) are module dependant, which means that if you call malloc() in code from one module (i.e. a DLL), then you should call free() within code of the same module or you could suffer some pretty bad heap corruption (and this has been well documented). Using HeapAlloc() with GetProcessHeap() instead of malloc(), including overloading new and delete operators to make use of such, allow you to pass dynamically allocated objects between modules and not have to worry about memory corruption if memory is allocated in code of one module and freed in code of another module once the pointer to a block of memory has been passed across to an external module.
You are right that they both allocate memory from a heap. But there are differences:
malloc() is portable, part of the standard.
HeapAlloc() is not portable, it's a Windows API function.
It's quite possible that, on Windows, malloc would be implemented on top of HeapAlloc. I would expect malloc to be faster than HeapAlloc.
HeapAlloc has more flexibility than malloc. In particular it allows you to specify which heap you wish to allocate from. This caters for multiple heaps per process.
For almost all coding scenarios you would use malloc rather than HeapAlloc. Although since you tagged your question C++, I would expect you to be using new!
With Visual C++, the function malloc() or the operator new eventually calls HeapAlloc(). If you debug the code, you will find the the function _heap_alloc_base() (in the file malloc.c) is calling return HeapAlloc(_crtheap, 0, size) where _crtheap is a global heap created with HeapCreate().
The function HeapAlloc() does a good job to minimize the memory overhead, with a minimum of 8 bytes overhead per allocation. The largest I have seen is 15 bytes per allocation, for allocations ranging from 1 byte to 100,000 bytes. Larger blocks have larger overhead, however as a percent of the total allocated it remains less than 2.5% of the payload.
I cannot comment on performance because I have not benchmarked the HeapAlloc() with a custom made routine, however as far as the memory overhead of using HeapAlloc(), the overhead is amazingly low.
malloc is a function in the C standard library (and also in the C++ standard library).
HeapAlloc is a Windows API function.
The latter lets you specify the heap to allocate from, which I imagine can be useful for avoiding serialization of allocation requests in different threads (note the HEAP_NO_SERIALIZE flag).
In systems where multiple DLLs may come and go (via LoadLibrary/Freelibrary), and when memory may be allocated within one DLL, but freed in another (see previous answer), HeapAlloc and related functions seem to be the least-common-denominator for successful memory sharing.
Thread safe, presumably highly optimized by PhDs galore, HeapAlloc appears to work in all kinds of situations where our not-so-shareable code using malloc/free would fail.
We are a C++ embedded shop, so we have overloaded operator new/delete across our system to use the HeapAlloc( GetProcessHeap( ) ) which can stubbed (on target) or native (to windows) for code portability.
So far no problems now that we have bypassed malloc/free which seem indisputably DLL specifically, a new "heap" for each DLL load.
Additionally, you can refer to:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa366705(v=vs.85).aspx
Which stands that you can enable some features of the HEAP managed by WinApi memory allocator, eg "HeapEnableTerminationOnCorruption".
As I understand, it makes some basic heap overflow protections which may be considered as an added value to your application in terms of security.
(eg, I would prefer to crash my app (as an app owner) rather than execute arbitrary code)
Other thing is that it might be useful in early phase of development, so you could catch memory issues before going to the production.
malloc is exported function by C run-time library(CRT) which is compiler specific.
C run-time library dll name changes from visual studio versions to versions.
HeapAlloc function is exported by kernel32.dll present in windows folder.
This is what MS has to say about it: http://msdn.microsoft.com/en-us/library/windows/desktop/aa366533(v=vs.85).aspx
One thing one one mentioned thus far is: "The malloc function has the disadvantage of being run-time dependent. The new operator has the disadvantage of being compiler dependent and language dependent."
Also, "HeapAlloc can be instructed to raise an exception if memory could not be allocated"
So if you want your program to run with any CRT, or perhaps no CRT at all, you'd use HeapAlloc. Perhaps only people who would do such thing would be malware writers. Another use might be if you are writing a very memory intensive application with specific memory allocation/usage patterns that you'd rather write your own heap allocator instead of using a CRT one.

How are heaps created in mixed language applications?

We have a front end written in Visual Basic 6.0 that calls several back end DLLs written in mixed C/C++. The problem is that each DLL appears to have its own heap and one of them isn’t big enough. The heap collides with the program stack when we’ve allocated enough memory.
Each DLL is written entirely in C, except for the basic DLL wrapper, which is written in C++. Each DLL has a handful of entry points. Each entry point immediately calls a C routine. We would like to increase the size of the heap in the DLL, but haven’t been able to figure out how to do that. I searched for guidance and found these MSDN articles:
http://msdn.microsoft.com/en-us/library/hh405351(v=VS.85).aspx
These articles are interesting but provide conflicting information. In our problem it appears that each DLL has its own heap. This matches the “Heaps: Pleasures and Pains” article that says that the C Run-Time (C RT) library creates its own heap on startup. The “Managing Heap Memory” article says that the C RT library allocated out of the default process heap. The “Memory management options in Win32” article says the behavior depends on the version of the C RT library being used.
We’ve temporarily solved the problem by allocating memory from a private heap. However, in order to improve the structure of this very large complex program, we want to switch from C with a thin C++ wrapper to real C++ with classes. We’re pretty certain that the new and free operator won’t allocate memory from our private heap and we’re wondering how to control the size of the heap C++ uses to allocate objects in each DLL. The application needs to run in all versions of desktop Windows-NT, from 2000 through 7.
The Question
Can anyone point us to definitive and correct documentation that
explains how to control the size of the heap C++ uses to allocate
objects?
Several people have asserted that stack corruption due to heap allocations overwriting the stack are impossible. Here is what we observed. The VB front end uses four DLLs that it dynamicly loads. Each DLL is independant of the others and provides a handful of methods called by the front end. All the DLLs comunicate via data structures written to files on disk. These data structures are all structured staticlly. They contain no pointers, just value types and fixed sized arrays of value types. The problem DLL is invoked by a single call where a file name is passed. It is designed to allocate about 20MB of data structures required to do complete its processing. It does a lot of calculation, writes the results to disk, releases the 20MB of data structures, and returns and error code. The front end then unloads the DLL. While debugging the problem under discussion, we set a break point at the beginning of the data structure allocation code and watched the memory values returned from the calloc calls and compared them with the current stack pointer. We watched as the allocated blocks approached the the stack. After the allocation was complete the stack began to grow until it overlapped the heap. Eventually the calculations wrote into the heap and corrupted the stack. As the stack unwound it tried to return to an invalid address and crashed with a segmentation fault.
Each of our DLLs is staticly linked to the CRT, so that each DLL has its own CRT heap and heap manager. Microsoft says in http://msdn.microsoft.com/en-us/library/ms235460(v=vs.80).aspx:
Each copy of the CRT library has a separate and distinct state.
As such, CRT objects such as file handles, environment variables, and
locales are only valid for the copy of the CRT where these objects are
allocated or set. When a DLL and its users use different copies of the
CRT library, you cannot pass these CRT objects across the DLL boundary
and expect them to be picked up correctly on the other side.
Also, because each copy of the CRT library has its own heap manager,
allocating memory in one CRT library and passing the pointer across a
DLL boundary to be freed by a different copy of the CRT library is a
potential cause for heap corruption.
We don't pass pointers between DLLs. We aren't experiencing heap corruption, we are experiencing stack corruption.
OK, the question is:
Can anyone point us to definitive and correct documentation that
explains how to control the size of the heap C++ uses to allocate
objects?
I am going to answer my own question. I got the answer from reading Raymond Chen's blog The Old New Thing, specifically There's also a large object heap for unmanaged code, but it's inside the regular heap. In that article Raymond recommends Advanced Windows Debugging by Mario Hewardt and Daniel Pravat. This book has very specific information on both stack and heap corruption, which is what I wanted to know. As a plus it provides all sorts of information about how to debug these problems.
Could you please elaborate on this your statement:
The heap collides with the program stack when we’ve allocated enough memory.
If we're talking about Windows (or any other mature platform), this should not be happening: the OS makes sure that stacks, heaps, mapped files and other objects never intersect.
Also:
Can anyone point us to definitive and correct documentation that explains how to control the size of the heap C++ uses to allocate objects?
The heap size is not fixed on Windows: it grows as the application uses more and more memory. It will grow until all available virtual memory space for the process is used. It is pretty easy to confirm this: just write a simple test app which keeps allocating memory and counts how much has been allocated. On a default 32-bit Windows you'll reach almost 2Gb. Surely, initially the heap doesn't occupy all available space, therefore it must grow in the process.
Without many details about the "collision" it's hard to tell what's happening in your case. However, looking at the tags to this question prompts me to one possibility. It is possible (and happens quite often, unfortunately) that ownership of allocated memory areas is being passed between modules (DLLs in your case). Here's the scenario:
there are two DLLs: A and B. Both of them created their own heaps
the DLL A allocates an object in its heap and passes the pointer and ownership to B
the DLL B receives the pointer, uses the memory and deallocates the object
If the heaps are different, most heap managers would not check if the memory region being deallocated actually belongs to it (mostly for performance reasons). So they would deacllocate something which doesn't belong to them. By doing that they corrupt the other module's heap. This may (and often does) lead to a crash. But not always. Depending on your luck (and particular heap manager implementation), this operation may change one of the heaps in a manner that the next allocation will happen outside of the area where the heap is located.
This often happens when one module is managed code, while the other is native one. Since you have the VB6 tag in the question, I'd check if this is the case.
If the stack grows large enough to hit the heap, a prematurely aborted stack overflow may be the problem: Invalid data is passed that does not satisfy the exit condition of some recursion (loop detection is not working or not existing) in the problem DLL so that an infinite recursion consumes ridiculously large stack space. One would expect such a DLL to terminate with a stack overflow exception, but for maybe compiler / linker optimizations or large foreign heap sizes it crashes elsewhere.
Heaps are created by the CRT. That is to say, the malloc heap is created by the CRT, and is unrelated to HeapCreate(). It's not used for large allocations, though, which are handed off to the OS directly.
With multiple DLLs, you might have multiple heaps (newer VC versions are better at sharing, but even VC6 had no problem if you used MSVCRT.DLL - that's shared)
The stack, on the other hand, is managed by the OS. Here you see why multiple heaps don't matter: The OS allocation for the different heaps will never collide with the OS allocation for the stack.
Mind you, the OS may allocate heap space close to the stack. The rule is just no overlap, after all, there's no guaranteed "unused separation zone". If you then have a buffer overflow, it could very well overflow into the stack space.
So, any solutions? Yes: move to VC2010. It has buffer security checks, implemented in quite an efficient way. They're the default even in release mode.

How do I decide whether having more that one VC++ CRT state is a problem for my application?

This MSDN article says that if my application loads VC++ runtime multiple times because either it or some DLLs it depends on are statically linked against VC++ runtime then the application will have multiple CRT states and this can lead to undefined behaviour.
How exactly do I decide if this is a problem for me? For example in this MSDN article several examples are provided that basically say that objects maintained by C++ runtime such as file handles should ot be passed across DLL boundaries. What exactly is a list of things to check if I want my project to statically link against VC++ runtime?
It's OK to have multiple copies of the CRT as long as you aren't doing certain things...:
Each copy of the CRT will manage its own heap.
This can cause unexpected problems if you allocate a heap-based object in module A using 'new', then pass it to module B where you attempt to release it using 'delete'. The CRT for module B will attempt to identify the pointer in terms of its own heap, and that's where the undefined behavior come in: If you're lucky, the CRT in module B will detect the problem and you'll get a heap corruption error. If you're unlucky, something weird will happen, and you won't notice it until much later...
You'll also have serious problems if you pass other CRT-managed things like file handles between modules.
If all of your modules dynamically link to the CRT, then they'll all share the same heap, and you can pass things around and not have to worry about where they get deleted.
If you statically link the CRT into each module, then you have to take care that your interfaces don't include these types, and that you don't allocate using 'new' or malloc in one module, and then attempt to clean up in another.
You can get around this limitation by using an OS allocation function like MS Windows's GlobalAlloc()/GlobalFree(), but this can be a pain since you have to keep track of which allocation scheme you used for each pointer.
Having to worry about whether the target machine has the CRT DLL your modules require, or packaging one to go along with your application can be a pain, but it's a one-time pain - and once done, you won't need to worry about all that other stuff above.
It is not that you can't pass CRT handles around. It is that you should be careful when you have two modules reading/writing the handle.
For example, in dll A, you write
char* p = new char[10];
and in the main program, write
delete[]p;
When dll A and your main program have two CRTs, the new an delete operations will happen in different heaps. The heap in the DLL can't manage the heap in the main program, resulting memory leaks.
Same thing on File handle. Internally File handle may have different implementations in addition to its memory resource. Calling fopen in one module and calling fwrite and fclose in another one could result problems as the data FILE* points could be different in CRT runtimes.
But you can certainly pass FILE* or memory pointer back and forth between two modules.
One issue I've seen with having different versions of the MSVC runtime loaded is that each runtime has its own heap, so you can end up with allocations failing with "out of memory" even though there is plenty of memory available, but in the wrong heap.
The major problem with this is that the problems can occur at runtime. At the binary level, you've got DLLs making function calls, eventually to the OS. The calls themselves are fine, but at runtime the arguments are not. Of course, this can depend on the exact code paths exercised.
Worse, it could even be time- or machine-dependent. An example of that would be a shared resource, used by two DLLs and protected by a proper reference count. The last user will delete it, which could be the DLL that created it (lucky) or the other (trouble)
As far as I know, there isn't a (safe) way to check which version of the CRT a library is statically linked with.
You can, however, check the binaries with a program such as DependencyWalker. If you don't see any DLLs beginning with MSVC in the dependency list, it's most likely statically linked.
If the library is from a third part, I'd just assume it's using a different version of the CRT.
This isn't 100% accurate, of course, because the library you're looking at could be linked with another library that links with the CRT.
if CRT handles are a problem, then don't expose CRT handles in your dll exports :-)
don't pass memory pointers around (if an exported class allocates some memory, it should free it... which means don't mess with inline constructor/destructor?)
This MSDN article (here) says that you should be alterted to the problem by linker error LNK4098. It also suggests that passing any CRT handles across a CRT boundary is likely to cause trouble, and mentions locale, low-level file I/O and memory allocation as examples.