Allocation numbers in C++ (windows) and its predictibility - c++

I am using _CrtDumpMemoryLeaks to identify memory leaks in our software. We are using a third party library in a multi-threaded application. This library does have memory leaks and therefore in our tests we want to identify those that ours and discard those we do not have any control over.
We use continuous integration so new functions/algorithms/bug fixes get added all the time.
So the question is - is there a safe way of identifying those leaks that are ours and those that are the third parties library. We though about using allocation numbers but is that safe?

In a big application I worked on the global new and delete operators were overwritten (eg. see How to properly replace global new & delete operators) and used private heaps (eg. HeapCreate). Third party libraries would use the process heap and thus the allocation would be clearly separated.
Frankly I don't think you can get far with allocation numbers. Using explicit separate heaps for app/libraries (and maybe even have separate per-component heaps within your own app) would be much more manageable. Consider that you can add your own app specific header to each allocated block and thus enable very fancy memory tracking. For example capture the allocation entire call-stack would be possible, for debugging. Enable per-component accounting. Etc etc.

You might be able to do this using Mirosoft's heap debugging library without using any third-party solutions. Based on what I learned from a previous question here, you should just make sure that all memory allocated in your code is allocated through a call to _malloc_dbg where the second argument is set to _CLIENT_BLOCK. Then you can set a callback function with _CrtSetDumpClient, and that callback will only receive information about the client blocks that were allocated, not the other ones.
You can easily use the preprocessor to convert all the calls to malloc and free to actually call their debugging versions (e.g. _malloc_dbg); just look at how it's done in crtdbg.h which comes with Visual Studio.
The tricky part for me would be figuring out how to override the new and delete operators to call debugging functions like _malloc_dbg. It might be hard to find a solution where only the news and deletes in your own code are affected, and not in the third-party library.

You may want to use DebugDiag Tool provided by Microsoft. For complete information about the tool
we can refer : http://www.microsoft.com/en-sg/download/details.aspx?id=40336
DebugDiag can be used for identifying various issue. We can follow the steps to track down the
leaks(ours and third party module):
Configure the DebugDiag under Rule Type "Native(non .NET) Memory and Handle Leak".
Now Re-run the application for sometime and capture the dump files. We can also configure
the DebugDiag to capture the dump file after specified interval.
Now we can open/analyze the captured dump file using DebugDiag under the "Performance Analyzers".
Once analysis is complete, DebugDiag would automatically generate the report and would give you the
modules/DLL information where leak is possible(with probability). After this we get the information about the modules from DebugDiag tool, we can concentrate on that particular module by doing static code analysis. If modules belongs to third party DLL, we can share the DebugDiag report to them. In addition to this, if you run/attach your application with appropriate PDB file, DebugDiag also provides the call stack from where chances of memory leak is possible.
These information were very useful in the past while debugging memory leak on windows based application. Hopefully above information would be useful.

The answer would REALLY depend on the actual implementation of the third partly library. Does it only leak a consistent number of items, or does that depend on, for example, the number of threads, what functions are used within the library, or some such? When are the allocations made?
Even then if it's a consistent number of leaks regardless of library usage, I'd be hesitant to use this the allocation number. By all means, give it a try. If all the allocations are made very early on, and they don't depend on any of "your" code, then it could work - and it is a REALLY simple thing. But try adding for example a static std::vector<int>(100) to see if memory allocations in static variables are affecting the allocation number... If it does, this method is probably doomed (unless you have very strict rules on static objects).
Using a separate heap (with new/delete operators replaced) would be the correct solution, as this can probably be expanded to gather other statistics too [like number of allocations made, to detect parts of the code that makes excessive allocations - of course, this has to be analysed based on what the code actually does].

The newer Doug Lea malloc's include the mspace abstraction. An mspace is a separate heap. In our couple 100K NCSL application, we use a dozen different mspace's for different parts of the code. We use allocators to have STL containers allocate memory from the right mspace.
Some of the benefits
3rd party code does not use mspaces, so their allocation (and leaks) do not mix with ours
We can look at the memory usage of each mspace to see which piece of code might have memory leaks
Any memory corruption is contained within one mspace thus limiting the amount of code we need to look at for debugging.

Related

C++ test if two DLLs share the same heap

It is well known that the freeing of heap memory must be done with the same allocator as the one used to allocate it. This is something to take into account when exchanging heap allocated objects across DLL boundaries.
One solution is to provide a destructor for each object, like in a C API: if a DLL allows creating object A it will have to provide a function A_free or something similar 1.
Another related solution is to wrap all allocations into shared_ptr because they store a link to the deallocator 2.
Another solution is to "inject" a top-level allocator into all loaded DLLs (recursively) 3.
Another solution is to just to not exchange heap allocated objects but instead use some kind of protocol 4.
Yet another solution is to be absolutely sure that the DLLs will share the same heap, which should (will?) happen if they share compatible compilations options (compiler, flags, runtime, etc.) 5 6.
This seems quite difficult to guarantee, especially if one would like to use a package manager and not build everything at once.
Is there a way to check at runtime that the heaps are actually the same between multiple DLLs, preferably in a cross-platform way?
For reliability and ease of debugging this seems better than hoping that the application will immediately crash and not corrupt stuff silently.
I think the biggest problem here is your definition of "heap". That assumes there is a unique definition.
The problem is that Windows has HeapAlloc, while C++ typically uses "heap" as the memory allocated by ::operator new. These two can be the same, distinct, a subset, or partially overlapping.
With two DLL's, both might be written in C++ and use ::operator new, but they both could have linked their own unique versions. So there might be multiple answers to the observation in the previous paragraph.
Now let's assume for an example that one DLL has ::operator new forward directly to HeapAlloc, but the other counts allocations before calling HeapAlloc. Clearly the two can't be mixed formally, because the count kept by the second allocator would be wrong. But the code is so simple that both news are probably inlined. So at assembly level you just have calls to HeapAlloc.
There's no way you can detect this at runtime, even if you would disassemble the code on the fly (!) - the inlined counter increment instruction is not distinguishable from the surrounding code.

Testing C++ code and IsBadWritePtr

I am currently writing some basic tests for some functions of my C++ code (it is a game engine that I am writing mostly for educational purposes). One of the features I want to test is the memory allocation code. The tests currently consist of a function that runs every startup if the code is in debug mode. This forces me to always test the code when I am debugging.
To test my memory allocation code my instinct is to do something like this:
int* test = MemoryManager::AllocateMemory<int>();
assert(!IsBadWritePtr(test, sizeof(int)), "Memory allocation test failed: allocate");
MemoryManager::FreeMemory(test);
assert(IsBadWritePtr(test, sizeof(int)), "Memory free test failed: free");
This code is working fine, but all the resources I can find say not to use the IsBadWritePtr function (this is a WinAPI function for those unfamiliar). Is the use of this function OK in this case? The three main warnings against using it I found were:
This might cause issues with guard pages
This isn't an issue as the memory allocation code is right there and I know I am not allocating a guard page.
It's better to fail earlier
This isn't a consideration as I am literally using it to fail as early as possible.
It isn't thread safe
The test code is executed right at the beginning of execution, long before any other threads exist. It also acts on memory to which no other pointers are created, and therefore could not exist in other threads.
So basically I want to know if the use of this function is a good idea in this case, and if there is anything I am missing about the function. I am also aware that something that points to the wrong place will still pass this test, but it at least detects most memory allocation errors, right (what are the chances I get a pointer to valid memory if the allocation fails?)
I was going to write this as a comment but it's too long.
I'll bring up the elephant in the room: why don't you just test for failure in the traditional way (by returning null or throwing an exception)?
Nevermind the fact that IsBadWritePtr is so frowned upon (even its documentation says that it's obsolete and you shouldn't use it), but your use case doesn't even seem appropriate. From the MSDN documentation:
This function is typically used when working with pointers returned from third-party libraries, where you cannot determine the memory management behavior in the third-party DLL.
But you are not using it to test anything passed/returned from a DLL, you just seem to be using it to test for allocation success, which is not only unnecessary (because you already know that from the return value of HeapAlloc, GlobalAlloc, etc.), but it's not what IsBadWritePtr is intended for.
Also, testing for allocation success is not something you should only do in debug mode, or with asserts, as it's obviously out of your control and you can't try to "fix" it by debugging.
Building on #user1610015's answer, there one reason why IsBadReadPtr should NOT work in your scenario.
Basically IsBadReadPtr works on whole page granularity. This means for the above code to be correct each and every allocation you make will consume a whole page (min 4KB).
Modern allocators use a variety of tricks to pack lots of allocations into pages (low fragmentation bucket heaps, linked lists of allocations, etc). If you don't pack small allocation like this then thinks like stl maps and other libraries which use lots of small allocations will absolutely kill your game (both in memory use and the fact that cache coherency will be killed with so much unused padding).
As a side not, your last comment about thread safety is dangerous. Lots of apps and libraries you link to can spawn threads off with global object constructors (and so run before main is called) and other tricks to insert code into your process. So I would definitely check this is the case with your code right now but more importantly later as you add 3rd party libraries to you code check it then.

How can you track memory across DLL boundaries

I want performant run-time memory metrics so I wrote a memory tracker based on overloading new & delete. It basically lets walk your allocations in the heap and analyze everything about them - fragmentation, size, time, number, callstack, etc. But, it has 2 fatal flaws: It can't track memory allocated in other DLLs and when ownership of objects is passed to DLLs or vice versa crashes ensue. And some smaller flaws: If a user uses malloc instead of new it's untracked; or if a user makes a class defined new/delete.
How can I eliminate these flaws? I think I must be going about this fundamentally incorrectly by overloading new/delete, is there a better way?
The right way to implement this is to use detours and a separate tool that runs in its own process. The procedure is roughly the following:
Create memory allocation in a remote process.
Place there code of a small loader that will load your dll.
Call CreateRemoteThread API that will run your loader.
From inside of the loaded dll establish detours (hooks, interceptors) on the alloc/dealloc functions.
Process the calls, track activity.
If you implement your tool this way, it will be not important from what DLL or directly from exe the memory allocation routines are called. Plus you can track activities from any process, not necessarily that you compiled yourself.
MS Windows allows checking contents of the virtual address space of the remote process. You can summarize use of virtual address space that was collected this way in a histogram, like the following:
From this picture you can see how many virtual allocation of what size are existing in your target process.
The picture above shows an overview of the virtual address space usage in 32-bit MSVC DevEnv. Blue stripe means a commited piece of emory, magenta stripe - reserved. Green is unoccupied part of the address space.
You can see that lower addresses are pretty fragmented, while the middle area - not. Blue lines at high addresses - various dlls that are loaded into the process.
You should find out the common memory management routines that are called by new/delete and malloc/free, and intercept those. It is usually malloc/free in the end, but check to make sure.
On UNIX, I would use LD_PRELOAD with some library that re-implemented those routines. On Windows, you have to hack a little bit, but this link seems to give a good description of the process. It basically suggests that you use Detours from Microsoft Research.
Passing ownership of objects between modules is fundamentally flawed. It showed up with your custom allocator, but there are plenty of other cases that will fail also:
compiler upgrades, and recompiling only some DLLs
mixing compilers from different vendors
statically linking the runtime library
Just to name a few. Free every object from the same module that allocated it (often by exporting a deletion function, such as IUnknown::Release()).

How do I replace global operator new and delete in an MFC application (debug only)

I've avoided trying to do anything with operator new for many years, due to my sense that it is a quagmire on Windows (especially using MFC). Not to mention, unless one has a very compelling reason to mess with global (or even class) new and delete, one should not.
However, I have a nasty little memory corruption bug, and I'd very much like to track it down. I get messages from the CRT debug allocator indicating that previously freed memory was overwritten. This message is displayed only during a later allocation call, when it tries to reuse a block (I believe this is how it works, anyway).
Due to the portion of code in question, the error message and point of corruption are very unrelated. All I know is "something somewhere overwrote some memory that was previously freed with a single null byte." (I ascertained this by using the debugger and watching the memory referred to by the debug heap over several different runs).
Having exhausted the obvious ideas as to where the culprit might be, I'm left with trying to do something more rigorous. It occurred to me that it would be ideal if I could cause each freed block to become a no-access page of memory so that the writer would be immediately caught by the CPU's MMC! A little bit of searching later, and I found someone who had implemented something along those lines:
http://www.codeproject.com/Articles/38340/Immediate-memory-corruption-detection
His code is buried under a ton of reinvent-the-wheel code, but extracting the core concept was pretty easy, and I have done so.
The problem I have now is that MFC redfines new as DEBUG_NEW, and then further defines a slew of debug interfaces down to the CRT. In addition, it does define the global operator new and delete. Hence, as far as C++ is concerned, the "user" is trying to replace global operator new and delete twice, and hence I get a linker error to the effect 'symbol already defined.'
Looking around the internet, and SO, I see a few promising articles, but none which ultimately have anything positive to say about replacing global operator new/delete for MFC.
How to properly replace global new & delete operators
Is it possible to replace the memory allocator in a debug build of an MFC application?
I am already aware:
MFC/CRT already provides rich debugging tools for memory allocation.
Well, it provides what it provides - such as the message that got me rolling down this path in the first place. I now know that a corruption is occurring, but that's awfully weak-sauce!
What I would like to supply is guarded-allocation (or even just guarded deallocation). This is clearly possible by using a lot of virtual address space and placing every allocation in isolation, which is horribly wasteful of memory. Okay, yeah, can't see the down side when this is debug-only code useful for special-purpose moments like now.
So, I'm desperately seeking solutions to the following
Force the compiler to be copacetic with my global operator new/delete despite the CRT/MFC supplied one.
Find another way to hook the MFC/CRT _heap_alloc_dbg chain to bottom out at using my own code in place of theirs, for the pen-ultimate allocation (i.e. I'll allocate via the OS's VirtualAlloc/VirtualFree to supply memory for new and/or malloc).
Does anyone know of answers, or good articles to read that might shed some light on how that these may be accomplished?
Other ideas:
Replace the CRT's new/delete at runtime using a thunk technique.
Some other approach entirely?!
Further investigation:
This article is pretty cool... it gives a way for me to patch the global new/delete operators at runtime. However, as the article points out, it's a bit hackish (however, since I only need this for debug builds, that's not a big deal) http://zeuxcg.blogspot.com/2009/03/fighting-against-crt-heap-and-winning.html
So although this is getting at what I want (a mechanism to replace the CRT memory allocation functions), this implementation is pretty far out of date, and so far my attempts to make it work have run into myriad issues. I think it's just too hacked to the version it was originally created for, and only for a relatively simple console use (i.e. C, not even C++, and jettisoning most of the debugging features provided by the microsoft CRT). Hence, although a super-cool idea, one that ultimately would cost many hours of effort to make work with the current VS2010 dev studio, and hence not worth it (to me).
Apparently there is a well-known version of this idea: http://en.wikipedia.org/wiki/Electric_Fence Which unfortunately even the Windows port I found http://code.google.com/p/electric-fence-win32/ fails to override the CRT properly, but asks that you modify all of your source code to access the electric fence heap allocation code. :(
Update 5/3/2012:
And now I discover that Windows already provides an implementation of Electric Fence, accessible via GFLAGS debugging tool http://support.microsoft.com/kb/286470 This can be turned on and off external to the application being tested. It's essentially the same technology as I was interested in, and has the features in the DUMA project (a branch of the Electric Fence - http://duma.sourceforge.net/
The MSVCRT debug heap is actually pretty good and has some useful features you can use, such as breakpoint on the nth allocation etc.
http://msdn.microsoft.com/en-us/library/974tc9t1(v=VS.80).aspx
Among other things you can insert an allocation hook which outputs debugging information etc which you can use to debug this sort of issue.
http://msdn.microsoft.com/en-us/library/z2zscsc2(v=vs.80).aspx
In your case all you really need to do is output the address, and file and line of each allocation. Then when you experience a corrupt block, find the block whose address immediately precedes it, which will almost certainly be the one which overran. You can use the memory view in the Visual Studio debugger to look at the memory address which was corrupted and see the preceding blocks. That should tell you all you need to know to find out when it was allocated.
The debug heap also has a numerical allocation ID on each block allocated, and can break on the nth! allocation, so if you can get a reasonably consistent repro, so the same numerical block is corrupted each time, then you should be able to use the "break on nth" functionality to get a full call stack for the time it was allocated.
You may also find _CrtCheckMemory useful to find out if corruption has occurred much earlier. Just call it periodically, and once you have the bug bracketed (error didn't occur in one, did occur in the other) move them closer and closer together.

On Sandboxing a memory-leaky 3rd-Party DLL

I am looking for a way to cure at least the symptoms of a leaky DLL i have to use. While the library (OpenCascade) claims to provides a memory manager, i have as of yet being unable to make it release any memory it allocated.
I would at least wish to put the calls to this module in a 'sandbox', in order to keep my application from not losing memory while the OCC-Module isn't even running any more.
My question is: While I realise that it would be an UGLY HACK (TM) to do so, is it possible to preallocate a stretch of memory to be used specifically by the libraries, or to build some kind of sandbox around it so i can track what areas of memory they used in order to release them myself when i am finished?
Or would that be to ugly a hack and I should try to resolve the issues otherwise?
The only reliable way is to separate use of the library into a dedicated process. You will start that process, pass data and parameters to it, run the library code, retrieve results. Once you decide the memory consumption is no longer tolerable you restart the process.
Using a library that isn't broken would probably be much easier, but if a replacement ins't available you could try intercepting the allocation calls. If the library isn't too badly 'optimized' (specifically function inlining) you could disassemble it and locate the malloc and free functions; on loading, you could replace every 4 (or 8 on p64 system) byte sequence that encodes that address with one that points to your own memory allocator. This is almost guaranteed to be a buggy, unreadable timesink, though, so don't do this if you can find a working replacement.
Edit:
Saw #sharptooth's answer, which has a much better chance of working. I'd still advise trying to find a replacement though.
You should ask Roman Lygin's opinion - he used to work at occ. He has at least one post that mentions memory management http://opencascade.blogspot.com/2009/06/developing-parallel-applications-with_23.html.
If you ask nicely, he might even write a post that explains mmgt's internals.