What is the purpose of malloc hooks? - c++

What exactly is the purpose of using malloc hooks? And I've read it's used in memory-profiling, etc. but never really understood how.

Well, if you can hook into the behaviour of allocation functions, then you can track memory allocations for profiling and debugging.
The GCC documentation on malloc hooks has a nice little example demonstrating adding debug output every time the allocation functions are invoked.
I'm not really sure what else to tell you... is that not reason enough?

One very simple example: suppose you know that memory allocated by allocation number N (N is the same in each run) is always leaked in your code. You can set a hook and inside put a breakpoint on condition "allocation number equals N". Once that breakpoint is hit you examine the call stack and find why exactly that memory is leaked later.

It's a simple way to make sure that your application is not leaking memory. This can be very important if it has to run for a long time in an environment with limited memory. You can use it while testing, and turn it off in the release version.

They can also be used to replace the allocator altogether e.g. with umem or boehm-gc either for testing or because it is more efficient for a particular application.

Related

Testing C++ code and IsBadWritePtr

I am currently writing some basic tests for some functions of my C++ code (it is a game engine that I am writing mostly for educational purposes). One of the features I want to test is the memory allocation code. The tests currently consist of a function that runs every startup if the code is in debug mode. This forces me to always test the code when I am debugging.
To test my memory allocation code my instinct is to do something like this:
int* test = MemoryManager::AllocateMemory<int>();
assert(!IsBadWritePtr(test, sizeof(int)), "Memory allocation test failed: allocate");
MemoryManager::FreeMemory(test);
assert(IsBadWritePtr(test, sizeof(int)), "Memory free test failed: free");
This code is working fine, but all the resources I can find say not to use the IsBadWritePtr function (this is a WinAPI function for those unfamiliar). Is the use of this function OK in this case? The three main warnings against using it I found were:
This might cause issues with guard pages
This isn't an issue as the memory allocation code is right there and I know I am not allocating a guard page.
It's better to fail earlier
This isn't a consideration as I am literally using it to fail as early as possible.
It isn't thread safe
The test code is executed right at the beginning of execution, long before any other threads exist. It also acts on memory to which no other pointers are created, and therefore could not exist in other threads.
So basically I want to know if the use of this function is a good idea in this case, and if there is anything I am missing about the function. I am also aware that something that points to the wrong place will still pass this test, but it at least detects most memory allocation errors, right (what are the chances I get a pointer to valid memory if the allocation fails?)
I was going to write this as a comment but it's too long.
I'll bring up the elephant in the room: why don't you just test for failure in the traditional way (by returning null or throwing an exception)?
Nevermind the fact that IsBadWritePtr is so frowned upon (even its documentation says that it's obsolete and you shouldn't use it), but your use case doesn't even seem appropriate. From the MSDN documentation:
This function is typically used when working with pointers returned from third-party libraries, where you cannot determine the memory management behavior in the third-party DLL.
But you are not using it to test anything passed/returned from a DLL, you just seem to be using it to test for allocation success, which is not only unnecessary (because you already know that from the return value of HeapAlloc, GlobalAlloc, etc.), but it's not what IsBadWritePtr is intended for.
Also, testing for allocation success is not something you should only do in debug mode, or with asserts, as it's obviously out of your control and you can't try to "fix" it by debugging.
Building on #user1610015's answer, there one reason why IsBadReadPtr should NOT work in your scenario.
Basically IsBadReadPtr works on whole page granularity. This means for the above code to be correct each and every allocation you make will consume a whole page (min 4KB).
Modern allocators use a variety of tricks to pack lots of allocations into pages (low fragmentation bucket heaps, linked lists of allocations, etc). If you don't pack small allocation like this then thinks like stl maps and other libraries which use lots of small allocations will absolutely kill your game (both in memory use and the fact that cache coherency will be killed with so much unused padding).
As a side not, your last comment about thread safety is dangerous. Lots of apps and libraries you link to can spawn threads off with global object constructors (and so run before main is called) and other tricks to insert code into your process. So I would definitely check this is the case with your code right now but more importantly later as you add 3rd party libraries to you code check it then.

How to make a simple tool to detect double free or memory overflow in Linux for a large project?

I have a large embedded project that has Linux running. Also, it has various process and threads running. I can't log all the malloc and new calls as it will make the box - Embedded Set-top box sluggish. Also, sluggishness might cause a crash because of mutex time out or other things. Thus, I want to make a tool that can help to debug the issues related to memory like - memory overflow.
For example, when you do a malloc of 4 bytes. But, you write 8 bytes. This may create a problem on the other chunk of data allocated.The other chunk of data header can be tampered. Thus, free() will fail or crash. How can I make a tool to detect such issue. Also, a tool to track down the memory leaks. Is there a way to do so? I can't use valgrind as it slows down my STB. So, I want to develop my tool that can check for the memory header corruption or memory leaks. Just based on my choice, it can do either memory corruption check or memory leak detection. Also, it should be a light weight.
Firstly there is probably no way to call this "simple".
Secondly if you are using C++ I highly suggest not using malloc/free but rather new/delete. The options for overriding those operators are much more flexible.
C++ provides a number of tools to improve memory safety really:
smart pointers (the performance cost really is worth the safety improvement)
Encapsulating things in classes. for example if you use std::array::at(i) it will throw an exception if your access is out of bounds. ref
lastly having proper usage of asserts in your code can go a long way to catch errors.
My point is merely that you should not depend on your debugging tools to negate the necessity of using good C++ programming methods.
Ok so now next you need to override new and delete.
A google search will provide many ways to do this.
link1
For your problem it probably makes more sense to overload delete/new globally.
Buffer overflow detection
This is the first part of your problem.
What you need to do is allocate additional memory in your new overloaded instruction so that there are some memory buffer regions before and after the memory and then return only the centre part.
How big a buffer is your choice.
pseudo code:
inline void* operator new(size_t s)
{
void* mem = malloc(s+2*BUFFER);
memset(mem,0x5A,s+2*BUFFER);
return (mem+BUFFER)
}
At some stage in the future you need to check that the BUFFER regions kept the values of 0x5A. You should probably do this in the call to free() but you can also have your own function to do this which you call periodically. In order to speed up this process use a function like memcmp perhaps.
Memory leak detection
Detecting memory leaks is not trivial.
Firstly I suggest using stack-based objects when ever possible to all-together avoid allocating memory on the heap when not needed.
The main question regarding memory leaks is to know if a certain memory block shouldn't been deleted or not.
99% of your memory leak problems can probably be solved just by using smart pointers.
However one of the most difficult memory leaks to catch is that of a growing data structure. (say for example a linked list that grows slowly over time)
Firstly in your overloaded new/malloc functions keep a list of all memory currently allocated. And also a counter of the total number of memory allocated.
Method 1: threshold detection:
Essentially every-time your program's memory usage exceeds a threshold amount you report this and increase the threshold. If your program continues to exceed thresholds as it keeps running something is wrong.
Method 2: Comparative analysis:
In pseudeo code:
Value1 = currentAmountOfMemoryUsed;
runSomeCode()
if (currentAmountOfMemoryUsed != Value1) reportProblem()
If this is possible depends a lot on what happens in runSomeCode() as some code can legally "save" up some memory for when it runs again later.
Method 3: Leak detection on program exit:
The premise is that if your code is 100% correctly written every bit of memory allocated should be freed at the time your program exists.
This method once again is not always possible because perhaps your program needs to run indefinitely and also your program might segfault because of your errors long before this can be detected.
Compiler support
On a lower level most compilers have some support to get into the whole memory management system but the way to handle this is 100% compiler/platform specific. e.g. Visual Studio C++
This is why I highly suggest not using malloc/free directly as this is problematic for debugging in this way as well as breaks the constructor/destructor design patterns of C++.
overriding malloc/free
There is however a more hands-on approach to overriding malloc/free.
That is by defining your own malloc/free functions.
Typically under debugging this will then use macro's to include FILE and LINE in the call:
#ifdef NDEBUG
#define myMalloc(s) myMallocImplementation(s,__FILE__,__LINE__);
#else
#define myMalloc(s) malloc(s)
#endif
What this allows is that your malloc implementation can then save the source location where the memory was allocated. This approach will however not catch malloc/free usage within libraries you are using.
This is a bit harder to do with new/delete calls as it would normally require some amount of digging into the call-stack at run-time to find out who called your new() function and that again is fairly compiler specific.
Also see: MSDN blog article
Memory freezing
Given everything I like to also just mention something that is very common in safety critical code (as used in motor vehicles and/or airplanes ect)
Outside of initialization a safety-critical program is usually not allowed to use malloc/free/new/delete. So all memory allocations must happen during initialization and then once the program and then usually malloc/free is frozen in some way. Any call to malloc/free after that will cause an assert.
This can be quite a heavy limitation to work with in a C++ environment but it does make for very robust code.
Note this does nothing for buffer overflow access or invalid pointer access problems.

Shielding app from library leaks

I have to use a function from a shared library which leaks some small amount of memory (Let's assume I can't modify the library). Unfortunately, I have to call this function huge number of times which obviously makes this leak catastrophic.
Is there any method to fix this problem? If yes, is there a fast method of doing this? (The function must be called few hundred thousand times, the leak becomes problematic after about 10k times)
I can think of a couple of approaches, but I don't know what will work for you.
Switch to a garbage-collecting memory allocator like Boehm's gc. This can sweep up those leaks, and may even be a performance gain because free() becomes a no-op.
exit(): The Ultimate Deallocator. Fork off a subprocess, run it 10k times, pass the results back to the parent process. Apache's web server does this to contain damage from third-party library leaks.
I'm not sure this easier than rewriting the function yourself, but you could write your own small memory allocator specific for your task, which would look somewhat the following way:
(it should replace default memory allocation calls and this is done for the functions in your library too).
1) You should have a possibility to enter the leak-reverting mode, which, for example, disposes everything allocated in this mode.
2) Before your function processes something, enter that leak-reverting mode and exit it upon the function finishes.
Basically, if the dependencies in your code aren't too tight, this would help.
Another way would be making another application and pairing it with the main. When the second one exits, the memory would be automatically disposed. You may want to see how googletest framework runs it's child test and how the pipes are being constructed there.
In short, no. If you have time, you can rewrite the function yourself. Catastrophic usually means this is the way to go. One other possibility, can you load and unload the library (like a .so)? It's possible that this will release the leaked memory.

Make process crash on large memory allocation

I'm trying to find a significant memory leak (15MB at a time, but doing allocations like this on multiple places). I checked the most obvious places, and then used AQTime, but I still can't pinpoint it. Now I see 2 options left:
1) Use SetProcessWorkingSetSize: I've tried this but my process happily keeps on running when using up more then 150MB:
DWORD MemorySize = 150*1024*1024;
SetProcessWorkingSetSize( GetCurrentProcess(), MemorySize/2, MemorySize*2 );
2) Put a breakpoint when allocating more then 1MB at a time. How should I do this, overload operator new with an 'if>1MB' inside ?
SetProcessWorkingSetSize doesn't mean what you think it means - it's a clue to the OS on how much memory to keep "in memory" versus paged to disk. Modern OSes are very aggressive when it comes to paging unused memory to disk - Windows particularly so.
IBM Rational Purify is your only solution other than a very thorough code analysis. On Windows, for C/C++, there is no better tool for finding memory leaks. On Mac or Linux you could use valgrind, but AFAIK, it's not yet working on Windows.
From your tags you are using c++ and visual studio.
In that case you can simply use the crt debug hooks that Microssoft provide for you.
Search msdn for _CrtSetAllocHook.
In a debug build this will allow you to intercept every allocation - you can ignore small ones and just set a break point or call ::DebugBreak on the large ones.
1) Use SetProcessWorkingSetSize: I've tried this but my process happily keeps on running when using up more then 150MB:
What is SetProcessWorkingSetSize returning? Is the call succeeding?
2) Put a breakpoint when allocating more then 1MB at a time. How should I do this, overload operator new with an 'if>1MB' inside ?
Yes, that should work.
It might be helpeful to examine the tools provided by the C Runtime Debug Heap provided by MSVC.
On an embedded type system, we would do exactly as you suggest - putting a break on any call to new/memAlloc above a certain threshold and do the same on free/delete. Tedious, but it will ge the job done. A condtional breakpoint on the size should do what you want, but on the delete, it's a bit worse.
Try to use UMDH. It is a free Microsoft utility that allows to find memory leaks.
Sorry all, none of the proposed solutions worked. It finally got fixed using AQTime and a lot of debugoutput. The leak got cleaned on shutdown, so it was looking for a needle in a haystack.
Still I'm interested in how to efficiently find this though. I tried to put a conditional breakpoint on the new operator, but the debugger took ages to evaluate "bytes > 1024 * 1024" for every single allocation.

how to find allocated memory in linux

Good afternoon all,
What I'm trying to accomplish: I'd like to implement an extension to a C++ unit test fixture to detect if the test allocates memory and doesn't free it. My idea was to record allocation levels or free memory levels before and after the test. If they don't match then you're leaking memory.
What I've tried so far: I've written a routine to read /proc/self/stat to get the vm size and resident set size. Resident set size seems like what I need but it's obviously not right. It changes between successive calls to the function with no memory allocation. I believe it's returning the cached memory used not what's allocated. It also changes in 4k increments so it's too coarse to be of any real use.
I can get the stack size by allocating a local and saving it's address. Are there any problems with doing this?
Is there a way to get real free or allocated memory on linux?
Thanks
Your best bet may actually be to use a tool specifically designed for the job of finding memory leaks. I have personal experience with Electric Fence, which is easy to use and seems to do the job nicely (not sure how well it will handle C++). Also recommended by others is Dmalloc.
For sure though, everyone seems to like Valgrind, which can do just about anything and even has front-ends (though anything that has a front-end built for it means that it probably isn't the simplest thing in the world). If the KDE folks can recommend it, it must be able to handle just about anything. (I'm not saying anything bad about KDE, just that it is a very large C++ codebase, so if Valgrind can handle KDE software, it must have something going for it. Though I don't have personal experience with it as Electric Fence was always enough for me)
I'd have to agree with those suggesting Valgrind and similar, but if the run-time overhead is too great, one option may be to use mallinfo() call to retrieve statistics on currently allocated memory, and check whether uordblks is nonzero.
Note that this will have to be run before global destructors are called - so if you have any allocations that are cleaned up there, this will register a false positive. It also won't tell you where the allocation is made - but it's a good first pass to figure out which test cases need work.
don't look a the OS to get allocation info. the C library manages memory internally, and only asks the OS for more RAM in chunks (4KB in your case). In most cases, it's never released to back to the OS, so you can't really check anything there.
You'll have to patch malloc() and free() to get the info you need.
Or, use Valgrind.
Not a direct answer but you could re-define the ::new and ::delete operators, and internally either via a singleton or global objects, keep track of the allocated, and de-allocated memory.
Edit: If this is a personal, DIY project then cool. But if its for something critical you can always jump onto one of the many leak detection libraries/programs available, a quick google search should suffice.
google-perftools can be used in your test code.