Node Add-on Nan::NewBuffer Causes Memory Leak - c++

I have a c++ node add on that uses the Nan library. I have a function that needs to return a buffer. The simplest version of it is as follows (code edited as per comments):
NAN_METHOD(Test) {
char * retVal = (char*)malloc(100 * sizeof(char));
info.GetReturnValue().Set(Nan::NewBuffer(retVal, 100 *sizeof(char)).ToLocalChecked());
}
where the union is just used as an easy way to reinterpret the bytes. According to the documentation, Nan::NewBuffer assumes ownership of the memory so there is no need to free the memory manually. However when I run my node code that uses this function, my memory skyrockets, even when I force the garbage collector to run via global.gc(); The node code to produce the error is extremely simple:
const addon = require("addon");
for (let i = 0; i < 100000000; i++) {
if(i % (1000000) === 0){
console.log(i);
try {
global.gc();
} catch (e) {
console.log("error garbage collecting");
process.exit();
}
}
const buf = addon.Test();
}
Any help would be appreciated.

After much experimentation and research, I discovered this postenter link description here which basically states that the promise to free the memory that is passed into Nan::NewBuffer is just a lie. Using Nan::CopyBuffer instead of Nan::NewBuffer solves the problem at the cost of a memcpy. So essentially, the answer is that Nan::NewBuffer is broken and you shouldn't use it. Use Nan::CopyBuffer instead.

In IT, we tend to call a lie in the documentation a bug.
This question has been the subject of a Node.js issue - https://github.com/nodejs/node/issues/40936
The answer is that the deallocation of the underlying buffer happens in the C++ equivalent of a JS SetImmediate handler and if your code is completely synchronous - which is your case - the buffers won't be freed until the program end.
You correctly found that Nan::CopyBuffer does not suffer from this problem.
Alas, this cannot be easily fixed, as Node.js does not know how this buffer was allocated and cannot call its callback from the garbage collector context.
If you are using Nan::NewBuffer I also suggest to look at this issue: https://github.com/nodejs/node/issues/40926 in which another closely related pitfall is discussed.

Related

Why this code is not causing memory leak?

I wanted to simulate memory leak in my application. I write following code, and tried to see in perfmon.
int main()
{
int *i;
while(1)
{
i = (int *) malloc(1000);
//just to avoid lazy allocation
*i = 100;
if(i == NULL)
{
printf("Memory Not Allocated\n");
}
Sleep(1000);
}
}
When I see used memory in Task Manager, it is fluctuate between 52K and 136K, but not going beyond that. Means, somethings it shows 52K and sometimes 136K, I do not understand how this code once goes to 136K, coming back to 52K, and not going beyond that.
I tried to use perfmon, but not able to exactly what to see in perfmon, snapshot of counters,
Please suggest how to simulate memory leak and how to detect it.
While an OS may defer actual allocation of dynamically allocated memory until it is used, the compiler optimizer may eliminate allocations that are only written to, and never read from. Because your writes have no well defined observable behaviour (you never read from it), the compiler may well optimize it away. I would suggest examing the generated assembly code to see what the compiler is actually generating. Really, this ought be one of the first steps in answering questions like "why doesn't this code behave like I think it should?".
Strictly, a memory leak is a bit context dependent: something in your program keeps allocating memory over time and not freeing it, when it should have been freed.
You code produces a "leak" on each subsequent pass through the while loop, because your program loses knowledge of a previously allocated pointer at that point. This is only visible by inspection however in this case; from the code posted it looks more like you are actually doing, albeit very slowly, is attempting to create a memory stress situation.
To 'find' a leak without inspection you need to run a tool like valgrind (Unix/Linux/OSX) or in Visual Studio enable allocation tracing with the DEBUG_NEW macro and view the output using a debugger.
If you really want to stress memory in a hurry, allocate 1024 x 1024 x 1024 bytes at a time...

Unexpected Behaviour from tcmalloc

I have been using tcmalloc for a few months in a large project, and so far I must say that I am pretty happy about it, most of all for its HeapProfiling features which allowed to track memory leaks and remove them.
In the past couple of weeks though we experienced random crashes in our application, and we could not find the source of the random crash. In a very particular situation, When the application crashed, we found ourselves with a completely corrupted stack for one of the application threads. Several times instead I found that threads were stuck in tcmalloc::PageHeap::AllocLarge(), but since I dont have debug symbols of tcmalloc linked, I could not understand what the issue was.
After nearly one week of investigation, today I tried the most simple of things: removed tcmalloc from linkage to avoid using it, just to see what happened. Well... I finally found out what the problem was, and the offending code looks very much like this:
void AllocatingFunction()
{
Object object_on_stack;
ProcessObject(&object_on_stack);
}
void ProcessObject(Object* object)
{
...
// Do Whatever
...
delete object;
}
Using libc the application still crashed but I finally saw that I was calling delete on an object which was allocated on the stack.
What I still cant figure out is why tcmalloc instead kept the application running regardless of this very risky (if not utterly wrong) object deallocation, and the double deallocation when object_on_stack goes out of scope when AllocatingFunction ends. The fact is that the offending code could be called repeatedly without any hint of the underlying abomination.
I know that memory deallocation is one of those "undefined behaviour" when not used properly, but my surprise is such a different behaviour between "standard" libc and tcmalloc.
Does anyone have some sort of explanation of insight, on why tcmalloc keeps the application running?
Thanks in advance :)
Have a Nice Day
very risky (if not utterly wrong) object deallocation
well, I disagree here, it is utterly wrong, and since you invoke UB, anything can happen.
It very much depends on what the tcmalloc code acutally does on deallocation, and how it uses the (possibly garbage) data around the stack at that location.
I have seen tcmalloc crash on such occasions too, as well as glibc going into an infinite loop. What you see is just coincidence.
Firstly, there was no double free in your case. When object_on_stack goes out of scope there is no free call, just the stack pointer is decreased (or rather increased, as stack grows down...).
Secondly, during delete TcMalloc should be able to recognize that the address from a stack does not belong to the program heap. Here is a part of free(ptr) implementation:
const PageID p = reinterpret_cast<uintptr_t>(ptr) >> kPageShift;
Span* span = NULL;
size_t cl = Static::pageheap()->GetSizeClassIfCached(p);
if (cl == 0) {
span = Static::pageheap()->GetDescriptor(p);
if (!span) {
// span can be NULL because the pointer passed in is invalid
// (not something returned by malloc or friends), or because the
// pointer was allocated with some other allocator besides
// tcmalloc. The latter can happen if tcmalloc is linked in via
// a dynamic library, but is not listed last on the link line.
// In that case, libraries after it on the link line will
// allocate with libc malloc, but free with tcmalloc's free.
(*invalid_free_fn)(ptr); // Decide how to handle the bad free request
return;
}
Call to invalid_free_fn crashes.

Howto debug double deletes in C++?

I'm maintaining a legacy application written in C++. It crashes every now and then and Valgrind tells me its a double delete of some object.
What are the best ways to find the bug that is causing a double delete in an application you don't fully understand and which is too large to be rewritten ?
Please share your best tips and tricks!
Here's some general suggestion's that have helped me in that situation:
Turn your logging level up to full debug, if you are using a logger. Look for suspicious stuff in the output. If your app doesn't log pointer allocations and deletes of the object/class under suspicion, it's time to insert some cout << "class Foo constructed, ptr= " << this << endl; statements in your code (and corresponding delete/destructor prints).
Run valgrind with --db-attach=yes. I've found this very handy, if a bit tedious. Valgrind will show you a stack trace every time it detects a significant memory error or event and then ask you if you want to debug it. You may find yourself repeatedly pressing 'n' many many times if your app is large, but keep looking for the line of code where the object in question is first (and secondly) deleted.
Just scour the code. Look for construction/deletion of the object in question. Sadly, sometimes it winds up being in a 3rd party library :-(.
Update: Just found this out recently: Apparently gcc 4.8 and later (if you can use GCC on your system) has some new built-in features for detecting memory errors, the "address sanitizer". Also available in the LLVM compiler system.
Yep. What #OliCharlesworth said. There's no surefire way of testing a pointer to see if it points to allocated memory, since it really is just the memory location itself.
The biggest problem your question implies is the lack of reproducability. Continuing with that in mind, you're stuck with changing simple 'delete' constructs to delete foo;foo = NULL;.
Even then the best case scenario is "it seems to occur less" until you've really stamped it down.
I'd also ask by what evidence Valgrind suggests it's a double-delete problem. Might be a better clue lingering around in there.
It's one of the simpler truly nasty problems.
This may or may not work for you.
Long time ago I was working on 1M+ lines program that was 15 years old at the time. Faced with the exact same problem - double delete with huge data set. With such data any out of the box "memory profiler" would be a no go.
Things that were on my side:
It was very reproducible - we had macro language and running same script exactly the same way reproduced it every time
Sometime during the history of the project someone decided that "#define malloc my_malloc" and "#define free my_free" had some use. These didn't do much more than call built-in malloc() and free() but project already compiled and worked this way.
Now the trick/idea:
my_malloc(int size)
{
static int allocation_num = 0; // it was single threaded
void* p = builtin_malloc(size+16);
*(int*)p = ++allocation_num;
*((char*)p+sizeof(int)) = 0; // not freed
return (char*)p+16; // check for NULL in order here
}
my_free(void* p)
{
if (*((char*)p+sizeof(int)))
{
// this is double free, check allocation_number
// then rerun app with this in my_alloc
// if (alloc_num == XXX) debug_break();
}
*((char*)p+sizeof(int)) = 1; // freed
//built_in_free((char*)p-16); // do not do this until problem is figured out
}
With new/delete it might be trickier, but still with LD_PRELOAD you might be able to replace malloc/free without even recompiling your app.
you are probably upgrading from a version that treated delete differently then the new version.
probably what the previous version did was when delete was called it did a static check for if (X != NULL){ delete X; X = NULL;} and then in the new version it just does the delete action.
you might need to go through and check for pointer assignments, and tracking references of object names from construction to deletion.
I've found this useful: backtrace() on linux. (You have to compile with -rdynamic.) This lets you find out where that double free is coming from by putting a try/catch block around all memory operations (new/delete) then in the catch block, print out your stack trace.
This way you can narrow down the suspects much faster than running valgrind.
I wrapped backtrace in a handy little class so that I can just say:
try {
...
} catch (...) {
StackTrace trace;
std::cerr << "Double free!!!\n" << trace << std::endl;
throw;
}
On Windows, assuming the app is built with MSVC++, you can take advantage of the extensive heap debugging tools built into the debug version of the standard library.
Also on Windows, you can use Application Verifier. If I recall correctly, it has a mode the forces each allocation onto a separate page with protected guard pages in between. It's very effective at finding buffer overruns, but I suspect it would also be useful for a double-free situation.
Another thing you could do (on any platform) would be to make a copy of the sources that are transformed (perhaps with macros) so that every instance of:
delete foo;
is replaced with:
{ delete foo; foo = nullptr; }
(The braces help in many cases, though it's not perfect.) That will turn many instances of double-free into a null pointer reference, making it much easier to detect. It doesn't catch everything; you might have a copy of a stale pointer, but it can help squash a lot of the common use-after-delete scenarios.

How can I get the size of a memory block allocated using malloc()? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicates:
How can I get the size of an array from a pointer in C?
Is there any way to determine the size of a C++ array programmatically? And if not, why?
I get a pointer to a chunk of allocated memory out of a C style function.
Now, it would be really interesting for debugging purposes to know how
big the allocated memory block that this pointer points is.
Is there anything more elegant than provoking an exception by blindly running over its boundaries?
Thanks in advance,
Andreas
EDIT:
I use VC++2005 on Windows, and GCC 4.3 on Linux
EDIT2:
I have _msize under VC++2005
Unfortunately it results in an exception in debug mode....
EDIT3:
Well. I have tried the way I described above with the exception, and it works.
At least while I am debugging and ensuring that immediately after the call
to the library exits I run over the buffer boundaries. Works like a charm.
It just isn't elegant and in no way usable in production code.
It's not standard but if your library has a msize() function that will give you the size.
A common solution is to wrap malloc with your own function that logs each request along with the size and resulting memory range, in the release build you can switch back to the 'real' malloc.
If you don't mind sleazy violence for the sake of debugging, you can #define macros to hook calls to malloc and free and pad the first 4 bytes with the size.
To the tune of
void *malloc_hook(size_t size) {
size += sizeof (size_t);
void *ptr = malloc(size);
*(size_t *) ptr = size;
return ((size_t *) ptr) + 1;
}
void free_hook (void *ptr) {
ptr = (void *) (((size_t *) ptr) - 1);
free(ptr);
}
size_t report_size(ptr) {
return * (((size_t *) ptr) - 1);
}
then
#define malloc(x) malloc_hook(x)
and so on
The C runtime library does not provide such a function. Furthermore, deliberately provoking an exception will not tell you how big the block is either.
Usually the way this problem is solved in C is to maintain a separate variable which keeps track of the size of the allocated block. Of course, this is sometimes inconvenient but there's generally no other way to know.
Your C runtime library may provide some heap debug functions that can query allocated blocks (after all, free() needs to know how big the block is), but any of this sort of thing will be nonportable.
With gcc and the GNU linker, you can easily wrap malloc
#include <stdlib.h>
#include <stdio.h>
void* __real_malloc(size_t sz);
void* __wrap_malloc(size_t sz)
{
void *ptr;
ptr = __real_malloc(sz);
fprintf(stderr, "malloc of size %d yields pointer %p\n", sz, ptr);
/* if you wish to save the pointer and the size to a data structure,
then remember to add wrap code for calloc, realloc and free */
return ptr;
}
int main()
{
char *x;
x = malloc(103);
return 0;
}
and compile with
gcc a.c -o a -Wall -Werror -Wl,--wrap=malloc
(Of course, this will also work with c++ code compiled with g++, and with the new operator (through it's mangled name) if you wish.)
In effect, the statically/dynamically loaded library will also use your __wrap_malloc.
No, and you can't rely on an exception when overrunning its boundaries, unless it's in your implementation's documentation. It's part of the stuff you really don't need to know about to write programs. Dig into your compiler's documentation or source code if you really want to know.
There is no standard C function to do this. Depending on your platform, there may be a non-portable method - what OS and C library are you using?
Note that provoking an exception is unreliable - there may be other allocations immediately after the chunk you have, and so you might not get an exception until long after you exceed the limits of your current chunk.
Memory checkers like Valgrind's memcheck and Google's TCMalloc (the heap checker part) keep track of this sort of thing.
You can use TCMalloc to dump a heap profile that shows where things got allocated, or you can just have it check to make sure your heap is the same at two points in program execution using SameHeap().
Partial solution: on Windows you can use the PageHeap to catch a memory access outside the allocated block.
PageHeap is an alternate memory manager present in the Windows kernel (in the NT varieties but nobody should be using any other version nowadays). It takes every allocation in a process and returns a memory block that has its end aligned with the end of a memory page, then it makes the following page unaccessible (no read, no write access). If the program tries to read or write past the end of the block, you'll get an access violation you can catch with your favorite debugger.
How to get it: Download and install the package Debugging Tools for Windows from Microsoft: http://www.microsoft.com/whdc/devtools/debugging/default.mspx
then launch the GFlags utility, go to the 3rd tab and enter the name of your executable, then Hit the key. Check the PageHeap checkbox, click OK and you're good to go.
The last thing: when you're done with debugging, don't ever forget to launch GFlags again, and disable PageHeap for the application. GFlags enters this setting into the Registry (under HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\), so it is persistent, even across reboots.
Also, be aware that using PageHeap can increase the memory needs of your application tremendously.
The way to do what you want is to BE the allocator. If you filter all requests, and then record them for debugging purposes, then you can find out what you want when the memory is free'd.
Additionally, you can check at the end of the program to see if all allocated blocks were freed, and if not, list them. An ambitious library of this sort could even take FUNCTION and LINE parameters via a macro to let you know exactly where you are leaking memory.
Finally, Microsoft's MSVCRT provides a a debuggable heap that has many useful tools that you can use in your debug version to find memory problems: http://msdn.microsoft.com/en-us/library/bebs9zyz.aspx
On Linux, you can use valgrind to find many errors. http://valgrind.org/

Most efficient replacement for IsBadReadPtr?

I have some Visual C++ code that receives a pointer to a buffer with data that needs to be processed by my code and the length of that buffer. Due to a bug outside my control, sometimes this pointer comes into my code uninitialized or otherwise unsuitable for reading (i.e. it causes a crash when I try to access the data in the buffer.)
So, I need to verify this pointer before I use it. I don't want to use IsBadReadPtr or IsBadWritePtr because everyone agrees that they're buggy. (Google them for examples.) They're also not thread-safe -- that's probably not a concern in this case, though a thread-safe solution would be nice.
I've seen suggestions on the net of accomplishing this by using VirtualQuery, or by just doing a memcpy inside an exception handler. However, the code where this check needs to be done is time sensitive so I need the most efficient check possible that is also 100% effective. Any ideas would be appreciated.
Just to be clear: I know that the best practice would be to just read the bad pointer, let it cause an exception, then trace that back to the source and fix the actual problem. However, in this case the bad pointers are coming from Microsoft code that I don't have control over so I have to verify them.
Note also that I don't care if the data pointed at is valid. My code is looking for specific data patterns and will ignore the data if it doesn't find them. I'm just trying to prevent the crash that occurs when running memcpy on this data, and handling the exception at the point memcpy is attempted would require changing a dozen places in legacy code (but if I had something like IsBadReadPtr to call I would only have to change code in one place).
bool IsBadReadPtr(void* p)
{
MEMORY_BASIC_INFORMATION mbi = {0};
if (::VirtualQuery(p, &mbi, sizeof(mbi)))
{
DWORD mask = (PAGE_READONLY|PAGE_READWRITE|PAGE_WRITECOPY|PAGE_EXECUTE_READ|PAGE_EXECUTE_READWRITE|PAGE_EXECUTE_WRITECOPY);
bool b = !(mbi.Protect & mask);
// check the page is not a guard page
if (mbi.Protect & (PAGE_GUARD|PAGE_NOACCESS)) b = true;
return b;
}
return true;
}
a thread-safe solution would be nice
I'm guessing it's only IsBadWritePtr that isn't thread-safe.
just doing a memcpy inside an exception handler
This is effectively what IsBadReadPtr is doing ... and if you did it in your code, then your code would have the same bug as the IsBadReadPtr implementation: http://blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx
--Edit:--
The only problem with IsBadReadPtr that I've read about is that the bad pointer might be pointing to (and so you might accidentally touch) a stack's guard page. Perhaps you could avoid this problem (and therefore use IsBadReadPtr safely), by:
Know what threads are running in your process
Know where the threads' stacks are, and how big they are
Walk down each stack, delberately touching each page of the stack at least once, before you begin to call isBadReadPtr
Also, the some of the comments associated with the URL above also suggest using VirtualQuery.
The reason these functions are bad to use is that the problem can't be solved reliably.
What if the function you're calling returns a pointer to memory that is allocated, so it looks valid, but it's pointing to other, unrelated data, and will corrupt your application if you use it.
Most likely, the function you're calling actually behaves correctly, and you are misusing it. (Not guaranteed, but that is often the case.)
Which function is it?
The fastest solution I can think of is to consult the virtual memory manager using VirtualQuery to see if there is a readable page at the given address, and cache the results (however any caching will reduce the accuracy of the check).
Example (without caching):
BOOL CanRead(LPVOID p)
{
MEMORY_BASIC_INFORMATION mbi;
mbi.Protect = 0;
::VirtualQuery(((LPCSTR)p) + len - 1, &mbi, sizeof(mbi));
return ((mbi.Protect & 0xE6) != 0 && (mbi.Protect & PAGE_GUARD) == 0);
}
Why can't you call the api
AfxIsValidAddress((p), sizeof(type), FALSE));
If the variable is uninitialized you are hosed. Sooner or later it's going to be an address for something you don't want to play with (like your own stack).
If you think you need this, and (uintptr_t)var < 65536 does not suffice (Windows does not allow allocating the bottom 64k), there is no real solution. VirtualQuery, etc. appear to "work" but sooner or later will burn you.
I am afraid you are out of luck - there is no way to reliably check the validity of a pointer. What Microsoft code is giving you bad pointers?
Any implementation of checking the validity of memory is subject to the same constriants that make IsBadReadPtr fail. Can you post an example callstack of where you want to check the validity of memory of a pointer passed to you from Windows? That might help other people (including me) diagnose why you need to do this in the first place.
If you have to resort to checking patterns in data, here are a few tips:
If you mention using IsBadReadPtr, you are probably developing for Windows x86 or x64.
You may be able to range check the pointer. Pointers to objects will be word aligned. In 32-bit windows, user-space pointers are in the range of 0x00401000-0x7FFFFFFF, or for large-address-aware applications, 0x00401000-0xBFFFFFFF instead (edit: 0x00401000-0xFFFF0000 for a 32-bit program on 64-bit windows). The upper 2GB/1GB is reserved for kernel-space pointers.
The object itself will live in Read/Write memory which is not executable. It may live in the heap, or it may be a global variable. If it is a global variable, you can validate that it lives in the correct module.
If your object has a VTable, and you are not using other classes, compare its VTable pointer with another VTable pointer from a known good object.
Range check the variables to see if they are possibly valid. For example, bools can only be 1 or 0, so if you see one with a value of 242, that's obviously wrong. Pointers can also be range checked and checked for alignment as well.
If there are objects contained within, check their VTables and data as well.
If there are pointers to other objects, you can check that the object lives in memory that is Read/Write and not executable, check the VTable if applicable, and range check the data as well.
If you do not have a good object with a known VTable address, you can use these rules to check if a VTable is valid:
While the object lives in Read/Write memory, and the VTable pointer is part of the object, the VTable itself will live in memory that is Read Only and not executable, and will be aligned to a word boundary. It will also belong to the module.
The entries of the VTable are pointers to code, which will be Read Only and Executable, and not writable. There is no alignment restrictions for code addresses. Code will belong to the module.
Here is what I use this just replaces the official microsoft ones by using #define's this way you can use the microsoft ones and not worry about them failing you.
// Check memory address access
const DWORD dwForbiddenArea = PAGE_GUARD | PAGE_NOACCESS;
const DWORD dwReadRights = PAGE_READONLY | PAGE_READWRITE | PAGE_WRITECOPY | PAGE_EXECUTE_READ | PAGE_EXECUTE_READWRITE | PAGE_EXECUTE_WRITECOPY;
const DWORD dwWriteRights = PAGE_READWRITE | PAGE_WRITECOPY | PAGE_EXECUTE_READWRITE | PAGE_EXECUTE_WRITECOPY;
template<DWORD dwAccessRights>
bool CheckAccess(void* pAddress, size_t nSize)
{
if (!pAddress || !nSize)
{
return false;
}
MEMORY_BASIC_INFORMATION sMBI;
bool bRet = false;
UINT_PTR pCurrentAddress = UINT_PTR(pAddress);
UINT_PTR pEndAdress = pCurrentAddress + (nSize - 1);
do
{
ZeroMemory(&sMBI, sizeof(sMBI));
VirtualQuery(LPCVOID(pCurrentAddress), &sMBI, sizeof(sMBI));
bRet = (sMBI.State & MEM_COMMIT) // memory allocated and
&& !(sMBI.Protect & dwForbiddenArea) // access to page allowed and
&& (sMBI.Protect & dwAccessRights); // the required rights
pCurrentAddress = (UINT_PTR(sMBI.BaseAddress) + sMBI.RegionSize);
} while (bRet && pCurrentAddress <= pEndAdress);
return bRet;
}
#define IsBadWritePtr(p,n) (!CheckAccess<dwWriteRights>(p,n))
#define IsBadReadPtr(p,n) (!CheckAccess<dwReadRights>(p,n))
#define IsBadStringPtrW(p,n) (!CheckAccess<dwReadRights>(p,n*2))
This approach is based on my understanding of Raymond Chen's blog post, If I'm not supposed to call IsBadXxxPtr, how can I check if a pointer is bad?
This is an old question but this part:
the code where this check needs to be done is time sensitive so I need
the most efficient check possible that is also 100% effective
VirtualQuery() takes a kernel call, so the simple memcpy() in an exception handler will be faster for the case where the memory is okay to read most of the time.
__try
{
memcpy(dest, src, size);
}__except(1){}
All stays in user mode when there is no exception. Maybe a bit slower for the use case where the memory is bad to read more than it is good (since it fires off a exception which is a round trip through the kernel and back).
You could also extend it with a custom memcpy loop and *size so you could return exactly how many bytes were actually read.
if you are using VC++ then I suggest to use microsoft specific keywords __try __except
to and catch HW exceptions