Issues when memory allocation/deallocation when linking with static c runtime - c++

I came across below note in the book Windows via C-C++ by Jeffrey Richter and Christophe Nasarre.
Examine the following code:
V
OID EXEFunc() {
PVOID pv = DLLFunc();
// Access the storage pointed to by pv...
// Assumes that pv is in EXE's C/C++ run-time heap
free(pv);
}
PVOID DLLFunc() {
// Allocate block from DLL's C/C++ run-time heap
return(malloc(100));
}
So, what do you think? Does the preceding code work correctly? Is the block allocated by the
DLL's function freed by the EXE's function? The answer is: maybe. The code shown does not
give you enough information. If both the EXE and the DLL link to the DLL C/C++ run-time
library, the code works just fine. However, if one or both of the modules link to the static C/C++
run-time library, the call to free fails.
I'm not able to understand why call to free would fail here when linking modules with static C runtime.
Can someone explain why free fails?
Found similar question here:
Memory Allocation in Static vs Dynamic Linking of C Runtime
But I've same doubt here as MrPhilTx:
Wouldn't all of the heaps be in the same address space?
Thanks!

When both your DLL and EXE are staticly linked to the C runtime the two runtimes simply don't know about each other. So both the EXE and DLL get their own copy of the runtime, their own heap and heap metadata. Neither side knows about the others Metadata and there is no safe way to update the data when you free memory. You end up with inconstant metadata and things will eventually fail (and if your very lucky, it will fail right away).
What this means is that you end up with at least two heaps in your process, and each heap has it's own rules and metadata. There is no way for the EXE to know the exact way the DLL allocates memory so there is no way for it to free it.
As for why you can get away with sharing a heap when everything is dynamically linked, that's easy, there is only one copy of the C Runtime DLL in the process, so if every DLL links against it they will all be calling the same code with the same metadata.

You can't allocate memory from one allocator and free it with another. Different allocators use different internal implementations and the result of giving a memory block to an allocator that didn't allocate it is unpredictable.
So unless you know for a fact that two sections of code are using the same allocator, you can't allocate memory in one section of code and free it in the other. The usual solution is to ensure the same unit both allocates and frees the memory. In your example, the DLL could offer a "free" function that the main code could call into instead of calling its own free function which frees to its own allocator.
So do this instead:
OID EXEFunc() {
PVOID pv = DLLFunc();
// Access the storage pointed to by pv...
// Assumes that pv is in EXE's C/C++ run-time heap
DLLFreeFunc(pv);
}
...
PVOID DLLFunc() {
// Allocate block from DLL's C/C++ run-time heap
return(malloc(100));
}
DLLFreeFunc(PVOID x) {
free(x);
}

On Linux, a program uses the brk and sbrk system calls to request extra data pages from the kernel. sbrk returns an address pointing to the data segment that can be used by your program.
malloc and free use the data segment returned by brk and sbrk by turning it into a heap. The heaps is a large block of memory in the current porcess's space which small blocks of memory can be requested and returned as required. It is important to note that many calls to malloc and free will make no system calls.
Now when malloc and free want to make use of the heap they need to get a pointer to the heap. This pointer is stored in a separate data segment called static data and is allocated when the application loads. In order to insure that different DLLs (or shared libraries on linux) to no clash with eachother, each DLL has its own static data sections.
Now let us assume that both the dll and the executable are statically linked to their own libraries. In such a case the dll and the executable will have pointers to a different heaps and is such event both dll and executable must free their own memory.
However on linux both the dll and the executable will access malloc and free through a common DLL (libc.so on linux). In such a case, since both the dll and executable are effectively accessing libc's heap, the executable can safely free memory allocated by the dll.
In any event it is good practice for the dll to provide its own free function. This if nothing else documents that the pointer returned by DLLFunc needs to be freed.
I imagine this is true on Windows as well.

The code critically depends on the implementation of malloc and free. A good implementation has no issues, a poor one will indeed fail. It's definitely easier to create a working DLL implementation of malloc and free, but it's far from impossible to do so in a static library.
A trivial example would be a static library which forwards the calls directly to GlobalAlloc and GlobalFree.

Related

LocalAlloc Vs GlobalAlloc Vs malloc Vs new

I have searched for this on various links, but still the doubt persist.
I do not understand the difference between LocalAlloc vs GlobalAlloc vs malloc vs new for memory allocation.
I have gone through this link of MSDN:
Comparing Memory Allocation Methods
Please explain the following statement:
The malloc function has the disadvantage of being run-time dependent. The new operator has the disadvantage of being compiler dependent and language dependent
Excerpts from Raymond Chen's OldNewThing
Back in the days of 16-bit Windows, the difference was significant.
In 16-bit Windows, memory was accessed through values called
“selectors”, each of which could address up to 64K. There was a
default selector called the “data selector”; operations on so-called
“near pointers” were performed relative to the data selector. For
example, if you had a near pointer p whose value was 0x1234 and your
data selector was 0x012F, then when you wrote *p, you were accessing
the memory at 012F:1234. (When you declared a pointer, it was near by
default. You had to say FAR explicitly if you wanted a far pointer.)
Important: Near pointers are always relative to a selector, usually
the data selector.
The GlobalAlloc function allocated a selector that could be used to
access the amount of memory you requested. You could access the memory
in that selector with a “far pointer”. A “far pointer” is a selector
combined with a near pointer. (Remember that a near pointer is
relative to a selector; when you combine the near pointer with an
appropriate selector, you get a far pointer.)
Every instance of a program and DLL got its own data selector, known
as the HINSTANCE. Therefore, if you had a near pointer p and accessed
it via *p from a program executable, it accessed memory relative to
the program instance’s HINSTANCE. If you accessed it from a DLL, you
got memory relative to your DLL’s HINSTANCE.
Therefore, that in 16-bit Windows, the LocalAlloc and GlobalAlloc
functions were completely different! LocalAlloc returned a near
pointer, whereas GlobalAlloc returned a selector.
Pointers that you intended to pass between modules had to be in the
form of “far pointers” because each module has a different default
selector. If you wanted to transfer ownership of memory to another
module, you had to use GlobalAlloc since that permitted the recipient
to call GlobalFree to free it.
Even in Win32, you have to be careful not to confuse the local heap
from the global heap. Memory allocated from one cannot be freed on the
other. All the weirdness about near and far pointers disappeared with
the transition to Win32. But the local heap functions and the global
heap functions are nevertheless two distinct heap interfaces.
Also, the link specified by you clearly says that,
Starting with 32-bit Windows, GlobalAlloc and LocalAlloc are
implemented as wrapper functions that call HeapAlloc using a handle to
the process's default heap, and HeapAlloc can be instructed to raise
an exception if memory could not be allocated, a capability not
available with LocalAlloc.
For your confusion on malloc vs new, Billy ONeal's answer summarizes that pretty clearly.
For the difference between malloc and HeapAlloc,
David Heffernan's and Luis Miguel Huapaya's answer combined gives the perfect solution::
malloc is portable, part of the standard. malloc (and other C runtime heap functions) are module dependant, which means that if you call malloc in code from one module (i.e. a DLL), then you should call free within code of the same module or you could suffer some pretty bad heap corruption.
HeapAlloc is not portable, it's a Windows API function. Using HeapAlloc with GetProcessHeap instead of malloc, including overloading new and delete operators to make use of such, allow you to pass dynamically allocated objects between modules and not have to worry about memory corruption if memory is allocated in code of one module and freed in code of another module once the pointer to a block of memory has been passed across to an external module.
GlobalAlloc and LocalAlloc are old functions from the 16 bit era. The difference was that you sometimes had to be able to allocate memory only used in your segment (that used near pointers), and sometimes needed to allocate memory to be shared with other processes and segments on the system. Today, these guys forward in some form or another to the HeapXxx functions, such as HeapAlloc. If you're writing new code and need to avoid linking with the C runtime, you should use the HeapXxx functions instead. Of course, if you call any of these, your program will only compile and run on Windows.
malloc is "run-time dependent" in that using it requires that you link against the C run-time (CRT). The CRT is the library that contains all the other standard C library functions, like printf or qsort. You can write a plain Win32 API program without linking with this (but I honestly can't see why you'd want to do that in real software).
new is compiler dependent and language dependent in that they require a compiler that can compile C++. (And usually new is implemented in terms of malloc, so it'll probably require using the CRT as well)

Assertion: The pointer MUST come from the 'local' heap

I am testing a little sound library called clunk (http://sourceforge.net/projects/clunk/).
I built that library for visual studio 11 and linked it in my visual studio project. When I try the test.cpp I am getting an assertion thrown by msvcr110d.dll.
Does it have to do with my runtime librarie settings: It is "Multithreaded-Debug-DLL (/MDd)" ?
In cmakelist.txt in clunk I added following line of code:
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} /MDd")
I am still getting the message that there are problems with pointer allocation. Why that?
You're probably allocating memory on one side of an application/library boundary and freeing it on the other. That's difficult to get right and likely best avoided.
You must ensure that memory is returned to the very same allocator that allocated it. Here are a few patterns to avoid this problem:
Instead of the library allocating memory for a returned structure, have the application do it. Then the application can free the structure.
Let the library allocate memory for a structure, but instead of the application freeing it, have the application call a special free function. So if there's a 'getFoo' function in the library that returns an allocated structure, have a 'freeFoo' function that releases that structure. This ensures the library returns the structure to its own allocator.
Have the library use statically allocated structures that are valid until some particular next call into the library.
Give the library a 'setAlloctor' function and pass it a pointer to malloc and free from the application. This way, the library will always use the application's allocator.
Give the library a getAllocator function that returns pointers to the malloc and free functions the library is using. This way, the application can get memory from the library's allocator (for the library to possibly free) or return memory to the library's allocator (that the library allocated).
Take a look at the code that's generating the assertion and see if it can be modified to use one of these patterns. It's possible, for example, that you're just calling delete on a pointer to an object you got from the library when you should be using a special destructor function provided by the library.

When exactly would a DLL use a different heap than the executable?

I know that if your DLL static links against a different version of the runtime then it creates its own heap. It will also if it is instructed to make a heap. Under these circumstances, it is unsafe for the DLL to delete what the exe allocated. In what cases does this NOT apply (as in, it is safe for the DLL to delete what the exe allocated)? Is it safe if both the exe and the DLL static link against the same runtime library?
Thanks
basically is there a way where whoever allocates it could just do addEvent(new DerivedEvent(), FunctorDestroyClass());
I may be reading more into your question than is there, but if you are wanting to know how you can allocate and free memory across DLL boundaries, then you might use something like the following:
#define DLLMemAlloc( size ) HeapAlloc( GetProcessHeap(), 0, size )
#define DLLMemFree( mem ) HeapFree( GetProcessHeap(), 0, mem )
That might be safer (a partial attempt at future-proofing). Relying on various build options to guarantee the safety of allocating and freeing across boundaries might lead to problems.
And (also not part of the question), you might re-think whether it is really necessary to be able to do this. It seems like there may be a design flaw if one DLL has to allocate something that another DLL (or executable) has to free.
DLL will get it's own memory manager if you link run-time library statically. You have 3 options: link run-time dynamically, always allocate and deallocate in the same place (either DLL or executable, providing forwarding if necessary), or use 3rd party memory allocator that takes care of this problem.

Is it safe to allocate memory for buffers on external dll and use it on main application?

As in topic.. I've found something like this in one application. In main C application we have declaration:
void* buff = NULL;
and later there is a call:
ReadData(&buff);
SaveToFile(buff);
SaveToFile() is a C function from main function.
ReadData(void* * ) is a C++ function from external dll. In this function memory for buffer is allocated by malloc function and filed with data.
So here is my question: is it correct?
All modules in a running process share the same address space (doesn't care whether you're Windows or Linux or whatever actually, it's a common principle). However, be careful: reading or writing from module A to a buffer owned by module B is fine - but freeing the buffer is probably bad.
On Windows, it depends on the runtime library the application is linked against. If it is not the DLL runtime ('multithreaded dll'), every module maintains its own copy of the heap manager. Thus, the module that allocated a memory area must also be responsible for destroying it because only its own heap manager knows about it. If you follow this guideline, you won't run into problems (linking against the DLL runtime avoids the problem because all modules deal with the same heap manager residing somewhere in msvXXXnnn.dll, but gives rise to other issues).
Edit:
ReadData(void* * ) is a C++ function from external dll. In this function memory for buffer is allocated by malloc function and filed with data.
That might run into the aforementioned allocator issue. Either add another function to that DLL (FreeData) which is explicitly responsible for freeing up the buffer (as proposed by Neil Butterworth) and just calls its own free(). Or you add a DLL function to query the size of the buffer, allocate it upfront and pass it to ReadData (that's the cleanest choice imo).
If both the DLL and the main executable were linked with the same C runtime, this is OK and you can call free() on the pointer to release it. However, a better idea is in the DLL to provide a function FreeData( void * ) which releases the data. In this way all memory management is done in the context of the DLL.
It's safe. However, you should always check:
if the same allocator is used for both allocation and deallocation
who is responsible for freeing (so there are no surprises)
watch out for any kind of automatic memory magement (if it's plain C/C++ then it's no problem).
It depends on the intention of the design and users the library is directed to. A better way is to take a buffer of some fixed size and fill it and return. But, you should be careful while freeing the buffer. It is better to call the free function (if any) provided by the third party DLL itself rather than calling free from your main.
In case of windows, if your third party DLL is using a different heap and if your application is using a different heap, it might lead to undefined behaviour. For Ex: if your third party DLL is build using VC8 and your application is built using VC6, then if you free the memory allocated by your external DLL, it will lead to problems.
Yes, this is correct. Memory in a process is equally accessible by all modules (EXE and DLLs) in that process.
Yes, there is no problem with this.

Does a memory leak at unload of a DLL cause a leak in the host process?

Consider this case:
dll = LoadDLL()
dll->do()
...
void do() {
char *a = malloc(1024);
}
...
UnloadDLL(dll);
At this point, will the 1k allocated in the call to malloc() be available to the host process again?
The DLL is statically linking to the CRT.
Memory used by a process as tracked by the OS is applicable to the full process and not specific to a DLL.
Memory is given to the program in chunks by the OS, called heaps
The heap managers (malloc / new etc) further divide up the chunks and hands it out to requesting code.
Only when a new heap is allocated does the OS detect an increase in memory.
When a DLL is statically linked to the C Run time library (CRT), a private copy of CRT with the CRT functions that the DLL's code invokes is compiled and put into the DLL's binary. Malloc is also inclued in this.
This private copy of malloc will be invoked whenever the code present inside the statically linked DLL tries to allocate memory.
Consequently, a private heap visible only to this copy of malloc, is acquired from the OS by this malloc and it allocates the memory requested by the code within this private heap.
When the DLL unloads, it unloads its private heap, and this leak goes unnoticed as the entire heap is returned back to the OS.
However If the DLL is dynamically linked, the memory is allocated by a single shared version of malloc, global to all code that is linked in the shared mode.
Memory allocated by this global malloc, comes out of a heap which is also the heap used for all other code that is linked in the dynamic aka shared mode and hence is common. Any leaks from this heap therefore becomes a leak which affects the whole process.
Edit - Added descriptions of the linking scenario.
You can't tell. This depends on the implementation of your static and dynamic CRT. It may even depend on the size of the allocation, as there are CRTs that forward large allocations to the OS, but implement their own heap for small allocations.
The problem with a CRT that leaks is of course that it leaks. The problem with a CRT that does not leak is that the executable might reasonable expect to use the memory, as malloc'ed memory should remain usable until free is called.
From MSDN Potential Errors Passing CRT Objects Across DLL Boundaries
Each copy of the CRT library has a
separate and distinct state. As such,
CRT objects such as file handles,
environment variables, and locales are
only valid for the copy of the CRT
where these objects are allocated or
set. When a DLL and its users use
different copies of the CRT library,
you cannot pass these CRT objects
across the DLL boundary and expect
them to be picked up correctly on the
other side.
Also, because each copy of the CRT
library has its own heap manager,
allocating memory in one CRT library
and passing the pointer across a DLL
boundary to be freed by a different
copy of the CRT library is a potential
cause for heap corruption.
Hope this helps.
Actually, the marked answer is incorrect. That right there is a leak. While it is technically feasible for each dll to implement its own heap, and free it on shutdown, most "runtime" heaps - static or dynamic - are wrappers around the Win32 process heap API.
Unless one has taken specific care to guarantee that this is not the case, the dll will leak the allocation per load,do,unload cycle.
One could do a test and see if there are memory leaks. You run a simple test 30 times allocating 1 MB each time. You should figure that out quite quickly.
One thing is for sure. If you allocated memory in the DLL you should also free that memory there (in the DLL).
For example you should have something like this (simple but intuitive pseudocode):
dll = DllLoad();
ptr = dll->alloc();
dll->free(ptr);
DllUnload(dll);
This must be done because the DLL has a different heap than the original process (that loads the dll).
No, you do not leak.
If you mix dll models (static, dynamic) then you can end up with a memory error if you allocate memory in a dll, that you free in a different one (or freed in the exe)
This means that the heap created by the statically-linked CRT is not the same heap as a different dll's CRT.
If you'd linked with the dynamic version of the CRT, then you'd have a leak as the heap is shared amongst all dynamically-linked CRTs. It means you should always design your apps to use the dynamic CRTs, or ensure you never manage memory across a dll boundary (ie if you allocate memory in a dll, always provide a routine to free it in the same dll)