Shared libraries memory space - c++

Does a C++ shared library have its own memory space? Or does it share the caller process' one?
I have a shared library which contains some classes and wrapper functions.
One of this wrapper function is kinda:
libXXX_construct() which initializes an object and returns the pointer to the said object.
Once I use libXXX_construct() in a caller program where is the object placed?Is it in the "caller" memory space or is it in the library's memory space?

A linked instance of the shared library shares the memory space of the instance of the executable that linked to it, directly or indirectly. This is true for both Windows and the UN*X-like operating systems. Note that this means that static variables in shared libraries are not a way of inter-process communication (something a lot of people thinks).

All the shared libraries share the virtual memory space of their process. (Including the main executable itself)

Unless specified otherwise, a shared library will share memory with the process hosting it. Each process instance will then have its own copy.
However, on Windows it is possible to create shared variables which allow inter-process communication. You do so by putting them in the right kind of segment. By default Windows uses two kinds of segments: data segments are read/write unshared, whereas code segments are read-only executable and shared. However, the read-write and shared attributes are orthogonal. A shared read-write segment in a library can be used to store shared variables, and it will survive until the last process exits.
Be careful with C++, as that will happily run constructors and destructors on process start & exit, even if you put variables in shared segments.
For details, see Peering Inside the PE: A Tour of the Win32 Portable Executable File Format part 2 by Matt Pietrek.

The shared library has the same address space as its host process. It has to be that way, or else you wouldn't be able to pass pointers from one module to another since they wouldn't be able to dereference them.
But although they are in the same address space, that doesn't mean they all use the same memory manager. The consequence is that if you provide a function that allocates memory on behalf of the caller, then you should provide a corresponding function to free that memory, say, libXXX_destroy().

Your object exists in the caller's memory space (in fact the one memory space shared between the library and the main executable)

The share address space so you can share pointers, however they don't share allocator (at least not on windows).
This means that if you call new to allocate an object inside a shared library you must call delete inside the same library or strange things may happen.

It's true that a library will use up memory in each process that loads it. However, at least under Windows, when multiple processes load the same DLL, unmodified pages (including all code pages) are quietly shared under the covers. Also, they take up no space in the swap file, since they're backed by the original file.
I believe this is more complicated for .NET, due to JIT compilation, but would still be true for NGENed assemblies.
edit
This is a detail of the VM. However, you can also flag a segment in a DLL to be shared across processes.

Related

Memory usage of loaded shared objects

I'm working on a Linux-based program that loads many plug-ins in the form of shared objects. What I want to find out is how much memory each shared object, and all its data structures, take at a certain point in time. Is it possible to do that? I can modify both the main program and the plug-in shared objects if needed.
It is not possible dynamically, since it may happen that a shared object A.so is dynamically creating at runtime some object data B which is used then destroyed by shared object C.so
So you cannot say that some data like B "belongs to" a particular shared object; you may (and should) have conventions about that. See RAII, rule of three, smart pointers, ....
The point is that the question "how many memory is used by a given library or shared object" has no sense. Memory and address-space are global to the process, so shared by the main program and all shared objects, libraries, plugins...!
You could however use proc(5) to get information about the entire process. From inside the program, read sequentially /proc/self/maps to get the map of its address space. From outside the program, read /proc/1234/maps for process of pid 1234.
You might want to use valgrind. Read more about memory management, garbage collection, reference counting. You could view your issues as related to resource management or garbage collection. You might want to use Boehm's conservative garbage collector (if using standard C++ containers, you'll want to use Boehm gc_allocator, see this). The point is that liveness of some given data is a global property of the program, not of any particular plugin or function. Think about circular references
What I want to find out is how much memory each shared object, and all
its data structures, take at a certain point in time. Is it possible
to do that?
If the program is running and you have its pid you can examing its memory mappings. For example:
% pmap 1234
[...]
00007f8702f6a000 148K r-x-- libtinfo.so.5.9
00007f8702f8f000 2044K ----- libtinfo.so.5.9
00007f870318e000 16K r---- libtinfo.so.5.9
[...]
This doesn't tell you much about the data structures et al though.

Issues when memory allocation/deallocation when linking with static c runtime

I came across below note in the book Windows via C-C++ by Jeffrey Richter and Christophe Nasarre.
Examine the following code:
V
OID EXEFunc() {
PVOID pv = DLLFunc();
// Access the storage pointed to by pv...
// Assumes that pv is in EXE's C/C++ run-time heap
free(pv);
}
PVOID DLLFunc() {
// Allocate block from DLL's C/C++ run-time heap
return(malloc(100));
}
So, what do you think? Does the preceding code work correctly? Is the block allocated by the
DLL's function freed by the EXE's function? The answer is: maybe. The code shown does not
give you enough information. If both the EXE and the DLL link to the DLL C/C++ run-time
library, the code works just fine. However, if one or both of the modules link to the static C/C++
run-time library, the call to free fails.
I'm not able to understand why call to free would fail here when linking modules with static C runtime.
Can someone explain why free fails?
Found similar question here:
Memory Allocation in Static vs Dynamic Linking of C Runtime
But I've same doubt here as MrPhilTx:
Wouldn't all of the heaps be in the same address space?
Thanks!
When both your DLL and EXE are staticly linked to the C runtime the two runtimes simply don't know about each other. So both the EXE and DLL get their own copy of the runtime, their own heap and heap metadata. Neither side knows about the others Metadata and there is no safe way to update the data when you free memory. You end up with inconstant metadata and things will eventually fail (and if your very lucky, it will fail right away).
What this means is that you end up with at least two heaps in your process, and each heap has it's own rules and metadata. There is no way for the EXE to know the exact way the DLL allocates memory so there is no way for it to free it.
As for why you can get away with sharing a heap when everything is dynamically linked, that's easy, there is only one copy of the C Runtime DLL in the process, so if every DLL links against it they will all be calling the same code with the same metadata.
You can't allocate memory from one allocator and free it with another. Different allocators use different internal implementations and the result of giving a memory block to an allocator that didn't allocate it is unpredictable.
So unless you know for a fact that two sections of code are using the same allocator, you can't allocate memory in one section of code and free it in the other. The usual solution is to ensure the same unit both allocates and frees the memory. In your example, the DLL could offer a "free" function that the main code could call into instead of calling its own free function which frees to its own allocator.
So do this instead:
OID EXEFunc() {
PVOID pv = DLLFunc();
// Access the storage pointed to by pv...
// Assumes that pv is in EXE's C/C++ run-time heap
DLLFreeFunc(pv);
}
...
PVOID DLLFunc() {
// Allocate block from DLL's C/C++ run-time heap
return(malloc(100));
}
DLLFreeFunc(PVOID x) {
free(x);
}
On Linux, a program uses the brk and sbrk system calls to request extra data pages from the kernel. sbrk returns an address pointing to the data segment that can be used by your program.
malloc and free use the data segment returned by brk and sbrk by turning it into a heap. The heaps is a large block of memory in the current porcess's space which small blocks of memory can be requested and returned as required. It is important to note that many calls to malloc and free will make no system calls.
Now when malloc and free want to make use of the heap they need to get a pointer to the heap. This pointer is stored in a separate data segment called static data and is allocated when the application loads. In order to insure that different DLLs (or shared libraries on linux) to no clash with eachother, each DLL has its own static data sections.
Now let us assume that both the dll and the executable are statically linked to their own libraries. In such a case the dll and the executable will have pointers to a different heaps and is such event both dll and executable must free their own memory.
However on linux both the dll and the executable will access malloc and free through a common DLL (libc.so on linux). In such a case, since both the dll and executable are effectively accessing libc's heap, the executable can safely free memory allocated by the dll.
In any event it is good practice for the dll to provide its own free function. This if nothing else documents that the pointer returned by DLLFunc needs to be freed.
I imagine this is true on Windows as well.
The code critically depends on the implementation of malloc and free. A good implementation has no issues, a poor one will indeed fail. It's definitely easier to create a working DLL implementation of malloc and free, but it's far from impossible to do so in a static library.
A trivial example would be a static library which forwards the calls directly to GlobalAlloc and GlobalFree.

Is C/C++ pointer keeps absolute memory address, or relative to application, or relative to module?

For example, if I declare a function in the main application, and a pass a pointer to it, from a dynamically loaded library (via dlopen under Linux or LoadLibrary under Windows) using a gotten symbol argument (via dlsym or GetProcAddress respectively) and try to call that function, would it work properly?
Same if pass pointer from one dynamically loaded library to another? I think it should work if pointer at least relative to application but not relative to module/library.
Another example. I declare a function in one application and pass pointer to it to another fully-independent application (both C and C++) somehow (parameter string or file i/o - idk how, just an idea) and try to call this function, would it work too? I could expect it to work if the pointers are absolute. Maybe it just won't work because system won't like such cross-call at all because of safety?
First you must understand that in C and C++ the value of a pointer is not required to be related to the addresses actually used, as long as pointer arithmetic works with it, the null pointer has an R value of 0 and the implementation manages to bijectively map between machine pointers and language abstract pointers.
On modern systems processes see a virtual address space and each process has its own address space. Libraries may be loaded to any address, thus passing a pointer between processes is utter nonsense – at least on paged memory architectures. However within a process machine level pointers are passed between loaded libraries with no problem; after all they share the same address space. On the language level they may not be the same (though usually the are), if multiple languages get into contact. But compilation units created using the same compiler will use the same pointer semantics. Also most language implementations that target the native machine agree on the pointer semantics for the simple reason, that having to convert between pointer formats would create a huge performance hit.
It's absolute.
The fact that it's a virtual address has nothing to do with this -- it's an absolute virtual address. It makes no difference to your program whether it's using virtual memory or physical memory... you shouldn't concern yourself with this unless you're passing pointers between processes (which I seriously doubt you are), or unless you're writing low-level kernel code or mapping/unmapping pages manually (which I also doubt you are).
Things work differently on different operating systems. On older / embedded operating systems all processes share the same space - one process can easily mess things for another.
On most general-purpose (i.e. not embedded) modern operating systems each process has a separate address space. All addresses are relative to this space. It doesn't matter how things are compiled / linked together, if they're in the same process, they share the address space.
It follows that two distinct processes have no implicit ways of accessing each-others space.
Is is due to MMU. It is hardware unit in most processors for protecting applications from each other. Its main purpose is to protect kernel/OS from application. Each application or process has its own memory and can not access memory of other process, but has to go trough kernel/OS in some way.

Is it safe to allocate memory for buffers on external dll and use it on main application?

As in topic.. I've found something like this in one application. In main C application we have declaration:
void* buff = NULL;
and later there is a call:
ReadData(&buff);
SaveToFile(buff);
SaveToFile() is a C function from main function.
ReadData(void* * ) is a C++ function from external dll. In this function memory for buffer is allocated by malloc function and filed with data.
So here is my question: is it correct?
All modules in a running process share the same address space (doesn't care whether you're Windows or Linux or whatever actually, it's a common principle). However, be careful: reading or writing from module A to a buffer owned by module B is fine - but freeing the buffer is probably bad.
On Windows, it depends on the runtime library the application is linked against. If it is not the DLL runtime ('multithreaded dll'), every module maintains its own copy of the heap manager. Thus, the module that allocated a memory area must also be responsible for destroying it because only its own heap manager knows about it. If you follow this guideline, you won't run into problems (linking against the DLL runtime avoids the problem because all modules deal with the same heap manager residing somewhere in msvXXXnnn.dll, but gives rise to other issues).
Edit:
ReadData(void* * ) is a C++ function from external dll. In this function memory for buffer is allocated by malloc function and filed with data.
That might run into the aforementioned allocator issue. Either add another function to that DLL (FreeData) which is explicitly responsible for freeing up the buffer (as proposed by Neil Butterworth) and just calls its own free(). Or you add a DLL function to query the size of the buffer, allocate it upfront and pass it to ReadData (that's the cleanest choice imo).
If both the DLL and the main executable were linked with the same C runtime, this is OK and you can call free() on the pointer to release it. However, a better idea is in the DLL to provide a function FreeData( void * ) which releases the data. In this way all memory management is done in the context of the DLL.
It's safe. However, you should always check:
if the same allocator is used for both allocation and deallocation
who is responsible for freeing (so there are no surprises)
watch out for any kind of automatic memory magement (if it's plain C/C++ then it's no problem).
It depends on the intention of the design and users the library is directed to. A better way is to take a buffer of some fixed size and fill it and return. But, you should be careful while freeing the buffer. It is better to call the free function (if any) provided by the third party DLL itself rather than calling free from your main.
In case of windows, if your third party DLL is using a different heap and if your application is using a different heap, it might lead to undefined behaviour. For Ex: if your third party DLL is build using VC8 and your application is built using VC6, then if you free the memory allocated by your external DLL, it will lead to problems.
Yes, this is correct. Memory in a process is equally accessible by all modules (EXE and DLLs) in that process.
Yes, there is no problem with this.

Instantiating objects in shared memory C++

We have a need for multiple programs to call functions in a common library. The library functions access and update a common global memory. Each program’s function calls need to see this common global memory. That is one function call needs to see the updates of any prior function call even if called from another program.
For compatibility reasons we have several design constraints on how the functions exposed by the shared library must operate:
Any data items (both standard data types and objects) that are declared globally must be visible to all callers regardless of the thread in which the code is running.
Any data items that are declared locally in a function are only visible inside that function.
Any standard data type or an instance of any class may appear either locally or globally or both.
One solution is to put the library’s common global memory in named shared memory. The first library call would create the named shared memory and initialize it. Subsequent program calls would get the address of the shared memory and use it as a pointer to the global data structure. Object instances declared globally would need to be dynamically allocated in shared memory while object instances declared locally could be placed on the stack or in the local heap of the caller thread. Problems arise because initialized objects in the global memory can create and point to sub-objects which allocate (new) additional memory. These new allocations also need to be in the shared memory and seen by all library callers. Another complication is these objects, which contain strings, files, etc., can also be used in the calling program. When declared in the calling program, the object’s memory is local to the calling program, not shared. So the object’s code needs to handle either case.
It appears to us that the solution will require that we override the global placement new, regular new and delete operators. We found a design for a memory management system that looks like it will work but we haven’t found any actual implementations. If anyone knows of an implementation of Nathan Myers’ memory management design (http://www.cantrip.org/wave12.html?seenIEPage=1) I would appreciate a link to it. Alternatively if anyone knows of another shared memory manager that accommodates dynamically allocating objects I would love to know about it as well. I've checked the Boost libraries and all the other sources I can find but nothing seems to do what we need. We prefer not to have to write one ourselves. Since performance and robustness are important it would be nice to use proven code. Thanks in advance for any ideas/help.
Thanks for the suggestions about the ATL and OSSP libraries. I am checking them out now although I'm afraid ATL is too Wincentric if are target turns out to be Unix.
One other thing now seems clear to us. Since objects can be dynamically created during execution, the memory management scheme must be able to allocate additional pages of shared memory. This is now starting to look like a full-blown heap replacement memory manager.
Take a look at boost.interprocess.
OSSP mm - Shared Memory Allocation:
man 3 mm
As I'm sure you have found, this is a very complex problem, and very difficult to correctly implement. A few tips from my experiences. First of all, you'll definitely want to synchronize access to the shared memory allocations using semaphores. Secondly, any modifications to the shared objects by multiple processes need to be protected by semaphores as well. Finally, you need to think in terms of offsets from the start of the shared memory region, rather than absolute pointer values, when defining your objects and data structures (it's generally possible for the memory to be mapped at a different address in each attached process, although you can choose a fixed mapping address if you need to). Putting it all together in a robust manner is the hard part. It's easy for shared memory based data structures to become corrupted if a process should unexpectedly die, so some cleanup / recovery mechanism is usually required.
Also study mutexes and semaphores. When two or more entities need to share memory or data, there needs to be a "traffic signal" mechanism to limit write access to only one user.