Integrating C++ custom memory allocators across shared/static libraries

Integrating C++ custom memory allocators across shared/static libraries - c++

I started to use some custom memory allocators such rpmalloc and ltmalloc into my project, but I have some concern about the integration, my project has various internal modules built as shared libraries or static libraries (depending how I configure them in my build system) and should build/run for the Windows/Linux/FreeBSD/Mac OS X, and architectures such x86 and ARM, and I don't know if I should make the calls of my memory allocator integrations inside of the header file or should remain inside of the cpp file.
If the memory allocator calls stay in the header files, every module should link the static library of the memory allocator, if it is kept in the .cpp file the calls are contained in the library who contains them, and only that module should link the custom memory allocator, but that module should contain an interface to every module can allocate them (avoiding memory allocation inconsistency)
I've read here if the memory is allocated normally (like malloc/free/syscalls does) every shared library has his own heap, but if uses mmap allocates memory which doesn't belong to the program's heap.
My question is does it introduce any hazard in my shared/static libraries if they are kept into one library (but every other library should link it in order to access their memory allocation interfaces)? or should everything inline in the header files, and every library should link the memory allocator library?.

How memory allocation is done heavily depends on the operating system. You need to understand how shared libraries work in those operating systems, how C language relates to those operating systems and to the concept of shared libraries.
C, C++ and modular programming
First of all, I want to mention that C language is not a modular language e.g. it has no support for modules or modular programming. For languages like C and C++ implementation of modular programming is left to the underlying operating system. Shared libraries is an example of mechanism that is used to implement modular programming with C and C++, therefore I will refer to them as modules.
Module = shared library and executable
Linux and Unix-like systems
Initially everything on Unix systems was statically linked. Shared libraries came later. And as Unix was a starting point for the C langauge, those systems try to provide shared library programming interface that is close to what programming in C feels like.
The idea is that C code written originally without shared libraries in mind should be build and should work without changes made to the source code. As the result, provided environment usually has single process-wide symbol namespace shared by all loaded modules e.g. there can only be a single function with a name foo in the whole process, except for static functions (and some functions that are hidden in moduels using OS-specific mechanisms). Basically it is the same as with static linking where you are not allowed to have duplicate symbols.
What this means for your case is that there is always a single function named malloc in use in the whole process and every module is using it e.g. all modules share the same memory allocator.
Now if process happens to have multiple malloc functions, only a single one is picked and will be used by all modules. Mechanism here is very simple - as shared libraries do not know location of every referenced function, they will usually call them though some table (GOT, PLT) that will be filled with required addresses lazily on the first call or at load time. The same rule is applied to the module that provides original function - even internally this function will be called though the same table, making it possible to override that function even in the original module that provides it (which is the source of many ineffeciencies related to usage of shared libraries on Linux, search for -fno-semantic-interposition, -fno-plt to overcome this).
The general rule here is that the first module to introduce symbol will be the one providing it. Therefore original process executable has the highest priority here and if it defines malloc function, that malloc function will be used everywhere in the process. The same applies to functions calloc, realloc, free and others. Using this trick and tricks like LD_PRELOAD allow you to override "default memory allocator" of your application. This is not guaranteed to work thought as there are some corner cases. You should consult documentation for your library before doing this.
I want to specifically note that this means there is a single heap in the process shared by all modules and there is a good reason for that. Unix-like systems usually provide two ways of allocating memory in a process:
brk, sbrk syscalls
mmap syscall
The first one provides you an access to a single per-process memory region usually allocated directly after the executable image. Because of the fact that there is only one such region, this way of memory allocation can only be used by a single allocator in a process (and it is usually already used by your C library).
This is important to understand before you throw any custom memory allocator into your process - it either should not use brk, sbrk, or should override existing allocator of your C library.
The second one can be used to request chunk of memory directly from the underlying kernel. As kernel know the structure of your process virtual memory, it is able to allocate pages of memory without interfering with any user-space allocator. This is also the only way to have multiple fully independent memory allocators (heaps) in the process.
Windows
Windows does not rely on C runtime the same way Unix-like systems do. Instead it provides its own runtime - Windows API.
There are two ways of allocating memory with Windows API:
Using functions like VirtualAlloc, MapViewOfFile.
And heap allocation functions - HeapCreate, HeapAlloc.
The first one is an equivalent to mmap, while the second one is a more advanced version of malloc which is based internally (as I believe) on VirtualAlloc.
Now because Windows does not have the same relation to C language as Unix-likes have, it does not provide you with malloc and free functions. Instead, those are provided by C runtime library which is implemented on top of Windows API.
Another thing about Windows - it does not have a concept of single per process symbol namespace e.g. you cannot override function here the same way you do on Unix-like systems. This allows you to have multiple C runtimes co-existing in the same process, and every of those runtimes can provide its independent implementation of malloc, free etc, each operating on a separate heap.
Therefore on Windows all libraries will share a single process Windows API-specific heap (can be obtained through GetProcessHeap), at the same time they will share heap of one of C runtimes in the process.
So how do you integrate memory allocator into your program?
It depends. You need understand what you are trying to achieve.
Do you need to replace memory allocator used by everyone in your process e.g. the default allocator? This is only possible on Unix-like system.
The only portable solition here is to use your specific allocator interface explicitly. It doesn't really matter how you do this, you just need to make sure the same heap is shared by all libraries on Windows.
The general rule here is that either everything should be statically linked or everything should be dynamically linked. Having some sort of mix between the two might be really complicated and requires you to keep the whole architecture in your head to avoid mixing heaps or other data structures in your program (which is not a big problem if you don't have many modules). If you need to mix static and dynamic linking, you should build you allocator library as a shared library to make it easier having single implementation of it in a process.
Another difference between Unix-alikes and Windows is that Windows does not have a concept of "statically linked executable". On Windows every executable has dependencies on Windows-specific dynamic libraries like ntdll.dll. While with ELF executables have separate types for "statically linked" and "dynamically linked" executables.
This is mostly due to single per-process symbol namespace which makes it dangerous to mix shared and static linking on Unix-alikes, but allows Windows to mix static and dynamic linking just fine (almost, not really).
If you use one of your libraries, you should make sure you link it dynamically with dynamically linked executables. Imagine if you link your allocator statically into your shared library, but another library in your process uses the same library too - you might be using another allocator by accident, not the one you were expecting.

Related

linux and shared libraries and different g++ compilers

What is the story regarding having a process on Linux, which dlopen() multiple shared libraries and the executable and/or the shared libraries compiled with different C++ compilers (e.g. provided by customers or 3rd parties).
Am I going right in the following assumptions:
there is only a single namespace for symbols in a linux process. Symbols are found and resolved only by symbol name. The source of the symbol is random in the presence of an unknown executable (customer supplied) or customer supplied shared libraries.
there is no way to make certain, that STL/boost symbols are being resolved from the correct source, as they are always weak and thus might be overwritten.
What are the implications of using multiple copies of (different) libc++ inside the same process (some of them static)?
I don't expect seperate libraries to be able to talk to each other via a C++ interface but only via a C interface. What I would like is, that one can load SharedLibraries from different vendors into a single process and they do not screw up each other.
I know that this has worked in Windows for decades

Your comment completely changes your question:
I don't expect it to be able to talk to each other via a C++ interface but only via a C interface. What I expect is, that one can load SharedLibraries from different vendors into a single process and they do not screw up each other. (This btw is working on Windows since decades)
This element of behaviour is largely system indipendant. The Windows PE format and Linux ELF are similar enough in design that they don't add any additional constraints or capabilities on this topic. So if your technique was going to work in Windows then it should also do so in Linux, just replacing .dll files for .so files.
Linux has more standardisation around calling conventions than Windows, so if anything you should find that Linux make's this simpler.
Origional Answer
Question:
there is only a single namespace for symbols in a Linux process?
That's correct; there's no such thing as namespaces in Linux's loader.
As you may know, C and C++ are very different languages. C++ has namespaces C does not. When libraries are loaded (in both Linux, Unix and also Windows) there is no concept of namespace.
C++ compilers use name mangling to ensure that names isolated by namespaces in your code, do not compiled when placed as symbols in the shared object. C compilers don't do this and don't need to do it because there are no namespaces.
Question:
Symbols are found and resolved only by symbol name. The source of the symbol is random in the presence of an unknown executable (customer supplied) or customer supplied shared libraries.
Let's replace the word "random" for unpredictable. That's also correct. From Wikipedia:
The C++ language does not define a standard decoration scheme, so each compiler uses its own. C++ also has complex language features, such as classes, templates, namespaces, and operator overloading, that alter the meaning of specific symbols based on context or usage. Meta-data about these features can be disambiguated by mangling (decorating) the name of a symbol. Because the name-mangling systems for such features are not standardized across compilers, few linkers can link object code that was produced by different compilers.
Question:
What is the story regarding having a process on LINUX, which dlopen() multiple shared libraries and the executable and/or the shared libraries compiled with different C++ compilers (e.g. provided by customers or 3rd parties).
You can off course dlopen() a shared object, but dlsym() would be tricky to use because of name mangling. You'd have to inspect the shared object manually to determine the precise symbol name.
What are the implications of using multiple copies of (different) libc++ inside the same process (some of them static)?
If you got that far, then I'd be concerned about memory management first of all. libc++ is responsible for implementing new and delete and converting these into memory requests from the OS. If they behave anything like GNU's malloc() and free(), they might manage their own pool of memory. It'd be hard to prectict what would happen if you called delete on an object that was created by a different libc++.

it seems that parts of this randomness can be avoided by loading the shared libraries using the following flags passed to dlopen():
RTLD_LOCAL
RTLD_DEEPBIND

Separate Heap on Linux? [duplicate]

This question already has answers here:
Do shared libraries use the same heap as the application?
(2 answers)
Closed 6 years ago.
In my project i have some plugins that get loaded at runtime via LoadLibrary(). From the book "Windows via C/C++" i know that objects created inside the DLL should be freed inside the DLL. Object* CreateObj() void FreeObj(Object*). The reason is, there could be multiple linked C/C++ runtimes linked to the running process.
As i try to port my project to Linux, i used the same approach. But: Is that needed in Linux too? Is it possible that there are multiple heaps in a Linux process too?

If your .so are statically linked to C++ runtimes - you should free objects in the same module where they were allocated since new/delete is something more than malloc()/free() and need some extra info to work properly. Moreover you shouldn't even pass runtime-specific objects / pointers to objects (e.g. std::string) across .so modules, since modules in general may be linked against different and binary incompatible runtime implementations (e.g. you have some third-party prebuilt modules). And even if you use the same runtime implementation across all process - static linkage leads to creating of multiple instances of runtime's internal globals, that surely can cause mess.
So, IMHO, the best scenario - is to link all your modules against dynamic version of runtime. Or if you really want to use statically linked runtimes - you must expose pure C interface for each module to avoid mentioned above interferences.
P.S. Such behavior doesn't depend on actual system, it is related to anything that supports dynamically loadable modules.

Loading COM in a process which is compiled using different visual studio compiler version

We have a executable compiled using Visual Studio 2008 version. Due to the 3rd party dependency we must compile this executable in visual studio 2008.
We also have another component which gets compiled in visual studio 2010. Now we need to get one COM component dll from this component (which is compiled in 2010 compiler version) accessed by the executable which is compiled using 2008 compiler version.
My question here is, would it work fine. Would there be conflicts in the runtime used by the executable (which is 2008 runtime lib) and runtime used by the COM component (which is using 2010 runtime).
We actually tried to load this COM dll in executable which actually worked fine. But I have concern that in later time due to multiple runtimes it may crash/fail.
Please let me know how the multiple runtimes would get handled here. Is it safe to load the different runtime in single executable. Would there be any conflicts in later part of execution due to different runtime available?
Anyway a solution we are looking to solve this problem to make the COM component as a OUT proc Server, which anyway will work. But that will involve a lot of work to do.
Please let me know.
Many Thanks

You should have no problem mixing COM objects that are linked with different runtime libraries, since the memory allocation and deallocation of each object will be done behind the DLL boundary.
You need to be careful that all your methods have proper COM signatures, i.e. all pointers should be COM pointers.

COM is designed for binary interop. By design the framework is implementation agnostic. The intent is that COM servers can be implemented in one language/runtime, and consumed by a COM client implemented with a different language/runtime.
There are absolutely no constraints over the languages and runtimes that are used by different parties.

This has been answered a few times in several contexts.
As long as you don't handle and/or pass around C runtime (CRT) data structures between modules, you're fine. If you do any of the following between modules that depend on different CRTs, you'll have trouble, and in this specific case, you're not implementing COM objects properly:
malloc memory in one module and realloc or free in another
fopen a FILE* in one module and fread, fwrite, fclose, etc. in another
setjmp in one module and longjmp in another
Note that there are things you can do:
Use memory malloced by another module, keeping the responsibility of reallocating and freeing on the originating module
Use some interface that interacts with files fopened by another module, keeping the responsibility of its use on the originating module
Don't use setjmp/longjmp across unrelated or losely coupled modules, define callbacks, aborting error codes, whatever, but don't rely on unwinding techniques, even if provided by the OS
You can see a pattern here. You can use resources from another module for as long as you delegate managing those resources to that module.
With COM, you shouldn't ever have this kind of trouble, everything should be encapsulated in objects through their implemented interfaces. Although you can pass malloced memory as top-level pointer arguments, you're only supposed to access that memory in the callee, never reallocate or free it. For inner pointers, you must use CoTaskMemAlloc and its cousins, as this is the common memory manager in COM. This applies to handling files (e.g. encapsulate them in IStream, IPipeByte, a IEnumByte or something similar), and don't unwind across COM calls.

are runtime linking library globals shared among plugins loaded with dlopen?

I've a C++ program that links at runtime with, lets say, mylib.so. then, the same program uses dlopen()/dlsym() to load a function from myplugin.so, dynamic library that in turn has dependencies to mylib.so.
My question is: will the program AND the function in the plugin access the same globals defined in mydlib.so in the same memory area reserved for the program, or each will be assigned different, unrelated copies in its own memory space? if the latter is the default behaviour, is it possible to change that?
Thanks in advance =)!

Globals in the main program that does the dlopen should be visible to the code that is dynamically loaded. However, the best advice I've seen to date (especially if you ever want to have even vaguely portable code) is to only have function calls be passed across the linker divide, and to not export any variables in either direction. It's also best if there is an API for the loaded code to register the interesting parts of its API with the loader (e.g., "Here is how I provide this SPI for drawing foobars on a baz") as that's a much saner way of doing callbacks rather than just mashing everything together.
[EDIT]: The other reason for doing this is if you're simulating weak linking on a platform that doesn't support it. That's a lot like the other one I list, except that it is the main program that is building the SPI out of the API exported by the dynamic library rather than the .so exporting it explicitly on startup. It's second best really, but you make do with what you've got rather than wishing (well, unless you're prepared to do the work by writing some sort of connection library).

Benefits of exporting a class from a dll vs. static library

I have a C++ class I'm writing now that will be used all over a project I'm working on. I have the option to put it in a static library, or export the class from a dll. What are the benefits/penalties for each approach. The only one I can think of is compiled code size which I don't really care about. Thanks!

Advantages of a DLL:
You can have multiple different exe's that access this functionality, so you will have a smaller project size overall.
You can dynamically update your component without replacing the whole exe. If you do this though be careful that the interface remains the same.
Sometimes like in the case of LGPL you are forced into using a DLL.
You could have some components as C#, Python or other languages that tie into your DLL.
You can build programs that consume your DLL that work with different versions of the DLL. For example you could check if a function exists in a certain operating system DLL and only call it if it exists, and otherwise do some other processing.
Advantages of Static library:
You cannot have dll verisoning problems that way
Less to distribute, you aren't forced into a full installer if you only have a small application.
You don't have to worry about anyone else tying into your code that would have been accessible if it was a DLL.
Easier to develop a static library as you don't need to worry about exports and imports.
Memory management is easier.

One of the most significant and often unnoted features of dynamic libraries on Windows is that DLLs have their own heap. This can be an advantage or a disadvantage depending on your point of view but you need to be aware of it. For example, a global variable in a DLL will be shared among all the processes attaching to that library which can be a useful form of de facto interprocess communication or the source of an obscure run time error.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js