dlopen and implicit library loading: two copies of the same library - c++

I have 3 things: open source application (let's call it APP),
closed source shared library (let's call it OPENGL)
and open source plugin for OPENGL (let's call it PLUGIN)[also shared library].
OS: Linux.
There is need to share data between APP and PLUGIN,
so APP linking with PLUGIN, and when I run it,
system load it automatically.
After that APP call eglInitialize that belongs to OPENGL,
and after that this function load PLUGIN again.
And after that I have two copies of PLUGIN in the APP memory.
I know that because of PLUGIN have global data, and after debugging
I saw that there are two copies of global data.
So question how I can fix this behaviour?
I want one instance of PLUGIN, that used by APP and OPENGL.
And I can not change OPENGL library.

It obviously depends a lot on exactly what the libraries are doing, but in general some solution should be possible.
First note that normally if a shared library with the same name is loaded multiple times it will continue to use the same library. This of coruse primarily applies to loading via the standard loading/linking mechanism. If the library calls dlopen on its own it still can get the same library but it depends on the flags to dlopen. Try reading the docs on dlopen to get an understanding of how it works and how you can manipulate it.
You can also try positioning the PLUGIN earlier in the linker command so that it gets loaded first and thus might avoid a double load later one. If you must load the PLUGIN dynamically this obviously won't help. You can also check if LD_PRELOAD might resolve the linking order.
As a last resort you may have to resort to using LD_LIBARY_PATH and putting an interface library in from of the real one. This one will simply pass calls to the real one but will intercept duplicate loads and shunt them to the previous load.
This is just a general direction to consider. Your actual answer will depend highly on your code and what the other shared libraries do. Always investigate linker load ordering first, as it is the easiest to check, and then dlopen flags, before going into the other options.

I suspect that OPENGL is loading PLUGIN with the RTLD_LOCAL flag. This
is normally what you want when loading a plugin, so that multiple
plugins don't conflict.
We've had similar problems with loading code under Java: we'd load a
dozen or so different modules, and they couldn't communicate with one
another. It's possible that our solution would work for you: we wrote a
wrapper for the plugin, and told Java that the wrapper was the plugin.
That plugin then loaded each of the other shared objects, using dlopen
with RTLD_GLOBAL. This worked between plugins. I'm not sure that it
will allow the plugins to get back to the main, however (but I think it
should). And IIRC, you'll need special options when linking main for
its symbols to be available. I think Linux treats the symbols in main
as if main had been loaded with RTLD_LOCAL otherwise. (Maybe
--export-dynamic? It's been a while since I've had to do this, and I
can't remember exactly.)

Related

Qt6 crc32 name collision

Qt + C++ + C gurus, I could some advice:
I started writing a Qt6 GUI C++ app that includes some extern C code. Much to my surprise, upon launch the ui->setupUi(this) call immediately segfaulted and I found myself inside of one of the extern C functions that was imported! This is literally at the first line of my MainWindow constructor - with no calls to any of the C code even added yet.
The C function where the segfault dumped me was named "crc32". I temporarily renamed it (and all calls to it) to test my theory and sure enough the program started correctly afterwards. So... Qt appears to invoke "crc32" when doing things like adding QMenu items... due to some name collision.
I'm puzzled how to fix it. I don't want to change the C code because it's actually a git submodule for a project I do not want to maintain a fork of forever. Since it's Qt calling it from the C++ side, I can't exactly specify the namespace on the call either.
Weird sidenote: When I was doing testing this morning with Qt5 (instead of 6), I was not running into this issue. The crc32 name collision may be specific to Qt6.
Looks like you are on Linux (or similar) and encountering a violation of the One Definition Rule. When the shared lib loader tries to load symbols from a shared lib, it will first check to see if it already has one with that name loaded. If so, it skips it. Thus, if you try to load two symbols with an identical name, you'll only get one of them, and then problems can ensue.
Possible options:
Hide symbols you don't need in the library. For this, consider using GCC's visibility controls.
Separate the symbols into different executables and share data with some interprocess communication like shared mem or a socket.
Use dlopen() on one of the libs along with the RTLD_LOCAL flag so it does not override the symbol.
Add a namespace to the project (or use its build options to control that, if available -- like Boost does). Yes, this violates your wish not to modify it, but it could be done with a patch or similar that does not modify the original checkout if you switched to using a package manager like Conan instead of a git submod.
crc32 is a symbol from zlib. Qt does use either the system zlib, or a built-in version of it. If it uses the built-in version, symbols are prefixed with z_ (so z_crc32) to avoid exactly such a clash.
So, if you want to use crc32, you have the options to configure Qt to either not use zlib at all (-no-zlib), or use the built-in version (-qt-zlib).

Load-time dynamic link library dispatching

I'd like my Windows application to be able to reference an extensive set of classes and functions wrapped inside a DLL, but I need to be able to guide the application into choosing the correct version of this DLL before it's loaded. I'm familiar with using dllexport / dllimport and generating import libraries to accomplish load-time dynamic linking, but I cannot seem to find any information on the interwebs with regard to possibly finding some kind of entry point function into the import library itself, so I can, specifically, use CPUID to detect the host CPU configuration, and make a decision to load a paricular DLL based on that information. Even more specifically, I'd like to build 2 versions of a DLL, one that is built with /ARCH:AVX and takes full advantage of SSE - AVX instructions, and another that assumes nothing is available newer than SSE2.
One requirement: Either the DLL must be linked at load-time, or there needs to be a super easy way of manually binding the functions referenced from outside the DLL, and there are many, mostly wrapped inside classes.
Bonus question: Since my libraries will be cross-platform, is there an equivalent for Linux based shared objects?
I recommend that you avoid dynamic resolution of your DLL from your executable if at all possible, since it is just going to make your life hard, especially since you have a lot of exposed interfaces and they are not pure C.
Possible Workaround
Create a "chooser" process that presents the necessary UI for deciding which DLL you need, or maybe it can even determine it automatically. Let that process move whatever DLL has been decided on into the standard location (and name) that your main executable is expecting. Then have the chooser process launch your main executable; it will pick up its DLL from your standard location without having to know which version of the DLL is there. No delay loading, no wonkiness, no extra coding; very easy.
If this just isn't an option for you, then here are your starting points for delay loading DLLs. Its a much rockier road.
Windows
LoadLibrary() to get the DLL in memory: https://msdn.microsoft.com/en-us/library/windows/desktop/ms684175(v=vs.85).aspx
GetProcAddress() to get pointer to a function: https://msdn.microsoft.com/en-us/library/windows/desktop/ms683212(v=vs.85).aspx
OR possibly special delay-loaded DLL functionality using a custom helper function, although there are limitations and potential behavior changes.. never tried this myself: https://msdn.microsoft.com/en-us/library/151kt790.aspx (suggested by Igor Tandetnik and seems reasonable).
Linux
dlopen() to get the SO in memory: http://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html
dladdr() to get pointer to a function: http://man7.org/linux/man-pages/man3/dladdr.3.html
To add to qexyn's answer, one can mimic delay loading on Linux by generating a small static stub library which would dlopen on first call to any of it's functions and then forward actual execution to shared library. Generation of such stub library can be automatically generated by custom project-specific script or Implib.so:
# Generate stub
$ implib-gen.py libxyz.so
# Link it instead of -lxyz
$ gcc myapp.c libxyz.tramp.S libxyz.init.c

Qt: different scenario if loading dll failed

I believe task I am trying to accomplish is fairly easy, but I have still managed to run into a problem, thus some help is appreciated.
I have got a base class (in a header and a source file), which I am subclassing two times as part of my program, lets call it WorkerBase. One of subclasses WorkerA is fairly trivial, whilst the other, WorkerB, depends on third-party libraries that depend on the hardware which program is running on. If hardware is unsuitable or those libraries are missing, using that subclass results in a fail. In this case, I would like to use WorkerA.
So, basically, how do I detect library loading failure? Now main program just wouldn't start if library is missing.
I am using Qt and the program is going to be Windows-only. Thank you.
I think, the simplest solution would be to move Woker's to plugins.
You can load plugin with QPluginLoader dynamically during run-time.
Then, while WorkerB would be statically linked to 3rd party lib, if required dependencies for WorkerB would not be satisfied, WorkerB plugin will simply fail to load, you would catch that with QPluginLoader and load WorkerA plugin instead.
The other way is to reinvent the wheel your own plugin system using QLibrary (and QPluginLoader as a reference realization).
Possibly that helps
http://developer.qt.nokia.com/doc/qt-4.8/qlibrary.html
bool QLibrary::load ()
Loads the library and returns true if the library was loaded successfully; otherwise returns false. Since resolve() always calls this function before resolving any symbols it is not necessary to call it explicitly. In some situations you might want the library loaded in advance, in which case you would use this function.

Catching calls from a program in runtime and mapping them to other calls

A program usually depends on several libraries and might sometimes depend on other programs as well. I look at projects like Wine and think how do they figure out what calls a program is making?
In a Linux environment, what are the approaches used to know what calls an executable is making in runtime in order to catch and map them to other calls?
Any code snippets or references to resources for extra reading is greatly appreciated :)
On Linux you're looking for the LD_PRELOAD environment variable. This will load your libraries before any requested by the program. If you provide a function definition that matches one loaded by the target program then your version will be called instead.
You can't really detect what functions a program is calling however. You can however get all the functions in a shared library and implement all of those. You aren't really catching the functions, you are simply reimplementing them.
Projects like Wine do this in some cases, but not in all. They also rewrite some of the dynamic libraries. So when a Win32 loads some DLL it is actually loading the Wine version and not the native version. This is essentially the same concept of replacing the functions with your own.
Lookup LD_PRELOAD for more information.

are runtime linking library globals shared among plugins loaded with dlopen?

I've a C++ program that links at runtime with, lets say, mylib.so. then, the same program uses dlopen()/dlsym() to load a function from myplugin.so, dynamic library that in turn has dependencies to mylib.so.
My question is: will the program AND the function in the plugin access the same globals defined in mydlib.so in the same memory area reserved for the program, or each will be assigned different, unrelated copies in its own memory space? if the latter is the default behaviour, is it possible to change that?
Thanks in advance =)!
Globals in the main program that does the dlopen should be visible to the code that is dynamically loaded. However, the best advice I've seen to date (especially if you ever want to have even vaguely portable code) is to only have function calls be passed across the linker divide, and to not export any variables in either direction. It's also best if there is an API for the loaded code to register the interesting parts of its API with the loader (e.g., "Here is how I provide this SPI for drawing foobars on a baz") as that's a much saner way of doing callbacks rather than just mashing everything together.
[EDIT]: The other reason for doing this is if you're simulating weak linking on a platform that doesn't support it. That's a lot like the other one I list, except that it is the main program that is building the SPI out of the API exported by the dynamic library rather than the .so exporting it explicitly on startup. It's second best really, but you make do with what you've got rather than wishing (well, unless you're prepared to do the work by writing some sort of connection library).