Check compatibility of dynamic library at runtime - c++

I am developing a C++ application which is required to load a dynamic library at runtime using dlopen. This library generally won't be written by me.
What method do people recommend to ensure future binary compatibility between this library and my application?
The options as I see them are:
Put the version number in the library file name, and attempt to load it (through a symbolic link) no matter what. If dlopen fails, report an error.
Maintain a second interface which returns an version number. However, if this interface changes for some reason, we run into the same problems as before.
Are there any other options?

You should define a convention about the dynamically loaded (i.e. dlopen-ed) library.
You might have the convention that the library is required to provide a const char mylib_version_str[]; symbol which gives the version of the API etc etc. Of course you could have your own preprocessor tricks to help about this.
For your inspiration, you might look at what GCC requires from its plugins (e.g. the plugin_is_GPL_compatible symbol).
If the dynamically loaded library is in C++, you might use demangling to check the signature of functions....

why not use both options? by istance few libraries already do that (Lua for example, old dll is Lua51.dll, then you have Lua52 etc. nd you can also query its version.)
A good interface can change, but not so often, why should 2 simple static methods
const char* getLibraryName();
uint32 getLibraryVersion();
change overtime?

if you/they are using a libtool to build a library/application you may recommend this way: http://www.gnu.org/software/libtool/manual/libtool.html#Versioning

Related

Qt6 crc32 name collision

Qt + C++ + C gurus, I could some advice:
I started writing a Qt6 GUI C++ app that includes some extern C code. Much to my surprise, upon launch the ui->setupUi(this) call immediately segfaulted and I found myself inside of one of the extern C functions that was imported! This is literally at the first line of my MainWindow constructor - with no calls to any of the C code even added yet.
The C function where the segfault dumped me was named "crc32". I temporarily renamed it (and all calls to it) to test my theory and sure enough the program started correctly afterwards. So... Qt appears to invoke "crc32" when doing things like adding QMenu items... due to some name collision.
I'm puzzled how to fix it. I don't want to change the C code because it's actually a git submodule for a project I do not want to maintain a fork of forever. Since it's Qt calling it from the C++ side, I can't exactly specify the namespace on the call either.
Weird sidenote: When I was doing testing this morning with Qt5 (instead of 6), I was not running into this issue. The crc32 name collision may be specific to Qt6.
Looks like you are on Linux (or similar) and encountering a violation of the One Definition Rule. When the shared lib loader tries to load symbols from a shared lib, it will first check to see if it already has one with that name loaded. If so, it skips it. Thus, if you try to load two symbols with an identical name, you'll only get one of them, and then problems can ensue.
Possible options:
Hide symbols you don't need in the library. For this, consider using GCC's visibility controls.
Separate the symbols into different executables and share data with some interprocess communication like shared mem or a socket.
Use dlopen() on one of the libs along with the RTLD_LOCAL flag so it does not override the symbol.
Add a namespace to the project (or use its build options to control that, if available -- like Boost does). Yes, this violates your wish not to modify it, but it could be done with a patch or similar that does not modify the original checkout if you switched to using a package manager like Conan instead of a git submod.
crc32 is a symbol from zlib. Qt does use either the system zlib, or a built-in version of it. If it uses the built-in version, symbols are prefixed with z_ (so z_crc32) to avoid exactly such a clash.
So, if you want to use crc32, you have the options to configure Qt to either not use zlib at all (-no-zlib), or use the built-in version (-qt-zlib).

Get available library functions at runtime

I am working with dynamic linked libraries (.dll) on windows or shared objects (.so) on Linux.
My goal is to write some code, that can - give the absolute path of the library on the disk - return a list of all exported functions (the export table) of that library and ultimately be able to call those functions. This should work on windows (wit dll) as well as on Linux (with so).
I am writing kind of a wrapper that delgates function calls to the respective library. Therefore I receive a path, a function name, and a list of parameters which I then want to forward. the thing is: I want to know whether the given function exists before trying to call it
From here I found a platform-independent way of opening and closing the library as well as getting a pointer to the function with the given name.
So the thing that remains is getting the names of available functions in the first place.
On this topic, I found this question dealing with the same kind of problem only that it asks for a Linux specific solution. In the given answer it is said
There is no libc function to do that. However, you can write one yourself (or copy/paste the code from a tool like readelf).
This clearly indicates that there are tools to do what I am looking for. The only question is, is there one that can work on windows as well as on Linux? If not how would I go about this on my own?
Here is a C# implementation (actually this is the code I want to port to C++) doing what I want (though windows only). To me, this appears as if the library structure is handled manually. If this is the way to go then where can I find the necessary information on the library structure?
So, on unixoids (and both Linux and WinNT have a posix subsytem), the dlopen function can be used to load a dynamic library and get function pointers to known symbols by name of that symbol.
Getting a list of symbols as far as I know was never an aspect that POSIX bothered to specify, so long story short, the functions that can do that for you on Linux are specific to the libc used there (GNU libc, mostly), and on Windows to the libc used there. Portable code means having to different codebases for two different libcs!
If you don't want to depend on your libc, you'd have to have a binary object parser (for ELF shared libraries on Linux, PE on Windows) to read out symbol names from the files. There's actually plenty of those – obviously, WINE has one for PE that is portable (especially works on Linux as well), and every linker (including glibc's runtime linker) under Linux can parse ELF files.
Personally, radare2 is a fine reverse-engineering framework with plenty of language bindings that is actually meant to analyze binary files, and give you exported symbols (as well as being capable of extracting non-exported functions, constructing call-graphs etc). It does have debugger, i.e. jumping-into-functions capabilities, too.
So, knowing now that
I am writing kind of a wrapper that delgates function calls to the respective library. Therefore I receive a path, a function name, and a list of parameters which I then want to forward. the thing is: I want to know whether the given function exists before trying to call it
things become way easier: you don't actually need to get the list of exports. It's easier and just as fast to simply try.
So, on any POSIX system (and that includes both Windows' POSIX subsystem and Linux), dlopen will open the library and load the symbol table, and dlsym will look up a symbol from that table. If that symbol is not in the table, it simply returns NULL. So, you already have all the tables you think you need; just not explicitly, but queryable.

Should I use a Static Library or DLL if I'm creating an API?

I'm creating an API to make it easier to find specifications about your system (CPU, GPU, etc.) using C++. This API will be distributed using GitHub.
In this scenario, which type of library should I use: Static or DLL (Dynamic)? Also, what are the pros and cons of each?
It depends on a few things, but like in the previous comments, you can leave the choice to the user.
A DLL is easier to integrate since the user just has to do a LoadLibrary() to start using it.
The problem with the DLL is that if you compiled it without using the default libraries as static and if you don't provide the redistributables, the user will get frustrated because of SxS problems. It may just so happen that there may not be any issues with SxS but you never know.
If you give out the lib file, it is possible that some compilation options may conflict with yours, unless you only use vanilla options.
All in all, at the end of the day, both options are viable and similar. Just depends on how the user wants to use your API.

Load-time dynamic link library dispatching

I'd like my Windows application to be able to reference an extensive set of classes and functions wrapped inside a DLL, but I need to be able to guide the application into choosing the correct version of this DLL before it's loaded. I'm familiar with using dllexport / dllimport and generating import libraries to accomplish load-time dynamic linking, but I cannot seem to find any information on the interwebs with regard to possibly finding some kind of entry point function into the import library itself, so I can, specifically, use CPUID to detect the host CPU configuration, and make a decision to load a paricular DLL based on that information. Even more specifically, I'd like to build 2 versions of a DLL, one that is built with /ARCH:AVX and takes full advantage of SSE - AVX instructions, and another that assumes nothing is available newer than SSE2.
One requirement: Either the DLL must be linked at load-time, or there needs to be a super easy way of manually binding the functions referenced from outside the DLL, and there are many, mostly wrapped inside classes.
Bonus question: Since my libraries will be cross-platform, is there an equivalent for Linux based shared objects?
I recommend that you avoid dynamic resolution of your DLL from your executable if at all possible, since it is just going to make your life hard, especially since you have a lot of exposed interfaces and they are not pure C.
Possible Workaround
Create a "chooser" process that presents the necessary UI for deciding which DLL you need, or maybe it can even determine it automatically. Let that process move whatever DLL has been decided on into the standard location (and name) that your main executable is expecting. Then have the chooser process launch your main executable; it will pick up its DLL from your standard location without having to know which version of the DLL is there. No delay loading, no wonkiness, no extra coding; very easy.
If this just isn't an option for you, then here are your starting points for delay loading DLLs. Its a much rockier road.
Windows
LoadLibrary() to get the DLL in memory: https://msdn.microsoft.com/en-us/library/windows/desktop/ms684175(v=vs.85).aspx
GetProcAddress() to get pointer to a function: https://msdn.microsoft.com/en-us/library/windows/desktop/ms683212(v=vs.85).aspx
OR possibly special delay-loaded DLL functionality using a custom helper function, although there are limitations and potential behavior changes.. never tried this myself: https://msdn.microsoft.com/en-us/library/151kt790.aspx (suggested by Igor Tandetnik and seems reasonable).
Linux
dlopen() to get the SO in memory: http://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html
dladdr() to get pointer to a function: http://man7.org/linux/man-pages/man3/dladdr.3.html
To add to qexyn's answer, one can mimic delay loading on Linux by generating a small static stub library which would dlopen on first call to any of it's functions and then forward actual execution to shared library. Generation of such stub library can be automatically generated by custom project-specific script or Implib.so:
# Generate stub
$ implib-gen.py libxyz.so
# Link it instead of -lxyz
$ gcc myapp.c libxyz.tramp.S libxyz.init.c

Any tutorial for embedding Clang as script interpreter into C++ Code?

I have no experience with llvm or clang, yet. From what I read clang is said to be easily embeddable Wikipedia-Clang, however, I did not find any tutorials about how to achieve this. So is it possible to provide the user of a c++ application with scripting-powers by JIT compiling and executing user-defined code at runtime? Would it be possible to call the applications own classes and methods and share objects?
edit: I'd prefer a C-like syntax for the script-languge (or even C++ itself)
I don't know of any tutorial, but there is an example C interpreter in the Clang source that might be helpful. You can find it here: http://llvm.org/viewvc/llvm-project/cfe/trunk/examples/clang-interpreter/
You probably won't have much of a choice of syntax for your scripting language if you go this route. Clang only parses C, C++, and Objective C. If you want any variations, you may have your work cut out for you.
I think here's what exactly you described.
http://root.cern.ch/drupal/content/cling
You can use clang as a library to implement JIT compilation as stated by other answers.
Then, you have to load up the compiled module (say, an .so library).
In order to accomplish this, you can use standard dlopen (unix) or LoadLibrary (windows) to load it, then use dlsym (unix) to dynamically reference compiled functions, say a "script" main()-like function whose name is known. Note that for C++ you would have to use mangled symbols.
A portable alternative is e.g. GNU's libltdl.
As an alternative, the "script" may run automatically at load time by implementing module init functions or putting some static code (the constructor of a C++ globally defined object would be called immediately).
The loaded module can directly call anything in the main application. Of course symbols are known at compilation time by using the proper main app's header files.
If you want to easily add C++ "plugins" to your program, and know the component interface a priori (say your main application knows the name and interface of a loaded class from its .h before the module is loaded in memory), after you dynamically load the library the class is available to be used as if it was statically linked. Just be sure you do not try to instantiate a class' object before you dlopen() its module.
Using static code allows to implement nice automatic plugin registration mechanisms too.
I don't know about Clang but you might want to look at Ch:
http://www.softintegration.com/
This is described as an embeddable or stand-alone c/c++ interpreter. There is a Dr. Dobbs article with examples of embedding it here:
http://www.drdobbs.com/architecture-and-design/212201774
I haven't done more than play with it but it seems to be a stable and mature product. It's commercial, closed-source, but the "standard" version is described as free for both personal and commercial use. However, looking at the license it seems that "commercial" may only include internal company use, not embedding in a product that is then sold or distributed. (I'm not a lawyer, so clearly one should check with SoftIntegration to be certain of the license terms.)
I am not sure that embedding a C or C++ compiler like Clang is a good idea in your case. Because the "script", that is the (C or C++) code fed (at runtime!) can be arbitrary so be able to crash the entire application. You usually don't want faulty user input to be able to crash your application.
Be sure to read What every C programmer should know about undefined behavior because it is relevant and applies to C++ also (including any "C++ script" used by your application). Notice that, unfortunately, a lot of UB don't crash processes (for example a buffer overflow could corrupt some completely unrelated data).
If you want to embed an interpreter, choose something designed for that purpose, like Guile or Lua, and be careful that errors in the script don't crash the entire application. See this answer for a more detailed discussion of interpreter embedding.