C++ force unloading shared library - c++

I'm trying to create an application which reloads a shared library multiple times. But at some point in time, dlmopen fails with error
/usr/lib/libc.so.6: cannot allocate memory in static TLS block
Here is the minimal code reproducing this issue:
#include <dlfcn.h>
#include <cstdio>
#include <vector>
int main() {
for (int i = 0; i < 100; ++i) {
void *lib_so = dlmopen(LM_ID_NEWLM, "lib.so", RTLD_LAZY | RTLD_LOCAL);
if (lib_so == NULL) {
printf("Iteration %i loading failed: %s\n", i, dlerror());
return 1;
}
dlclose(lib_so);
}
return 0;
}
And empty lib.cpp, compiled with
g++ -rdynamic -ldl -Wl,-R . -o test main.cpp
g++ -fPIC -shared lib.cpp -o lib.so
Update
It seems that it crashes even with one thread. The question is: how can I force a library unload or a destruction of unused namespaces created with LM_ID_NEWLM?

There is a built-in limit to the number of link map namespaces available to a process. This is rather poorly documented in the comment:
The glibc implementation supports a maximum of 16 namespaces
in the man page.
Once you create a link map namespace, there is no support for 'erasing' it via any APIs. This is just the way it's designed, and there's no real way to get around that without editing the glibc source and adding some hooks.
Using namespaces for reloading of a library is not actually reloading the library - you're simply loading a new copy of the library. This is one of the use cases of the namespaces - if you tried to dlopen the same library multiple times, you would get the same handle to the same library; however if you load the second instance in a different namespace, you won't get the same handle. If you want to accomplish reloading, you need to unload the library using dlclose, which will unload the library once the last remaining reference to the library has been released.
If you want to attempt to 'force unload' a library, then you could try issuing multiple dlclose calls until it unloads; however if you don't know what the library has done (e.g. spawned threads) there may be no way of preventing a crash in that case.

Older glibc versions might have some bugs related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=89692
https://sourceware.org/bugzilla/show_bug.cgi?id=14898
What version are you using? Try using a newer glibc version, your code works pretty fine on my computer (glibc 2.23).

Related

Multiple load (dlopen) and unload (dlclose) of the same shared object calls resulting in segmentation fault

In my code, I have a for loop where first I am calling dlopen to load a shared object, then calling a function of the loaded library, and then dlclose to unload it. The first iteration of the loop works as expected but during the second iteration (when i=1) dlopen call is causing segmentation fault (core dumped).
void *handle;
char* (*goImg)();
char *error;
int i;
for (i=0; i<5; i++) {
handle = dlopen("mylib.so", RTLD_LAZY);
if (!handle) {
fputs (dlerror(), stderr);
exit(1);
}
goImg = dlsym(handle, "writeToFileInWrapper");
if ((error = dlerror()) != NULL) {
fputs(error, stderr);
exit(1); }
goImg();
if (!handle) {
fputs (dlerror(), stderr);
exit(1);
}
dlclose(handle);
}
Script to generate mylib.so:
echo "Building golang code to create an archive file."
go build -buildmode=c-archive -o bin/goHelper.a goHelper.go
echo "Building wrapperCodeInC to be consumed as a shared library."
gcc -c -fPIC -o bin/shared/wrapperCodeInC.o -I./bin -I./wrapper wrapper/wrapperCodeInC.c
gcc -s -shared -lpthread -Wl,-Bsymbolic -o bin/mylib.so -Wl,--whole-archive bin/goHelper.a -Wl,--no-whole-archive bin/shared/wrapperCodeInC.o
Here, goHelper.go has few functions written in go language and wrapperCodeInC.c has the wrapper functions to invoke those go functions.
In the first run of the loop dlopen(), goImg(), and dlclose() works as expected but then during the second run (i=1), dlopen is dumping core. Any idea what could be causing this?
Note: If I remove -Wl,-Bsymbolic from the build file, then I get an error similar to this issue: https://github.com/golang/go/issues/30822
If I add flag RTLD_NODELETE in dlopen call (dlopen("mylib.so", RTLD_LAZY | RTLD_NODELETE )), then all the iterations run fine but I am not sure if that is the right thing to do.
dlclose doesn't work on Go shared libraries, and it's been an open issue since 2015. The root of the issue seems to be that Go offers no way to gracefully terminate any background threads that the runtime might have started, and which might still be running at the time you call dlclose.
Besides that limitation, dlclose is apparently not required to do much of anything anyway.
See also Using Go in C limitations.
Thus, it seems that you may as well call attention to the library's lack of "unloadability" by using RTLD_NODELETE.
If this is part of a more general "plug-in" system, where you could have libraries written in several different languages, then you might not want to apply RTLD_NODELETE to all of them. In that case, you could try using the -z nodelete linker option when creating the shared library. Then the code that calls dlopen doesn't need to be aware of the quirks of any particular .so file it loads. (There's code in Go to add that parameter automatically; I think it's used if you use go tool link to do your linking instead of running gcc directly.)

Writing a plugin system?

After many hours of research I have turned up nothing, so I turn to you good folks in hopes of a solution. I am going to be writing a bot in c++, and at some point would like to make a plugin system for it. Now I know I could just write a scripting language for it, however, I know its possible to just write an api and have the program link to that dynamically at run time. My question is, how do i get that dynamic linkage (like what hexchat has for its plugins)? Are there any elegant solutions, or at least theories on the typical design?
On Linux and Posix systems, you want to use dlopen(3) & dlsym (or some libraries wrapping these functions, e.g. Glib from GTK, Qt, POCO, etc...). More precisely,
Build a position independent code shared library as your plugin:
gcc -fPIC -Wall -c plugin1.c -o plugin1.pic.o
gcc -fPIC -Wall -c plugin2.c -o plugin2.pic.o
Notice that if the plugin is coded in C++ you'll compile it with g++ and you should declare the plugin functions as extern "C" to avoid name mangling.
Then link your plugin as
gcc -shared -Wall plugin1.pic.o plugin2.pic.o -o plugin.so
You may add dynamic libraries (e.g. a -lreadline at end of command above if your plugin wants GNU readline).
At last, call dlopen with a full path in your main program, e.g.
void* dlh = dlopen("./plugin.so", RTLD_NOW);
if (!dlh) { fprintf(stderr, "dlopen failed: %s\n", dlerror());
exit(EXIT_FAILURE); };
(often dlh is a global data)
Then use dlsym to get the function pointers. So declare their signature in some header included both by program and plugin code like
typedef int readerfun_t (FILE*);
declare some (often) global function pointers
readerfun_t* readplugfun;
and use dlsym on the plugin handle dlh:
readplugfun = (readerfun_t*) dlsym(dlh, "plugin_reader");
if (!readplugfun) { fprintf (stderr, "dlsym failed: %s\n", dlerror());
exit(EXIT_FAILURE); };
Of course in your plugin source code (e.g. in plugin1.cc) you'll define
extern "C" int plugin_reader (FILE*inf) { // etc...
You might define some constructor (or destructor) functions in your plugin (see GCC function attributes); the would be called at dlopen (or dlclose) time. In C++ you should simply use static objects. (their constructor is called at dlopen time, their destructor is called at dlclose time; hence the name of the function attributes).
At the end of your program call
dlclose(dlh), dlh = NULL;
In practice, you can do a lot (perhaps a million) of dlopen calls.
You generally want to link your main program with -rdynamic to let its symbols be visible from plugins.
gcc -rdynamic prog1.o prog2.o -o yourprog -ldl
Read Program Library HowTo & C++ dlopen mini HowTo & Drepper's paper: How to Write a Shared Library
The most important part is to define and document a plugin convention (i.e. "protocol"), that is a set (and API) of functions (to be dlsym-ed) required in your plugin and how to use them, in which order they are called, what is the memory ownership policy, etc. If you allow several similar plugins, you might have some well documented hooks in your main program which calls all the dlsym-ed functions of relevant dlopen-ed plugins. Examples: GCC plugins conventions, GNU make modules, Gedit plugins, ...

Internal exceptions in shared library terminate end user application

I am building a shared library which uses Boost.thread internally. As a result, Boost.system is also used since Boost.thread depends on that. My shared library exports a C interface, so I want to hide all my internal exception handling and thread usage etc from the end user. It is supposed to be a black box so to speak. However, when I link with a client application, while the program runs fine - as soon as it is time to stop the processing by invoking a library function I get:
terminate called after throwing an instance of 'boost::thread_interrupted'
I catch this exception internally in the library, so I have no idea why it is not actually being caught. The end user's program is not meant to know about or handle Boost exceptions in any way. When building the shared library, I use static linking for both Boost.thread and Boost.system so the outside world is never meant to see them. I am on GCC 4.7 on Ubuntu 12. On Windows, I have no such problems (neither with MSVC or MinGw).
(EDIT)
I am editing the question to show a minimalistic example that reproduces the problem, as per the requests in the comments.
Here first is the code for testlib.cpp and testlib.h.
testlib.cpp:
#include <boost/thread/thread.hpp>
void thread_func()
{
while(1)
{
boost::this_thread::interruption_point();
}
}
void do_processing()
{
// Start a thread that will execute the function above.
boost::thread worker(thread_func);
// We assume the thread started properly for the purposes of this example.
// Now let's interrupt the thread.
worker.interrupt();
// And now let's wait for it to finish.
worker.join();
}
And now testlib.h:
#ifndef TESTLIB_H
#define TESTLIB_H
void do_processing();
#endif
I build this into a shared library with the following command:
g++ -static-libgcc -static -s -DNDEBUG -I /usr/boost_1_54_0 -L /usr/boost_1_54_0/stage/lib -Wall -shared -fPIC -o libtestlib.so testlib.cpp -lboost_thread -lboost_system -lpthread -O3
Then, I have the code for a trivial client program which looks as follows:
#include "testlib.h"
#include <cstdio>
int main()
{
do_processing();
printf("Execution completed properly.\n");
return 0;
}
I build the client as follows:
g++ -DNDEBUG -I /usr/boost_1_54_0 -L ./ -Wall -o client client.cpp -ltestlib -O3
When I run the client, I get:
terminate called after throwing an instance of 'boost::thread_interrupted'
Aborted (core dumped)
I am not explicitly catching the thread interruption exception, but according to the Boost documentation Boost.thread does that and terminates the given thread only. I tried explicitly catching the exception from within the thread_func function, but that made no difference.
(End OF EDIT)
(EDIT 2)
It is worth noting that even with -fexceptions turned on, the problem still persists. Also, I tried to throw and catch an exception that is defined in the same translation unit as the code that catches and throws it, with no improvement. In short, all exceptions appear to remain uncaught in the shared library even though I definitely have catch handlers for them. When I compile the client file and the testlib file as part of a single program, that is to say without making testlib into a shared library, everything works as expected.
(End OF EDIT 2)
Any tips?
I finally figured it out. The -static flag should never be specified when -shared is specified. My belief was that it merely told the linker to prefer static versions of libraries that it links, but instead it makes the generated dynamic library unsuitable for dynamic linking which is a bit ironic. But there it is. Removing -static solved all my problems, and I am able to link Boost statically just fine inside my dynamic library which handles exceptions perfectly.
Maybe this?
If you have a library L which throws E, then both L and the
application A MUST be linked against X, the library containing the
definition of E.
Try to link executable against boost, too.
A shared library that itself includes statically linked libraries is not such a good idea, and I don't think that this scenario is well supported in the GNU toolchain.
I think that your particular problem arises from the option -static-libgcc, but I've been unable to compile it in my machine with your options. Not that linking statically-dinamically to libpthread.so sounds as such a good idea either... What will happen if the main executable wants to create its own threads? Will it be compiled with -pthread? If it is, then you will link twice the thread functions; if it isn't, it will have the functions but not the precompiler macros nor the thread-safe library functions.
My advice is simply not to compile your library statically, that's just not the Linux way.
Actually that should not be a real problem, even if you don't want to rely on the distribution version of boost: compile your program against the shared boost libraries and deploy all these files (libboost_thread.so.1.54.0, libboost_system.so.1.54.0 and libtestlib.so) to the same directory. Then run your program with LD_LIBRARY_PATH=<path-to-so-files>. Since the client is not intended to use boost directly, it doesn't need the boost headers, nor link them in the compiler command. You still have your black box, but now it is formed by 3 *so files, instead of just 1.

C++ error: undefined reference to 'clock_gettime' and 'clock_settime'

I am pretty new to Ubuntu, but I can't seem to get this to work. It works fine on my school computers and I don't know what I am not doing. I have checked usr/include and time.h is there just fine. Here is the code:
#include <iostream>
#include <time.h>
using namespace std;
int main()
{
timespec time1, time2;
int temp;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time1);
//do stuff here
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time2);
return 0;
}
I am using CodeBlocks as my IDE to build and run as well. Any help would be great, thank you.
Add -lrt to the end of g++ command line. This links in the librt.so "Real Time" shared library.
example:
c++ -Wall filefork.cpp -lrt -O2
For gcc version 4.6.1, -lrt must be after filefork.cpp otherwise you get a link error.
Some older gcc version doesn't care about the position.
Since glibc version 2.17, the library linking -lrt is no longer required.
The clock_* are now part of the main C library. You can see the change history of glibc 2.17 where this change was done explains the reason for this change:
+* The `clock_*' suite of functions (declared in <time.h>) is now available
+ directly in the main C library. Previously it was necessary to link with
+ -lrt to use these functions. This change has the effect that a
+ single-threaded program that uses a function such as `clock_gettime' (and
+ is not linked with -lrt) will no longer implicitly load the pthreads
+ library at runtime and so will not suffer the overheads associated with
+ multi-thread support in other code such as the C++ runtime library.
If you decide to upgrade glibc, then you can check the compatibility tracker of glibc if you are concerned whether there would be any issues using the newer glibc.
To check the glibc version installed on the system, run the command:
ldd --version
(Of course, if you are using old glibc (<2.17) then you will still need -lrt.)
I encountered the same error. My linker command did have the rt library included -lrt which is correct and it was working for a while. After re-installing Kubuntu it stopped working.
A separate forum thread suggested the -lrt needed to come after the project object files.
Moving the -lrt to the end of the command fixed this problem for me although I don't know the details of why.

Using libtool to load a duplicate function name from a shared library

I'm trying to create a 'debug' shared library (i.e., .so or .dll file) that calls another 'real' shared library that has the same C API as the debug library (in this case, to emulate the PKCS#11 API). However, I'm running into trouble where the link map of the debug library is colliding with that of the real library and causing the debug library to call its own functions instead of the corresponding functions in the real library. I found a solution to this problem by using the POSIX dlmopen command, but would like to understand if the same is possible using GNU's libtool.
On my Solaris 10 system, the following code fails the assertion when a test application statically links to the debug library:
#include <dlfcn.h>
int MyFunctionName() {
int (*function_ptr)();
void *handle = dlopen("realsharedlibrary.so", RTDL_LAZY);
*(void **)(&function_ptr) = dlsym(handle, "MyFunctionName");
ASSERT(function_ptr != MyFunctionName); // Fails
return (*function_ptr)();
}
In this case, I get a function pointer to the local 'MyFunctionName' (in the debug library) instead of MyFunctionName within the real shared library.
I've discovered that it's possible to get around this problem by using the command 'dlmopen' instead of 'dlopen', and telling dlmopen to create a new link map (with the LM_ID_NEWLM parameter) when loading the real library:
int MyFunctionName() {
int (*function_ptr)();
void *handle = dlmopen(LM_ID_NEWLM, "realsharedlibrary.so", RTDL_LAZY);
*(void **)(&function_ptr) = dlsym(handle, "MyFunctionName");
ASSERT(function_ptr != MyFunctionName); // succeeds
return function_ptr(); // call real function
}
Unfortunately, dlmopen does not seem to be included within libtool (i.e., I don't see an lt_dlmopen function in libtool).
Is it possible to do the same thing using libtool commands -- that is, to create a new link map when loading the new library so that it doesn't collide with the link map of the debug library?
I haven't found a good way to use libtool to solve this problem yet, but there's a way to avoid the Solaris-specific 'dlmopen' function by using dlopen with these flags:
void *handle = dlopen("realsharedlibrary.so", RTLD_NOW | RTLD_GROUP | RTLD_LOCAL)
Apparently, the problem of symbol-collisions is solved by using RTLD_NOW instead of RTLD_LAZY and by adding RTLD_GROUP. The RTLD_LOCAL is there because POSIX requires using either RTLD_LOCAL or RTLD_GLOBAL, or the behavior is undefined. For Solaris, the behavior is to default to RTLD_LOCAL.
The open question, though, is whether it's possible to pass these types of flags to lt_dlopen.