Delay initialisation of static variables in a 3rd party library - c++

I'm linking against a 3rd party library that uses static variables. These end up getting initialised before main and grab resources prematurely, causing some havoc in my application. Is there any idiom / technique/ wrapping method, to regain control and define the point in execution where the library is allowed initialise all of its static variables without hacking at the library itself?
Specifically, I have a thirdpartylib::system object, that once defined in main, grabs all sorts of resources before main is entered. The compiler sees that the code can be hit, and then goes about initialising all of its static vars out of control of the library consumer. Ideally, I'd like some kind of guard to stop this until I say so, like . .
// my code that may exit before I want the lib stuff to be invoked
{
LET_SYSTEM_RUN_RIOT();
thirdpartylib::system sys;
// do some stuff with it
KILL_IT_ALL_WITH_FIRE();
}

The only thing you can do is build it dynamically and load it at runtime via dload/LoadLibrary. Then you are in complete control of when the library initializes itself. By linking statically, you are conceptually making the library part of your application, which means it will initialize as part of you application, i.e. before your main function.

Related

Can destruction order be controlled across dynamic libraries?

I am encountering an issue where my library is crashing due to executable that loads me calling my function after main exits. I'm wondering - can I control the lifecycle of my globals to not be destroyed until (and if) my library gets unloaded? I understand what I'm asking.
Basically, the executable looks something like this (roughly):
struct MyLibController
{
void *libhandle;
void (*function)() myLibFunction;
~MyLibController()
{
myLibFunction(); // Call to my library's exported function
dlclose(libhandle);
}
};
std::shared_ptr<MyLibController> globalPtr;
int main(int argc, const char **argv)
{
globalPtr = std::make_shared<MyLibController>();
// initializing globalPtr to dlopen my library, map function ptrs, etc.
// do some work with my library
return 0;
}
I have absolutely no control over the code in this executable.
My library code look something like this:
SomeType globalObject;
// Exported function via c interface in my library
void myLibFunction()
{
// crash occurs globalObject is used after executable's main function exits
globalObject.someFunction();
// do some work
}
I have a lot of control over the library code - but this is a simple example. The Sometype globalObject is very necessary (suppose that it's a mutex, used to sync myLibFunction and a bunch of others).
I'd like to ensure that my globalObject is valid even after executable's main function exits. Is this possible? If so, how?
P.S. I am aware that I can dynamically allocate the globalObject and leak it, which resolves the crash. It feels wrong though, and I don't want to sign off on it.
You can register for a callback when the main() returns using std::atexit(): http://en.cppreference.com/w/cpp/utility/program/atexit
For example, when your library is loaded, use atexit() to register, then when that callback fires, set a flag for yourself that you check before trying to do anything else. If the flag is set, ignore any other actions from the caller, because the program is shutting down.
Your best bet is to make your global object a reference counted singleton of some sort. Then update your interface to instantiate an object from your loaded library. This object can then grab a reference to the global object during construction, and thus only releases it after it has been destroyed. The global variable itself would also merely have a reference, thus during the dlclose() process the global reference would be released, but your object would still have a reference until it is destroyed.
#o11c's comment provided a working solution:
Using attribute((constructor)) and attribute((destructor))
allows you to specify a priority
In other words, I can have the control over lifecycle of my variables despite binary's main function exiting (I just have to dynamically allocate them and free them).

How can I accessing a static variable in program P from dlopen'd library L?

I have a library (L) that is dynamically loaded by a program (P) using dlopen. L implements a plugin interface and so calls back on it's parent to obtain some functionality.
Inside P is a singleton object that dynamically creates a thread pool object A.
I need access to A from L.
However, because the singleton works by using a static variable, when L is loaded it ends up creatng it's own instance which is some cases would be fine but I want the instance that was created in P. Is there a way around this?
You should not have a static A in L. Let P pass the address of A to L, i.e., L.init(&A).
File scope names declared static have internal linkage. Internal linkage means that they are invisible to other translation units, even in a "classic" linking model without any dynamic libraries. Given that statics are not visible even to other translation units in the same executable, it is not reasonable to expect them to be visible from attached dynamic libraries.
You have to think of a way to achieve the necessary linkage using external, dynamic symbols. Perhaps the singleton simply cannot have an internal name, but must have an external name.
L is creating its own instance of the object not just because the object is static, but because you have linked into L the thread pool module which defines that singleton and the thread pool functions. This can happen even with objects that have external names, depending on how the library is linked.
You must pick a single object in which the thread pool service will reside, and then make sure it only resides there. Doesn't your project have a utility library where you can stick in this sort of thing?
You can adhere to the model that it is the program executable P which provides the thread pool API. This is really the same thing. The program P is another dynamic object and effectively serves as the library for the thread pool module, which it provides to itself and to other shared objects.
Regardless of where that thread pool module lives, make sure that you are not statically linking copies of that module into other objects: it lives just in one place.
If the thread pool singleton's external name is part of that API (everyone knows its documented name and uses it directly, passing that global pool to the API functions) then that name should be made external, and declared in the header file.
If the singleton is to be private, then you have to think of some way of hiding it, like making it implicit in the function calls (there is only one thread pool, and that is that) or else abstracting the access to it somewhat (provide an ensure_thread_pool) function which creates a thread pool if one does not exist, or else returns the previously created one, in a thread safe way.
Think: why do not, for instance, stdin and stdout have this problem? Why doesn't every library instantiate its own stdout stream and call its own fprintf function on that stream? Why, obviously, because these things live in one place: the C library. Copies of them do not live in other places; other places just use them by reference via the dynamic symbols.

Can I create a second instance of a singleton in a DLL?

I have a static library which contains singletons. I need to load a separate instance of those singletons in the same process for testing purposes.
So I have created a DLL which links the same static library, and then the main process loads that DLL.
As soon as the DLL tries to load, I get access violations when trying to access the static instance pointers in the singletons.
Some posts that I have read say that it's impossible and that I need a second process, while others say that each DLL gets it's own copies of all the static variables in the static library it links, which suggests that this should work..
Is what I am trying to do possible?
Most of the time a singleton is really meant to be only one - your request is unusual.
I know that linking a static library into a DLL can result in multiple instances of static variables, because I've seen it myself. Each DLL or EXE gets its own copy of the static library via the linker, and thus its own copy of the static variables.
The access violations may come from problems with initialization order. The best way to control that is to make sure the static variables are within a function that initializes them just-in-time, rather than global variables.

Should LD_PRELOAD load module or just use module to replace symbols

We have a multi-threaded c++ app compiled with g++ running on an embedded powerpc. To memory leak test this in a continuous integration test we've created a heap analyzer that gets loaded with ld_preload.
We'd like to guarantee that a function in the ld_preloaded module gets called before anything else happens (including creation of static objects etc...). Even more crucially we'd like to have another function that gets called right before the process exits so the heap analyzer can output its results. The problem we see is that a vector in our application is being created at global file scope before anything happens in our ld_preloaded module. The vector grows in size within main. At shutdown the destructor function in our preloaded module is called before the vector is destroyed.
Is there any way we can code a preloaded module to run a function before anything else and after everything else? We've tried using __attribute__((constructor)) and destructor without success.
Returning to the question title, I'm beginning to suspect that ld only looks in the preloaded module when resolving symbols for subsequent module loads. It doesn't actually load the preloaded module first. Can anyone shed any light on this for us?
Originally, you would have no control over the order of constructors from different translation units. So, this extends to shared libraries as well.
However, newer versions of GCC support applying a priority parameter to the constructor attribute which should allow you some control over when your specified function will run in relation to other global constructors. The default priority when not specified is the maximum priority value. So any priority level you set below that should make your constructor run before them, and your destructor after them.
static int initialize () __attribute__((constructor(101)));
static int deinitialize () __attribute__((destructor(101)));
static int initialize () {
puts("initialized");
}
static int deinitialize () {
puts("deinitialized");
}
101 appears to be the lowest priority level allowed to be specified. 65535 is the highest. Lower numbers are executed first.

How to link non thread-safe library so each thread will have its own global variables from it?

I have a program that I link with many libraries. I run my application on profiler and found out that most of the time is spent in "waiting" state after some network requests.
Those requests are effect of my code calling sleeping_function() from external library.
I call this function in a loop which executes many, many times so all waiting times sum up to huge amounts.
As I cannot modify the sleeping_function() I want to start a few threads to run a few iterations of my loop in parallel. The problem is that this function internally uses some global variables.
Is there a way to tell linker on SunOS that I want to link specific libraries in a way that will place all variables from them in Thread Local Storage?
I don’t think you’ll be able to achieve this with just the linker, but you might be able to get something working with some code in C.
The problem is that a call to load a library that is already loaded will return a reference to the already loaded instance instead of loading a new copy. A quick look at the documentation for dlopen and LoadLibrary seems to confirm that there’s no way to load the same library more than once, at least not if you want the image to be prepared for execution. One way to circumvent this would be to prevent the OS from knowing that it is the same library. To do this you could make a copy of the file.
Some pseudo code, just replace calls to sleeping_function with calls to call_sleeping_function_thread_safe:
char *shared_lib_name
void sleeping_function_thread_init(char *lib_name);
void call_sleeping_function_thread_safe()
{
void *lib_handle;
pthread_t pthread;
new_file_name = make_copy_of_file(shared_lib_name);
pthread_create(&pthread, NULL, sleeping_function_thread_init, new_file_name);
}
void sleeping_function_thread_init(char *lib_name)
{
void *lib_handle;
void (*)() sleeping_function;
lib_handle = dlopen(lib_name, RTLD_LOCAL);
sleeping_function = dlsym(lib_handle, "sleeping_function")
while (...)
sleeping_function;
dlclose(lib_handle);
delete_file(lib_name);
}
For windows dlopen becomes LoadLibrary and dlsym becomes GetProcAddress etc... but the basic idea would still work.
In general, this is a bad idea. Global data isn't the only issue that may prevent a non thread-safe library from running in a multithreaded environment.
As one example, what if the library had a global variable that points to a memory-mapped file that it always maps into a single, hardcoded address. In this case, with your technique, you would have one global variable per thread, but they would all point to the same memory location, which would be trashed by multi-threaded access.