Is lazy loading in linux possible? [duplicate] - c++

I've been searching a way to make a shared library (let's name the library libbar.so) delay loaded on Linux and it should hopefully be realized with a help from only a linker, not modifying anything on the source code written in C++; I mean I don't want to invoke dlopen() nor dlsym() in the source code of the parent library (let's name it libfoo.so) to invoke a function of libbar.so because they make the source code messy and the maintenance process difficult. (In short, I'm expecting to go on the similar way to Visual Studio's /DELAYLOAD option even on Linux)
Anyway, I've found some uncertain information pieces related to my question on the internet so far, so it would be very nice to have the answers from you all for the following questions to make the information clear.
Does GNU ld support any delay loading mechanism on Linux?
If it doesn't, how about Clang?
Is the dlopen() family the only way to make a shared library delay loaded on Linux?
I tested to pass -zlazy flag to GCC (g++) with a path to the library, it seemed to accept the flag but the behavior did not look making libbar.so delay loaded (Not having libbar.so, I was expecting to have an exception at the first call of libbar.so, but the exception actually raised before entering to libfoo.so). On the other hand, Clang (clang++) left a warning that it ignored the option flag.
Best regards,

Delay loading is NOT a runtime feature. MSVC++ implemented it without help from Windows. And like dlopen is the only way on Linux, GetProcAddress is the only runtime method on Windows.
So, what is delay loading then? It's very simple: Any call to a DLL has to go through a pointer (since you don't know where it will load). This has always been handled by the compiler and linker for you. But with delay loading, MSVC++ sets this pointer initially to a stub that calls LoadLibrary and GetProcAddress for you.
Clang can do the same without help from ld. At runtime, it's just an ordinary dlopen call, and Linux cannot determine that Clang inserted it.

This functionality can be achieved in a portable way using Proxy design pattern.
In code it may look something like this:
#include <memory>
// SharedLibraryProxy.h
struct SharedLibraryProxy
{
virtual ~SharedLibraryProxy() = 0;
// Shared library interface begin.
virtual void foo() = 0;
virtual void bar() = 0;
// Shared library interface end.
static std::unique_ptr<SharedLibraryProxy> create();
};
// SharedLibraryProxy.cc
struct SharedLibraryProxyImp : SharedLibraryProxy
{
void* shared_lib_ = nullptr;
void (*foo_)() = nullptr;
void (*bar_)() = nullptr;
SharedLibraryProxyImp& load() {
// Platform-specific bit to load the shared library at run-time.
if(!shared_lib_) {
// shared_lib_ = dlopen(...);
// foo_ = dlsym(...)
// bar_ = dlsym(...)
}
return *this;
}
void foo() override {
return this->load().foo_();
}
void bar() override {
return this->load().bar_();
}
};
SharedLibraryProxy::~SharedLibraryProxy() {}
std::unique_ptr<SharedLibraryProxy> SharedLibraryProxy::create() {
return std::unique_ptr<SharedLibraryProxy>{new SharedLibraryProxyImp};
}
// main.cc
int main() {
auto shared_lib = SharedLibraryProxy::create();
shared_lib->foo();
shared_lib->bar();
}

To add to MSalters answer, one can easily mimic Windows approach to lazy loading on Linux by creating a small static stub library that would try to dlopen needed library on first call to any of it's functions (emitting diagnostic message and terminating if dlopen failed) and then forwarding all calls to it.
Such stub libraries can be written by hand, generated by project/library-specific script or generated by universal tool Implib.so:
$ implib-gen.py libxyz.so
$ gcc myapp.c libxyz.tramp.S libxyz.init.c ...

Related

How to make a shared library delay loaded on Linux

I've been searching a way to make a shared library (let's name the library libbar.so) delay loaded on Linux and it should hopefully be realized with a help from only a linker, not modifying anything on the source code written in C++; I mean I don't want to invoke dlopen() nor dlsym() in the source code of the parent library (let's name it libfoo.so) to invoke a function of libbar.so because they make the source code messy and the maintenance process difficult. (In short, I'm expecting to go on the similar way to Visual Studio's /DELAYLOAD option even on Linux)
Anyway, I've found some uncertain information pieces related to my question on the internet so far, so it would be very nice to have the answers from you all for the following questions to make the information clear.
Does GNU ld support any delay loading mechanism on Linux?
If it doesn't, how about Clang?
Is the dlopen() family the only way to make a shared library delay loaded on Linux?
I tested to pass -zlazy flag to GCC (g++) with a path to the library, it seemed to accept the flag but the behavior did not look making libbar.so delay loaded (Not having libbar.so, I was expecting to have an exception at the first call of libbar.so, but the exception actually raised before entering to libfoo.so). On the other hand, Clang (clang++) left a warning that it ignored the option flag.
Best regards,
Delay loading is NOT a runtime feature. MSVC++ implemented it without help from Windows. And like dlopen is the only way on Linux, GetProcAddress is the only runtime method on Windows.
So, what is delay loading then? It's very simple: Any call to a DLL has to go through a pointer (since you don't know where it will load). This has always been handled by the compiler and linker for you. But with delay loading, MSVC++ sets this pointer initially to a stub that calls LoadLibrary and GetProcAddress for you.
Clang can do the same without help from ld. At runtime, it's just an ordinary dlopen call, and Linux cannot determine that Clang inserted it.
This functionality can be achieved in a portable way using Proxy design pattern.
In code it may look something like this:
#include <memory>
// SharedLibraryProxy.h
struct SharedLibraryProxy
{
virtual ~SharedLibraryProxy() = 0;
// Shared library interface begin.
virtual void foo() = 0;
virtual void bar() = 0;
// Shared library interface end.
static std::unique_ptr<SharedLibraryProxy> create();
};
// SharedLibraryProxy.cc
struct SharedLibraryProxyImp : SharedLibraryProxy
{
void* shared_lib_ = nullptr;
void (*foo_)() = nullptr;
void (*bar_)() = nullptr;
SharedLibraryProxyImp& load() {
// Platform-specific bit to load the shared library at run-time.
if(!shared_lib_) {
// shared_lib_ = dlopen(...);
// foo_ = dlsym(...)
// bar_ = dlsym(...)
}
return *this;
}
void foo() override {
return this->load().foo_();
}
void bar() override {
return this->load().bar_();
}
};
SharedLibraryProxy::~SharedLibraryProxy() {}
std::unique_ptr<SharedLibraryProxy> SharedLibraryProxy::create() {
return std::unique_ptr<SharedLibraryProxy>{new SharedLibraryProxyImp};
}
// main.cc
int main() {
auto shared_lib = SharedLibraryProxy::create();
shared_lib->foo();
shared_lib->bar();
}
To add to MSalters answer, one can easily mimic Windows approach to lazy loading on Linux by creating a small static stub library that would try to dlopen needed library on first call to any of it's functions (emitting diagnostic message and terminating if dlopen failed) and then forwarding all calls to it.
Such stub libraries can be written by hand, generated by project/library-specific script or generated by universal tool Implib.so:
$ implib-gen.py libxyz.so
$ gcc myapp.c libxyz.tramp.S libxyz.init.c ...

Is it safe to use exceptions in an injected class across dll/so?

I have a dll or shared object, MyDll which produces an object, ForeignObject:
class ForeignObject : public LocalInterface {
virtual void some_func(Injector& i) {
i.inject();
}
};
In the interfaces looks like this.
class Injector {
virtual void inject() = 0;
};
class LocalInterface {
virtual void some_func(Injector& i) = 0;
};
Then I do the following:
class MyInjector : public Injector {
virtual void inject() {throw std::runtime_error("arrgghh");}
};
int main() {
LocalInterface* foreign_obj = create_object_from_my_dll();
MyInjector injector;
foreign_obj->some_func(injector);
destroy_object_from_my_dll(foreign_obj);
return 0;
}
My question is, why is it safe/not safe to throw an exception like this. Is it safe when they are built with the same compiler? Between compilers?
I would say the answer will depend on the OS. But for all the UNICES I've worked with (including Linux and Mac OS X), it will be safe. Of course, you need to compile with compilers using the same ABI, but the ABI of g++ hasn't changed in years (can't remember exactly, but I think 3.2 was the last big change). Also, in the model used in Linux, linkage don't impact code behaviour. So you can safely put any code in a dynamically linked file (i.e. shared object file), it will work as if it is linked statically.
On Windows, if you use VC++, then you won't be able to compile your library with another compiler. The program and the library must use the same version of VC++, there is no compatibility in either direction. If you are using g++, you can compile with another version of g++ (like on UNIX), but not with VC++. As for the question of it this is dangerous, I will let a windows-specialist answer as DLL have weird behaviour, and if I remember an object should always be destroyed within the scope of the DLL that loaded it, but I am not 100% sure.
It is safe if and only if you build it with the same compiler ( same runtime lib ) but in case of different runtime lib it will generate a memory corruption in your program and crash. In general there is not any special problem with exception but with a class exporting from dll. You exception will be exported and this is a problem.

Can I access to symbols of the host process from a shared object loaded in runtime? Any alternative?

In my scenario I want a plugin, which is a shared object loaded in runtime, to access symbols from the “host application” so I can add any functionality to my application.
I have tried but have not found any way to do this and I have no clue on whether this is possible or not. So, can I do this somehow or is there any alternative that applications that use plugins use?
I am on Fedora 15, Linux 2.6.4. However, I hope the solution will be cross-platform.
There are three main approaches:
Pass a structure of function pointers to the DLL from the application, giving access to whatever symbols you want to share. This is the most portable method, but is kind of a pain to create all the function pointers. Something like:
// In shared header
struct app_vtable {
void (*appFoo)();
};
// In plugin:
const app_vtable *vt;
void set_vtable(const app_vtable *vt_) {
vt = vt_;
}
void bar() {
vt->appFoo();
}
// In application:
void foo();
const app_vtable vt = { foo };
void loadplugin() {
void *plugin = dlopen("plugin.so", RTLD_LAZY);
void (*pset_vtable)(const app_vtable *) = dlsym(plugin, "set_vtable");
pset_vtable(&vt);
void (*pbar)() = dlsym(plugin, "bar");
pbar();
}
Move your application into a library, and have the executable simply link in this library and call an entry point in it. Then your plugins can link the same library and access its symbols easily. This is also quite portable, but can result in some performance loss due to the need to use position-independent code in your main app library (although you may be able to get away with a fixed mapping in this case, depending on your architecture).
On Linux only (and possible other ELF platforms) you can use -rdynamic to export symbols from the application executable directly. However this isn't very portable to other platforms - in particular, these is no equivalent to this on Windows.

dynamic_cast fails when used with dlopen/dlsym

Intro
Let me apologise upfront for the long question. It is as short as I could make it, which is, unfortunately, not very short.
Setup
I have defined two interfaces, A and B:
class A // An interface
{
public:
virtual ~A() {}
virtual void whatever_A()=0;
};
class B // Another interface
{
public:
virtual ~B() {}
virtual void whatever_B()=0;
};
Then, I have a shared library "testc" constructing objects of class C, implementing both A and B, and then passing out pointers to their A-interface:
class C: public A, public B
{
public:
C();
~C();
virtual void whatever_A();
virtual void whatever_B();
};
A* create()
{
return new C();
}
Finally, I have a second shared library "testd", which takes a A* as input, and tries to cast it to a B*, using dynamic_cast
void process(A* a)
{
B* b = dynamic_cast<B*>(a);
if(b)
b->whatever_B();
else
printf("Failed!\n");
}
Finally, I have main application, passing A*'s between the libraries:
A* a = create();
process(a);
Question
If I build my main application, linking against the 'testc' and 'testd' libraries, everything works as expected. If, however, I modify the main application to not link against 'testc' and 'testd', but instead load them at runtime using dlopen/dlsym, then the dynamic_cast fails.
I do not understand why. Any clues?
Additional information
Tested with gcc 4.4.1, libc6 2.10.1 (Ubuntu 9.10)
Example code available
I found the answer to my question here. As I understand it, I need to make the typeinfo available in 'testc' available to the library 'testd'. To do this when using dlopen(), two extra things need to be done:
When linking the library, pass the linker the -E option, to make sure it exports all symbols to the executable, not just the ones that are unresolved in it (because there are none)
When loading the library with dlopen(), add the RTLD_GLOBAL option, to make sure symbols exported by testc are also available to testd
In general, gcc does not support RTTI across dlopen boundaries. I have personal experience with this messing up try/catch, but your problem looks like more of the same. Sadly, I'm afraid that you need to stick to simple stuff across dlopen.
I have to add to this question since I encountered this problem as well.
Even when providing -Wl,-E and using RTLD_GLOBAL, the dynamic_casts still failed. However, passing -Wl,-E in the actual application's linkage as well and not only in the library seem to have fixed it.
If one have no control over the source of the main application, -Wl,-E is not applicable. Passing -Wl,-E to the linker while building own binaries (the host so and the plugins) do not help either.
In my case the only working solution was to load and unload my host so from the _init function of the host so itself using RTLD_GLOBAL flag (See code below). This solution works in both cases:
the main application links against the host so.
the main application loads host so using dlopen (without RTLD_GLOBAL).
In both cases one has to follow the instructions stated by gcc visibility wiki.
If one makes the symbols of the plugin and the host so visible to each other (by using #pragma GCC visibility push/pop or corresponding attribute) and loads the plugins (from the host so) by using RTLD_GLOBAL 1. will work also without loading and unloading the own so (as mentioned by link given above).
This solution makes 2. also work which has not been the case before.
// get the path to the module itself
static std::string get_module_path() {
Dl_info info;
int res = dladdr( (void*)&get_module_path, &info);
assert(res != 0); //failure...
std::string module_path(info.dli_fname);
assert(!module_path.empty()); // no name? should not happen!
return module_path;
}
void __attribute__ ((constructor)) init_module() {
std::string module = get_module_path();
// here the magic happens :)
// without this 2. fails
dlclose(dlopen(module.c_str(), RTLD_LAZY | RTLD_GLOBAL));
}

Alternatives to dlsym() and dlopen() in C++

I have an application a part of which uses shared libraries. These libraries are linked at compile time.
At Runtime the loader expects the shared object to be in the LD_LIBRARY_PATH , if not found the entire application crashes with error "unable to load shared libraries".Note that there is no guarantee that client would be having the library, in that case I want the application to leave a suitable error message also the independent part should work correctly.
For this purpose I am using dlsym() and dlopen() to use the API in the shared library. The problem with this is if I have a lot of functions in the API, i have to access them Individually using dlsym() and ptrs which in my case are leading to memory corruption and code crashes.
Are there any alternatives for this?
The common solution to your problem is to declare a table of function pointers, to do a single dlsym() to find it, and then call all the other functions through a pointer to that table. Example (untested):
// libfoo.h
struct APIs {
void (*api1)(void);
void *(*api2)(int);
long (*api3)(int, void *);
};
// libfoo.cc
void fn1(void) { ... }
void *fn2(int) { ... }
long fn3(int, void *) { ... }
APIs api_table = { fn1, fn2, fn3 };
// client.cc
#include "libfoo.h"
...
void *foo_handle = dlopen("libfoo.so", RTLD_LAZY);
if (!foo_handle) {
return false; // library not present
}
APIs *table = dlsym(foo_handle, "api_table");
table->api1(); // calls fn1
void *p = table->api2(42); // calls fn2
long x = table->api3(1, p); // calls fn3
P.S. Accessing your API functions individually using dlsym and pointers does not in itself lead to memory corruption and crashes. Most likely you just have bugs.
EDIT:
You can use this exact same technique with a 3rd-party library. Create a libdrmaa_wrapper.so and put the api_table into it. Link the wrapper directly against libdrmaa.so.
In the main executable, dlopen("libdrmaa_wrapper.so", RTLD_NOW). This dlopen will succeed if (and only if) libdrmaa.so is present at runtime and provides all API functions you used in the api_table. If it does succeed, a single dlsym call will give you access to the entire API.
You can wrap your application with another one which first checks for all the required libraries, and if something is missing it errors out nicely, but if everything is allright it execs the real application.
Use below type of code
Class DynLib
{
/* All your functions */
void fun1() {};
void fun2() {};
.
.
.
}
DynLib* getDynLibPointer()
{
DynLib* x = new Dynlib;
return x;
}
use dlopen() for loading this library at runtime.
and use dlsym() and call getDynLibPointer() which returns DynLib object.
from this object you can access all your functions jst as obj.fun1().....
This is ofcource a C++ style of struct method proposed earlier.
You are probably looking for some form of delay library load on Linux. It's not available out-of-the-box but you can easily mimic it by creating a small static stub library that would try to dlopen needed library on first call to any of it's functions (emitting diagnostic message and terminating if dlopen failed) and then forwarding all calls to it.
Such stub libraries can be written by hand, generated by project/library-specific script or generated by universal tool Implib.so:
$ implib-gen.py libxyz.so
$ gcc myapp.c libxyz.tramp.S libxyz.init.c ...
Your problem is that the resolution of unresolved symbols is done very early on - on Linux I believe the data symbols are resolved at process startup, and the function symbols are done lazily. Therefore depending on what symbols you have unresolved, and on what sort of static initialization you have going on - you may not get a chance to get in with your code.
My suggestion would be to have a wrapper application that traps the return code/error string "unable to load shared libraries", and then converts this into something more meaningful. If this is generic, it will not need to be updated every time you add a new shared library.
Alternatively you could have your wrapper script run ldd and then parse the output, ldd will report all libraries that are not found for your particular application.