Alternatives to dlsym() and dlopen() in C++

Alternatives to dlsym() and dlopen() in C++ - c++

I have an application a part of which uses shared libraries. These libraries are linked at compile time.
At Runtime the loader expects the shared object to be in the LD_LIBRARY_PATH , if not found the entire application crashes with error "unable to load shared libraries".Note that there is no guarantee that client would be having the library, in that case I want the application to leave a suitable error message also the independent part should work correctly.
For this purpose I am using dlsym() and dlopen() to use the API in the shared library. The problem with this is if I have a lot of functions in the API, i have to access them Individually using dlsym() and ptrs which in my case are leading to memory corruption and code crashes.
Are there any alternatives for this?

The common solution to your problem is to declare a table of function pointers, to do a single dlsym() to find it, and then call all the other functions through a pointer to that table. Example (untested):
// libfoo.h
struct APIs {
void (*api1)(void);
void *(*api2)(int);
long (*api3)(int, void *);
};
// libfoo.cc
void fn1(void) { ... }
void *fn2(int) { ... }
long fn3(int, void *) { ... }
APIs api_table = { fn1, fn2, fn3 };
// client.cc
#include "libfoo.h"
...
void *foo_handle = dlopen("libfoo.so", RTLD_LAZY);
if (!foo_handle) {
return false; // library not present
}
APIs *table = dlsym(foo_handle, "api_table");
table->api1(); // calls fn1
void *p = table->api2(42); // calls fn2
long x = table->api3(1, p); // calls fn3
P.S. Accessing your API functions individually using dlsym and pointers does not in itself lead to memory corruption and crashes. Most likely you just have bugs.
EDIT:
You can use this exact same technique with a 3rd-party library. Create a libdrmaa_wrapper.so and put the api_table into it. Link the wrapper directly against libdrmaa.so.
In the main executable, dlopen("libdrmaa_wrapper.so", RTLD_NOW). This dlopen will succeed if (and only if) libdrmaa.so is present at runtime and provides all API functions you used in the api_table. If it does succeed, a single dlsym call will give you access to the entire API.

You can wrap your application with another one which first checks for all the required libraries, and if something is missing it errors out nicely, but if everything is allright it execs the real application.

Use below type of code
Class DynLib
{
/* All your functions */
void fun1() {};
void fun2() {};
.
.
.
}
DynLib* getDynLibPointer()
{
DynLib* x = new Dynlib;
return x;
}
use dlopen() for loading this library at runtime.
and use dlsym() and call getDynLibPointer() which returns DynLib object.
from this object you can access all your functions jst as obj.fun1().....
This is ofcource a C++ style of struct method proposed earlier.

You are probably looking for some form of delay library load on Linux. It's not available out-of-the-box but you can easily mimic it by creating a small static stub library that would try to dlopen needed library on first call to any of it's functions (emitting diagnostic message and terminating if dlopen failed) and then forwarding all calls to it.
Such stub libraries can be written by hand, generated by project/library-specific script or generated by universal tool Implib.so:
$ implib-gen.py libxyz.so
$ gcc myapp.c libxyz.tramp.S libxyz.init.c ...

Your problem is that the resolution of unresolved symbols is done very early on - on Linux I believe the data symbols are resolved at process startup, and the function symbols are done lazily. Therefore depending on what symbols you have unresolved, and on what sort of static initialization you have going on - you may not get a chance to get in with your code.
My suggestion would be to have a wrapper application that traps the return code/error string "unable to load shared libraries", and then converts this into something more meaningful. If this is generic, it will not need to be updated every time you add a new shared library.
Alternatively you could have your wrapper script run ldd and then parse the output, ldd will report all libraries that are not found for your particular application.

Related

Is lazy loading in linux possible? [duplicate]

I've been searching a way to make a shared library (let's name the library libbar.so) delay loaded on Linux and it should hopefully be realized with a help from only a linker, not modifying anything on the source code written in C++; I mean I don't want to invoke dlopen() nor dlsym() in the source code of the parent library (let's name it libfoo.so) to invoke a function of libbar.so because they make the source code messy and the maintenance process difficult. (In short, I'm expecting to go on the similar way to Visual Studio's /DELAYLOAD option even on Linux)
Anyway, I've found some uncertain information pieces related to my question on the internet so far, so it would be very nice to have the answers from you all for the following questions to make the information clear.
Does GNU ld support any delay loading mechanism on Linux?
If it doesn't, how about Clang?
Is the dlopen() family the only way to make a shared library delay loaded on Linux?
I tested to pass -zlazy flag to GCC (g++) with a path to the library, it seemed to accept the flag but the behavior did not look making libbar.so delay loaded (Not having libbar.so, I was expecting to have an exception at the first call of libbar.so, but the exception actually raised before entering to libfoo.so). On the other hand, Clang (clang++) left a warning that it ignored the option flag.
Best regards,

Delay loading is NOT a runtime feature. MSVC++ implemented it without help from Windows. And like dlopen is the only way on Linux, GetProcAddress is the only runtime method on Windows.
So, what is delay loading then? It's very simple: Any call to a DLL has to go through a pointer (since you don't know where it will load). This has always been handled by the compiler and linker for you. But with delay loading, MSVC++ sets this pointer initially to a stub that calls LoadLibrary and GetProcAddress for you.
Clang can do the same without help from ld. At runtime, it's just an ordinary dlopen call, and Linux cannot determine that Clang inserted it.

This functionality can be achieved in a portable way using Proxy design pattern.
In code it may look something like this:
#include <memory>
// SharedLibraryProxy.h
struct SharedLibraryProxy
{
virtual ~SharedLibraryProxy() = 0;
// Shared library interface begin.
virtual void foo() = 0;
virtual void bar() = 0;
// Shared library interface end.
static std::unique_ptr<SharedLibraryProxy> create();
};
// SharedLibraryProxy.cc
struct SharedLibraryProxyImp : SharedLibraryProxy
{
void* shared_lib_ = nullptr;
void (*foo_)() = nullptr;
void (*bar_)() = nullptr;
SharedLibraryProxyImp& load() {
// Platform-specific bit to load the shared library at run-time.
if(!shared_lib_) {
// shared_lib_ = dlopen(...);
// foo_ = dlsym(...)
// bar_ = dlsym(...)
}
return *this;
}
void foo() override {
return this->load().foo_();
}
void bar() override {
return this->load().bar_();
}
};
SharedLibraryProxy::~SharedLibraryProxy() {}
std::unique_ptr<SharedLibraryProxy> SharedLibraryProxy::create() {
return std::unique_ptr<SharedLibraryProxy>{new SharedLibraryProxyImp};
}
// main.cc
int main() {
auto shared_lib = SharedLibraryProxy::create();
shared_lib->foo();
shared_lib->bar();
}

To add to MSalters answer, one can easily mimic Windows approach to lazy loading on Linux by creating a small static stub library that would try to dlopen needed library on first call to any of it's functions (emitting diagnostic message and terminating if dlopen failed) and then forwarding all calls to it.
Such stub libraries can be written by hand, generated by project/library-specific script or generated by universal tool Implib.so:
$ implib-gen.py libxyz.so
$ gcc myapp.c libxyz.tramp.S libxyz.init.c ...

How to make a shared library delay loaded on Linux

I've been searching a way to make a shared library (let's name the library libbar.so) delay loaded on Linux and it should hopefully be realized with a help from only a linker, not modifying anything on the source code written in C++; I mean I don't want to invoke dlopen() nor dlsym() in the source code of the parent library (let's name it libfoo.so) to invoke a function of libbar.so because they make the source code messy and the maintenance process difficult. (In short, I'm expecting to go on the similar way to Visual Studio's /DELAYLOAD option even on Linux)
Anyway, I've found some uncertain information pieces related to my question on the internet so far, so it would be very nice to have the answers from you all for the following questions to make the information clear.
Does GNU ld support any delay loading mechanism on Linux?
If it doesn't, how about Clang?
Is the dlopen() family the only way to make a shared library delay loaded on Linux?
I tested to pass -zlazy flag to GCC (g++) with a path to the library, it seemed to accept the flag but the behavior did not look making libbar.so delay loaded (Not having libbar.so, I was expecting to have an exception at the first call of libbar.so, but the exception actually raised before entering to libfoo.so). On the other hand, Clang (clang++) left a warning that it ignored the option flag.
Best regards,

Delay loading is NOT a runtime feature. MSVC++ implemented it without help from Windows. And like dlopen is the only way on Linux, GetProcAddress is the only runtime method on Windows.
So, what is delay loading then? It's very simple: Any call to a DLL has to go through a pointer (since you don't know where it will load). This has always been handled by the compiler and linker for you. But with delay loading, MSVC++ sets this pointer initially to a stub that calls LoadLibrary and GetProcAddress for you.
Clang can do the same without help from ld. At runtime, it's just an ordinary dlopen call, and Linux cannot determine that Clang inserted it.

This functionality can be achieved in a portable way using Proxy design pattern.
In code it may look something like this:
#include <memory>
// SharedLibraryProxy.h
struct SharedLibraryProxy
{
virtual ~SharedLibraryProxy() = 0;
// Shared library interface begin.
virtual void foo() = 0;
virtual void bar() = 0;
// Shared library interface end.
static std::unique_ptr<SharedLibraryProxy> create();
};
// SharedLibraryProxy.cc
struct SharedLibraryProxyImp : SharedLibraryProxy
{
void* shared_lib_ = nullptr;
void (*foo_)() = nullptr;
void (*bar_)() = nullptr;
SharedLibraryProxyImp& load() {
// Platform-specific bit to load the shared library at run-time.
if(!shared_lib_) {
// shared_lib_ = dlopen(...);
// foo_ = dlsym(...)
// bar_ = dlsym(...)
}
return *this;
}
void foo() override {
return this->load().foo_();
}
void bar() override {
return this->load().bar_();
}
};
SharedLibraryProxy::~SharedLibraryProxy() {}
std::unique_ptr<SharedLibraryProxy> SharedLibraryProxy::create() {
return std::unique_ptr<SharedLibraryProxy>{new SharedLibraryProxyImp};
}
// main.cc
int main() {
auto shared_lib = SharedLibraryProxy::create();
shared_lib->foo();
shared_lib->bar();
}

To add to MSalters answer, one can easily mimic Windows approach to lazy loading on Linux by creating a small static stub library that would try to dlopen needed library on first call to any of it's functions (emitting diagnostic message and terminating if dlopen failed) and then forwarding all calls to it.
Such stub libraries can be written by hand, generated by project/library-specific script or generated by universal tool Implib.so:
$ implib-gen.py libxyz.so
$ gcc myapp.c libxyz.tramp.S libxyz.init.c ...

Linking functions with LD_PRELOAD at runtime

I am writing a lib that intercepts the calls to malloc and free at runtime by running the program with LD_PRELOAD=mylib myexe.
The calls to malloc and free are intercept fine. My problem is that there's another function in mylib that I also want intercepted when LD_PRELOAD is used and I cannot figure out why it doesn't 'just work' like the calls to malloc and free.
In mylib.c:
void* malloc(size_t s)
{
return doMyMalloc();
}
void free(void* p)
{
doMyFree(p);
}
void otherThing(size_t)
{
doThing();
}
In myexe.cpp:
#include <malloc.h>
extern "C" void otherThing(size_t); // Compile with -Wl,--unresolved-symbols=ignore-all
int main(int argc, char* argv[])
{
void* x = malloc(1000); // Successfully intercepted.
free(x); // Successfully intercepted.
otherThing(1); // Segfault.
}
One way I have managed to get it to work is by doing:
typedef void (*FUNC)(size_t);
FUNC otherThing = NULL;
int main(int argc, char* argv[])
{
otherThing = (FUNC)dlsym(RTLD_NEXT, "otherThing");
otherThing(1); // Successfully calls mylib's otherThing().
}
but I don't want to write all this code; I don't have to do it for malloc and free.
It's ok if the program crashes if the LD_PRELOAD prefix is missing.

I feel like you're applying a single solution (LD_PRELOAD) to solve two different problems. First, you want to patch malloc() and free(). You've got that working--great. Next, you want to have a runtime "plug-in" system where you don't link against any library at build time, but only do so at runtime. This is typically done using dlopen() and dlsym(), and I recommend you use those.
The idea is that you don't want to specify a particular implementation of otherThing() at build time, but you do need to have some implementation at runtime (or you rightly expect to crash). So let's be explicit about it and use dlsym() to resolve the function name at runtime, of course with error detection in case it's not found.
As for where to define otherThing(), it can be in a totally separate file given to dlopen(), or in mylib (in which case pass NULL as the filename to dlopen()).

It's a bit of a gotcha. There are several posts relating to this on the internet, but I'll try to break it down to 'getting it to work'.
If this is under Linux, then what has happened is that the app has been compiled to not be able to use the external symbol. The quickest solution is to add the same compilation flags to the main application as are used in the library i.e. add the -fPIC flag to the compilation of the main application, just like it is done for the library.
Rather than using the -Wl,--unresolved-symbols=ignore-all flag, you should be using the __attribute__ ((weak)) for the function, like:
extern "C" void otherThing(size_t) __attribute__ ((weak);
And checking it for NULL at run-time, which will allow you to determine if it's been set or not.
By compiling the main application in the same way as a .so, you are implicitly allowing it to be used itself as a target for LD_PRELOAD as, by the manual page:
LD_PRELOAD
A list of additional, user-specified, ELF shared libraries to be
loaded before all others. The items of the list can be separated by
spaces or colons. This can be used to selectively override
functions in other shared libraries.

How to compile app in C with module?

I want to do application, which can be compiled with external modules, for example like in php. In php you can load modules in runtime, or compile php with modules together, so modules are available without loading in runtime. But i don't understand how this can be done. If i have module in module.c and there is one function, called say_hello, how can i register it to main application, if you understand what i mean?
/* module.c */
#include <stdio.h>
// here register say_hello function, but how, if i can't in global scope
// call another function?
void say_hello()
{
printf("hello!");
}
If i compile all that files(main app + modules) together, there isn't some reference to say_hello function from main app, because it is called only if user call it in its code. So how can i say to my app, hey, there is say_hello function, if someone want to call it, you know it exists.
EDIT1: I need to have something like table at runtime, where i can see if user called function exists (have C equivavent). Header files doesn't help to me.
EDIT2: My app is interpret for my script langugage.
EDIT3: If someone call function in php, php interpret must know that function exists. I know about dynamic linking and if .so or .dll is loaded, then some start routine is called and you can simple register function in that dll, so php interpret can see, if some module registred for example function called "say_hello". But if i want compile php with for example gd support, then how gd functions are registred to some php function list, hashtable or whatever?

I guess what you are looking for is dynamic libraries (we call runtime loadable modules as dynamic/shared libraries in C and in the OS world, in general). Take, for example, Pidgin which supports plugins to extend it's functionalities. It gives a particular interface to it's plugin-makers to abide by, say functions to register, load, unload and use, which the plugins will have to follow.
When the program loads, it looks for such dynamic libraries in it's plugins directory, if present, it'll load and use it, else it'll skip exposing the functionality. The reason why an interface is needed is that since different modules can have different functionalities which are unknown uptil runtime, an app. has to have a common, agreed-upon way of "talking" to it's plugins/modules.
Every C program can be linked to a static or a dynamic library; static will copy the code from the library to the said program, there by leaving no dependencies for the program to run, while linking to a dynamic library expects the dynamic library to be present when the program is launched. A third way of doing it, is not to link to a DLL, but just asking the OS to perform a load operation of the library. If this succeeds, then the dynamic module is used, else ignored. The functionality which the dynamic library should perform is exposed to the user, only if the load call succeeds.
It is to be noted that this is a operating system provided feature and it has nothing to do with the language used (C or C++ or Python doesn't matter here); as far as C is concered, the compiler still links to known code i.e. code which is available # compile time. This is the reason for different operating system, one needs to write different code to load a dynamic module. Even more, the file type/format of syuch libraries vary from system to system. In Linux it's called shared objects (.so), in Mac it's called dynamic libraries (.dylib) and in Windows as Dynamic link libraries (.dll).

C is not interpreted language. So you need linking, you may want static linking or dynamic linking.
Program building consists of 2 major phases: compiling and linking. During compiling all c-files are translated into machine code, leaving called functions unresolved (obj or o files). Then linker merges all these files into one executable, resolving what was unresolved.
This is static linking. Linked module becomes integral part of executable.
Dynamic linking is platform specific. Under windows these are DLLs. You should issue a system call to load DLL after which you will be able to call functions from it.

What you need is dynamic library. Let's first take a look at the example provided in the Linux manpage of dlopen(3):
/* Load the math library, and print the cosine of 2.0: */
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
int main(int argc, char **argv) {
void *handle;
double (*cosine)(double);
char *error;
handle = dlopen("libm.so", RTLD_LAZY);
if (!handle) {
fprintf(stderr, "%s\n", dlerror());
exit(EXIT_FAILURE);
}
dlerror(); /* Clear any existing error */
/* Writing: cosine = (double (*)(double)) dlsym(handle, "cos");
would seem more natural, but the C99 standard leaves
casting from "void *" to a function pointer undefined.
The assignment used below is a workaround. */
*(void **) (&cosine) = dlsym(handle, "cos");
if ((error = dlerror()) != NULL) {
fprintf(stderr, "%s\n", error);
exit(EXIT_FAILURE);
}
printf("%f\n", (*cosine)(2.0));
dlclose(handle);
exit(EXIT_SUCCESS);
}
There's also a C++ dlopen mini HOWTO.
For more general information about dynamic loading, start from the wikipedia page first.

I think it is impossible, if i understand what you mean. Because it is compiled language.

Can I access to symbols of the host process from a shared object loaded in runtime? Any alternative?

In my scenario I want a plugin, which is a shared object loaded in runtime, to access symbols from the “host application” so I can add any functionality to my application.
I have tried but have not found any way to do this and I have no clue on whether this is possible or not. So, can I do this somehow or is there any alternative that applications that use plugins use?
I am on Fedora 15, Linux 2.6.4. However, I hope the solution will be cross-platform.

There are three main approaches:
Pass a structure of function pointers to the DLL from the application, giving access to whatever symbols you want to share. This is the most portable method, but is kind of a pain to create all the function pointers. Something like:
// In shared header
struct app_vtable {
void (*appFoo)();
};
// In plugin:
const app_vtable *vt;
void set_vtable(const app_vtable *vt_) {
vt = vt_;
}
void bar() {
vt->appFoo();
}
// In application:
void foo();
const app_vtable vt = { foo };
void loadplugin() {
void *plugin = dlopen("plugin.so", RTLD_LAZY);
void (*pset_vtable)(const app_vtable *) = dlsym(plugin, "set_vtable");
pset_vtable(&vt);
void (*pbar)() = dlsym(plugin, "bar");
pbar();
}
Move your application into a library, and have the executable simply link in this library and call an entry point in it. Then your plugins can link the same library and access its symbols easily. This is also quite portable, but can result in some performance loss due to the need to use position-independent code in your main app library (although you may be able to get away with a fixed mapping in this case, depending on your architecture).
On Linux only (and possible other ELF platforms) you can use -rdynamic to export symbols from the application executable directly. However this isn't very portable to other platforms - in particular, these is no equivalent to this on Windows.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Alternatives to dlsym() and dlopen() in C++ - c++

You can wrap your application with another one which first checks for all the required libraries, and if something is missing it errors out nicely, but if everything is allright it execs the real application.

Related

Is lazy loading in linux possible? [duplicate]

How to make a shared library delay loaded on Linux

Linking functions with LD_PRELOAD at runtime

How to compile app in C with module?

Can I access to symbols of the host process from a shared object loaded in runtime? Any alternative?

Categories

Resources