How to replace the usage of LD_PRELOAD with dlopen()? - c++

I'm working on c++ with shared library usage.
Currently I'm using "LD_PRELOAD" and set this environment variable using setenv()
call.
But I want to use dlopen() API to load shared library. That should work same as like setting environment variable (i.e. LD_PRELOAD) using setenv().
can i use dlopen() to get above requirements? or there is difference in the library loading using LD_PRELOAD and dlopen()?

I'm not 100% sure about this, but as I understand it using LD_PRELOAD makes the program loader load all libraries, first, then the library specified by LD_PRELOAD and last your application. This makes it possible to override system libraries with your own.
Using dlopen loads the shared object after your program is loaded, so can not be used to override system objects.
If the environment variable have to be set for the program to work correctly, then it has to be set before the program is loaded, either in the shell or by your LD_PRELOAD file. If the program doesn't need the environment variable immediately then you can either set it in the program or in the "on-load" function in the shared object loaded by dlopen.

Related

Run code before shared library is loaded in C++

I have a proprietary shared object library that runs a code before my application starts. It complains about an environment variable that is not defined. Despite the fact that I can define it on the system environment, I would like to make the definition on the run time within my application executable, before this library is loaded.
I've read that GCC is capable of defining some _init functions before the library loads. However, I'm unable to find how to call these functions.
Is there some way to run a code on the executable before the loader calls the init section of the libraries?

C++ - Does LoadLibrary() actually link to the library?

I'm using Code::Blocks and hate manually linking DLLs. I found the LoadLibrary() function, and am wondering if it works like a .a or .lib file would. Does this function work like that? If not, what can I do programming-wise (if anything) to link a DLL without having to link a DLL by doing the Project < Build options < Linker settings < add < ... method?
LoadLibrary loads the requested library (and all the libraries it needs) into your process' address space. In order to access any of the code/data in that library, you need to find out the code or data address in the newly loaded region of memory. You need to use GetProcAddress.
The difference between this process and adding a library during build time is that for build-time library the compiler prepares a list of locations that refer to a given function, linker puts that list into .exe, and run-time linker loads the library, does the equivalent of GetProcAddress for the function name and places the address into all locations that the compiler marked.
When you don't have this automated support, you have to declare a pointer to function, call GetProcAddress yourself, and assign the returned value to your pointer to function. You can then call the function like any other C function (note "C" part - the above process is complicated by name mangling when you use C++, so make use of extern "C")
LoadLibrary() loads a DLL at runtime. Normaly you link when you compile the EXE, and link the DLLs like a static library at this time. If you need to load libraries dynamically at runtime, you use LoadLibrary().
For example when you implement a pluginsystem, is this usefull, as you don't know the libraries beforehand.
Not how it works at all. LoadLibrary is used to load a "at compile time unknown" DLL - such as program extensions/plugins or "This DLL for SSE, that DLL for non-SSE" based on "what can the hardware do - one could also consider having a DLL per connection type to email servers or something like that, so that the email program doesn't have to "carry" all the different variants, when only one is used for any particular email address.
Further, to use a DLL that has been loaded this way, you need to use GetProcAddress to get the address of the functions in the DLL. This is very different from linking the DLL into the project at buildtime, where functions simply appear "automagically" by help of the system loader functions that load the DLL's that are added to the project at build-time.
Contrary to static or dynamic library linking, LoadLibrary doesn't make the library's symbols directly available to your program. You need to call GetProcAddress at runtime to get a pointer to the functions you want to call.
As #Devolus mentioned, this is a good way to implement plugin systems and/or to access optional components. However, since the symbols are not available to your program in a transparent way, this is not really practical for ordinary usage.

Link dll to static library and load it into an application linked against the same static library

I am creating an application that supports modules in the form of dlls that are loaded dynamically at runtime. The code is laid out in the following way:
core - static library
This has a mechanism to load shared libraries and call a "create" function that returns a new Module object (uses a shared header).
module shared library (linked against core static library)
This module uses the shared Module header and also uses other classes from the core library (hence why it is linked against the core library). It is built to include all symbols from static libraries.
test application executable (linked against core static library)
I am getting funky, and seemingly sporadic behavior. They always end up in access violations but it seems that member variables that I very explicitly set (integers) will print out in later functions as garbage (i have verified that they are not being deleted earlier). This only ever seems to happen if they dynamic library is loaded (even if I never call the create function).
My main question is, is there are danger here that the symbols in the shared library will conflict with the symbols in the executable (since they came from the same static library) and cause problems even though they are from the exact same static library?
I can't speak for Linux and OS X behavior, but on Windows, the following is exactly what is happening. Since you say you also want to compile on Windows, this is relevant.
The problem you are experiencing is that you actually have multiple versions of everything in the core. Each module and the application itself has its own copy of the core, and their variables are not shared. This includes the C runtime, so things like new/delete across module boundaries are fraught with peril.
To verify that this is what is happening, create a simple test: set a global in the core to a value in your test application, then from from your dynamically loaded code try to access that global and see what you get. I will wager that you will see that your store to the global will not be reflected!
Solutions:
1) Make core a shared dynamic library. This may or may not be an option for you.
2) Operate extremely carefully with the knowledge of the above; All CRT and/or your own core state will not be shared, so you must make sure things will be allocated/destroyed on their own side of the module boundaries, among other things.
My own application is designed almost identically to yours; ie a static library with shared code needed by both the application and the modules, and then dynamically loaded plugins loaded by the application core.
What I do for all shared core state that must be accessed across modules is that the first thing each module does after loading is have its "core pointer" set to an instantiation of the core libraries in the application. This ensures that all modules are working with the same data.

Load shared library by path at runtime

I am building a Java application that uses a shared library written in C++ and compiled for different operating systems. The problem is, that this shared library itself depends on an additional library it normally finds under the appropriate environment variable (PATH, LIBRARY_PATH or LD_LIBRARY_PATH).
I can - but don't want to - set these environment variables. I'd rather load the needed shared libraries from a given path at runtime - just like a plugin. And no - I don't want any starter application that starts a new process with a new environment. Does anybody know how to achieve this?
I know that this must be possible, as one of the libraries I use is capable of loading its plugins from a given path. Of course I'd prefer platform independent code, but if this ain't possible, seperate solutions for Windows, Linux and MacOS would also do it.
EDIT
I should have mentioned that the shared library I'd wish to use is object oriented, which means that a binding of single functions won't do it.
Un UNIX/Linux systems you can use dlopen. The issue then is you have to fetch all symbols you need via dlsym
Simple example:
typedef int (*some_func)(char *param);
void *myso = dlopen("/path/to/my.so", RTLD_NOW);
some_func *func = dlsym(myso, "function_name_to_fetch");
func("foo");
dlclose(myso);
Will load the .so and execute function_name_to_fetch() from in there. See the man page dlopen(1) for more.
On Windows, you can use LoadLibrary, and on Linux, dlopen. The APIs are extremely similar and can load a so/dll directly by providing the full path. That works if it is a run-time dependency (after loading, you "link" by calling GetProcAddress/dlsym.)
I concur with the other posters about the use of dlopen and LoadLibrary. The libltdl gives you a platform-independent interface to these functions.
I do not think you can do it for it.
Most Dlls have some sort of init() function that must be called after it have been loaded, and sometime that init() function needs some parameters and returns some handle to be used to call the dll's functions. Do you know the definition of the additional library?
Then, the first library can not simply look if the DLL X is in RAM only by using its name. The one it needs can be in a different directory or a different build/version. The OS will recognize the library if the full path is the same than another one already loaded and it will share it instead of loading it a second time.
The other library can load its plugins from another path because it written to not depend on PATH and they are his own plugins.
Have you try to update the process's environment variables from the code before loading the Dll? That will not depends on a starter process.

RPATH equivalent for executables

I have a c++ shared library which as part of its normal behaviour fork()/execs() another executable containing some unstable legacy code. This executable is not useful other than with this library, so I'd like to avoid placing it in a PATH directory. I'd also like to be able to install multiple copies in various locations, so hard coded paths are not desirable. Is there anything equivalent to a RPATH that will allow exec() to find this executable? Alternatively, is it possible to query the rpath of a shared library from the library itself?
Edit: This post suggests the latter is possible. I'll leave this open in case anybody knows the answer to the asked question. Is there a way to inspect the current rpath on Linux?
You can always use getenv to get the environment within the shared object, but is RPATH really what you want to use for that? Wouldn't it be better to have the shared object have some sort of configuration file in the user's home directory (or custom environment variable) that tells it which location to use run the external binary?
I think the best way to do this is to set an environment variable and use execve() to run the binary. Presumably you could just set PATH and then execve() a shell that would use PATH to find a copy of the executable. The library equivalent would be to set LD_LIBRARY_PATH and execve() a binary that has this library as a dependency.
In either case, you are not changing the external environment, only manufacturing a modified copy that is used with execve().