Using libtool to load a duplicate function name from a shared library - c++

I'm trying to create a 'debug' shared library (i.e., .so or .dll file) that calls another 'real' shared library that has the same C API as the debug library (in this case, to emulate the PKCS#11 API). However, I'm running into trouble where the link map of the debug library is colliding with that of the real library and causing the debug library to call its own functions instead of the corresponding functions in the real library. I found a solution to this problem by using the POSIX dlmopen command, but would like to understand if the same is possible using GNU's libtool.
On my Solaris 10 system, the following code fails the assertion when a test application statically links to the debug library:
#include <dlfcn.h>
int MyFunctionName() {
int (*function_ptr)();
void *handle = dlopen("realsharedlibrary.so", RTDL_LAZY);
*(void **)(&function_ptr) = dlsym(handle, "MyFunctionName");
ASSERT(function_ptr != MyFunctionName); // Fails
return (*function_ptr)();
}
In this case, I get a function pointer to the local 'MyFunctionName' (in the debug library) instead of MyFunctionName within the real shared library.
I've discovered that it's possible to get around this problem by using the command 'dlmopen' instead of 'dlopen', and telling dlmopen to create a new link map (with the LM_ID_NEWLM parameter) when loading the real library:
int MyFunctionName() {
int (*function_ptr)();
void *handle = dlmopen(LM_ID_NEWLM, "realsharedlibrary.so", RTDL_LAZY);
*(void **)(&function_ptr) = dlsym(handle, "MyFunctionName");
ASSERT(function_ptr != MyFunctionName); // succeeds
return function_ptr(); // call real function
}
Unfortunately, dlmopen does not seem to be included within libtool (i.e., I don't see an lt_dlmopen function in libtool).
Is it possible to do the same thing using libtool commands -- that is, to create a new link map when loading the new library so that it doesn't collide with the link map of the debug library?

I haven't found a good way to use libtool to solve this problem yet, but there's a way to avoid the Solaris-specific 'dlmopen' function by using dlopen with these flags:
void *handle = dlopen("realsharedlibrary.so", RTLD_NOW | RTLD_GROUP | RTLD_LOCAL)
Apparently, the problem of symbol-collisions is solved by using RTLD_NOW instead of RTLD_LAZY and by adding RTLD_GROUP. The RTLD_LOCAL is there because POSIX requires using either RTLD_LOCAL or RTLD_GLOBAL, or the behavior is undefined. For Solaris, the behavior is to default to RTLD_LOCAL.
The open question, though, is whether it's possible to pass these types of flags to lt_dlopen.

Related

Linking shared lib on Linux with duplicate yet modified class/struct causes segfault

I have a problem understanding, what exactly happens, when a dynamic library is loaded at runtime and how the dynamic linker recognizes and treats "same symbols".
I've read other questions related to symbolic linking and observed all the typical recommendations (using extern "C", using -fPIC when linking the library, etc.). To my knowledge, my specific problem was not discussed, so far. The paper "How to write shared libraries" https://www.akkadia.org/drepper/dsohowto.pdf does discuss the process of resolving library symbol dependencies, that may explain what's happening in my example below, but alas, it does not offer a workaround.
I found a post where the last (unfortunately) un-answered comment is very much the same as my problem:
Is there symbol conflict when loading two shared libraries with a same symbol
Only difference is: in my case the symbol is being an auto-generated constructor.
Here's the setup (Linux):
program "master" uses some library class declaration "Dummy" with 4 members variables and loads dynamically a shared library via dlopen() and resolves two simple functions with dlsym()
the shared library "slave" uses also the library with the class "Dummy", yet in a newer version with 5 member variables (extra string)
when the shared library's function is called from master, accessing the newly added string member in class Dummy segfaults - apparently the string wasn't initialized correctly
My assumption is: the constructor of class Dummy exists already in memory since master uses this function itself, and when loading the shared library it does not load its own version of the constructor, but simply re-uses the existing version from master. By doing that the extra string variable is not initialized correctly in the constructor, and accessing it segfaults.
When debugging into the assembler code when initializing the Dummy variable d in the slave, indeed Dummy's constructor inside the master's memory space is being called.
Questions:
How does the dynamic linker (dlopen()?) recognize, that the class Dummy used to compile the master should be the same as Dummy compiled into Slave, despite it being provided in the library itself? Why does the symbol lookup take the master's variant of the constructor, even though the symbol table must also contain the constructor symbol imported from the library?
Is there a way, for example by passing some suitable options to dlopen() or dlsym() to enforce usage of the Slave's own Dummy constructor instead of the one from Master (i.e. tweak the symbol lookup/reallocation behavior)?
Code: full minimalistic source code example can be found here:
https://bauklimatik-dresden.de/privat/nicolai/tmp/master-slave-test.tar.bz2
Relevant shared lib loading code in Master:
#include <iostream>
#include <dlfcn.h> // shared library loading on Unix systems
#include "Dummy.h"
int create(void * &data);
typedef int F_create(void * &data);
int destroy(void * data);
typedef int F_destroy(void * data);
int main() {
// use dummy class at least once in program to create constructor
Dummy d;
d.m_c = "Test";
// now load dynamic library
void *soHandle = dlopen( "libSlave.so", RTLD_LAZY );
std::cout << "Library handle 'libSlave.so': " << soHandle << std::endl;
if (soHandle == nullptr)
return 1;
// now load constructor and destructor functions
F_create * createFn = reinterpret_cast<F_create*>(dlsym( soHandle, "create" ) );
F_destroy * destroyFn = reinterpret_cast<F_destroy*>(dlsym( soHandle, "destroy" ) );
void * data;
createFn(data);
destroyFn(data);
return 0;
}
Class Dummy: the variant without "EXTRA_STRING" is used in Master, with extra string is used in Slave
#ifndef DUMMY_H
#define DUMMY_H
#include <string>
#define EXTRA_STRING
class Dummy {
public:
double m_a;
int m_b;
std::string m_c;
#ifdef EXTRA_STRING
std::string m_c2;
#endif // EXTRA_STRING
double m_d;
};
#endif // DUMMY_H
Note: if I use exaktly same class Dummy both in Master and Slave, the code works (as expected).
When debugging into the assembler code when initializing the Dummy variable d in the slave, indeed Dummy's constructor inside the master's memory space is being called.
This is expected behavior on UNIX. Unlike Windows DLLs, UNIX shared libraries are designed to imitate archive libraries, and are not designed to be self-contained isolated units of code.
How does the dynamic linker (dlopen()?) recognize, that the class Dummy used to compile the master should be the same as Dummy compiled into Slave, despite it being provided in the library itself? Why does the symbol lookup take the master's variant of the constructor, even though the symbol table must also contain the constructor symbol imported from the library?
The dynamic loader doesn't care (or know anything) about any classes. It operates of symbols.
By default symbols are resolved to the first definition of any given symbol which is visible to the dynamic loader (the exported symbol).
You can examine the set of symbols which are exported from any given binary with nm -CD Master and nm -CD libSlave.so.
Is there a way, for example by passing some suitable options to dlopen() or dlsym() to enforce usage of the Slave's own Dummy constructor instead of the one from Master (i.e. tweak the symbol lookup/reallocation behavior)?
There are several ways to modify the default behavior.
The best approach is to have libSlave.so use its own namespace. That will change all the (mangled) symbol names, and will completely eliminate any collisions.
The next best approach is to limit the set of symbols which are exported from libSlave.so, by compiling with -fvisibility=hidden and adding explicit __attribute__((visibility("default"))) to the (few) functions which must be visible from that library (create and destroy in your example).
Another possible approach is to link libSlave.so with -Wl,-Bsymbolic flag, thought the symbol resolution rules get pretty complicated really fast, and unless you understand them all, it's best to avoid doing this.
P.S. One might wonder why the Master binary exports any symbols -- normally only symbols referenced by other .sos used during the link are exported.
This happens because cmake uses -rdynamic when linking the main executable. Why it does that, I have no idea.
So another workaround is: don't use cmake (or at least not with the default flags it uses).
I followed the recommendations found in the last answer and Is there symbol conflict when loading two shared libraries with a same symbol :
running 'nm Master' and 'nm libSlave.so' showed the same automatically generated constructor symbols:
...
000000000000612a W _ZN5DummyC1EOS_
00000000000056ae W _ZN5DummyC1ERKS_
0000000000004fe8 W _ZN5DummyC1Ev
...
So, the mangled function signatures match in both the master's binary and the slave.
When loading the library, the master's function is used instead of the library's version. To study this further, I created an even more minimalistic example like in the post referenced above:
master.cpp
#include <iostream>
#include <dlfcn.h> // shared library loading on Unix systems
// prototype for imported slave function
void hello();
typedef void F_hello();
void printHello() {
std::cout << "Hello world from master" << std::endl;
}
int main() {
printHello();
// now load dynamic library
void *soHandle = nullptr;
const char * const sharedLibPath = "libSlave.so";
// I tested different RTLD_xxx options, see text for explanations
soHandle = dlopen( sharedLibPath, RTLD_NOW | RTLD_DEEPBIND);
if (soHandle == nullptr)
return 1;
// now load shared lib function and execute it
F_hello * helloFn = reinterpret_cast<F_hello*>(dlsym( soHandle, "hello" ) );
helloFn();
return 0;
}
slave.h
#pragma once
#ifdef __cplusplus
extern "C" {
#endif
void hello();
#ifdef __cplusplus
}
#endif
slave.cpp
#include "slave.h"
#include <iostream>
void printHello() {
std::cout << "Hello world from slave" << std::endl;
}
void hello() {
printHello(); // should call our own hello() function
}
You notice the same function printHello() exists both in the library and the master.
I compiled both manually this time (without CMake) and the following flags:
# build master
/usr/bin/c++ -fPIC -o tmp/master.o -c master.cpp
/usr/bin/c++ -rdynamic tmp/master.o -o Master -ldl
# build slave
/usr/bin/c++ -fPIC -o tmp/slave.o -c slave.cpp
/usr/bin/c++ -fPIC -shared -Wl,-soname,libSlave.so -o libSlave.so tmp/slave.o
Mind the use of -fPIC in both master and slave-library.
I now tried several combinations of RTLD_xx flags and compile flags:
1.
dlopen() flags: RTLD_NOW | RTLD_DEEPBIND
-fPIC for both libs
Hello world from master
Hello world from slave
-> result as expected (this is what I wanted to achieve)
2.
dlopen() flags: RTLD_NOW | RTLD_DEEPBIND
-fPIC for only the library
Hello world from master
Speicherzugriffsfehler (Speicherabzug geschrieben) ./Master
-> Here, a segfault happens in the line where the iostream libraries cout call is made; still, the printHello()s function in the library is called
3.
dlopen() flags: RTLD_NOW
-fPIC for only the library
Hello world from master
Hello world from master
-> This is my original behavior; so RTLD_DEEPBIND is definitely what I need, in conjunction with -fPIC in the master's binary;
Note: while CMake automatically adds -fPIC when building shared libraries, it does not generally do this for executables; here you need to manually add this flag when building with CMake
Note2: Using RTLD_NOW or RTLD_LAZY does not make a difference.
Using the combination of -fPIC on both executable and shared lib, with RTLD_DEEPBIND lets the original example with the different Dummy classes work without problems.

C++ force unloading shared library

I'm trying to create an application which reloads a shared library multiple times. But at some point in time, dlmopen fails with error
/usr/lib/libc.so.6: cannot allocate memory in static TLS block
Here is the minimal code reproducing this issue:
#include <dlfcn.h>
#include <cstdio>
#include <vector>
int main() {
for (int i = 0; i < 100; ++i) {
void *lib_so = dlmopen(LM_ID_NEWLM, "lib.so", RTLD_LAZY | RTLD_LOCAL);
if (lib_so == NULL) {
printf("Iteration %i loading failed: %s\n", i, dlerror());
return 1;
}
dlclose(lib_so);
}
return 0;
}
And empty lib.cpp, compiled with
g++ -rdynamic -ldl -Wl,-R . -o test main.cpp
g++ -fPIC -shared lib.cpp -o lib.so
Update
It seems that it crashes even with one thread. The question is: how can I force a library unload or a destruction of unused namespaces created with LM_ID_NEWLM?
There is a built-in limit to the number of link map namespaces available to a process. This is rather poorly documented in the comment:
The glibc implementation supports a maximum of 16 namespaces
in the man page.
Once you create a link map namespace, there is no support for 'erasing' it via any APIs. This is just the way it's designed, and there's no real way to get around that without editing the glibc source and adding some hooks.
Using namespaces for reloading of a library is not actually reloading the library - you're simply loading a new copy of the library. This is one of the use cases of the namespaces - if you tried to dlopen the same library multiple times, you would get the same handle to the same library; however if you load the second instance in a different namespace, you won't get the same handle. If you want to accomplish reloading, you need to unload the library using dlclose, which will unload the library once the last remaining reference to the library has been released.
If you want to attempt to 'force unload' a library, then you could try issuing multiple dlclose calls until it unloads; however if you don't know what the library has done (e.g. spawned threads) there may be no way of preventing a crash in that case.
Older glibc versions might have some bugs related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=89692
https://sourceware.org/bugzilla/show_bug.cgi?id=14898
What version are you using? Try using a newer glibc version, your code works pretty fine on my computer (glibc 2.23).

Is it possible to determine (at runtime) if a function has been implemented?

One of Objective C's primary features is simple introspection. A typical use of this functionality is the ability to check some method (function), to make sure it indeed exists, before calling it.
Whereas the following code will throw an error at runtime (although it compiles just fine (Apple LLVM version 7.0.2 (clang-700.1.81)))...
#import Foundation;
#interface Maybe : NSObject + (void) maybeNot; #end
#implementation Maybe #end
int main (){ [Maybe maybeNot]; }
By adding one simple condition before the call...
if ([Maybe respondsToSelector:#selector(maybeNot)])
We can wait till runtime to decide whether or not to call the method.
Is there any way to do this with "standard" C (c11) or C++ (std=c14)?
i.e....
extern void callMeIfYouDare();
int main() { /* if (...) */ callMeIfYouDare(); }
I guess I should also mention that I am testing/using this is in a Darwin runtime environment.
On GNU gcc / Mingw32 / Cygwin you can use Weak symbol:
#include <stdio.h>
extern void __attribute__((weak)) callMeIfYouDare();
void (*callMePtr)() = &callMeIfYouDare;
int main() {
if (callMePtr) {
printf("Calling...\n");
callMePtr();
} else {
printf("callMeIfYouDare() unresolved\n");
}
}
Compile and run:
$ g++ test_undef.cpp -o test_undef.exe
$ ./test_undef.exe
callMeIfYouDare() unresolved
If you link it with library that defines callMeIfYouDare though it will call it. Note that going via the pointer is necessary in Mingw32/Cygwin at least. Placing a direct call callMeIfYouDare() will result in a truncated relocation by default which unless you want to play with linker scripts is unavoidable.
Using Visual Studio, you might be able to get __declspec(selectany) to do the same trick: GCC style weak linking in Visual Studio?
Update #1: For XCode you can use __attribute__((weak_import)) instead according to: Frameworks and Weak Linking
Update #2: For XCode based on "Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)" I managed to resolve the issue by compiling with the following command:
g++ test_undef.cpp -undefined dynamic_lookup -o test_undef
and leaving __attribute__((weak)) as it is for the other platforms.
If you can see a function of an object (not pointer) is called in a source code and the code is compiled successfully - then the function does exist and no checking needed.
If a function being called via a pointer then you assume your pointer is of type of the class that has that function. To check whether it's so or not you use casting:
auto* p = dynamic_cast<YourClass*>(somepointer);
if (p != nullptr)
p->execute();
C++ or C don't have introspection. You could add some with your additional layer (look at Qt metaobject, or GTK GObject introspection for examples); you might consider customizing GCC with MELT to get some introspection... (but that would take weeks). You could have some additional script or tool which emits C or C++ code related to your introspection needs (SWIG could be inspirational).
In your particular case, you might want to use weak symbols (at least on Linux). Perhaps use the relevant function attribute so code.
extern void perhapshere(void) __attribute__((weak));
if (perhapshere)
perhapshere();
and you might even make that shorter with some macro.
Maybe you just want to load some plugin with dlopen(3) and use dlsym(3) to find symbols in it (or even in the whole program which you would link with -rdynamic, by giving the NULL path to dlopen and using dlsym on the obtained handle); be aware that C++ uses name mangling.
So you might try
void*mainhdl = dlopen(NULL, RTLD_NOW);
if (!mainhdl) { fprintf(stderr, "dlopen failed %s\n", dlerror());
exit(EXIT_FAILURE); };
then later:
typedef void voidvoidsig_t (void); // the signature of perhapshere
void* ad = dlsym(mainhdl, "perhapshere");
if (ad != NULL) {
voidvoidsig_t* funptr = (voidvoidsig_t*)ad;
(*funptr)();
}

extern function call under windows make undefined reference

There is my problem:
All the code is in C++11.
Part 1
I've build a library (named socket).
somwhere in the code, I declare a function:
namespace ntw {
extern int dispatch(int id,Socket& request));
}
This function is user defined (most of the case, simply a big switch)) and by this way the body of it, is not define in the socket lib.
I use this function in server.cpp (which is part of the socket lib).
All is fine under linux, it build the .so perfectly.
But under windows it create a
undefined reference to ntw::dispatch(int,Socket&)
So the .dll is not build.
Part2
I create the main program that use the socket lib.
And in this code, I make :
namespace ntw {
int dispatch(int id,Socket& request)){
/// some code
return ...;
}
}
finaly
So now, that i want is:
the user define int dispatch(int id,Socket& request)) have to be call by the socket libary.
Under Ubuntu, all is fine (coppilation, adn run),
but under windows .... it's the hell.
What can I do to make it work under windows xp and over ?
linux
Ubuntu 12.04, gcc 4.8.1
windows
windows xp, mingw 4.8.1
code
github: https://github.com/Krozark/cpp-Socket
It use cmake (if someone want to try).
What you are attempting won't work on Windows which has a quite different linking model from Linux. You need run time binding. I think you want the function to be provided by the host executable. In which case you need the host to export it with either a .def file or __declspec(dllexport). And then run time binding like this:
HMODULE hMod = GetModuleHandle(NULL); // gets host executable module handle
void *fn = GetProcAddress(hMod, FunctionName);
You can then cast fn to an appropriately declared function pointer before calling the function.
This is probably a reasonable approximation to how your Linux code operates. But it's not a very natural way to operate on Windows. More normal would be for the host to register callback functions or interfaces with the library. Once the host has informed the library of its callbacks, the library can use them.

Is there a way to "statically" interpose a shared .so (or .o) library into an executable?

First of all, consider the following case.
Below is a program:
// test.cpp
extern "C" void printf(const char*, ...);
int main() {
printf("Hello");
}
Below is a library:
// ext.cpp (the external library)
#include <iostream>
extern "C" void printf(const char* p, ...);
void printf(const char* p, ...) {
std::cout << p << " World!\n";
}
Now I can compile the above program and library in two different ways.
The first way is to compile the program without linking the external library:
$ g++ test.cpp -o test
$ ldd test
linux-gate.so.1 => (0xb76e8000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7518000)
/lib/ld-linux.so.2 (0xb76e9000)
If I run the above program, it will print:
$ ./test
Hello
The second way is to compile the program with a link to the external library:
$ g++ -shared -fPIC ext.cpp -o libext.so
$ g++ test.cpp -L./ -lext -o test
$ export LD_LIBRARY_PATH=./
$ ldd test
linux-gate.so.1 => (0xb773e000)
libext.so => ./libext.so (0xb7738000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb756b000)
libstdc++.so.6 => /usr/lib/i386-linux-gnu/libstdc++.so.6 (0xb7481000)
/lib/ld-linux.so.2 (0xb773f000)
libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xb743e000)
libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xb7421000)
$ ./test
Hello World!
As you can see, in the first case the program uses printf from libc.so, while in the second case it uses printf from libext.so.
My question is: from the executable obtained as in the first case and the object code of libext (either as .so or .o), is it possible to obtain an executable like in the second case? In other words, is it possible to replace the link to libc.so with a link to libext.so for all symbols defined in the latter?
**Note that interposition via LD_PRELOAD is not what I want. I want to obtain an exectuable which is directly linked to the libraries I need. I underline again that fact the I only have access to the first binary and to the external object I want to "statically" interpose **
It is possible. Learn about shared library interposition:
When a program that uses dynamic libraries is compiled, a list of undefined symbols is included in the binary, along with a list of libraries the program is linked with. There is no correspondence between the symbols and the libraries; the two lists just tell the loader which libraries to load and which symbols need to be resolved. At runtime, each symbol is resolved using the first library that provides it. This means that if we can get a library containing our wrapper functions to load before other libraries, the undefined symbols in the program will be resolved to our wrappers instead of the real functions.
What you ask for is traditionally NOT possible. This has already been discussed here and here.
The crux of your question being -
How to statically link a dynamic shared object?
This cannot be done. The reason being the fact that statically linking a library is effectively the same as taking the compilation results of that library, unpacking them in your current project, and using them as if they were your own objects. *.a files are just archives of a bunch of *.o files with all the info intact within them. On the other hand, dynamic libraries are already linked; the symbol re-location info already being discarded and hence cannot be statically linked into an executable.
However you DO have other alternatives to work around this technical limitation.
So what are your options?
1. Use LD_PRELOAD on target system
Shared library interposition is well described in Maxim's answer.
2. Prepare a pre-linked stand-alone executable
elf-statifier is tool for creating portable, self-contained Linux executables.
It attempts to package together a dynamically-linked executable and all the dynamically-linked libraries of into a single stand-alone executable file. This file can be copied and run on another machine independently.
So now on your development machine, you can set LD_PRELOAD and run the original executable and verify that it works properly. At this point elf-statifier creates a snapshot of the process memory image. This snapshot is saved as an ELF executable, with all the required shared-libraries(incluing your custom libext.so) inside. Hence there is no need to make any modifications (for eg. to LD_PRELOAD) on the target system running the newly generated standalone executable.
However, this approach is not guaranteed to work in all scenarios. This is due to the fact that recent Linux kernels introduced VDSO and ASLR.
A commercial alternative to this is ermine. It can work around VDSO and ASLR limitations.
You are going to have to modify the binary. Take a look at patchelf http://nixos.org/patchelf.html
It will let you set or modify either the RPATH or even the "interpreter" i.e. ld-linux-x86-64.so to something else.
From the description of the utility:
Dynamically linked ELF executables always specify a dynamic linker or
interpreter, which is a program that actually loads the executable
along with all its dynamically linked libraries. (The kernel just
loads the interpreter, not the executable.) For example, on a
Linux/x86 system the ELF interpreter is typically the file
/lib/ld-linux.so.2.
So what you could do is run patchelf on the binary in question (i.e. test) with your own interpreter that then loads your library... This may be difficult, but the source code to ld-linux-so is available...
Option 2 would be to modify the list of libraries yourself. At least patchelf gives you a starting point in that the code iterates over the list of libraries (see DT_NEEDED in the code).
The elf specification documentation does indicate that the order is indeed important:
DT_NEEDED: This element holds the string table offset of a null-terminated
string, giving the name of a needed library. The offset is an index
into the table recorded in the DT_STRTAB entry. See ‘‘Shared Object
Dependencies’’ for more information about these names. The dynamic
array may contain multiple entries with this type. These entries’
relative order is significant, though their relation to entries of
other types is not.
The nature of your question indicates you are familiar with programming :-) Might be a good time to contribute an addition to patchelf... Modifying library dependencies in a binary.
Or maybe your intention is to do exactly what patchelf was created to do... Anyway, hope this helps!
Statifier probably does what you want. It takes an executable and all shared libraries and outputs a static executable.
It's possible. You just need to edit ELF header and add your library in Dynamic section.
You can check contents of "Dynamic section" using readelf -d <executable>. Also readelf -S <executable> will tell you offset of .dynsym and .dynstr. In .dynsym you can find array of Elf32_Dyn or Elf64_Dyn structures where your d_tag should be DT_NEEDED and d_un.d_ptr should point to a string "libext.so" located in .dynstr section.
ELF headers are described in /usr/include/elf.h.
It might be possible to do what you're asking by dynamically loading the library using dlopen(), accessing the symbol for the function as a function pointer using dlsym(), and then invoking it via the function pointer. There's a good example of what to do on this website.
I tailored that example to your example above:
// test.cpp
#include <stdio.h>
typedef void (*printf_t)(const char *p, ...);
int main() {
// Call the standard library printf
printf_t my_printf = &printf;
my_printf("Hello"); // should print "Hello"
// Now dynamically load the "overloaded" printf and call it instead
void* handle = dlopen("./libext.so", RTLD_LAZY);
if (!handle) {
std::cerr << "Cannot open library: " << dlerror() << std::endl;
return 1;
}
// reset errors
dlerror();
my_printf = (printf_t) dlsym(handle, "printf");
const char *dlsym_error = dlerror();
if (dlsym_error) {
std::cerr << "Cannot load symbol 'printf': " << dlsym_error << std::endl;
dlclose(handle);
return 1;
}
my_printf("Hello"); // should print "Hello, world"
// close the library
dlclose(handle);
}
The man page for dlopen and dlsym should provide some more insight. You'll need to try this out, as it is unclear how dlsym will handle the conflicting symbol (in your example, printf) - if it replaces the existing symbol, you may need to "undo" your action later. It really depends on the context of your program, and what you're trying to do overall.
It is possible to change the binary.
For example with a tool like ghex you can change the hexadecimal code of the binary, you search in the code for each instance of libc.so and you replace it by libext.so
Not statically, but you can redirect dynamically loaded symbols in a shared library to your own functions using the elf-hook utility created by Anthony Shoumikhin.
The typical usage is to redirect certain function calls from within a 3rd-party shared library which you can't edit.
Let's say your 3rd party library is located at /tmp/libtest.so, and you want to redirect printf calls made from within the library, but leave calls to printf from other locations unaffected.
Exemplar app:
lib.h
#pragma once
void test();
lib.cpp
#include "lib.h"
#include <cstdio>
void test()
{
printf("hello from libtest");
}
In this example, the above 2 files are compiled into a shared library libtest.so and stored in /tmp
main.cpp
#include <iostream>
#include <dlfcn.h>
#include <elf_hook.h>
#include "lib.h"
int hooked_printf(const char* p, ...)
{
std::cout << p << " [[ captured! ]]\n";
return 0;
}
int main()
{
// load the 3rd party shared library
const char* fn = "/tmp/libtest.so";
void* h = dlopen(fn, RTLD_LAZY);
// redirect printf calls made from within libtest.so
elf_hook(fn, LIBRARY_ADDRESS_BY_HANDLE(h), "printf", (void*)hooked_printf);
printf("hello from my app\n"); // printf in my app is unaffected
test(); // test is the entry point to the 3rd party library
dlclose(h);
return 0;
}
Output
hello from my app
hello from libtest [[ captured! ]]
So as you can see it is possible to interpose your own functions without setting LD_PRELOAD, with the added benefit that you have finer-grained control of which functions are intercepted.
However, the functions are not statically interposed, but rather dynamically redirected
GitHub source for the elf-hook library is here, and a full codeproject article written by Anthony Shoumikhin is here