How to dynamically link GCC objects? - c++

I'm unsure if the question is phrased correctly, or if what I want to is possible.
I have an existing GCC application (compiled for a Cortex-M3 if that matters). What I want to do is create a little piece of functionality (just a single method, few methods) that can call into the existing application.
I want to place these few methods at a specific memory location (I know how to do that). What I don't know how to do is get the new application to compile/link with the objects of the existing application.
For instance, my existing application has the function:
int Add(int a, int b);
And new application wants to use it:
int Calculate(int a, int b, int opType)
{
Add(a, b);
}
I have access to all linker, obj, h files, etc.

You can't usually link to executables, only libraries (static or shared) and object files. So, the best advice I can give would be to build the "core" of the first program as a shared library, and link the "front-end" (main) as an executable built against the core shared lib. Then, your second program can also just be a program linked against the shared library.
You can also use dlopen on dynamic executables to link the executable at runtime, and use dlsym to get function pointers for the desired functionality, though this is usually only used if you have no control over the first executable.
Example of the latter (note again that this should be a last resort):
a.c:
#include <stdio.h>
int main() { printf("hello world!\n"); return 42; }
b.c:
#include <stdio.h>
#include <dlfcn.h>
main() {
void *handle = dlopen("a", RTLD_LAZY);
if(!handle) {
printf("failed: %s\n", dlerror());
return -1;
}
int (*amain)() = dlsym(handle, "main");
if(!amain) {
printf("dlsym failed: %s\n", dlerror());
return -1;
}
return amain();
}

Thanks for your input, however I was able to do exactly what I wanted by compiling the new application using the ELF file from the existing application has an input to the Linker by specifying
--just-symbols elffile.elf

If you're using a linux variant, then the answer given by #nneonneo to use dlopen() and dlsym() is the best approach.
However, assuming that you're using another OS (or none at all) and/or you really really need this code to live at a fixed location (for example if you need to shift execution to an address on a specific memory device, eg. if doing flash manipulation), you can use a hard coded function pointer.
Declare a function pointer as follows:
typedef int (*AddFnPtr)(int a, int b);
AddFnPtr MyAddFunction = (AddFnPtr)ADDRESS_OF_YOUR_FUNCTION;
Then call as:
int Calculate(int a, int b, int opType)
{
MyAddFunction(a, b);
}
Note that the linker has no way of knowing if the code that you've put at that location has the right prototype, or even exists - so there is no error checking either at link time or at run time.
You will probably (depending on OS) also need to take steps to map the absolute memory location at which you've put your function into the local processes address space.

Related

C++ Creating Objects as Static - advice needed on good practice

I am a mainly C programmer building a prototype on a Raspberry Pi. I'm making extensive use of some open source C code, but also a Raspberry Pi hardware add on which comes with C++ drivers. So I need them to work together. I did some research and got them to work together by writing a C++ function with an extern "C" declaration, compiling this as a shared library and linking it with my C program.
I need the C++ function to instantiate an object the first time it is called and then to be able to interact with this object on subsequent calls to the function. I was slightly overwhelmed by the instructions for how to create and access C++ objects directly in C, so I tried simply adding "static" before the creation of the object - and interacting with the object through the mediation of the C wrapper. This seems to working perfectly but I'm slightly worried that this is not routinely given as the answer to the "using C++ objects in C" and so I wonder if I am going to end up with unforeseen problems? I don't need my code at this stage to be high quality, but I don't want to end up with segmentation errors because I have done something foolish. Any advice would be really appreciated.
Here is a cut down version to show what I am doing. In this example I create a simple c++ function that takes an int argument from the calling C program. If the argument is 0 it creates the objects and sets all the leds in array to 0. If I call this a second time with the argument = 1, it instructs the same object to light all the red leds. This code works.
#include <string.h>
#include <matrix_hal/everloop.h>
#include <matrix_hal/everloop_image.h>
#include <matrix_hal/matrixio_bus.h>
extern "C" int led_change(int input_from_c)
{
namespace hal = matrix_hal;
static hal::MatrixIOBus bus;
static hal::EverloopImage image1d(18);
static hal::Everloop everloop;
if (input_from_c == 0)
{
if (!bus.Init()) return false;
// this line just resizes the EverloopImage object to the number of LEDs on the board
everloop.Setup(&bus);
// switch off the leds
for (int i=0;i<18;i++)
{
image1d.leds[i].red = 0;
image1d.leds[i].green = 0;
image1d.leds[i].blue= 0;
image1d.leds[i].white = 0;
}
everloop.Write(&image1d);
}
else if (input_from_c == 1)
{
for (int i=0;i<18;i++)
{
image1d.leds[i].red = 100;
image1d.leds[i].green = 0;
image1d.leds[i].blue= 0;
image1d.leds[i].white = 0;
}
everloop.Write(&image1d);
}
return 1;
}
The calling code in C is just
#include <unistd.h>
#include <stdio.h>
int led_change (int);
int i;
void main () {
i = led_change(0);
printf("returned %d\n",i);
sleep(1);
i = led_change(1);
printf("returned second time %d\n",i);
}
Hope this clear. Thanks for any help.
You need to compile and link all your C code as C++ modules. The reverse doesn't work. C doesn't know about constructors to be called for your static objects in c++ modules... so the linker has to understand the c++ calling sequences. C++ language was designed with compatibility of old C code in mind. But C doesn't have that statement in mind when it was designed.
Despite of this, C++ compilers normally can compile also pure C code (in C language mode) so one of these compilers will be valid. It will generate code that can survive in the same program without any problem.
You can have a C code main() function in a c++ program, but always use a c++ linker to link that code into the program.
When "C" code is linked to C++ code using "extern", C++ compilers stop "name mangling" for those C variables or functions, while creating symbol names in object file. For C++ functions, they mangle names to support function overloading.
When static objects are returned from shared objects, they might create problem in multi-threaded programs, like two or more threads modifying the values in the static objects at the same time.

How to set the address of a function in C in a specific memory location

since I am using embedded system, I need to store a specific function in an external memory location in the address 0x840140
Here is the function:
//The function that I want to set its address to 0x840140
float myfunction(float x,float y) {
float z;
z=x+y;
return z;
}
void main() {
float w;
//Calling the function
w=myfunction(5.5,10.5);
}
Xilinx "MicroBlaze" seems to be using a GNU CC based compiler, which means it (probably) using the gnu ld linker. It has a fairly extensive scripting language, so different sections of code, for example, can be located at different locations.
If you don't want ALL of your code to be located as one lump, you will need to "set" a section for the function in question, e.g:
void myfunction (void) __attribute__ ((section ("at840000.text")));
then use text.at840000 to tell the linker where you want the code to be placed.
Something like this:
SECTIONS {
at840000.text 0x840000 { * }
}
(I'm not 100% sure about the exact syntax here, but something along those lines)
Disclaimer: I have no never tried this.
It might be possible using a linker script. This article places code at a specific address for building a kernel. Check the section about "The linking part".
Are you sure you want the function stored in a specific memory location? You probably want just the function result.
void main() {
float *w = (float*)0x840140;
//Calling the function
*w=myfunction(5.5,10.5);
}
This will put the float returned by myFunction() in the correct memory location.

C++ load shared library and extract class implementations at runtime on linux platform

In C++, is it possible to load a shared library at execution time?
I want the user to choose which shared library to be loaded at runtime, without recompiling the whole program.
dlopen() is a solution for C, but my program is written is C++/Qt, and the symbol to extract are Qt-style class, is there a more "c++" way to do that.
You can do it in Qt using QLibrary in two ways. The following example calls a function from a shared library at runtime in two different ways:
#include <QLibrary>
#include <QDebug>
class Dynamic_library
{
public:
Dynamic_library();
virtual int sum( int len, int * data );
};
typedef Dynamic_library * (*get_object_func)();
typedef int (*call_sum_func)(int len , int * data);
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
QLibrary library( "./dynamic_library" );
library.load();
if( !library.isLoaded() )
{
qDebug() << "Cannot load library.";
return 0;
}
call_sum_func call_sum = (call_sum_func)library.resolve( "call_sum" );
if( call_sum )
{
//Dynamic_library * obj=get_object();
int * a=new int[3];
a[0]=2;
a[1]=3;
a[2]=4;
qDebug() << "sum of 2+3+4' = " << call_sum( 3, a ) <<"\n";
delete [] a;
}
get_object_func get_object = (get_object_func)library.resolve( "get_object" );
if( get_object )
{
Dynamic_library * obj=get_object();
int * a=new int[3];
a[0]=7;
a[1]=8;
a[2]=9;
qDebug() << "sum of 7+8+9' = " << obj->sum(3, a );
delete [] a;
}
return a.exec();
}
The code for the shared library is as follows:
class DYNAMIC_LIBRARYSHARED_EXPORT Dynamic_library
{
public:
Dynamic_library();
virtual int sum( int len, int * data );
};
extern "C" Q_DECL_EXPORT Dynamic_library * get_object()
{
return new Dynamic_library();
}
extern "C" Q_DECL_EXPORT int call_sum(int len, int * data)
{
return Dynamic_library().sum(len,data);
}
Dynamic_library::Dynamic_library()
{
}
int Dynamic_library::sum( int len, int *data )
{
int sum = 0;
for(int i=0; i<len; ++i )
sum += data[i];
return sum;
}
If the target library itself, or at least its specification, is under your control, then you shouldn't be using QLibrary - use the Qt plugin system instead. It doesn't require the call-via-pointer gymnastics otherwise needed.
If you insist on using a dlopen-like mechanism, there is nothing C-specific about QLibrary. The obvious limitation is that the library that you're trying to open must have been compiled with a C++ compiler that's ABI-compatible to the one you use to compile your own code. On Windows that really means using the same MSVC version.
Apart from that, you'll have to look up the mangled version of the symbol. Once you've done that, you can call the symbol using a function/method pointer that matches it. This won't work on constructors/destructors, by design. If you wish to create new instances of objects, you'll need a static factory method provided by the library.
If the library doesn't provide factory methods, you can implement a shim library that links to the target library by a generic name and does provide factory methods. You'll still need to call individual methods by function/method pointers.
Create a temporary folder.
Copy the shim library to the temporary folder.
Copy the target library renamed to the generic name, into the temporary folder.
Save the value of LD_LIBRARY_PATH environment variable.
Prepend the temporary folder to LD_LIBRARY_PATH.
Open/load the library.
Restore the saved value of LD_LIBRARY_PATH.
Of course, you must have the header file for whatever interface the library exposes. It can't be, generally, reconstructed given just a dynamic library file - primarily because the mangled symbols don't have full structural information for the used types. For example, even if you can find a constructor for a given class, you won't know how big is the class instance (its sizeof).
Yes it's possible to do what you're describing on most operating systems, but how you do it is dependent on the system and regardless of the system it's definitely a bit more work on your end to make it happen.
The general steps are:
load the library
for each symbol you're interested in within the library, locate it and store to a variable for later use. (This can be done as-needed, rather than right away.)
For example, in pseudo-code (read: this won't compile!) on a *nix type system, lets assume your shared library has this in it:
// I'm declaring this _extern "C"_ to avoid name mangling, making it easier to
// specify a symbol name for dlsym() later
extern "C" int myFunction() {
return 10;
}
Assume this is in a library called libmyFunction.so. Your main application could, for example:
{
void *handle = dlopen("libmyFunction.so", <flags>);
if (!handle) return; // error: cannot locate the library!
int (*func)() = (int (*)())dlsym(handle, "myFunction");
if (!func) return; // error: cannot locate the symbol!
printf("The function returns: %d\n", func());
}
If you need to do this on Windows, the concept is the same but the function calls are different.

Multiplatform way to determine if a dynamic library is present

I need to check if a dynamic library is present, so that later I can safely call functions that use this library.
Is there a multiplatform way to check this? I am targeting MS Windows 7 (VC++11) and Linux (g++).
To dynamically "use" a function from a shared library requires that the library isn't part of the executable file, so you will need to write code to load the library and then use the function. There may well be ways to to do that in a portable fashion, but I'm not aware of any code available to do that.
It isn't very hard code to write. As "steps", it involves the following:
Load the library given a name of a file (e.g. "xx", which is then translated to "xx.so" or "xx.dll" in the architecture specific code).
Find a function based on either index ("function number 1") or name ("function blah"), and return the address.
Repeat step 2 for all relevant functions.
When no longer needing the library, close it with the handle provided.
If step 1 fails, then your library isn't present (or otherwise "not going to work"), so you can't call functions in it...
Clearly, there are many ways to design an interface to provide this type of functionality, and exactly how you go about that would depend on what your actual problem setting is.
Edit:
To clarify the difference between using a DLL directly, and using one using dynamic loading from the code:
Imagine that this is our "shared.h", which defines the functions for the shared library
(There is probably some declspec(...) or exportsymbol or other such stuff in a real header, but I'll completely ignore that for now).
int func1();
char *func2(int x);
In a piece of code that directly uses the DLL, you'd just do:
#include <shared.h>
int main()
{
int x = func1();
char *str = func2(42);
cout << "x=" << x << " str=" << str << endl;
return 0;
}
Pretty straight forward, right?
When we use a shared library that is dynamically loaded by the code, it gets a fair bit more complex:
#include <shared.h>
typedef int (*ptrfunc1)();
typedef char * (*ptrfunc2)(int x);
int main()
{
SOMETYPE handle = loadlibrary("shared");
if (handle == ERROR_INDICATOR)
{
cerr << "Error: Couldn't load shared library 'shared'";
return 1;
}
ptrfunc1 pf1 = reinterpret_cast<ptrfunc1>(findfunc("func1"));
ptrfunc2 pf2 = reinterpret_cast<ptrfunc2>(findfunc("func2"));
int x = pf1();
char *str = pf2(42);
cout << "x=" << x << " str=" << str << endl;
return 0;
}
As you can see, the code suddenly got a lot more "messy". Never mind what hoops you have to jump through to find the constructor for a QObject, or worse, inherit from a QObject. In other words, if you are using Qt in your code, you are probably stuck with linking directly to "qt.lib" and your application WILL crash if a Qt environment isn't installed on the machine.
LoadLibrary calls should fail, then you can know if the dynamic library is present or not. Also with dynamic loading you get the function pointer from the dynamic library and if the pointer is null then the platform doesn't support that function on that platform.
On windows you have LoadLibrary API to load a dynamic lib. And GetProcAddress API to look up the desired function in that lib. If GetProcAddress returns NULL for that particular function that you are looking for that functionality is not present for that platform. You can log then and decide fallback.

How can I intercept dlsym calls using LD_PRELOAD?

I want to intercept application's calls to dlsym. I have tried declaring inside the .so that I am preloading dlsym , and using dlsym itself to get it's real address, but that for quite obvious reasons didn't work.
Is there a way easier than taking process' memory maps, and using libelf to find the real location of dlsym inside loaded libdl.so?
WARNING:
I have to explicitely warn everyone who tries to do this. The general premise of having a shared library hooking dlsym has several significant drawbacks. The biggest issue issue is that the original dlsym implementation if glibc will internally use stack unwinding techniques to find out from which loaded module the function was called. If the intercepting shared library then calls the original dlsym on behalf of the original application, this will break lookups using stuff like RTLD_NEXT, as now the current module isn't the originally calling one, but your hook library.
It might be possible to implement this the correct way, but it requires a lot more work. Without having tried it, I think that using dlinfo to get to the chained list of linket maps, you could individually walk through all modules, and do a separate dlsym for each one, to get the RTLD_NEXT behavior right. You still need to get the address of your caller for that, which you might get via the old backtrace(3) family of functions.
MY OLD ANSWER FROM 2013
I stumbled across the same problem with hdante's answer as the commenter: calling __libc_dlsym() directly crashes with a segfault. After reading some glibc sources, I came up with the following hack as a workaround:
extern void *_dl_sym(void *, const char *, void *);
extern void *dlsym(void *handle, const char *name)
{
/* my target binary is even asking for dlsym() via dlsym()... */
if (!strcmp(name,"dlsym"))
return (void*)dlsym;
return _dl_sym(handle, name, dlsym);
}
NOTE two things with this "solution":
This code bypasses the locking which is done internally by (__libc_)dlsym(), so to make this threadsafe, you should add some locking.
The thrid argument of _dl_sym() is the address of the caller, glibc seems to reconstruct this value by stack unwinding, but I just use the address of the function itself. The caller address is used internally to find the link map the caller is in to get things like RTLD_NEXT right (and, using NULL as thrid argument will make the call fail with an error when using RTLD_NEXT). However, I have not looked at glibc's unwindind functionality, so I'm not 100% sure that the above code will do the right thing, and it may happen to work just by chance alone...
The solution presented so far has some significant drawbacks: _dl_sym() acts quite differently than the intended dlsym() in some situations. For example, trying to resolve a symbol which does not exist does exit the program instead of just returning NULL. To work around that, one can use _dl_sym() to just get the pointer to the original dlsym() and use that for everything else (like in the "standard" LD_PRELOAD hook approch without hooking dlsym at all):
extern void *_dl_sym(void *, const char *, void *);
extern void *dlsym(void *handle, const char *name)
{
static void * (*real_dlsym)(void *, const char *)=NULL;
if (real_dlsym == NULL)
real_dlsym=_dl_sym(RTLD_NEXT, "dlsym", dlsym);
/* my target binary is even asking for dlsym() via dlsym()... */
if (!strcmp(name,"dlsym"))
return (void*)dlsym;
return real_dlsym(handle,name);
}
UPDATE FOR 2021 / glibc-2.34
Beginning with glibc 2.34, the function _dl_sym() is no longer publicly exported. Another approach I can suggest is to use dlvsym() instead, which is offically part of the glibc API and ABI. The only downside is that you now need the exact version to ask for the dlsym symbol. Fortunately, that is also part of the glibc ABI, unfortunately, it varies per architecture. However, a grep 'GLIBC_.*\bdlsym\b' -r sysdeps in the root folder of the glibc sources will tell you what you need:
[...]
sysdeps/unix/sysv/linux/i386/libc.abilist:GLIBC_2.0 dlsym F
sysdeps/unix/sysv/linux/i386/libc.abilist:GLIBC_2.34 dlsym F
[...]
sysdeps/unix/sysv/linux/x86_64/64/libc.abilist:GLIBC_2.2.5 dlsym F
sysdeps/unix/sysv/linux/x86_64/64/libc.abilist:GLIBC_2.34 dlsym F
Glibc-2.34 actually introduced new versions of this function, but the old versions are still be kept around for backwards compatibilty.
For x86_64, you could use:
real_dlsym=dlvsym(RTLD_NEXT, "dlsym", "GLIBC_2.2.5");
And, if you both like to get the newest version, as well as a potentially one of another interceptor in the same process, you can use that version to do an unversioned query again:
real_dlsym=real_dlsym(RTLD_NEXT, "dlsym");
If you actually need to hook both dlsym and dlvsym in your shared object, this approach of course won't work either.
UPDATE: hooking both dlsym() and dlvsym() at the same time
Out of curiosity, I thought about some approach to hook both of the glibc symbol query methods, and I came up with a solution using an additional wrapper library which links to libdl. The idea is that the interceptor library can dynamically load this library at runtime using dlopen() with the RTLD_LOCAL | RTLD_DEEPBIND flags, which will create a separate linker scope for this object, also containing the libdl, so that the dlsym and dlvsym will be resolved to the original methods, and not the one in the interceptor library. The problem now is that our interceptor library can not directly call any function inside the wrapper library, because we can not use dlsym, which is our original problem.
However, the shared library can have an initialization function, which the linker will call before the dlopen() returns. We just need to pass some information from the initialization function of the wrapper library to the interceptor library. Since both are in the same process, we can use the environment block for that.
This is the code I came up with:
dlsym_wrapper.h:
#ifndef DLSYM_WRAPPER_H
#define DLSYM_WRAPPER_H
#define DLSYM_WRAPPER_ENVNAME "DLSYM_WRAPPER_ORIG_FPTR"
#define DLSYM_WRAPPER_NAME "dlsym_wrapper.so"
typedef void* (*DLSYM_PROC_T)(void*, const char*);
#endif
dlsym_wrapper.c, compiled to dlsym_wrapper.so:
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
#include "dlsym_wrapper.h"
__attribute__((constructor))
static void dlsym_wrapper_init()
{
if (getenv(DLSYM_WRAPPER_ENVNAME) == NULL) {
/* big enough to hold our pointer as hex string, plus a NUL-terminator */
char buf[sizeof(DLSYM_PROC_T)*2 + 3];
DLSYM_PROC_T dlsym_ptr=dlsym;
if (snprintf(buf, sizeof(buf), "%p", dlsym_ptr) < (int)sizeof(buf)) {
buf[sizeof(buf)-1] = 0;
if (setenv(DLSYM_WRAPPER_ENVNAME, buf, 1)) {
// error, setenv failed ...
}
} else {
// error, writing pointer hex string failed ...
}
} else {
// error: environment variable already set ...
}
}
And one function in the interceptor library to get the pointer to the
original dlsym() (should be called only once, guared by a mutex):
static void *dlsym_wrapper_get_dlsym
{
char dlsym_wrapper_name = DLSYM_WRAPPER_NAME;
void *wrapper;
const char * ptr_str;
void *res = NULL;
void *ptr = NULL;
if (getenv(DLSYM_WRAPPER_ENVNAME)) {
// error: already defined, shoudn't be...
}
wrapper = dlopen(dlsym_wrapper_name, RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND | RTLD_NOLOAD);
if (wrapper) {
// error: dlsym_wrapper.so already loaded ...
// it is important that we load it by ourselves to a sepearte linker scope
}
wrapper = dlopen(dlsym_wrapper_name, RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND);
if (!wrapper) {
// error: dlsym_wrapper.so can't be loaded
}
ptr_str = getenv(DLSYM_WRAPPER_ENVNAME);
if (!ptr_str) {
// error: dlsym_wrapper.so failed...
}
if (sscanf(ptr_str, "%p", &ptr) == 1) {
if (ptr) {
// success!
res = ptr;
} else {
// error: got invalid pointer ...
}
} else {
// error: failed to parse pointer...
}
// this is a bit evil: close the wrapper. we can be sure
// that libdl still is used, as this mosule uses it (dlopen)
dlclose(wrapper);
return res;
}
This of course assumes that dlsym_wrapper.so is in the library search path. However, you may prefer to just inject the interceptor library via LD_PRELOAD using a full path, and not modifying LD_LIBRARY_PATH at all. To do so, you can add dladdr(dlsym_wrapper_get_dlsym,...) to find the path of the injector library itself, and use that for searching the wrapper library, too.
http://www.linuxforu.com/2011/08/lets-hook-a-library-function/
From the text:
Do beware of functions that themselves call dlsym(), when you need to call __libc_dlsym (handle, symbol) in the hook.
extern void *__libc_dlsym (void *, const char *);
void *dlsym(void *handle, const char *symbol)
{
printf("Ha Ha...dlsym() Hooked\n");
void* result = __libc_dlsym(handle, symbol); /* now, this will call dlsym() library function */
return result;
}