I want to intercept application's calls to dlsym. I have tried declaring inside the .so that I am preloading dlsym , and using dlsym itself to get it's real address, but that for quite obvious reasons didn't work.
Is there a way easier than taking process' memory maps, and using libelf to find the real location of dlsym inside loaded libdl.so?
WARNING:
I have to explicitely warn everyone who tries to do this. The general premise of having a shared library hooking dlsym has several significant drawbacks. The biggest issue issue is that the original dlsym implementation if glibc will internally use stack unwinding techniques to find out from which loaded module the function was called. If the intercepting shared library then calls the original dlsym on behalf of the original application, this will break lookups using stuff like RTLD_NEXT, as now the current module isn't the originally calling one, but your hook library.
It might be possible to implement this the correct way, but it requires a lot more work. Without having tried it, I think that using dlinfo to get to the chained list of linket maps, you could individually walk through all modules, and do a separate dlsym for each one, to get the RTLD_NEXT behavior right. You still need to get the address of your caller for that, which you might get via the old backtrace(3) family of functions.
MY OLD ANSWER FROM 2013
I stumbled across the same problem with hdante's answer as the commenter: calling __libc_dlsym() directly crashes with a segfault. After reading some glibc sources, I came up with the following hack as a workaround:
extern void *_dl_sym(void *, const char *, void *);
extern void *dlsym(void *handle, const char *name)
{
/* my target binary is even asking for dlsym() via dlsym()... */
if (!strcmp(name,"dlsym"))
return (void*)dlsym;
return _dl_sym(handle, name, dlsym);
}
NOTE two things with this "solution":
This code bypasses the locking which is done internally by (__libc_)dlsym(), so to make this threadsafe, you should add some locking.
The thrid argument of _dl_sym() is the address of the caller, glibc seems to reconstruct this value by stack unwinding, but I just use the address of the function itself. The caller address is used internally to find the link map the caller is in to get things like RTLD_NEXT right (and, using NULL as thrid argument will make the call fail with an error when using RTLD_NEXT). However, I have not looked at glibc's unwindind functionality, so I'm not 100% sure that the above code will do the right thing, and it may happen to work just by chance alone...
The solution presented so far has some significant drawbacks: _dl_sym() acts quite differently than the intended dlsym() in some situations. For example, trying to resolve a symbol which does not exist does exit the program instead of just returning NULL. To work around that, one can use _dl_sym() to just get the pointer to the original dlsym() and use that for everything else (like in the "standard" LD_PRELOAD hook approch without hooking dlsym at all):
extern void *_dl_sym(void *, const char *, void *);
extern void *dlsym(void *handle, const char *name)
{
static void * (*real_dlsym)(void *, const char *)=NULL;
if (real_dlsym == NULL)
real_dlsym=_dl_sym(RTLD_NEXT, "dlsym", dlsym);
/* my target binary is even asking for dlsym() via dlsym()... */
if (!strcmp(name,"dlsym"))
return (void*)dlsym;
return real_dlsym(handle,name);
}
UPDATE FOR 2021 / glibc-2.34
Beginning with glibc 2.34, the function _dl_sym() is no longer publicly exported. Another approach I can suggest is to use dlvsym() instead, which is offically part of the glibc API and ABI. The only downside is that you now need the exact version to ask for the dlsym symbol. Fortunately, that is also part of the glibc ABI, unfortunately, it varies per architecture. However, a grep 'GLIBC_.*\bdlsym\b' -r sysdeps in the root folder of the glibc sources will tell you what you need:
[...]
sysdeps/unix/sysv/linux/i386/libc.abilist:GLIBC_2.0 dlsym F
sysdeps/unix/sysv/linux/i386/libc.abilist:GLIBC_2.34 dlsym F
[...]
sysdeps/unix/sysv/linux/x86_64/64/libc.abilist:GLIBC_2.2.5 dlsym F
sysdeps/unix/sysv/linux/x86_64/64/libc.abilist:GLIBC_2.34 dlsym F
Glibc-2.34 actually introduced new versions of this function, but the old versions are still be kept around for backwards compatibilty.
For x86_64, you could use:
real_dlsym=dlvsym(RTLD_NEXT, "dlsym", "GLIBC_2.2.5");
And, if you both like to get the newest version, as well as a potentially one of another interceptor in the same process, you can use that version to do an unversioned query again:
real_dlsym=real_dlsym(RTLD_NEXT, "dlsym");
If you actually need to hook both dlsym and dlvsym in your shared object, this approach of course won't work either.
UPDATE: hooking both dlsym() and dlvsym() at the same time
Out of curiosity, I thought about some approach to hook both of the glibc symbol query methods, and I came up with a solution using an additional wrapper library which links to libdl. The idea is that the interceptor library can dynamically load this library at runtime using dlopen() with the RTLD_LOCAL | RTLD_DEEPBIND flags, which will create a separate linker scope for this object, also containing the libdl, so that the dlsym and dlvsym will be resolved to the original methods, and not the one in the interceptor library. The problem now is that our interceptor library can not directly call any function inside the wrapper library, because we can not use dlsym, which is our original problem.
However, the shared library can have an initialization function, which the linker will call before the dlopen() returns. We just need to pass some information from the initialization function of the wrapper library to the interceptor library. Since both are in the same process, we can use the environment block for that.
This is the code I came up with:
dlsym_wrapper.h:
#ifndef DLSYM_WRAPPER_H
#define DLSYM_WRAPPER_H
#define DLSYM_WRAPPER_ENVNAME "DLSYM_WRAPPER_ORIG_FPTR"
#define DLSYM_WRAPPER_NAME "dlsym_wrapper.so"
typedef void* (*DLSYM_PROC_T)(void*, const char*);
#endif
dlsym_wrapper.c, compiled to dlsym_wrapper.so:
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
#include "dlsym_wrapper.h"
__attribute__((constructor))
static void dlsym_wrapper_init()
{
if (getenv(DLSYM_WRAPPER_ENVNAME) == NULL) {
/* big enough to hold our pointer as hex string, plus a NUL-terminator */
char buf[sizeof(DLSYM_PROC_T)*2 + 3];
DLSYM_PROC_T dlsym_ptr=dlsym;
if (snprintf(buf, sizeof(buf), "%p", dlsym_ptr) < (int)sizeof(buf)) {
buf[sizeof(buf)-1] = 0;
if (setenv(DLSYM_WRAPPER_ENVNAME, buf, 1)) {
// error, setenv failed ...
}
} else {
// error, writing pointer hex string failed ...
}
} else {
// error: environment variable already set ...
}
}
And one function in the interceptor library to get the pointer to the
original dlsym() (should be called only once, guared by a mutex):
static void *dlsym_wrapper_get_dlsym
{
char dlsym_wrapper_name = DLSYM_WRAPPER_NAME;
void *wrapper;
const char * ptr_str;
void *res = NULL;
void *ptr = NULL;
if (getenv(DLSYM_WRAPPER_ENVNAME)) {
// error: already defined, shoudn't be...
}
wrapper = dlopen(dlsym_wrapper_name, RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND | RTLD_NOLOAD);
if (wrapper) {
// error: dlsym_wrapper.so already loaded ...
// it is important that we load it by ourselves to a sepearte linker scope
}
wrapper = dlopen(dlsym_wrapper_name, RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND);
if (!wrapper) {
// error: dlsym_wrapper.so can't be loaded
}
ptr_str = getenv(DLSYM_WRAPPER_ENVNAME);
if (!ptr_str) {
// error: dlsym_wrapper.so failed...
}
if (sscanf(ptr_str, "%p", &ptr) == 1) {
if (ptr) {
// success!
res = ptr;
} else {
// error: got invalid pointer ...
}
} else {
// error: failed to parse pointer...
}
// this is a bit evil: close the wrapper. we can be sure
// that libdl still is used, as this mosule uses it (dlopen)
dlclose(wrapper);
return res;
}
This of course assumes that dlsym_wrapper.so is in the library search path. However, you may prefer to just inject the interceptor library via LD_PRELOAD using a full path, and not modifying LD_LIBRARY_PATH at all. To do so, you can add dladdr(dlsym_wrapper_get_dlsym,...) to find the path of the injector library itself, and use that for searching the wrapper library, too.
http://www.linuxforu.com/2011/08/lets-hook-a-library-function/
From the text:
Do beware of functions that themselves call dlsym(), when you need to call __libc_dlsym (handle, symbol) in the hook.
extern void *__libc_dlsym (void *, const char *);
void *dlsym(void *handle, const char *symbol)
{
printf("Ha Ha...dlsym() Hooked\n");
void* result = __libc_dlsym(handle, symbol); /* now, this will call dlsym() library function */
return result;
}
Related
The problem I have is that I want to create a generic command line application that can be used to load a library DLL and then call a function in the library DLL. The function name is specified on the command line with the arguments also provided on the utility command line.
I can access the external function from a DLL dynamically loaded using the LoadLibrary() function. Once the library is loaded I can obtain a pointer to the function using GetProcAddress() I want to call the function with the arguments specified on the command line.
Can I pass a void-pointer-list to the function-pointer which I got returned by the LoadLibrary() function similar to the example below?
To keep the example code simple, I deleted the error-checking. Is there a way to get something like this working:
//Somewhere in another dll
int DoStuff(int a, int b)
{
return a + b;
}
int main(int argc, char **argv)
{
void *retval;
void *list = argv[3];
HMODULE dll;
void* (*generic_function)(void*);
dll = LoadLibraryA(argv[1]);
//argv[2] = "DoStuff"
generic_function = GetProcAddress(dll, argv[2]);
//argv[3] = 4, argv[4] = 7, argv[5] = NULL
retval = generic_function(list);
}
If I forgot to mention necessary information, please let me know.
Thanks in advance
You need to cast the function pointer returned by LoadLibrary to one with the right argument types before calling it. One way to manage it is to have a number call-adaptor functions that do the right thing for every possible function type you might want to call:
void Call_II(void (*fn_)(), char **args) {
void (*fn)(int, int) = (void (*)(int, int))fn_;
fn(atoi(args[0]), atoi(args[1]));
}
void Call_IS(void (*fn_)(), char **args) {
void (*fn)(int, char *) = (void (*)(int, char *))fn_;
fn(atoi(args[0]), args[1]);
}
...various more functions
Then you take the pointer you got from GetProcAddress and the additional arguments and pass them to the correct Call_X function:
void* (*generic_function)();
dll = LoadLibraryA(argv[1]);
//argv[2] = "DoStuff"
generic_function = GetProcAddress(dll, argv[2]);
//argv[3] = 4, argv[4] = 7, argv[5] = NULL
Call_II(generic_function, &argv[3]);
The problem is that you need to know what the type of the function you're getting the pointer for is and call the appropriate adaptor function. Which generally means making a table of function name/adaptors and doing a lookup in it.
The related problem is that there's no function analogous to GetProcAddress that will tell you the argument types for a function in the library -- that information simply isn't stored anywhere accessable in the dll.
A library DLL contains the object code for the functions that are part of the library along with some additional information to allow the DLL to be usable.
However a library DLL does not contain the actual type information needed to determine the specific argument list and types for the functions contained in the library DLL. The main information in a library DLL is: (1) a list of the functions that the DLL exports along with the address information that will connect a call of a function to the actual function binary code and (2) a list of any required DLLs that the functions in the library DLL use.
You can actually open a library DLL in a text editor, I suggest a small one, and scan through the arcane symbols of the binary code until you reach the section that contains the list of functions in the library DLL as well as other required DLLs.
So a library DLL contains the bare minimum information needed to (1) find a particular function in the library DLL so that it can be invoked and (2) a list of other needed DLLs that the functions in the library DLL depend on.
This is different from a COM object which normally does have type information in order to support the ability to do what is basically reflection and explore the COM object's services and how those services are accessed. You can do this with Visual Studio and other IDEs which generate a list of COM objects installed and allow you to load a COM object and explore it. Visual Studio also has a tool that will generate the source code files that provide the stubs and include file for accessing the services and methods of a COM object.
However a library DLL is different from a COM object and all the additional information provided with a COM object is not available from a library DLL. Instead a library DLL package is normally made up of (1) the library DLL itself, (2) a .lib file that contains the linkage information for the library DLL along with the stubs and functionality to satisfy the linker when building your application which uses the library DLL, and (3) an include file with the function prototypes of the functions in the library DLL.
So you create your application by calling the functions which reside in the library DLL but using the type information from the include file and linking with the stubs of the associated .lib file. This procedure allows Visual Studio to automate much of the work required to use a library DLL.
Or you can hand code the LoadLibrary() and the building of a table of the functions in the library DLL using GetProcAddress(). By doing hand coding all you really need are the function prototypes of the functions in the library DLL which you then can type in yourself and the library DLL itself. You are in effect doing the work by hand that the Visual Studio compiler does for you if you are using the .lib library stubs and include file.
If you know the actual function name and the function prototype of a function in a library DLL then what you could do is to have your command line utility require the following information:
the name of the function to be called as a text string on the command
line
the list of the arguments to be used as a series of text strings on the command line
an additional parameter that describes the function prototype
This is similar to how functions in the C and C++ runtime which accept variable argument lists with unknown parameter types work. For instance the printf() function which prints a list of argument values has a format string followed by the arguments to be printed. The printf() function uses the format string to determine the types of the various arguments, how many arguments to expect, and what kinds of value transformations to do.
So if your utility had a command line something like the following:
dofunc "%s,%d,%s" func1 "name of " 3 " things"
And the library DLL had a function whose prototype looked like:
void func1 (char *s1, int i, int j);
then the utility would dynamically generate the function call by transforming the character strings of the command line into the actual types needed for the function to be called.
This would work for simple functions that take Plain Old Data types however more complicated types such as struct type argument would require more work as you would need some kind of a description of the struct along with some kind of argument description perhaps similar to JSON.
Appendix I: A simple example
The following is the source code for a Visual Studio Windows console application that I ran in the debugger. The command arguments in the Properties was pif.dll PifLogAbort which caused a library DLL from another project, pif.dll, to be loaded and then the function PifLogAbort() in that library to be invoked.
NOTE: The following example depends on a stack based argument passing convention as is used with most x86 32 bit compilers. Most compilers also allow for a calling convention to be specified other than stack based argument passing such as the __fastcall modifier of Visual Studio. Also as pointed out in the comments, the default for x64 and 64 bit Visual Studio is to use the __fastcall convention by default so that function arguments are passed in registers and not on the stack. See Overview of x64 Calling Conventions in the Microsoft MSDN. See as well the comments and discussion in How are variable arguments implemented in gcc?
.
Notice how the argument list to the function PifLogAbort() is built as a structure that contains an array. The argument values are put into the array of a variable of the struct and then the function is called passing the entire struct by value. What this does is to push a copy of the array of parameters onto the stack and then calls the function. The PifLogAbort() function sees the stack based on its argument list and processes the array elements as individual arguments or parameters.
// dllfunctest.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
typedef struct {
UCHAR *myList[4];
} sarglist;
typedef void ((*libfunc) (sarglist q));
/*
* do a load library to a DLL and then execute a function in it.
*
* dll name.dll "funcname"
*/
int _tmain(int argc, _TCHAR* argv[])
{
HMODULE dll = LoadLibrary(argv[1]);
if (dll == NULL) return 1;
// convert the command line argument for the function name, argv[2] from
// a TCHAR to a standard CHAR string which is what GetProcAddress() requires.
char funcname[256] = {0};
for (int i = 0; i < 255 && argv[2][i]; i++) {
funcname[i] = argv[2][i];
}
libfunc generic_function = (libfunc) GetProcAddress(dll, funcname);
if (generic_function == NULL) return 2;
// build the argument list for the function and then call the function.
// function prototype for PifLogAbort() function exported from the library DLL
// is as follows:
// VOID PIFENTRY PifLogAbort(UCHAR *lpCondition, UCHAR *lpFilename, UCHAR *lpFunctionname, ULONG ulLineNo);
sarglist xx = {{(UCHAR *)"xx1", (UCHAR *)"xx2", (UCHAR *)"xx3", (UCHAR *)1245}};
generic_function(xx);
return 0;
}
This simple example illustrates some of the technical hurdles that must be overcome. You will need to know how to translate the various parameter types into the proper alignment in a memory area which is then pushed onto the stack.
The interface to this example function is remarkably homogeneous in that most of the arguments are unsigned char pointers with the exception of the last which is an int. With a 32 bit executable all four of these variable types have the same length in bytes. With a more varied list of types in the argument list you will need to have an understanding as to how your compiler aligns parameters when it is pushing the arguments onto the stack before doing the call.
Appendix II: Extending the simple example
Another possibility is to have a set of helper functions along with a different version of the struct. The struct provides a memory area to create a copy of the necessary stack and the help functions are used to build the copy.
So the struct and its helper functions may look like the following.
typedef struct {
UCHAR myList[128];
} sarglist2;
typedef struct {
int i;
sarglist2 arglist;
} sarglistlist;
typedef void ((*libfunc2) (sarglist2 q));
void pushInt (sarglistlist *p, int iVal)
{
*(int *)(p->arglist.myList + p->i) = iVal;
p->i += sizeof(int);
}
void pushChar (sarglistlist *p, unsigned char cVal)
{
*(unsigned char *)(p->arglist.myList + p->i) = cVal;
p->i += sizeof(unsigned char);
}
void pushVoidPtr (sarglistlist *p, void * pVal)
{
*(void * *)(p->arglist.myList + p->i) = pVal;
p->i += sizeof(void *);
}
And then the struct and helper functions would be used to build the argument list like the following after which the function from the library DLL is invoked with the copy of the stack provided:
sarglistlist xx2 = {0};
pushVoidPtr (&xx2, "xx1");
pushVoidPtr (&xx2, "xx2");
pushVoidPtr (&xx2, "xx3");
pushInt (&xx2, 12345);
libfunc2 generic_function2 = (libfunc2) GetProcAddress(dll, funcname);
generic_function2(xx2.arglist);
Somewhat related to my previous question here
Is there a way to get the calling Object from within a function or method in d?
example:
class Foo
{
public void bar()
{
auto ci = whoCalledMe();
// ci should be something that points me to baz.qux, _if_ baz.qux made the call
}
}
class Baz
{
void qux()
{
auto foo = new Foo();
foo.bar();
}
}
Questions:
Does something like whoCalledMe exist? and if so, what is it called?
if something does exist, can it be used at compile time (in a template) and if so, how?
Alternatively;
is it possible to get access to the call stack at runtime? like with php's debug_backtrace?
To expand on what CyberShadow said, since you can get the fully qualified name of the function by using __FUNCTION__, you can also get the function as a symbol using a mixin:
import std.stdio;
import std.typetuple;
void callee(string file=__FILE__, int line=__LINE__, string func=__FUNCTION__)()
{
alias callerFunc = TypeTuple!(mixin(func))[0];
static assert(&caller == &callerFunc);
callerFunc(); // will eventually overflow the stack
}
void caller()
{
callee();
}
void main()
{
caller();
}
The stack will overflow here since these two functions end up calling each other recursively indefinitely.
It's not directly possible to get information about your "caller". You might have some luck getting the address from the call stack, but this is a low-level operation and depends on things such as whether your program was compiled with stack frames. After you have the address, you could in theory convert it to a function name and line number, provided debugging symbols are available for your program's binary, but (again) this is highly platform-specific and depends on the toolchain used to compile your program.
As an alternative, you might find this helpful:
void callee(string file=__FILE__, int line=__LINE__, string func=__FUNCTION__)()
{
writefln("I was called by %s, which is in %s at line %d!", func, file, line);
}
void caller()
{
// Thanks to IFTI, we can call the function as usual.
callee();
}
But note that you can't use this trick for non-final class methods, because every call to the function will generate a new template instance (and the compiler needs to know the address of all virtual methods of a class beforehand).
Finding the caller is something debuggers do and generally requires having built the program with symbolic debug information switches turned on. Reading the debug info to figure this out is highly system dependent and is pretty advanced.
The exception unwinding mechanism also finds the caller, but those tables are not generated for functions that don't need them, and the tables do not include the name of the function.
I'm unsure if the question is phrased correctly, or if what I want to is possible.
I have an existing GCC application (compiled for a Cortex-M3 if that matters). What I want to do is create a little piece of functionality (just a single method, few methods) that can call into the existing application.
I want to place these few methods at a specific memory location (I know how to do that). What I don't know how to do is get the new application to compile/link with the objects of the existing application.
For instance, my existing application has the function:
int Add(int a, int b);
And new application wants to use it:
int Calculate(int a, int b, int opType)
{
Add(a, b);
}
I have access to all linker, obj, h files, etc.
You can't usually link to executables, only libraries (static or shared) and object files. So, the best advice I can give would be to build the "core" of the first program as a shared library, and link the "front-end" (main) as an executable built against the core shared lib. Then, your second program can also just be a program linked against the shared library.
You can also use dlopen on dynamic executables to link the executable at runtime, and use dlsym to get function pointers for the desired functionality, though this is usually only used if you have no control over the first executable.
Example of the latter (note again that this should be a last resort):
a.c:
#include <stdio.h>
int main() { printf("hello world!\n"); return 42; }
b.c:
#include <stdio.h>
#include <dlfcn.h>
main() {
void *handle = dlopen("a", RTLD_LAZY);
if(!handle) {
printf("failed: %s\n", dlerror());
return -1;
}
int (*amain)() = dlsym(handle, "main");
if(!amain) {
printf("dlsym failed: %s\n", dlerror());
return -1;
}
return amain();
}
Thanks for your input, however I was able to do exactly what I wanted by compiling the new application using the ELF file from the existing application has an input to the Linker by specifying
--just-symbols elffile.elf
If you're using a linux variant, then the answer given by #nneonneo to use dlopen() and dlsym() is the best approach.
However, assuming that you're using another OS (or none at all) and/or you really really need this code to live at a fixed location (for example if you need to shift execution to an address on a specific memory device, eg. if doing flash manipulation), you can use a hard coded function pointer.
Declare a function pointer as follows:
typedef int (*AddFnPtr)(int a, int b);
AddFnPtr MyAddFunction = (AddFnPtr)ADDRESS_OF_YOUR_FUNCTION;
Then call as:
int Calculate(int a, int b, int opType)
{
MyAddFunction(a, b);
}
Note that the linker has no way of knowing if the code that you've put at that location has the right prototype, or even exists - so there is no error checking either at link time or at run time.
You will probably (depending on OS) also need to take steps to map the absolute memory location at which you've put your function into the local processes address space.
I am creating a small C++ wrapper shared library around a Fortran 95 library. Since the Fortran symbols contain . in the symbol name, I have to use dlsym to load the Fortran function into a C++ function pointer.
Currently, I have a bunch of global function pointers in header files:
// test.h
extern void (*f)(int* arg);
and I populate them in the corresponding C++ file:
// test.cc
void (*f))(int* = reinterpret_cast<void(*)(int*>(dlsym(RTLD_DEFAULT, "real_f.symbol_name_");
Questions:
If I do it this way, when are these pointers populated?
Can I assume them to be loaded in my executable that loads this library?
In particular, can I use these functions in statically created objects in my executable or other libraries? Or does this suffer from the static initalization order fiasco?
If the above way is not correct, what is the most elegant way of populating these pointers such that they can be used in static objects in executables and other libraries?
I am using the Sun Studio compiler on Solaris, if that makes a difference, but I would also be interested in a solution for GCC on Linux.
Where does the line
f = reinterpret_cast<void(*)(int*)>(dlsym(RTLD_DEFAULT, "real_f.symbol_name_"));
occur in test.cc? The pointer will be initialized when the line is
executed (which of course depends on when the function which contains it
is called). Or did you mean to write
void (*f)(int* ) = reinterpret_cast<void(*)(int*>(dlsym(RTLD_DEFAULT, "real_f.symbol_name_");
? In this case, the pointer will be initialized during static
initialization. Which means that you still have order of initialization
issues if you try to use the pointers in the constructor of a static
object.
The classical solution for this would be to use some sort of singleton:
struct LibraryPointers
{
void (*f)(int* );
// ...
static LibraryPointers const& instance()
private:
LibraryPointers();
};
LibraryPointers const&
LibraryPointers::instance()
{
static LibraryPointers theOneAndOnly;
return theOneAndOnly;
}
LibraryPointers::LibraryPointers()
: f( reinterpret_cast<void(*)(int*)>(dlsym(RTLD_DEFAULT, "real_f.symbol_name_")) )
, // initialization of other pointers...
{
}
Then wrap the library in a C++ class which uses this structure to get
the addresses of the pointers.
And one last remark: the reinterpret_cast you are trying to do isn't
legal, at least not formally. (I think that both Sun CC and g++ will
accept it, however.) According to Posix, the correct way to get a
pointer to function from dlsym would be:
void (*f)(int* );
*reinterpret_cast<void**>(&f) = dlsym(...);
This doesn't lend itself to initializations, however.
I have a code in C++ that calls functions from external library. The function I called is CreateProcess like below.
CreateProcess(NULL,pProcessName,NULL,NULL,false,CREATE_SUSPENDED,
NULL,NULL,&suStartUpInformation,&piProcessInformation)
Now when I compile the code and dissemble it, the assembly shows the plain text as CreateProcess(args1, args2, ...). Is there any way to obfuscate or encrypt the function call to API so that if someone dissembles it then he won't ever know which functions are called.
Thanks!
Any function that is imported by name will always have the name embedded into the binary (in the import descriptor thunk to be exact), the detailed parameter info is gotten from the pdbs as Steve mentioned (however analysing debuggers like ollydbg can deduce args, due to the symbol name being available). The only ways to avoid this is to either encrypt to IAT (using 3rd party packers/virtualizers/binary protection systems etc, like enigma) or use a custom version of GetModuleHandle (basically just a PEB spelunking tool) and GetProcAddress (a PE spelunking tool this time), then by storing all the api calls you need as runtime encrypted strings, you can then call whatever you need without plain text giving you away (securerom does this, though it uses GetProcAddress directly, along with some binary obfuscation).
Update:
for compile-time 'obfuscated' strings, you can use something like this (really simple, but it should be portable, if you use C++0x, this is a lot easier):
#define c(x) char((x) - 1) //really simple, complexity is up to the coder
#define un(x) char((x) + 1)
typedef int (WINAPI* MSGBOX)(HWND, LPCSTR, LPCSTR, UINT);
const int ORD_MASK = 0x10101010;
const char szMessageBoxA[] = {c('M'),c('e'),c('s'),c('s'),c('a'),c('g'),c('e'),c('B'),c('o'),c('x'),c('A')};
FARPROC GetProcAddressEncrypted(HMODULE hModule, const char* szName, BOOL bOrd = FALSE)
{
if(bOrd)
return GetProcAddress(hModule,reinterpret_cast<const char*>(reinterpret_cast<int>(szName) ^ ORD_MASK)); //this requires that ordinals be stored as ordinal ^ ORD_MASK
char szFunc[128] = {'\0'};
for(int i = 0; *szName; i++)
szFunc[i] = uc(*szName++);
return GetProcAddress(hModule,szName);
}
MSGBOX pfMsgBox = static_cast<MSGBOX>(GetProcAddressEncrypted(GetHandleEncrypted(szUser32),szMessageBox));
Optionally you may want to use MSVC's EncodePointer to hide the values in the global function pointers (just remember to use DecodePointer when you call them).
note: code is untested, as its just off the top of my head
You might use dynamic linking. In Windows, use LoadLibrary, LoadLibraryEx, GetProcAddress. Now in you code, include some form in obfuscated form of name instead of the real lib/symbol names and unofuscate it at runtime.
You might want to use dynamic dispatch (function pointers) so that the function called cannot be deduced easily from the code.
You might delegate the work of calling this function to another thread (using some IPC mechanism).
But it's quite useless, using a debugger it will very simple to find that this function has been called. And it will be very simple to detect that a process has been created.
Ok! here is the solution. Lets say I want to call "MessageBoxA" from "user32.dll".
So here is how I will do it using LoadLibraryA & GetProcAddress .
//Ok here you can see.
//I am passing DLL name(user32.dll) and DLL function(MessageBoxA) as String
//So I can also perform Encrypt & Decrypt operation on Strings and obfuscate it.
//Like i can encrypt the string "user32.dll" and at runtime decrypt it and pass it as
//an argument to "LoadLibraryA" and same for the Function name "MessageBoxA".
//The code is compiled in DevC++ 4.9.9.2.
#include <windows.h>
#include <iostream>
using namespace std;
void HelloWorld()
{
char* szMessage = "Hello World!";
char* szCaption = "Hello!";
HMODULE hModule = LoadLibraryA( "user32.dll" );
FARPROC fFuncProc = GetProcAddress( hModule, "MessageBoxA" );
( ( int ( WINAPI *)( HWND, LPCSTR, LPCSTR, UINT ) ) fFuncProc )( 0, szMessage, szCaption, 0 );
}
int main()
{
HelloWorld();
}