Obfuscate External API Calls C++

Obfuscate External API Calls C++ - c++

I have a code in C++ that calls functions from external library. The function I called is CreateProcess like below.
CreateProcess(NULL,pProcessName,NULL,NULL,false,CREATE_SUSPENDED,
NULL,NULL,&suStartUpInformation,&piProcessInformation)
Now when I compile the code and dissemble it, the assembly shows the plain text as CreateProcess(args1, args2, ...). Is there any way to obfuscate or encrypt the function call to API so that if someone dissembles it then he won't ever know which functions are called.
Thanks!

Any function that is imported by name will always have the name embedded into the binary (in the import descriptor thunk to be exact), the detailed parameter info is gotten from the pdbs as Steve mentioned (however analysing debuggers like ollydbg can deduce args, due to the symbol name being available). The only ways to avoid this is to either encrypt to IAT (using 3rd party packers/virtualizers/binary protection systems etc, like enigma) or use a custom version of GetModuleHandle (basically just a PEB spelunking tool) and GetProcAddress (a PE spelunking tool this time), then by storing all the api calls you need as runtime encrypted strings, you can then call whatever you need without plain text giving you away (securerom does this, though it uses GetProcAddress directly, along with some binary obfuscation).
Update:
for compile-time 'obfuscated' strings, you can use something like this (really simple, but it should be portable, if you use C++0x, this is a lot easier):
#define c(x) char((x) - 1) //really simple, complexity is up to the coder
#define un(x) char((x) + 1)
typedef int (WINAPI* MSGBOX)(HWND, LPCSTR, LPCSTR, UINT);
const int ORD_MASK = 0x10101010;
const char szMessageBoxA[] = {c('M'),c('e'),c('s'),c('s'),c('a'),c('g'),c('e'),c('B'),c('o'),c('x'),c('A')};
FARPROC GetProcAddressEncrypted(HMODULE hModule, const char* szName, BOOL bOrd = FALSE)
{
if(bOrd)
return GetProcAddress(hModule,reinterpret_cast<const char*>(reinterpret_cast<int>(szName) ^ ORD_MASK)); //this requires that ordinals be stored as ordinal ^ ORD_MASK
char szFunc[128] = {'\0'};
for(int i = 0; *szName; i++)
szFunc[i] = uc(*szName++);
return GetProcAddress(hModule,szName);
}
MSGBOX pfMsgBox = static_cast<MSGBOX>(GetProcAddressEncrypted(GetHandleEncrypted(szUser32),szMessageBox));
Optionally you may want to use MSVC's EncodePointer to hide the values in the global function pointers (just remember to use DecodePointer when you call them).
note: code is untested, as its just off the top of my head

You might use dynamic linking. In Windows, use LoadLibrary, LoadLibraryEx, GetProcAddress. Now in you code, include some form in obfuscated form of name instead of the real lib/symbol names and unofuscate it at runtime.
You might want to use dynamic dispatch (function pointers) so that the function called cannot be deduced easily from the code.
You might delegate the work of calling this function to another thread (using some IPC mechanism).
But it's quite useless, using a debugger it will very simple to find that this function has been called. And it will be very simple to detect that a process has been created.

Ok! here is the solution. Lets say I want to call "MessageBoxA" from "user32.dll".
So here is how I will do it using LoadLibraryA & GetProcAddress .
//Ok here you can see.
//I am passing DLL name(user32.dll) and DLL function(MessageBoxA) as String
//So I can also perform Encrypt & Decrypt operation on Strings and obfuscate it.
//Like i can encrypt the string "user32.dll" and at runtime decrypt it and pass it as
//an argument to "LoadLibraryA" and same for the Function name "MessageBoxA".
//The code is compiled in DevC++ 4.9.9.2.
#include <windows.h>
#include <iostream>
using namespace std;
void HelloWorld()
{
char* szMessage = "Hello World!";
char* szCaption = "Hello!";
HMODULE hModule = LoadLibraryA( "user32.dll" );
FARPROC fFuncProc = GetProcAddress( hModule, "MessageBoxA" );
( ( int ( WINAPI *)( HWND, LPCSTR, LPCSTR, UINT ) ) fFuncProc )( 0, szMessage, szCaption, 0 );
}
int main()
{
HelloWorld();
}

Related

How to resolve access violation writing location when calling dll method

I'm using GetProcAddress to gain access to a standard Isapi Filter DLL method - the GetFilterVersion method which takes a pointer to a HTTP_FILTER_VERSION structure.
https://msdn.microsoft.com/en-us/library/ms525822(v=vs.90).aspx
https://msdn.microsoft.com/en-us/library/ms525465(v=vs.90).aspx
I've tested the code against a working Isapi filter that I've written and it works fine. I debug the code against an Isapi filter from a vendor (I don't have access to the source code or anything beyond the dll itself) and I get the exception, "access violation writing location". What could be the issue? (Both Isapi filters work in IIS.)
//Attempted to define function ptr several ways
typedef BOOL(__cdecl * TRIRIGAISAPIV)(PHTTP_FILTER_VERSION);
//typedef BOOL( * TRIRIGAISAPIV)(PHTTP_FILTER_VERSION);
//typedef BOOL(WINAPI * TRIRIGAISAPIV)(PHTTP_FILTER_VERSION);
void arbitraryMethod()
{
HINSTANCE hDLL; // Handle to DLL
TRIRIGAISAPIV lpfnDllFunc2; // Function pointer
DWORD lastError;
BOOL uReturnVal2;
hDLL = LoadLibrary(L"iisWASPlugin_http.dll"); //vendor's dll
//hDLL = LoadLibrary(L"KLTWebIsapi.dll //my dll
if (hDLL != NULL)
{
lpfnDllFunc2 = (TRIRIGAISAPIV)GetProcAddress(hDLL, "GetFilterVersion");
if (!lpfnDllFunc2)
{
lastError = GetLastError();
// handle the error
FreeLibrary(hDLL);
//return 1;
}
else
{
HTTP_FILTER_VERSION pVer = { 6 };
//Call the function via pointer; Works with my dll, fails with vendor's
uReturnVal2 = lpfnDllFunc2(&pVer);
//................ HELP!!!!!!!!!!!!!
}
}
}

One issue that I see is that your function pointer declaration is incorrect.
According to the Microsoft documentation, GetFilterVersion is prototyped as:
BOOL WINAPI GetFilterVersion(PHTTP_FILTER_VERSION pVer);
The WINAPI is a Windows macro that is actually defined as __stdcall, thus you are declaring the function pointer incorrectly when you used __cdecl.
What does WINAPI mean?
Thus, your declaration should be:
typedef BOOL(__stdcall * TRIRIGAISAPIV)(PHTTP_FILTER_VERSION);

It could be that there are actually some additional structure fields filled by the custom filter.
You can try to increase the size of the structure to see if that will work, like for example:
struct HTTP_FILTER_VERSION_EXTRA {
HTTP_FILTER_VERSION v;
char[1024] extra;
};
HTTP_FILTER_VERSION_EXTRA ver;
ver.v.dwServerFilterVersion = 6;
uReturnVal2 = lpfnDllFunc2(&ver.v);
It is sometimes the case with the WinAPI structures that they allow versioning, so adding fields is possible. If the function doesn't then check (or doesn't know) the actual structure version, it might try to use an extended one which might be different than the one supplied - if the size of the supplied struct is then lesser than the structure version the func tries to use, bad things can happen.
Also check if the DLL is 64-bit or 32-bit. You cannot use 64-bit DLL by 32-bit app and vice versa (but I expect that would already fail during the LoadLibrary call).

How do I return a unicode string / wstring / CStringW from a C++ dll to InstallScript?

I am using InstallShield 2013 Premium. I created a C++ dll in Visual Studio 2010 to provide some functionality I could not achieve with InstallScript alone. My C++ function needs to return a small string (a username) to the InstallScript after doing considerable work to get this value.
Throughout the C++ am I using CStringW to represent my strings. Ideally, I would like to return it as Unicode, but I'm content with ANSI if that's my only option. I have tried numerous approaches with CStringW, std::wstring, std::string, LPCTSTR, LPSTR, char *... I tried direct returns, and attempts to return by reference. Nothing works!
Sometimes the dll function hangs, sometimes it throws an exception, at best it returns garbage values with non-printing characters. The official documentation on this does not seem accurate (it doesn't work for me!). Extensive Googling, and searching the Flexera boards produce "solutions" from others struggling with the same ridiculous problem, and yet non of those work for me either...
I didn't try this until the end, as I took for granted that you could pass strings between dlls and InstallScript easily enough. In retrospect, I should have started with the interface between the two and then developed the dll functionality after that.

Thanks for the help guys! I finally figured this out for myself though. There are multiple facets to the solution, however, which I have not found documented or suggested elsewhere.
Major points
1) return a WCHAR * from C++
2) use WSTRING as the corresponding return type in the InstallScript prototype
3) return it into a regular STRING variable in InstallScript, and treat it like any other
4) retain the value the WCHAR * points to in the C++ dll in a static variable, otherwise it apparently gets deleted and the pointer becomes invalid
If you've gotten far enough to find yourself in the same boat, I probably don't need to serve up every detail, but here's a chunk of example code to help you along:
Visual Studio Def File
LIBRARY MyIsDllHelper
EXPORTS
getSomeStringW #1
C++ Header
#ifdef MYISDLLHELPER_EXPORTS
#define MYISDLLHELPER_API __declspec(dllexport)
#else
#define MYISDLLHELPER_API __declspec(dllimport)
#endif
#include <stdexcept>
#include <atlstr.h>
namespace MyIsDllHelper
{
class MyIsDllHelper
{
public:
static MYISDLLHELPER_API WCHAR * getSomeStringW();
};
}
C++ Source
#include "stdafx.h"
#include "MyIsDllHelper.h"
static CStringW someStringRetained;
CStringW getTheString()
{
CStringW s;
// do whatever...
return s;
}
WCHAR * MyIsDllHelper::MyIsDllHelper::getSomeStringW()
{
someStringRetained = getTheString();
return someStringRetained.GetBuffer( someStringRetained.GetLength() ) + L'\0';
}
InstallScript
#define HELPER_DLL_FILE_NAME "MyIsDllHelper.dll"
prototype WSTRING MyIsDllHelper.getSomeStringW();
function DoSomething( hMSI )
STRING svSomeString;
STRING svDllPath;
begin
// Find the .dll file path. (A custom function)
GetSupportFilePath( HELPER_DLL_FILE_NAME, TRUE, svDllPath );
// Load the .dll file into memory.
if( UseDLL( svDllPath ) != 0 ) then
MessageBox ("Could not load dll: " + svDllPath, SEVERE );
abort;
endif;
// Get the string from the dll
try
svSomeString = MyIsDllHelper.getSomeStringW();
catch
MessageBox( "Could not execute dll function: MyIsDllHelper.getSomeStringW", SEVERE );
abort;
endcatch;
// Remove the .dll file from memory.
if( UnUseDLL( svDllPath ) < 0 ) then
MessageBox ("Could not unload dll: " + svDllPath, SEVERE );
abort;
endif;
// Use the string
MessageBox( "svSomeString: [" + svSomeString + "]", INFORMATION );
end;

You're best off when you can make your interface use C approaches rather than C++ ones. Match the interface of functions like GetEnvironmentVariable in which your function accepts a pointer to a buffer (and for correctness a size of that buffer), and then writes into that buffer. The majority of your implementation shouldn't have to change, as long as you can finish with something like a StringCchCopy from your CString into the buffer.
Since you specifically mention CStringW and other Unicode string types, I'd suggest choosing LPWSTR (rather than LPTSTR) for the interface type.
Then all that's left is declaring this for consumption by InstallScript. This means the prototype should use WSTRING and BYREF. If the function interface is the same as GetEnvironmentVariableW, the prototype should look something like this:
prototype MyFunc(WSTRING, BYREF WSTRING, NUMBER);

You can use strings, but I guess the problem is with the encoding.
Have a look here: https://adventuresinscm.wordpress.com/2014/01/12/unicode-files-and-installshield/

C Pass arguments as void-pointer-list to imported function from LoadLibrary()

The problem I have is that I want to create a generic command line application that can be used to load a library DLL and then call a function in the library DLL. The function name is specified on the command line with the arguments also provided on the utility command line.
I can access the external function from a DLL dynamically loaded using the LoadLibrary() function. Once the library is loaded I can obtain a pointer to the function using GetProcAddress() I want to call the function with the arguments specified on the command line.
Can I pass a void-pointer-list to the function-pointer which I got returned by the LoadLibrary() function similar to the example below?
To keep the example code simple, I deleted the error-checking. Is there a way to get something like this working:
//Somewhere in another dll
int DoStuff(int a, int b)
{
return a + b;
}
int main(int argc, char **argv)
{
void *retval;
void *list = argv[3];
HMODULE dll;
void* (*generic_function)(void*);
dll = LoadLibraryA(argv[1]);
//argv[2] = "DoStuff"
generic_function = GetProcAddress(dll, argv[2]);
//argv[3] = 4, argv[4] = 7, argv[5] = NULL
retval = generic_function(list);
}
If I forgot to mention necessary information, please let me know.
Thanks in advance

You need to cast the function pointer returned by LoadLibrary to one with the right argument types before calling it. One way to manage it is to have a number call-adaptor functions that do the right thing for every possible function type you might want to call:
void Call_II(void (*fn_)(), char **args) {
void (*fn)(int, int) = (void (*)(int, int))fn_;
fn(atoi(args[0]), atoi(args[1]));
}
void Call_IS(void (*fn_)(), char **args) {
void (*fn)(int, char *) = (void (*)(int, char *))fn_;
fn(atoi(args[0]), args[1]);
}
...various more functions
Then you take the pointer you got from GetProcAddress and the additional arguments and pass them to the correct Call_X function:
void* (*generic_function)();
dll = LoadLibraryA(argv[1]);
//argv[2] = "DoStuff"
generic_function = GetProcAddress(dll, argv[2]);
//argv[3] = 4, argv[4] = 7, argv[5] = NULL
Call_II(generic_function, &argv[3]);
The problem is that you need to know what the type of the function you're getting the pointer for is and call the appropriate adaptor function. Which generally means making a table of function name/adaptors and doing a lookup in it.
The related problem is that there's no function analogous to GetProcAddress that will tell you the argument types for a function in the library -- that information simply isn't stored anywhere accessable in the dll.

A library DLL contains the object code for the functions that are part of the library along with some additional information to allow the DLL to be usable.
However a library DLL does not contain the actual type information needed to determine the specific argument list and types for the functions contained in the library DLL. The main information in a library DLL is: (1) a list of the functions that the DLL exports along with the address information that will connect a call of a function to the actual function binary code and (2) a list of any required DLLs that the functions in the library DLL use.
You can actually open a library DLL in a text editor, I suggest a small one, and scan through the arcane symbols of the binary code until you reach the section that contains the list of functions in the library DLL as well as other required DLLs.
So a library DLL contains the bare minimum information needed to (1) find a particular function in the library DLL so that it can be invoked and (2) a list of other needed DLLs that the functions in the library DLL depend on.
This is different from a COM object which normally does have type information in order to support the ability to do what is basically reflection and explore the COM object's services and how those services are accessed. You can do this with Visual Studio and other IDEs which generate a list of COM objects installed and allow you to load a COM object and explore it. Visual Studio also has a tool that will generate the source code files that provide the stubs and include file for accessing the services and methods of a COM object.
However a library DLL is different from a COM object and all the additional information provided with a COM object is not available from a library DLL. Instead a library DLL package is normally made up of (1) the library DLL itself, (2) a .lib file that contains the linkage information for the library DLL along with the stubs and functionality to satisfy the linker when building your application which uses the library DLL, and (3) an include file with the function prototypes of the functions in the library DLL.
So you create your application by calling the functions which reside in the library DLL but using the type information from the include file and linking with the stubs of the associated .lib file. This procedure allows Visual Studio to automate much of the work required to use a library DLL.
Or you can hand code the LoadLibrary() and the building of a table of the functions in the library DLL using GetProcAddress(). By doing hand coding all you really need are the function prototypes of the functions in the library DLL which you then can type in yourself and the library DLL itself. You are in effect doing the work by hand that the Visual Studio compiler does for you if you are using the .lib library stubs and include file.
If you know the actual function name and the function prototype of a function in a library DLL then what you could do is to have your command line utility require the following information:
the name of the function to be called as a text string on the command
line
the list of the arguments to be used as a series of text strings on the command line
an additional parameter that describes the function prototype
This is similar to how functions in the C and C++ runtime which accept variable argument lists with unknown parameter types work. For instance the printf() function which prints a list of argument values has a format string followed by the arguments to be printed. The printf() function uses the format string to determine the types of the various arguments, how many arguments to expect, and what kinds of value transformations to do.
So if your utility had a command line something like the following:
dofunc "%s,%d,%s" func1 "name of " 3 " things"
And the library DLL had a function whose prototype looked like:
void func1 (char *s1, int i, int j);
then the utility would dynamically generate the function call by transforming the character strings of the command line into the actual types needed for the function to be called.
This would work for simple functions that take Plain Old Data types however more complicated types such as struct type argument would require more work as you would need some kind of a description of the struct along with some kind of argument description perhaps similar to JSON.
Appendix I: A simple example
The following is the source code for a Visual Studio Windows console application that I ran in the debugger. The command arguments in the Properties was pif.dll PifLogAbort which caused a library DLL from another project, pif.dll, to be loaded and then the function PifLogAbort() in that library to be invoked.
NOTE: The following example depends on a stack based argument passing convention as is used with most x86 32 bit compilers. Most compilers also allow for a calling convention to be specified other than stack based argument passing such as the __fastcall modifier of Visual Studio. Also as pointed out in the comments, the default for x64 and 64 bit Visual Studio is to use the __fastcall convention by default so that function arguments are passed in registers and not on the stack. See Overview of x64 Calling Conventions in the Microsoft MSDN. See as well the comments and discussion in How are variable arguments implemented in gcc?
.
Notice how the argument list to the function PifLogAbort() is built as a structure that contains an array. The argument values are put into the array of a variable of the struct and then the function is called passing the entire struct by value. What this does is to push a copy of the array of parameters onto the stack and then calls the function. The PifLogAbort() function sees the stack based on its argument list and processes the array elements as individual arguments or parameters.
// dllfunctest.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
typedef struct {
UCHAR *myList[4];
} sarglist;
typedef void ((*libfunc) (sarglist q));
/*
* do a load library to a DLL and then execute a function in it.
*
* dll name.dll "funcname"
*/
int _tmain(int argc, _TCHAR* argv[])
{
HMODULE dll = LoadLibrary(argv[1]);
if (dll == NULL) return 1;
// convert the command line argument for the function name, argv[2] from
// a TCHAR to a standard CHAR string which is what GetProcAddress() requires.
char funcname[256] = {0};
for (int i = 0; i < 255 && argv[2][i]; i++) {
funcname[i] = argv[2][i];
}
libfunc generic_function = (libfunc) GetProcAddress(dll, funcname);
if (generic_function == NULL) return 2;
// build the argument list for the function and then call the function.
// function prototype for PifLogAbort() function exported from the library DLL
// is as follows:
// VOID PIFENTRY PifLogAbort(UCHAR *lpCondition, UCHAR *lpFilename, UCHAR *lpFunctionname, ULONG ulLineNo);
sarglist xx = {{(UCHAR *)"xx1", (UCHAR *)"xx2", (UCHAR *)"xx3", (UCHAR *)1245}};
generic_function(xx);
return 0;
}
This simple example illustrates some of the technical hurdles that must be overcome. You will need to know how to translate the various parameter types into the proper alignment in a memory area which is then pushed onto the stack.
The interface to this example function is remarkably homogeneous in that most of the arguments are unsigned char pointers with the exception of the last which is an int. With a 32 bit executable all four of these variable types have the same length in bytes. With a more varied list of types in the argument list you will need to have an understanding as to how your compiler aligns parameters when it is pushing the arguments onto the stack before doing the call.
Appendix II: Extending the simple example
Another possibility is to have a set of helper functions along with a different version of the struct. The struct provides a memory area to create a copy of the necessary stack and the help functions are used to build the copy.
So the struct and its helper functions may look like the following.
typedef struct {
UCHAR myList[128];
} sarglist2;
typedef struct {
int i;
sarglist2 arglist;
} sarglistlist;
typedef void ((*libfunc2) (sarglist2 q));
void pushInt (sarglistlist *p, int iVal)
{
*(int *)(p->arglist.myList + p->i) = iVal;
p->i += sizeof(int);
}
void pushChar (sarglistlist *p, unsigned char cVal)
{
*(unsigned char *)(p->arglist.myList + p->i) = cVal;
p->i += sizeof(unsigned char);
}
void pushVoidPtr (sarglistlist *p, void * pVal)
{
*(void * *)(p->arglist.myList + p->i) = pVal;
p->i += sizeof(void *);
}
And then the struct and helper functions would be used to build the argument list like the following after which the function from the library DLL is invoked with the copy of the stack provided:
sarglistlist xx2 = {0};
pushVoidPtr (&xx2, "xx1");
pushVoidPtr (&xx2, "xx2");
pushVoidPtr (&xx2, "xx3");
pushInt (&xx2, 12345);
libfunc2 generic_function2 = (libfunc2) GetProcAddress(dll, funcname);
generic_function2(xx2.arglist);

passing pointers to extern C function in a DLL from VB

I am a C++ (MSVC) writer, VB newbie trying to assist an expert VB.net writer who has just not done this task before.
We wish to develop both C/C++ and VB applications to use a DLL written in C++ with C extern-ed API functions. The C++ program is working just fine. It's VB where we are having difficulties.
The DLL provides an extern C function:
RegisterCallback( void* cbFuncPtr, void* dataPtr );
NOTE 1: See my note below for a design change and the reasons we made it.
NOTE 2: Additional update added as an answer below.
where the callback function havs this C typedef:
typedef (void)(* CALL_NACK)(void*);
The cbFuncPtr is expected to be a function pointer to some VB function that will get called as the CALL_BACK. The dataPtr is a pointer to a data structure that has this C definition:
typedef struct
{
int retCode;
void* a_C_ptr;
char message[500];
} cbResponse_t;
where a_C_ptr is an internal pointer in the DLL that the VB can cast tolong`. It uniquely identifies where in the DLL the callback was made and allows the VB function to recognize calls from same/different locations.
We are able to access and run the RegisterCallback() function from VB just fine. Logging shows we get there and that data is passed in. It is the actual data that seems to be the problem.
In reading about a million forum entries we have learned that VB doesn't know what pointers are and that a VB structure is more than just organized memory. We're pretty sure the "address" of a VB structure is not what C thinks an address is. We've seen repeated references to "marshaling" and "managed data", but lack enough understanding to know what that is telling us.
How should we code VB to give the DLL the execution address of its callback function and how do we code up a VB construct that the DLL can fill in just as it does for C++?
Might we need a DLL function where the calling app can say "C" or "VB" andhave the DLL handle the sturcture pointers differently? If so, how would one code up C to fill in the VB structure?

This is a bit too big and deep to just be an edit to the original posting...
From the link posted by #DaveNewman, I extracted this gem:
Here's a bit about the compact framework, but it's the same in the
grown-ups framework:
The .NET Compact Framework supports automatic marshaling of structures
and classes that contain simple types. All fields are laid out
sequentially in memory in the same order as they appear in the
structure or class definition. Both classes and structures appear in
native code as pointers to C/C++ structs.
Objects in the managed heap can be moved around in memory at any time
by the garbage collector, so their physical addresses may change
without notice. P/Invoke automatically pins down managed objects
passed by reference for the duration of each method call. This means
pointers passed to unmanaged code will be valid for that one call.
Bear in mind that there is no guarantee that the object will not be
moved to a different memory address on subsequent calls.
http://msdn.microsoft.com/en-us/library/aa446538.aspx#netcfmarshallingtypes_topic6
This is major hurdle for a RegisterCallback( fcnPtr, dataPtr) function. The pointer passed in at registration time could change at any time the RegisterCallback() is not the current statement. The posting author summed it up this way
You don't need to do anything as the structures are pinned automatically for duration of the call.
implying, of course, not pinned down outside the call.
For this reason we decided on a design change to have the response structure built in, so to speak, the C/C++ world of the DLL, not in VB's space. That way it'll stay put. The actual callback function's signature will remain unchanged so the VB program can know where the DLL put the response. This also allows the responders in the DLL to allocate separate response structures for separate needs.

Once again my update is too large for a mere comment!
Update 18 Apr 2013:
Well, the attempt to use the code from Calling Managed Code from Unmanaged Code cited above was a bust. We ended up having to add /clr to the DLL make turning the DLL into managed code, which made it unusable from a C application.
We are now testing the example at Callback Sample which I was able to show made a DLL that worked with both VB and C++. You'd need to have the PinvokeLib.dll Source to make this work.
Here is the code for the C++ (C really) tester. Compiled as a MSVC project.
NOTE: Notice the __cdecl in this line:
typedef bool (__cdecl *FPtr)(BOOL_FP_INT fp, int i );
It was the secret I had to find. The DLL and this app are compiled with __cdecl linkage, not __stdcall. They are the default in VC++ and I just used the defaults. I tried changing everything to __stdcall but that didn't work. Has to be __cdecl.
// PinvokeTester.cpp : Defines the entry point for the console application.
//
#include <stdio.h>
#include <cstdio>
#include <stdlib.h>
#include <cstdlib>
#include <string.h>
#include <cstring>
#include <sstream>
#include <iostream>
#include <algorithm>
#include <Windows.h>
#define PINVOKELIB_API __declspec(dllimport)
HINSTANCE hLib; // Windows DLL handle
bool CALLBACK VBCallBack( int value );
bool AttachLibrary( void );
void * GetFuncAddress( HINSTANCE hLib, const char* procname );
int main(int argc, char* argv[])
{
if ( !AttachLibrary() )
{
printf( "Lib did not attach.\n" );
exit(1);
}
typedef bool (CALLBACK *BOOL_FP_INT)(int i );
typedef bool (__cdecl *FPtr)(BOOL_FP_INT fp, int i );
FPtr TestCallBack = (FPtr)GetFuncAddress( hLib, "TestCallBack" );
TestCallBack( (BOOL_FP_INT)VBCallBack, 255 );
return 0;
}
bool CALLBACK VBCallBack( int value )
{
printf( "\nCallback called with param: %d", value);
return true;
}
bool AttachLibrary( void )
{
// Get a var for the IPC-dll library.
std::string dllName;
/*--- First, link to the IPC-dll library or report failure to do so. ---*/
dllName = ".\\PinvokeLib";
if ( NULL == (hLib = LoadLibraryA( dllName.c_str() )) )
{
printf( "\nERROR: Library \"%s\" Not Found or Failed to Load. \n\n", dllName.c_str() );
printf( "\"%s\"\n", GetLastError() );
return false;
}
return true;
}
//=====================================================================
void * GetFuncAddress( HINSTANCE hLib, const char* procname )
{
void * procAddr = NULL;
procAddr = (void *)GetProcAddress( hLib, procname );
// If the symbol wasn't found, handle error ---------------------
if ( NULL == procAddr )
{
std::cout << "ERROR: Could not get an address for the \""
<< procname << "\" function. : "
<< GetLastError() << std::endl;
exit( 7 );
procAddr = (void*)NULL;
}
return procAddr;
}

How can I intercept dlsym calls using LD_PRELOAD?

I want to intercept application's calls to dlsym. I have tried declaring inside the .so that I am preloading dlsym , and using dlsym itself to get it's real address, but that for quite obvious reasons didn't work.
Is there a way easier than taking process' memory maps, and using libelf to find the real location of dlsym inside loaded libdl.so?

WARNING:
I have to explicitely warn everyone who tries to do this. The general premise of having a shared library hooking dlsym has several significant drawbacks. The biggest issue issue is that the original dlsym implementation if glibc will internally use stack unwinding techniques to find out from which loaded module the function was called. If the intercepting shared library then calls the original dlsym on behalf of the original application, this will break lookups using stuff like RTLD_NEXT, as now the current module isn't the originally calling one, but your hook library.
It might be possible to implement this the correct way, but it requires a lot more work. Without having tried it, I think that using dlinfo to get to the chained list of linket maps, you could individually walk through all modules, and do a separate dlsym for each one, to get the RTLD_NEXT behavior right. You still need to get the address of your caller for that, which you might get via the old backtrace(3) family of functions.
MY OLD ANSWER FROM 2013
I stumbled across the same problem with hdante's answer as the commenter: calling __libc_dlsym() directly crashes with a segfault. After reading some glibc sources, I came up with the following hack as a workaround:
extern void *_dl_sym(void *, const char *, void *);
extern void *dlsym(void *handle, const char *name)
{
/* my target binary is even asking for dlsym() via dlsym()... */
if (!strcmp(name,"dlsym"))
return (void*)dlsym;
return _dl_sym(handle, name, dlsym);
}
NOTE two things with this "solution":
This code bypasses the locking which is done internally by (__libc_)dlsym(), so to make this threadsafe, you should add some locking.
The thrid argument of _dl_sym() is the address of the caller, glibc seems to reconstruct this value by stack unwinding, but I just use the address of the function itself. The caller address is used internally to find the link map the caller is in to get things like RTLD_NEXT right (and, using NULL as thrid argument will make the call fail with an error when using RTLD_NEXT). However, I have not looked at glibc's unwindind functionality, so I'm not 100% sure that the above code will do the right thing, and it may happen to work just by chance alone...
The solution presented so far has some significant drawbacks: _dl_sym() acts quite differently than the intended dlsym() in some situations. For example, trying to resolve a symbol which does not exist does exit the program instead of just returning NULL. To work around that, one can use _dl_sym() to just get the pointer to the original dlsym() and use that for everything else (like in the "standard" LD_PRELOAD hook approch without hooking dlsym at all):
extern void *_dl_sym(void *, const char *, void *);
extern void *dlsym(void *handle, const char *name)
{
static void * (*real_dlsym)(void *, const char *)=NULL;
if (real_dlsym == NULL)
real_dlsym=_dl_sym(RTLD_NEXT, "dlsym", dlsym);
/* my target binary is even asking for dlsym() via dlsym()... */
if (!strcmp(name,"dlsym"))
return (void*)dlsym;
return real_dlsym(handle,name);
}
UPDATE FOR 2021 / glibc-2.34
Beginning with glibc 2.34, the function _dl_sym() is no longer publicly exported. Another approach I can suggest is to use dlvsym() instead, which is offically part of the glibc API and ABI. The only downside is that you now need the exact version to ask for the dlsym symbol. Fortunately, that is also part of the glibc ABI, unfortunately, it varies per architecture. However, a grep 'GLIBC_.*\bdlsym\b' -r sysdeps in the root folder of the glibc sources will tell you what you need:
[...]
sysdeps/unix/sysv/linux/i386/libc.abilist:GLIBC_2.0 dlsym F
sysdeps/unix/sysv/linux/i386/libc.abilist:GLIBC_2.34 dlsym F
[...]
sysdeps/unix/sysv/linux/x86_64/64/libc.abilist:GLIBC_2.2.5 dlsym F
sysdeps/unix/sysv/linux/x86_64/64/libc.abilist:GLIBC_2.34 dlsym F
Glibc-2.34 actually introduced new versions of this function, but the old versions are still be kept around for backwards compatibilty.
For x86_64, you could use:
real_dlsym=dlvsym(RTLD_NEXT, "dlsym", "GLIBC_2.2.5");
And, if you both like to get the newest version, as well as a potentially one of another interceptor in the same process, you can use that version to do an unversioned query again:
real_dlsym=real_dlsym(RTLD_NEXT, "dlsym");
If you actually need to hook both dlsym and dlvsym in your shared object, this approach of course won't work either.
UPDATE: hooking both dlsym() and dlvsym() at the same time
Out of curiosity, I thought about some approach to hook both of the glibc symbol query methods, and I came up with a solution using an additional wrapper library which links to libdl. The idea is that the interceptor library can dynamically load this library at runtime using dlopen() with the RTLD_LOCAL | RTLD_DEEPBIND flags, which will create a separate linker scope for this object, also containing the libdl, so that the dlsym and dlvsym will be resolved to the original methods, and not the one in the interceptor library. The problem now is that our interceptor library can not directly call any function inside the wrapper library, because we can not use dlsym, which is our original problem.
However, the shared library can have an initialization function, which the linker will call before the dlopen() returns. We just need to pass some information from the initialization function of the wrapper library to the interceptor library. Since both are in the same process, we can use the environment block for that.
This is the code I came up with:
dlsym_wrapper.h:
#ifndef DLSYM_WRAPPER_H
#define DLSYM_WRAPPER_H
#define DLSYM_WRAPPER_ENVNAME "DLSYM_WRAPPER_ORIG_FPTR"
#define DLSYM_WRAPPER_NAME "dlsym_wrapper.so"
typedef void* (*DLSYM_PROC_T)(void*, const char*);
#endif
dlsym_wrapper.c, compiled to dlsym_wrapper.so:
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
#include "dlsym_wrapper.h"
__attribute__((constructor))
static void dlsym_wrapper_init()
{
if (getenv(DLSYM_WRAPPER_ENVNAME) == NULL) {
/* big enough to hold our pointer as hex string, plus a NUL-terminator */
char buf[sizeof(DLSYM_PROC_T)*2 + 3];
DLSYM_PROC_T dlsym_ptr=dlsym;
if (snprintf(buf, sizeof(buf), "%p", dlsym_ptr) < (int)sizeof(buf)) {
buf[sizeof(buf)-1] = 0;
if (setenv(DLSYM_WRAPPER_ENVNAME, buf, 1)) {
// error, setenv failed ...
}
} else {
// error, writing pointer hex string failed ...
}
} else {
// error: environment variable already set ...
}
}
And one function in the interceptor library to get the pointer to the
original dlsym() (should be called only once, guared by a mutex):
static void *dlsym_wrapper_get_dlsym
{
char dlsym_wrapper_name = DLSYM_WRAPPER_NAME;
void *wrapper;
const char * ptr_str;
void *res = NULL;
void *ptr = NULL;
if (getenv(DLSYM_WRAPPER_ENVNAME)) {
// error: already defined, shoudn't be...
}
wrapper = dlopen(dlsym_wrapper_name, RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND | RTLD_NOLOAD);
if (wrapper) {
// error: dlsym_wrapper.so already loaded ...
// it is important that we load it by ourselves to a sepearte linker scope
}
wrapper = dlopen(dlsym_wrapper_name, RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND);
if (!wrapper) {
// error: dlsym_wrapper.so can't be loaded
}
ptr_str = getenv(DLSYM_WRAPPER_ENVNAME);
if (!ptr_str) {
// error: dlsym_wrapper.so failed...
}
if (sscanf(ptr_str, "%p", &ptr) == 1) {
if (ptr) {
// success!
res = ptr;
} else {
// error: got invalid pointer ...
}
} else {
// error: failed to parse pointer...
}
// this is a bit evil: close the wrapper. we can be sure
// that libdl still is used, as this mosule uses it (dlopen)
dlclose(wrapper);
return res;
}
This of course assumes that dlsym_wrapper.so is in the library search path. However, you may prefer to just inject the interceptor library via LD_PRELOAD using a full path, and not modifying LD_LIBRARY_PATH at all. To do so, you can add dladdr(dlsym_wrapper_get_dlsym,...) to find the path of the injector library itself, and use that for searching the wrapper library, too.

http://www.linuxforu.com/2011/08/lets-hook-a-library-function/
From the text:
Do beware of functions that themselves call dlsym(), when you need to call __libc_dlsym (handle, symbol) in the hook.
extern void *__libc_dlsym (void *, const char *);
void *dlsym(void *handle, const char *symbol)
{
printf("Ha Ha...dlsym() Hooked\n");
void* result = __libc_dlsym(handle, symbol); /* now, this will call dlsym() library function */
return result;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js