EndScene hook questions - c++

So recently I wanted to add an imgui interface to an example window using DirectX, so I watched on a video I had to hook the EndScene function using DirectX9sdk to be able to add my custom imgui interface.
However I have some questions:
Where can I find any documentation for the DirectX9 functions and types,( if there is any, because I do not understand why we specifically have to hook the EndScene function) or where could I find any article explaining more in depth how directX works?
I've seen two version so far of EndScene hooks one with a patternScanning function which scans a signature in the shaderapi dll and another which creates a DirectXDevice and then accesses the vtable from there; are there any sources online, or is it something we have to do ourselves?
Here is the version I have:
while (!DirectXDevice) // loops until it finds the device
DirectXDevice = **(DWORD**)(FindPattern("shaderapidx9.dll", "A1 ?? ?? ?? ?? 50 8B 08 FF 51 0C") + 0x1);
void** pVTable = *reinterpret_cast<void***>(DirectXDevice); // getting the vtable array
oEndScene = (f_EndScene)DetourFunction((PBYTE)pVTable[42], (PBYTE)Hooked_EndScene)//getting the 42th virtual function and detouring it to our own
I don't really understand what __stdcall does here, I do know it is used to call WINAPI functions but what for here?
HRESULT __stdcall Hooked_EndScene(IDirect3DDevice9* pDevice){//some code}
Note: thats the function I hook to the original endscene.
Thank you really much, I'm sorry if there are alot of questions but I really can't wrap my head around this.

How do you know which functions you need to hook?
To put it bluntly, you have to be an experienced DirectX graphics programmer to find that out. Don't expect being able to hook into a framework that you don't understand. It just so happens that EndScene will always be called after all the other draw calls on the render target.
There are tons of D3D9 programming resources available, online and in paper form. Most of them are not free. I'm afraid this is not the answer you were hoping for.
What is the deal with pattern scanning, or creating a temporary D3D9 device?
Microsoft did not put any explicit effort into making EndScene hookable. It just happens to be hookable because every normal function is hookable. You need a way to find the function in memory, because the function will not always be at the same address.
One approach is to scan for known instructions that appear inside the function. Someone needs to be the first person to find out that pattern that you can scan for. You are far from the first person to hook EndScene, so many have reverse-engineered the function before and shared searchable patterns.
NOTE: The pattern does not necessarily need to be directly inside the target function. It might also lead you to something else first, in your case, the ID3D9Device instance. The important thing is that you can find your way to the EndScene function somehow.
Another approach is to get a pointer to the function. If it was a regular C function, that would be easy. It's hard here because OOP tends to make these things hard - you have to fight your way through various interfaces to get the correct vtable.
Both methods have advantages and disadvantages -- creating a D3D9 device is safer, but also more intrusive, because the target process might not expect someone to just randomly create new devices.
Why does the hook function need __stdcall?
Since you replace the original function with your hooked version, the calling convention of the hooked function must be the same as the calling convention of the original function. The caller of EndScene expects (and was compiled with) a __stdcall convention, so the new function must also behave the same way, otherwise the stack will be corrupted. Your act of replacing the function does not change the way the caller calls it.

Related

Can I access Windows Kernel system calls directly?

I have been doing research into Windows internals, and have just recently learned about system calls and I am wondering if it is possible to use these system calls like functions? I understand they aren't really meant to be accessed externally.
For instance: NtUserEmptyClipboard is a system call in Win32k.sys, and it's address is 0x117f
If I wanted to use this call like a function, how could I do so?
What you want to do depends heavily on the architecture you're interested, but the thing to know is, that ntdll.dll is the user-mode trampoline for every syscall - i.e. the only one who actually makes syscalls at the end of the day is ntdll.
So, let's disassemble one of these methods in WinDbg, by opening up any old exe (I picked notepad). First, use x ntdll!* to find the symbols exported by ntdll:
0:000> x ntdll!*
00007ff9`ed1aec20 ntdll!RtlpMuiRegCreateLanguageList (void)
00007ff9`ed1cf194 ntdll!EtwDeliverDataBlock (void)
00007ff9`ed20fed0 ntdll!shortsort_s (void)
00007ff9`ed22abbf ntdll!RtlUnicodeStringToOemString$fin$0 (void)
00007ff9`ed1e9af0 ntdll!LdrpAllocateDataTableEntry (void)
...
So, let's pick one at random, NtReadFile looks neato. Let's disassemble it:
0:000> uf ntdll!NtReadFile
ntdll!NtReadFile:
00007ff9`ed21abe0 4c8bd1 mov r10,rcx
00007ff9`ed21abe3 b805000000 mov eax,5
00007ff9`ed21abe8 0f05 syscall
00007ff9`ed21abea c3 ret
Here, we see that we stuff away rcx, put the syscall number into eax, then call the syscall instruction. Every syscall has a number that is assigned arbitrarily by Windows (i.e. this number is a secret handshake between ntdll and the kernel, and changes whenever Microsoft wants)
None of these instructions are "magic", you could execute them in your app directly too (but there's no practical reason to do so, of course - just for funsies)
EmptyClipboard is one of so-called "Win32 API" and NtUserEmptyClipboard is a corresponding "native API".
http://en.wikipedia.org/wiki/Native_API
Unlike Linux syscall(2), we are rarely supposed to directly call "native API". I heard they are in ntdll.dll rather than win32k.sys. But we should be able to invoke them just like normal functions defined in a normal DLL.
Is there any way to call the Windows Native API functions from the user mode?
I strongly doubt that 0x117f is the address you're looking for. I suspect it might be the value which you need to pass to GetProcAddress. But I don't know for sure, since those things vary across Windows versions (that's why ordinary people use documented functions instead)
The main part of the native API is exported via normal functions from ntdll.dll. You can load this dll into your process and call these functions just like any other API functions. As long as you have the right function prototypes and parameters, the calls will work just fine. What they do internally is transition from usermode to kernelmode and then they use an offset into the system service descriptor table (SSDT) to find the address of the function in kernel mode memory, and then the function is called. There is an open source project http://nativetest.codeplex.com/ that makes calls to the native api that you might refer to.
The functions in win32k.sys are not exposed in ntdll.dll. As far as I can tell they are not exposed anywhere. The address you have listed - I believe - is actually an offset into the SSDT. If you really needed to call this function, you would have to make the transition from usermode to kernelmode yourself, putting all the parameters for the function and the SSDT offset into the right places.
As others have recommended, I would suggest to find the usermode API to help accomplish what you want to do. FWIW, in user32.dll the function EmptyClipboard appears to forward directly to NtUserEmptyClipboard, according to the link /dump output.
1731 DF 0002018A EmptyClipboard = _NtUserEmptyClipboard#0

Detouring and using a _thiscall as a hook (GCC calling convention)

I've recently been working on detouring functions (only in Linux) and so far I've had great success. I was developing my own detouring class until I found this. I modernized the code a bit and converted it to C++ (as a class of course). That code is just like any other detour implementation, it replaces the original function address with a JMP to my own specified 'hook' function. It also creates a 'trampoline' for the original function.
Everything works flawlessly but I'd like to do one simple adjustement. I program in pure C++, I use no global functions and everything is enclosed in classes (just like Java/C#). The problem is that this detouring method breaks my pattern. The 'hook' function needs to be a static/non-class function.
What I want to do is to implement support for _thiscall hooks (which should be pretty simple with the GCC _thiscall convention). I've had no success modifying this code to work with _thiscall hooks. What I want as an end result is something just as simple as this; PatchAddress(void * target, void * hook, void * class);. I'm not asking anyone to do this for me, but I would like to know how to solve/approach my problem?
From what I know, I should only need to increase the 'patch' size (i.e it's now 5 bytes, and I should require an additional 5 bytes?), and then before I use the JMP call (to my hook function), I push my 'this' pointer to the stack (which should be as if I called it as a member function). To illustrate:
push 'my class pointer'
jmp <my hook function>
Instead of just having the 'jmp' call directly/only. Is that the correct approach or is there something else beneath that needs to be taken into account (note: I do not care about support for VC++ _thiscall)?
NOTE: here's is my implementation of the above mentioned code: header : source, uses libudis86
I tried several different methods and among these were JIT compile (using libjit) which proved successful but the method did not provide enough performance for it to be usable. Instead I turned to libffi, which is used for calling functions dynamically at run-time. The libffi library had a closure API (ffi_prep_closure_loc) which enabled me to supply my 'this' pointer to each closure generated. So I used a static callback function and converted the void pointer to my object type and from there I could call any non-static function I wished!

calling kernel32.dll function without including windows.h

if kernel32.dll is guaranteed to loaded into a process virtual memory,why couldn't i call function such as Sleep without including windows.h?
the below is an excerpt quoting from vividmachine.com
5. So, what about windows? How do I find the addresses of my needed DLL functions? Don't these addresses change with every service pack upgrade?
There are multitudes of ways to find the addresses of the functions that you need to use in your shellcode. There are two methods for addressing functions; you can find the desired function at runtime or use hard coded addresses. This tutorial will mostly discuss the hard coded method. The only DLL that is guaranteed to be mapped into the shellcode's address space is kernel32.dll. This DLL will hold LoadLibrary and GetProcAddress, the two functions needed to obtain any functions address that can be mapped into the exploits process space. There is a problem with this method though, the address offsets will change with every new release of Windows (service packs, patches etc.). So, if you use this method your shellcode will ONLY work for a specific version of Windows. Further dynamic addressing will be referenced at the end of the paper in the Further Reading section.
The article you quoted focuses on getting the address of the function. You still need the function prototype of the function (which doesn't change across versions), in order to generate the code for calling the function - with appropriate handling of input and output arguments, register values, and stack.
The windows.h header provides the function prototype that you wish to call to the C/C++ compiler, so that the code for calling the function (the passing of arguments via register or stack, and getting the function's return value) can be generated.
After knowing the function prototype by reading windows.h, a skillful assembly programmer may also be able to write the assembly code to call the Sleep function. Together with the function's address, these are all you need to make the function call.
With some black magic you can ;). there have been many custom implementations of GetProcAddress, which would allow you to get away with not needing windows.h, this however isn't be all and end all and could probably end up with problems due to internal windows changes. Another method is using toolhlp to enumerate the modules in the process to get kernel.dll's base, then spelunks its PE for the EAT and grab the address of GetProcAddress. from there you just need function pointer prototypes to call the addresses correctly(and any structure defs needed), which isn't too hard meerly labour intensive(if you have many functions), infact under windows xp this is required to disable DEP due to service pack differencing, ofc you need windows.h as a reference to get this, you just don't need to include it.
You'd still need to declare the function in order to call it, and you'd need to link with kernel32.lib. The header file isn't anything magic, it's basically just a lot of function declarations.
I can do it with 1 line of assembly and then some helper functions to walk the PEB
file by hard coding the correct offsets to different members.
You'll have to start here:
static void*
JMIM_ASM_GetBaseAddr_PEB_x64()
{
void* base_address = 0;
unsigned long long var_out = 0;
__asm__(
" movq %%gs:0x60, %[sym_out] ; \n\t"
:[sym_out] "=r" (var_out) //:OUTPUTS
);
//: printf("[var_out]:%d\n", (int)var_out);
base_address=(void*)var_out;
return( base_address );
}
Then use windbg on an executable file to inspect the data structures on your machine.
A lot of the values you'll be needing are hard to find and only really documented by random hackers. You'll find yourself on a lot of malware writing sites digging for answers.
dt nt!_PEB -r #$peb
Was pretty useful in windbg to get information on the PEB file.
There is a full working implementation of this in my game engine.
Just look in: /DEP/PEB2020 for the code.
https://github.com/KanjiCoder/AAC2020
I don't include <windows.h> in my game engine. Yet I use "GetProcAddress"
and "LoadLibraryA". Might be in-advisable to do this. But my thought was the more
moving parts, the more that can go wrong. So figured I'd take the "#define WIN32_LEAN_AND_MEAN" to it's absurd conclusion and not include <windows.h> at all.

Current function address - x64

I am working on this small project where I'd like to generate the call graph of an application - I am not planning to do anything complex, it is mainly for fun/experience. I am working on x64 platform.
The first goal I set myself is to be able to measure the time spent in each function of my test application. So far my strategy has been to use __penter()_ and __pexit()_ - __penter()_ is a function that will get called at the start of every method or function and conversely __pexit()_ will get called at the end of every method or function.
With these two functions I can record each function call as well as the time spent in each of them. What I'd like to do next is get the address of each function being called.
For example if we consider the following callstack (very simplified):
main()
....myFunction()
........_penter()
I am in __penter_ and I want to get the address of the calling function, myFunction(). I already found a way to do it in the case of non-leaf functions, I simply use RtlLookupFunctionEntry. However this solution doesn't seem to work for leaf functions because they don't provide any unwind data.
One thing I was thinking about is to go up one more level in the callstack, in main(), and decode the CALL procedure manually - that would involve getting a pointer to the instruction calling myFunction().
I was wondering if any of you would know how to get the address of the current function in the case of leaf functions. I have this gut feeling that my current approach is a bit overcomplicated.
Thanks,
Clem
I believe SymGetSymFromAddr64, probably along with StackWalk64 should get you (most of?) what you want.
Hmm, x64 code, no assembly hacks at your disposal unless you use ml64.exe. There's one intrinsic that ought to help here, _ReturnAddress() gives you the code location of the call to your __penter() function. The instruction after it btw. That should be enough to help you identify the caller.

C++ How to replace a function but still use the original function in it?

I want to modify the glBindTexture() function to keep track of the previously binded texture ID's. At the moment i just created new function for it, but then i realised that if i use other codes that use glBindTexture: then my whole system might go down to toilet.
So how do i do it?
Edit: Now when i thought it, checking if i should bind texture or not is quite useless since opengl probably does this already. But i still need to keep track on the previously used texture.
As Andreas is saying in the comment, you should check this is necessary. Still, if you want to do such a thing, and you use gnu linker (you don't specify the operating system) you could use the linker option:
--wrap glBindTexture
(if given directly to gcc you should write):
-Wl,--wrap,glBindTexture
As this is done at linker stage, you can use your new function with an already existing library (edit: by 'library' I mean some existing code which you can recompile but which you wouldn't want to modify).
The code for the 'replacement' function will look like:
void * __wrap_glBindTexture (GLenum target, GLuint texture) {
printf ("glBindTexture wrapper\n");
return __real_glBindTexture (target,texture);
}
You actually can do this. Take a look at LD_PRELOAD. Create a shared library that defines glBindTexture. To call the original implementation from within the wrapper, dlopen the real OpenGL library and use dlsym to call the right function from there.
Now have all client code LD_PRELOAD your shared lib so that their OpenGL calls go to your wrapper.
This is the most common method of intercepting and modifying calls to shared libraries.
You can intercept and replace all calls to glBindTexture. To do this you need to create your own OpenGL dll which intercepts all OpenGL function calls, does the bookkeeping you want and then forward the function calls to the real OpenGL dll. This is a lot of work so I would defintely think twice before going down this route...
Programs like GLIntercept work like this.
One possibility is to use a macro to replace existing calls to glBindTexture:
#define glBindTexture(target, texture) myGlBindTexture(target, texture)
Then in you code, where you want to ensure against using the macro, you surround the name with parentheses:
(glBindTexture)(someTarget, someTexture);
A function-like macro is only replace where the name is followed immediately by an open-parenthesis, so this prevents macro expansion.
Since this is a macro, it will only affect source code compiled with the macro visible, not an existing DLL, static library, etc.
I haven't ever worked with OpenGL, so not knowing anything about that function, here's my best guess. You would want to replace the glBindTexture function call with your new function's call anywhere it occurs in your code. If you use library functions that will call glBindTexture internally, then you should probably figure out a way to reverse what glBindTexture does. Then, anytime you call something that binds a texture, you can immediately call your reversal function to undo its changes.
The driver WON'T do it, it's in the spec. YOU have to ensure that you don't bind the same texture twice, so it's a good idea.
However, it's even a better idea to separate the concerns : let the low-level openGL deal with its low-level stuff, and your (thin, thick, as you want) abstraction layer do the higher-level stuff.
So, create a oglWrapper::BindTexture function that does the if(), but you should not play around with LD, even if this is technically possible.
[EDIT] In fact, it's not in the ogl spec, but still.
In general, the approaches have been catalogued under the heading of "seams", as popularized in M. Feather's 2004 book Working Effectively with Legacy Code. The book focuses on finding seams in a monolith application to isolate parts of it and put them under automated testing.
Feathers' seams can be found in the following places
compiler
__attribute__ ((ifunc in GCC, https://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Function-Attributes.html
preprocessor
change what gets used with a #define
linker
-Wl,--wrap,func_xyz
linking order, first found symbol gets used, program can delegate using dlsym(RTLD_NEXT, ...)
the binary format has a Procedure Linkage Table which can be modified by the program itself when it runs
in Java, much can be achieved in the JVM, see for example Mockito
language features
function pointers, this can actually be done so as to add no syntactic overhead at point of call!
object inheritance: inherit, override, call super()
sources:
https://www.informit.com/articles/article.aspx?p=359417&seqNum=3
https://www.cute-test.com/guides/mocking-with-cute/