I don't know assembly so I'm not sure how to go about this.
I have a program which is hooking into another. I have obtained the offset to where the function is located within the hooked program's .exe
#define FuncToCall 0x00447E5D
So now how can I use __asm{} to call that function?
Well short answer is if you do not know assembly you should not be doing this, haha.
But, if you are so intent on wall hacking, I mean, modifying the operation of a legitimate program, you can't just take an address and call it good.
You need to look up the symbol (if in happy linux land) or use sig scanning ( or both D= ) to find the actual function.
Once you do that then its relatively simple, you just need to write a mov and jmp. Assuming you have access to the running process already, and your sig scanner found the right address, this bit of code will get you want you want
mov eax, 0×deadbeef
jmp eax
Now, if this function you want is a class method.. you need to do some more studying. But that bit of assembly will run whatever static function you want.
There is some mess to deal with different calling conventions too, so no commenters try and call me out on that, that is far to advanced for this question.
EDIT:
By the way I do not use call because when using call you have to worry about stack frames and other very messing things. This code will jump to any address and start executing.
If you want to return to your code thats another story, but that WILL get your target function going as long as its the right calling convention, not a class method, etc etc
I think you could also cast that address to a function pointer and call it that way. That might be better.
Thanks for answers, but I figured it out. This is what I'm doing:
#define FuncToCall 0x00447E5D
DWORD myfunc = FuncToCall;
__asm call dword ptr [myfunc];
If it works don't fix it, and by golly it works.
Here is a tricky one:
You can use it with parameters and return value too. It simply forwards everything to the function you intend to call that is given by a pointer (FuncToCall) to the function.
void call_FuncToCall(.......)
{
__asm__
("call label1\n label1:\n"
"pop %eax\n"
"movl FuncToCall, %eax\n"
"leave\n"
"jmp *%eax");
}
Related
I have a proxy dxgi.dll and I'm trying to detour the Present function in the original dxgi.dll in order to render things on screen. The .dll is successfully loaded and the detour is placed. However the detour crashes the program as soon as my new Present is called. Keep in mind the .dlls and programs are 64-bit.
Below is an image of how the function looks in memory before modification (Start highlighted):
Okay so I just found out I'm not allowed to post images directly on here unless I have 10 reputation, so use this link (replace DOT):
https://imgur DOT com/a/Jf53dYc
I am not sure exactly where it crashes, I believe the program keeps running for a little while, but it definetly crashes in the middle/soon after the detour Present is called, I know this because I can write the pointer to the SwapChain parameter to a file from inside the Present detour before it crashes.
I found the original Present function address using IDA. You can see what IDA says about the function on the picture in the imgur gallery.
I've been looking at the memory and been trying to figure out what is wrong, when I follow the jumps using Cheat engine they lead to the correct places, nevertheless something in the detour is making the program crash. The overriden opcodes also seem to be replaced properly.
I've tried to change the calling convention and return type on my Present function, I read in a dxgi hooking guide that the return type was a HRESULT, I tried changing to this to no avail. As for the calling convention I've tried WINAPI.
I've also looked a little bit into if the stack or registers are being corrupted by my function detour. However I'm not very good with assembly and I can't say for sure if this is the case.
I have a class named Core that takes care of the hooking, here is the header file with some relevant definitions:
#pragma once
#include <iostream>
#include <Windows.h>
#include <intrin.h>
#include <dxgi.h>
#include <fstream>
// Seems my C++ doesn't have QWORD predefined, defining it myself
typedef unsigned __int64 QWORD;
// Definition of the structure of the DXGI present function
typedef __int64 (__fastcall* PresentFunction)(IDXGISwapChain *pSwapChain, UINT SyncInterval, UINT Flags);
class Core
{
private:
QWORD originalDllBaseAddress;
QWORD originalPresentFunctionOffset;
public:
void Init();
bool Hook(PresentFunction originalFunction, void* newFunction, int bytes);
~Core();
};
Init starts the process by getting the relevant addresses:
void Core::Init()
{
originalDllBaseAddress = (QWORD)GetModuleHandleA("dxgi_.dll");
originalPresentFunctionOffset = 0x5070;
originalPresentFunction = (PresentFunction)(originalDllBaseAddress + (QWORD)originalPresentFunctionOffset);
Hook(originalPresentFunction, FixAndReturn, 14);
}
Hook tries to place a jump in the target address, I strongly believe the issue is somewhere in here, (comments have now changed my mind, it probably has something to do with assembly, registers or the stack) more specifically the assignments to originalFunction:
bool Core::Hook(PresentFunction originalFunction, void* newFunction, int length)
{
DWORD oldProtection;
VirtualProtect(originalFunction, length, PAGE_EXECUTE_READWRITE, &oldProtection);
memset(originalFunction, 0x90, length);
// Bytes are flipped (because of endianness), could alternatively use _byteswap_uint64()
*(QWORD*)originalFunction = 0x0000000025FF;
// The kind of jump I'm doing here seems to only use 6 bytes,
// and then grabs the subsequent memory address,
// I'm not quite sure if I'm doing this right
*(QWORD*)((QWORD)originalFunction + 6) = (QWORD)newFunction;
DWORD temp;
VirtualProtect(originalFunction, length, oldProtection, &temp);
originalPresentFunction = (PresentFunction)((QWORD)originalFunction + length);
presentAddr = (QWORD)Present;
jmpBackAddr = (QWORD)originalPresentFunction;
return true;
}
I've tried many things when it comes to writing the bytes into memory, but none of them have fixed my problem.
The assignment to "originalPresentFunction" at the end of the function is the address that the detour will attempt to jump back to.
Here is the definition of the detour function, located in Core.cpp:
__int64 __fastcall Present(IDXGISwapChain *pSwapChain, UINT SyncInterval, UINT Flags)
{
//The program crashes with and without these file writes.
std::ofstream file;
file.open("HELLO FROM PRESENT.txt");
file << pSwapChain;
file.close();
return originalPresentFunction(pSwapChain, SyncInterval, Flags);
}
This is the function, when called, that causes a crash. As you can see, I am writing the pSwapChain parameter to a file here. I did this to test if the parameters are being passed from the original function. This write is successful, and the contents of the file look like a valid pointer. thus the crash happens after this write. FixAndReturn() is an assembly function.
includelib legacy_stdio_definitions.lib
.data
extern presentAddr : qword
extern jmpBackAddr : qword
; This performs instructions originally performed by dxgi.dll in the
; memory that we've replaced, and then returns
.code
FixAndReturn PROC
call [presentAddr]
mov [rsp+10h],rbx
mov [rsp+20h],rsi
push rbp
push rdi
push r14
jmp qword ptr [jmpBackAddr]
FixAndReturn ENDP
end
I have uploaded the entire code on Github if more code is needed:
https://github.com/techiew/KenshiDXHook
It's been a while, I've been busy with other things but I've now made the detour function work successfully.
After looking at resources on the web and doing a lot of thinking. The answer is quite simple. In my FixAndReturn assembly code, all I need to do is jmp to the detour function, no call is needed. A call might unneccesarily change things we don't want to, and our detour function is already identical to the original function in terms of parameters and whatnot, so it will already read the parameters from the same place that the original function call placed them. This means a jmp will work just fine for running our detour function. No extra pushes or pops are needed in assembly for this to work.
Here is a basic overview of the process:
Our hook is placed by placing a jmp to the beginning of our assembly code.
Our assembly code immediately jumps to our detour/hooked function.
When the detour function is finished, it returns a function call.
This function call uses a typedef which is identical to the original function we hooked. It looks like this:
typedef HRESULT (__fastcall* PresentFunction)(IDXGISwapChain *pSwapChain, UINT SyncInterval, UINT Flags);
Returning the function using the typedef is done like this, with the original argument values:
return ((PresentFunction)coreRef->newPresentReturn)(swapChain, syncInterval, flags);
Basically what's happening here is that the address following right after our second assembly code jmp instruction pointing to our detour function is being returned to and called as a function, thus we are jumping to the detour, jumping back, and executing the original code. (coreRef->newPresentReturn contains the address right after the jmp instruction).
We are now adhering to the calling convention of the original Present function, and the parameters we pass in are put in the right places, the registers and stack and whatever are not corrupted in any way.
Resource used: Guidedhacking.com - D3D11 barebones hook
Full code is on my Github: https://github.com/techiew/KenshiHook
This is the beginning of a function that already exists and works; the commented line is my addition and its purpose is to toggle a pin.
inline __attribute__((naked))
void CScheduler::SwapToThread(void* pNew, void* pPrev)
{
//*(volatile DWORD*)0x400FF08C = (1 << 14);
if (pPrev != NULL)
{
if (pPrev == this) // Special case to save scheduler stack on startup
{
asm("mov lr,%0"::"p"(&CScheduler_Run_Exit)); // load r1 with schedulers End thread
asm("orr lr, 1");
When I uncomment my addition, my hard fault handler executes. I get it has something to do with this being a naked function but I don't understand why a simple assignment causes a problem.
Two questions:
Why does this line trigger the hard fault?
How can I perform this assignment inside this function?
It was only luck that your previous version of the function happened to work without crashing.
The only thing that can safely be put inside a naked function is a pure Basic Asm statement. https://gcc.gnu.org/onlinedocs/gcc/ARM-Function-Attributes.html. You can split it up into multiple Basic Asm statements, instead of asm("insn \n\t" / "insn2 \n\t" / ...);, but you have to write the entire function in asm yourself.
While using extended asm or a mixture of basic asm and C code may appear to work, they cannot be depended upon to work reliably and are not supported.
If you want to run C++ code from a naked function, you could call a regular function (or bl on ARM, jal on MIPS, etc.), following to the standard calling convention.
As for the specific reason in this case? Maybe creating that address in a register stepped on the function args, leading to the branches going wrong? Inspect the generated asm if you want, but it's 100% unsupported.
Or maybe it ended up using more registers, and since it's naked didn't properly save/restore call-preserved registers? I haven't looked at the code-gen myself for naked functions.
Are you sure this function needs to be naked? I guess that's because you manipulate lr to return to the new context.
If you don't want to just write more logic in asm, maybe have this function's caller do more work (and maybe pass it pointer and/or boolean args telling it more simply what it needs to do, so your inputs are already in registers, and you don't need to access globals).
I just had a really quick question that I saw someone mention something about in another question, but I didn't want to necro-post on it.
I'm coding in inline assembly with c++, and need to display a register value in decimal. I was searching ways to do this, and saw someone mention "If you're using inline c, just call printf." But they didn't go much further into explanation on it than that.
Is it possible the call printf can be used to get a register value in decimal format without needing to write a conversion section of the code? And if so, how would that work? Say after some computations to a user entered integer, the value now lies in the AX register. Would I simply put call printf in the code after it? Or does it print values from the stack? Or is it maybe even possible to do something like:
AX printf
I apologize for my ignorance on this, our book does not cover inline assembly, and I'd like to avoid having to write a massive segment of code to convert if I can. Plus I can't really seem to find answers on how exactly printf works. Thank you for any help, I really appreciate it!
The easiest way to accomplish this is to use inline assembler to copy your register to some variable, and then print that variable.
short registerValue;
__asm mov registerValue, ax;
printf("ax: %hd", registerValue);
The exact assembler invocation will depend on your compiler and syntax; the above likely won't work with a compiler other than cl.
If you want to actually call printf from assembler, you'll need to figure out it's calling convention and how that calling convention passes variadic function arguments.
Depending on the compiler, there may be predefined pseudo-symbols which directly access the registers. This was especially convenient with Turbo C and its descendants:
_some_magic_function ();
printf ("es:bx = %0x:%0x\n", _ES, _BX);
im working on a hook in C++ and ASM and currently i have just made an easy inline hook that places a jump in the first instruction of the target function which in this case is OutputDebugString just for testing purposes.
the thing is that my hook fianlly works after about 3 days of research and figuring out the bits and peaces of how things work, but there is one problem i have no idea how to change the parameters that come in to my "dummy" function before jumping on to the rest of the original function.
as u can see in my code i have tried to change the parameter simply in C++ but of course this does not work as im poping all the registers afterwards :/
anyways here is my dummy function which is what the hooked function jumps to:
static void __declspec(naked) MyDebugString(LPCTSTR lpOutputString) {
__asm {
PUSHAD
}
//Where i suppose i could run my code, but not be able to interfere with parameters :/
lpOutputString = L"new message!";
__asm {
POPAD
MOV EDI, EDI
PUSH EBP
MOV EBP, ESP
JMP Addr
}
original_DebugString(lpOutputString);
}
i understand why the code is not working as i said, i just can't see a proper solution to this, any help is greatly appreciated.
Every compiler has a protocol for calling functions using assembly language. The protocol may be stated deep in their manuals.
A faster method to find the function protocols is to have the compiler generate an assembly language listing for your function.
The best method for writing inline assembly is to:
First write the function in C++ source code
Next print out the assembly listing of the function.
Review and understand how the compiler generated assembly works.
Lastly, modify the internal assembly to suite your needs.
My preference is to write the C++ code as efficient as I can (or to help the compiler use optimal assembly language). I then review the assembly listing. I only change the inline assembly to invoke processor special features (such as block move instructions).
I have a dll and i wish to create a detour to one of its exported functions,
The dll is not part of windows.
I need to be able to call the real function after my detour (call the real function from a detoured one)
I know the exact signature of the function.
I already have been able to detour the function, but right now i can't call the real one.
I realize i need to use a trampoline function, I've seen examples online.
the problem is: all those examples show how to detour a windows API function, i need to do the same for a function i get thorough a dll import.
any help would be welcomed
--edit
just to clarify, I have attempted to call the original function by its pointer, but that does not work.
also tried using the method from this stack overflow article
that doesn't even crash but it looks like it goes into to an infinite loop (i assume because in the original function there is a jump to the detoured one)
edit -- solved!
not sure what solved it,
used this as reference.
stopped using getProcadder and instead started using DetourFindFunction instead
cleaned up the code (pretty sure i cleaned out whatever caused the issue)
works,
thanks anyway
I don't use detours(I actually detest it!), but detouring any non hot-patchable function can be done in a generic manner, like so:
Sstep 1:
insert a JMP <your code> at the start of the function, takes 5 bytes, probably a little more to align to the nearest instruction. as an example
the start of the function to hook:
SUB ESP,3C
PUSH EDI
PUSH ESI
//more code
would become:
JMP MyFunction
//more code
one would do this by writing 0xE9 at the first byte then writing the value (function_addr - patch_addr + sizeof(INT_PTR)) in the following DWORD. writing should be done using WriteProcessMemory after setting Read/write/execute permissions with VirtualProtectEx
Step 2:
next, we create an assembly interface:
void __declspec(naked) MyFunc()
{
__asm
{
call Check ;call out filter func
test eax,eax ; test if we let the call through
je _EXIT
sub esp,3c ; its gone through, so we replicate what we overwrote
push edi
push esi
jmp NextExecutionAddress ; now we jump back to the location just after our jump
_EXIT:
retn ; note, this must have the correct stack cleanup
}
}
NextExecutionAddress will need to be filled at run time using ModuleBase + RVA.
To be honest, its way easier, and better(!) to just EAT (Export Address Table) hook the export table of the dll, or IAT (Import Address Table) hook the import tables of whats calling the funcs you want to filter. Detours should have functions for these type of hooks, if not, there are other freely available libs to do it.
The other way would be to use detour to hook every call in the apps using the dll to reroute them to a proxy function in your own code, this has the advantage of allowing one to filter only certain calls, and not everything across a binary(it is possible to do the same using _ReturnAddress, but thats more work), the disadvantage though is capturing the locations to patch(I use ollydbg + a custom patching engine) and it won't work on non-regular calling convention functions(like those made with #pragma aux in Watcom or the optimized calls generated by VC7+).
One important thing to note: if your hooking a multithreaded app, your patches need to be done with the app suspended, or be done attomically use InterlockedExchange, InterlockExchange64 and InterlockedExchangePointer(I use the latter for all IAT/EAT hooks, especially when hooking from a 'third party process')
Looking at the post you link to, the method there is horrible in my opinion, mainly due to the assmebly :P but, how are you calling this pointer you obtain, and how is it obtained?