I'm hooking an udocumented Windows API function RtlGetFullPathName_U (residing in ntdll.dll), to detect process injections in my game. However, the function type looks different when looking at the function in IDA, and when looking at the function through the only info I could find about the function (from ReactOS's docs).
When looking in IDA:
The file analyzed above is ntdll.dll found through x32dbg:
When looking in ReactOS' docs, I see RtlGetFullPathName_U looks like this:
ULONG
NTAPI
RtlGetFullPathName_U(
IN PCWSTR FileName,
IN ULONG Size,
IN PWSTR Buffer,
OUT PWSTR *ShortName
);
Using ReactOS' version of RtlGetFullPathName_U works when I hook, but I notice a difference in amount of parameters, why is that? I mean my approach would normally be to see the exported functions through IDA, not through ReactOS' documentation.
A last question; are there other relevant functions I could hook to detect process injections? Besides LoadLibraryA/W/Ex?
As you can see in the disassembly, the function uses push ecx early on, followed by saving the address of the just-pushed value in eax. The address in eax is then pushed onto the stack as an argument for the next function.
So what you read in the decompiler output is not technically wrong: it stores the value of ecx in a local variable and then passes the address of that local variable to RtlGetFullPathName_UEx.
To capture this, IDA assumes that the value passed to the function in ecx might matter and marks it as a parameter.
However, most likely, the real purpose of the push ecx instruction here is not to save the value of ecx, but simply to reserve four bytes on the stack for a local variable (a more common idiom for which would be sub esp, 4). Using push is an optimization.
To confirm this definitively, you would have to analyze the called function, RtlGetFullPathName_UEx, and see whether it ever reads the contents of the memory pointed to by its last parameter. If, as I strongly suspect, it does not, and this parameter is only used for output, then the value in the caller can simply be considered uninitialized.
After you've confirmed this (or if for some other reason, e.g. trusting ReactOS's declaration, you believe this is the case), you can modify the function prototype to use __stdcall and remove the void *this parameter in IDA, and it will show it as what it (probably) is: passing a pointer to an uninitialized local variable.
Related
If I have some non-inline function and C++ compiler knows that this function modifies some registers then compiler will save all necessary registers before doing function CALL.
At least I expect that compiler does this (saving) as far as it knows what registers will be modified inside called function.
Now imagine that my function modifies ALL possible registers of CPU (general purpose, SIMD, FPU, etc.). How can I enforce compiler to save everything what it needs before doing any CALL to this function? To remind, my function is non-inline, i.e. is called through CALL instruction.
Of course through asm I can push all possible registers on stack at my function start and pop all registers back before function return.
Although I can save ALL possible registers I would better prefer if compiler saves only necessary registers, that were used by function's caller, for performance (speed) and memory usage reasons.
Because inside my function I don't know in advance who will use it hence I have to save every possible register. But at the place where my function was used compiler knows exactly what registers are used in caller's function hence it may save much fewer registers needed, because for sure not all registers will be used.
Hence I want to mark my function as "modifying all registers" so that C++ compiler will push to stack just registers that it needs before calling my function.
Is there any way to do this? Any GCC/CLang/MSVC attribute of function? Or maybe listing all registers in clobber section of asm statement?
Main thing is that I don't want to save registers myself inside this function (for some specific reason), instead I want all callers to save all needed registers before calling my function, but I want all callers to be aware that my function modifies everything what is possible.
I'm looking for some imaginary modifies-all attribute like:
__attribute__((modifies_all_registers)) void f();
I did following experiment:
Try it online!
__attribute__((noinline)) int modify(int i) {
asm volatile(
""
: "+m" (i) ::
"rax", "rbx", "rcx", "rdx", "rsi", "rdi", "rbp", "rsp",
"r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15",
"xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7",
"xmm8", "xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15",
"ymm0", "ymm1", "ymm2", "ymm3", "ymm4", "ymm5", "ymm6","ymm7",
"ymm8", "ymm9", "ymm10", "ymm11", "ymm12", "ymm13", "ymm14", "ymm15",
"zmm0", "zmm1", "zmm2", "zmm3", "zmm4", "zmm5", "zmm6", "zmm7",
"zmm8", "zmm9", "zmm10", "zmm11", "zmm12", "zmm13", "zmm14", "zmm15"
);
return i + 1;
}
int main(int argc, char ** argv) {
auto volatile x = modify(argc);
}
in other words I asm-clobbered almost all possible registers, and compiler generated following push-sequence inside modify() (and also same pop sequence at the end):
push rbp
mov rbp, rsp
push r15
push r14
push r13
push r12
push rbx
nothing else was pushed, so I can see that somehow compiler (CLang) didn't care about other regiesters except rbx, rbp, r12-r15. Does it mean that there is some C++ calling convention that says that I can modify any other registers besides these few, without restoring them on function return?
Does it mean that there is some C++ calling convention that says that I can modify any other registers besides these few, without restoring them on function return?
Yes. Among other things, ABI specification that is used on a given platform defines function calling conventions. Calling conventions define a set of registers that are allowed to be clobbered by the function and a set of registers that are required to be preserved by the function. If registers of the former set contain useful data for the caller, the caller is expected to save that data before the call. If registers from the latter set have to be used in the called function, the function must save and restore these registers before returning.
There are also conventions regarding which registers, in what order, are used to pass arguments to the function and to receive the returned value. You can consider those registers as clobbered, since the caller must initialize them with parameter values (and thus save any useful data that was in those registers before the call) and the callee is allowed to modify them.
In your case, the asm statement marks all registers as clobbered, and the compiler only saves and restores registers that it is required to preserve across the function call. Note that by default the caller will always save the registers from the clobber set before a function call, whether they are actually modified by the function or not. In some cases, the optimizer may be able to remove saving the registers that are not actually modified - for example, if the function call is inlined or the compiler is able to analyze the function body (e.g. in case of LTO). However, if the function body is not known at compile time, the compiler must assume the worst and adhere the ABI specification.
So, in general, you do not need to mark the function in any special way - the ABI rules already work in such a way that registers are saved and restored as needed. And, as you witnessed yourself, even with asm statements the compilers are able to tell which registers are used in a function. If you still want to save specific, or all, registers for some reason, your only option is to write in assembler. Or, in case if you're implementing some sort of context switching, use specialized instructions like XSAVE/XRSTOR or APIs like ucontext.
I want to break in a function, but only if it was NOT called from a specific other function. That's because there's one or two functions that amount for most of the calls, but I'm not interested in debugging them.
I noticed that breakpoints have a Filter option:
Is that something that could be used to filter stack trace and break based on it's contents?
I don't think you can use the filters for that, based on this: Use breakpoints in the Visual Studio debugger Specifically, the breakpoint filters are meant for concurrent programs, and you can filter on:
MachineName, ProcessId, ProcessName, ThreadId, or ThreadName.
One suggestion I would make to get something like what you want, is to add an extra parameter with a default value to the function you want to break in. Then set the value to something different in the places you don't want to monitor, and use a "Conditional Expression" in the breakpoint to make it only break on the default value.
Of course, this requires you to make debugging-only changes to your code (and then revert them when done), so it is a pretty ugly approach.
If you know the address of the code location where the function is called from, you could make the breakpoint condition depend on the return address stored on the call stack.
Therefore, you should be able to set the breakpoint as a condition of the value *(DWORD*)ESP (32-bit code) or *(QWORD*)RSP (64-bit code). I haven't tested it though.
However, my above example will only work if the breakpoint is set at the very start of the function, before the called function pushes any values on the stack or modifies the stack pointer. I'm not sure where Visual Studio sets the breakpoint if you place it on the first instruction of a function. Therefore, you may have to either set the breakpoint in the disassembly window to the first assembler instruction of the function or you might have to compensate for the function having modified the stack pointer in the function prolog.
Alternatively, if a proper stack frame has been set up using the EBP register (or RBP for 64-bit), then you could use that instead.
Please note that not the address of the CALL instructon will be placed on the stack, but rather the return address, which is the address of the next assembler-level instruction of the calling function.
I suggest you first set an unconditional breakpoint where you want it and then inspect the stack using the memory viewer in the debugger, specifically to see where the values of ESP/RSP and EBP/RBP are pointing and where the return address is stored on the stack.
I have a memory address, its the memory address of a function in another program (one of its dlls). I am already loaded into the program via DLL injection. I already have the bass address, and the actual location of the function each time the program loads. So, this is not an issue.
I want to just simply hook that location, and grab the variables. I know the function's pseudocode. So this is not an issue. OR another approach that would be great is doing a break point at that memory location and grab the debug registers.
I can not find any clear-cut examples of this. I also do not have the "name" of the function, I just have the memory address. Is there any way to work with just a memory address? Most, if not all the examples have you use the name of the function, which I do not have.
If anyone could point me into the right direction so I can accomplish this task, I would greatly appreciate it. It also might help a lot of other people who may have the same question.
Edit: I should also mention that Id rather not overload my program with someone else code, I really just want the barebones, much like a basic car with roll-up windows. No luxury packages for me please.
You missed the most important part, is this for 32 or 64 bit code? In any case, the code project has a good run-down and lib here that covers both.
If you want to do this "old-school", then it can be done quite simply:
firstly, you need to find the virtual address of the function you want to hook (due to ASLR, you should never rely on it being in the same place), this is generally done with RVA + module base load address for function that are not exported, for exported functions, you can use GetProcAddress.
From there, the type hook depends on what you want to accomplish, in your case, there are two methods:
patch a jump/call out to your function in the target function' prologue
patch all call sites to the function you want to hook, redirecting to your function
the first is simpler, but messy as it generally involves some inline assembly (unless you are hooking a /HOTPATCH binary or you just want to stub it), the second is much cleaner, but requires a bit of work with a debugger.
The function you'll jump out to should have the same parameters and calling convention (ABI) as the function you are hooking, this function is where you can capture the passed parameters, manipulate them, filter calls or whatever you are after.
for both, you need a way to write some assembly to do the patching, under windows, WriteProcessMemory is your first port of call (note: you require RWX permissions to do this, hence the calls to VirtualProtect), this is a little utility function that creates a 32bit relative call or jump (depending on the opcode passed as eType)
#pragma pack(1)
struct patch_t
{
BYTE nPatchType;
DWORD dwAddress;
};
#pragma pack()
BOOL ApplyPatch(BYTE eType, DWORD dwAddress, const void* pTarget)
{
DWORD dwOldValue, dwTemp;
patch_t pWrite =
{
eType,
(DWORD)pTarget - (dwAddress + sizeof(DWORD) + sizeof(BYTE))
};
VirtualProtect((LPVOID)dwAddress,sizeof(DWORD),PAGE_EXECUTE_READWRITE,&dwOldValue);
BOOL bSuccess = WriteProcessMemory(GetCurrentProcess(),(LPVOID)dwAddress,&pWrite,sizeof(pWrite),NULL);
VirtualProtect((LPVOID)dwAddress,sizeof(DWORD),dwOldValue,&dwTemp);
return bSuccess;
}
This function works great for method 2, but for method 1, you'll need to jump to an intermediary assembly trampoline to restore any code that the patch overwrote before returning to the original function, this gets very tedious, which is why its better to just use an existing and tested library.
From the sounds of it, using method 1 and patching a jump over the prologue of your target function will do what you need, as it seems you don't care about executing the function you patched.
(there is a third method using HW breakpoints, but this is very brittle, and can become problematic, as you are limited to 4 HW breakpoints).
Your "sample" is here:
http://www.codeproject.com/Articles/4610/Three-Ways-to-Inject-Your-Code-into-Another-Proces#section_1
Normally when you "hook" into the DLL, you actually put your function in front of the one in the DLL that gets called, so your function gets called instead. You then capture whatever you want, call the other function, capture its return values and whatever else, then return to the original caller.
If I have a program in C++/C that (language doesn't matter much, just needed to illustrate a concept):
#include <iostream>
void foo() {
printf("in foo");
}
int main() {
foo();
return 0;
}
What happens in the assembly? I'm not actually looking for assembly code as I haven't gotten that far in it yet, but what's the basic principle?
In general, this is what happens:
Arguments to the function are stored on the stack. In platform specific order.
Location for return value is "allocated" on the stack
The return address for the function is also stored in the stack or in a special purpose CPU register.
The function (or actually, the address of the function) is called, either through a CPU specific call instruction or through a normal jmp or br instruction (jump/branch)
The function reads the arguments (if any) from the stack and the runs the function code
Return value from function is stored in the specified location (stack or special purpose CPU register)
Execution jumps back to the caller and the stack is cleared (by restoring the stack pointer to its initial value).
The details of the above vary from platform to platform and even from compiler to compiler (see e.g. STDCALL vs CDECL calling conventions). For instance, in some cases, CPU registers are used instead of storing stuff on the stack. The general idea is the same though
You can see it for yourself:
Under Linux 'compile' your program with:
gcc -S myprogram.c
And you'll get a listing of the programm in assembler (myprogram.s).
Of course you should know a little bit about assembler to understand it (but it's worth learning because it helps to understand how your computer works). Calling a function (on x86 architecture) is basically:
put variable a on stack
put variable b on stack
put variable n on stack
jump to address of the function
load variables from stack
do stuff in function
clean stack
jump back to main
What happens in the assembly?
A brief explanation: The current stack state is saved, a new stack is created and the code for the function to be executed is loaded and run. This involves inconveniencing a few registers of your microprocessor, some frantic to and fro read/writes to the memory and once done, the calling function's stack state is restored.
What happens? In x86, the first line of your main function might look something like:
call foo
The call instruction will push the return address on the stack and then jmp to the location of foo.
Arguments are pushed in stack and "call" instruction is made
Call is a simple "jmp" with pushing an address of instruction into stack ("ret" in the end of a method popping it and jumping on it)
I think you want to take a look at call stack to get a better idea what happens during a function call: http://en.wikipedia.org/wiki/Call_stack
A very good illustration:
http://www.cs.uleth.ca/~holzmann/C/system/memorylayout.pdf
What happens?
C mimics what will occur in assembly...
It is so close to machine that you can realize what will occur
void foo() {
printf("in foo");
/*
db mystring 'in foo'
mov eax, dword ptr mystring
mov edx , dword ptr _printf
push eax
call edx
add esp, 8
ret
//thats it
*/
}
int main() {
foo();
return 0;
}
1- a calling context is established on the stack
2- parameters are pushed on the stack
3- a "call" is performed to the method
The general idea is that you need to
Save the current local state
Pass the arguments to a function
Call the actual function. This involves putting the return address somewhere so the RET instruction knows where to continue.
The specifics vary from architecture to architecture. And the even more specific specifics might vary between various languages. Although there usually are ways of controlling this to some extent to allow for interoperability between different languages.
A pretty useful starting point is the Wikipedia article on calling conventions. On x86 for example the stack is almost always used for passing arguments to functions. On many RISC architectures, however, registers are mainly used while the stack is only needed in exceptional cases.
The common idea is that the registers that are used in the calling method are pushed on the stack (stack pointer is in ESP register), this process is called "push the registers". Sometimes they're also zeroed, but that depends. Assembly programmers tend to free more registers then the common 4 (EAX, EBX, ECX and EDX on x86) to have more possibilities within the function.
When the function ends, the same happens in the reverse: the stack is restored to the state from before calling. This is called "popping the registers".
Update: this process does not necessarily have to happen. Compilers can optimize it away and inline your functions.
Update: normally parameters of the function are pushed on the stack in reverse order, when they are retrieved from the stack, they appear as if in normal order. This order is not guaranteed by C. (ref: Inner Loops by Rick Booth)
I'm debugging a C++ Win32 application and I'd like to call an arbitrary Win32 API from the context of that process, as though the program had run this line of code:
DestroyWindow(0x00021c0e);
But entering that into the Immediate Window gives:
CXX0017: Error: symbol "DestroyWindow" not found
Edit: Using the full name of the function, {,,user32.dll}_NtUserDestroyWindow#4, I can get the Immediate Window to understand which function I mean and display the function's address:
{,,user32.dll}_NtUserDestroyWindow#4
0x76600454 _NtUserDestroyWindow#4
but when I try to call it, this happens:
{,,user32.dll}_NtUserDestroyWindow#4(0x00021c0e);
CXX0004: Error: syntax error
Is it even possible to call a C function from the Immediate Window like this, or am I barking up the wrong tree?
Once you have the function address (as you've done in the updated question), you can try casting it to a function pointer and calling it:
(*(BOOL (*)(HWND))0x76600454)((HWND)0x00021c0e)
The first part of that casts the address to BOOL (*)(HWND), which is a pointer to a function taking an HWND parameter and returning BOOL. Then, the function pointer is dereferenced and called. Make sure to get the parameters correct, otherwise bad things will happen. On 64-bit systems, and HWND might be 64 bits, so you might not be able to get away with passing the parameter as an int.
Edit: See the comments for the full story.
I believe the problem is that the C++ EE is having problems resolving the context of DestroyWindow. Try the following
{,,user32}DestroyWindow(0x00021c0e);
I'm not sure if the method invocation syntax supports this style of qualification (only used it for casting in the past). But it's worth a shot.
EDIT You may or may not need to add a ! after the closing }. It's been awhile since I've used this syntax and I often confuse it with the equivalent windbg one.
I figured out a workaround, but I'd still prefer to get the Immediate Window to work.
The workaround is:
get the address of the function, as shown in the question
use the Disassembly window to go to that address, and put a breakpoint there
do something to the application to make it call DestroyWindow
step back up the call stack to the caller of DestroyWindow, which looks like this:
6D096A9D push ecx
6D096A9E call dword ptr ds:[6D0BB4B8h]
put a breakpoint on the push ecx instruction, and clear the one on DestroyWindow
hit Continue, and again do something to the application to make it call that code
note down the value of ecx
change the value of ecx in the debugger to the desired value and step over the push/call
restore the value of ecx and use Set Next Statement to go back to the push, then Continue
It's longwinded, but it works. It assumes you can make the application call the appropriate API at will.