Can I access Windows Kernel system calls directly? - c++

I have been doing research into Windows internals, and have just recently learned about system calls and I am wondering if it is possible to use these system calls like functions? I understand they aren't really meant to be accessed externally.
For instance: NtUserEmptyClipboard is a system call in Win32k.sys, and it's address is 0x117f
If I wanted to use this call like a function, how could I do so?

What you want to do depends heavily on the architecture you're interested, but the thing to know is, that ntdll.dll is the user-mode trampoline for every syscall - i.e. the only one who actually makes syscalls at the end of the day is ntdll.
So, let's disassemble one of these methods in WinDbg, by opening up any old exe (I picked notepad). First, use x ntdll!* to find the symbols exported by ntdll:
0:000> x ntdll!*
00007ff9`ed1aec20 ntdll!RtlpMuiRegCreateLanguageList (void)
00007ff9`ed1cf194 ntdll!EtwDeliverDataBlock (void)
00007ff9`ed20fed0 ntdll!shortsort_s (void)
00007ff9`ed22abbf ntdll!RtlUnicodeStringToOemString$fin$0 (void)
00007ff9`ed1e9af0 ntdll!LdrpAllocateDataTableEntry (void)
...
So, let's pick one at random, NtReadFile looks neato. Let's disassemble it:
0:000> uf ntdll!NtReadFile
ntdll!NtReadFile:
00007ff9`ed21abe0 4c8bd1 mov r10,rcx
00007ff9`ed21abe3 b805000000 mov eax,5
00007ff9`ed21abe8 0f05 syscall
00007ff9`ed21abea c3 ret
Here, we see that we stuff away rcx, put the syscall number into eax, then call the syscall instruction. Every syscall has a number that is assigned arbitrarily by Windows (i.e. this number is a secret handshake between ntdll and the kernel, and changes whenever Microsoft wants)
None of these instructions are "magic", you could execute them in your app directly too (but there's no practical reason to do so, of course - just for funsies)

EmptyClipboard is one of so-called "Win32 API" and NtUserEmptyClipboard is a corresponding "native API".
http://en.wikipedia.org/wiki/Native_API
Unlike Linux syscall(2), we are rarely supposed to directly call "native API". I heard they are in ntdll.dll rather than win32k.sys. But we should be able to invoke them just like normal functions defined in a normal DLL.
Is there any way to call the Windows Native API functions from the user mode?

I strongly doubt that 0x117f is the address you're looking for. I suspect it might be the value which you need to pass to GetProcAddress. But I don't know for sure, since those things vary across Windows versions (that's why ordinary people use documented functions instead)

The main part of the native API is exported via normal functions from ntdll.dll. You can load this dll into your process and call these functions just like any other API functions. As long as you have the right function prototypes and parameters, the calls will work just fine. What they do internally is transition from usermode to kernelmode and then they use an offset into the system service descriptor table (SSDT) to find the address of the function in kernel mode memory, and then the function is called. There is an open source project http://nativetest.codeplex.com/ that makes calls to the native api that you might refer to.
The functions in win32k.sys are not exposed in ntdll.dll. As far as I can tell they are not exposed anywhere. The address you have listed - I believe - is actually an offset into the SSDT. If you really needed to call this function, you would have to make the transition from usermode to kernelmode yourself, putting all the parameters for the function and the SSDT offset into the right places.
As others have recommended, I would suggest to find the usermode API to help accomplish what you want to do. FWIW, in user32.dll the function EmptyClipboard appears to forward directly to NtUserEmptyClipboard, according to the link /dump output.
1731 DF 0002018A EmptyClipboard = _NtUserEmptyClipboard#0

Related

EndScene hook questions

So recently I wanted to add an imgui interface to an example window using DirectX, so I watched on a video I had to hook the EndScene function using DirectX9sdk to be able to add my custom imgui interface.
However I have some questions:
Where can I find any documentation for the DirectX9 functions and types,( if there is any, because I do not understand why we specifically have to hook the EndScene function) or where could I find any article explaining more in depth how directX works?
I've seen two version so far of EndScene hooks one with a patternScanning function which scans a signature in the shaderapi dll and another which creates a DirectXDevice and then accesses the vtable from there; are there any sources online, or is it something we have to do ourselves?
Here is the version I have:
while (!DirectXDevice) // loops until it finds the device
DirectXDevice = **(DWORD**)(FindPattern("shaderapidx9.dll", "A1 ?? ?? ?? ?? 50 8B 08 FF 51 0C") + 0x1);
void** pVTable = *reinterpret_cast<void***>(DirectXDevice); // getting the vtable array
oEndScene = (f_EndScene)DetourFunction((PBYTE)pVTable[42], (PBYTE)Hooked_EndScene)//getting the 42th virtual function and detouring it to our own
I don't really understand what __stdcall does here, I do know it is used to call WINAPI functions but what for here?
HRESULT __stdcall Hooked_EndScene(IDirect3DDevice9* pDevice){//some code}
Note: thats the function I hook to the original endscene.
Thank you really much, I'm sorry if there are alot of questions but I really can't wrap my head around this.
How do you know which functions you need to hook?
To put it bluntly, you have to be an experienced DirectX graphics programmer to find that out. Don't expect being able to hook into a framework that you don't understand. It just so happens that EndScene will always be called after all the other draw calls on the render target.
There are tons of D3D9 programming resources available, online and in paper form. Most of them are not free. I'm afraid this is not the answer you were hoping for.
What is the deal with pattern scanning, or creating a temporary D3D9 device?
Microsoft did not put any explicit effort into making EndScene hookable. It just happens to be hookable because every normal function is hookable. You need a way to find the function in memory, because the function will not always be at the same address.
One approach is to scan for known instructions that appear inside the function. Someone needs to be the first person to find out that pattern that you can scan for. You are far from the first person to hook EndScene, so many have reverse-engineered the function before and shared searchable patterns.
NOTE: The pattern does not necessarily need to be directly inside the target function. It might also lead you to something else first, in your case, the ID3D9Device instance. The important thing is that you can find your way to the EndScene function somehow.
Another approach is to get a pointer to the function. If it was a regular C function, that would be easy. It's hard here because OOP tends to make these things hard - you have to fight your way through various interfaces to get the correct vtable.
Both methods have advantages and disadvantages -- creating a D3D9 device is safer, but also more intrusive, because the target process might not expect someone to just randomly create new devices.
Why does the hook function need __stdcall?
Since you replace the original function with your hooked version, the calling convention of the hooked function must be the same as the calling convention of the original function. The caller of EndScene expects (and was compiled with) a __stdcall convention, so the new function must also behave the same way, otherwise the stack will be corrupted. Your act of replacing the function does not change the way the caller calls it.

MSVC: Reading a specific 64 or 32 bit register (e.g. R10) in 64 bit code?

Is there any way with MSVC to read a specific 64 (or 32) bit register directly in a normal C++ function?
For example, can I read the contents of r10 somehow via any intrinsics or such?
For context:
I'm implementing a variadic function (lets call it my_func), which needs to forward its call to another variadic function, and add one more argument along the way (an ID if you will, any numeric type will do - a 16, 32 or 64 bit integer for example, doesn't matter too much).
I need to do this forwarding in as little instructions as possible, so I can't process the variadic list in the initial function and just forward the va_list or such.
So I've implemented my_func in assembly:
; This function needs to be as compact as possible
my_func PROC
; assume 123 is the ID to be passed along with the arguments that my_func is called with
mov r10, 123
jmp address_of_the_real_target_function
my_func
I just jump to the target function, and pass the ID in a seperate register - R10 in this case.
ARG* the_real_target_function(ARG* arg0, ...)
{
auto id = ReadRegister();
// ... do stuff ...
}
This works well so far - only nuisance being that I needed another assembly helper function to read R10 back in the proper C++ function,
ReadRegister PROC
mov rax, r10
ret
ReadRegister ENDP
which is a bit annoying as that call won't get inlined.
Hence the question - is there any way to read this register directly in C++?
(Otherwise, I was thinking of maybe utilizing SSE registers, which should be readable via intrinsics - but curious if there's a way to do this with just 64 - or 32 - bit registers)
Thanks
--
edit: I believe this is not a duplicate of the linked topic. Listed solutions in there are specific to other compilers, or in case os MSVC, 32-bit only (inline assembly is not supported on x64)
--
edit 2: For more context on why I'm trying to do this.
This is indended to be an Excel Addin (which will host plugins and expose their functions to Excel, basically).
In order to register a function in Excel, I need to bind it to a specific function exported by my DLL. I don't know in advance (= at compile time) how many, or what plugin functions need to be registered and called.
So I need to implement loads of exported functions - thousands. Enough to always have registration slots for all plugins available.
In order to keep the overall size of the DLL in check, I need the registered functions to be very slim, and ideally also be capable of dealing with variadic args (as I don't know what shape the plugin functions have at compile-time; and due to the space-constraints, I want to avoid creating callbacks for any possible aririty of arguments)
And for even more added fun, it needs to work in x64 and x86 - though in the latter case, the function is called by Excel via stdcall convention, so the usual C++ variadic args won't work. But, at least at runtime I can find out the number (and type) of args passed to the function, so I should be able to handle the stack myself.
So bottom line, my idea is to have these slim trampoline functions, which will forward all arguments, plus their ID, to some central handler (as per above in X64; and via stack in X86).
The handler then gets things a bit into order - i.e. creates some standardized iterator for the arguments, calls the actual plugin function registered via that ID etc.
static thread_local variable would take few instructions, so it is not that slim as you may want.
Yet it would be fully portable.
There's less portable but more instruction-efficien way.
Notice Arbitrary data slot in TEB.
So __readfsdword(0x14)/__writefsdword(0x14) on x86 and __readgsqword(0x28)/__writegsqword(0x28) on x64 may do this trick. If, well, no one else is using the same extra space for other purpose.

Using a windows kernal function via GetModuleHandle

I would like to use FsRtlIsDbcsInExpression (https://msdn.microsoft.com/en-us/library/windows/hardware/ff546803(v=vs.85).aspx) to do wild card checking exactly the same as Windows does it natively without have to re-implement it in my program. When I use:
auto module = GetModuleHandle(TEXT("NtosKrnl.exe"));
module turns up null. From what I can find on the internet, since this is a kernel mode function, KernelGetModuleBase is required. However, this function doesn't seem to resolve automatically and there are no msdn docs on it, so I am doubtful that is the solution. Does anyone have pointers for how to use function?
GetModuleHandle for ntoskrnl is going to fail because it's not loaded into your memory space. You can only call such functions from kernel.
You might want to try for the function PathMatch spec (https://msdn.microsoft.com/en-us/library/windows/desktop/bb773727%28v=vs.85%29.aspx). It appears to do the same job.

calling kernel32.dll function without including windows.h

if kernel32.dll is guaranteed to loaded into a process virtual memory,why couldn't i call function such as Sleep without including windows.h?
the below is an excerpt quoting from vividmachine.com
5. So, what about windows? How do I find the addresses of my needed DLL functions? Don't these addresses change with every service pack upgrade?
There are multitudes of ways to find the addresses of the functions that you need to use in your shellcode. There are two methods for addressing functions; you can find the desired function at runtime or use hard coded addresses. This tutorial will mostly discuss the hard coded method. The only DLL that is guaranteed to be mapped into the shellcode's address space is kernel32.dll. This DLL will hold LoadLibrary and GetProcAddress, the two functions needed to obtain any functions address that can be mapped into the exploits process space. There is a problem with this method though, the address offsets will change with every new release of Windows (service packs, patches etc.). So, if you use this method your shellcode will ONLY work for a specific version of Windows. Further dynamic addressing will be referenced at the end of the paper in the Further Reading section.
The article you quoted focuses on getting the address of the function. You still need the function prototype of the function (which doesn't change across versions), in order to generate the code for calling the function - with appropriate handling of input and output arguments, register values, and stack.
The windows.h header provides the function prototype that you wish to call to the C/C++ compiler, so that the code for calling the function (the passing of arguments via register or stack, and getting the function's return value) can be generated.
After knowing the function prototype by reading windows.h, a skillful assembly programmer may also be able to write the assembly code to call the Sleep function. Together with the function's address, these are all you need to make the function call.
With some black magic you can ;). there have been many custom implementations of GetProcAddress, which would allow you to get away with not needing windows.h, this however isn't be all and end all and could probably end up with problems due to internal windows changes. Another method is using toolhlp to enumerate the modules in the process to get kernel.dll's base, then spelunks its PE for the EAT and grab the address of GetProcAddress. from there you just need function pointer prototypes to call the addresses correctly(and any structure defs needed), which isn't too hard meerly labour intensive(if you have many functions), infact under windows xp this is required to disable DEP due to service pack differencing, ofc you need windows.h as a reference to get this, you just don't need to include it.
You'd still need to declare the function in order to call it, and you'd need to link with kernel32.lib. The header file isn't anything magic, it's basically just a lot of function declarations.
I can do it with 1 line of assembly and then some helper functions to walk the PEB
file by hard coding the correct offsets to different members.
You'll have to start here:
static void*
JMIM_ASM_GetBaseAddr_PEB_x64()
{
void* base_address = 0;
unsigned long long var_out = 0;
__asm__(
" movq %%gs:0x60, %[sym_out] ; \n\t"
:[sym_out] "=r" (var_out) //:OUTPUTS
);
//: printf("[var_out]:%d\n", (int)var_out);
base_address=(void*)var_out;
return( base_address );
}
Then use windbg on an executable file to inspect the data structures on your machine.
A lot of the values you'll be needing are hard to find and only really documented by random hackers. You'll find yourself on a lot of malware writing sites digging for answers.
dt nt!_PEB -r #$peb
Was pretty useful in windbg to get information on the PEB file.
There is a full working implementation of this in my game engine.
Just look in: /DEP/PEB2020 for the code.
https://github.com/KanjiCoder/AAC2020
I don't include <windows.h> in my game engine. Yet I use "GetProcAddress"
and "LoadLibraryA". Might be in-advisable to do this. But my thought was the more
moving parts, the more that can go wrong. So figured I'd take the "#define WIN32_LEAN_AND_MEAN" to it's absurd conclusion and not include <windows.h> at all.

how to create a trampoline function using DetourAttachEx? (with MS detours)

I have a dll and i wish to create a detour to one of its exported functions,
The dll is not part of windows.
I need to be able to call the real function after my detour (call the real function from a detoured one)
I know the exact signature of the function.
I already have been able to detour the function, but right now i can't call the real one.
I realize i need to use a trampoline function, I've seen examples online.
the problem is: all those examples show how to detour a windows API function, i need to do the same for a function i get thorough a dll import.
any help would be welcomed
--edit
just to clarify, I have attempted to call the original function by its pointer, but that does not work.
also tried using the method from this stack overflow article
that doesn't even crash but it looks like it goes into to an infinite loop (i assume because in the original function there is a jump to the detoured one)
edit -- solved!
not sure what solved it,
used this as reference.
stopped using getProcadder and instead started using DetourFindFunction instead
cleaned up the code (pretty sure i cleaned out whatever caused the issue)
works,
thanks anyway
I don't use detours(I actually detest it!), but detouring any non hot-patchable function can be done in a generic manner, like so:
Sstep 1:
insert a JMP <your code> at the start of the function, takes 5 bytes, probably a little more to align to the nearest instruction. as an example
the start of the function to hook:
SUB ESP,3C
PUSH EDI
PUSH ESI
//more code
would become:
JMP MyFunction
//more code
one would do this by writing 0xE9 at the first byte then writing the value (function_addr - patch_addr + sizeof(INT_PTR)) in the following DWORD. writing should be done using WriteProcessMemory after setting Read/write/execute permissions with VirtualProtectEx
Step 2:
next, we create an assembly interface:
void __declspec(naked) MyFunc()
{
__asm
{
call Check ;call out filter func
test eax,eax ; test if we let the call through
je _EXIT
sub esp,3c ; its gone through, so we replicate what we overwrote
push edi
push esi
jmp NextExecutionAddress ; now we jump back to the location just after our jump
_EXIT:
retn ; note, this must have the correct stack cleanup
}
}
NextExecutionAddress will need to be filled at run time using ModuleBase + RVA.
To be honest, its way easier, and better(!) to just EAT (Export Address Table) hook the export table of the dll, or IAT (Import Address Table) hook the import tables of whats calling the funcs you want to filter. Detours should have functions for these type of hooks, if not, there are other freely available libs to do it.
The other way would be to use detour to hook every call in the apps using the dll to reroute them to a proxy function in your own code, this has the advantage of allowing one to filter only certain calls, and not everything across a binary(it is possible to do the same using _ReturnAddress, but thats more work), the disadvantage though is capturing the locations to patch(I use ollydbg + a custom patching engine) and it won't work on non-regular calling convention functions(like those made with #pragma aux in Watcom or the optimized calls generated by VC7+).
One important thing to note: if your hooking a multithreaded app, your patches need to be done with the app suspended, or be done attomically use InterlockedExchange, InterlockExchange64 and InterlockedExchangePointer(I use the latter for all IAT/EAT hooks, especially when hooking from a 'third party process')
Looking at the post you link to, the method there is horrible in my opinion, mainly due to the assmebly :P but, how are you calling this pointer you obtain, and how is it obtained?