Problem with hooking ntdll.dll calls - c++

I'm currently working on hooking ntdll.dll calls via dll injection.
At first, I create thread in existing process via CreateRemoteThread() then I load my dll via LoadLibrary and finally hook calls on PROCESS_ATTACH.
Injection works fine, but then I want to log all registry and file system queries. And the problem is that it doesn't work properly.
I decided to publish code via PasteBin, because piece is pretty big. Here is the link:
http://pastebin.com/39r4Me6B
I'm trying to hook ZwOpenKey, then log key content and then launch "true" function by pointer. Function NOpenKey gets executed, but process stops without any errors.
Does anyone see any issues?

If you use OllyDbg, ZwOpenKey starts with 5 bytes MOV EAX, 77.
You can overwrite these bytes like so JMP _myZwOpenKey then from there you can do whatever with the values on the stack, restore all registers then do a JMP 7C90D5B5 which is address of ZwOpenKey + 5 bytes.
CPU Disasm
Address Hex dump Command Comments
7C90D5AF 90 NOP
7C90D5B0 /$ B8 77000000 MOV EAX,77 ; ntdll.ZwOpenKey(guessed rg1,Arg2,Arg3)
7C90D5B5 |. BA 0003FE7F MOV EDX,7FFE0300
7C90D5BA |. FF12 CALL DWORD PTR DS:[EDX]
7C90D5BC \. C2 0C00 RETN 0C
I usually do these in Assembly that way I don't have to mess around a lot with type casting and all that. Hope this helps.

Related

How can i debug the CodeStubAssembler(CSA) code in v8 line by line

I have seen the good answer to my quesntion from Debugging CodeStubAssembler (CSA) code in V8. howerver,i really cannot understand the point "You can then step through the CSA code as it emits a Turbofan IR graph which the Turbofan backend will then translate to machine code" in upshot one.Can I debug CSA line by line according to the source code in that way?
In order to express my needs more clearly,I use some code examples:
2864 TNode<Smi> CodeStubAssembler::BuildAppendJSArray(ElementsKind kind,
2865 TNode<JSArray> array,
2866 CodeStubArguments* args,
2867 TVariable<IntPtrT>* arg_index,
2868 Label* bailout) {
2869 Comment("BuildAppendJSArray: ", ElementsKindToString(kind));
2870 Label pre_bailout(this);
2871 Label success(this);
2872 TVARIABLE(Smi, var_tagged_length);
The above is the code in CSA,Can I enter 'n' in gdb and then single step from line 2869 to line 2870 ?
Just to be clear,Can I get the following format for debugging CSA code?
[───────────────────────────────────────────────────────────────────────────────────────DISASM───────────────────────────────────────────────────────────────────────────────────────]
0x7f9fc9bcaca5 mov rax, qword ptr [rbp - 0x60]
0x7f9fc9bcaca9 mov rcx, qword ptr fs:[0x28]
0x7f9fc9bcacb2 mov rdx, qword ptr [rbp - 8]
0x7f9fc9bcacb6 cmp rcx, rdx
0x7f9fc9bcacb9 mov qword ptr [rbp - 0xb0], rax
► 0x7f9fc9bcacc0 jne 0x7f9fc9bcacd6
0x7f9fc9bcacc6 mov rax, qword ptr [rbp - 0xb0]
0x7f9fc9bcaccd add rsp, 0xb0
0x7f9fc9bcacd4 pop rbp
0x7f9fc9bcacd5 ret
0x7f9fc9bcacd6 call __stack_chk_fail#plt <0x7f9fcb191dc0>
[───────────────────────────────────────────────────────────────────────────────────────SOURCE───────────────────────────────────────────────────────────────────────────────────────]
457 // static
458 MaybeHandle<Object> Execution::Call(Isolate* isolate, Handle<Object> callable,
459 Handle<Object> receiver, int argc,
460 Handle<Object> argv[]) {
461 return Invoke(isolate, InvokeParams::SetUpForCall(isolate, callable, receiver,
462 argc, argv));
463 }
464
465 MaybeHandle<Object> Execution::CallBuiltin(Isolate* isolate,
466 Handle<JSFunction> builtin,
[───────────────────────────────────────────────────────────────────────────────────────STACK
Yes, you can do that, just like for any other C++ code.
Of course, this code runs as part of mksnapshot, and what it does is it creates (part of) a "builtin" code object that can handle appending elements to JavaScript arrays. Line 2869 will put a comment into the code object's comment section (if you are running with the --code-comments flag), line 2870 will define a label that will be used for conditional jumps later.
So just to be clear, this code does not run when you actually append elements to arrays. At that time, the builtin generated by this code will run, and debugging that requires different techniques (see the other answer).
EDIT to address questions in comments:
If i enter p kind in line 2870,can i get the value of kind? if i enter p ElementsKindToString in this function,can i get the address of function ElementsKindToString?
Yes, of course, this is plain C++. (Also, why do you ask? Just try it!)
how could i break in gdb before the Turbofan backend translate this function to machine code and get the debugging format i posted above.
Run mksnapshot in GDB and set a breakpoint on the line you want, then switch the view mode as desired. (Again, this is regular GDB usage; if you need a GDB tutorial then please search for one, there are plenty on the 'net.)
While you haven't directly asked for it, I have a suspicion that what you really want to do is step through the generated builtins instruction-by-instruction and see the CSA source that was responsible for generating them. That, unfortunately, is not possible, because the builtins and their generators run at different times (and even in different binaries!).

What happens when I double click an executable, technically

tl;dr
I'm trying to understand the difference of running a program directly via double clicking the executable vs running it through a terminal or programatically via CreateProcess in windows 10.
long version
Because this is my problem, executing an old game (circa 2003 using D3D8) through double clicking in windows 10 works okay. Executing the game through an seconday executable (also circa 2003) using CreateProcess seems to sometimes work okay.
But executing it through my new golang executable never works. I get a very tiny screen instead. So I want to understand what's the technical difference.
For reference, my golang code goes like: (tiny screen)
cmd := exec.Command(filepath.Join(".", "Game.exe"))
err := cmd.Start()
Disassembling the secondary executable gives me this: (normal screen)
CPU Disasm
Address Hex dump Command Comments
004043AF |. 51 PUSH ECX ; /pProcessInformation = 59C7F521 -> {hProcess=???,hThread=???,ProcessID=???,ThreadID=???}
004043B0 |. 52 PUSH EDX ; |pStartupInfo => OFFSET LOCAL.16
004043B1 |. 68 28E54000 PUSH OFFSET 0040E528 ; |CurrentDirectory = "."
004043B6 |. 50 PUSH EAX ; |pEnvironment => NULL
004043B7 |. 50 PUSH EAX ; |CreationFlags => 0
004043B8 |. 6A 01 PUSH 1 ; |InheritHandles = TRUE
004043BA |. 50 PUSH EAX ; |pThreadSecurity => NULL
004043BB |. 50 PUSH EAX ; |pProcessSecurity => NULL
004043BC |. 68 10E54000 PUSH OFFSET 0040E510 ; |CommandLine = "Game.exe"
004043C1 |. 50 PUSH EAX ; |ApplicationName => NULL
004043C2 |. FF15 8CB04000 CALL DWORD PTR DS:[<&KERNEL32.CreateProc ; \KERNEL32.CreateProcessA
When I say the game is a tiny screen it shows up like this:
versus this when it is executed directly with double click: (don't mind it being black its just the normal startup)
Additional info: Actually the problem only exists in windows 10. Windows 7 is completely fine.
For windows 10, the only way to make it normal screen is to use this setting:
When that is used, I get the normal screen when using double click or the secondary executable. But its still a tiny screen on my Golang app.

ORG alternative for C++

In assembly we use the org instruction to set the location counter to a specific location in the memory. This is particularly helpful in making Operating Systems. Here's an example boot loader (From wikibooks):
org 7C00h
jmp short Start ;Jump over the data (the 'short' keyword makes the jmp instruction smaller)
Msg: db "Hello World! "
EndMsg:
Start: mov bx, 000Fh ;Page 0, colour attribute 15 (white) for the int 10 calls below
mov cx, 1 ;We will want to write 1 character
xor dx, dx ;Start at top left corner
mov ds, dx ;Ensure ds = 0 (to let us load the message)
cld ;Ensure direction flag is cleared (for LODSB)
Print: mov si, Msg ;Loads the address of the first byte of the message, 7C02h in this case
;PC BIOS Interrupt 10 Subfunction 2 - Set cursor position
;AH = 2
Char: mov ah, 2 ;BH = page, DH = row, DL = column
int 10h
lodsb ;Load a byte of the message into AL.
;Remember that DS is 0 and SI holds the
;offset of one of the bytes of the message.
;PC BIOS Interrupt 10 Subfunction 9 - Write character and colour
;AH = 9
mov ah, 9 ;BH = page, AL = character, BL = attribute, CX = character count
int 10h
inc dl ;Advance cursor
cmp dl, 80 ;Wrap around edge of screen if necessary
jne Skip
xor dl, dl
inc dh
cmp dh, 25 ;Wrap around bottom of screen if necessary
jne Skip
xor dh, dh
Skip: cmp si, EndMsg ;If we're not at end of message,
jne Char ;continue loading characters
jmp Print ;otherwise restart from the beginning of the message
times 0200h - 2 - ($ - $$) db 0 ;Zerofill up to 510 bytes
dw 0AA55h ;Boot Sector signature
;OPTIONAL:
;To zerofill up to the size of a standard 1.44MB, 3.5" floppy disk
;times 1474560 - ($ - $$) db 0
Is it possible accomplish the task with C++? Is there any command, function etc. like org where i can change the location of the program?
No it's not possible to do in any C compiler that I know of. You can however create your own linker script that places the code/data/bss segments at specific addresses.
Just for clarity, the org directive does not load the code at the specified address, it merely informs the assembler that the code will be loaded at that address. The code shown appears to be for Nasm (or similar) - in AT&T syntax, the .org directive does something different: it pads the code to that address - similar to the times line in the Nasm code.. Nasm can do this because in -f bin mode, it "acts as it's own linker".
The important thing for the code to know is the address where Msg can be found. The jmps and jnes (and call and ret which your example doesn't have, but a compiler may generate) are relative addressing mode. We code jmp target but the bytes that are actually emitted say jmp distance_to_target (plus or minus) so the address doesn't matter.
Gas doesn't do this, it emits a linkable object file. To use ld without a linker script the command line looks something like:
ld -o boot.bin boot.o -oformat binary -T text=0x7C00
(don't quote me on that exact syntax but "something like that") If you can get a linkable object file from your (16-bit capable!) C++ compiler, you might be able to do the same.
In the case of a bootsector, the code is loaded by the BIOS (or fake BIOS) at 0x7C00 - one of the few things we can assume about the bootsector. The sane thing for a bootsector to do is not fiddle-faddle around printing a message, but to load something else. You'll need to know how to find the something else on the disk and where you want to load it to (perhaps where your C++ compiler wants to put it by default) - and jmp there. This jmp will want to be a far jmp, which does need to know the address.
I'm guessing it's going to be some butt-ugly C++!

How do I get a programs windows from an injected DLL?

I've injected a DLL into a program to implement a chat UI over the applications main window. I figured I can get the applications main window handle, then get it's DC, and draw onto it. The window has a predictable title, which means I can use FindWindow to get the handle. The only problem is, the DLL is injected when the process starts. At that time, the window hasn't been created. Which means FindWindow finds nothing!
What are some solutions to this? Could I create a thread in the DLL and sleep for a while until I know the window is created? This seems very unstable so I'd rather not do it.
What I tried to do was use SetWindowsHookEx in the DLL to hook the global WndProc. I could scan the messages until I find one from my window (which means it was created). Then I could save the handle and go on with my program. I'm not too worried about there being multiple windows with the same name at the time. The only problem is that my hook never gets called.
I create the hook like this:
m_hWndProcHook = SetWindowsHookEx(WH_CALLWNDPROC, (HOOKPROC)WndProc, m_hModule, 0);
if(!m_hWndProcHook)
{
oss << "Failed to set wndproc hook. Error code: " << GetLastError();
Log(oss.str().c_str());
return false;
}
Which returns a valid hook. The WndProc look like this:
LRESULT CALLBACK CChatLibrary::WndProc(int code, WPARAM wParam, LPARAM lParam)
{
CWPSTRUCT* pData;
ostringstream oss;
char wndName[256];
gChatLib->Log("WNDPROC");
if(code < 0)
return CallNextHookEx(gChatLib->GetWndProcHookHandle(), code, wParam, lParam);
else
{
//Get the data for the wndproc
pData = (CWPSTRUCT*)lParam;
//Log the message
GetWindowText(pData->hwnd, wndName, 256);
oss << "Message from window \"" << wndName << "\"";
gChatLib->Log(oss.str().c_str());
return CallNextHookEx(gChatLib->GetWndProcHookHandle(), code, wParam, lParam);
}
}
But no "WNDPROC" messages are logged into my log file... Earlier, I had a MessageBox instead of a log to see if it worked, which turned out to be a terrible idea. All the programs froze because they were waiting for me to click "OK", and I had to do a hard reset... When I turned my computer back on and replaced the MessageBox with a log command, it didn't work. I know my log works, though, because it works everywhere else. I'm extremely confused with whats happening with this.
Are there any other methods of obtaining the main window (preferably when it is created)? Or is my hook method good, but just executed wrong? Thank you for any feedback.
You can always inject the DLL when application has already started. It's quite complicated nowadays because of ASLR in Windows Vista/7, but not impossible. You would have to write a short application which would inject selected DLL into the process with given PID. Here is what should be done in order to inject DLL into the running process:
Write a shellcode which would find address of the kernel32.dll library. Here is my old code in NASM:
[BITS 32]
_main:
xor eax, eax
mov esi, [FS:eax+0x30] ; ESI points at PEB
mov esi, [esi+0x0C] ; ESI points at PEB->Ldr
mov esi, [esi+0x1C] ; ESI points at PEB->Ldr.InInitOrder
mov edx, -1 ; EDX is now the current letter pointer
check_dll:
mov ebp, [esi+0x08] ; EBP points at base address InInitOrder[i]
mov edi, [esi+0x20] ; EDI points at InInitOrder[X] name
mov esi, [esi] ; ESI points at flink
mov edx, -1 ; set letter pointer at InInitOrder name
mov ebx, 0 ; set pattern letter pointer to null
check_small_name:
inc edx ; go to the next letter in InInitOrder name
cmp ebx, 0x7 ; check if we have checked all letters
je library_found ; if so and no error kernel32.dll found
mov al, BYTE[edi+edx] ; load byte to EAX from InInitOrder name
cmp al, 0x0 ; check if unicode complement
je check_small_name ; ignore if so
jmp s_kernel32
back1:
pop ecx
cmp BYTE[ecx+ebx], al ; compare characters
jne check_big_name ; if not equal check upper size
inc ebx ; if equal then go to the next letter in pattern
jmp check_small_name ; loop
check_big_name:
jmp b_kernel32
back2:
pop ecx
cmp BYTE[ecx+ebx], al ; check characters
jne check_dll ; if not equal then go to the next module
inc ebx ; if equal go increment the pattern pointer
jmp check_small_name ; loop
library_found:
mov eax, ebp ; move kernel32 base address into ECX
loop:
jmp loop
s_kernel32:
call back1
db "kernel32",10,0
b_kernel32:
call back2
db "KERNEL32",10,0
Load compiled shellcode into memory from file.
Attach to the target process as a debugger. Stop all threads in application. Allocate some memory and set 'read, write, execute' permissions and inject shellcode there.
Get main thread handle. Open thread, create thread context backup and then set new context with EIP register modified (set to the allocated memory - shellcode - address).
Resume threads for some time (e.g. 5 s). Make sure that the process was activated and our shellcode had a chance to execute.
Again attach as a debugger to the target process. Read the EAX register which should now store kernel32.dll base address in target process (thanks to the ASLR it might be not the same as in your injector process).
Check the offset of LoadLibraryA function in kernel32.dll from your process.
The offset should be the same in target process so you have to add remote kernel32.dll base address to the offset in order to compute base address of LoadLibraryA function in the remote process.
Call CreateRemoteThread function giving the computed address of LoadLibraryA as a function to call and DLL path as it's parameter.
I had to figure this all on my own some time ago (I couldn't find any description), but recently I found something similiar: http://syprog.blogspot.com/2012/05/createremotethread-bypass-windows.html
Happy hacking!

Flash crashes when stopping directshow source filter

Here's the callstack :
0480b000()
vcam.ax!CSourceStream::DoBufferProcessingLoop() + 0xe1 bytes
vcam.ax!CSourceStream::ThreadProc() + 0x13e bytes
vcam.ax!CAMThread::InitialThreadProc() + 0x51 bytes
kernel32.dll!7c80b713()
The callstack is from this thread:
0 > 0x000015b8 Worker Thread CAMThread::InitialThreadProc 0480b000 Normal 0
disassembly code:
017D0B5B push edx
017D0B5C mov eax,dword ptr [ecx+8]
017D0B5F call eax
017D0B61 cmp esi,esp
017D0B63 call #ILT+2525(__RTC_CheckEsp) (17C49E2h)
017D0B68 cmp dword ptr [ebp-2Ch],0
017D0B6C je CSourceStream::DoBufferProcessingLoop+10Ah (17D0B8Ah)
017D0B6E mov eax,dword ptr [ebp-2Ch]
Problem exists at the line 017D0B5F call eax
This problem exists for most directshow filters ,how to fix?
I believe vcam.ax's source code is here, so probably the best option is to compile the source code locally and then attach to the process that's crashing in the debugger. Then you can put a breakpoint in the DoBufferProcessingLoop() implementation, recreate the crash, and you should be able figure out why you're crashing.
I'v used vcom.ax and encountered the same problem as yours. I solve it by following step.
Add CAutoLock cAutoLock(&m_cSharedState); on the first Line of following function:
CVCamStream::CVCamStream() //constructor
CVCamStream::~CVCamStream() //distructor
HRESULT CVCamStream::FillBuffer(IMediaSample *pms)
HRESULT CVCamStream::OnThreadCreate()
This may solve your problem.