Help: Application crashes on accessing source code - c++

Here is a simple asm code I have inserted in VC++ project. addr_curr_ebp is the current address of EBP pointer. It is pointing to the old EBP value inside the stack frame. 4 bytes after this is the return address inside the application function. I extract a single byte from the code section. I run my code along with other applications like gtalk, vlc etc. The application always crashes when I include ProbStat 1 and 2 in my code. When I remove these statements everything works fine. What do you think this is?
__asm{
push eax
push ebx
push cx
mov ebx, addr_curr_ebp
mov eax, [ebx + 4]
mov cl, BYTE PTR [eax - 5] //ProbStat 1
mov ret_5, cl // ProbStat 2
pop cx
pop ebx
pop eax
}

Your code snippet isn't good enough to see where "ret_5" is located. You'll get an automatic crash if it is a member of a class. The ecx register stores the "this" pointer, you're messing it up.
Not sure what this does, sound to me like you need to use the _ReturnAddress intrinsic. It returns the address of the instruction after the call instruction that called this code. Assign it to an unsigned char*, no need for assembly this way.

Related

Calling putchar using x64 assembly through C++

So, I wrote a little library that allows me to execute raw bytecode, as in assembly instructions, in C++.
I thought writing a brainfuck-to-x64 compiler with it. Everything worked, until I had to implement the . brainfuck instruction, which prints a character to stdout.
I know I need to pass the (only) argument through rcx (according to cdecl). But I don't know how to setup the stack, or cleanup after a function call. My ASM code is as follows:
push rbp ; This is the only thing I tried doing as an epilog
mov rcx, QWORD PTR [rbx+rax*4] ; rbx contains the address of an array (32-bit elements), and rax contains the index, the character byte is saved in that address
push rax ; Retrieve rax after it gets clobbered by putchar
push rcx ; Push rcx to use it as an argument
call r10 ; r10 contains the address of putchar
pop rcx ; Restore all clobbered registers
pop rax
pop rbp
This snippet of code works, a character gets put into stdout, but after that, I just get "Access violation executing location 0x0000000000000000."
What am I missing?
Sounds like putchar is not returning correctly due to rsp being corrupted, or something
By the way, I got the address of putchar like this:
#include <cstdio>
uint_least64_t putchar_addr = (uint_least64_t)&std::putchar;
I need to get the pointer as an integer so I can append it to the code buffer as bytecode later.

fastcall how to use for more than 4 parameters

I was trying to build a function in assebmly(FASM) that used more than 4 parameters. in x86 it works fine but I know in x64 with fastcall you have to spill the parameters into the shadow space in the order of rcx,rdx,r8,r9 I read that for 5 and etc you have to pass them onto the stack, but I don't know how to do this. this is what I tried but it keeps saying invalid operand. I know that the first 4 parameters I am doing right because I have made x64 functions before but it is the last 3 I don't know how to spill
proc substr,inputstring,outputstring,buffer1,buffer2,buffer3,startposition,length
;spill
mov [inputstring],rcx
mov [outputstring],rdx
mov [buffer1],r8
mov [buffer2],r9
mov [buffer3],[rsp+8*4]
mov [startposition],[rsp+8*5]
mov [length],[rsp+8*6]
if I try
mov [buffer3],rsp+8*4
it says extra characters on the line.
I also saw that somepeople use rsp+20h, rsp+28h etc but that does not work either.
how do I call more than 4 parameters using fastcall on x64?
also do I have to make room on the stack? I saw some people have to put add rsp,20h right before their spill code. I tried that and it did not help the invlaid operand.
thanks
update
after playing around with it for a little bit I found that the only way it seems to work is if I spill the first 4 parameters and then ignore the rest 5-infinity
proc substr,inputstring,outputstring,buffer1,buffer2,buffer3,startposition,length
;spill
mov [inputstring],rcx
mov [outputstring],rdx
mov [buffer1],r8
mov [buffer2],r9
;start the regular code. ignore spilling buffer3,startposition and length
On x86/x64-CPUs this following instructions does not exist:
mov [buffer3],[rsp+8*4]
mov [startposition],[rsp+8*5]
mov [length],[rsp+8*6]
Workaround with using the rax-register for to read and for to write a values from and to a memory loaction:
mov rax,[rsp+8*4]
mov [buffer3],rax
mov rax,[rsp+8*5]
mov [startposition],rax
mov rax,[rsp+8*6]
mov [length],rax

when can an assignment fail to properly show in Visual C?

C++ compiled (from the same source) DLL with Visual Studio C++ 2010 Express on both a 64 bit Windows 7 and 32 bit Windows XP. External 64 bit app on windows 7 calls the DLL and executes properly. Equivalent 32 bit app on Windows XP bombs on return from DLL call with stack or memory corruption.
Trying to debug this I put a breakpoint where the DLL is copying the data from some internal structures to what the external app wants, last step before returning. At a given point I'm looking at something like this in Visual Studio:
destination[i].field = source[i].field;
where both fields in the source and destination are doubles or longs.
Hovering over the source it shows the correctly computed values. Hovering over the destination, before executing the statement, shows that it was properly initialized to zeros. After executing the statement the destination contain a different value, e.g. 36.3468 becomes 0.00104800000000122891, 6 becomes 10, etc.
This is strange. Maybe there is a structure element misalignment, but wouldn't that show up as a warning somewhere else? Maybe I'm stepping over memory (in the 32 bit version only!?), but then shouldn't the value be apparently correct after stepping over the assignment? Haven't stepped into machine code in a while and don't know x86/x86_64 assembly that well, do I have to do that to see what the code that does that assignment is really doing?
Here is one of the lines that seems to not execute properly and the disassembly in both the 64 bit and 32 bit versions, in that order:
destination[i].field = source[i].field;
000007FEEF6B4DD3 movsxd rax,dword ptr [i]
000007FEEF6B4DDB imul rax,rax,4A68h
000007FEEF6B4DE2 movsxd rcx,dword ptr [i]
000007FEEF6B4DEA imul rcx,rcx,30h
000007FEEF6B4DEE mov rdx,qword ptr [destination]
000007FEEF6B4DF3 mov r8,qword ptr [source]
000007FEEF6B4DF8 movsd xmm0,mmword ptr [r8+rax+4A40h]
000007FEEF6B4E02 movsd mmword ptr [rdx+rcx+8],xmm0
destination[i].field = source[i].field;
09E64361 mov eax,dword ptr [i]
09E64364 imul eax,eax,4A38h
09E6436A mov ecx,dword ptr [i]
09E6436D imul ecx,ecx,2Ch
09E64370 mov edx,dword ptr [destination]
09E64373 mov esi,dword ptr [source]
09E64376 fld qword ptr [esi+eax+4A10h]
09E6437D fstp qword ptr [edx+ecx+4]
If I step over that line in the 64 bit version, VS shows me the proper value for destination[i].field, but not in the 32 bit version. Seems that the structures have different sizes in different versions, thus different offsets and 4 vs 8 bytes in the last assignment, but shouldn't at that point VS show me the proper value?
If I step over the fld instruction on the 32 bit version, I can see that st0 is loaded with the wrong value, i.e. not what is shown for source[i].field, For i=0, eax=0, esi=source, thus probably the 4A10h offset is wrong and/or differently computed in the code and what VS uses to show me the value. How is this possible?

Injecting 64 Bit DLL using code cave

I'm trying to inject a 64 Bit DLL into 64 Bit Process (explorer for the matter).
I've tried using Remote-thread\Window Hooks techniques but some Anti-Viruses detects my loader as a false positive.
After reading this article : Dll Injection by Darawk, I decided to use code caves.
It worked great for 32bit but because VS doesn't support inline assembly for 64 Bit I had to write the op-codes and operands explicitly.
I looked at this article : 64Bit injection using code cave, as the article states, there are some differences:
There are several differences that had to be incorporated here:
MASM64 uses fastcall, so the function's argument has to be passed in a
register and not on the stack.
The length of the addresses - 32 vs. 64 bit - must be taken into account.
MASM64 has no instruction that
pushes all registers on the stack (like pushad in 32bit) so this had
to be done by pushing all the registers explicitly.
I followed those guidelines and ran the article's example but none of what I did worked.
The target process just crashed at the moment I resumed the main thread and I don't know how to really look into it because ollydbg has no 64 bit support.
This is how the code looks before I injected it:
codeToInject:
000000013FACD000 push 7741933Ah
000000013FACD005 pushfq
000000013FACD006 push rax
000000013FACD007 push rcx
000000013FACD008 push rdx
000000013FACD009 push rbx
000000013FACD00A push rbp
000000013FACD00B push rsi
000000013FACD00C push rdi
000000013FACD00D push r8
000000013FACD00F push r9
000000013FACD011 push r10
000000013FACD013 push r11
000000013FACD015 push r12
000000013FACD017 push r13
000000013FACD019 push r14
000000013FACD01B push r15
000000013FACD01D mov rcx,2CA0000h
000000013FACD027 mov rax,76E36F80h
000000013FACD031 call rax
000000013FACD033 pop r15
000000013FACD035 pop r14
000000013FACD037 pop r13
000000013FACD039 pop r12
000000013FACD03B pop r11
000000013FACD03D pop r10
000000013FACD03F pop r9
000000013FACD041 pop r8
000000013FACD043 pop rdi
000000013FACD044 pop rsi
000000013FACD045 pop rbp
000000013FACD046 pop rbx
000000013FACD047 pop rdx
000000013FACD048 pop rcx
000000013FACD049 pop rax
000000013FACD04A popfq
000000013FACD04B ret
Seems fine to me but I guess I'm missing something.
My complete code can be found here : Source code
Any ideas\suggestions\alternatives?
The first push that stores the return value only pushes a 32-bit value. dwOldIP in your code is a DWORD as well, it should be a DWORD64. Having to cast to DWORD from ctx.Rip should've been enough of a hint ;)
Also, make sure the stack is 16-byte aligned upon entering the call to LoadLibrary. Some APIs throw exceptions if the stack is not aligned properly.
Apparently, The main problem was that I allocated the code cave data without the EXECUTE_PAGE_READWRITE permission and therefore the chunk of data was treated as data and not as opcodes.

ASSEMBLY Offset to C++ Code question

I've been trying to convert this code to C++ without any inlining and I cannot figure it out..
Say you got this line
sub edx, (offset loc_42C1F5+5)
My hex-rays gives me
edx -= (uint)((char*)loc_42C1F5 + 5))
But how would it really look like without the loc_42C1F5.
I would think it would be
edx -= 0x42C1FA;
But is that correct? (can't really step this code in any assembler-level debugger.. as it's damaged well protected)
loc_42C1F5 is a label actually..
seg000:0042C1F5 loc_42C1F5: ; DATA XREF: sub_4464A0+2B5o
seg000:0042C1F5 mov edx, [esi+4D98h]
seg000:0042C1FB lea ebx, [esi+4D78h]
seg000:0042C201 xor eax, eax
seg000:0042C203 xor ecx, ecx
seg000:0042C205 mov [ebx], eax
loc_42C1F5 is a symbol. Given the information you've provided, I cannot say what its offset is. It may be 0x42C1F5 or it may be something else entirely.
If it is 0x42C1F5, then your translation should be correct.
IDA has incorrectly identified 0x42C1FA as an offset, and Hex-Rays used that interpretation. Just convert it to plain number (press O) and all will be well. That's why it's called Interactive Disassembler :)