I ran the debugger on CodeBlocks and viewed the disassembly window.
The full source code for the program I debugged is the following:
int main(){}
and the assembly code I saw in the window was this:
00401020 push %ebp
00401021 mov %esp,%ebp
00401023 push %ebx
00401024 sub $0x34,%esp
00401027 movl $0x401150,(%esp)
0040102E call 0x401984 <SetUnhandledExceptionFilter#4>
00401033 sub $0x4,%esp
00401036 call 0x401330 <__cpu_features_init>
0040103B call 0x401740 <fpreset>
00401040 lea -0x10(%ebp),%eax
00401043 movl $0x0,-0x10(%ebp)
0040104A mov %eax,0x10(%esp)
0040104E mov 0x402000,%eax
00401053 movl $0x404004,0x4(%esp)
0040105B movl $0x404000,(%esp)
00401062 mov %eax,0xc(%esp)
00401066 lea -0xc(%ebp),%eax
00401069 mov %eax,0x8(%esp)
0040106D call 0x40192c <__getmainargs>
00401072 mov 0x404008,%eax
00401077 test %eax,%eax
00401079 jne 0x4010c5 <__mingw_CRTStartup+165>
0040107B call 0x401934 <__p__fmode>
00401080 mov 0x402004,%edx
00401086 mov %edx,(%eax)
00401088 call 0x4014f0 <_pei386_runtime_relocator>
0040108D and $0xfffffff0,%esp
00401090 call 0x401720 <__main>
00401095 call 0x40193c <__p__environ>
0040109A mov (%eax),%eax
0040109C mov %eax,0x8(%esp)
004010A0 mov 0x404004,%eax
004010A5 mov %eax,0x4(%esp)
004010A9 mov 0x404000,%eax
004010AE mov %eax,(%esp)
004010B1 call 0x401318 <main>
004010B6 mov %eax,%ebx
004010B8 call 0x401944 <_cexit>
004010BD mov %ebx,(%esp)
004010C0 call 0x40198c <ExitProcess#4>
004010C5 mov 0x4050f4,%ebx
004010CB mov %eax,0x402004
004010D0 mov %eax,0x4(%esp)
004010D4 mov 0x10(%ebx),%eax
004010D7 mov %eax,(%esp)
004010DA call 0x40194c <_setmode>
004010DF mov 0x404008,%eax
004010E4 mov %eax,0x4(%esp)
004010E8 mov 0x30(%ebx),%eax
004010EB mov %eax,(%esp)
004010EE call 0x40194c <_setmode>
004010F3 mov 0x404008,%eax
004010F8 mov %eax,0x4(%esp)
004010FC mov 0x50(%ebx),%eax
004010FF mov %eax,(%esp)
00401102 call 0x40194c <_setmode>
00401107 jmp 0x40107b <__mingw_CRTStartup+91>
0040110C lea 0x0(%esi,%eiz,1),%esi
Is it normal to get this much assembly code from so little C++ code?
By normal, I mean is this close to the average amount of assembly code the MinGW compiler generates relative to the amount of C++ source code I provided above?
Yes, this is fairly typical startup/shutdown code.
Before your main runs, a few things need to happen:
stdin/stdout/stderr get opened
cin/cout/cerr/clog get opened, referring to stdin/stdout/stderr
Any static objects you define get initialized
command line gets parsed to produce argc/argv
environment gets retrieved (maybe)
Likewise, after your main exits, a few more things have to happen:
Anything set up with atexit gets run
Your static objects get destroyed
cin/cout/cerr/clog get destroyed
all open output streams get flushed and closed
all open input streams get closed
Depending on the platform, there may be a few more things as well, such as setting up some default exception handlers (for either C++ exceptions, some platform-specific exceptions, or both).
Note that most of this is fixed code that gets linked into essentially every program, regardless of what it does or doesn't contain. In theory, they can use some tricks (e.g., "weak externals") to avoid linking in some of this code when it isn't needed, but most of what's above is used so close to universally (and the code to handle it is sufficiently trivial) that it's pretty rare to bother going to any work to eliminate this little bit of code, even when it's not going to be used (like your case, where nothing gets used at all).
Note that what you've shown is startup/shutdown code though. It's linked into your program, traditionally from a file named something like crt0 (along with, perhaps, some additional files).
If you look through your file for the code generated for main itself, you'll probably find that it's a lot shorter--possibly as short and simple as just ret. It may be so tiny that you missed the fact that it's there at all though.
This call 0x401318 <main>
is what you code resolved to, basically. main() is a function and there is code surrounding it, often called something like __start and __end.
What you see amounts, in part, to the CRT support code in __start, and cleanup afterward in __end.
Related
I am writing simple programs then analyze them.
Today I've written this:
#include <stdio.h>
int x;
int main(void){
printf("Enter X:\n");
scanf("%d",&x);
printf("You enter %d...\n",x);
return 0;
}
It's compiled into this:
push rbp
mov rbp, rsp
lea rdi, s ; "Enter X:"
call _puts
lea rsi, x
lea rdi, aD ; "%d"
mov eax, 0
call ___isoc99_scanf
mov eax, cs:x <- don't understand this
mov esi, eax
lea rdi, format ; "You enter %d...\n"
mov eax, 0
call _printf
mov eax, 0
pop rbp
retn
I don't understand what cs:x means.
I use Ubuntu x64, GCC 10.3.0, and IDA pro 7.6.
TL:DR: IDA confusingly uses cs: to indicate a RIP-relative addressing mode in 64-bit code.
In IDA mov eax, x means mov eax, DWORD [x] which in turn means reading a DWORD from the variable x.
For completeness, mov rax, OFFSET x means mov rax, x (i.e. putting the address of x in rax).
In 64-bit displacements are still 32-bit, so, for a Position Independent Executable, it's not always possible to address a variable by encoding its address (because it's 64-bit and it would not fit into a 32-bit field). And in position-independent code, it's not desirable.
Instead, RIP-relative addressing is used.
In NASM, RIP-relative addressing takes the form mov eax, [REL x], in gas it is mov x(%rip), %eax.
Also, in NASM, if DEFAULT REL is active, the instruction can be shortened to mov eax, [x] which is identical to the 32-bit syntax.
Each disassembler will disassemble a RIP-relative operand differently. As you commented, Ghidra gives mov eax, DWORD PTR [x].
IDA uses mov eax, cs:x to mean mov eax, [REL x]/mov x(%rip), %eax.
;IDA listing, 64-bit code
mov eax, x ;This is mov eax, [x] in NASM and most likely wrong unless your exec is not PIE and always loaded <= 4GiB
mov eax, cs:x ;This is mov eax, [REL x] in NASM and idiomatic to 64-bit programs
In short, you can mostly ignore the cs: because that's just the way variables are addressed in 64-bit mode.
Of course, as the listing above shows, the use or absence of RIP-relative addressing tells you the program can be loaded anywhere or just below the 4GiB.
The cs prefix shown by IDA threw me off.
I can see that it could mentally resemble "code" and thus the rip register but I don't think the RIP-relative addressing implies a cs segment override.
In 32-bit mode, the code segment is usually read-only, so an instruction like mov [cs:x], eax will fault.
In this scenario, putting a cs: in front of the operand would be wrong.
In 64-bit mode, segment overrides (other than fs/gs) are ignored (and the read-bit of the code segment is ignored anyway), so the presence of a cs: doesn't really matter because ds and cs are effectively indistinguishable. (Even an ss or ds override doesn't change the #GP or #SS exception for a non-canonical address.)
Probably the AGU doesn't even read the segment shadow registers anymore for segment bases other than fs or gs. (Although even in 32-bit mode, there's a lower latency fast path for the normal case of segment base = 0, so hardware may just let that do its job.)
Still cs: is misleading in my opinion - a 2E prefix byte is still possible in machine code as padding. Most tools still call it a CS prefix, although http://ref.x86asm.net/coder64.html calls it a "null prefix" in 64-bit mode. There's no such byte here, and cs: is not an obvious or clear way to imply RIP-relative addressing.
This question already has answers here:
GDB Cannot insert breakpoint, Cannot access memory at address XXX? [duplicate]
(2 answers)
Closed 5 years ago.
Dump of assembler code for function main():
0x000000000000071a <+0>: push rbp
0x000000000000071b <+1>: mov rbp,rsp
0x000000000000071e <+4>: sub rsp,0x20
0x0000000000000722 <+8>: mov rax,QWORD PTR fs:0x28
0x000000000000072b <+17>: mov QWORD PTR [rbp-0x8],rax
0x000000000000072f <+21>: xor eax,eax
0x0000000000000731 <+23>: lea rax,[rbp-0x20]
0x0000000000000735 <+27>: mov rdi,rax
0x0000000000000738 <+30>: call 0x764 <Test::Test()>
0x000000000000073d <+35>: lea rax,[rbp-0x20]
0x0000000000000741 <+39>: mov rdi,rax
0x0000000000000744 <+42>: call 0x7ae <Test::a()>
0x0000000000000749 <+47>: mov eax,0x0
0x000000000000074e <+52>: mov rdx,QWORD PTR [rbp-0x8]
0x0000000000000752 <+56>: xor rdx,QWORD PTR fs:0x28
0x000000000000075b <+65>: je 0x762 <main()+72>
0x000000000000075d <+67>: call 0x5f0 <__stack_chk_fail#plt>
0x0000000000000762 <+72>: leave
0x0000000000000763 <+73>: ret
End of assembler dump.
I have a problem.. I'm trying to debug the program but the addresses are weird and I can't read the registers(after start). "The program has no registers now."
and that's happens at any program that I've compiled in my computer.
EDIT:
gef➤ break*0x0000000000000763
Breakpoint 1 at 0x763: file 1.cpp, line 36.
gef➤ r
Starting program: /root/Desktop/Challenges/AdvancedMemoryChallenges/1.bin
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x763
gef➤ info reg $rip
rip 0x7ffff7dd9c20 0x7ffff7dd9c20
gef➤
gef➤ start
[+] Breaking at '{int (void)} 0x55555555471a <main()>'
[!] Command 'entry-break' failed to execute properly, reason: Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x763
0x763 is an address before relocation. (It is unclear whether it is from an object file or the actual executable.)
The addresses of code in a running program are never this low in the address space.
You need to set a breakpoint on _start or main, start the program, and see which addresses the kernel assigns to the machine code in question. The GDB disassemble command print will print such addresses.
GDB automatically disables address space layout randomization (ASLR), so the addresses will be constant as long as you do not change the program, its libraries, or the kernel (which sometimes results in process layout changes, too).
Here is the walkthrough I'm using: https://i.imgur.com/LIImg.jpg
From what I'm seeing is that to call a windows function, you put the arguments in specific registers. But where is it listed what registers to use and what order?
Look at the code section of that image, it just seems to use r8d, r9, edx and ecx? Does that mean it uses edx, ecx, r8d, r9d, r10d, etc? What happens when you run out of registers for a function with many parameters?
Also why does it have to subtract from the stack? And why 0x28?
I was trying to build a function in assebmly(FASM) that used more than 4 parameters. in x86 it works fine but I know in x64 with fastcall you have to spill the parameters into the shadow space in the order of rcx,rdx,r8,r9 I read that for 5 and etc you have to pass them onto the stack, but I don't know how to do this. this is what I tried but it keeps saying invalid operand. I know that the first 4 parameters I am doing right because I have made x64 functions before but it is the last 3 I don't know how to spill
proc substr,inputstring,outputstring,buffer1,buffer2,buffer3,startposition,length
;spill
mov [inputstring],rcx
mov [outputstring],rdx
mov [buffer1],r8
mov [buffer2],r9
mov [buffer3],[rsp+8*4]
mov [startposition],[rsp+8*5]
mov [length],[rsp+8*6]
if I try
mov [buffer3],rsp+8*4
it says extra characters on the line.
I also saw that somepeople use rsp+20h, rsp+28h etc but that does not work either.
how do I call more than 4 parameters using fastcall on x64?
also do I have to make room on the stack? I saw some people have to put add rsp,20h right before their spill code. I tried that and it did not help the invlaid operand.
thanks
update
after playing around with it for a little bit I found that the only way it seems to work is if I spill the first 4 parameters and then ignore the rest 5-infinity
proc substr,inputstring,outputstring,buffer1,buffer2,buffer3,startposition,length
;spill
mov [inputstring],rcx
mov [outputstring],rdx
mov [buffer1],r8
mov [buffer2],r9
;start the regular code. ignore spilling buffer3,startposition and length
On x86/x64-CPUs this following instructions does not exist:
mov [buffer3],[rsp+8*4]
mov [startposition],[rsp+8*5]
mov [length],[rsp+8*6]
Workaround with using the rax-register for to read and for to write a values from and to a memory loaction:
mov rax,[rsp+8*4]
mov [buffer3],rax
mov rax,[rsp+8*5]
mov [startposition],rax
mov rax,[rsp+8*6]
mov [length],rax
I have a directory change monitor process that reads updates from files within a set of directories. I have another process that performs small writes to a lot of files to those directories (test program). Figure about 100 directories with 10 files in each, and about 500 files being modified per second.
After running for a while, the directory monitor process hangs on a call to fclose() in a method that is basically tailing the file. In this method, I fopen() the file, check that the handle is valid, do a few seeks and reads, and then call fclose(). These reads are all performed by the same thread in the process. After the hang, the thread never progresses.
I couldn't find any good information on why fclose() might deadlock instead of returning some kind of error code. The documentation does mention _fclose_nolock(), but it doesn't seem to be available to me (Visual Studio 2003).
The hang occurs for both debug and release builds. In a debug build, I can see that fclose() calls _free_base(), which hangs before returning. Some kind of call into kernel32.dll => ntdll.dll => KernelBase.dll => ntdll.dll is spinning. Here's the assembly from ntdll.dll that loops indefinitely:
77CEB83F cmp dword ptr [edi+4Ch],0
77CEB843 lea esi,[ebx-8]
77CEB846 je 77CEB85E
77CEB848 mov eax,dword ptr [edi+50h]
77CEB84B xor dword ptr [esi],eax
77CEB84D mov al,byte ptr [esi+2]
77CEB850 xor al,byte ptr [esi+1]
77CEB853 xor al,byte ptr [esi]
77CEB855 cmp byte ptr [esi+3],al
77CEB858 jne 77D19A0B
77CEB85E mov eax,200h
77CEB863 cmp word ptr [esi],ax
77CEB866 ja 77CEB815
77CEB868 cmp dword ptr [edi+4Ch],0
77CEB86C je 77CEB87E
77CEB86E mov al,byte ptr [esi+2]
77CEB871 xor al,byte ptr [esi+1]
77CEB874 xor al,byte ptr [esi]
77CEB876 mov byte ptr [esi+3],al
77CEB879 mov eax,dword ptr [edi+50h]
77CEB87C xor dword ptr [esi],eax
77CEB87E mov ebx,dword ptr [ebx+4]
77CEB881 lea eax,[edi+0C4h]
77CEB887 cmp ebx,eax
77CEB889 jne 77CEB83F
Any ideas what might be happening here?
I posted this as a comment, but I realize this could be an answer in its own right...
Based on the disassembly, my guess is you've overwritten some internal heap structure maintained by ntdll, and it is looping forever iterating through a linked list.
In particular at the start of the loop, the current list node seems to be in ebx. At the end of the loop, the expected last node (or terminator, if you like -- it looks a bit like these are circular lists and the last node is the same as the first, pointer to this node being at [edi+4Ch]) is contained in eax. Probably the result of cmp ebx, eax is never equal, because there is some cycle in the list introduced by a heap corruption.
I don't think this has anything to do with locks, otherwise we would see some atomic instructions (eg. lock cmpxchg, xchg, etc.) or calls to other synchronization functions.
I had a same case with file close function. In my case, I solved by located the close function embedded other function body instead of having own function.
I was also suspicious on
(1) the name of file being duplicated (2) Windows scheduling (file IO wasn't completed before next task treading being started. Windows scheduling and multi-threading is behind of the curtain, so it is hard to verify, but I have similar issue when I tried to save many data in ASCII in the loop. Saving on binary solved at this case.)
My environment, IDE: Visual Studio 2015, OS: Windows 7, language: C++