When I debugged a code, and found:
0x08048500 <+0>: push %ebp
0x08048501 <+1>: mov %esp,%ebp
...
0x08048563 <+99>: jmp 0x8048567 <Postion+103> <===0x8048567 doesn't exist an instruction.
0x08048565 <+101>: dec %edx
0x08048566 <+102>: cmp %bh,%al
0x08048568 <+104>: test %edx,%esp
Q: Why does "jmp 0x8048567" jump into <+103>? It doesn't exist an instruction. What's the point? Thanks.
Why does "jmp 0x8048567" jump into <+103>? It doesn't exist an instruction
It's very likely that the instruction at 0x8048567 does exist. You can see it with x/4i 0x8048567.
What is probably happening is that instruction at 0x8048565 doesn't really exist, but GDB doesn't know that, continues disassembling one instruction after another, and gets out of sync with the real instruction stream.
Related
This fragment of code
int foo(int a, int b)
{
return (a == b);
}
generates the following assembly (https://godbolt.org/z/fWsM1zo6q)
foo(int, int):
xorl %eax, %eax
cmpl %esi, %edi
sete %al
ret
According to https://www.felixcloutier.com/x86/setcc
[SETcc] Sets the destination operand to 0 or 1 depending on the settings of the status flags
So what is the point of initializing %eax with zero by doing xorl %eax, %eax first if it will be zero/one depending on result of a == b anyway? Isn't it a waste of CPU clocks that both gcc and clang can't avoid for some reason?
Because setcc sucks: only available in 8-bit operand-size. But you used 32-bit int for the return value, so you need that 8-bit result zero-extended to 32-bit.
Even if you did only want to return a bool or char, you might still do this to avoid a false dependency when writing AL. xor-zeroing doesn't cost "a cycle", it costs 1 uop (and is as cheap as a nop on Intel), but that's still not free. (https://agner.org/optimize/)
Unfortunately AMD64 didn't change setcc, nor did any later extensions, so producing a 32-bit 0/1 is still a pain on x86 even with -march=icelake-client or znver3. Having a 66 operand-size or rep prefix modify setcc to use 32-bit operand-size would have been helpful to avoid wasting an instruction (and front-end uop) for this, but neither vendor has ever bothered to introduce an extension like that. (Usually only extensions that can give major speedups in a few "hot" functions that you can do dynamic dispatch for, not things that need to be used everywhere to add up to a small improvement.)
xor-zeroing before the setcc is the least-bad way, when you have a spare register, as discussed at the bottom of my answer on What is the best way to set a register to zero in x86 assembly: xor, mov or and?.
The other options, if you do want to overwrite a compare input include:
1. mov-imm32=0 which you can do after a compare, not affecting FLAGS:
# for example if you want to replace a compare input with a boolean
cmp %ecx, %eax
mov $0, %eax
setcc %al
This wastes code-size (5 bytes vs. 2 for mov vs. xor), and (on Intel P6-family) has a partial register stall when reading EAX, because no xor-zeroing was used to set the internal RAX=AL upper-bytes-known-zero state.
The mov-immediate is off the critical path, so out-of-order exec can get it done early, before the compare inputs are ready, and have that zeroed register ready for setcc to write into.
(On Intel SnB-family CPUs, xor-zeroing is handled in the rename logic, so it doesn't have to execute early to get the zero ready; it's already done when it enters the back-end. e.g. after a front-end stall, xor-zeroing and setcc could enter the back-end in the same cycle, but the setcc could still execute in the first cycle after that, unlike if it was a mov-immediate that would have to actually run on a back-end execution unit to write a zero to a register.)
2. MOVZX on an 8-bit setcc result
cmp %ecx, %eax
setcc %cl
movzbl %cl, %eax
This is mostly even worse, except on P6-family where it avoids a partial-register stall.
But movzx is on the critical path from compare inputs being ready to 0/1 result being ready. (Although IvyBridge and later can run it with zero latency when it's between two separate registers, which is why I used %cl instead of %al. Compilers normally don't optimize for this, and would setcc %al / movzbl %al, %eax if they don't manage to xor-zero something first. This defeats mov-elimination even on Intel CPUs that have it.)
setcc %cl has a false dependency on RCX (except on Intel P6-family which renames low8 registers separately from the full register), but that's ok because RCX and RAX were both already part of the dependency chain leading to setcc.
If you're not overwriting one of the compare inputs, xor-zero the separate destination register. setcc %al / movzbl %al, %eax after cmp %esi, %edi would be the worst of all possible options, because RAX might have last been written by a cache-miss load of something independent, or a slow div or something like that before the function call, so you could be coupling this dependency chain into it.
Is GDB capable of changing certain program tasks? such as from "jle"(Jump less than or equal to) to "jge"(jump greater than or equal to).
from: 0x0000000000400563 <+45>: jle 0x400547 <main+17>
to: 0x0000000000400563 <+45>: jge 0x400547 <main+17>
Is GDB capable of changing memory arithmetics?
What is "memory arithmetic"?
such as from "jle"(Jump less than or equal to) to "jge"(jump greater than or equal to).
Your question is exceedingly unclear. I think you are asking: can I change a program to perform JGE instead of current JLE instruction in GDB?
If that is your question, the answer is yes.
Find the opcode that is currently being used with disas/r 0x400563,0x400564 command.
Change the opcode to that of JGE with an assignment to the instruction opcode.
Example:
(gdb) disas/r 0x400496,0x400497
Dump of assembler code from 0x400496 to 0x400497:
0x0000000000400496 <main(int, char**)+15>: 7e 07 jle 0x40049f <main(int, char**)+24>
End of assembler dump.
The opcode is 0x7E. The opcode for JGE is 0x7D (table).
Let's patch. You can do that for a running process:
(gdb) start
(gdb) set *(char*)0x400496 = 0x7d
(gdb) disas/r 0x400496,0x400497
Dump of assembler code from 0x400496 to 0x400497:
0x0000000000400496 <main(int, char**)+15>: 7d 07 jge 0x40049f <main(int, char**)+24>
End of assembler dump.
Or you can update the binary on disk, but invoking gdb --write a.out.
This question already has answers here:
GDB Cannot insert breakpoint, Cannot access memory at address XXX? [duplicate]
(2 answers)
Closed 5 years ago.
Dump of assembler code for function main():
0x000000000000071a <+0>: push rbp
0x000000000000071b <+1>: mov rbp,rsp
0x000000000000071e <+4>: sub rsp,0x20
0x0000000000000722 <+8>: mov rax,QWORD PTR fs:0x28
0x000000000000072b <+17>: mov QWORD PTR [rbp-0x8],rax
0x000000000000072f <+21>: xor eax,eax
0x0000000000000731 <+23>: lea rax,[rbp-0x20]
0x0000000000000735 <+27>: mov rdi,rax
0x0000000000000738 <+30>: call 0x764 <Test::Test()>
0x000000000000073d <+35>: lea rax,[rbp-0x20]
0x0000000000000741 <+39>: mov rdi,rax
0x0000000000000744 <+42>: call 0x7ae <Test::a()>
0x0000000000000749 <+47>: mov eax,0x0
0x000000000000074e <+52>: mov rdx,QWORD PTR [rbp-0x8]
0x0000000000000752 <+56>: xor rdx,QWORD PTR fs:0x28
0x000000000000075b <+65>: je 0x762 <main()+72>
0x000000000000075d <+67>: call 0x5f0 <__stack_chk_fail#plt>
0x0000000000000762 <+72>: leave
0x0000000000000763 <+73>: ret
End of assembler dump.
I have a problem.. I'm trying to debug the program but the addresses are weird and I can't read the registers(after start). "The program has no registers now."
and that's happens at any program that I've compiled in my computer.
EDIT:
gef➤ break*0x0000000000000763
Breakpoint 1 at 0x763: file 1.cpp, line 36.
gef➤ r
Starting program: /root/Desktop/Challenges/AdvancedMemoryChallenges/1.bin
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x763
gef➤ info reg $rip
rip 0x7ffff7dd9c20 0x7ffff7dd9c20
gef➤
gef➤ start
[+] Breaking at '{int (void)} 0x55555555471a <main()>'
[!] Command 'entry-break' failed to execute properly, reason: Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x763
0x763 is an address before relocation. (It is unclear whether it is from an object file or the actual executable.)
The addresses of code in a running program are never this low in the address space.
You need to set a breakpoint on _start or main, start the program, and see which addresses the kernel assigns to the machine code in question. The GDB disassemble command print will print such addresses.
GDB automatically disables address space layout randomization (ASLR), so the addresses will be constant as long as you do not change the program, its libraries, or the kernel (which sometimes results in process layout changes, too).
I ran the debugger on CodeBlocks and viewed the disassembly window.
The full source code for the program I debugged is the following:
int main(){}
and the assembly code I saw in the window was this:
00401020 push %ebp
00401021 mov %esp,%ebp
00401023 push %ebx
00401024 sub $0x34,%esp
00401027 movl $0x401150,(%esp)
0040102E call 0x401984 <SetUnhandledExceptionFilter#4>
00401033 sub $0x4,%esp
00401036 call 0x401330 <__cpu_features_init>
0040103B call 0x401740 <fpreset>
00401040 lea -0x10(%ebp),%eax
00401043 movl $0x0,-0x10(%ebp)
0040104A mov %eax,0x10(%esp)
0040104E mov 0x402000,%eax
00401053 movl $0x404004,0x4(%esp)
0040105B movl $0x404000,(%esp)
00401062 mov %eax,0xc(%esp)
00401066 lea -0xc(%ebp),%eax
00401069 mov %eax,0x8(%esp)
0040106D call 0x40192c <__getmainargs>
00401072 mov 0x404008,%eax
00401077 test %eax,%eax
00401079 jne 0x4010c5 <__mingw_CRTStartup+165>
0040107B call 0x401934 <__p__fmode>
00401080 mov 0x402004,%edx
00401086 mov %edx,(%eax)
00401088 call 0x4014f0 <_pei386_runtime_relocator>
0040108D and $0xfffffff0,%esp
00401090 call 0x401720 <__main>
00401095 call 0x40193c <__p__environ>
0040109A mov (%eax),%eax
0040109C mov %eax,0x8(%esp)
004010A0 mov 0x404004,%eax
004010A5 mov %eax,0x4(%esp)
004010A9 mov 0x404000,%eax
004010AE mov %eax,(%esp)
004010B1 call 0x401318 <main>
004010B6 mov %eax,%ebx
004010B8 call 0x401944 <_cexit>
004010BD mov %ebx,(%esp)
004010C0 call 0x40198c <ExitProcess#4>
004010C5 mov 0x4050f4,%ebx
004010CB mov %eax,0x402004
004010D0 mov %eax,0x4(%esp)
004010D4 mov 0x10(%ebx),%eax
004010D7 mov %eax,(%esp)
004010DA call 0x40194c <_setmode>
004010DF mov 0x404008,%eax
004010E4 mov %eax,0x4(%esp)
004010E8 mov 0x30(%ebx),%eax
004010EB mov %eax,(%esp)
004010EE call 0x40194c <_setmode>
004010F3 mov 0x404008,%eax
004010F8 mov %eax,0x4(%esp)
004010FC mov 0x50(%ebx),%eax
004010FF mov %eax,(%esp)
00401102 call 0x40194c <_setmode>
00401107 jmp 0x40107b <__mingw_CRTStartup+91>
0040110C lea 0x0(%esi,%eiz,1),%esi
Is it normal to get this much assembly code from so little C++ code?
By normal, I mean is this close to the average amount of assembly code the MinGW compiler generates relative to the amount of C++ source code I provided above?
Yes, this is fairly typical startup/shutdown code.
Before your main runs, a few things need to happen:
stdin/stdout/stderr get opened
cin/cout/cerr/clog get opened, referring to stdin/stdout/stderr
Any static objects you define get initialized
command line gets parsed to produce argc/argv
environment gets retrieved (maybe)
Likewise, after your main exits, a few more things have to happen:
Anything set up with atexit gets run
Your static objects get destroyed
cin/cout/cerr/clog get destroyed
all open output streams get flushed and closed
all open input streams get closed
Depending on the platform, there may be a few more things as well, such as setting up some default exception handlers (for either C++ exceptions, some platform-specific exceptions, or both).
Note that most of this is fixed code that gets linked into essentially every program, regardless of what it does or doesn't contain. In theory, they can use some tricks (e.g., "weak externals") to avoid linking in some of this code when it isn't needed, but most of what's above is used so close to universally (and the code to handle it is sufficiently trivial) that it's pretty rare to bother going to any work to eliminate this little bit of code, even when it's not going to be used (like your case, where nothing gets used at all).
Note that what you've shown is startup/shutdown code though. It's linked into your program, traditionally from a file named something like crt0 (along with, perhaps, some additional files).
If you look through your file for the code generated for main itself, you'll probably find that it's a lot shorter--possibly as short and simple as just ret. It may be so tiny that you missed the fact that it's there at all though.
This call 0x401318 <main>
is what you code resolved to, basically. main() is a function and there is code surrounding it, often called something like __start and __end.
What you see amounts, in part, to the CRT support code in __start, and cleanup afterward in __end.
Can anyone explain to me the assembly generated by GCC of the following C++ codes? espeically, the meaning of setg and test in the codes. thx!
.cpp codes:
1 /*for loop*/
2 int main()
3 {
4 int floop_id;
5 for(floop_id=100;floop_id>=1;floop_id--)
6 {}
7 return 0;
8 }
assembly codes:
3 push %ebp
3 mov %esp, %ebp
3 sub $0x10,%esp
5 movl $0x64,-0x4(%ebp)
5 jmp 8048457<main+0x13>
5 subl $0x1,-0x4(%esp)
5 cmpl $0x0,-0x4(%esp)
5 setg %al
5 test %al, %al
7 mov $0x0,%eax
8 leave
8 ret
cmpl $0x0,-0x4(%esp); setg %al means compare -0x4(%esp) (floop_id in your code) against 0, and set %al to 1 if it's greater, or 0 otherwise.
test %al, %al here isn't doing anything. I don't know why it's in the assembly. (Normally, testing a value with itself is used to get the signum of the value (i.e., zero, positive, or negative), but the result of this isn't being used here. Chances are, it was going to do a conditional branch (to implement the loop), but seeing as your loop is empty, it got removed.)
Your generated assembly code doesn't have the cycle in it (apparently the compiler decided that it is not needed), but it seems to contain some loose remains of that cycle. There are bits that load 100 into a variable, subtract 1 from it, compare it with 0. But there's no actual iteration in the code.
Trying to find any logic in this is a pointless exercise. The compiler, obviously, decided to remove the entire cycle. But why it left some "debris" behind is not clear. I'd say that what is left behind in the code is harmless, but at the same time has as much meaning as a value of an initialized variable.
BTW, where does the unconditional jmp lead to? It is not clear from your disassembly. Doesn't it just jump to 7 right away?