I've got a very simple main function, with a for loop inside it, like below:
#include<stdio.h>
int main()
{
for(int i=0;i<30;++i)
printf("%d\n",i);
return 0;
}
I tried to compile it like:
gcc 4.c -g
Then I debug it with gdb:
$ gdb a.out
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04)...
Reading symbols from a.out...done.
(gdb) list
1 #include<stdio.h>
2 int main()
3 {
4 for(int i=0;i<30;++i)
5 printf("%d\n",i);
6 return 0;
7 }
(gdb) b 5
Breakpoint 1 at 0x400537: file 4.c, line 5.
(gdb) b 6
Breakpoint 2 at 0x400555: file 4.c, line 6.
(gdb) r
Starting program: /home/a/cpp/a.out
Breakpoint 1, main () at 4.c:5
5 printf("%d\n",i);
(gdb) p i
$1 = 0
(gdb) u
0
4 for(int i=0;i<30;++i)
(gdb) u //not exiting for loop?
Breakpoint 1, main () at 4.c:5
5 printf("%d\n",i);
(gdb)
1
4 for(int i=0;i<30;++i)
(gdb) u
Seems the "u" command doesn't help to execute the whole for loop and come to next break point, but something like a "n" command.
Why? Any misunderstanding from my description?
Thanks.
It appears that gdb has to go through the loop once in order understand the loop structure.
(gdb) list
1 #include<stdio.h>
2 int main()
3 {
4 for(int i=0;i<5;++i)
5 {
6 printf("%d\n",i);
7 }
8 return 0;
9 }
10
(gdb) b main
Breakpoint 1 at 0x400535: file junk.cpp, line 4.
(gdb) b 8
Breakpoint 2 at 0x40055c: file junk.cpp, line 8.
(gdb) r
Starting program: /tmp/local/matcher_server/bin/a.out
Breakpoint 1, main () at junk.cpp:4
4 for(int i=0;i<30;++i)
(gdb) n
6 printf("%d\n",i);
(gdb) n
0
4 for(int i=0;i<30;++i)
(gdb) u
1
2
3
4
Breakpoint 2, main () at junk.cpp:8
8 return 0;
To understand why, we need to look at the assembler for main
(gdb) disass
Dump of assembler code for function main():
0x000000000040052d <+0>: push %rbp
0x000000000040052e <+1>: mov %rsp,%rbp
0x0000000000400531 <+4>: sub $0x10,%rsp
0x0000000000400535 <+8>: movl $0x0,-0x4(%rbp)
0x000000000040053c <+15>: jmp 0x400556 <main()+41>
0x000000000040053e <+17>: mov -0x4(%rbp),%eax
0x0000000000400541 <+20>: mov %eax,%esi
0x0000000000400543 <+22>: mov $0x4005f4,%edi
0x0000000000400548 <+27>: mov $0x0,%eax
0x000000000040054d <+32>: callq 0x400410 <printf#plt>
=> 0x0000000000400552 <+37>: addl $0x1,-0x4(%rbp)
0x0000000000400556 <+41>: cmpl $0x1d,-0x4(%rbp)
0x000000000040055a <+45>: jle 0x40053e <main()+17>
0x000000000040055c <+47>: mov $0x0,%eax
0x0000000000400561 <+52>: leaveq
0x0000000000400562 <+53>: retq
End of assembler dump.
and the line details from dwarfdump
.debug_line: line number info for a single cu
Source lines (from CU-DIE at .debug_info offset 0x0000000b):
<pc> [row,col] NS BB ET PE EB IS= DI= uri: "filepath"
NS new statement, BB new basic block, ET end of text sequence
PE prologue end, EB epilogue begin
IA=val ISA number, DI=val discriminator value
0x0040052d [ 3, 0] NS uri: "/tmp/local/matcher_server/bin/junk.cpp"
0x00400535 [ 4, 0] NS
0x0040053e [ 6, 0] NS DI=0x2
0x00400552 [ 4, 0] NS DI=0x2
0x00400556 [ 4, 0] DI=0x1
0x0040055c [ 8, 0] NS
0x00400561 [ 9, 0] NS
0x00400563 [ 9, 0] NS ET
The [ 3,0] column is the line and column number. As we can see the loop causes the line numbers to be non-sequential, 3,4,6,4.
I suspect that first time the program hits line 6 and the 'u' command is given gdb is confused about the loop in the DWARF symbols. On the second loop, it gets it right however. Perhaps a small bug or an artifact of how the 'u' command is implemented.
Note that gdb will still hit breakpoints during the 'u' command. In your example, you will need to remove the breakpoint on the printf.
Related
I want to make my program more strong in front of hackers, so i have a program::validator class which validate my environment by some parameters. I :
Compile program::validator shared library.
Compile program using -O2 and --ffast-math and link to libprogramvalidator.so.
Run the program with the GDB.
Find the line which actually call program::validator::is_valid_system().
and i want know that i can ignore execution of those line ?
i just want avoid call to is_valid_system function in my ELF executable binary file.
There are several easy ways. You can use GDB jump $address, or return commands to achieve this. Example:
#include <stdio.h>
int is_valid_system()
{
return 0;
}
int main()
{
if (is_valid_system()) {
printf("Life is good\n");
return 0;
}
printf("Invalid system detected\n");
return 1;
}
As you can see, running above program will always print Invalid system and exit with error code 1. Let's confirm that:
gcc t.c && gdb -q ./a.out
(gdb) run
Starting program: /tmp/a.out
Invalid system detected
[Inferior 1 (process 180727) exited with code 01]
Ok, now let's make the program print Life is good. Let's do that via return. To achieve that, set a breakpoint on the desired function, set return register ($rax on x86_64) to desired value, and return to force the function to immediately return:
(gdb) b is_valid_system
Breakpoint 1 at 0x1139
(gdb) run
Starting program: /tmp/a.out
Breakpoint 1, 0x0000555555555139 in is_valid_system ()
(gdb) set $rax = 1
(gdb) return
#0 0x000055555555514e in main ()
(gdb) c
Continuing.
Life is good
[Inferior 1 (process 196141) exited normally]
Alternatively, you can "jump over" the function. Disasemble the caller, break on the CALL instruction, set return register to desired value, and jump to next instruction:
(gdb) disas main
Dump of assembler code for function main:
0x0000555555555140 <+0>: push %rbp
0x0000555555555141 <+1>: mov %rsp,%rbp
0x0000555555555144 <+4>: mov $0x0,%eax
0x0000555555555149 <+9>: callq 0x555555555135 <is_valid_system>
0x000055555555514e <+14>: test %eax,%eax
0x0000555555555150 <+16>: je 0x555555555165 <main+37>
0x0000555555555152 <+18>: lea 0xeab(%rip),%rdi # 0x555555556004
0x0000555555555159 <+25>: callq 0x555555555030 <puts#plt>
0x000055555555515e <+30>: mov $0x0,%eax
0x0000555555555163 <+35>: jmp 0x555555555176 <main+54>
0x0000555555555165 <+37>: lea 0xea5(%rip),%rdi # 0x555555556011
0x000055555555516c <+44>: callq 0x555555555030 <puts#plt>
0x0000555555555171 <+49>: mov $0x1,%eax
0x0000555555555176 <+54>: pop %rbp
0x0000555555555177 <+55>: retq
End of assembler dump.
(gdb) b *0x0000555555555149
Breakpoint 2 at 0x555555555149
(gdb) run
Starting program: /tmp/a.out
Breakpoint 2, 0x0000555555555149 in main ()
(gdb) set $rax = 1
(gdb) jump *0x000055555555514e
Continuing at 0x55555555514e.
Life is good
[Inferior 1 (process 205378) exited normally]
You could also use GDB to temporarily or permanently patch the is_valid_system out. Details in this answer.
This is another variant of the common mistake "Thinking that you can trust your environment, even when you cannot trust your environment".
You implicitly trust that the compiler is a real compiler, the linker a real linker, GDB the real GDB, and the disassembler a real disassembler. You have given hackers not one but four ways to attack your program.
C++ standard says that it is unspecified whether or not a reference requires storage (3.7).. However, as far as I understand, gcc implements C++ references as pointers and as such they can be corrupted.
Is it possible to get an address of a reference in gdb and put a hardware breakpoint on that address in order to find out what corrupts the memory where the reference resides? How can one set such a breakpoint?
GDB may does hardware watchpointing. You can use command watch for this. Example:
main.cpp:
int main(int argc, char **argv)
{
int a = 0;
int& b = a;
int* c = &a;
*c = 1;
return 0;
}
Start debugging and set breakpoint on start main function and end main function:
(gdb) b main
Breakpoint 1 at 0x401bc8: file /../main.cpp, line 60.
(gdb) b main.cpp:65
Breakpoint 2 at 0x401be9: file /../main.cpp, line 65.
(gdb) r
Get address of reference b:
Breakpoint 1, main (argc=1, argv=0x7fffffffddd8) at /../main.cpp:60
60 int a = 0;
(gdb) disas /m
Dump of assembler code for function main(int, char**):
59 {
... Something code
60 int a = 0;
=> 0x0000000000401bc8 <+11>: movl $0x0,-0x14(%rbp)
61 int& b = a;
0x0000000000401bcf <+18>: lea -0x14(%rbp),%rax
0x0000000000401bd3 <+22>: mov %rax,-0x10(%rbp)
62 int* c = &a;
0x0000000000401bd7 <+26>: lea -0x14(%rbp),%rax
0x0000000000401bdb <+30>: mov %rax,-0x8(%rbp)
63 *c = 1;
0x0000000000401bdf <+34>: mov -0x8(%rbp),%rax
0x0000000000401be3 <+38>: movl $0x1,(%rax)
64
65 return 0;
0x0000000000401be9 <+44>: mov $0x0,%eax
66 }
0x0000000000401bee <+49>: pop %rbp
0x0000000000401bef <+50>: retq
End of assembler dump.
(gdb) p $rbp-0x10
$1 = (void *) 0x7fffffffdce0
p $rbp-0x10 is printing address of reference b. It is 0x7fffffffdce0.
Set this address for watching:
(gdb) watch *0x7fffffffdce0
Hardware watchpoint 3: *0x7fffffffdce0
(gdb) c
GDB break only if value is changed:
(gdb) c
Continuing.
Hardware watchpoint 3: *0x7fffffffdce0
Old value = -8752
New value = -8996
main (argc=1, argv=0x7fffffffddd8) at /../main.cpp:62
62 int* c = &a;
Sorry for my english!
According to https://www.ethicalhacker.net/columns/heffner/intro-to-assembly-and-reverse-engineering
mov 0xffffffb4,0x1
moves the number 1 into 0xffffffb4.
So, I decided to test this on my own.
In GDB, x is the command to print the value of memory address.
However, when I run
x 0x00000000004004fc
I'm not getting the value of 133 (decimal) or 85 (hexadecimal)
Instead, I'm getting 0x85f445c7. Any idea what is this?
me#box:~/c$ gdb -q test
Reading symbols from test...done.
(gdb) l
1 #include <stdio.h>
2
3 int main(){
4 int a = 1;
5 int b = 13;
6 int c = 133;
7 printf("Value of C : %d\n",c);
8 return 0;
9 }
(gdb) b 7
Breakpoint 1 at 0x400503: file test.c, line 7.
(gdb) r
Starting program: /home/me/c/test
Breakpoint 1, main () at test.c:7
7 printf("Value of C : %d\n",c);
(gdb)
Disassemble
(gdb) disas
Dump of assembler code for function main:
0x00000000004004e6 <+0>: push %rbp
0x00000000004004e7 <+1>: mov %rsp,%rbp
0x00000000004004ea <+4>: sub $0x10,%rsp
0x00000000004004ee <+8>: movl $0x1,-0x4(%rbp)
0x00000000004004f5 <+15>: movl $0xd,-0x8(%rbp)
0x00000000004004fc <+22>: movl $0x85,-0xc(%rbp)
=> 0x0000000000400503 <+29>: mov -0xc(%rbp),%eax
0x0000000000400506 <+32>: mov %eax,%esi
0x0000000000400508 <+34>: mov $0x4005a4,%edi
0x000000000040050d <+39>: mov $0x0,%eax
0x0000000000400512 <+44>: callq 0x4003c0 <printf#plt>
0x0000000000400517 <+49>: mov $0x0,%eax
0x000000000040051c <+54>: leaveq
0x000000000040051d <+55>: retq
End of assembler dump.
(gdb) x 0x00000000004004fc
0x4004fc <main+22>: 0x85f445c7
(gdb)
;DRTL
To print a value in GDB use print or (p in short form) command.
in your command
x 0x00000000004004fc
You have missed p command. You have to use x with p command pair to print value as hexadecimal format, like below:
(gdb) p/x 0x00000000004004fc
If the memory address is some pointer to some structure then you have to cast the memory location before using the pointer. For example,
struct node {
int data;
struct node *next
};
is some structure and you have the address of that structure pointer, then to view the contents of that memory location you have to use
(gdb) p *(struct node *) 0x00000000004004fc
Notable:
The command
x 0x00000000004004fc
Will look at the instruction and related data for this instruction:
0x00000000004004fc <+22>: movl $0x85,-0xc(%rbp)
... as you can see that the left column (address) is equal to the value used for the command (the address to read)
In the instruction 0x85 is clearly the destination address for the mov, and reflected in the printed value; 0x85f445c7 - which stored as MSB (most significant byte) at the address.
How to disassemble file after use strip command in gdb?
You can use GDB x/i command, e.g.
(gdb) x/4i 0x400390
0x400390: xor %ebp,%ebp
0x400392: mov %rdx,%r9
0x400395: pop %rsi
0x400396: mov %rsp,%rdx
But what you are probably looking for is objdump -d a.out
You can also use the disassemble command. It works like x /i , but it has the optional r nd m flags which, respectively, show you the raw encoding of the instructions and the source code line number correspondance.
With disassemble /rm:
(gdb) p free
$1 = {void (void *)} 0x7ffff7df0980 <free>
(gdb) disassemble /rm free,+13
Dump of assembler code from 0x7ffff7df0980 to 0x7ffff7df098d:
121 in dl-minimal.c
0x00007ffff7df0987 <free+7>: 53 push %rbx
0x00007ffff7df0988 <free+8>: 48 89 fb mov %rdi,%rbx
122 in dl-minimal.c
123 in dl-minimal.c
0x00007ffff7df0980 <free+0>: 48 3b 3d 49 d8 20 00 cmp 0x20d849(%rip),%rdi # 0x7ffff7ffe1d0 <alloc_last_block>
0x00007ffff7df098b <free+11>: 74 03 je 0x7ffff7df0990 <free+16>
End of assembler dump
With x /i:
(gdb) p free
$3 = {void (void *)} 0x7ffff7df0980 <free>
(gdb) x /4i free
0x7ffff7df0980 <free>: cmp 0x20d849(%rip),%rdi # 0x7ffff7ffe1d0 <alloc_last_block>
0x7ffff7df0987 <free+7>: push %rbx
0x7ffff7df0988 <free+8>: mov %rdi,%rbx
0x7ffff7df098b <free+11>: je 0x7ffff7df0990 <free+16>
The advantage (depending on your needs) of x /i over disassemble though, is that x /i accepts a size in instructions whereas disassemble takes a size in bytes.
consider this :
[mdstest:~/onkar/test]$cat test.c
#include<stdio.h>
int main(int argc,char **argv)
{
printf("%p\n",main);
return 0;
}
[mdstest:~/onkar/test]$make
gcc -g -Wall -o test test.c
[mdstest:~/onkar/test]$./test
0x8048368 ------------------------------------- (1)
[mdstest:~/onkar/test]$gdb test
:::::::::::
:::::::::::
(gdb) b main
Breakpoint 1 at 0x8048384: file test.c, line 5.
(gdb) r
Starting program: /home/mdstest/onkar/test/test
[Thread debugging using libthread_db enabled]
Breakpoint 1, main (argc=1, argv=0xbffff2d4) at test.c:5
5 printf("%p\n",main);
(gdb) disassemble
Dump of assembler code for function main:
0x08048368 <+0>: push %ebp
0x08048369 <+1>: mov %esp,%ebp
0x0804836b <+3>: sub $0x8,%esp
0x0804836e <+6>: and $0xfffffff0,%esp
0x08048371 <+9>: mov $0x0,%eax
0x08048376 <+14>: add $0xf,%eax
0x08048379 <+17>: add $0xf,%eax
0x0804837c <+20>: shr $0x4,%eax
0x0804837f <+23>: shl $0x4,%eax
0x08048382 <+26>: sub %eax,%esp
=> 0x08048384 <+28>: sub $0x8,%esp -----------------------------(2)
0x08048387 <+31>: push $0x8048368
0x0804838c <+36>: push $0x8048480
0x08048391 <+41>: call 0x80482b0 <printf#plt>
0x08048396 <+46>: add $0x10,%esp
0x08048399 <+49>: mov $0x0,%eax
0x0804839e <+54>: leave
0x0804839f <+55>: ret
End of assembler dump.
Why are (1) and (2) addresses different ? That is , why some other address
is getting printed in (1) whereas the debugger stops at some other location ?
When a function is called, the calling function does a bit of stuff, and then issues a call instruction pointing to the function being called.
The callee then does a lot of boilerplate of their own - saving registers, shifting the stack pointer to allocate space for stack variables, etc.
When you ask gdb to break at the start of a function, it breaks after that boilerplate, at the start of your actual code - so the address of the function is going to be earlier than the point at which gdb breaks.
"The address of main" is indeed 0x08048368 -- the address of source line 5, where the breakpoint was set, is just after the standard start-of-function boilerplate, just before the code prepping printf's argument and calling it (so that a n will execute that printf-call statement, for example).