According to https://www.ethicalhacker.net/columns/heffner/intro-to-assembly-and-reverse-engineering
mov 0xffffffb4,0x1
moves the number 1 into 0xffffffb4.
So, I decided to test this on my own.
In GDB, x is the command to print the value of memory address.
However, when I run
x 0x00000000004004fc
I'm not getting the value of 133 (decimal) or 85 (hexadecimal)
Instead, I'm getting 0x85f445c7. Any idea what is this?
me#box:~/c$ gdb -q test
Reading symbols from test...done.
(gdb) l
1 #include <stdio.h>
2
3 int main(){
4 int a = 1;
5 int b = 13;
6 int c = 133;
7 printf("Value of C : %d\n",c);
8 return 0;
9 }
(gdb) b 7
Breakpoint 1 at 0x400503: file test.c, line 7.
(gdb) r
Starting program: /home/me/c/test
Breakpoint 1, main () at test.c:7
7 printf("Value of C : %d\n",c);
(gdb)
Disassemble
(gdb) disas
Dump of assembler code for function main:
0x00000000004004e6 <+0>: push %rbp
0x00000000004004e7 <+1>: mov %rsp,%rbp
0x00000000004004ea <+4>: sub $0x10,%rsp
0x00000000004004ee <+8>: movl $0x1,-0x4(%rbp)
0x00000000004004f5 <+15>: movl $0xd,-0x8(%rbp)
0x00000000004004fc <+22>: movl $0x85,-0xc(%rbp)
=> 0x0000000000400503 <+29>: mov -0xc(%rbp),%eax
0x0000000000400506 <+32>: mov %eax,%esi
0x0000000000400508 <+34>: mov $0x4005a4,%edi
0x000000000040050d <+39>: mov $0x0,%eax
0x0000000000400512 <+44>: callq 0x4003c0 <printf#plt>
0x0000000000400517 <+49>: mov $0x0,%eax
0x000000000040051c <+54>: leaveq
0x000000000040051d <+55>: retq
End of assembler dump.
(gdb) x 0x00000000004004fc
0x4004fc <main+22>: 0x85f445c7
(gdb)
;DRTL
To print a value in GDB use print or (p in short form) command.
in your command
x 0x00000000004004fc
You have missed p command. You have to use x with p command pair to print value as hexadecimal format, like below:
(gdb) p/x 0x00000000004004fc
If the memory address is some pointer to some structure then you have to cast the memory location before using the pointer. For example,
struct node {
int data;
struct node *next
};
is some structure and you have the address of that structure pointer, then to view the contents of that memory location you have to use
(gdb) p *(struct node *) 0x00000000004004fc
Notable:
The command
x 0x00000000004004fc
Will look at the instruction and related data for this instruction:
0x00000000004004fc <+22>: movl $0x85,-0xc(%rbp)
... as you can see that the left column (address) is equal to the value used for the command (the address to read)
In the instruction 0x85 is clearly the destination address for the mov, and reflected in the printed value; 0x85f445c7 - which stored as MSB (most significant byte) at the address.
Related
This question already has an answer here:
GDB - Address of breakpoint
(1 answer)
Closed 1 year ago.
Here is my function with line numbers
8 | void function(char* string) {
9 | char buffer[16];
10| strcpy(buffer,string);
11| }
Here is gdb disassemble function output
0x000011d4 <+0>: push %ebp
0x000011d5 <+1>: mov %esp,%ebp
0x000011d7 <+3>: push %ebx
0x000011d8 <+4>: sub $0x14,%esp
0x000011db <+7>: call 0x123d <__x86.get_pc_thunk.ax>
0x000011e0 <+12>: add $0x2e20,%eax
0x000011e5 <+17>: sub $0x8,%esp <---- I want Break point here
0x000011e8 <+20>: pushl 0x8(%ebp)
0x000011eb <+23>: lea -0x18(%ebp),%edx
0x000011ee <+26>: push %edx
0x000011ef <+27>: mov %eax,%ebx
0x000011f1 <+29>: call 0x1030 <strcpy#plt>
0x000011f6 <+34>: add $0x10,%esp
0x000011f9 <+37>: nop
0x000011fa <+38>: mov -0x4(%ebp),%ebx
0x000011fd <+41>: leave
0x000011fe <+42>: ret
If I set break point at 0x000011e5 using the following command,
(gdb) b *0x000011e5
and run the program, gdb ignores all breakpoints and exits.
But, if I specify,
b 9, it works.
Here is the output
(gdb) b 10
Breakpoint 1 at 0x4011e5: file hello.c, line 10.
Why are the address different ?
Why are the address different
Because you have a position-independent executable, which is linked at address 0, but relocated to a different address at runtime.
I have written this simple piece of code for testing buffer overflow:
#include <stdio.h>
#include <string.h>
using namespace std;
int f(int x, int y, char *s){
char buf[4];
strcpy(buf,s);
return 0;
}
int main(int argc, char** argv){
f(2,3,argv[1]);
return 0;
}
Then compiling and viewing its execution with gdb (g++ 4.8.4)
g++ -g -fno-stack-protector -o bo bo.c
gdb bo
...
b f
r "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
p $rbp // 0x7fffffffdc90
p $rsp // 0x7fffffffdc70
x/20xw $rsp
0x7fffffffdc70: 0xffffe0ef 0x00007fff 0x00000003 0x00000002
0x7fffffffdc80: 0xffffdcb0 0x00007fff 0x00000000 0x00000000
0x7fffffffdc90: 0xffffdcb0 0x00007fff 0x00400585 0x00000000
0x7fffffffdca0: 0xffffdd98 0x00007fff 0x00000000 0x00000002
0x7fffffffdcb0: 0x00000000 0x00000000 0xf7a36ec5 0x00007fff
My understanding is that the stack grows downward to lower addresses, but it looks this stack frame (from 0x7fffffffdc90 - 0x7fffffffdc90) is growing upward: the parameters are pushed upward (s, y then x). Why is that?
Looks like the return address (0x00400585) is pushed first. But what are the meanings of subsequent words? Are they:
Saved $rbp$?
What are the next 2 words?
To see what happens to your stack after the call of f, call disassembler in gdb:
(gdb) disas
Dump of assembler code for function f(int, int, char*):
0x000000000040052d <+0>: push %rbp
0x000000000040052e <+1>: mov %rsp,%rbp
0x0000000000400531 <+4>: sub $0x20,%rsp
0x0000000000400535 <+8>: mov %edi,-0x14(%rbp)
0x0000000000400538 <+11>: mov %esi,-0x18(%rbp)
0x000000000040053b <+14>: mov %rdx,-0x20(%rbp)
0x000000000040053f <+18>: mov -0x20(%rbp),%rdx
0x0000000000400543 <+22>: lea -0x10(%rbp),%rax
0x0000000000400547 <+26>: mov %rdx,%rsi
0x000000000040054a <+29>: mov %rax,%rdi
=> 0x000000000040054d <+32>: callq 0x400410 <strcpy#plt>
0x0000000000400552 <+37>: mov $0x0,%eax
0x0000000000400557 <+42>: leaveq
0x0000000000400558 <+43>: retq
Before the call to strcpy the stack looks like (I use 64bit formating rather than 32bit):
(gdb) x/6xg $rsp
0x7fffffffddb0: 0x00007fffffffe297 0x0000000200000003
0x7fffffffddc0: 0x00007fffffffddf0 0x0000000000000000
0x7fffffffddd0: 0x00007fffffffddf0 0x0000000000400585
So you can see:
0x0000000000400585 - return address of the function f.
right next to it 0x00007fffffffddf0 - pushed on the stack by 0x000000000040052d <+0>: push %rbp
the next 4 values were reserved on stack via
0x0000000000400531 <+4>: sub $0x20,%rsp
you can see parameters 2 and 3 being saved on the stack prior to the call of the strcpy (0x0000000200000003- because ints are only 4 byte long).
You can also deduce other values on the stack from the disassembly.
The top of the stack is at the beginning (address 0x7fffffffddb0) and the addresses get bigger (e.g. 0x7fffffffddd0 for the third line) so you can see the stack really grows downwards but is shown upside down by gdb.
I've set the disassembly-flavor of the gdb-debugger to Intel (both: su & normal user), but anyway it's still showing the assembly-code in AT&T notation:
patrick#localhost:~/Dokumente/Projekte$ gdb -q ./a.out
Reading symbols from ./a.out...done.
(gdb) break main
Breakpoint 1 at 0x40050e: file firstprog.c, line 5.
(gdb) run
Starting program: /home/patrick/Dokumente/Projekte/a.out
Breakpoint 1, main () at firstprog.c:5
5 for(i=0; i < 10; i++)
(gdb) show disassembly
The disassembly flavor is "intel".
(gdb) info registers
rax 0x400506 4195590
rbx 0x0 0
rcx 0x0 0
rdx 0x7fffffffe2d8 140737488347864
rsi 0x7fffffffe2c8 140737488347848
rdi 0x1 1
rbp 0x7fffffffe1e0 0x7fffffffe1e0
(gdb) info register eip
Invalid register `eip'
I did restart the computer. My OS is Kali Linux amd64.
I have the following questions:
Why is gdb still showing the AT&T notation?
Why is the register EIP (instruction pointer) shown as invalid register?
You are misunderstanding what disassembly flavour means. It means exactly that: what the disassembly looks like when you view machine code in a human-readable(ish) form.
To print registers (or use registers in any other context), you need to use $reg, such as $rip or $pc, $eax, etc.
If I disassemble one of my programs with at&t syntax, gdb shows this:
0x00000000007378f0 <+0>: push %rbp
0x00000000007378f1 <+1>: mov %rsp,%rbp
0x00000000007378f4 <+4>: sub $0x20,%rsp
0x00000000007378f8 <+8>: movl $0x0,-0x4(%rbp)
0x00000000007378ff <+15>: mov %edi,-0x8(%rbp)
0x0000000000737902 <+18>: mov %rsi,-0x10(%rbp)
=> 0x0000000000737906 <+22>: mov -0x10(%rbp),%rsi
0x000000000073790a <+26>: mov (%rsi),%rdi
0x000000000073790d <+29>: callq 0x737950 <FindLibPath(char const*)>
0x0000000000737912 <+34>: xor %eax,%eax
Then do this:
(gdb) set disassembly-flavor intel
(gdb) disass main
Dump of assembler code for function main(int, char**):
0x00000000007378f0 <+0>: push rbp
0x00000000007378f1 <+1>: mov rbp,rsp
0x00000000007378f4 <+4>: sub rsp,0x20
0x00000000007378f8 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x00000000007378ff <+15>: mov DWORD PTR [rbp-0x8],edi
0x0000000000737902 <+18>: mov QWORD PTR [rbp-0x10],rsi
=> 0x0000000000737906 <+22>: mov rsi,QWORD PTR [rbp-0x10]
0x000000000073790a <+26>: mov rdi,QWORD PTR [rsi]
0x000000000073790d <+29>: call 0x737950 <FindLibPath(char const*)>
0x0000000000737912 <+34>: xor eax,eax
and you can see the difference. But the names of registers and how you use registers on the gdb command-line isn't changing, you need a $reg in both cases.
When i execute the following commands i get different address of function()
(gdb) break function()
Breakpoint 1 at function() 0x804834a.
(gdb) print function()
Breakpoint 1 at function() 0x8048344.
Why there is difference in both address?
This output can't be correct, it would be if you did something as:
int func(void) {
int a = 10;
printf("%d\n", a);
return 1;
}
after loading it into the gdb:
(gdb) p func
$1 = {int (void)} 0x4016b0 <func>
(gdb) b func
Breakpoint 1 at 0x4016b6: file file.c, line 4.
(gdb) disassemble func
Dump of assembler code for function func:
0x004016b0 <+0>: push %ebp
0x004016b1 <+1>: mov %esp,%ebp
0x004016b3 <+3>: sub $0x28,%esp
0x004016b6 <+6>: movl $0xa,-0xc(%ebp)
0x004016bd <+13>: mov -0xc(%ebp),%eax
0x004016c0 <+16>: mov %eax,0x4(%esp)
0x004016c4 <+20>: movl $0x405064,(%esp)
0x004016cb <+27>: call 0x403678 <printf>
0x004016d0 <+32>: mov $0x1,%eax
0x004016d5 <+37>: leave
0x004016d6 <+38>: ret
End of assembler dump.
(gdb)
Here func points to the exact first instruction in the function, push %ebp, but when you setup a break point, gdb sets it after stack frame initialization instructions:
0x004016b0 <+0>: push %ebp
0x004016b1 <+1>: mov %esp,%ebp
0x004016b3 <+3>: sub $0x28,%esp
at where the instructions of the function actually begins:
=> 0x004016b6 <+6>: movl $0xa,-0xc(%ebp)
0x004016bd <+13>: mov -0xc(%ebp),%eax
0x004016c0 <+16>: mov %eax,0x4(%esp)
0x004016c4 <+20>: movl $0x405064,(%esp)
0x004016cb <+27>: call 0x403678 <printf>
0x004016d0 <+32>: mov $0x1,%eax
0x004016d5 <+37>: leave
0x004016d6 <+38>: ret
here this instruction:
movl $0xa,-0xc(%ebp) ; 0xa = 10
is this part:
int a = 10;
Gdb sets a breakpoint after function prologue, as before the things are properly set up it could not show the expected state like local variables, etc.
Break therefor sets breakpoint and prints address of first instruction after prologue, whereas print prints the address of actual first instruction in function.
You can set a breakpoint to actual first instruction by doing break *0x8048344, then observe the value of local variables there and after prologue.
consider this :
[mdstest:~/onkar/test]$cat test.c
#include<stdio.h>
int main(int argc,char **argv)
{
printf("%p\n",main);
return 0;
}
[mdstest:~/onkar/test]$make
gcc -g -Wall -o test test.c
[mdstest:~/onkar/test]$./test
0x8048368 ------------------------------------- (1)
[mdstest:~/onkar/test]$gdb test
:::::::::::
:::::::::::
(gdb) b main
Breakpoint 1 at 0x8048384: file test.c, line 5.
(gdb) r
Starting program: /home/mdstest/onkar/test/test
[Thread debugging using libthread_db enabled]
Breakpoint 1, main (argc=1, argv=0xbffff2d4) at test.c:5
5 printf("%p\n",main);
(gdb) disassemble
Dump of assembler code for function main:
0x08048368 <+0>: push %ebp
0x08048369 <+1>: mov %esp,%ebp
0x0804836b <+3>: sub $0x8,%esp
0x0804836e <+6>: and $0xfffffff0,%esp
0x08048371 <+9>: mov $0x0,%eax
0x08048376 <+14>: add $0xf,%eax
0x08048379 <+17>: add $0xf,%eax
0x0804837c <+20>: shr $0x4,%eax
0x0804837f <+23>: shl $0x4,%eax
0x08048382 <+26>: sub %eax,%esp
=> 0x08048384 <+28>: sub $0x8,%esp -----------------------------(2)
0x08048387 <+31>: push $0x8048368
0x0804838c <+36>: push $0x8048480
0x08048391 <+41>: call 0x80482b0 <printf#plt>
0x08048396 <+46>: add $0x10,%esp
0x08048399 <+49>: mov $0x0,%eax
0x0804839e <+54>: leave
0x0804839f <+55>: ret
End of assembler dump.
Why are (1) and (2) addresses different ? That is , why some other address
is getting printed in (1) whereas the debugger stops at some other location ?
When a function is called, the calling function does a bit of stuff, and then issues a call instruction pointing to the function being called.
The callee then does a lot of boilerplate of their own - saving registers, shifting the stack pointer to allocate space for stack variables, etc.
When you ask gdb to break at the start of a function, it breaks after that boilerplate, at the start of your actual code - so the address of the function is going to be earlier than the point at which gdb breaks.
"The address of main" is indeed 0x08048368 -- the address of source line 5, where the breakpoint was set, is just after the standard start-of-function boilerplate, just before the code prepping printf's argument and calling it (so that a n will execute that printf-call statement, for example).