How to disassemble elf stripped file in gdb? - gdb

How to disassemble file after use strip command in gdb?

You can use GDB x/i command, e.g.
(gdb) x/4i 0x400390
0x400390: xor %ebp,%ebp
0x400392: mov %rdx,%r9
0x400395: pop %rsi
0x400396: mov %rsp,%rdx
But what you are probably looking for is objdump -d a.out

You can also use the disassemble command. It works like x /i , but it has the optional r nd m flags which, respectively, show you the raw encoding of the instructions and the source code line number correspondance.
With disassemble /rm:
(gdb) p free
$1 = {void (void *)} 0x7ffff7df0980 <free>
(gdb) disassemble /rm free,+13
Dump of assembler code from 0x7ffff7df0980 to 0x7ffff7df098d:
121 in dl-minimal.c
0x00007ffff7df0987 <free+7>: 53 push %rbx
0x00007ffff7df0988 <free+8>: 48 89 fb mov %rdi,%rbx
122 in dl-minimal.c
123 in dl-minimal.c
0x00007ffff7df0980 <free+0>: 48 3b 3d 49 d8 20 00 cmp 0x20d849(%rip),%rdi # 0x7ffff7ffe1d0 <alloc_last_block>
0x00007ffff7df098b <free+11>: 74 03 je 0x7ffff7df0990 <free+16>
End of assembler dump
With x /i:
(gdb) p free
$3 = {void (void *)} 0x7ffff7df0980 <free>
(gdb) x /4i free
0x7ffff7df0980 <free>: cmp 0x20d849(%rip),%rdi # 0x7ffff7ffe1d0 <alloc_last_block>
0x7ffff7df0987 <free+7>: push %rbx
0x7ffff7df0988 <free+8>: mov %rdi,%rbx
0x7ffff7df098b <free+11>: je 0x7ffff7df0990 <free+16>
The advantage (depending on your needs) of x /i over disassemble though, is that x /i accepts a size in instructions whereas disassemble takes a size in bytes.

Related

Dwarf DW_AT_location objdump and dwarfdump inconsistent

I am playing around with CPython and trying to understand how a debugger works.
Specifically, I am trying to get the location of the last PyFrameObject so that I can traverse that and get the Python backtrace.
In the file ceval.c, line 689 has the definition of the function:
PyObject * PyEval_EvalFrameEx(PyFrameObject *f, int throwflag)
What I am interested in getting is the location of f on the stack. When dumping the binary with dwarfdump I get that f is at $rbp-824, but if I dump the binary with objdump I get that the location is $rbp-808 - a discrepancy of 16. Also, when debugging with GDB, I get that the correct answer is $rbp-808 like objdump gives me. Why the discrepancy, and why is dwarfdump incorrect? What am I not understanding?
How to technically recreate the problem:
Download python-2.7.17.tgz from Python website. Extract.
I compiled python-2.7.17 from source with debug symbols (./configure --enable-pydebug && make). Run the following commands on the resulting python binary:
dwarfdump Python-2.7.17/python has the following output:
DW_AT_name f
DW_AT_decl_file 0x00000001 /home/meir/code/python/Python-2.7.17/Python/ceval.c
DW_AT_decl_line 0x000002b1
DW_AT_type <0x00002916>
DW_AT_location len 0x0003: 91c879: DW_OP_fbreg -824
I know this is the correct f because the line the variable is declared on is 689 (0x2b1). As you can see the location is:
DW_AT_location len 0x0003: 91c879: DW_OP_fbreg -824: Meaning $rbp-824.
Running the command objdump -S Python-2.7.17/python has the following output:
PyEval_EvalFrameEx(PyFrameObject *f, int throwflag)
{
f7577: 55 push %rbp
f7578: 48 89 e5 mov %rsp,%rbp
f757b: 41 57 push %r15
f757d: 41 56 push %r14
f757f: 41 55 push %r13
f7581: 41 54 push %r12
f7583: 53 push %rbx
f7584: 48 81 ec 38 03 00 00 sub $0x338,%rsp
f758b: 48 89 bd d8 fc ff ff mov %rdi,-0x328(%rbp)
f7592: 89 b5 d4 fc ff ff mov %esi,-0x32c(%rbp)
f7598: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
f759f: 00 00
f75a1: 48 89 45 c8 mov %rax,-0x38(%rbp)
f75a5: 31 c0 xor %eax,%eax
Debugging this output will show you that the relevant line is:
f758b: 48 89 bd d8 fc ff ff mov %rdi,-0x328(%rbp) where you can clearly see that f is being loaded from -0x328(%rbp) which is $rbp-808. Also, GDB supports this finding.
So again, the question is, what am I missing and why the 16 byte discrepency between dwarfdump and reality?
Thanks
Edit:
The dwarfdump including the function above is:
< 1><0x00004519> DW_TAG_subprogram
DW_AT_external yes(1)
DW_AT_name PyEval_EvalFrameEx
DW_AT_decl_file 0x00000001 /home/meir/code/python/Python-2.7.17/Python/ceval.c
DW_AT_decl_line 0x000002b1
DW_AT_prototyped yes(1)
DW_AT_type <0x00000817>
DW_AT_low_pc 0x000f7577
DW_AT_high_pc <offset-from-lowpc>53969
DW_AT_frame_base len 0x0001: 9c: DW_OP_call_frame_cfa
DW_AT_GNU_all_tail_call_sites yes(1)
DW_AT_sibling <0x00005bbe>
< 2><0x0000453b> DW_TAG_formal_parameter
DW_AT_name f
DW_AT_decl_file 0x00000001 /home/meir/code/python/Python-2.7.17/Python/ceval.c
DW_AT_decl_line 0x000002b1
DW_AT_type <0x00002916>
DW_AT_location len 0x0003: 91c879: DW_OP_fbreg -824
According to the answer below, DW_OP_fbreg is offset from the frame base - in my case DW_OP_call_frame_cfa. I am having trouble identifying the frame base. My registers are as following:
(gdb) info registers
rax 0xfffffffffffffdfe -514
rbx 0x7f6a4887d040 140094460121152
rcx 0x7f6a48e83ff7 140094466441207
rdx 0x0 0
rsi 0x0 0
rdi 0x0 0
rbp 0x7ffd24bcef00 0x7ffd24bcef00
rsp 0x7ffd24bceba0 0x7ffd24bceba0
r8 0x7ffd24bcea50 140725219813968
r9 0x0 0
r10 0x0 0
r11 0x246 582
r12 0x7f6a48870df0 140094460071408
r13 0x7f6a48874b58 140094460087128
r14 0x1 1
r15 0x7f6a48873794 140094460082068
rip 0x5559834e99c0 0x5559834e99c0 <PyEval_EvalFrameEx+46153>
eflags 0x246 [ PF ZF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
As stated above, I already know that %rbp-808 works. What is the correct way to do it with the registers that I have?
Edit:
I finally understood the answer. I needed to unwind one more function, and find the place my function was called. There, the variable I was looking for really was in $rsp and $rsp-824 was correct
DW_OP_fbreg -824: Meaning $rbp-824
It does not mean that. It means, offset -824 from frame base (virtual) register, which is not necessarily (nor usually) equal to $rbp.
You need to look for DW_AT_frame_base to know what the frame base in the current function is.
Most likely it's defined as DW_OP_call_frame_cfa, which is the value of $RSP just before current function was called, and is equal to $RBP-16 (8 bytes for return address saved by the CALL instruction, and 8 bytes for previous $RBP saved by the first instruction of your function).

GDB: Print the value of memory address

According to https://www.ethicalhacker.net/columns/heffner/intro-to-assembly-and-reverse-engineering
mov 0xffffffb4,0x1
moves the number 1 into 0xffffffb4.
So, I decided to test this on my own.
In GDB, x is the command to print the value of memory address.
However, when I run
x 0x00000000004004fc
I'm not getting the value of 133 (decimal) or 85 (hexadecimal)
Instead, I'm getting 0x85f445c7. Any idea what is this?
me#box:~/c$ gdb -q test
Reading symbols from test...done.
(gdb) l
1 #include <stdio.h>
2
3 int main(){
4 int a = 1;
5 int b = 13;
6 int c = 133;
7 printf("Value of C : %d\n",c);
8 return 0;
9 }
(gdb) b 7
Breakpoint 1 at 0x400503: file test.c, line 7.
(gdb) r
Starting program: /home/me/c/test
Breakpoint 1, main () at test.c:7
7 printf("Value of C : %d\n",c);
(gdb)
Disassemble
(gdb) disas
Dump of assembler code for function main:
0x00000000004004e6 <+0>: push %rbp
0x00000000004004e7 <+1>: mov %rsp,%rbp
0x00000000004004ea <+4>: sub $0x10,%rsp
0x00000000004004ee <+8>: movl $0x1,-0x4(%rbp)
0x00000000004004f5 <+15>: movl $0xd,-0x8(%rbp)
0x00000000004004fc <+22>: movl $0x85,-0xc(%rbp)
=> 0x0000000000400503 <+29>: mov -0xc(%rbp),%eax
0x0000000000400506 <+32>: mov %eax,%esi
0x0000000000400508 <+34>: mov $0x4005a4,%edi
0x000000000040050d <+39>: mov $0x0,%eax
0x0000000000400512 <+44>: callq 0x4003c0 <printf#plt>
0x0000000000400517 <+49>: mov $0x0,%eax
0x000000000040051c <+54>: leaveq
0x000000000040051d <+55>: retq
End of assembler dump.
(gdb) x 0x00000000004004fc
0x4004fc <main+22>: 0x85f445c7
(gdb)
;DRTL
To print a value in GDB use print or (p in short form) command.
in your command
x 0x00000000004004fc
You have missed p command. You have to use x with p command pair to print value as hexadecimal format, like below:
(gdb) p/x 0x00000000004004fc
If the memory address is some pointer to some structure then you have to cast the memory location before using the pointer. For example,
struct node {
int data;
struct node *next
};
is some structure and you have the address of that structure pointer, then to view the contents of that memory location you have to use
(gdb) p *(struct node *) 0x00000000004004fc
Notable:
The command
x 0x00000000004004fc
Will look at the instruction and related data for this instruction:
0x00000000004004fc <+22>: movl $0x85,-0xc(%rbp)
... as you can see that the left column (address) is equal to the value used for the command (the address to read)
In the instruction 0x85 is clearly the destination address for the mov, and reflected in the printed value; 0x85f445c7 - which stored as MSB (most significant byte) at the address.

selecting address to change value in memory

This question/answer on SO shows how to use GDB to change a value in memory, but in the example given, it chooses an address to set the value that wasn't previously being used
For example, to change the return value to 22, the author does
set {unsigned char}0x00000000004004b9 = 22
However, why would this address 0x00000000004004b9 be the address to change? If you look at the output of disas/r the address 0x00000000004004b9 isn't being used, so why use this one to set to 22? I'm trying to understand how to know which address needs to be changed to (in this example) change the return value, if the output of disas/r doesn't show it.
code
$ cat t.c
int main()
{
return 42;
}
$ gcc t.c && ./a.out; echo $?
42
$ gdb --write -q ./a.out
(gdb) disas/r main
Dump of assembler code for function main:
0x00000000004004b4 <+0>: 55 push %rbp
0x00000000004004b5 <+1>: 48 89 e5 mov %rsp,%rbp
0x00000000004004b8 <+4>: b8 2a 00 00 00 mov $0x2a,%eax
0x00000000004004bd <+9>: 5d pop %rbp
0x00000000004004be <+10>: c3 retq
End of assembler dump.
(gdb) set {unsigned char}0x00000000004004b9 = 22
(gdb) disas/r main
Dump of assembler code for function main:
0x00000000004004b4 <+0>: 55 push %rbp
0x00000000004004b5 <+1>: 48 89 e5 mov %rsp,%rbp
0x00000000004004b8 <+4>: b8 16 00 00 00 mov $0x16,%eax <<< ---changed
0x00000000004004bd <+9>: 5d pop %rbp
0x00000000004004be <+10>: c3 retq
End of assembler dump.
(gdb) q
$ ./a.out; echo $?
22 <<<--- Just as desired
I'm trying to understand how to know which address needs to be changed to (in this example) change the return value, if the output of disas/r doesn't show it.
To understand this, you need to understand instruction encoding. The instruction here is "move immediate 32-bit constant to register". The constant is part of the instruction (that's what "immediate" means). It may be helpful to compile this instead:
int foo() { return 0x41424344; }
int bar() { return 0x45464748; }
int main() { return foo() + bar(); }
When you do compile it, you should see something similar to:
(gdb) disas/r foo
Dump of assembler code for function foo:
0x00000000004004ed <+0>: 55 push %rbp
0x00000000004004ee <+1>: 48 89 e5 mov %rsp,%rbp
0x00000000004004f1 <+4>: b8 44 43 42 41 mov $0x41424344,%eax
0x00000000004004f6 <+9>: 5d pop %rbp
0x00000000004004f7 <+10>: c3 retq
End of assembler dump.
(gdb) disas/r bar
Dump of assembler code for function bar:
0x00000000004004f8 <+0>: 55 push %rbp
0x00000000004004f9 <+1>: 48 89 e5 mov %rsp,%rbp
0x00000000004004fc <+4>: b8 48 47 46 45 mov $0x45464748,%eax
0x0000000000400501 <+9>: 5d pop %rbp
0x0000000000400502 <+10>: c3 retq
End of assembler dump.
Now you can clearly see where in the instruction stream each byte of the immediate constant resides (and also that x86 uses little-endian encoding for them).
The standard reference on instruction encoding for x86 is Intel instruction set reference. You can find 0xB8 instruction on page 3-528.

Can I disas a specific statement in gdb?

(gdb) n
253 conf.log = log;
Like above,the next statement is conf.log = log;,how can I just disas that?
I tried simply disas,but gdb will disassembly all the current function(I don't need so much)...
(gdb) disas
Dump of assembler code for function ngx_init_cycle:
0x0000000000417c7c <ngx_init_cycle+0>: push %rbp
0x0000000000417c7d <ngx_init_cycle+1>: mov %rsp,%rbp
0x0000000000417c80 <ngx_init_cycle+4>: push %rbx
0x0000000000417c81 <ngx_init_cycle+5>: sub $0x258,%rsp
0x0000000000417c88 <ngx_init_cycle+12>: mov %rdi,-0x228(%rbp)
0x0000000000417c8f <ngx_init_cycle+19>: callq 0x42b2fc <ngx_timezone_update>
0x0000000000417c94 <ngx_init_cycle+24>: mov 0x2b00e5(%rip),%rax # 0x6c7d80 <ngx_cached_time>
0x0000000000417c9b <ngx_init_cycle+31>: mov %rax,-0x88(%rbp)
0x0000000000417ca2 <ngx_init_cycle+38>: mov -0x88(%rbp),%rax
0x0000000000417ca9 <ngx_init_cycle+45>: movq $0x0,(%rax)
0x0000000000417cb0 <ngx_init_cycle+52>: callq 0x4149e7 <ngx_time_update>
0x0000000000417cb5 <ngx_init_cycle+57>: mov -0x228(%rbp),%rax
0x0000000000417cbc <ngx_init_cycle+64>: mov 0x10(%rax),%rax
0x0000000000417cc0 <ngx_init_cycle+68>: mov %rax,-0x90(%rbp)
0x0000000000417cc7 <ngx_init_cycle+75>: mov -0x90(%rbp),%rsi
0x0000000000417cce <ngx_init_cycle+82>: mov $0x4000,%edi
0x0000000000417cd3 <ngx_init_cycle+87>: callq 0x405c6c <ngx_create_pool>
0x0000000000417cd8 <ngx_init_cycle+92>: mov %rax,-0x80(%rbp)
0x0000000000417cdc <ngx_init_cycle+96>: cmpq $0x0,-0x80(%rbp)
---Type <return> to continue, or q <return> to quit---q
UPDATE
(gdb) info line 98
Line 98 of "src/os/unix/ngx_process_cycle.c" starts at address 0x42f6f3 <ngx_master_process_cycle+31>
and ends at 0x42f704 <ngx_master_process_cycle+48>.
(gdb) disas 0x42f6f3,0x42f704
Dump of assembler code for function ngx_master_process_cycle:
0x000000000042f6d4 <ngx_master_process_cycle+0>: push %rbp
0x000000000042f6d5 <ngx_master_process_cycle+1>: mov %rsp,%rbp
0x000000000042f6d8 <ngx_master_process_cycle+4>: push %rbx
0x000000000042f6d9 <ngx_master_process_cycle+5>: sub $0x128,%rsp
0x000000000042f6e0 <ngx_master_process_cycle+12>: mov %rdi,-0x108(%rbp)
0x000000000042f6e7 <ngx_master_process_cycle+19>: lea -0xe0(%rbp),%rdi
0x000000000042f6ee <ngx_master_process_cycle+26>: callq 0x402988 <sigemptyset#plt>
0x000000000042f6f3 <ngx_master_process_cycle+31>: lea -0xe0(%rbp),%rdi
0x000000000042f6fa <ngx_master_process_cycle+38>: mov $0x11,%esi
0x000000000042f6ff <ngx_master_process_cycle+43>: callq 0x402878 <sigaddset#plt>
0x000000000042f704 <ngx_master_process_cycle+48>: lea -0xe0(%rbp),%rdi
0x000000000042f70b <ngx_master_process_cycle+55>: mov $0xe,%esi
0x000000000042f710 <ngx_master_process_cycle+60>: callq 0x402878 <sigaddset#plt>
0x000000000042f715 <ngx_master_process_cycle+65>: lea -0xe0(%rbp),%rdi
0x000000000042f71c <ngx_master_process_cycle+72>: mov $0x1d,%esi
0x000000000042f721 <ngx_master_process_cycle+77>: callq 0x402878 <sigaddset#plt>
0x000000000042f726 <ngx_master_process_cycle+82>: lea -0xe0(%rbp),%rdi
0x000000000042f72d <ngx_master_process_cycle+89>: mov $0x2,%esi
0x000000000042f732 <ngx_master_process_cycle+94>: callq 0x402878 <sigaddset#plt>
---Type <return> to continue, or q <return> to quit---
try something to the effect of:
(gdb) info line 12
Line 12 of "test.c" starts at address 0x4004f4 <main+24>
and ends at 0x4004fe <main+34>.
(gdb) disas 0x4004f4,0x4004fe
Dump of assembler code from 0x4004f4 to 0x4004fe:
0x00000000004004f4 <main+24>: mov $0x0,%eax
0x00000000004004f9 <main+29>: callq 0x4004d0 <bp3>
End of assembler dump.
Or:
(gdb) disas main+24,main+34
Dump of assembler code from 0x4004f4 to 0x4004fe:
0x00000000004004f4 <main+24>: mov $0x0,%eax
0x00000000004004f9 <main+29>: callq 0x4004d0 <bp3>
End of assembler dump.
not sure of a more automatic way offhand.

What does <value optimized out> mean in gdb?

(gdb) n
134 a = b = c = 0xdeadbeef + ((uint32_t)length) + initval;
(gdb) n
(gdb) p a
$30 = <value optimized out>
(gdb) p b
$31 = <value optimized out>
(gdb) p c
$32 = 3735928563
How can gdb optimize out my value??
It means you compiled with e.g. gcc -O3 and the gcc optimiser found that some of your variables were redundant in some way that allowed them to be optimised away. In this particular case you appear to have three variables a, b, c with the same value and presumably they can all be aliassed to a single variable. Compile with optimisation disabled, e.g. gcc -O0, if you want to see such variables (this is generally a good idea for debug builds in any case).
Minimal runnable example with disassembly analysis
As usual, I like to see some disassembly to get a better understanding of what is going on.
In this case, the insight we obtain is that if a variable is optimized to be stored only in a register rather than the stack, and then the register it was in gets overwritten, then it shows as <optimized out> as mentioned by R..
Of course, this can only happen if the variable in question is not needed anymore, otherwise the program would lose its value. Therefore it tends to happen that at the start of the function you can see the variable value, but then at the end it becomes <optimized out>.
One typical case which we often are interested in of this is that of the function arguments themselves, since these are:
always defined at the start of the function
may not get used towards the end of the function as more intermediate values are calculated.
tend to get overwritten by further function subcalls which must setup the exact same registers to satisfy the calling convention
This understanding actually has a concrete application: when using reverse debugging, you might be able to recover the value of variables of interest simply by stepping back to their last point of usage: How do I view the value of an <optimized out> variable in C++?
main.c
#include <stdio.h>
int __attribute__((noinline)) f3(int i) {
return i + 1;
}
int __attribute__((noinline)) f2(int i) {
return f3(i) + 1;
}
int __attribute__((noinline)) f1(int i) {
int j = 1, k = 2, l = 3;
i += 1;
j += f2(i);
k += f2(j);
l += f2(k);
return l;
}
int main(int argc, char *argv[]) {
printf("%d\n", f1(argc));
return 0;
}
Compile and run:
gcc -ggdb3 -O3 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
gdb -q -nh main.out
Then inside GDB, we have the following session:
Breakpoint 1, f1 (i=1) at main.c:13
13 i += 1;
(gdb) disas
Dump of assembler code for function f1:
=> 0x00005555555546c0 <+0>: add $0x1,%edi
0x00005555555546c3 <+3>: callq 0x5555555546b0 <f2>
0x00005555555546c8 <+8>: lea 0x1(%rax),%edi
0x00005555555546cb <+11>: callq 0x5555555546b0 <f2>
0x00005555555546d0 <+16>: lea 0x2(%rax),%edi
0x00005555555546d3 <+19>: callq 0x5555555546b0 <f2>
0x00005555555546d8 <+24>: add $0x3,%eax
0x00005555555546db <+27>: retq
End of assembler dump.
(gdb) p i
$1 = 1
(gdb) p j
$2 = 1
(gdb) n
14 j += f2(i);
(gdb) disas
Dump of assembler code for function f1:
0x00005555555546c0 <+0>: add $0x1,%edi
=> 0x00005555555546c3 <+3>: callq 0x5555555546b0 <f2>
0x00005555555546c8 <+8>: lea 0x1(%rax),%edi
0x00005555555546cb <+11>: callq 0x5555555546b0 <f2>
0x00005555555546d0 <+16>: lea 0x2(%rax),%edi
0x00005555555546d3 <+19>: callq 0x5555555546b0 <f2>
0x00005555555546d8 <+24>: add $0x3,%eax
0x00005555555546db <+27>: retq
End of assembler dump.
(gdb) p i
$3 = 2
(gdb) p j
$4 = 1
(gdb) n
15 k += f2(j);
(gdb) disas
Dump of assembler code for function f1:
0x00005555555546c0 <+0>: add $0x1,%edi
0x00005555555546c3 <+3>: callq 0x5555555546b0 <f2>
0x00005555555546c8 <+8>: lea 0x1(%rax),%edi
=> 0x00005555555546cb <+11>: callq 0x5555555546b0 <f2>
0x00005555555546d0 <+16>: lea 0x2(%rax),%edi
0x00005555555546d3 <+19>: callq 0x5555555546b0 <f2>
0x00005555555546d8 <+24>: add $0x3,%eax
0x00005555555546db <+27>: retq
End of assembler dump.
(gdb) p i
$5 = <optimized out>
(gdb) p j
$6 = 5
(gdb) n
16 l += f2(k);
(gdb) disas
Dump of assembler code for function f1:
0x00005555555546c0 <+0>: add $0x1,%edi
0x00005555555546c3 <+3>: callq 0x5555555546b0 <f2>
0x00005555555546c8 <+8>: lea 0x1(%rax),%edi
0x00005555555546cb <+11>: callq 0x5555555546b0 <f2>
0x00005555555546d0 <+16>: lea 0x2(%rax),%edi
=> 0x00005555555546d3 <+19>: callq 0x5555555546b0 <f2>
0x00005555555546d8 <+24>: add $0x3,%eax
0x00005555555546db <+27>: retq
End of assembler dump.
(gdb) p i
$7 = <optimized out>
(gdb) p j
$8 = <optimized out>
To understand what is going on, remember from the x86 Linux calling convention: What are the calling conventions for UNIX & Linux system calls on i386 and x86-64 you should know that:
RDI contains the first argument
RDI can get destroyed in function calls
RAX contains the return value
From this we deduce that:
add $0x1,%edi
corresponds to the:
i += 1;
since i is the first argument of f1, and therefore stored in RDI.
Now, while we were at both:
i += 1;
j += f2(i);
the value of RDI hadn't been modified, and therefore GDB could just query it at anytime in those lines.
However, as soon as the f2 call is made:
the value of i is not needed anymore in the program
lea 0x1(%rax),%edi does EDI = j + RAX + 1, which both:
initializes j = 1
sets up the first argument of the next f2 call to RDI = j
Therefore, when the following line is reached:
k += f2(j);
both of the following instructions have/may have modified RDI, which is the only place i was being stored (f2 may use it as a scratch register, and lea definitely set it to RAX + 1):
0x00005555555546c3 <+3>: callq 0x5555555546b0 <f2>
0x00005555555546c8 <+8>: lea 0x1(%rax),%edi
and so RDI does not contain the value of i anymore. In fact, the value of i was completely lost! Therefore the only possible outcome is:
$3 = <optimized out>
A similar thing happens to the value of j, although j only becomes unnecessary one line later afer the call to k += f2(j);.
Thinking about j also gives us some insight on how smart GDB is. Notably, at i += 1;, the value of j had not yet materialized in any register or memory address, and GDB must have known it based solely on debug information metadata.
-O0 analysis
If we use -O0 instead of -O3 for compilation:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
then the disassembly would look like:
11 int __attribute__((noinline)) f1(int i) {
=> 0x0000555555554673 <+0>: 55 push %rbp
0x0000555555554674 <+1>: 48 89 e5 mov %rsp,%rbp
0x0000555555554677 <+4>: 48 83 ec 18 sub $0x18,%rsp
0x000055555555467b <+8>: 89 7d ec mov %edi,-0x14(%rbp)
12 int j = 1, k = 2, l = 3;
0x000055555555467e <+11>: c7 45 f4 01 00 00 00 movl $0x1,-0xc(%rbp)
0x0000555555554685 <+18>: c7 45 f8 02 00 00 00 movl $0x2,-0x8(%rbp)
0x000055555555468c <+25>: c7 45 fc 03 00 00 00 movl $0x3,-0x4(%rbp)
13 i += 1;
0x0000555555554693 <+32>: 83 45 ec 01 addl $0x1,-0x14(%rbp)
14 j += f2(i);
0x0000555555554697 <+36>: 8b 45 ec mov -0x14(%rbp),%eax
0x000055555555469a <+39>: 89 c7 mov %eax,%edi
0x000055555555469c <+41>: e8 b8 ff ff ff callq 0x555555554659 <f2>
0x00005555555546a1 <+46>: 01 45 f4 add %eax,-0xc(%rbp)
15 k += f2(j);
0x00005555555546a4 <+49>: 8b 45 f4 mov -0xc(%rbp),%eax
0x00005555555546a7 <+52>: 89 c7 mov %eax,%edi
0x00005555555546a9 <+54>: e8 ab ff ff ff callq 0x555555554659 <f2>
0x00005555555546ae <+59>: 01 45 f8 add %eax,-0x8(%rbp)
16 l += f2(k);
0x00005555555546b1 <+62>: 8b 45 f8 mov -0x8(%rbp),%eax
0x00005555555546b4 <+65>: 89 c7 mov %eax,%edi
0x00005555555546b6 <+67>: e8 9e ff ff ff callq 0x555555554659 <f2>
0x00005555555546bb <+72>: 01 45 fc add %eax,-0x4(%rbp)
17 return l;
0x00005555555546be <+75>: 8b 45 fc mov -0x4(%rbp),%eax
18 }
0x00005555555546c1 <+78>: c9 leaveq
0x00005555555546c2 <+79>: c3 retq
From this horrendous disassembly, we see that the value of RDI is moved to the stack at the very start of program execution at:
mov %edi,-0x14(%rbp)
and it then gets retrieved from memory into registers whenever needed, e.g. at:
14 j += f2(i);
0x0000555555554697 <+36>: 8b 45 ec mov -0x14(%rbp),%eax
0x000055555555469a <+39>: 89 c7 mov %eax,%edi
0x000055555555469c <+41>: e8 b8 ff ff ff callq 0x555555554659 <f2>
0x00005555555546a1 <+46>: 01 45 f4 add %eax,-0xc(%rbp)
The same basically happens to j which gets immediately pushed to the stack when when it is initialized:
0x000055555555467e <+11>: c7 45 f4 01 00 00 00 movl $0x1,-0xc(%rbp)
Therefore, it is easy for GDB to find the values of those variables at any time: they are always present in memory!
This also gives us some insight on why it is not possible to avoid <optimized out> in optimized code: since the number of registers is limited, the only way to do that would be to actually push unneeded registers to memory, which would partly defeat the benefit of -O3.
Extend the lifetime of i
If we edited f1 to return l + i as in:
int __attribute__((noinline)) f1(int i) {
int j = 1, k = 2, l = 3;
i += 1;
j += f2(i);
k += f2(j);
l += f2(k);
return l + i;
}
then we observe that this effectively extends the visibility of i until the end of the function.
This is because with this we force GCC to use an extra variable to keep i around until the end:
0x00005555555546c0 <+0>: lea 0x1(%rdi),%edx
0x00005555555546c3 <+3>: mov %edx,%edi
0x00005555555546c5 <+5>: callq 0x5555555546b0 <f2>
0x00005555555546ca <+10>: lea 0x1(%rax),%edi
0x00005555555546cd <+13>: callq 0x5555555546b0 <f2>
0x00005555555546d2 <+18>: lea 0x2(%rax),%edi
0x00005555555546d5 <+21>: callq 0x5555555546b0 <f2>
0x00005555555546da <+26>: lea 0x3(%rdx,%rax,1),%eax
0x00005555555546de <+30>: retq
which the compiler does by storing i += i in RDX at the very first instruction.
Tested in Ubuntu 18.04, GCC 7.4.0, GDB 8.1.0.
It didn't. Your compiler did, but there's still a debug symbol for the original variable name.
From https://idlebox.net/2010/apidocs/gdb-7.0.zip/gdb_9.html
The values of arguments that were not saved in their stack frames are shown as `value optimized out'.
I'm guessing you compiled with -O(somevalue) and are accessing variables a,b,c in a function where optimization has occurred.
You need to turn off the compiler optimisation.
If you are interested in a particular variable in gdb, you can delare the variable as "volatile" and recompile the code. This will make the compiler turn off compiler optimization for that variable.
volatile int quantity = 0;
Just run "export COPTS='-g -O0';" and rebuild your code. After rebuild, debug it using gdb. You'll not see such error. Thanks.