GDB - Reading 1 words from the stack

GDB - Reading 1 words from the stack - gdb

I want to print 1 words from the top of stack in the form of hexadecimal. To do so, I typed the following:
(gdb) x/1xw $esp
but GDB keeps popping up:
0xffffffffffffe030: Cannot access memory at address 0xffffffffffffe030
The program I'm trying to debug has already pushed a value onto stack so just in case if you're wondering that I might be trying to access kernel variables at the very beginning of program, it's not so.
Any idea?

0xffffffffffffe030 is a 64-bit constant, so you are running in x64-bit mode. But $esp is a 32-bit register (which GDB sign-extends to 64 bits in this context). The 64-bit stack pointer is called $rsp. Try this instead:
(gdb) x/1xw $rsp

Related

Do memory addresses contain implicit hex digits?

What is the value of a memory address that is less than 12 hex digits on a 64-bit computer?
For instance, when I run gdb on a simple assembly program and run (gdb) info frame I get:
Stack level 0, frame at 0x7fffffffd970:
rip = 0x40052f in main (file.s:11); saved rip = 0x7ffff7a2d830
source language asm.
Arglist at 0x7fffffffd960, args:
Locals at 0x7fffffffd960, Previous frame's sp is 0x7fffffffd970
Saved registers:
rbp at 0x7fffffffd960, rip at 0x7fffffffd968
The first part of the second line rip = 0x40052f in main (file.s:11) I believe states the value of the instruction pointer when I called info frame. But why is the memory address it holds not 12 hex digits?
Also, if I type (gdb) x 0x7fffffffd968 (which I expect to be 0x7ffff7a2d830) I get:
0x7fffffffd968: 0xf7a2d830
Does this mean that any memory address with less than 12 hex digits contains an implicit 7ff...?

No. On x86 or x86_64, a memory address is simply a number, but is commonly displayed using hexadecimal. And like most number notation systems, a shorter number just means a much smaller value, or if you like, there are implicit zeros before it.
So just like the decimal string "12" is much smaller than "12654321", the address 0x40052f is much smaller than the address 0x7ffff7a2d830. The two addresses are almost certainly in different virtual memory maps. (On Linux, you can view virtual memory maps by cat /proc/{pid}/maps.)
When you used the gdb x command, you didn't see the value you expected because gdb took a guess at what kind of data your address points at. The first time you use x in a gdb session, it defaults to showing 4 bytes (32 bits) per element, as though the address points at an array of uint32_t. Since addresses on x86_64 are 8 bytes (64 bits), you need x/g to tell gdb the element size is 8 bytes.

How does GDB restore instruction after breakpoint

I have read that GDB puts a int 3 (opcode CC) at the wanted adress in the target program memory.
Si this operation is erasing a piece of instruction (1 byte) in the program memory.
My question is: How and When GDB replaces the original opcode when the program continues ?
When i type disassemble in GDB, i do not see CC opcode. Is this because GDB knows it is him that puts the CC ?
Is there a way to do a raw disassemble, in order to see exactly what opcodes are loaded in memory at this instant ?

How and When GDB replaces the original opcode when the program continues ?
I use this as an interview question ;-)
On Linux, to continue past the breakpoint, 0xCC is replaced with the original instruction, and ptrace(PTRACE_SINGLESTEP, ...) is done. When the thread stops on the next instruction, original code is again replaced by the 0xCC opcode (to restore the breakpoint), and the thread continued on its merry way.
On x86 platforms that do not have PTRACE_SINGLESTEP, trap flag is set in EFLAGS via ptrace(PTRACE_SETREGS, ...) and the thread is continued. The trap flag causes the thread to immediately stop again (on the next instruction, just like PTRACE_SINGLESTEP would).
When i type disassemble in GDB, i do not see CC opcode. Is this because GDB knows it is him that puts the CC ?
Correct. A program could examine and print its own instruction stream, and you can observe breakpoint 0xCC opcodes that way.
Is there a way to do a raw disassemble, in order to see exactly what opcodes are loaded in memory at this instant ?
I don't believe there is. You can use (gdb) set debug infrun to observe what GDB is doing to the inferior (being debugged) process.
What i do not understand in fact is the exact role of SIGTRAP. Who is sending/receiving this signal ? The debugger or the target program?
Neither: after PTRACE_ATTACH, the kernel stops the inferior, then notifies the debugger that it has done so, by making debugger's wait return.
I see a wait(NULL) after the ptrace attach. What is the meaning of this wait?
See the explanation above.
Thread specific breakpoints?
For a thread-specific breakpoint, the debugger inserts a process-wide breakpoint (via 0xCC opcode), then simply immediately resumes any thread which hits the breakpoint, unless the thread is the specific one that you wanted to stop.

What does <exp+6> mean in gdb?

I was debugging a C++ code using gdb. The program stopped due to a segmentation fault.
Program received signal SIGSEGV, Segmentation fault.
So I was trying to print out the value of variables to identify where the error is coming from. I have an array called 'ring' of type 'Link **', where Link is a class I defined. Each element in the array points to a 'Link *' variable. Here is the output when I print the first three elements of the 'ring' array.
(gdb) print ring[0]
$13 = (Link *) 0x8125290
(gdb) print ring[1]
$14 = (Link *) 0xb7e80b86 <exp+6>
(gdb) print ring[2]
$15 = (Link *) 0x8132e20
Why am I getting '' after the memory address when printing 'ring[1]'? What does it mean?
EDIT: Im using gdb 7.8 on Arch Linux (3.16.4-1-ARCH)

It means the pointer value is equal to the address of the exp symbol plus 6. It's just the debugger trying to be helpful—whenever it decodes any pointer value, it tries to see if the pointer happens to lie near any known symbols in the object code, and it prints out that information if so.
You might expect to see such notation when examining the disassembly of a function's code, e.g. in branch targets, but as a data pointer, that's very unusual (function pointers would tend to point directly at function symbols, not offset into them).
You almost certainly have some kind of memory corruption bug that just happens to produce that value as a side effect.

How to use a logical address with an FS or GS base in gdb?

gdb provides functionality to read or write to a specific linear address, for example:
(gdb) x/1wx 0x080483e4
0x80483e4 <main>: 0x83e58955
(gdb)
but how do you specify a logical address ? I came accross the following instruction:
0x0804841a <+6>: mov %gs:0x14,%eax
how can i read the memory at "%gs:0x14" in gdb, or translate this logical address to a linear address that i could use in x command ?
note: i know that i could simply read %eax after this instruction, but that is not my concern

how can i read the memory at "%gs:0x14" in gdb
You can't: there is no way for GDB to know how the segment to which %gs refers to has been set up.
or translate this logical address to a linear address that i could use in x command
Again, you can't do this in general. However, you appear to be on 32-bit x86 Linux, and there you can do that -- the %gs is set up to point to the thread descriptor via set_thread_area system call.
You can do catch syscall set_thread_area in GDB, and examine the parameters (each thread will have one such call). The code to actually do that is here. Once you know how %gs has been set up, just add 0x14 to the base_addr, and you are done.

As answered in Using GDB to read MSRs, this is possible with gdb 8, using the registers $fs_base and $gs_base.

I think the easiest way to do this is to read the content of EAX register as you can see the value of %gs:0x14 is moved to EAX.
In GDB, set a breakpoint at the address right after 0x0804841a with break. For example
break *0x0804841e
Then run the program and you can print the contents of EAX register with
info registers eax

stack traces stop at the leaf register (lr)

Often I see ARM stack traces (read: Android NDK stack traces) that terminate with an lr pointer, like so:
#00 pc 001c6c20 /data/data/com.audia.dev.qt/lib/libQtGui.so
#01 lr 80a356cc /data/data/com.audia.dev.rta/lib/librta.so
I know that lr stands for link register on ARM and other architectures, and that it's a quick way to store a return address, but I don't understand why it always seems to store a useless address. In this example, 80a356cc cannot be mapped to any code using addr2line or gdb.
Is there any way to get more information? Why must the trace stop at the lr address anyway?

Stumbled on the answer finally. I just had to be more observant. Look at the following short stack trace and the information that comes after it:
#00 pc 000099d6 /system/lib/libandroid.so
#01 lr 80b6c17c /data/data/com.audia.dev.rta/lib/librta.so
code around pc:
a9d899b4 bf00bd0e 2102b507 aa016d43 28004798
a9d899c4 9801bfa8 bf00bd0e 460eb573 93004615
a9d899d4 6d842105 462b4632 200047a0 bf00bd7c
a9d899e4 b100b510 f7fe3808 2800edf4 f04fbf14
a9d899f4 200030ff bf00bd10 b097b5f0 4614af01
code around lr:
80b6c15c e51b3078 e5933038 e5932024 e51b302c
80b6c16c e1a00002 e3a01000 e3a02000 ebfeee5c
80b6c17c e1a03000 e50b303c e51b303c e1a03fa3
80b6c18c e6ef3073 e3530000 0a000005 e59f34fc
80b6c19c e08f3003 e1a00003 e51b103c ebfeebe6
Now the lr address is still a 80xxxxxx address that isn't useful to us.
The address it prints from the pc is 000099d6, but look at the next section, code around pc. The first column is a list of addresses (you can tell from the fact that it increments by 16 each time.) None of those addresses looks like the pc address, unless you chop off the first 16 bits. Then you'll notice that the a9d899d4 must correspond to 000099d4, and the code where the program stopped is two bytes in from that.
Android's stack trace seems to have "chopped off" the first 2 bytes of the pc address for me, but for whatever reason it does not do it for addresses in the leaf register. Which brings us to the solution:
In short, I was able to chop off the first 16 bits from the 80b6c17c address to make it 0000c17c, and so far that has given me a valid code address every time that I can look up with gdb or addr2line. (edit: I've found it's actually usually the first 12 bits or first 3 hexadecimal digits. You can decide this for yourself by looking at the stack trace output like I described above.) I can confirm that it is the right code address as well. This has definitely made debugging a great bit easier!

Do you have all debugging info (-g3) on?
Gcc likes to use the lr as a normal register. Remember that a non-leaf function looks like
push {lr}
; .. setup args here etc.
bl foo ; call a function foo
; .. work with function results
pop {pc}
Once it pushed lr to the stack, the compiler can use it almost freely - lr will be overwritten only by function calls. So its quite likely that there is any intermediate value in lr.
This should be stated in the debugging information that the compiler generates, in order to let the debugger know it has to look at the stack value instead of lr.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js