How to use a logical address with an FS or GS base in gdb?

How to use a logical address with an FS or GS base in gdb? - gdb

gdb provides functionality to read or write to a specific linear address, for example:
(gdb) x/1wx 0x080483e4
0x80483e4 <main>: 0x83e58955
(gdb)
but how do you specify a logical address ? I came accross the following instruction:
0x0804841a <+6>: mov %gs:0x14,%eax
how can i read the memory at "%gs:0x14" in gdb, or translate this logical address to a linear address that i could use in x command ?
note: i know that i could simply read %eax after this instruction, but that is not my concern

how can i read the memory at "%gs:0x14" in gdb
You can't: there is no way for GDB to know how the segment to which %gs refers to has been set up.
or translate this logical address to a linear address that i could use in x command
Again, you can't do this in general. However, you appear to be on 32-bit x86 Linux, and there you can do that -- the %gs is set up to point to the thread descriptor via set_thread_area system call.
You can do catch syscall set_thread_area in GDB, and examine the parameters (each thread will have one such call). The code to actually do that is here. Once you know how %gs has been set up, just add 0x14 to the base_addr, and you are done.

As answered in Using GDB to read MSRs, this is possible with gdb 8, using the registers $fs_base and $gs_base.

I think the easiest way to do this is to read the content of EAX register as you can see the value of %gs:0x14 is moved to EAX.
In GDB, set a breakpoint at the address right after 0x0804841a with break. For example
break *0x0804841e
Then run the program and you can print the contents of EAX register with
info registers eax

Related

gdb only shows xmm registers

I have written a subroutine using the avx2 instruction set (ymm registers), and now I want to debug it. My machine supports this instruction set, and the program can be executed without problems (no SIGILL exception).
But when I type 'tui reg vector' or 'info all-registers' to the gdb, it only shows the xmm registers. And the print command is not working, too:
(gdb) p $ymm0
$1 = void
(gdb) p/x $ymm0
$2 = Value can't be converted to integer.
I use the most current version gdb-8, so I think, it should know the avx2 instruction set.How can I persuade the debugger showing the hole ymm registers?Are there some config files I can edit to tell gdb the instruction set it shold use?
Kind regards

How does gdb read the register values of a program / process it's debugging? How are registers associated with a process?

I wrote a short program in c++ :
#include<iostream>
using namespace std;
int main(){
int x=10;
int y=20;
cout<< x+y <<endl;
return 0;
}
just out of curiosity i wanted to understand a program behind the hood so i was playing with gdb & came acrooss info registers command .when i use info registers in gdb i get output like this:
(gdb) info registers
rax 0x400756 4196182
rbx 0x0 0
rcx 0x6 6
rdx 0x7fffffffd418 140737488344088
rsi 0x7fffffffd408 140737488344072
rdi 0x1 1
rbp 0x7fffffffd320 0x7fffffffd320
rsp 0x7fffffffd320 0x7fffffffd320
r8 0x7ffff7ac1e80 140737348640384
r9 0x7ffff7dcfea0 140737351843488
r10 0x7fffffffd080 140737488343168
r11 0x7ffff773a410 140737344939024
r12 0x400660 4195936
r13 0x7fffffffd400 140737488344064
r14 0x0 0
r15 0x0 0
rip 0x40075a 0x40075a <main+4>
eflags 0x246 [ PF ZF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
I understand these are registers and their values but what I want to know is how/why are registers associated with a process. the values of registers should be changing continuously as different processes are scheduled by the operating system? I referred to the command info registers & this is what I found but this is still confusing.
info registers -> Prints the names and values of all registers except
floating-point and vector registers (in the selected stack frame).

Registers change all the time. In fact, even the debugger changes register values, as it has to run itself.
However, while you look at your program with a debugger, the debugger suspends your running process. As part of suspending, the CPU state is saved to RAM. The debugger understands this, and can just look at the suspended state in RAM. Say that register R1 was saved to address 0x1234 on suspending, then the debugger can just print the bytes stored at that address.

Each thread/process has its own register values. The user-space "architectural state" (register values) is saved on entering the kernel via a system call or interrupt. (This is true on all OSes).
See What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code? for a look at Linux's system-call entry points, with the hand-written asm that actually saves registers on the process's kernel stack. (Each thread has its own kernel stack, in Linux).
In multi-tasking OSes in general, every process/thread has its own memory space for saving state, so context switches work by restoring the saved state from the thread being switched to. This is a bit of a simplification, because there's kernel state vs. saved user-space. state1
So any time a process isn't actually running on a CPU core, its register values are saved in memory.
The OS provides an API for reading/writing the saved register state, and memory, of other processes.
In Linux, this API is the ptrace(2) system call; it's what GDB uses to read register values and to single-step. Thus, GDB reads saved register values of the target process from memory, indirectly via the kernel. GDB's own code doesn't use any special x86 instructions, or even load / store from any special addresses; it just makes system calls because access to another process's state has to go through the kernel. (Well I think a process could map another process's memory into its own address space, if Linux even has a system call for that, but I think memory reads/writes actually go through ptrace just like register accesses.)
(I think) If the target process was currently executing (instead of suspended) when another process made a ptrace system call that read or wrote one of its register values, the kernel would have to interrupt it so its current state would be saved to memory. This doesn't normally happen with GDB: it only tries to read register values when it's suspended the target process.
ptrace is also what strace uses to trace system calls. See Playing with ptrace, Part I from Linux Journal. strace ./my_program is fantastically useful for systems programming, especially when making system calls from hand-written asm, to decode the args you're actually passing, and the return values.
Footnotes:
In Linux, the actual switch to a new thread happens inside the kernel, from kernel context to kernel context. This saves "only" the integer registers on the kernel stack, sets rsp to the right place in the other thread's kernel stack, then restores the saved registers. So there's a function call that, when it returns, is executing in kernel mode for the new thread, with per-CPU kernel variables set appropriately. User-space state for the new thread is eventually restored the same way it would have been if the system call or interrupt that originally entered the kernel from user-space had returned without calling the scheduler. i.e. from the state saved by the system call or interrupt kernel entry point. Lazy / eager FPU state saving is another complication; the kernel generally avoids touching the FPU so it can avoid saving/restoring FPU state when just entering the kernel and returning back to the same user-space process.

GDB - Reading 1 words from the stack

I want to print 1 words from the top of stack in the form of hexadecimal. To do so, I typed the following:
(gdb) x/1xw $esp
but GDB keeps popping up:
0xffffffffffffe030: Cannot access memory at address 0xffffffffffffe030
The program I'm trying to debug has already pushed a value onto stack so just in case if you're wondering that I might be trying to access kernel variables at the very beginning of program, it's not so.
Any idea?

0xffffffffffffe030 is a 64-bit constant, so you are running in x64-bit mode. But $esp is a 32-bit register (which GDB sign-extends to 64 bits in this context). The 64-bit stack pointer is called $rsp. Try this instead:
(gdb) x/1xw $rsp

How does GDB restore instruction after breakpoint

I have read that GDB puts a int 3 (opcode CC) at the wanted adress in the target program memory.
Si this operation is erasing a piece of instruction (1 byte) in the program memory.
My question is: How and When GDB replaces the original opcode when the program continues ?
When i type disassemble in GDB, i do not see CC opcode. Is this because GDB knows it is him that puts the CC ?
Is there a way to do a raw disassemble, in order to see exactly what opcodes are loaded in memory at this instant ?

How and When GDB replaces the original opcode when the program continues ?
I use this as an interview question ;-)
On Linux, to continue past the breakpoint, 0xCC is replaced with the original instruction, and ptrace(PTRACE_SINGLESTEP, ...) is done. When the thread stops on the next instruction, original code is again replaced by the 0xCC opcode (to restore the breakpoint), and the thread continued on its merry way.
On x86 platforms that do not have PTRACE_SINGLESTEP, trap flag is set in EFLAGS via ptrace(PTRACE_SETREGS, ...) and the thread is continued. The trap flag causes the thread to immediately stop again (on the next instruction, just like PTRACE_SINGLESTEP would).
When i type disassemble in GDB, i do not see CC opcode. Is this because GDB knows it is him that puts the CC ?
Correct. A program could examine and print its own instruction stream, and you can observe breakpoint 0xCC opcodes that way.
Is there a way to do a raw disassemble, in order to see exactly what opcodes are loaded in memory at this instant ?
I don't believe there is. You can use (gdb) set debug infrun to observe what GDB is doing to the inferior (being debugged) process.
What i do not understand in fact is the exact role of SIGTRAP. Who is sending/receiving this signal ? The debugger or the target program?
Neither: after PTRACE_ATTACH, the kernel stops the inferior, then notifies the debugger that it has done so, by making debugger's wait return.
I see a wait(NULL) after the ptrace attach. What is the meaning of this wait?
See the explanation above.
Thread specific breakpoints?
For a thread-specific breakpoint, the debugger inserts a process-wide breakpoint (via 0xCC opcode), then simply immediately resumes any thread which hits the breakpoint, unless the thread is the specific one that you wanted to stop.

stack traces stop at the leaf register (lr)

Often I see ARM stack traces (read: Android NDK stack traces) that terminate with an lr pointer, like so:
#00 pc 001c6c20 /data/data/com.audia.dev.qt/lib/libQtGui.so
#01 lr 80a356cc /data/data/com.audia.dev.rta/lib/librta.so
I know that lr stands for link register on ARM and other architectures, and that it's a quick way to store a return address, but I don't understand why it always seems to store a useless address. In this example, 80a356cc cannot be mapped to any code using addr2line or gdb.
Is there any way to get more information? Why must the trace stop at the lr address anyway?

Stumbled on the answer finally. I just had to be more observant. Look at the following short stack trace and the information that comes after it:
#00 pc 000099d6 /system/lib/libandroid.so
#01 lr 80b6c17c /data/data/com.audia.dev.rta/lib/librta.so
code around pc:
a9d899b4 bf00bd0e 2102b507 aa016d43 28004798
a9d899c4 9801bfa8 bf00bd0e 460eb573 93004615
a9d899d4 6d842105 462b4632 200047a0 bf00bd7c
a9d899e4 b100b510 f7fe3808 2800edf4 f04fbf14
a9d899f4 200030ff bf00bd10 b097b5f0 4614af01
code around lr:
80b6c15c e51b3078 e5933038 e5932024 e51b302c
80b6c16c e1a00002 e3a01000 e3a02000 ebfeee5c
80b6c17c e1a03000 e50b303c e51b303c e1a03fa3
80b6c18c e6ef3073 e3530000 0a000005 e59f34fc
80b6c19c e08f3003 e1a00003 e51b103c ebfeebe6
Now the lr address is still a 80xxxxxx address that isn't useful to us.
The address it prints from the pc is 000099d6, but look at the next section, code around pc. The first column is a list of addresses (you can tell from the fact that it increments by 16 each time.) None of those addresses looks like the pc address, unless you chop off the first 16 bits. Then you'll notice that the a9d899d4 must correspond to 000099d4, and the code where the program stopped is two bytes in from that.
Android's stack trace seems to have "chopped off" the first 2 bytes of the pc address for me, but for whatever reason it does not do it for addresses in the leaf register. Which brings us to the solution:
In short, I was able to chop off the first 16 bits from the 80b6c17c address to make it 0000c17c, and so far that has given me a valid code address every time that I can look up with gdb or addr2line. (edit: I've found it's actually usually the first 12 bits or first 3 hexadecimal digits. You can decide this for yourself by looking at the stack trace output like I described above.) I can confirm that it is the right code address as well. This has definitely made debugging a great bit easier!

Do you have all debugging info (-g3) on?
Gcc likes to use the lr as a normal register. Remember that a non-leaf function looks like
push {lr}
; .. setup args here etc.
bl foo ; call a function foo
; .. work with function results
pop {pc}
Once it pushed lr to the stack, the compiler can use it almost freely - lr will be overwritten only by function calls. So its quite likely that there is any intermediate value in lr.
This should be stated in the debugging information that the compiler generates, in order to let the debugger know it has to look at the stack value instead of lr.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js