How does GDB restore instruction after breakpoint - gdb

I have read that GDB puts a int 3 (opcode CC) at the wanted adress in the target program memory.
Si this operation is erasing a piece of instruction (1 byte) in the program memory.
My question is: How and When GDB replaces the original opcode when the program continues ?
When i type disassemble in GDB, i do not see CC opcode. Is this because GDB knows it is him that puts the CC ?
Is there a way to do a raw disassemble, in order to see exactly what opcodes are loaded in memory at this instant ?

How and When GDB replaces the original opcode when the program continues ?
I use this as an interview question ;-)
On Linux, to continue past the breakpoint, 0xCC is replaced with the original instruction, and ptrace(PTRACE_SINGLESTEP, ...) is done. When the thread stops on the next instruction, original code is again replaced by the 0xCC opcode (to restore the breakpoint), and the thread continued on its merry way.
On x86 platforms that do not have PTRACE_SINGLESTEP, trap flag is set in EFLAGS via ptrace(PTRACE_SETREGS, ...) and the thread is continued. The trap flag causes the thread to immediately stop again (on the next instruction, just like PTRACE_SINGLESTEP would).
When i type disassemble in GDB, i do not see CC opcode. Is this because GDB knows it is him that puts the CC ?
Correct. A program could examine and print its own instruction stream, and you can observe breakpoint 0xCC opcodes that way.
Is there a way to do a raw disassemble, in order to see exactly what opcodes are loaded in memory at this instant ?
I don't believe there is. You can use (gdb) set debug infrun to observe what GDB is doing to the inferior (being debugged) process.
What i do not understand in fact is the exact role of SIGTRAP. Who is sending/receiving this signal ? The debugger or the target program?
Neither: after PTRACE_ATTACH, the kernel stops the inferior, then notifies the debugger that it has done so, by making debugger's wait return.
I see a wait(NULL) after the ptrace attach. What is the meaning of this wait?
See the explanation above.
Thread specific breakpoints?
For a thread-specific breakpoint, the debugger inserts a process-wide breakpoint (via 0xCC opcode), then simply immediately resumes any thread which hits the breakpoint, unless the thread is the specific one that you wanted to stop.

Related

gdb backtrace mechanism

The mechanism that allows gdb to perform backtrace 1 is well explained.
Starting from the current frame, look at the return address
Look for a function whose code section contains that address.
Theoretically, there might be hundreds of thousands of functions to consider.
I was wondering if there are any inherent limitations that prevent gdb
from creating a lookup table with return address -> function name.
What makes you think GDB does a straight search through all functions? This isn't what happens. GDB organises symbols into a couple of different data structures that allow for more efficient mapping between addresses and the enclosing function.
A good place to start might be here: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gdb/blockframe.c;h=d9c28e0a0176a1d91fec1df089fdc4aa382e8672;hb=HEAD#l118
The mechanism that allows gdb to perform backtrace 1 is well explained.
This isn't at all how GDB performs a backtrace.
The address stored in the rip register points to the current instruction, and has nothing to do with return address.
The return address is stored on the stack, or possibly in another register. To find where it is stored (on x86_64, and assuming e.g. Linux/ELF/DWARF file format), GDB looks up unwind descriptor that covers the current value of RIP. The unwind descriptor also tells GDB how to restore other registers to the state they were just before the current function was called.
You can see unwind descriptors with e.g. readelf -wf a.out command.
Once GDB knows how to find return address and restore registers, it can effectively perform an up command, stepping from current (called) frame into previous (caller) frame.
Now this process repeats, until either GDB finds a special unwind descriptor which says "I am the last, don't try to unwind past me", or some error occurs (e.g. restored RIP is 0).
Notably, nowhere in this process does GDB have to consider thousands of functions.

Find out the source file (line number) where a global variable was initialized?

I have pretty large C++ code base of a shared library which is messed up with complicated conditional macro spaghetti so IDE has troubles with that. I examined it with GDB to find the initial value of a global variable as follows:
$ gdb libcomplex.so
(gdb) p some_global_var
$1 = 1024
So I figured out the value the variable was initialized with.
QUESTION: Is it possible to find out which source file (and maybe line number) it was initialized at with GDB?
I tried list some_global_var, but it simply prints nothing:
(gdb) list some_global_var
(gdb)
So on x86 you can put a limited number of hardware watchpoints on that variable being changed:
If you are lucky, on a global you can get away with
watch some_global_var
But the debugger may still decide that is not a fixed address, and do a software watchpoint.
So you need to get the address, and watch exactly that:
p &some_global_var
(int*)0x000123456789ABC
watch (int*)0x000123456789ABC
Now, when you restart, the debugger should pop out when the value is first initialised, perhaps to zero, and/or when it is initialised to the unexpected value. If you are lucky, listing the associated source code will tell you how it came to be initialised. As others have stated you may then need to deduce why that line of code generated that value, which can be a pain with complex macros.
If that doesn't help you, or it stops many times unexpectedly during startup, then you should initially disable the watchpoint, then starti to restart you program and stop as soon as possible. Then p your global, and if it does not yet have the magic value, enable the watchpoint and continue. Hopefully this will skip the irrelevant startup and zoom in on the problem value.
You could use rr (https://rr-project.org/) to record a trace of the program, then you could reverse-execute to find the location. E.g.:
rr replay
(gdb) continue
...
(gdb) watch -l some_global_var
(gdb) reverse-continue

GDB - Reading 1 words from the stack

I want to print 1 words from the top of stack in the form of hexadecimal. To do so, I typed the following:
(gdb) x/1xw $esp
but GDB keeps popping up:
0xffffffffffffe030: Cannot access memory at address 0xffffffffffffe030
The program I'm trying to debug has already pushed a value onto stack so just in case if you're wondering that I might be trying to access kernel variables at the very beginning of program, it's not so.
Any idea?
0xffffffffffffe030 is a 64-bit constant, so you are running in x64-bit mode. But $esp is a 32-bit register (which GDB sign-extends to 64 bits in this context). The 64-bit stack pointer is called $rsp. Try this instead:
(gdb) x/1xw $rsp

Custom debugger based on windows api - debugging multithreaded process

In my debugger i set particular memory address to 0xCC (int3) and while execution one of the threads reaches this address. Exception is thrown. While handling exception i subtract IP register to point one instruction before 0xCC and replace 0xCC with original byte. I also set flag in thread context to throw exception after execute one instruction - I need to set back 0xCC byte.
Problem:
Code executes correctly but I realized that there possibly is a bug. After receiving exception I set back original byte and set flag in thread to back to the debugger just right after it executes one instruction (it lets me to set back int3). It sounds good but I have detected that after original byte is executing another thread also executes this instruction without throwing exception (I think it can be related to threads switching).

How to use a logical address with an FS or GS base in gdb?

gdb provides functionality to read or write to a specific linear address, for example:
(gdb) x/1wx 0x080483e4
0x80483e4 <main>: 0x83e58955
(gdb)
but how do you specify a logical address ? I came accross the following instruction:
0x0804841a <+6>: mov %gs:0x14,%eax
how can i read the memory at "%gs:0x14" in gdb, or translate this logical address to a linear address that i could use in x command ?
note: i know that i could simply read %eax after this instruction, but that is not my concern
how can i read the memory at "%gs:0x14" in gdb
You can't: there is no way for GDB to know how the segment to which %gs refers to has been set up.
or translate this logical address to a linear address that i could use in x command
Again, you can't do this in general. However, you appear to be on 32-bit x86 Linux, and there you can do that -- the %gs is set up to point to the thread descriptor via set_thread_area system call.
You can do catch syscall set_thread_area in GDB, and examine the parameters (each thread will have one such call). The code to actually do that is here. Once you know how %gs has been set up, just add 0x14 to the base_addr, and you are done.
As answered in Using GDB to read MSRs, this is possible with gdb 8, using the registers $fs_base and $gs_base.
I think the easiest way to do this is to read the content of EAX register as you can see the value of %gs:0x14 is moved to EAX.
In GDB, set a breakpoint at the address right after 0x0804841a with break. For example
break *0x0804841e
Then run the program and you can print the contents of EAX register with
info registers eax