GDB watchpoint implementation - gdb

I'm writing gdbstub for ARM and I have a question. I'm trying to implement watchpoints for my stub. GDB has special packets for different types of watchpoins (read, write, access), but every time I set a watchpoint at some values I got GDB implementation - single-stepping through the code and comparing values on each step. It's specified in GDB documentation, but then why do you need a special packet to write watchpoints?
Obviously, GDB native implementation is slow. This packet might be needed to re-define the implementation. For example, when I set wp at addr, not value, Z2 packet is really sent. But I don't understand how GDB should understand that the "S05"(stop packet) was sent because of a watchpoint.
In breakpoints the decision is made on comparison between the current bp_address and the program counter value.
How does it work with watchpoints?

When I tried that with gdbserver+gdb on x86-64 linux machine, gdbserver replied with T packet, wherein it stated "watch" as stop reason:
Packet received: T05watch:000000000058c460;06:00deffffff7f0000;07:f0ddffffff7f0000;10:9cd4410000000000;thread:p3425.3425;core:5;

Related

How to Read the program counter / Instruction pointer of a specific core in Kernel Mode?

Windows 10, x64 , x86
My current knowledge
Lets say it is quad core, there will be 4 individual program counters which will point to 4 different locations of code for parallel execution.
Each of this program counters indicates where a computer is in its program sequence.
The address it points to changes after a context switch where another threads program counter gets placed onto the program counter to execute.
What I want to do:
Im in Kernel Mode my thread is running on core 1 and I want to read the current instruction pointer of core 2.
Expected Results:
0x203123 is the address of the instruction pointer and this address belongs to this thread and this thread belongs to this process... etc.
Anyone knows how to do it or can give me good book references, links etc...
Although I don't believe it's officially documented, there is a ZwGetContextThread exported from ntdll.dll. Being undocumented, things can change (and I haven't tried it in quite a while) but at least when I last tried it, you called it with a thread handle and a pointer to a CONTEXT structure, and it would return that thread's context.
I'm not certain exactly how up-to-date that is though. It's never mattered to me, so I haven't checked, but my guess would be that the IP in the CONTEXT you get is whatever was saved the last time the thread was suspended. So, if you want something (reasonably) current, you'd use ZwSuspendThread, get the context, then ZwResumeThread to start it running again.
Here I suppose I'm probably supposed to give the standard lines about undocumented function being subject to change, using them being a bad idea, and that you should generally leave all of this alone. Ah well, I been disappointing teachers and other authority figures for years, and I guess I'm not changing right now.
On the other hand, there may be a practical problem here. If you really need data that's really current, this probably isn't going to work very well for you. What it gives you will be kind of current at best. On the other hand, really current is almost a meaningless concept with information that goes out of date every clock cycle.
Anyone knows how to do it or can give me good book references, links etc...
For 80x86 hardware (regardless of operating system); there are only 3 ways to do this (that I know of):
a) send an inter-processor interrupt to the other CPU, and have an interrupt handler that stores the "return EIP" (from its stack) at a known address in memory so that your CPU can read "value of EIP immediately before interrupt" (with synchronization so that your CPU doesn't read before the value is written, etc).
b) put the other CPU into some kind of "debug mode" (single-stepping, last branch recording, ...) so that (either code in a debug exception handler or the CPU's hardware itself) is constantly writing EIP values to memory that you can read.
Of course both of these options will ruin performance, and the value you get will probably be useless (because EIP would've changed after you obtain it but before you can use the obtained value). To ensure the value is still useful; you'd need the other CPU to wait until after you've consumed the obtained value (and are ready for the next value); and to do that you'd have to resort to single-step debugging facilities (with the waiting in the debug exception handler), where you'll be lucky if you can get performance better than a thousand times slower (and can probably improve performance by simply disabling other CPUs completely).
Also note that they still won't accurately tell you EIP in all cases (e.g. if the CPU is in SMM/System Management Mode and is beyond the control of the OS); and I doubt Windows kernel supports any of it (e.g. kernel should support single-stepping of user-space processes/threads to allow debuggers to work, but won't support single-stepping of kernel and will probably lock up the computer due to various "waiting for lock to be released for 6 days" problems).
The last of the 3 options is:
c) Run the OS inside an emulator/simulator instead of running it on real hardware. In that case you can probably modify the emulator/simulator's code to inject EIP values somewhere (maybe some kind of virtual "EIP reporting device"?). This will ruin performance of the emulator/simulator, but you may be able to hide that (e.g. "virtual time inside the emulator passes at a rate of one second per 1000 seconds of real time outside the emulator").

How do GDB rwatch and awatch commands work?

I see it is possible in GDB to set a breakpoint which will fire when a specific memory address will be read or written.
I am wondering how it works. Does GDB have a sort of copy of the process memory and check what has changed between each instruction ? Or is it a syscall or kernel feature for that ?
(Intel x86 32 and 64 bits architecture)
I am wondering how it works.
There are two ways: software watchpoints and hardware watchpoints (only available on some architectures).
Software watchpoints work by single-stepping the application, and checking whether the value has changed after every instruction. These are painfully slow (1000x slower), and in practice aren't usable for anything other than a toy program. They also can't detect access, only change of the value in watched location.
Hardware watchpoints require processor support. Intel x86 chips have debug registers, which could be programmed to watch for access (awatch, rwatch) or change (watch) of a given memory location. When the processor detects that the location of interest has been accessed, it raises debug exception, which the OS translates into a signal, and (as usual) a signal is given to the debugger before the target sees it.
HW watchpoints execute at native speed, but (on x86) you can have only up to 4 distinct addresses (in practice, I've never needed more than 2).
Does execution of current instruction fire a watch read at eip address?
It should. You could trivially answer this yourself. Just try it.
Does push on stack fire a write on stack memory address?
Likewise.

How do I find out what writes to an address using C++?

I have an address that get's writen to 1000x per second by 300 different instructions. How can I use c++ to find out the last instruction to write to an address?
I already have made it so it alerts me the instance a specific value is written to an address, but how can I make it print the last instruction address that wrote that specific value?
I would do this in a debugger but all of the debuggers I've found cannot handle doing a conditional breakpoint on an address that changes 1000x per second without freezing the program.
If I can't do this in C++, what are other ways that I can do this? I need to find what address instruction writes a specific value to a memory address that receives over 1000 writes per second from different addresses.
Update:
I am using Windows 7 x32 for those wondering.
Take a look at pin. Briefly, pin allows you to instrument your code at the x86 instruction level, allowing you to track reads and/or writes as you please. I've used it myself to model cache performance and found it fairly fast.
already have made it so it alerts me the instance a specific value is written to an address, but how can I make it print the last instruction address that wrote that specific value?
If it's just for one-off debugging, have the code that alerts system/popen pstack (http://www.linuxcommand.org/man_pages/pstack1.html) or similar - some external program that dumps your call stack. Exactly which program to use is highly OS dependent, and you've said nothing of your environment. (This is a common technique for generating call stacks from signal handlers after invalid memory accesses etc.)

how does gdb work?

I want to know how does gdb work internally.
e.g. I know a brief idea that it makes use of ptrace() system call to monitor traced program.
But I want to know how it handles signals, how it inserts new code, and other such fabulous things it does.
Check out the GDB Internals Manual, which covers some of the important aspects. There's also an older PDF version of this document.
From the manual:
This document documents the internals of the GNU debugger, gdb. It includes description of gdb's key algorithms and operations, as well as the mechanisms that adapt gdb to specific hosts and targets.
Taken from gdbint.pdf:
It can be done either as hardware breakpoints or as software
breakpoints:
Hardware breakpoints are sometimes available as a builtin debugging features with some chips. Typically these work by having dedicated
register into which the breakpoint address may be stored. If the PC
(shorthand for program counter) ever matches a value in a breakpoint
registers, the CPU raises an exception and reports it to GDB.
Another possibility is when an emulator is in use; many emulators include circuitry that watches the address lines coming out from the
processor, and force it to stop if the address matches a breakpoint's
address.
A third possibility is that the target already has the ability to do breakpoints somehow; for instance, a ROM monitor may do its own
software breakpoints. So although these are not literally hardware
breakpoints, from GDB's point of view they work the same;
Software breakpoints require GDB to do somewhat more work. The basic theory is that GDB will replace a program instruction with a trap,
illegal divide, or some other instruction that will cause an
exception, and then when it's encountered, GDB will take the exception
and stop the program. When the user says to continue, GDB will restore
the original instruction, single-step, re-insert the trap, and
continue on.
The only way you'll find out is by studying the source.
You can also build it and debug it with itself. Step through the code, and you'll know exactly how it does what it does.
Reading GDB source is not for the faint of heart though -- it is chock-full of macros, and heavily uses libbfd, which itself is hard to understand.
It has to, because it is portable (and in particular, builds and works on platforms which do not have ptrace() at all).

What causes a Sigtrap in a Debug Session

In my c++ program I'm using a library which will "send?" a Sigtrap on a certain operations when
I'm debugging it (using gdb as a debugger). I can then choose whether I wish to Continue or Stop the program. If I choose to continue the program works as expected, but setting custom breakpoints after a Sigtrap has been caught causes the debugger/program to crash.
So here are my questions:
What causes such a Sigtrap? Is it a leftover line of code that can be removed, or is it caused by the debugger when he "finds something he doesn't like" ?
Is a sigtrap, generally speaking, a bad thing, and if so, why does the program run flawlessly when I compile a Release and not a Debug Version?
What does a Sigtrap indicate?
This is a more general approach to a question I posted yesterday Boost Filesystem: recursive_directory_iterator constructor causes SIGTRAPS and debug problems.
I think my question was far to specific, and I don't want you to solve my problem but help me (and hopefully others) to understand the background.
Thanks a lot.
With processors that support instruction breakpoints or data watchpoints, the debugger will ask the CPU to watch for instruction accesses to a specific address, or data reads/writes to a specific address, and then run full-speed.
When the processor detects the event, it will trap into the kernel, and the kernel will send SIGTRAP to the process being debugged. Normally, SIGTRAP would kill the process, but because it is being debugged, the debugger will be notified of the signal and handle it, mostly by letting you inspect the state of the process before continuing execution.
With processors that don't support breakpoints or watchpoints, the entire debugging environment is probably done through code interpretation and memory emulation, which is immensely slower. (I imagine clever tricks could be done by setting pagetable flags to forbid reading or writing, whichever needs to be trapped, and letting the kernel fix up the pagetables, signaling the debugger, and then restricting the page flags again. This could probably support near-arbitrary number of watchpoints and breakpoints, and run only marginally slower for cases when the watchpoint or breakpoint aren't frequently accessed.)
The question I placed into the comment field looks apropos here, only because Windows isn't actually sending a SIGTRAP, but rather signaling a breakpoint in its own native way. I assume when you're debugging programs, that debug versions of system libraries are used, and ensure that memory accesses appear to make sense. You might have a bug in your program that is papered-over at runtime, but may in fact be causing further problems elsewhere.
I haven't done development on Windows, but perhaps you could get further details by looking through your Windows Event Log?
While working in Eclipse with minGW/gcc compiler, I realized it's reacting very bad with vectors in my code, resulting to an unclear SIGTRAP signal and sometimes even showing abnormal debugger behavior (i.e. jumping somewhere up in the code and continuing execution of the code in reverse order!).
I have copied the files from my project into the VisualStudio and resolved the issues, then copied the changes back to eclipse and voila, worked like a charm. The reasons were like vector initialization differences with reserve() and resize() functions, or trying to access elements out of the bounds of the vector array.
Hope this will help someone else.
I received a SIGTRAP from my debugger and found out that the cause was due to a missing return value.
string getName() { printf("Name!");};