GDB: catching a signal and continue debugging - gdb

I am trying to catch floating point exception (SIGFPE) in GDB, not pass it to the process and continue debugging onwards.
I have given gdb this:
handle SIGFPE stop nopass
When a SIGFPE occurs GDB stops at the correct place. The problem is I can't and don't know how can I continue debugging.
I have tried giving GDB
continue
or
signal 0
but it still hangs on the offending line and refuses to continue.
Is there a way to continue debugging after receiving a signal?
I am using GDB 7.5.1, which I have compiled myself and I have also tried with GDB 7.4, which comes with my 12.04 Ubuntu distribution. Both have the same behaviour.

The problem is that when you continue a program after a synchronous signal, it reexecutes the same instruction that caused the signal, which means you'll just get the signal again. If you tell it to ignore the signal (either directly or via gdb) it will go into a tight loop reexecuting that instruction repeatedly.
If you want to actually continue the program somewhere after the instruction that causes the signal, you need to manually set the $pc register to the next (or some other) instruction before issuing the continue command.

Related

AddressSanitizer kills GDB state, even when following Sanitizer Github advice

I have a double-free bug. I am able to reproduce it using a debug build with Address Sanitizer (AS) detects but when I run under GDB, AS kills the GDB session.
I found this Address Sanitizer page with instructions how to keep GDB:
https://github.com/google/sanitizers/wiki/AddressSanitizerAndDebugger
but when I do:
(gdb) break __asan::ReportGenericError
at the beginning of the session, the GDB state still disappears after the problem is detected:
(gdb) bt
No stack.
the GDB state still disappears after the problem is detected
There are several possible reasons for this:
Somehow you didn't set the breakpoint correctly
It's actually a child process that is dying
Somehow the thread in which the error is detected is not attached by GDB.
To eliminate 1, use catch syscall exit_group (and possibly also catch syscall exit) -- this way GDB is sure to stop before the process disappears.
For 2, AddressSanitizer message should indicate the thread id in which the error is detected, and that id should match one of the threads in GDB info thread output.
For 3, we'd need to understand more about how that thread was created.

GDB: breakpoint in inferior process

I have a network software that I need to debug. It forks at multiple places and I need to debug one particular function handling one particular request.
Is there any way to setup a global breakpoint that would be caught even when it is in an inferior process?
I cannot use follow-fork-mode child because this will follow the first request, not the one I need to debug.
One way to do this is to have gdb remain attached to all the processes. Then you would set your breakpoint and run the program as usual; the breakpoint would fire in any sub-process that happened to hit that location. You can use breakpoint conditions to try to reduce the number of hits.
To put gdb into multi-inferior mode, I use this:
set detach-on-fork off
set non-stop on
set pagination off
Depending on your version of gdb, you might also need set target-async on.
This mode can be a bit peculiar to work in. For example, when one thread stops, the other keep going. Also, breakpoint stops are reported, but not always obvious; and I think gdb doesn't immediately switch to the stopping thread (this may have changed in gdb git, I forget).

GDB backtrace without stopping

I am trying to let my program run continously with GDB.
Currently I have a bash script which starts GDB with my program and when it crashes it prints the backtrace and starts GDB again (endless loop).
Now I added a signal handler for my program which kills specific threads when the handler gets a signal from them. Now I can achieve that GDB does not stop by doing this:
handle SIGSEGV nostop
But this leads me to the problem that I do not get a GDB backtrace which I would like to print automatically without stopping the program (or at least continuing automatically).
Any help would be appreciated!
Continue to use handle to suppress ordinary stops from SEGV. Then set a catchpoint that does what you want:
(gdb) catch signal SIGSEGV
(gdb) commands
> silent # this bit is optional
> bt
> continue
> end
This will print a backtrace on SIGSEGV but not otherwise interfere with normal operation. You may also want handle SIGSEGV noprint.

How can I continue from a program that has stopped, using lldb?

I am trying to break out of a read-line loop into lldb, and then continue where I broke out of. When I try using C-C, the program just exits after the "continue" command is given to lldb.
Here is the sample code:
#include<iostream>
#include<string>
using namespace std;
int main(){
string cmd;
while(true){
if (!getline(cin,cmd)) {
cout<<"ending on eof"<<endl;
break;}
else if (cmd=="GO INTO DEBUGGER"){
//??
}
else
cout<<"Got line: "<<cmd<<endl;
}
cout<<"Exiting program"<<endl;
return 0;
};
When this program is executed, it just echoes back the input line. When I interrupt the program using C-C, I bounce back into the debugger. When I then execute "continue" in the debugger, instead of returning to the loop, I just exit with the EOF message.
How can I either return to the loop from when the loop was interrupted, either using C-C or perhaps by using some kind of command in place of the "GO INTO DEBUGGER" clause (returning from "assert(0)" rarely works I find.
This is all compiled with clang++ on Mac Mavericks.
Note: for some reason the lldb backtrace says it received SIGSTOP, I thought C-C was SIGINT, but I guess I'm out of date.
This sort of problem comes because of the interaction between signals and system traps.
When a program that is sitting in a system trap waiting for input (or in any system trap really) gets a signal, the system may need to get the thread sitting on the trap out of the kernel in order to deliver the signal to it. If it has to do that, the trap call will return with an its usual error value, and the thread local "errno" variable will be set to EINTR.
This is happening to you in the debugger because the debugger has to send a signal (lldb uses SIGSTOP rather than SIGINT for uninteresting reasons) to interrupt your program.
However, this isn't specific to the debugger, this could happen because of job control signals or any other signal your program might receive. So to be safe, when you get an error return back from some read (read, select, getline, etc...) type call, you are supposed to check errno and only treat the error as an EOF if errno is not EINTR.
That being said, it seems like getline is buggy w.r.t. signals. If I interrupt the program while it is sitting in getline, I get a 0 return and errno is correctly set to 4. But the next time I call into getline, it again returns with 0, but this time errno has not been reset which makes it kind of hard to use in this context. Interesting...
Rather than using Control-C to stop your program, you should use a breakpoint in lldb. When you attach to your program, before starting execution, you can set a breakpoint by typing:
break foo.c:11
to break in the file foo.c on line 11. See the docs for more information.
Once the debugger stops at the breakpoint, you can examine variables and perform other actions, then type:
continue
to continue the execution of the program.

How does gdb attach to multi-threaded process?

I will try to be as specific as I can, but so far I have worded this problem so poorly that Google failed to return any useful results (hence my question here).
I am attaching gdb to a multi-threaded c++ server process. All I can say is that strange things have been happening while trying to do the usual set-breakpoint-break-investigate.
First, while waiting for the breakpoint to be hit (in 'Continuing' mode), I suddenly got back the (gdb) prompt with the message:
Continuing.
[Thread 0x54d5b940 (LWP 28503) exited]
[New Thread 0x54d5b940 (LWP 28726)]
Cannot get thread event message: debugger service failed
Second, also while waiting for the breakpoint to be hit, I'm suddenly told the program has received SIGSEGV and - back to the (gdb) prompt - backtrace tells me the segfault happened in pthread_cancel(). Note the process under investigation does not normally segfault.
I clearly lack enough information about how gdb works to even begin guessing what is happening. Am I doing anything wrong? The steps I take are the same each time:
gdb attach
break 'MyFunction()'
continue
Thoughts? Thanks.
I fought with similar gdb issues for a while. My case was having lots of threads spawned that executed few functions and then exited.
It appears if a thread exits too fast and there's lots of these happening sometimes gdb cannot keep up and when it fails, it fails with style as in crashes :) I think it tries to attach to a thread that is already done as per the error message.
I see this as an issue in gdb 6.5 to 7.6 and still happening. Did not try with older versions.
My advice is look for this use case or similar. Once I changed my design to have a thread serving a queue of requests gdb works flawlessly.
Design wise is healthier to have already created threads that digest actions than always spawning new threads.
Still same code debugs without a problem on Visual Studio so I do have to say that is a small disappointment to me with regards to gdb.
I use Eclipse and looking at the GDB traces (usually enabled by default) will give you a better hint of where GDB fails. One of the buttons on the console shows you the GDB trace.