I'm trying to debug PHP segmentation fault in my CI. I managed to get the GDB stacktrace by running phpunit like this:
gdb php -ex "run vendor/bin/phpunit" -ex bt -ex quit
The problem is that this way I do not know if phpunit succeeded or failed (doesn't matter if it failed with segfault or with test failure). I need to somehow log the exit code of the vendor/bin/phpunit command and use it in an if statement later - to decide whether or not the build succeeded.
The problem is that this way I do not know if phpunit succeeded or failed (doesn't matter if it failed with segfault or with test failure)
You can print the value of $_exitsignal variable. In case of SIGSEGV signal it would be equal to 11. Also you can use $_isvoid function to know if the program exited or signalled. For this code:
#include <signal.h>
int main (int argc, char *argv[])
{
raise (SIGSEGV);
return 0;
}
you can invoke gdb this way:
$ gdb -q -ex run -ex bt -ex c -ex "print \$_isvoid(\$_exitsignal)" -ex quit a.out
Reading symbols from a.out...
Starting program: /tmp/a.out
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7e1e9e5 in raise () from /lib64/libc.so.6
#0 0x00007ffff7e1e9e5 in raise () from /lib64/libc.so.6
#1 0x000000000040113f in main (argc=1, argv=0x7fffffffd6e8) at 1.c:5
Continuing.
Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
$1 = 0
At the end of output you can see that the program ended with signal, $_isvoid($_exitsignal) returned 0:
$1 = 0
Related
It seems that gdb fails finding the code position of an assertion failure, after I recompile my code. More precisely, I expect the position of a signal raise, relative to an assertion failure, to be
0x00007ffff7a5ff00 in raise () from /lib64/libc.so.`6
while instead I obtain
0x00007ffff7a5ff00 in ?? ()
For instance, consider the following code
#include <assert.h>
int main()
{
assert(0);
return 0;
}
compiled with debug symbols and debugged with gdb.
> gcc -g main.c
> gdb a.out
On the first run of gdb, the position is found, and the backtrace is reported correctly:
GNU gdb (Gentoo 8.0.1 p1) 8.0.1
...
(gdb) r
Starting program: /home/myself/a.out
a.out: main.c:5: main: Assertion `0' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff7a5ff00 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff7a5ff00 in raise () from /lib64/libc.so.6
#1 0x00007ffff7a61baa in abort () from /lib64/libc.so.6
#2 0x00007ffff7a57cb7 in ?? () from /lib64/libc.so.6
#3 0x00007ffff7a57d72 in __assert_fail () from /lib64/libc.so.6
#4 0x00005555555546b3 in main () at main.c:5
(gdb)
The problem comes when I recompile the code. After recompiling, I issue the run command in the same gdb instance. Gdb re-reads the symbols, starts the program from the beginning, but does not find the right position:
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
`/home/myself/a.out' has changed; re-reading symbols.
Starting program: /home/myself/a.out
a.out: main.c:5: main: Assertion `0' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff7a5ff00 in ?? ()
(gdb) bt
#0 0x00007ffff7a5ff00 in ?? ()
#1 0x0000000000000000 in ?? ()
(gdb) up
Initial frame selected; you cannot go up.
(gdb) n
Cannot find bounds of current function
At this point the debugger is unusable. One cannot go up, step forward.
As a workaround, I can manually reload the file, and positions are found again.
(gdb) file a.out
Load new symbol table from "a.out"? (y or n) y
Reading symbols from a.out...done.
(gdb) r
Starting program: /home/myself/a.out
a.out: main.c:5: main: Assertion `0' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff7a5ff00 in raise () from /lib64/libc.so.6
(gdb)
Unfortunately, after reloading the file this way, gdb fails resetting the breakpoints.
ERRATA CORRIGE: I was experiencing failure in resetting the breakpoints using gdb 7.12.1. After upgrading to 8.0.1 the problem vanished. Supposedly, this was related to the bugfix https://sourceware.org/bugzilla/show_bug.cgi?id=21555. However, code positions where assertions fail still cannot be found correctly.
Does anybody have any idea about what is going on here?
This has started happening after a system update. The system update recompiled all system libraries, including the glibc, as position independent code, i.e., compiled with -fPIC.
Also, the version of the gcc I am using is 6.4.0
Here is a workaround. Since file re-reads the symbols correctly, while run does not, we can define a hook for the command run so to execute file before:
define hook-run
pi gdb.execute("file %s" % gdb.current_progspace().filename)
end
after you change the source file and recompile u are generating a different file from the one loaded to GDB.
you need to stop the running debug cession and reload the file.
you cant save the previously defined breakpoints and watch points in the file to a changed source, since gdb is actually inserting additional code to your source to support breakpoints and registrar handlers.
if you change the source the the behavior is undefined and you need to reset those breakpoints.
you can refer to gdb manual regarding saving breakpoints in a file as
Mark Plotnick suggested, but it wont work if you change the file(from my experience)
https://sourceware.org/gdb/onlinedocs/gdb/Save-Breakpoints.html
I'm trying to get to the bottom of a bug in KDE 5.6. The locker screen breaks no matter how I lock it. Here's the relevant code: https://github.com/KDE/kscreenlocker/blob/master/abstractlocker.cpp#L51
When I run /usr/lib/kscreenlocker_greet --testing, I get an output of:
KCrash: Application 'kscreenlocker_greet' crashing...
Floating point exception (core dumped)
I'm trying to run it with gdb to try and pin the exact location of the bug, but I'm not sure where to set the breakpoints in order to isolate the bug. Should I be looking for calls to KCrash? Or perhaps a raise() call? Can I get gdb to print off the relevant line of code that causes SIGFPE?
Thanks for any advice you can offer.
but I'm not sure where to set the breakpoints in order to isolate the bug
You shouldn't need to set any breakpoints at all: when a process running under GDB encounters a fatal signal (such as SIGFPE), the OS notices that the process is being traced by the debugger, and notifies the debugger (instead of terminating the process). That in turn causes GDB to stop, and prompt you for additional commands. It is at that time that you can look around and understand what caused the crash.
Example:
cat -n t.c
1 #include <fenv.h>
2
3 int foo(double d) {
4 return 1/d;
5 }
6
7 int main()
8 {
9 feenableexcept(FE_DIVBYZERO);
10 return foo(0);
11 }
gcc -g t.c -lm
./a.out
Floating point exception
gdb -q ./a.out
(gdb) run
Starting program: /tmp/a.out
Program received signal SIGFPE, Arithmetic exception.
0x000000000040060e in foo (d=0) at t.c:4
4 return 1/d;
(gdb) bt
#0 0x000000000040060e in foo (d=0) at t.c:4
#1 0x0000000000400635 in main () at t.c:10
(gdb) q
Here, as you can see, GDB stops when SIGFPE is delivered, and allows you to look around and understand the crash.
In your case, you would want to first install debuginfo symbols for KDE, and then run
gdb --args /usr/lib/kscreenlocker_greet --testing
(gdb) run
In Linux, GDB doesn't allow set new-console on and in stead uses something called tty. With
set inferior-tty /dev/pts/[number of an active console],
in a .gdbinit file (requires editing the number every time) it redirects std::cout, but std::cin isn't working properly. It just interprets my input as if I'm sending a bash command and reports an error, and my program continues to wait for input. I can no longer type in the console after that, so I assume std::cin is being redirected, but doesn't work properly.
I tried looking up how to launch a terminal from the application itself. I could only find this answer, which also mentions a bug that it doesn't redirect input.
Is there any way to fix this issue and redirect std::cin (and std::cout) to a Linux terminal properly when debugging?
Background info: What I'm trying to do should be simple. Print a > in front of user input before using std::cin. I have simple code in place that prints the >, flushes cout and then calls getline(). It works when just running the program normally. But sadly, GDB refuses to flush the stream when there isn't a newline, so it doesn't print the >, ignores the first character of the user input and then does prints the >, immediately followed by the error message that my program sends because of the mutilated input string.
In Windows, I've solved it by making a .gdbinit file with set new-console on. This causes GDB to use a Windows console in stead of its own and that works as intended.
If you need to start the debugging after the program has a chance to pause, just run the program in a terminal, and then gdb - your-program-pid in another terminal. Otherwise there's a sequence of steps which is faster to do than to describe.
Start two terminal windows.
In one terminal, figure out the PID of the shell. ps should tell you.
In the other terminal, use these commands
$ gdb - pid-of-the-other-shell
(gdb) br main
(gdb) c
Now you have gdb attached to your shell in another terminal. Not very useful.
In the first terminal, type at the shell prompt
$ exec your-program
Now you have one terminal running gdb and another terminal running your program under gdb, stopped at main. Bear in mind that when the program exits, its terminal window will close. If you don't want this, start a second level shell in the first terminal and attach gdb to it.
You can also use set inferior-tty command, but you must make sure the other tty exists and no other program attempts to read from it. The easiest way is to run a shell in another terminal and give it a while true; do sleep 1000; done command. Note that you may get warning: GDB: Failed to set controlling terminal: Operation not permitted messages. They are harmless.
You can use gdb's multiprocess debugging feature and a terminal emulator like xterm to do this. (gnome-terminal won't work so well as explained later)
prompt.c
#include <stdio.h>
int main()
{
enum {
N = 100,
};
char name[N];
printf("> ");
scanf("%s", name);
printf("Hi, %s\n", name);
return 0;
}
xterm.gdb
set detach-on-fork off
set target-async on
set pagination off
set non-stop on
file /usr/bin/xterm
# -ut tells xterm not to write a utmp log record
# which involves root privileges
set args -ut -e ./prompt
catch exec
run
Sample Session
$ gdb -q -x xterm.gdb
Catchpoint 1 (exec)
<...>
[New process 7073]
<...>
Thread 0x7ffff7fb2780 (LWP 7073) is executing new program: /home/scottt/Dropbox/stackoverflow/gdb-stdin-new-terminal/prompt
Reading symbols from /home/scottt/Dropbox/stackoverflow/gdb-stdin-new-terminal/prompt...done.
Catchpoint 1 (exec'd /home/scottt/Dropbox/stackoverflow/gdb-stdin-new-terminal/prompt), 0x0000003d832011f0 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) inferior 2
[Switching to inferior 2 [process 7073] (/home/scottt/Dropbox/stackoverflow/gdb-stdin-new-terminal/prompt)]
[Switching to thread 2 (Thread 0x7ffff7fb2780 (LWP 7073))]
#0 0x0000003d832011f0 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) break prompt.c:11
Breakpoint 2 at 0x4004b4: file prompt.c, line 11.
(gdb) continue &
Continuing.
(gdb)
Breakpoint 2, main () at prompt.c:11
11 printf("> ");
next
12 scanf("%s", name);
(gdb) next
13 printf("Hi, %s\n", name);
A new xterm window will appear after executing gdb -q -x xterm.gdb and you can interact with prompt inside it without interfering with GDB.
Related GDB commands:
inferior 2 switches the process GDB's debugging much like the thread command switches threads. (GDB: Debugging Multiple Inferiors and Programs)
continue & means to continue executing in the background. (GDB: Background Execution)
I used xterm because gnome-terminal uses dbus-launch to ask another process to launch the new terminal outside of the parent-child process relationship that GDB relies on.
I seem to have some kind of multithreading bug in my code that makes it crash once every 30 runs of its test suite. The test suite is non-interactive. I want to run my test suite in gdb, and have gdb exit normally if the program exits normally, or break (and show a debugging prompt) if it crashes. This way I can let the test suite run repeatedly, go grab a cup of coffee, come back, and be presented with a nice debugging prompt. How can I do this with gdb?
This is a little hacky but you could do:
gdb -ex='set confirm on' -ex=run -ex=quit --args ./a.out
If a.out terminates normally, it will just drop you out of GDB. But if you crash, the program will still be active, so GDB will typically prompt if you really want to quit with an active inferior:
Program received signal SIGABRT, Aborted.
0x00007ffff72dad05 in raise (sig=...) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
in ../nptl/sysdeps/unix/sysv/linux/raise.c
A debugging session is active.
Inferior 1 [process 15126] will be killed.
Quit anyway? (y or n)
Like I said, not pretty, but it works, as long as you haven't toggled off the prompt to quit with an active process. There is probably a way to use gdb's quit command too: it takes a numeric argument which is the exit code for the debugging session. So maybe you can use --eval-command="quit stuff", where stuff is some GDB expression that reflects whether the inferior is running or not.
This program can be used to test it out:
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main() {
if (time(NULL) % 2) {
raise(SIGINT);
}
puts("no crash");
return EXIT_SUCCESS;
}
You can also trigger a backtrace when the program crashes and let gdb exit with the return code of the child process:
gdb -return-child-result -ex run -ex "thread apply all bt" -ex "quit" --args myProgram -myProgramArg
The easiest way is to use the Python API offered by gdb:
def exit_handler(event):
gdb.execute("quit")
gdb.events.exited.connect(exit_handler)
You can even do it with one line:
(gdb) python gdb.events.exited.connect(lambda x : gdb.execute("quit"))
You can also examine the return code to ensure it's the "normal" code you expected with event.exit_code.
You can use it in conjuction with --eval-command or --command as mentioned by #acm to register the event handler from the command line, or with a .gdbinit file.
Create a file named .gdbinit and it will be used when gdb is launched.
run
quit
Run with no options:
gdb --args prog arg1...
You are telling gdb to run and quit, but it should stop processing the file if an error occurs.
Make it dump core when it crashes. If you're on linux, read the man core man page and also the ulimit builtin if you're running bash.
This way when it crashes you'll find a nice corefile that you can feed to gdb:
$ ulimit -c unlimited
$ ... run program ..., gopher coffee (or reddit ;)
$ gdb progname corefile
If you put the following lines in your ~/.gdbinit file, gdb will exit when your program exits with a status code of 0.
python
def exit_handler ( event ):
if event .exit_code == 0:
gdb .execute ( "quit" )
gdb .events .exited .connect ( exit_handler )
end
The above is a refinement of Kevin's answer.
Are you not getting a core file when it crashes? Start gdb like this 'gdb -c core' and do a stack traceback.
More likely you will want to be using Valgrind.
I have this output when trying to debug
Program received signal SIGSEGV, Segmentation fault 0x43989029 in
std::string::compare (this=0x88fd430, __str=#0xbfff9060) at
/home/devsw/tmp/objdir/i686-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:253
253 { return memcmp(__s1, __s2, __n); }
Current language: auto; currently c++
Using valgrind I getting this output
==12485== Process terminating with default action of signal 11 (SIGSEGV)
==12485== Bad permissions for mapped region at address 0x0
==12485== at 0x1: (within path_to_my_executable_file/executable_file)
You don't need to use Valgrind, in fact you want to use the GNU DeBugger (GDB).
If you run the application via gdb (gdb path_to_my_executable_file/executable_file) and you've compiled the application with debugging enabled (-g or -ggdb for GNU C/C++ compilers), you can start the application (via run command at the gdb prompt) and once you arrive at the SegFault, do a backtrace (bt) to see what part of your program called std::string::compare which died.
Example (C):
mctaylor#mpc:~/stackoverflow$ gcc -ggdb crash.c -o crash
mctaylor#mpc:~/stackoverflow$ gdb -q ./crash
(gdb) run
Starting program: /home/mctaylor/stackoverflow/crash
Program received signal SIGSEGV, Segmentation fault.
0x00007f78521bdeb1 in memcpy () from /lib/libc.so.6
(gdb) bt
#0 0x00007f78521bdeb1 in memcpy () from /lib/libc.so.6
#1 0x00000000004004ef in main (argc=1, argv=0x7fff3ef4d848) at crash.c:5
(gdb)
So the error I'm interested in is located on crash.c line 5.
Good luck.
Just run the app in the debugger. At one point it will die and you will have a stack trace with the information you want.