gdb Breakpoint on assert on in multithreaded C program - c++

I'm using assert from <cassert> to check invariants in my multithreaded C++11 program. When the assertion fails, I'd like to be able to inspect the state of the failing function, along with still-intact backtrace, variable state, etc. at the time of the failed assertion. The issue seems be some interaction between SIGABRT and my threads, as my std::threads are pthread_killed, presumably by some default signal handler. How can I pause gdb right at the time of the failed assertion?
Here are some things I've tried:
set a catchpoint on SIGABRT. This catch does occur, but it's too late (in __pthread_kill).
defined __assert_fail, which is extern declared in <assert.h>, and set a gdb breakpoint on it. This is never caught so presumably the pthread is being killed before this is called (?).
What's the recommended approach here?

I did the following:
Example programm:
#include <cassert>
void f2()
{
assert(0);
}
void f1()
{
f2();
}
int main()
{
f1();
}
Now I set a breakpoint to f2 in hope I can step down to the assert with stepi later:
gdb > break f2
gdb > run
Breakpoint 11, f2 () at main.cpp:5
gdb > stepi // several times!!!!
0x080484b0 in __assert_fail#plt ()
Ahhh! As we can see stepi goes to symbol which tells us that there is a function with that name. So set simply a breakpoint for __assert_fail#plt
gdb > break __assert_fail#plt
gdb > run
Breakpoint 11, f2 () at main.cpp:5
(gdb) bt
#0 0x080484b0 in __assert_fail#plt ()
#1 0x080485f7 in f2 () at main.cpp:5
#2 0x08048602 in f1 () at main.cpp:10
#3 0x0804861b in main () at main.cpp:15
Works for me!

If you need a breakpoint on assert for some reason, Klaus's answer to break on __assert_fail is absolutely correct.
However, it turns out that setting a breakpoint to see stack traces in gdb on multithreaded programs is simply not necessary at all, as gdb already breaks on SIGABRT and switches the the aborting thread. In my case I had a misconfigured set of libraries that lead to this red herring. If you are trying to see stack traces from aborted code (SIGABRT) in gdb using multithreaded programs, you do not need to do anything in gdb, assuming the default signal handlers are in place.
FYI, you can see the default signal handlers by running info signals, and the same for just SIGABRT by running info signals SIGABRT. On my machine, I see this, which shows that the program will be stopped, etc. If for some reason your SIGABRT signal handler is not set up to stop on SIGABRT, you need to change that setting. More info at https://sourceware.org/gdb/onlinedocs/gdb/Signals.html.
(gdb) info signals SIGABRT
Signal Stop Print Pass to program Description
SIGABRT Yes Yes Yes Aborted

Related

Print thread id at breakpoint

I am debugging some C++ code. When I am paused at breakpoint, if I do info thread, gdb shows me a list of all the threads in my process, and puts an asterisk next to the thread under execution at breakpoint. Is there a gdb command which makes gdb tell you the thread id when at breakpoint?
I am doing catch throw and catch catch, to debug around the time an exception is thrown on thread 1. But, thread 2 is simultaneously also throwing and catching exceptions. Since, I am only interested in throw and catch on thread1, I plan to ask gdb for threadid, and script the breakpoint to continue if threadid is 2.
(gdb) catch throw
Catchpoint 7 (throw)
(gdb) catch catch
Catchpoint 8 (catch)
(gdb) command 8
> if threadid == 2
> c
> end
Can you please show me how to write this line if threadid == 2?
Using the built-in $_thread convenience variable:
The debugger convenience variables $_thread and $_gthread contain,
respectively, the per-inferior thread number and the global thread
number of the current thread. You may find this useful in writing
breakpoint conditional expressions, command scripts, and so forth. See
Convenience Variables, for general information on convenience
variables.
catch catch if $_thread == 1
Using the Python API:
— Function: gdb.selected_thread () This function returns the thread
object for the selected thread. If there is no selected thread, this
will return None.
catch catch
command
python
if gdb.selected_thread() != 1:
gdb.execute('continue');
end
end
Generally speaking, when GDB lacks a feature, it's very unlikely you cannot implement it using the Python API since it allows you to explore your running program and context.

STL type/function used in gdb conditional break, will crash the program?

I wish to test in my program below: when s="abc", break inside "f()" and see the value if "i".
#include<string>
using namespace std;
int i=0;
void f(const string& s1)
{
++i; // line 6
}
int main()
{
string s="a";
s+="b";
s+="c";
s+="d";
s+="e";
s+="f";
return 0;
}
Compile and run a.out, no problem. I then debug it
g++ 1.cpp -g
gdb a.out
...
(gdb) b main if strcmp(s.c_str(),"abc")==0
Breakpoint 1 at 0x400979: file 1.cpp, line 9.
(gdb) r
Starting program: /home/dev/a.out
Program received signal SIGSEGV, Segmentation fault.
__strcmp_sse2_unaligned () at ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S:31
31 ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S: No such file or directory.
Error in testing breakpoint condition:
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on".
Evaluation of the expression containing the function
(__strcmp_sse2_unaligned) will be abandoned.
When the function is done executing, GDB will silently stop.
Program received signal SIGSEGV, Segmentation fault.
Breakpoint 1, __strcmp_sse2_unaligned () at ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S:31
31 in ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S
If I change the break point declaration into:
(gdb) b main:6 if s.compare("abc")==0
Breakpoint 1 at 0x400979: file 1.cpp, line 9.
Then I get another kind of crash, seems:
(gdb) r
Starting program: /home/dev/a.out
Program received signal SIGSEGV, Segmentation fault.
__memcmp_sse4_1 () at ../sysdeps/x86_64/multiarch/memcmp-sse4.S:1024
1024 ../sysdeps/x86_64/multiarch/memcmp-sse4.S: No such file or directory.
Error in testing breakpoint condition:
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on".
Evaluation of the expression containing the function
(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::compare(char const*) const) will be abandoned.
When the function is done executing, GDB will silently stop.
Program received signal SIGSEGV, Segmentation fault.
Breakpoint 1, __memcmp_sse4_1 () at ../sysdeps/x86_64/multiarch/memcmp-sse4.S:1024
1024 in ../sysdeps/x86_64/multiarch/memcmp-sse4.S
Is this crash caused by gdb, or my command? If my command has runtime problem, why gdb doesn't simply report an error, by rather crash the program?
Hope to get some explanations, as I didn't get this error cause.
What is going on here is that your command:
(gdb) break main:6
... is interpreted by gdb as the same as break main. You can see this by typing the latter as well:
(gdb) b main:6
Breakpoint 1 at 0x400919: file q.cc, line 10.
(gdb) b main
Note: breakpoint 1 also set at pc 0x400919.
Breakpoint 2 at 0x400919: file q.cc, line 10.
Now, this is peculiar because gdb presumably ought to warn you that the trailing :6 is ignored. (I'd recommend filing a bug asking that this be made a syntax error.)
If you want to break at a certain line in a file you must use the source file name. Presumably you meant to type:
(gdb) break main.cc:6

How to use GDB to analyse stack trace leading to system libraries?

I'm trying to find the reason for a segfault which is occurring on the level of system libraries.
I would like get some hints on how to use gdb to examine args of the getenv() call seen in the following stack trace.
As the trace shows - getenv() is not called directly by my code - call is nested in the chain of system calls initiated in my code. Call is starting with my routine a_logmsg() trying to get thread-safe localtime - localtime_r(), and getenv() is called later somewhere within the code of libc. OS is Solaris 8/SPARC.
Program terminated with signal 11, Segmentation fault.
#0 0xfed3c9a0 in getenv () from /usr/lib/libc.so.1
(gdb) where
#0 0xfed3c9a0 in getenv () from /usr/lib/libc.so.1
#1 0xfed46ab0 in getsystemTZ () from /usr/lib/libc.so.1
#2 0xfed44918 in ltzset_u () from /usr/lib/libc.so.1
#3 0xfed44140 in localtime_r () from /usr/lib/libc.so.1
#4 0x00029c28 in a_logmsg (fmt=0xfd5d0 "%s: no changes to config.") at misc.c:155
#5 0x000273b8 in a_sync_device (device_group=0x11e3ed0 "none", hostname=0xfbbffe8d "router",
config_by=0xfbbffc8f "scheduled_archiving", platform=0x11e3ee0 "cisco", authset=0x11e3ef0 "set01",
arch_method=0xffffcfc8 <Address 0xffffcfc8 out of bounds>) at arch.c:256
#6 0x00027ce8 in a_archive_single (arg=0x1606f50) at arch.c:498
#7 0xfe775378 in _lwp_start () from /usr/lib/libthread.so.1
#8 0xfe775378 in _lwp_start () from /usr/lib/libthread.so.1
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
I would like get some hints on how to use gdb to examine args of the getenv() call seen in the following stack trace.
The source for Solaris libc is available here.
You can examine argument to getenv by setting the breakpoint on it, and looking at the registers. You'll need to know the ABI that is used, but it's quite simple -- the argument to getenv is in register i0, and print (char*)$i0 at the (gdb) prompt should print "TZ".
Finally, the most likely reason for a crash in getenv is that you've corrupted the environment earlier. In particular, note that this code is bad:
void buggy()
{
char buf[80];
strcpy(buf, "FOO=BAR");
putenv(buf); // <-- BUG!
}
You could usually examine the environment via one of these commands:
(gdb) x/100s __environ
(gdb) x/100s environ
Chances are, you'll see strings there which do not contain the = sign.

How to break when a specific exception type is thrown in GDB?

According to the documentation I can break on specific exception type by using conditional breakpoints. However the syntax for the condition isn't very clear to me:
condition bnum <expression>
Looking at the expression syntax I think this is the pattern I need:
{type} addr
However, I don't know what I should pass for the addr argument. I tried the following:
(gdb) catch throw
(gdb) condition 1 boost::bad_function_call *
But it doesn't work (gdb breaks on all exception types).
Can anyone help?
Update
I also tried #Adam's suggestion, but it results in an error message:
(gdb) catch throw boost::bad_function_call
Junk at end of arguments.
Without boost:: namespace:
(gdb) catch throw bad_function_call
Junk at end of arguments.
Workaround
Breaking in the constructor of bad_function_call works.
EDIT
The documentation suggests that catch throw <exceptname> can be used to break whenever an exception of type <exceptname> is thrown; however, that doesn't seem to work in practice.
(gdb) help catch
Set catchpoints to catch events.
Raised signals may be caught:
catch signal - all signals
catch signal <signame> - a particular signal
Raised exceptions may be caught:
catch throw - all exceptions, when thrown
catch throw <exceptname> - a particular exception, when thrown
catch catch - all exceptions, when caught
catch catch <exceptname> - a particular exception, when caught
Thread or process events may be caught:
catch thread_start - any threads, just after creation
catch thread_exit - any threads, just before expiration
catch thread_join - any threads, just after joins
Process events may be caught:
catch start - any processes, just after creation
catch exit - any processes, just before expiration
catch fork - calls to fork()
catch vfork - calls to vfork()
catch exec - calls to exec()
Dynamically-linked library events may be caught:
catch load - loads of any library
catch load <libname> - loads of a particular library
catch unload - unloads of any library
catch unload <libname> - unloads of a particular library
The act of your program's execution stopping may also be caught:
catch stop
C++ exceptions may be caught:
catch throw - all exceptions, when thrown
catch catch - all exceptions, when caught
Ada exceptions may be caught:
catch exception - all exceptions, when raised
catch exception <name> - a particular exception, when raised
catch exception unhandled - all unhandled exceptions, when raised
catch assert - all failed assertions, when raised
Do "help set follow-fork-mode" for info on debugging your program
after a fork or vfork is caught.
Do "help breakpoints" for info on other commands dealing with breakpoints.
When gdb command 'catch throw' fails, try this workaround :
(tested with Linux g++ 4.4.5/gdb 6.6)
1/ Add this code anywhere in the program to debug :
#include <stdexcept>
#include <exception>
#include <typeinfo>
struct __cxa_exception {
std::type_info *inf;
};
struct __cxa_eh_globals {
__cxa_exception *exc;
};
extern "C" __cxa_eh_globals* __cxa_get_globals();
const char* what_exc() {
__cxa_eh_globals* eh = __cxa_get_globals();
if (eh && eh->exc && eh->exc->inf)
return eh->exc->inf->name();
return NULL;
}
2/ In gdb you will then be able to filter exceptions with :
(gdb) break __cxa_begin_catch
(gdb) cond N (what_exc()?strstr(what_exc(),"exception_name"):0!=0)
where N is the breakpoint number, and exception_name is the name of exception for which we wish to break.
From what I have understood from the question here, you want to break when a specific exception boost::bad_function_call is thrown in your application.
$> gdb /path/to/binary
(gdb) break boost::bad_function_call::bad_function_call()
(gdb) run --some-cli-options
So when the temporary object boost::bad_function_call is constructed in preparation for the throw; gdb will break out!
I have tested this and it does work. If you precisely know the way the exception object is being constructed then you can set breakpoint on the specific constructor, otherwise as shown in the example below, you can omit the arguments prototype list, and gdb will set break points on all different flavours of the constructor.
$ gdb /path/to/binary
(gdb) break boost::bad_function_call::bad_function_call
Breakpoint 1 at 0x850f7bf: boost::bad_function_call::bad_function_call. (4 locations)
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y <MULTIPLE>
1.1 y 0x0850f7bf in boost::bad_function_call::bad_function_call() at /usr/include/boost/function/function_base.hpp:742
1.2 y 0x0850fdd5 in boost::bad_function_call::bad_function_call(boost::bad_function_call const&) at /usr/include/boost/function/function_base.hpp:739
1.3 y 0x0863b7d2 <boost::bad_function_call::bad_function_call()+4>
1.4 y 0x086490ee <boost::bad_function_call::bad_function_call(boost::bad_function_call const&)+6>
Another approach is to rely on the tinfo argument available when the catch point is triggered, which is a pointer to the object returned by typeid(type).
So say if I want to catch exception std::bad_alloc being thrown, I could just do:
> p &typeid(std::bad_alloc)
> $1 = (__cxxabiv1::__si_class_type_info *) 0x8c6db60 <typeinfo for std::bad_alloc>
> catch throw if tinfo == 0x8c6db60
Let's assume you have the following code.cpp with a thread that throws an exception:
#include <iostream>
#include <thread>
void thr()
{
while (true) {
new int[1000000000000ul];
}
}
int main(int argc, char* argv[]) {
std::thread t(thr);
t.join();
std::cout << "Hello, World!" << std::endl;
return 0;
}
Compile it with using the following CMakeLists.txt
cmake_minimum_required(VERSION 3.5)
project(tutorial)
set(CMAKE_CXX_STANDARD 11)
add_executable(test_exceptions main.cpp)
target_link_libraries(test stdc++ pthread)
Now you can play with, running it will give you an abort because of bad_alloc.
Before going on, it's better if you install libstd debug symbols, sudo apt-get install libstdc++6-5-dbg or whatever version you have.
Debug compilation
If you are compiling in Debug you can follow this answer https://stackoverflow.com/a/12434170/5639395 because constructors are usually defined.
Release compilation
If you are compiling in DebWithRelInfo you may not be able to find a proper constructor where to put your breakpoint because of the compiler optimization. In this case, you have some other options. Let's continue.
Source code change solution
If you can change the source code easily, this will work https://stackoverflow.com/a/9363680/5639395
Gdb catch throw easy solution
If you don't want to change the code, you can try to see if catch throw bad_alloc or in general catch throw exception_name works.
Gdb catch throw workaround
I will build on top of this answer https://stackoverflow.com/a/6849989/5639395
We will add a breakpoint in gdb in the function __cxxabiv1::__cxa_throw . This function takes a parameter called tinfo that has the information we need to conditionally check for the exception we care about.
We want something like catch throw if exception==bad_alloc, so how to find the proper comparison?
It turns out that tinfo is a pointer to a structure that has a variable called __name inside. This variable has a string with the mangled name of the exception type.
So we can do something like: catch throw if tinfo->__name == mangled_exception_name
We are almost there!
We need a way to do string comparison, and it turns out gdb has a built-in function $_streq(str1,str2) that does exactly what we need.
The mangled name of the exception is a little harder to find, but you can try to guess it or check the Appendix of this answer. Let's assume for now it is "St9bad_alloc".
The final instruction is:
catch throw if $_streq(tinfo->__name , "St9bad_alloc")
or equivalent
break __cxxabiv1::__cxa_throw if $_streq(tinfo->__name , "St9bad_alloc")
How to find the name of your exception
You have two options
Look for the symbol in the library
Assuming that you installed the libstd debug symbols, you can find the library name like this:
apt search libstd | grep dbg | grep installed
The name is something like this libstdc++6-5-dbg
Now check the files installed:
dpkg -L libstdc++6-5-dbg
Look for something that has a debug in the path, and a .so extension. In my pc I have /usr/lib/x86_64-linux-gnu/debug/libstdc++.so.6.0.21.
Finally, look for the exception you want in there.
nm /usr/lib/x86_64-linux-gnu/debug/libstdc++.so.6.0.21 | grep -i bad_alloc
Or
nm /usr/lib/x86_64-linux-gnu/debug/libstdc++.so.6.0.21 | grep -i runtime_error
etc.
In my case I found something like 00000000003a4b20 V _ZTISt9bad_alloc which suggested me to use "St9bad_alloc" as the name.
Throw it in gdb and inspect the name in there
This is easy, just start gdb, catch throw everything and run the small executable I wrote before. When you are inside gdb, you can issue a p *tinfo and look for the __name description from gdb.
gdb -ex 'file test_exceptions' -ex 'catch throw' -ex 'run'
(gdb) p *tinfo
$1 = {_vptr.type_info = 0x406260 <vtable for __cxxabiv1::__si_class_type_info+16>,
__name = 0x7ffff7b8ae78 <typeinfo name for std::bad_alloc> "St9bad_alloc"}
As others already mentioned this functionality doesn't work in practice. But as workaround you can put condition on catch throw. When exception is thrown we come to __cxa_throw function. It has several parameters pointing to exception class, so we can set condition on one of them. In the sample gdb session below, I put condition on dest parameter of __cxa_throw. The only problem is that value of dest (0x80486ec in this case) is unknown in advance. It can be known, for example, by first running gdb without condition on breakpoint.
[root#localhost ~]#
[root#localhost ~]# gdb ./a.out
GNU gdb (GDB) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/a.out...done.
(gdb) catch throw
Catchpoint 1 (throw)
(gdb) condition 1 dest==0x80486ec
No symbol "dest" in current context.
(gdb) r
warning: failed to reevaluate condition for breakpoint 1: No symbol "dest" in current context.
warning: failed to reevaluate condition for breakpoint 1: No symbol "dest" in current context.
warning: failed to reevaluate condition for breakpoint 1: No symbol "dest" in current context.
Catchpoint 1 (exception thrown), __cxxabiv1::__cxa_throw (obj=0x804a080, tinfo=0x8049ca0, dest=0x80486ec <_ZNSt13runtime_errorD1Ev#plt>) at ../../../../gcc-4.4.3/libstdc++-v3/libsupc++/eh_throw.cc:68
68 ../../../../gcc-4.4.3/libstdc++-v3/libsupc++/eh_throw.cc: No such file or directory.
in ../../../../gcc-4.4.3/libstdc++-v3/libsupc++/eh_throw.cc
(gdb) bt
#0 __cxxabiv1::__cxa_throw (obj=0x804a080, tinfo=0x8049ca0, dest=0x80486ec <_ZNSt13runtime_errorD1Ev#plt>) at ../../../../gcc-4.4.3/libstdc++-v3/libsupc++/eh_throw.cc:68
#1 0x08048940 in main () at test.cpp:14
(gdb) i b
Num Type Disp Enb Address What
1 breakpoint keep y 0x008d9ddb exception throw
stop only if dest==0x80486ec
breakpoint already hit 1 time
(gdb)
Update
You must also load debug info for libstdc++ for this workaround to work.
I'm not sure if this is a recent fix, but with GDB GNU gdb (Debian 9.1-2) 9.1, I have used catch throw std::logical_error successfully. I would hate to generalise prematurely, but it is possible this now works correctly in GDB (April 2020).
I think I can answer the part about setting conditional breaks. I won't answer question regarding exceptions as __raise_exception seems to not exist in g++ 4.5.2 (?)
Let's assume that you have following code (I use void to get something similar to __raise_exception from gdb doc)
void foo(void* x) {
}
int main() {
foo((void*)1);
foo((void*)2);
}
to break at foo(2) you use following commands
(gdb) break foo
Breakpoint 1 at 0x804851c: file q.cpp, line 20.
(gdb) condition 1 x == 2
If you run with
(gdb) r
you will see that it stops on the second foo call, but not on the first one
I think, what they meant in docs is that you set break on function __raise_exception (very implementation dependent)
/* addr is where the exception identifier is stored
id is the exception identifier. */
void __raise_exception (void **addr, void *id);
and then set conditional break on id as described above (you have to somehow determine what is id for yours exception type).
Unfortunately
(gdb) break __raise_exception
results with (g++ 4.5.2)
Function "__raise_exception" not defined.
In case the problem is that there is no valid stack trace (not breaking in raise), it seems to be a problem when re-compiling without re-starting gdb.
( i.e. calling "make" inside the gdb console).
After having re-started gdb, it breaks correctly in raise.c
(my versions : GNU gdb 8.1.0.20180409-git, gcc 7.4.0, GNU make 4.1)

Analyze Core Dump

We have a binary that generates coredump. So I ran the gdb command to analyze the issue. Please note the binary and code are in two different locations and we cannot build the whole binary using debugging symbols. Hence how and what details can I find from below backtarce:
gdb binary corefile
(gdb) where
#0 0x101fa37a in f1()
#1 0x10203812 in operator f2< ()
#2 0x085f6244 in f3 ()
#3 0x085f1574 in f4()
#4 0x0805b27b in sigsegv_handler ()
#5 <signal handler called>
#6 0x1018d945 in f5()
#7 0x1018e021 in f6()
..................................
#29 0x08055c5c in main ()
(gdb)
Please provide me gdb commands that I can issue to find what’s data inside each stack frame, what’s the issue probably is, where it is failing, other debugging methods if any?
You can use help in gdb. To navigate the stack : help stack
The main useful commands to navigate the stack are up and down. If you have debugging symbols at hand, you can use list to see where you are. Then to get information, you need print (abbreviated as 'p'). For example, if you have an int called myInt then you just type p myInt. With no debug info it will be harder. From your stack frame it seems that the problem is in f5(). One thing you can do is start your program inside gdb. it will stop right where the segfault happens. When you have hints about the part of your code that segfaults, you can compile this code unit with debugging options.
That the basics. Tell us more if you want more help.
my2c