There are two stacks in the program: one is created by OS and the second is created by program itself to run some code with it.
When the program crashes in the second stack, I want to switch to the main stack in gdb and see the backtrace. Is it possible?
I tried to save the rsp to a variable and change it after the crash, but the resulting backtrace was not right. I think gdb cannot differentiate frames in stack.
I think you were right with the approach of just restoring some register values to point GDB at the right stack. It's difficult to know how your application may have worked without any of its source, but for the very simple make/swapcontext application below:
#include <stdio.h>
#include <ucontext.h>
#include <unistd.h>
ucontext_t a_ctx, b_ctx;
char b_stack[4096];
void a2() {
swapcontext(&a_ctx, &b_ctx);
}
void a1() { a2(); }
void b2() {
printf("pausing");
pause(); // interrupt here in the debugger
}
void b1() { b2(); }
int main() {
getcontext(&b_ctx);
b_ctx.uc_stack.ss_sp = b_stack;
b_ctx.uc_stack.ss_size = sizeof(b_stack);
makecontext(&b_ctx, b1, 0);
a1();
}
The set $... command can set registers to their saved values from within GDB at which point bt will find the old stack.
(gdb) bt
#0 0x00007ffff7e8cc23 in __libc_pause () at ../sysdeps/unix/sysv/linux/pause.c:29
#1 0x00005555555551c7 in b2 () at src/so.c:16
#2 0x00005555555551d8 in b1 () at src/so.c:19
#3 0x00007ffff7e13a60 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x0000000000000000 in ?? ()
(gdb) set $rbp = a_ctx.uc_mcontext.gregs[REG_RBP]
(gdb) set $rip = a_ctx.uc_mcontext.gregs[REG_RIP]
(gdb) set $rsp = a_ctx.uc_mcontext.gregs[REG_RSP]
(gdb) bt
#0 a2 () at src/so.c:10
#1 0x00005555555551a7 in a1 () at src/so.c:12
#2 0x0000555555555234 in main () at src/so.c:26
If you have two threads with two stacks you can query the thread by 'info threads'.
After you know which thread's stack you want to see select it with the 'thread' command. Put the number of the thread after the command.
Then you only need to query the stack by 'bt'
Related
Goal: I want to modify internal information and access this information from many threads synchronously as fast as possible
I simplified code bellow, but this is how I tried to achieve this.
I have 2 shared pointers.
One is called m_mutable_data and the other is called m_const_data.
m_mutable_data is updated in strand guarded way. m_const_data is updated with contents of m_mutable_data every 60s also in the strand guarded way.
This is the only place m_const_data shared pointer is reset with new data. m_const_data is read synchronously by many threads, 1000+ times per second.
Code
class black_list_container : public std::enable_shared_from_this<black_list_container>
{
struct meta_data
{
bool blacked;
}
struct black_list_data
{
std::unordered_map<uint32_t,meta_data> data;
}
public:
#pragma optimize( "", off )
bool is_blacked(uint32_t id)
{
// This call is called from many different threads (1000+ calls per second)
// should be synchronous and as fast as possible
auto c = m_const_data;
return c->data[id].blacked;
}
#pragma optimize( "", on )
#pragma optimize( "", off )
void update_const_data()
{
// Called internaly by timer every 60s to update m_const_data with contents of m_mutable_data
// Guarded with strand
m_strand->post([self{shared_from_this()}]{
auto snapshot = new black_list_data();
snapshot->data = m_mutable_data->data;
m_const_data.reset(snapshot);
});
}
#pragma optimize( "", on )
private:
void internal_modification_mutable_data()
{
// Called internaly by different metrics
// Guarded with strand
m_strand->post([self{shared_from_this()}]{
// .... do some modification on internal m_mutable_data
});
}
boost::asio::io_context::strand m_strand;
std::shared_ptr<black_list_data> m_mutable_data;
std::shared_ptr<black_list_data> m_const_data;
};
Very, very seldom this code crashes in method 'is_blacked' on line
auto c = m_const_data;
This is the backtrace
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./STRATUM-01'.
Program terminated with signal 6, Aborted.
#0 0x00007fe09aaf1387 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-307.el7.1.x86_64 libgcc-4.8.5-39.el7.x86_64 libstdc++-4.8.5-39.el7.x86_64
(gdb) bt
#0 0x00007fe09aaf1387 in raise () from /lib64/libc.so.6
#1 0x00007fe09aaf2a78 in abort () from /lib64/libc.so.6
#2 0x00007fe09ab33ed7 in __libc_message () from /lib64/libc.so.6
#3 0x00007fe09ab3c299 in _int_free () from /lib64/libc.so.6
#4 0x00000000005fae36 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fe0440aeaa0) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:154
#5 0x00000000006b9205 in ~__shared_count (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:684
#6 ~__shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:1123
#7 ~shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr.h:93
#8 black_list_container_impl::is_blacked (this=0x7fe08c287e50, id=23654) at /var/lib/jenkins/workspace/validator/src/black_list_container.cpp:69
I'm not exactly sure why destruction of shared_ptr is called in frame #7
Obviously I did not achieve my goal so please direct me into pattern that actually achieves my goal in thread safe way.
I know I could have used
std::atomic<std::shared_ptr<black_list_data>> m_const_data;
but would not this affect performance while reading from many different threads?
I think I found the answer to my question in this article Atomic Smart Pointers.
So I have to change code in update_const_data() to
auto snapshot = std::make_shared<black_list_data>();
snapshot->data = m_mutable_data->data;
std::atomic_store(&m_const_data, snapshot);
and code in is_blacked() to
auto c = std::atomic_load(&m_const_data);
I do operations on an STL map in the following functions, all of which are protected by a mutex:-
static std::mutex track_active_lock_mtx;
typedef intrusive_ptr<WatchCtxInternal> WatchCtxInternal_h;
static std::map<WatchCtxInternal*, WatchCtxInternal_h> actives;
void* get_ptr(WatchCtxInternal_h ctx)
{
unique_lock<mutex> trackActiveLock(track_active_lock_mtx);
if(actives.find(ctx.get()) == actives.end()) {
actives.insert(make_pair(ctx.get(), ctx));
}
trackActiveLock.unlock();
return ctx.get();
}
void genericWatcher(void *watcherCtx)
{
unique_lock<mutex> trackActiveLock(track_active_lock_mtx);
auto it = actives.find((WatchCtxInternal*)watcherCtx);
if (it == actives.end()) {
return;
}
//do unrelated stuff
actives.erase(it);
}
I got a segmentation fault in the first function:-
Program terminated with signal SIGSEGV, Segmentation fault.
#0 _M_lower_bound (this=<optimized out>, __k=<optimized out>, __y=0xf31256e8, __x=0x65687465) at /volume/evo/files/opt/poky/1.8.2-4/sysroots/i586-poky-linux/usr/include/c++/4.9.2/bits/stl_tree.h:1261
1261 if (!_M_impl._M_key_compare(_S_key(__x), __k))
(gdb) bt
#0 _M_lower_bound (this=<optimized out>, __k=<optimized out>, __y=0xf31256e8, __x=0x65687465) at /volume/evo/files/opt/poky/1.8.2-4/sysroots/i586-poky-linux/usr/include/c++/4.9.2/bits/stl_tree.h:1261
#1 find (__k=<optimized out>, this=0xf6ac8e2c <actives>) at /volume/evo/files/opt/poky/1.8.2-4/sysroots/i586-poky-linux/usr/include/c++/4.9.2/bits/stl_tree.h:1913
#2 find (__x=<optimized out>, this=0xf6ac8e2c <actives>) at /volume/evo/files/opt/poky/1.8.2-4/sysroots/i586-poky-linux/usr/include/c++/4.9.2/bits/stl_map.h:860
#3 get_ptr (ctx=...)
(gdb)fr 3
(gdb) p ctx
$4 = {px = 0xf3124d30}
EDIT: I managed to get a stack trace using the Memcheck tool. What is happening is that the static map gets cleaned up as part of the process exit, but a callback to genericWatcher is occurring in the other thread before completely exiting:-
main.cpp
static void thread1(void *arg) {
//call genericWatcher repeatedly
}
int main() {
if(fork() == 0) {
pthread_create(..., thread1,..)
//call get_ptr() repeatedly
}
return 0;
}
Is there any way to prevent this? I could allocate a singleton that holds the actives map, but I try to avoid using singletons
The most likely point of failure is the erase call in your release callback because it's the only access point to your map that hasn't got any guarding mechanism. Are you sure at that point that your WatchCtx is part of the map's keys? If not, it sounds possible that the insert is already letting go.
But, like Velkan already said, valgrind (or your debugger of choice) will give you certainty.
I am working on a pthread code to do repeated matrix vector product. While doing so, I first wrote the serial matrix vector code for multiplication and then later I attempted to put the matrix vector product into separate threads.
The code https://github.com/viswans/parallel-computing-cs525/blob/pthread/pthread_page_rank/src/pthread/pagerankPthread.cpp does what I just described. Particularly when the number of threads is increased from 8 to 9, the binary results in a segmentation fault.
On debugging using gdb I noticed that there was a null pointer being dereferenced, and I added a watch point on that pointer to see if it is being set properly. What I noticed was that the argument to the function being called from pthread_create seems to be flushed and set to 0!
Old value = 37843
New value = 45242576
0x0000000000403436 in __gnu_cxx::new_allocator<(anonymous namespace)::ThreadStruct>::construct<(anonymous namespace)::ThreadStruct, (anonymous namespace)::ThreadStruct> (this=0x2b25970, __p=0x2b260e0) at /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.4/include/g++-v4/ext/new_allocator.h:120
120 { ::new((void *)__p) _Up(std::forward<_Args>(__args)...); }
(gdb) c
Continuing.
[New Thread 0x7ffff2985700 (LWP 3390)]
[New Thread 0x7ffff2184700 (LWP 3391)]
[New Thread 0x7ffff1983700 (LWP 3392)]
[New Thread 0x7ffff1182700 (LWP 3393)]
Hardware watchpoint 3: *(0x2b260e8)
Old value = 45242576
New value = 0
0x00007ffff708eedb in __memset_sse2 () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff708eedb in __memset_sse2 () from /lib64/libc.so.6
#1 0x00007ffff7ded2e2 in allocate_dtv () from /lib64/ld-linux-x86-64.so.2
#2 0x00007ffff7ded9be in _dl_allocate_tls () from /lib64/ld-linux-x86-64.so.2
#3 0x00007ffff7bc9fc5 in pthread_create##GLIBC_2.2.5 () from /lib64/libpthread.so.0
#4 0x0000000000402b47 in PageRank::PageRankPthread::calculatePageRank (matrix=std::shared_ptr (count 1, weak 0) 0x2b258d0,�
input=std::vector of length 196591, capacity 196591 = {...}, num_threads=9, criterion=...) at src/pthread/pagerankPthread.cpp:84
#5 0x0000000000401d5d in mainPthread (argc=3, argv=0x7fffffffe6b8) at src/pthread/mainPthread.cpp:31
#6 0x000000000040be47 in main (argc=3, argv=0x7fffffffe6b8) at src/main.cpp:9
Any insight about why pthread_create would flush the arguments would be much appreciated.
Thanks
Sudharshan
You call push_back on the tstruct vector, which invalidates all pointers into that vector, causing the threads to access structures that have moved. One simple fix is to add tstruct.reserve(num_threads); after std::vector< ThreadStruct > tstruct;.
But you should really rethink this and do things in a more sensible way. Is a vector of structures a suitable collection to use when you need a pointer into the collection to remain valid as the collection is modified?
#include < iostream >
#include < pthread.h >
using namespace std;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
void* Func(void *)
{
pthread_mutex_lock(&mutex);
cout << "First thread execution" << endl;
pthread_mutex_unlock(&mutex);
}
int main()
{
pthread_t th1;
pthread_create(&th1, NULL, Func, NULL);
pthread_mutex_lock(&mutex);
cout << "In main thread" << endl;
pthread_mutex_lock(&mutex);
// pthread_join(th1, NULL); // Note this code is commented
return 0;
}
I have executed following program on linux fedora 22 (also on http://www.cpp.sh/) around 20 times, and out of 20 execution I have found following outputs:-
Output1:
In main thread
First thread execution
Output2:
First thread execution
In main thread
Output3:
In main thread
Output4:
In main thread
First thread execution
First thread execution
Output 1 to 3 are expected as main thread is not waiting child thread to exit. Execution sequence of both the threads (main and child) is fully dependent on Kernel thread scheduling.
But output 4 is strange !!! First thread execution gets printed two times !!!
Now if I run program after un-commentting code 'pthread_join(th1, NULL)' or add 'pthread_exit(NULL)', I do not get strange output (i.e. First thread execution never printed twice) ever, even I run code 10000 times.
My questions to experts are:
Without pthread_join/pthread_exit what is happening behind the scene so that First thread execution have got printed 2 times?
Responsibility of pthread_join is to get exit code of a particular thread, and after successful call of pthread_join, kernel will free resources of that particular thread. If I do not call pthread_join on a joinable thread then it will result in resource leak, but why above mentioned strange behavior ??
We might say, this is un-defined behavior, but it would be great if any expert provide technical explanation on this.
How pthread_join/pthread_exit can prevent above mentioned strange behavior ? What hidden thing it is doing here due to that strange behavior doesn't appear ?
Thanks to experts in advance..
I've observed this kind of double printing in a similar situation. While your thread was waiting in the write system call doing its normal output, specifically, in this stack:
#0 0x00007ffff78f4640 in write () from /lib64/libc.so.6
#1 0x00007ffff788fb93 in _IO_file_write () from /lib64/libc.so.6
#2 0x00007ffff788fa72 in new_do_write () from /lib64/libc.so.6
#3 0x00007ffff7890e05 in _IO_do_write () from /lib64/libc.so.6
#4 0x00007ffff789114f in _IO_file_overflow () from /lib64/libc.so.6
the program was terminated normally, normal termination caused the output subsystem to flush all buffers. The output buffer on stdin was not yet marked free (the write system call didn't return yet), so it was written out again:
#0 0x00007ffff78f4640 in write () from /lib64/libc.so.6
#1 0x00007ffff788fb93 in _IO_file_write () from /lib64/libc.so.6
#2 0x00007ffff788fa72 in new_do_write () from /lib64/libc.so.6
#3 0x00007ffff7890e05 in _IO_do_write () from /lib64/libc.so.6
#4 0x00007ffff7890140 in _IO_file_sync () from /lib64/libc.so.6
#5 0x00007ffff7891f56 in _IO_default_setbuf () from /lib64/libc.so.6
#6 0x00007ffff7890179 in _IO_file_setbuf () from /lib64/libc.so.6
#7 0x00007ffff7892703 in _IO_cleanup () from /lib64/libc.so.6
#8 0x00007ffff78512f8 in __run_exit_handlers () from /lib64/libc.so.
In any case, join your threads (If you used C++ threads, it would have reminded you to do that) or otherwise synchronize access to the output stream.
The main thread might end earlier then the spawned thread.
Ending the main thread implies ending the whole process, along with all threads being brought down abruptly. This might invoke undefined behaviour, thus anything can happen.
To get around this
either join the spawned thread using pthread_join() from main(),
or end the main thread using pthread_exit(), which just ends the main thread and keeps the process from being ended.
I wrote a program which uses massive parallel execution. I am working with an Array of objects and an Array of mutexes for synchronization. My code Looks something like this:
std::vector<MyObject> objects;
std::vector<std::mutex> mutexes;
void work(int data)
{
for(unsigned int i = 0; i < objects.size(); ++i)
{
//Check if data Needs to be processed for objects[i]
if(dontNeedToProcess)continue;
mutexes[i].lock();
//Work with data for objects[i]
mutexes[i].unlock();
}
}
The function "work" is called by multiple threads with different data. After some hours (sometimes even days) the program is stuck. When I run it with gdb I can see that the program hangs while locking the mutex.
The Problem is now that I compiled the progrem with optimization (-O2) and the mutexes and "i" are optimized out.
Is it possible that the optimization causes this behavior when using an Array of mutexes?
Edit:
All the threads are at the same Position. The Backtrace Looks like the following:
#0 0xb7779d3c in __kernel_vsyscall ()
#1 0xb67ff672 in __lll_lock_wait ()
at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144
#2 0xb67fb1c2 in _L_lock_920 ()
from /lib/i386-linux-gnu/i686/cmov/libpthread.so.0
#3 0xb67fb043 in __GI___pthread_mutex_lock (mutex=0x9de56530)
at ../nptl/pthread_mutex_lock.c:114
#4 0xb69043c4 in pthread_mutex_lock (mutex=0x9de56530) at forward.c:192
#5 0xb72754f3 in pthread_mutex_lock ()
from /usr/lib/i386-linux-gnu/libasan.so.1
#6 0x0809ac85 in __gthread_mutex_lock (__mutex=<optimized out>)
at /usr/include/i386-linux-gnu/c++/4.9/bits/gthr-default.h:748
#7 lock (this=<optimized out>) at /usr/include/c++/4.9/mutex:135
...