deadlock in multi-threaded program between malloc and ctime_r - c++

I have a C++ program (running on Linux - Ubuntu 12.04 - gcc compiler), and i am getting a deadlock between 2 threads
T1 backtrace:
#0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:93
#1 0x00007f24bd454251 in _L_lock_10628 () at malloc.c:5253
#2 0x00007f24bd451f77 in __GI___libc_malloc (bytes=139793535074336) at malloc.c:2921
#3 0x00007f24bd457da2 in __GI___strdup (s=0x7f24bd54a4c2 "/etc/localtime") at strdup.c:43
T2 backtrace:
#0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:93
#1 0x00007f24bd480ee4 in _L_lock_2180 () at tzset.c:616
#2 0x00007f24bd480cf7 in __tz_convert (timer=0x7f24bd789ee8, use_localtime=1, tp=0x7f2455a0bc70) at tzset.c:619
#3 0x00007f24bd47e570 in ctime_r (t=<optimized out>, buf=0x7f2455a0bcf0 "\364\230\005\277$\177") at ctime_r.c:29
T2 is called from a static library i am using.
I also read ctime_r is not thread-safe.
How can i avoid deadlock on such case ?

Related

why is std::terminate not called when throwing in my destructor

I'm trying to "see" the call to std::terminate() when throwing from a destructor with the following code:
#include <stdexcept>
struct boom {
~boom() {
throw std::logic_error("something went wrong");
}
};
int main() {
boom();
}
compiled with g++ and run the code:
# ./a.out
terminate called after throwing an instance of 'std::logic_error'
what(): something went wrong
Aborted (core dumped)
so far so good, seems to work as expected (call terminate()). But when trying to break on terminate in gdb, the function does not get hit and the backtrace shows only abort:
(gdb) b std::terminate
Breakpoint 2 at 0x7f60a3772240 (2 locations)
(gdb) r
Starting program: a.out
terminate called after throwing an instance of 'std::logic_error'
what(): something went wrong
Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007f2c241eb921 in __GI_abort () at abort.c:79
#2 0x00007f2c24628957 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007f2c2462eae6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007f2c2462db49 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007f2c2462e4b8 in __gxx_personality_v0 () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007f2c23c05573 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7 0x00007f2c23c05ad1 in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#8 0x00007f2c2462ed47 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x0000558cee09593a in boom::~boom() ()
#10 0x0000558cee0958dd in main ()
(gdb)
This behavior is puzzling me somehow, is there something I am missing?
EDIT:
same test using clang++ and the std::terminate breakpoint is hit:
Breakpoint 1, 0x0000000000400750 in std::terminate()#plt ()
(gdb) bt
#0 0x0000000000400750 in std::terminate()#plt ()
#1 0x00000000004009bf in __clang_call_terminate ()
#2 0x000000000187eef0 in ?? ()
#3 0x00000000004009a4 in boom::~boom() ()
#4 0x00000000004008f1 in main ()
(gdb)
You are running into the as-if rule.
C++ compilers are allowed to essentially rewrite your entire program as they see fit as long as the side effects remain the same.
What a side effect is very well defined, and "a certain function gets called" is not part of that. So the compiler is perfectly allowed to just call abort() directly if it can determine that this is the only side effect std::terminate() would have on the program.
You can get more details at https://en.cppreference.com/w/cpp/language/as_if

Core dump analysis using gdb shows failure in new

From the back trace of gdb am getting the below stack trace.
What might be the reason .
Is it that new failed to allocate memory or something else.
#0 0x00c69eff in raise () from /lib/tls/libc.so.6
#1 0x00c6b705 in abort () from /lib/tls/libc.so.6
#2 0x005564f7 in __cxa_call_unexpected () from /usr/lib/libstdc++.so.5
#3 0x00556544 in std::terminate () from /usr/lib/libstdc++.so.5
#4 0x005566b6 in __cxa_throw () from /usr/lib/libstdc++.so.5
#5 0x005568d2 in operator new () from /usr/lib/libstdc++.so.5
#6 0x005569bf in operator new[] () from /usr/lib/libstdc++.so.5

Suppression file for ThreadSanitizer does not work: What is wrong?

Clang has the -fsanitize-blacklist compile switch to suppress warnings from the ThreadSanitizer. Unfortunately, I cannot get it to work.
Here is an example that I want to suppress:
WARNING: ThreadSanitizer: data race (pid=21502)
Read of size 8 at 0x7f0dcf5b31a8 by thread T6:
#0 tbb::interface6::internal::auto_partition_type_base<tbb::interface6::internal::auto_partition_type>::check_being_stolen(tbb::task&) /usr/include/tbb/partitioner.h:305 (exe+0x000000388b38)
#1 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
Previous write of size 8 at 0x7f0dcf5b31a8 by thread T1:
#0 auto_partition_type_base /usr/include/tbb/partitioner.h:299 (exe+0x000000388d9a)
#1 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
#2 GhostSearch::Ghost3Search::SearchTask::execute_impl() /home/phil/ghost/search/ghost3/ghost3_search_alg.cpp:1456 (exe+0x000000387a8a)
#3 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
#4 GhostSearch::Ghost3Search::Ghost3SearchAlg::NullWindowSearch(int, MOVE, int, std::vector<MOVE, std::allocator<MOVE> >&) /home/phil/ghost/search/ghost3/ghost3_search_alg.cpp:1640 (exe+0x000000388310)
#5 GhostSearch::PureMTDSearchAlg::FullWindowSearch(GhostSearch::SearchWindow, GhostSearch::SearchWindow, MOVE, int, std::vector<MOVE, std::allocator<MOVE> >&) /home/phil/ghost/search/pure_mtd_search_alg.cpp:41 (exe+0x000000370e3f)
#6 GhostSearch::PureSearchAlgWrapper::RequestHandlerThread::EnterHandlerMainLoop() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:124 (exe+0x000000372d1b)
#7 operator() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:94 (exe+0x000000374683)
#8 execute_native_thread_routine /home/phil/tmp/gcc/src/gcc-4.8-20130725/libstdc++-v3/src/c++11/thread.cc:84 (libstdc++.so.6+0x0000000b26cf)
Thread T6 (tid=21518, running) created by thread T3 at:
#0 pthread_create ??:0 (exe+0x0000002378e1)
#1 <null> <null>:0 (libtbb.so.2+0x0000000198c0)
Thread T1 (tid=21513, running) created by main thread at:
#0 pthread_create ??:0 (exe+0x0000002378e1)
#1 __gthread_create /home/phil/tmp/gcc/src/gcc-build/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:662 (libstdc++.so.6+0x0000000b291e)
#2 GhostSearch::PureSearchAlgWrapper::StartRequestHandlerThread() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:77 (exe+0x0000003715c3)
#3 GhostSearch::Search::ExecuteSearch(GhostSearch::SEARCH_SETTINGS const&) /home/phil/ghost/search.cpp:243 (exe+0x00000033063f)
#4 GhostSearch::Search::StartSearch(GhostSearch::SEARCH_SETTINGS const&, UserBoard const&, GhostInterfaces::UserInterface*) /home/phil/ghost/search.cpp:176 (exe+0x00000033037a)
#5 GhostInterfaces::UserInterface::StartSearch(GhostSearch::SEARCH_SETTINGS const&, UserBoard const&) /home/phil/ghost/interface.cpp:1072 (exe+0x0000002ea220)
#6 GhostInterfaces::UserInterface::MainLoop() /home/phil/ghost/interface.cpp:576 (exe+0x0000002e9464)
#7 GhostInterfaces::Command_Analyze::Execute(GhostInterfaces::UserInterfaceData&) /home/phil/ghost/commands.cpp:1005 (exe+0x00000028756c)
#8 GhostInterfaces::UserInterface::FinishNextCommand() /home/phil/ghost/interface.cpp:1161 (exe+0x0000002e9ed0)
#9 GhostInterfaces::UserInterface::MainLoop() /home/phil/ghost/interface.cpp:571 (exe+0x0000002e9447)
#10 main /home/phil/ghost/ghost.cpp:54 (exe+0x000000274efd)
SUMMARY: ThreadSanitizer: data race /usr/include/tbb/partitioner.h:305 tbb::interface6::internal::auto_partition_type_base<tbb::interface6::internal::auto_partition_type>::check_being_stolen(tbb::task&)
My tries for the suppression file so far (but it does not work):
# TBB
fun:tbb::*
src:/usr/include/tbb/partitioner.h
Do you know why it does not work?
(By the way, I would be happy to suppress all warnings from the TBB library.)
Finally, I got it working.
According to the documentation, each line must start with a valid "suppression_type" (race, thread, mutex, signal, deadlock, or called_from_lib).
In my example, the correct suppression_type is race.
Here is an example file called "sanitizer-thread-suppressions.txt", which suppresses two functions, which are known to contain data races:
race:Function1
race:MyNamespace::Function2
To test the suppress file, set the TSAN_OPTIONS environment variable and call the application (compiled with -fsanitize=thread):
$ TSAN_OPTIONS="suppressions=sanitizer-thread-suppressions.txt" ./myapp
If that works, you can apply the settings at compile time:
-fsanitize=thread -fsanitize-blacklist=sanitizer-thread-suppressions.txt

notify_all causes segmentation fault

I am using boost threads, upon calling notify_all() within the destructor i am seeing a segmentation fault. Here is the stack:
(gdb) where
#0 0x00007ffff752de84 in pthread_mutex_lock ()
from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007fffe85ab22e in boost::pthread::pthread_mutex_scoped_lock::pthread_mutex_scoped_lock (this=0x7fffffffdba0, m_=0x0)
at /usr/include/boost/thread/pthread/pthread_mutex_scoped_lock.hpp:26
#2 0x00007fffe85abb5d in boost::condition_variable::notify_one (this=0x0)
at /usr/include/boost/thread/pthread/condition_variable.hpp:88
#3 0x00007fffe8690864 in CampaignFrequency::stopFlushThread (this=0x6ad590)
at /home/git/gitRTB/infinityplus/src/common/shm/CampaignFrequency.cpp:197
#4 0x00007fffe868ffd7 in CampaignFrequency::~CampaignFrequency (
this=0x6ad590, __in_chrg=<optimised out>)
at /home/git/gitRTB/infinityplus/src/common/shm/CampaignFrequency.cpp:81
#5 0x00007fffe85bdc37 in rtb_child_init (s=0x7ffff7fc3238)
at /home/git/gitRTB/infinityplus/src/bidder/mod_rtb.cpp:265
#6 0x000000000044784c in ap_run_child_init ()
#7 0x000000000042817c in ?? ()
#8 0x0000000000463594 in ?? ()
#9 0x00000000004635f4 in ?? ()
#10 0x00000000004643fd in ?? ()
#11 0x000000000042f026 in ap_run_mpm ()
#12 0x0000000000428d74 in main ()
Without actually seeing the code, this is mostly conjecture.
From your debug:
#2 0x00007fffe85abb5d in boost::condition_variable::notify_one (this=0x0)
at /usr/include/boost/thread/pthread/condition_variable.hpp:88
It's saying that this (inside the condition variable) is nullptr. It appears that you are calling cv->notify_all() where cv is nullptr (aka 0). Is it possible you are deleting the condition variable prior to trying to use it?

Is it possible to programmatically cause dlopen() in OS X to return NULL?

Let's call it x.dylib. I only want x.dylib to be loaded sometimes.
In the initialization of the dylib is there any way to have some logic which would cause the dlopen() call that tried to load x.dylib to fail to load x.dylib and return NULL?
Renaming x.dylib is not an option.
I looked through http://opensource.apple.com/source/dyld/dyld-210.2.3/src/dyldAPIs.cpp but I am unfamiliar with the code.
I thought maybe this would do it:
__attribute__((constructor))
void initializer(void) {
fprintf(stderr, "initializer\n");
throw;
}
but when I call dlopen() on the dylib with that initializer, i just get "initializer
terminate called without an active exceptionAbort trap: 6"
So I'm stumped; any help would be great.
Edit:
The stack trace, when viewed with gdb is as follows:
Program received signal SIGABRT, Aborted.
0x00007fff9128a82a in __kill ()
(gdb) bt
#0 0x00007fff9128a82a in __kill ()
#1 0x00007fff93539a9c in abort ()
#2 0x00007fff987f07bc in abort_message ()
#3 0x00007fff987edfcf in default_terminate ()
#4 0x00007fff987ee001 in safe_handler_caller ()
#5 0x00007fff987ee05c in std::terminate ()
#6 0x00007fff987ef152 in __cxa_throw ()
#7 0x0000000100003eb5 in initializer ()
#8 0x00007fff5fc0fda6 in __dyld__ZN16ImageLoaderMachO18doModInitFunctionsERKN11ImageLoader11LinkContextE ()
#9 0x00007fff5fc0faf2 in __dyld__ZN16ImageLoaderMachO16doInitializationERKN11ImageLoader11LinkContextE ()
#10 0x00007fff5fc0d2e4 in __dyld__ZN11ImageLoader23recursiveInitializationERKNS_11LinkContextEjRNS_21InitializerTimingListE ()
#11 0x00007fff5fc0d27d in __dyld__ZN11ImageLoader23recursiveInitializationERKNS_11LinkContextEjRNS_21InitializerTimingListE ()
#12 0x00007fff5fc0e0b7 in __dyld__ZN11ImageLoader15runInitializersERKNS_11LinkContextERNS_21InitializerTimingListE ()
#13 0x00007fff5fc034dd in __dyld__ZN4dyld24initializeMainExecutableEv ()
#14 0x00007fff5fc0760b in __dyld__ZN4dyld5_mainEPK12macho_headermiPPKcS5_S5_ ()
#15 0x00007fff5fc01059 in __dyld__dyld_start ()
which come from:
http://opensource.apple.com/source/dyld/dyld-210.2.3/src/dyld.cpp
http://opensource.apple.com/source/dyld/dyld-210.2.3/src/ImageLoader.cpp
http://opensource.apple.com/source/dyld/dyld-210.2.3/src/ImageLoaderMachO.cpp
I'm surprised that I don't see dlopen() in the stack trace though.