Is armadillo solve() thread safe? - c++

In my code I have loop in which I construct and over determined linear system and try to solve it:
#pragma omp parallel for
for (int i = 0; i < n[0]+1; i++) {
for (int j = 0; j < n[1]+1; j++) {
for (int k = 0; k < n[2]+1; k++) {
arma::mat A(max_points, 2);
arma::mat y(max_points, 1);
// initialize A and y
arma::vec solution = solve(A,y);
}
}
}
Sometimes, quite randomly the program hangs or the results in the solution vector are NaN. And if I put do this:
arma::vec solution;
#pragma omp critical
{
solution = solve(weights*A,weights*y);
}
then these problem don't seem to happen anymore.
When it hangs, it does so because some threads are waiting at the OpenMP barrier:
Thread 2 (Thread 0x7fe4325a5700 (LWP 39839)):
#0 0x00007fe44d3c2084 in gomp_team_barrier_wait_end () from /usr/lib64/gcc-4.9.2/lib64/gcc/x86_64-redhat-linux-gnu/4.9.2/libgomp.so.1
#1 0x00007fe44d3bf8c2 in gomp_thread_start () at ../.././libgomp/team.c:118
#2 0x0000003f64607851 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003f642e890d in clone () from /lib64/libc.so.6
And the other threads are stuck inside Armadillo:
Thread 1 (Thread 0x7fe44afe2e60 (LWP 39800)):
#0 0x0000003ee541f748 in dscal_ () from /usr/lib64/libblas.so.3
#1 0x00007fe44c0d3666 in dlarfp_ () from /usr/lib64/atlas/liblapack.so.3
#2 0x00007fe44c058736 in dgelq2_ () from /usr/lib64/atlas/liblapack.so.3
#3 0x00007fe44c058ad9 in dgelqf_ () from /usr/lib64/atlas/liblapack.so.3
#4 0x00007fe44c059a32 in dgels_ () from /usr/lib64/atlas/liblapack.so.3
#5 0x00007fe44f09fb3d in bool arma::auxlib::solve_ud<double, arma::Glue<arma::Mat<double>, arma::Mat<double>, arma::glue_times> >(arma::Mat<double>&, arma::Mat<double>&, arma::Base<double, arma::Glue<arma::Mat<double>, arma::Mat<double>, arma::glue_times> > const&) () at /usr/include/armadillo_bits/lapack_wrapper.hpp:677
#6 0x00007fe44f0a0f87 in arma::Col<double>::Col<arma::Glue<arma::Glue<arma::Mat<double>, arma::Mat<double>, arma::glue_times>, arma::Glue<arma::Mat<double>, arma::Mat<double>, arma::glue_times>, arma::glue_solve> >(arma::Base<double, arma::Glue<arma::Glue<arma::Mat<double>, arma::Mat<double>, arma::glue_times>, arma::Glue<arma::Mat<double>, arma::Mat<double>, arma::glue_times>, arma::glue_solve> > const&) ()
at /usr/include/armadillo_bits/glue_solve_meat.hpp:39
As you can see from the stacktrace my version of Armadillo uses atlas. And according to this documentation atlas seems to be thread safe: ftp://lsec.cc.ac.cn/netlib/atlas/faq.html#tsafe
Update 9/11/2015
I finally got some time to run more tests, based on the suggestions of Vladimir F.
When I compile armadillo with ATLAS's BLAS, I'm still able to reproduce then hangs and the NaNs. When it hangs, the only thing that changes in the stacktrace is the call to BLAS:
#0 0x0000003fa8054718 in ATL_dscal_xp1yp0aXbX#plt () from /usr/lib64/atlas/libatlas.so.3
#1 0x0000003fb05e7666 in dlarfp_ () from /usr/lib64/atlas/liblapack.so.3
#2 0x0000003fb0576a61 in dgeqr2_ () from /usr/lib64/atlas/liblapack.so.3
#3 0x0000003fb0576e06 in dgeqrf_ () from /usr/lib64/atlas/liblapack.so.3
#4 0x0000003fb056d7d1 in dgels_ () from /usr/lib64/atlas/liblapack.so.3
#5 0x00007ff8f3de4c34 in void arma::lapack::gels<double>(char*, int*, int*, int*, double*, int*, double*, int*, double*, int*, int*) () at /usr/include/armadillo_bits/lapack_wrapper.hpp:677
#6 0x00007ff8f3de1787 in bool arma::auxlib::solve_od<double, arma::Glue<arma::Mat<double>, arma::Mat<double>, arma::glue_times> >(arma::Mat<double>&, arma::Mat<double>&, arma::Base<double, arma::Glue<arma::Mat<double>, arma::Mat<double>, arma::glue_times> > const&) () at /usr/include/armadillo_bits/auxlib_meat.hpp:3434
Compiling without ATLAS, only with netlib BLAS and LAPACK, I was able to reproduce the NaNs but not the hangs.
In both cases, surrounding solve() with #pragma omp critical I have no problems at all

Are you sure your systems are over determined? solve_ud in your stack trace says otherwise. Though you have solve_od too, and probably that's nothing to do with the issue. But it doesn't hurt to find why that's happening and fix it if you think the systems should be od.
Is armadillo solve() thread safe?
That I think depends on your lapack version, also see this. Looking at the code of solve_od all the variables accessed seem to be local. Note the warning in the code:
NOTE: the dgels() function in the lapack library supplied by ATLAS 3.6
seems to have problems
Thus it seems only lapack::gels can cause trouble for you. If fixing lapack is not possible, a workaround is to stack your systems and solve a single large system. That probably would be even more efficient if your individual systems are small.

The thread safety of Armadillo's solve() function depends (only) on the BLAS library that you use. The LAPACK implementations are thread safe when BLAS is. The Armadillo solve() function is not thread safe when linking to the reference BLAS library. However, it is thread safe when using OpenBLAS. Additionally, ATLAS provides a BLAS implementation that also mentions it is thread safe, and the Intel MKL is thread safe as well, but I have no experience with Armadillo linked to those libraries.
Of course, this only applies when you run solve() from multiple threads with different data.

Related

tbb's private_server and false positive ThreadSanitizer data races

We are getting false positive ThreadSanitizer (tsan) data race warnings on a frequent but inconsistent basis. Though it is well-known that tsan can give false positive warnings, some of which may be suppressed via the TSAN_OPTIONS environment variable, there is a particular class of warnings that we are encountering that appear specifically related to Intel's Thread Building Block's (tbb) use of tbb::detail::r1::rml::private_server that appears preventable if we could somehow have more control over the stopping of this private_server for instance. Here is one such false positive tsan data race warning encountered during a Google Test run:
WARNING: ThreadSanitizer: data race (pid=5244)
Write of size 1 at 0x7ffda4d64fd8 by main thread:
#0 std::shared_lock<std::shared_mutex>::shared_lock(std::shared_mutex&, std::defer_lock_t) /usr/local/foo-deps/20220316/include/c++/9.4.0/shared_mutex:639 (FooTest+0x68d162)
#1 FooProxy::buildTranslationMapToOtherProxy(FooProxy*, std::vector<foo::StringOpInfo, std::allocator<foo::StringOpInfo> > const&) const /home/jenkins-slave/workspace/core-tsan-gcc/Foo/FooProxy.cpp:323 (FooTest+0x68d162)
#2 FooProxy_BuildTranslationMapToPartialOverlapProxy_Test::TestBody() /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:798 (FooTest+0x5c5284)
#3 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62d798)
#4 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62d798)
#5 testing::Test::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4043 (FooTest+0x618586)
#6 testing::TestInfo::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4219 (FooTest+0x6187d4)
#7 testing::TestSuite::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4351 (FooTest+0x618959)
#8 testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6892 (FooTest+0x618e7e)
#9 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62de38)
#10 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62de38)
#11 testing::UnitTest::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6479 (FooTest+0x619440)
#12 RUN_ALL_TESTS() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gtest/gtest.h:11696 (FooTest+0x5b401a)
#13 main /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:974 (FooTest+0x5b401a)
Previous read of size 8 at 0x7ffda4d64fd8 by thread T18:
[failed to restore the stack]
Location is stack of main thread.
Location is global '<null>' at 0x000000000000 ([stack]+0x00000001efd8)
Thread T18 (tid=5264, running) created by main thread at:
#0 pthread_create ../../.././libsanitizer/tsan/tsan_interceptors.cc:964 (libtsan.so.0+0x2cd6b)
#1 tbb::detail::r1::rml::private_server::wake_some(int) <null> (FooTest+0x8828ce)
#2 tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::external_waiter>(tbb::detail::d1::task*, tbb::detail::r1::external_waiter&) <null> (FooTest+0x88b1c2)
#3 tbb::detail::r1::task_arena_impl::execute(tbb::detail::d1::task_arena_base&, tbb::detail::d1::delegate_base&) <null> (FooTest+0x86e74c)
#4 Foo::getStringViews() const /home/jenkins-slave/workspace/core-tsan-gcc/Foo/Foo.cpp:1869 (FooTest+0x63612c)
#5 Foo_GetStringViews_Test::TestBody() /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:141 (FooTest+0x5c625c)
#6 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62d798)
#7 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62d798)
#8 testing::Test::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4043 (FooTest+0x618586)
#9 testing::TestInfo::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4219 (FooTest+0x6187d4)
#10 testing::TestSuite::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4351 (FooTest+0x618959)
#11 testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6892 (FooTest+0x618e7e)
#12 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62de38)
#13 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62de38)
#14 testing::UnitTest::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6479 (FooTest+0x619440)
#15 RUN_ALL_TESTS() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gtest/gtest.h:11696 (FooTest+0x5b401a)
#16 main /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:974 (FooTest+0x5b401a)
SUMMARY: ThreadSanitizer: data race /usr/local/foo-deps/20220316/include/c++/9.4.0/shared_mutex:639 in std::shared_lock<std::shared_mutex>::shared_lock(std::shared_mutex&, std::defer_lock_t)
(Some names have been altered for anonymity.)
Summary of events in chronological order:
Google test Foo.GetStringViews is run (Thread T18 frame #5)
During this test, an instance ta of tbb::task_arena calls ta.execute([&] { tbb::parallel_for(...); });.
This appears to run tbb::detail::r1::rml::private_server::wake_some(int) which spawns a thread that survives in between Google tests.
Google test FooProxy.BuildTranslationMapToPartialOverlapProxy is run (main thread frame #2)
This test writes to address 0x7ffda4d64fd8 that was read by the previous test.
Our TSAN_OPTIONS environment variable is set to
suppressions=/path/to/tsan.suppressions, history_size=7, second_deadlock_stack=1, halt_on_error=1
We surmise that the false positive data race warning is due to 3 primary ingredients:
Two independent tests are run synchronously one after the other in which no data race is possible, but happen to read/write or write/write to/from the same memory address.
One of the thread's stack exceeds the maximum history_size=7 and reports [failed to restore the stack].
The first thread spawns a tbb::detail::r1::rml::private_server that survives through to the second test.
It is because the tbb::detail::r1::rml::private_server from the first test remains concurrent with the second test that confuses tsan to flag this as a data race.
Question(s)
How can the tbb::detail::r1::rml::private_server thread be killed at the beginning or end of each test?
Alternatively, if that's not possible, is there something that we can add to our tsan.suppressions file or TSAN_OPTIONS environment variable that specifically suppresses this false warning without hiding real data races that may occur?
To kill the tbb::detail::r1::rml::private_server after each Google Test, we overrode the Test Fixture TearDown() method:
void TearDown() override {
// Expected to kill tbb::detail::r1::rml::private_server after each test,
// which can otherwise trigger false positive tsan data race warnings.
auto handle = tbb::task_scheduler_handle::get();
tbb::finalize(handle, std::nothrow_t{});
}
In our version of TBB we also had to #define TBB_PREVIEW_WAITING_FOR_WORKERS and #include <tbb/global_control.h>.
Credit: Pavel Kumbrasev for the suggestion.
You can replace the Mach semaphore with a dispatch semaphore to suppress the warnings.
Refer to the below link:
https://developer.apple.com/documentation/dispatch/dispatch_semaphore
You can also create a suppression file to specify the suppressions runtime flag
https://github.com/google/sanitizers/wiki/ThreadSanitizerSuppressions
If this helps, you can apply the settings at compile time:
-fsanitize=thread -fsanitize-blacklist=sanitizer-thread-suppressions.txt

directory_iterator runs into segfault

This is my code:
#include <iostream>
#include <filesystem>
int main(int argc, char *argv[]) {
auto iter = std::filesystem::directory_iterator("foo");
for (auto &entry : iter) {
std::cout << entry.path();
}
}
When I run it and the directory foo exists, I get a SIGSEGV. So I started gdb:
(gdb) run
Starting program: /home/krausefx/a.out
Program received signal SIGSEGV, Segmentation fault.
0x0000555555556a87 in std::vector<std::filesystem::__cxx11::path::_Cmpt, std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::~vector (
this=0x23) at /usr/include/c++/8/bits/stl_vector.h:567
567 std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish,
(gdb) backtrace
#0 0x0000555555556a87 in std::vector<std::filesystem::__cxx11::path::_Cmpt, std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::~vector (
this=0x23) at /usr/include/c++/8/bits/stl_vector.h:567
#1 0x00005555555566aa in std::filesystem::__cxx11::path::~path (this=0x3) at /usr/include/c++/8/bits/fs_path.h:208
#2 0x0000555555557ebe in std::filesystem::__cxx11::path::_Cmpt::~_Cmpt (this=<incomplete type>) at /usr/include/c++/8/bits/fs_path.h:643
#3 0x0000555555557ed9 in std::_Destroy<std::filesystem::__cxx11::path::_Cmpt> (__pointer=0x3) at /usr/include/c++/8/bits/stl_construct.h:98
#4 0x0000555555557ced in std::_Destroy_aux<false>::__destroy<std::filesystem::__cxx11::path::_Cmpt*> (__first=0x3, __last=0x0)
at /usr/include/c++/8/bits/stl_construct.h:108
#5 0x00005555555576de in std::_Destroy<std::filesystem::__cxx11::path::_Cmpt*> (__first=0x3, __last=0x0)
at /usr/include/c++/8/bits/stl_construct.h:137
#6 0x0000555555556fb9 in std::_Destroy<std::filesystem::__cxx11::path::_Cmpt*, std::filesystem::__cxx11::path::_Cmpt> (__first=0x3, __last=0x0)
at /usr/include/c++/8/bits/stl_construct.h:206
#7 0x0000555555556a9d in std::vector<std::filesystem::__cxx11::path::_Cmpt, std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::~vector (
this=0x7fffffffdcf0) at /usr/include/c++/8/bits/stl_vector.h:567
#8 0x00005555555566aa in std::filesystem::__cxx11::path::~path (this=0x7fffffffdcd0) at /usr/include/c++/8/bits/fs_path.h:208
#9 0x000055555555630d in main (argc=32767, argv=0x7ffff7fadf40 <std::wcout>) at test.cpp:5
(gdb) p this
$1 = (vector * const) 0x23
So apparently, when initializing the directory_iterator, the destructor of std::filesystem::path gets called for some reason, and somewhere in there, the destuctor of std::vector is called on a this value of 0x23, which obviously is a bad thing and leads to a SIGSEGV.
What's happening here? Am I doing something wrong? Is this a compiler bug (compiler is g++ 8.3.0)?
I checked directory_iterator works fine using GCC 8 under Ubuntu.
Be sure to add the linker flag -lstdc++fs when compiling.
If you don't compilation ends successful but, at least in my system, I get a segfault as you do when it starts iterating.
I don't think std::filesystem is stable. It caused segfaults and other problems in my project (especially std::filesystem::path in mingw-w64 that ships with msys2). Try updating your gcc package and check if the problem persists. If it does then you can file a bug report or just wait and hope that someone already reported it (in my case updating fixed the problem).

deadlock in multi-threaded program between malloc and ctime_r

I have a C++ program (running on Linux - Ubuntu 12.04 - gcc compiler), and i am getting a deadlock between 2 threads
T1 backtrace:
#0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:93
#1 0x00007f24bd454251 in _L_lock_10628 () at malloc.c:5253
#2 0x00007f24bd451f77 in __GI___libc_malloc (bytes=139793535074336) at malloc.c:2921
#3 0x00007f24bd457da2 in __GI___strdup (s=0x7f24bd54a4c2 "/etc/localtime") at strdup.c:43
T2 backtrace:
#0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:93
#1 0x00007f24bd480ee4 in _L_lock_2180 () at tzset.c:616
#2 0x00007f24bd480cf7 in __tz_convert (timer=0x7f24bd789ee8, use_localtime=1, tp=0x7f2455a0bc70) at tzset.c:619
#3 0x00007f24bd47e570 in ctime_r (t=<optimized out>, buf=0x7f2455a0bcf0 "\364\230\005\277$\177") at ctime_r.c:29
T2 is called from a static library i am using.
I also read ctime_r is not thread-safe.
How can i avoid deadlock on such case ?

Suppression file for ThreadSanitizer does not work: What is wrong?

Clang has the -fsanitize-blacklist compile switch to suppress warnings from the ThreadSanitizer. Unfortunately, I cannot get it to work.
Here is an example that I want to suppress:
WARNING: ThreadSanitizer: data race (pid=21502)
Read of size 8 at 0x7f0dcf5b31a8 by thread T6:
#0 tbb::interface6::internal::auto_partition_type_base<tbb::interface6::internal::auto_partition_type>::check_being_stolen(tbb::task&) /usr/include/tbb/partitioner.h:305 (exe+0x000000388b38)
#1 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
Previous write of size 8 at 0x7f0dcf5b31a8 by thread T1:
#0 auto_partition_type_base /usr/include/tbb/partitioner.h:299 (exe+0x000000388d9a)
#1 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
#2 GhostSearch::Ghost3Search::SearchTask::execute_impl() /home/phil/ghost/search/ghost3/ghost3_search_alg.cpp:1456 (exe+0x000000387a8a)
#3 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
#4 GhostSearch::Ghost3Search::Ghost3SearchAlg::NullWindowSearch(int, MOVE, int, std::vector<MOVE, std::allocator<MOVE> >&) /home/phil/ghost/search/ghost3/ghost3_search_alg.cpp:1640 (exe+0x000000388310)
#5 GhostSearch::PureMTDSearchAlg::FullWindowSearch(GhostSearch::SearchWindow, GhostSearch::SearchWindow, MOVE, int, std::vector<MOVE, std::allocator<MOVE> >&) /home/phil/ghost/search/pure_mtd_search_alg.cpp:41 (exe+0x000000370e3f)
#6 GhostSearch::PureSearchAlgWrapper::RequestHandlerThread::EnterHandlerMainLoop() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:124 (exe+0x000000372d1b)
#7 operator() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:94 (exe+0x000000374683)
#8 execute_native_thread_routine /home/phil/tmp/gcc/src/gcc-4.8-20130725/libstdc++-v3/src/c++11/thread.cc:84 (libstdc++.so.6+0x0000000b26cf)
Thread T6 (tid=21518, running) created by thread T3 at:
#0 pthread_create ??:0 (exe+0x0000002378e1)
#1 <null> <null>:0 (libtbb.so.2+0x0000000198c0)
Thread T1 (tid=21513, running) created by main thread at:
#0 pthread_create ??:0 (exe+0x0000002378e1)
#1 __gthread_create /home/phil/tmp/gcc/src/gcc-build/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:662 (libstdc++.so.6+0x0000000b291e)
#2 GhostSearch::PureSearchAlgWrapper::StartRequestHandlerThread() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:77 (exe+0x0000003715c3)
#3 GhostSearch::Search::ExecuteSearch(GhostSearch::SEARCH_SETTINGS const&) /home/phil/ghost/search.cpp:243 (exe+0x00000033063f)
#4 GhostSearch::Search::StartSearch(GhostSearch::SEARCH_SETTINGS const&, UserBoard const&, GhostInterfaces::UserInterface*) /home/phil/ghost/search.cpp:176 (exe+0x00000033037a)
#5 GhostInterfaces::UserInterface::StartSearch(GhostSearch::SEARCH_SETTINGS const&, UserBoard const&) /home/phil/ghost/interface.cpp:1072 (exe+0x0000002ea220)
#6 GhostInterfaces::UserInterface::MainLoop() /home/phil/ghost/interface.cpp:576 (exe+0x0000002e9464)
#7 GhostInterfaces::Command_Analyze::Execute(GhostInterfaces::UserInterfaceData&) /home/phil/ghost/commands.cpp:1005 (exe+0x00000028756c)
#8 GhostInterfaces::UserInterface::FinishNextCommand() /home/phil/ghost/interface.cpp:1161 (exe+0x0000002e9ed0)
#9 GhostInterfaces::UserInterface::MainLoop() /home/phil/ghost/interface.cpp:571 (exe+0x0000002e9447)
#10 main /home/phil/ghost/ghost.cpp:54 (exe+0x000000274efd)
SUMMARY: ThreadSanitizer: data race /usr/include/tbb/partitioner.h:305 tbb::interface6::internal::auto_partition_type_base<tbb::interface6::internal::auto_partition_type>::check_being_stolen(tbb::task&)
My tries for the suppression file so far (but it does not work):
# TBB
fun:tbb::*
src:/usr/include/tbb/partitioner.h
Do you know why it does not work?
(By the way, I would be happy to suppress all warnings from the TBB library.)
Finally, I got it working.
According to the documentation, each line must start with a valid "suppression_type" (race, thread, mutex, signal, deadlock, or called_from_lib).
In my example, the correct suppression_type is race.
Here is an example file called "sanitizer-thread-suppressions.txt", which suppresses two functions, which are known to contain data races:
race:Function1
race:MyNamespace::Function2
To test the suppress file, set the TSAN_OPTIONS environment variable and call the application (compiled with -fsanitize=thread):
$ TSAN_OPTIONS="suppressions=sanitizer-thread-suppressions.txt" ./myapp
If that works, you can apply the settings at compile time:
-fsanitize=thread -fsanitize-blacklist=sanitizer-thread-suppressions.txt

Using GCC's function instrumentation, why does using C++ STL containers or stream I/O cause a segfault?

I recently read about using GCC's code generation features (specifically, the -finstrument-functions compiler flag) to easily add instrumentation to my programs. I thought it sounded really cool and went to try it out on a previous C++ project. After several revisions of my patch, I found that any time I tried to use an STL container or print to stdout using C++ stream I/O, my program would immediately crash with a segfault. My first idea was to maintain a std::list of Event structs
typedef struct
{
unsigned char event_code;
intptr_t func_addr;
intptr_t caller_addr;
pthread_t thread_id;
timespec ts;
}Event;
list<Event> events;
which would be written to a file when the program terminated. GDB told me that when I tried to add an Event to the list, calling events.push_back(ev) itself initiated an instrumentation call. This wasn't terrible surprising and made sense after I thought about it for a bit, so on to plan 2.
The example in the blog which got me involved in all this mess didn't do anything crazy, it simply wrote a string to a file using fprintf(). I didn't think there would be any harm in using C++'s stream-based I/O instead of the older (f)printf(), but that assumption proved to be wrong. This time, instead of a nearly-infinite death spiral, GDB reported a fairly normal-looking descent into the standard library... followed by a segfault.
A Short Example
#include <list>
#include <iostream>
#include <stdio.h>
using namespace std;
extern "C" __attribute__ ((no_instrument_function)) void __cyg_profile_func_enter(void*, void*);
list<string> text;
extern "C" void __cyg_profile_func_enter(void* /* unused */, void* /* unused */)
{
// Method 1
text.push_back("NOPE");
// Method 2
cout << "This explodes" << endl;
// Method 3
printf("This works!");
}
Sample GDB Backtrace
Method 1
#0 _int_malloc (av=0x7ffff7380720, bytes=29) at malloc.c:3570
#1 0x00007ffff704ca45 in __GI___libc_malloc (bytes=29) at malloc.c:2924
#2 0x00007ffff7652ded in operator new(unsigned long) ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007ffff763ba89 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007ffff763d495 in char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007ffff763d5e3 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00000000004028c1 in __cyg_profile_func_enter () at src/instrumentation.cpp:82
#7 0x0000000000402c6f in std::move<std::string&> (__t=...) at /usr/include/c++/4.6/bits/move.h:82
#8 0x0000000000402af5 in std::list<std::string, std::allocator<std::string> >::push_back(std::string&&) (this=0x6055c0, __x=...) at /usr/include/c++/4.6/bits/stl_list.h:993
#9 0x00000000004028d2 in __cyg_profile_func_enter () at src/instrumentation.cpp:82
#10 0x0000000000402c6f in std::move<std::string&> (__t=...) at /usr/include/c++/4.6/bits/move.h:82
#11 0x0000000000402af5 in std::list<std::string, std::allocator<std::string> >::push_back(std::string&&) (this=0x6055c0, __x=...) at /usr/include/c++/4.6/bits/stl_list.h:993
#12 0x00000000004028d2 in __cyg_profile_func_enter () at src/instrumentation.cpp:82
#13 0x0000000000402c6f in std::move<std::string&> (__t=...) at /usr/include/c++/4.6/bits/move.h:82
#14 0x0000000000402af5 in std::list<std::string, std::allocator<std::string> >::push_back(std::string&
...
Method 2
#0 0x00007ffff76307d1 in std::ostream::sentry::sentry(std::ostream&) ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1 0x00007ffff7630ee9 in std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long) ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00007ffff76312ef in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x000000000040251e in __cyg_profile_func_enter () at src/instrumentation.cpp:81
#4 0x000000000040216d in _GLOBAL__sub_I__ZN8GLWindow7attribsE () at src/glwindow.cpp:164
#5 0x0000000000402f2d in __libc_csu_init ()
#6 0x00007ffff6feb700 in __libc_start_main (main=0x402cac <main()>, argc=1, ubp_av=0x7fffffffe268,
init=0x402ed0 <__libc_csu_init>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7fffffffe258) at libc-start.c:185
#7 0x0000000000401589 in _start ()
Environment:
Ubuntu Linux 12.04 (x64)
GCC 4.6.3
Intel 3750K CPU
8GB RAM
The problem with using cout in the instrumentation function is that the instrumentation function is being called by __libc_csu_init() which is a very early part of the runtime's initialization - before global C++ objects get a chance to be constructed (in fact, I think __libc_csu_init() is responsible for kicking off those constructors - at least indirectly).
So cout hasn't had a chance to be constructed yet and trying to use it doesn't work very well...
And that may well be the problem you run into with trying to use std::List after fixing the infinite recursion (mentioned in Dave S' answer).
If you're willing to lose some instrumentation during initialization, you can do something like:
#include <iostream>
#include <stdio.h>
int initialization_complete = 0;
using namespace std;
extern "C" __attribute__ ((no_instrument_function)) void __cyg_profile_func_enter(void*, void*);
extern "C" void __cyg_profile_func_enter(void* /* unused */, void* /* unused */)
{
if (!initialization_complete) return;
// Method 2
cout << "This explodes" << endl;
// Method 3
printf("This works! ");
}
void foo()
{
cout << "foo()" << endl;
}
int main()
{
initialization_complete = 1;
foo();
}
The first case seems to be an infinite loop, resulting in stack overflow. This is probably because std::list is a template, and it's code is generated as part of the translation unit where you're using it. This causes it to get instrumented as well. So you call push_back, which calls the handler, which calls push_back, ...
The second, if I had to guess, might be similar, though it's harder to tell.
The solution is to compile the instrumentation functions separately, without the -finstrument-functions. Note, the example blog compiled the trace.c separately, without the option.