We are getting false positive ThreadSanitizer (tsan) data race warnings on a frequent but inconsistent basis. Though it is well-known that tsan can give false positive warnings, some of which may be suppressed via the TSAN_OPTIONS environment variable, there is a particular class of warnings that we are encountering that appear specifically related to Intel's Thread Building Block's (tbb) use of tbb::detail::r1::rml::private_server that appears preventable if we could somehow have more control over the stopping of this private_server for instance. Here is one such false positive tsan data race warning encountered during a Google Test run:
WARNING: ThreadSanitizer: data race (pid=5244)
Write of size 1 at 0x7ffda4d64fd8 by main thread:
#0 std::shared_lock<std::shared_mutex>::shared_lock(std::shared_mutex&, std::defer_lock_t) /usr/local/foo-deps/20220316/include/c++/9.4.0/shared_mutex:639 (FooTest+0x68d162)
#1 FooProxy::buildTranslationMapToOtherProxy(FooProxy*, std::vector<foo::StringOpInfo, std::allocator<foo::StringOpInfo> > const&) const /home/jenkins-slave/workspace/core-tsan-gcc/Foo/FooProxy.cpp:323 (FooTest+0x68d162)
#2 FooProxy_BuildTranslationMapToPartialOverlapProxy_Test::TestBody() /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:798 (FooTest+0x5c5284)
#3 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62d798)
#4 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62d798)
#5 testing::Test::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4043 (FooTest+0x618586)
#6 testing::TestInfo::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4219 (FooTest+0x6187d4)
#7 testing::TestSuite::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4351 (FooTest+0x618959)
#8 testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6892 (FooTest+0x618e7e)
#9 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62de38)
#10 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62de38)
#11 testing::UnitTest::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6479 (FooTest+0x619440)
#12 RUN_ALL_TESTS() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gtest/gtest.h:11696 (FooTest+0x5b401a)
#13 main /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:974 (FooTest+0x5b401a)
Previous read of size 8 at 0x7ffda4d64fd8 by thread T18:
[failed to restore the stack]
Location is stack of main thread.
Location is global '<null>' at 0x000000000000 ([stack]+0x00000001efd8)
Thread T18 (tid=5264, running) created by main thread at:
#0 pthread_create ../../.././libsanitizer/tsan/tsan_interceptors.cc:964 (libtsan.so.0+0x2cd6b)
#1 tbb::detail::r1::rml::private_server::wake_some(int) <null> (FooTest+0x8828ce)
#2 tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::external_waiter>(tbb::detail::d1::task*, tbb::detail::r1::external_waiter&) <null> (FooTest+0x88b1c2)
#3 tbb::detail::r1::task_arena_impl::execute(tbb::detail::d1::task_arena_base&, tbb::detail::d1::delegate_base&) <null> (FooTest+0x86e74c)
#4 Foo::getStringViews() const /home/jenkins-slave/workspace/core-tsan-gcc/Foo/Foo.cpp:1869 (FooTest+0x63612c)
#5 Foo_GetStringViews_Test::TestBody() /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:141 (FooTest+0x5c625c)
#6 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62d798)
#7 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62d798)
#8 testing::Test::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4043 (FooTest+0x618586)
#9 testing::TestInfo::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4219 (FooTest+0x6187d4)
#10 testing::TestSuite::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4351 (FooTest+0x618959)
#11 testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6892 (FooTest+0x618e7e)
#12 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62de38)
#13 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62de38)
#14 testing::UnitTest::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6479 (FooTest+0x619440)
#15 RUN_ALL_TESTS() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gtest/gtest.h:11696 (FooTest+0x5b401a)
#16 main /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:974 (FooTest+0x5b401a)
SUMMARY: ThreadSanitizer: data race /usr/local/foo-deps/20220316/include/c++/9.4.0/shared_mutex:639 in std::shared_lock<std::shared_mutex>::shared_lock(std::shared_mutex&, std::defer_lock_t)
(Some names have been altered for anonymity.)
Summary of events in chronological order:
Google test Foo.GetStringViews is run (Thread T18 frame #5)
During this test, an instance ta of tbb::task_arena calls ta.execute([&] { tbb::parallel_for(...); });.
This appears to run tbb::detail::r1::rml::private_server::wake_some(int) which spawns a thread that survives in between Google tests.
Google test FooProxy.BuildTranslationMapToPartialOverlapProxy is run (main thread frame #2)
This test writes to address 0x7ffda4d64fd8 that was read by the previous test.
Our TSAN_OPTIONS environment variable is set to
suppressions=/path/to/tsan.suppressions, history_size=7, second_deadlock_stack=1, halt_on_error=1
We surmise that the false positive data race warning is due to 3 primary ingredients:
Two independent tests are run synchronously one after the other in which no data race is possible, but happen to read/write or write/write to/from the same memory address.
One of the thread's stack exceeds the maximum history_size=7 and reports [failed to restore the stack].
The first thread spawns a tbb::detail::r1::rml::private_server that survives through to the second test.
It is because the tbb::detail::r1::rml::private_server from the first test remains concurrent with the second test that confuses tsan to flag this as a data race.
Question(s)
How can the tbb::detail::r1::rml::private_server thread be killed at the beginning or end of each test?
Alternatively, if that's not possible, is there something that we can add to our tsan.suppressions file or TSAN_OPTIONS environment variable that specifically suppresses this false warning without hiding real data races that may occur?
To kill the tbb::detail::r1::rml::private_server after each Google Test, we overrode the Test Fixture TearDown() method:
void TearDown() override {
// Expected to kill tbb::detail::r1::rml::private_server after each test,
// which can otherwise trigger false positive tsan data race warnings.
auto handle = tbb::task_scheduler_handle::get();
tbb::finalize(handle, std::nothrow_t{});
}
In our version of TBB we also had to #define TBB_PREVIEW_WAITING_FOR_WORKERS and #include <tbb/global_control.h>.
Credit: Pavel Kumbrasev for the suggestion.
You can replace the Mach semaphore with a dispatch semaphore to suppress the warnings.
Refer to the below link:
https://developer.apple.com/documentation/dispatch/dispatch_semaphore
You can also create a suppression file to specify the suppressions runtime flag
https://github.com/google/sanitizers/wiki/ThreadSanitizerSuppressions
If this helps, you can apply the settings at compile time:
-fsanitize=thread -fsanitize-blacklist=sanitizer-thread-suppressions.txt
Related
I have a C++ program in Code::Block with gcc and wxWidgets.
After my working thread throws a wxThreadEvent with a struct as payload my program crashes (actually not at the throw, but at the moment I want to get the payload in the main).
Does anyone have an idea what is wrong?
The working thread part:
wxThread::ExitCode NavigationThread::Entry()
{
wxThreadEvent event(wxEVT_THREAD, ID_REFRESH_DIRECTION);
position_variables positionPayload;
positionPayload.latitude = latDouble;
positionPayload.longitude = lonDouble;
positionPayload.direction = direction;
event.SetPayload(&positionPayload);
m_parent->GetEventHandler()->AddPendingEvent(event);
}
The struct:
struct position_variables{
double latitude;
double longitude;
wxString direction;
};
class NavigationThread : public wxThread
{
...
}
The main.cpp
WindowsDgpsGUIFrame::WindowsDgpsGUIFrame(wxWindow* parent,wxWindowID id)
{
Bind(wxEVT_THREAD, &WindowsDgpsGUIFrame::onRefreshDirections, this, ID_REFRESH_DIRECTION);
}
void WindowsDgpsGUIFrame::onRefreshDirections(wxThreadEvent& event)
{
position_variables answerDirections = event.GetPayload<position_variables>(); //Here it crashes
}
When the crash occurs in normal "run" mode, a windows opens saying the program stopped working. In debug mode there is a small window in Code::blocks saying something about SIGSEGV, segmentation fault (or something like that) and this is the Call Stack:
#0 00877A54 std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::basic_string(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) () (??:??)
#1 04A1E550 ?? () (??:??)
#2 007ED139 position_variables::position_variables(this=0x4a1e588) (D:/WindowsDgps/WindowsDgpsGUI/NavigationThread.h:54)
#3 00851B54 wxAny::As<position_variables>(this=0x4c6fe70) (C:/wxWidgets-3.0.2/include/wx/any.h:979)
#4 0084E70C wxEventAnyPayloadMixin::GetPayload<position_variables>(this=0x4c6fe58) (C:/wxWidgets-3.0.2/include/wx/event.h:1219)
#5 0043320E WindowsDgpsGUIFrame::onRefreshDirections(this=0x4be2a68, event=...) (D:\WindowsDgps\WindowsDgpsGUI\WindowsDgpsGUIMain.cpp:440)
#6 0063AA48 wxAppConsoleBase::HandleEvent (this=0x4b2bde0, handler=0x4be2a68, func=(void (wxEvtHandler::*)(wxEvtHandler * const, wxEvent &) (../../src/common/appbase.cpp:611)
#7 0063AAD9 wxAppConsoleBase::CallEventHandler(this=0x4b2bde0, handler=0x4be2a68, functor=..., event=...) (../../src/common/appbase.cpp:623)
#8 0062DEA1 wxEvtHandler::ProcessEventIfMatchesId(entry=..., handler=0x4be2a68, event=...) (../../src/common/event.cpp:1392)
#9 0062EB3A wxEvtHandler::SearchDynamicEventTable(this=0x4be2a68, event=...) (../../src/common/event.cpp:1751)
#10 0062E318 wxEvtHandler::TryHereOnly(this=0x4be2a68, event=...) (../../src/common/event.cpp:1585)
#11 007C50A0 wxEvtHandler::TryBeforeAndHere(this=0x4be2a68, event=...) (../../include/wx/event.h:3671)
#12 0062E157 wxEvtHandler::ProcessEventLocally(this=0x4be2a68, event=...) (../../src/common/event.cpp:1522)
#13 0062E0FF wxEvtHandler::ProcessEvent(this=0x4be2a68, event=...) (../../src/common/event.cpp:1495)
#14 0062DCEC wxEvtHandler::ProcessPendingEvents(this=0x4be2a68) (../../src/common/event.cpp:1359)
#15 0063A69C wxAppConsoleBase::ProcessPendingEvents(this=0x4b2bde0) (../../src/common/appbase.cpp:520)
#16 007F0883 wxIdleWakeUpModule::MsgHookProc(nCode=0, wParam=1, lParam=77720172) (../../src/msw/window.cpp:7454)
#17 746BE1A1 USER32!TrackMouseEvent() (C:\WINDOWS\SysWOW64\user32.dll:??)
#18 ?? ?? () (??:??)
with #2 highlighted red.
Maybe it has something to do with the Clone() part in SetPayload()? Though I don't quite get how I should use it or why my accessing of the payload would be problematic...
You can't use a pointer to a local variable, which will be destroyed as soon as you exit the function containing it, thus making the pointer invalid, as the payload. Use the object itself, not a pointer to it, instead.
I've been trying to use gloox 1.0.14 for the first time and I think I'm using the most minimal example there is but poorly I get a SIGSEGV. Can anyone reproduce this problem or tell me why this happens and what I'm doing wrong? It's seems to be that the JID has an impact on this but I'd expect it to throw an error instead of segv crashing and the JID seems valid to me and even if the certificate is incorrect or whatever I'd still expect it to throw instead.
#include <cstdlib>
#include "gloox/client.h"
int main() {
gloox::JID jid("segv#jabber.de");
gloox::Client client(jid, "password");
client.connect();
return EXIT_SUCCESS;
}
Sanitizer told me:
ASAN:SIGSEGV
=================================================================
==27028==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f6357f79acd bp 0x7ffcca7c1a50 sp 0x7ffcca7c17d0 T0)
#0 0x7f6357f79acc (/lib64/libgnutls.so.30+0x99acc)
#1 0x7f6357f7b49a (/lib64/libgnutls.so.30+0x9b49a)
#2 0x7f6357f7bb18 in gnutls_x509_crt_verify (/lib64/libgnutls.so.30+0x9bb18)
#3 0x7f635a4ad54b in gloox::GnuTLSClient::verifyAgainstCAs(gnutls_x509_crt_int*, gnutls_x509_crt_int**, int) (/lib64/libgloox.so.13+0xbd54b)
#4 0x7f635a4ad6bf in gloox::GnuTLSClient::getCertInfo() (/lib64/libgloox.so.13+0xbd6bf)
#5 0x7f635a4afd6c in gloox::GnuTLSBase::handshake() (/lib64/libgloox.so.13+0xbfd6c)
#6 0x7f635a4afbc0 in gloox::GnuTLSBase::decrypt(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (/lib64/libgloox.so.13+0xbfbc0)
#7 0x7f635a4490bb in gloox::ConnectionTCPClient::recv(int) (/lib64/libgloox.so.13+0x590bb)
#8 0x7f635a4c297d in gloox::ConnectionTCPBase::receive() (/lib64/libgloox.so.13+0xd297d)
#9 0x7f635a4541d7 in gloox::ClientBase::connect(bool) (/lib64/libgloox.so.13+0x641d7)
#10 0x4014d7 in main test.cc:8
#11 0x7f6358aa357f in __libc_start_main (/lib64/libc.so.6+0x2057f)
#12 0x401198 in _start (test+0x401198)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 ??
==27028==ABORTING
I compiled via:
c++ -fsanitize=address -fsanitize=undefined -ggdb -std=c++14 -Wall -Wextra -Wpedantic -Wconversion -Wsign-conversion -lgloox main.cc
This looks like a bug in recent version of gloox. Running the code under gdb or valgrind (without sanitizer) shows nice backtrace.
Full backtrace from valgrind points to the place of problem:
==29533== at 0x62B558D: verify_crt (verify.c:602)
==29533== by 0x62B6F57: _gnutls_verify_crt_status (verify.c:936)
==29533== by 0x62B75CC: gnutls_x509_crt_verify (verify.c:1329)
==29533== by 0x4EF254B: gloox::GnuTLSClient::verifyAgainstCAs(gnutls_x509_crt_int*, gnutls_x509_crt_int**, int) (tlsgnutlsclient.cpp:227)
==29533== by 0x4EF26BF: gloox::GnuTLSClient::getCertInfo() (tlsgnutlsclient.cpp:157)
==29533== by 0x4EF4D6C: gloox::GnuTLSBase::handshake() (tlsgnutlsbase.cpp:138)
==29533== by 0x4EF4BC0: gloox::GnuTLSBase::decrypt(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (tlsgnutlsbase.cpp:70)
==29533== by 0x4E8E0BB: gloox::ConnectionTCPClient::recv(int) (connectiontcpclient.cpp:169)
==29533== by 0x4F0797D: gloox::ConnectionTCPBase::receive() (connectiontcpbase.cpp:115)
==29533== by 0x4E991D7: gloox::ClientBase::connect(bool) (clientbase.cpp:212)
==29533== by 0x400DAC: main (main.cc:9)
Backtrace from gdb shows:
#0 verify_crt (cert=0xbebebebebebebebe, trusted_cas=trusted_cas#entry=0x0, tcas_size=tcas_size#entry=0, flags=flags#entry=0, output=output#entry=0x7fffffffce90, _issuer=_issuer#entry=0x7fffffffce98,
now=1455405710, max_path=0x7fffffffce94, end_cert=true, nc=0x602000007a50, func=0x0) at verify.c:602
#1 0x00007ffff46bcf58 in _gnutls_verify_crt_status (certificate_list=certificate_list#entry=0x7fffffffcf08, clist_size=clist_size#entry=1, trusted_cas=trusted_cas#entry=0x0, tcas_size=tcas_size#entry=0,
flags=flags#entry=0, purpose=purpose#entry=0x0, func=0x0) at verify.c:936
#2 0x00007ffff46bd5cd in gnutls_x509_crt_verify (cert=cert#entry=0xbebebebebebebebe, CA_list=CA_list#entry=0x0, CA_list_length=CA_list_length#entry=0, flags=flags#entry=0, verify=verify#entry=0x7fffffffcf24)
at verify.c:1329
#3 0x00007ffff6bee54c in gloox::GnuTLSClient::verifyAgainstCAs (this=this#entry=0x61400000fc40, cert=0xbebebebebebebebe, CAList=CAList#entry=0x0, CAListSize=CAListSize#entry=0) at tlsgnutlsclient.cpp:227
#4 0x00007ffff6bee6c0 in gloox::GnuTLSClient::getCertInfo (this=0x61400000fc40) at tlsgnutlsclient.cpp:157
#5 0x00007ffff6bf0d6d in gloox::GnuTLSBase::handshake (this=0x61400000fc40) at tlsgnutlsbase.cpp:138
#6 0x00007ffff6bf0bc1 in gloox::GnuTLSBase::decrypt (this=0x61400000fc40,
data="\024\003\003\000\001\001\026\003\003\000(\373\336\267\221q\256\266\344\363\022\367 C\022\233\351\251\065\036\355\070\362\217\264\370\003\206+\"\201r^\355\067I\203Y\213\350\301")
at tlsgnutlsbase.cpp:70
#7 0x00007ffff6b8a0bc in gloox::ConnectionTCPClient::recv (this=<optimized out>, timeout=<optimized out>) at connectiontcpclient.cpp:169
#8 0x00007ffff6c0397e in gloox::ConnectionTCPBase::receive (this=0x60c00000bec0) at connectiontcpbase.cpp:115
#9 0x00007ffff6b951d8 in gloox::ClientBase::connect (this=0x7fffffffd410, block=<optimized out>) at clientbase.cpp:212
#10 0x00000000004013a8 in main () at main.cc:9
cert=0xbebebebebebebebe pointer is the point of failure. It is brought to that place from frame 4 in tlsgnutlsclient.cpp:157, where is such fanny construct:
157 m_certInfo.chain = verifyAgainstCAs( cert[certListSize], 0 /*CAList*/, 0 /*CAListSize*/ );
cert[certListSize] is clearly pointing away from the existing array. I tried to trace the bug in sources, but I am not so skilled with the svn commandline tools so I am leaving this on the reported to fill upstream bug (ok, I can do that, but let me know if there is anything I can do for you).
I have a simple Qt program that is reading from shared memory in a separate thread. I call a routine that waits until there is data in the shared memory segment, it unblocks and returns and then I emits a signal to the main GUI thread. The main GUI thread slot is received and in that routine I just print to std::cout that it was received then it returns--no further data processing. This program runs find for about 30 seconds then seq faults.
Running the program under gdb catches the seq fault and a back trace shows that the fault is deep in the event processing of Qt. None of my own code is involved in this seq fault. I am totally lost as to what the problem is.
Here is the gdb output. In the following gdb output, the readSM() routine is the one that blocks until there is data available to read from shared memory, and the readDataBuffer() is the slot from the main GUI thread that just returns without any further processing. There are tons of the std::cout lines for the signal and slot before the seg fault.
...many of the following 2 lines before the fault...
DataReader::readSM(): calling m_pSharedMemory->ReadWait()
readDataBuffer(): entered, returning
DataReader::readSM(): calling m_pSharedMemory->ReadWait()
readDataBuffer(): entered, returning
Program received signal SIGSEGV, Segmentation fault.
0xb72f5c54 in QGuiApplicationPrivate::processNativeEvent(QWindow*, QByteArray const&, void*, long*) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Gui.so.5
(gdb)
(gdb) bt
#0 0xb72f5c54 in QGuiApplicationPrivate::processNativeEvent(QWindow*, QByteArray const&, void*, long*) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Gui.so.5
#1 0xb72e3ff3 in QWindowSystemInterface::handleNativeEvent(QWindow*, QByteArray const&, void*, long*) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Gui.so.5
#2 0xb3f40e2d in ?? () from /opt/Qt/5.2.1/gcc/plugins/platforms/libqxcb.so
#3 0xb3f33170 in ?? () from /opt/Qt/5.2.1/gcc/plugins/platforms/libqxcb.so
#4 0xb3f33dce in ?? () from /opt/Qt/5.2.1/gcc/plugins/platforms/libqxcb.so
#5 0xb3f74e8b in ?? () from /opt/Qt/5.2.1/gcc/plugins/platforms/libqxcb.so
#6 0xb701e943 in QMetaCallEvent::placeMetaCall(QObject*) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Core.so.5
#7 0xb7021d92 in QObject::event(QEvent*) () from /opt/Qt/5.2.1/gcc/lib/libQt5Core.so.5
#8 0xb7913eb4 in QApplicationPrivate::notify_helper(QObject*, QEvent*) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Widgets.so.5
#9 0xb7917d00 in QApplication::notify(QObject*, QEvent*) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Widgets.so.5
#10 0xb6ff3c2e in QCoreApplication::notifyInternal(QObject*, QEvent*) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Core.so.5
#11 0xb6ff68ec in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Core.so.5
#12 0xb6ff6e2c in QCoreApplication::sendPostedEvents(QObject*, int) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Core.so.5
#13 0xb704ad14 in ?? () from /opt/Qt/5.2.1/gcc/lib/libQt5Core.so.5
#14 0xb68a06d3 in g_main_context_dispatch () from /lib/i386-linux-gnu/libglib-2.0.so.0
#15 0xb68a0a70 in ?? () from /lib/i386-linux-gnu/libglib-2.0.so.0
#16 0xb68a0b51 in g_main_context_iteration () from /lib/i386-linux-gnu/libglib-2.0.so.0
#17 0xb704b128 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Core.so.5
#18 0xb3f97836 in ?? () from /opt/Qt/5.2.1/gcc/plugins/platforms/libqxcb.so
#19 0xb6ff22e6 in QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Core.so.5
#20 0xb6ff272c in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) ()
from /opt/Qt/5.2.1/gcc/lib/libQt5Core.so.5
#21 0xb6ff6ed2 in QCoreApplication::exec() () from /opt/Qt/5.2.1/gcc/lib/libQt5Core.so.5
#22 0xb72f5b04 in QGuiApplication::exec() () from /opt/Qt/5.2.1/gcc/lib/libQt5Gui.so.5
#23 0xb790e914 in QApplication::exec() () from /opt/Qt/5.2.1/gcc/lib/libQt5Widgets.so.5
#24 0x0805034b in main ()
(gdb)
Since I am not processing any of the data in the GUI slot, and since this program works fine for about 30 seconds with its usual signal/slots being called, I am at a loss at to what is causing this seq fault in the Qt event processing.
Any help would be very appreciated.
Thanks,
-Andres
UPDATE:
thank you Andrew Medico for your suggestion to use valgrind. I never thought of using it as a debugging tool but that makes perfect sense. After using it, it became obvious that you were correct that something else was causing the problem and I finally found the cause elsewhere in my code. I was over writing a buffer. Thanks again Andrew for your assistance.
-Andres
Clang has the -fsanitize-blacklist compile switch to suppress warnings from the ThreadSanitizer. Unfortunately, I cannot get it to work.
Here is an example that I want to suppress:
WARNING: ThreadSanitizer: data race (pid=21502)
Read of size 8 at 0x7f0dcf5b31a8 by thread T6:
#0 tbb::interface6::internal::auto_partition_type_base<tbb::interface6::internal::auto_partition_type>::check_being_stolen(tbb::task&) /usr/include/tbb/partitioner.h:305 (exe+0x000000388b38)
#1 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
Previous write of size 8 at 0x7f0dcf5b31a8 by thread T1:
#0 auto_partition_type_base /usr/include/tbb/partitioner.h:299 (exe+0x000000388d9a)
#1 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
#2 GhostSearch::Ghost3Search::SearchTask::execute_impl() /home/phil/ghost/search/ghost3/ghost3_search_alg.cpp:1456 (exe+0x000000387a8a)
#3 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
#4 GhostSearch::Ghost3Search::Ghost3SearchAlg::NullWindowSearch(int, MOVE, int, std::vector<MOVE, std::allocator<MOVE> >&) /home/phil/ghost/search/ghost3/ghost3_search_alg.cpp:1640 (exe+0x000000388310)
#5 GhostSearch::PureMTDSearchAlg::FullWindowSearch(GhostSearch::SearchWindow, GhostSearch::SearchWindow, MOVE, int, std::vector<MOVE, std::allocator<MOVE> >&) /home/phil/ghost/search/pure_mtd_search_alg.cpp:41 (exe+0x000000370e3f)
#6 GhostSearch::PureSearchAlgWrapper::RequestHandlerThread::EnterHandlerMainLoop() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:124 (exe+0x000000372d1b)
#7 operator() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:94 (exe+0x000000374683)
#8 execute_native_thread_routine /home/phil/tmp/gcc/src/gcc-4.8-20130725/libstdc++-v3/src/c++11/thread.cc:84 (libstdc++.so.6+0x0000000b26cf)
Thread T6 (tid=21518, running) created by thread T3 at:
#0 pthread_create ??:0 (exe+0x0000002378e1)
#1 <null> <null>:0 (libtbb.so.2+0x0000000198c0)
Thread T1 (tid=21513, running) created by main thread at:
#0 pthread_create ??:0 (exe+0x0000002378e1)
#1 __gthread_create /home/phil/tmp/gcc/src/gcc-build/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:662 (libstdc++.so.6+0x0000000b291e)
#2 GhostSearch::PureSearchAlgWrapper::StartRequestHandlerThread() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:77 (exe+0x0000003715c3)
#3 GhostSearch::Search::ExecuteSearch(GhostSearch::SEARCH_SETTINGS const&) /home/phil/ghost/search.cpp:243 (exe+0x00000033063f)
#4 GhostSearch::Search::StartSearch(GhostSearch::SEARCH_SETTINGS const&, UserBoard const&, GhostInterfaces::UserInterface*) /home/phil/ghost/search.cpp:176 (exe+0x00000033037a)
#5 GhostInterfaces::UserInterface::StartSearch(GhostSearch::SEARCH_SETTINGS const&, UserBoard const&) /home/phil/ghost/interface.cpp:1072 (exe+0x0000002ea220)
#6 GhostInterfaces::UserInterface::MainLoop() /home/phil/ghost/interface.cpp:576 (exe+0x0000002e9464)
#7 GhostInterfaces::Command_Analyze::Execute(GhostInterfaces::UserInterfaceData&) /home/phil/ghost/commands.cpp:1005 (exe+0x00000028756c)
#8 GhostInterfaces::UserInterface::FinishNextCommand() /home/phil/ghost/interface.cpp:1161 (exe+0x0000002e9ed0)
#9 GhostInterfaces::UserInterface::MainLoop() /home/phil/ghost/interface.cpp:571 (exe+0x0000002e9447)
#10 main /home/phil/ghost/ghost.cpp:54 (exe+0x000000274efd)
SUMMARY: ThreadSanitizer: data race /usr/include/tbb/partitioner.h:305 tbb::interface6::internal::auto_partition_type_base<tbb::interface6::internal::auto_partition_type>::check_being_stolen(tbb::task&)
My tries for the suppression file so far (but it does not work):
# TBB
fun:tbb::*
src:/usr/include/tbb/partitioner.h
Do you know why it does not work?
(By the way, I would be happy to suppress all warnings from the TBB library.)
Finally, I got it working.
According to the documentation, each line must start with a valid "suppression_type" (race, thread, mutex, signal, deadlock, or called_from_lib).
In my example, the correct suppression_type is race.
Here is an example file called "sanitizer-thread-suppressions.txt", which suppresses two functions, which are known to contain data races:
race:Function1
race:MyNamespace::Function2
To test the suppress file, set the TSAN_OPTIONS environment variable and call the application (compiled with -fsanitize=thread):
$ TSAN_OPTIONS="suppressions=sanitizer-thread-suppressions.txt" ./myapp
If that works, you can apply the settings at compile time:
-fsanitize=thread -fsanitize-blacklist=sanitizer-thread-suppressions.txt
Let's call it x.dylib. I only want x.dylib to be loaded sometimes.
In the initialization of the dylib is there any way to have some logic which would cause the dlopen() call that tried to load x.dylib to fail to load x.dylib and return NULL?
Renaming x.dylib is not an option.
I looked through http://opensource.apple.com/source/dyld/dyld-210.2.3/src/dyldAPIs.cpp but I am unfamiliar with the code.
I thought maybe this would do it:
__attribute__((constructor))
void initializer(void) {
fprintf(stderr, "initializer\n");
throw;
}
but when I call dlopen() on the dylib with that initializer, i just get "initializer
terminate called without an active exceptionAbort trap: 6"
So I'm stumped; any help would be great.
Edit:
The stack trace, when viewed with gdb is as follows:
Program received signal SIGABRT, Aborted.
0x00007fff9128a82a in __kill ()
(gdb) bt
#0 0x00007fff9128a82a in __kill ()
#1 0x00007fff93539a9c in abort ()
#2 0x00007fff987f07bc in abort_message ()
#3 0x00007fff987edfcf in default_terminate ()
#4 0x00007fff987ee001 in safe_handler_caller ()
#5 0x00007fff987ee05c in std::terminate ()
#6 0x00007fff987ef152 in __cxa_throw ()
#7 0x0000000100003eb5 in initializer ()
#8 0x00007fff5fc0fda6 in __dyld__ZN16ImageLoaderMachO18doModInitFunctionsERKN11ImageLoader11LinkContextE ()
#9 0x00007fff5fc0faf2 in __dyld__ZN16ImageLoaderMachO16doInitializationERKN11ImageLoader11LinkContextE ()
#10 0x00007fff5fc0d2e4 in __dyld__ZN11ImageLoader23recursiveInitializationERKNS_11LinkContextEjRNS_21InitializerTimingListE ()
#11 0x00007fff5fc0d27d in __dyld__ZN11ImageLoader23recursiveInitializationERKNS_11LinkContextEjRNS_21InitializerTimingListE ()
#12 0x00007fff5fc0e0b7 in __dyld__ZN11ImageLoader15runInitializersERKNS_11LinkContextERNS_21InitializerTimingListE ()
#13 0x00007fff5fc034dd in __dyld__ZN4dyld24initializeMainExecutableEv ()
#14 0x00007fff5fc0760b in __dyld__ZN4dyld5_mainEPK12macho_headermiPPKcS5_S5_ ()
#15 0x00007fff5fc01059 in __dyld__dyld_start ()
which come from:
http://opensource.apple.com/source/dyld/dyld-210.2.3/src/dyld.cpp
http://opensource.apple.com/source/dyld/dyld-210.2.3/src/ImageLoader.cpp
http://opensource.apple.com/source/dyld/dyld-210.2.3/src/ImageLoaderMachO.cpp
I'm surprised that I don't see dlopen() in the stack trace though.