I am debugging A segfault reported by TSAN in the CI of Boost.Beast.
I strongly believe it to be a false positive, but I don't know what to look for in order to suppress it.
It seems to me from the stack trace that the code is correctly instrumented.
The code passes all other tests, inclding valgrind, ubsan, etc.
I'm hoping some kind expert may put me out of my misery.
Here is the output:
====== BEGIN OUTPUT ======
beast.http.read
ThreadSanitizer:DEADLYSIGNAL
==132842==ERROR: ThreadSanitizer: SEGV on unknown address 0x7ff5d9cff000 (pc 0x7ff5dceba0d0 bp 0x000000000000 sp 0x7ff5d9c3d910 T132844)
==132842==The signal is caused by a READ memory access.
#0 __sanitizer::StackDepotBase<__sanitizer::StackDepotNode, 1, 20>::Put(__sanitizer::StackTrace, bool*) <null> (libtsan.so.2+0xba0d0)
#1 __tsan::CurrentStackId(__tsan::ThreadState*, unsigned long) <null> (libtsan.so.2+0x8c48f)
#2 __sanitizer::DD::MutexInit(__sanitizer::DDCallback*, __sanitizer::DDMutex*) <null> (libtsan.so.2+0xac534)
#3 __tsan::DDMutexInit(__tsan::ThreadState*, unsigned long, __tsan::SyncVar*) <null> (libtsan.so.2+0x9a3f8)
#4 __tsan::MetaMap::GetSync(__tsan::ThreadState*, unsigned long, unsigned long, bool, bool) <null> (libtsan.so.2+0xa85dc)
#5 __tsan_atomic32_fetch_add <null> (libtsan.so.2+0x783e9)
#6 __gnu_cxx::__exchange_and_add(int volatile*, int) /usr/include/c++/12/ext/atomicity.h:66 (read+0x4188f8)
#7 __gnu_cxx::__exchange_and_add_dispatch(int*, int) /usr/include/c++/12/ext/atomicity.h:101 (read+0x4188f8)
#8 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release_last_use() /usr/include/c++/12/bits/shared_ptr_base.h:187 (read+0x4188f8)
#9 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /usr/include/c++/12/bits/shared_ptr_base.h:361 (read+0x40c592)
#10 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /usr/include/c++/12/bits/shared_ptr_base.h:1071 (read+0x418fee)
#11 std::__shared_ptr<boost::asio::detail::strand_executor_service::strand_impl, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /usr/include/c++/12/bits/shared_ptr_base.h:1524 (read+0x420f83)
#12 std::shared_ptr<boost::asio::detail::strand_executor_service::strand_impl>::~shared_ptr() /usr/include/c++/12/bits/shared_ptr.h:175 (read+0x420faf)
#13 boost::asio::detail::strand_executor_service::invoker<boost::asio::io_context::basic_executor_type<std::allocator<void>, 0ul> const, void>::~invoker() <null> (read+0x471f5f)
#14 boost::asio::detail::executor_op<boost::asio::detail::strand_executor_service::invoker<boost::asio::io_context::basic_executor_type<std::allocator<void>, 0ul> const, void>, boost::asio::detail::recycling_allocator<void, boost::asio::detail::thread_info_base::default_tag>, boost::asio::detail::scheduler_operation>::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, unsigned long) <null> (read+0x48b0ac)
#15 boost::asio::detail::scheduler_operation::complete(void*, boost::system::error_code const&, unsigned long) boost/asio/detail/scheduler_operation.hpp:40 (read+0x4fa38e)
#16 boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&) boost/asio/detail/impl/scheduler.ipp:492 (read+0x4e8835)
#17 boost::asio::detail::scheduler::run(boost::system::error_code&) boost/asio/detail/impl/scheduler.ipp:210 (read+0x4e74fb)
#18 boost::asio::io_context::run() boost/asio/impl/io_context.ipp:63 (read+0x4dc122)
#19 boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}::operator()() const <null> (read+0x412363)
#20 void std::__invoke_impl<void, boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}>(std::__invoke_other, boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}&&) <null> (read+0x4a1f92)
#21 std::__invoke_result<boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}>::type std::__invoke<boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}>(boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}&&) <null> (read+0x49f9dc)
#22 void std::thread::_Invoker<std::tuple<boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (read+0x49c90a)
#23 std::thread::_Invoker<std::tuple<boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}> >::operator()() <null> (read+0x49941e)
#24 std::thread::_State_impl<std::thread::_Invoker<std::tuple<boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}> > >::_M_run() <null> (read+0x494c2a)
#25 execute_native_thread_routine <null> (libstdc++.so.6+0xdbb72)
#26 __tsan_thread_start_func <null> (libtsan.so.2+0x393ef)
#27 start_thread <null> (libc.so.6+0x8ce2c)
#28 clone3 <null> (libc.so.6+0x1121af)
ThreadSanitizer can not provide additional info.
SUMMARY: ThreadSanitizer: SEGV (/lib64/libtsan.so.2+0xba0d0) in __sanitizer::StackDepotBase<__sanitizer::StackDepotNode, 1, 20>::Put(__sanitizer::StackTrace, bool*)
==132842==ABORTING
The code being tested is latest master branch.
Command line to reproduce:
$ ./b2 toolset=gcc thread-sanitizer=norecover link=static variant=debug libs/beast/test -q -d+2 -j1
My compiler info:
$ gcc --version
gcc (GCC) 12.1.1 20220507 (Red Hat 12.1.1-1)
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
My OS is Fedora 36. But we see this happen on Ubuntu as well.
Using the b2 line I could repro the SEGV on linux using master (00293a6adb5 from the superproject).
I started with a - to me - more convenient setup based on CMake. I modified the CMake to use thread,undefined sanitizers instead of address,undefined for VARIANT=ubasan.
Interestingly, it doesn't segfault. It does however seem to have a legit TSAN violation in basic_stream.cpp, where the effective flags are:
{
"directory": "/backup/cloudbackup/custom_ex/superboost/libs/beast/build/test/beast/core",
"command": "/home/sehe/bin/g++-10 -DBOOST_ALL_STATIC_LINK=1 -DBOOST_ASIO_DISABLE_BOOST_ARRAY=1 -DBOOST_ASIO_DISABLE_BOOST_BIND=1 -DBOOST_ASIO_DISABLE_BOOST_DATE_TIME=1 -DBOOST_ASIO_DISABLE_BOOST_REGEX=1 -DBOOST_ASIO_NO_DEPRECATED=1 -DBOOST_ASIO_SEPARATE_COMPILATION=1 -DBOOST_BEAST_ALLOW_DEPRECATED -DBOOST_BEAST_SEPARATE_COMPILATION=1 -DBOOST_BEAST_TESTS -DBOOST_COROUTINES_NO_DEPRECATION_WARNING=1 -I/backup/cloudbackup/custom_ex/superboost/libs/beast/include -I/backup/cloudbackup/custom_ex/superboost/libs/beast/. -I/backup/cloudbackup/custom_ex/superboost/libs/beast/test/./extern -I/backup/cloudbackup/custom_ex/superboost/libs/beast/test/./extras/include -isystem /backup/cloudbackup/custom_ex/superboost -std=c++11 -Wall -Wextra -Wpedantic -Wno-unused-parameter -DBOOST_BEAST_NO_SLOW_TESTS=1 -msse4.2 -funsigned-char -fno-omit-frame-pointer -fsanitize=thread,undefined -O2 -g -DNDEBUG -pthread -o CMakeFiles/tests-beast-core.dir/basic_stream.cpp.o -c /backup/cloudbackup/custom_ex/superboost/libs/beast/test/beast/core/basic_stream.cpp",
"file": "/backup/cloudbackup/custom_ex/superboost/libs/beast/test/beast/core/basic_stream.cpp"
},
Breaking it down for readability:
g++-10
-DBOOST_ALL_STATIC_LINK=1
-DBOOST_ASIO_DISABLE_BOOST_ARRAY=1
-DBOOST_ASIO_DISABLE_BOOST_BIND=1
-DBOOST_ASIO_DISABLE_BOOST_DATE_TIME=1
-DBOOST_ASIO_DISABLE_BOOST_REGEX=1
-DBOOST_ASIO_NO_DEPRECATED=1
-DBOOST_ASIO_SEPARATE_COMPILATION=1
-DBOOST_BEAST_ALLOW_DEPRECATED
-DBOOST_BEAST_SEPARATE_COMPILATION=1
-DBOOST_BEAST_TESTS
-DBOOST_COROUTINES_NO_DEPRECATION_WARNING=1
-I/superboost/libs/beast/include
-I/superboost/libs/beast/.
-I/superboost/libs/beast/test/./extern
-I/superboost/libs/beast/test/./extras/include
-isystem /superboost
-std=c++11
-Wall
-Wextra
-Wpedantic
-Wno-unused-parameter
-DBOOST_BEAST_NO_SLOW_TESTS=1
-msse4.2
-funsigned-char
-fno-omit-frame-pointer
-fsanitize=thread,undefined
-O2 -g
-DNDEBUG
-pthread
-o CMakeFiles/tests-beast-core.dir/basic_stream.cpp.o
-c /superboost/libs/beast/test/beast/core/basic_stream.cpp
The reported diagnostic: https://paste.ubuntu.com/p/6SKjmZ9wFT/ (lines truncated for SO):
beast.core.basic_stream
==================
WARNING: ThreadSanitizer: data race (pid=24051)
Write of size 8 at 0x7b0400000030 by thread T1:
#0 pipe ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1726 (libtsan.so.0+0x3e574)
#1 __sanitizer::IsAccessibleMemoryRange(unsigned long, unsigned long) ../../../../src/libsanitizer/sanitizer_common/sanitizer_posix_libcdep.cpp:281 (libu...
#2 operator() /superboost/libs/beast/test/beast/core/basic_stream.cpp:228 (tests-beast-core+0x4f0271)
#3 operator() /superboost/boost/asio/detail/bind_handler.hpp:171 (tests-beast-core+0x4f0271)
#4 invoke<boost::asio::detail::binder1<boost::beast::(anonymous namespace)::test_server::test_server(boost::beast::string_view, boost::asio::ip::tcp::end...
#5 complete<boost::asio::detail::binder1<boost::beast::(anonymous namespace)::test_server::test_server(boost::beast::string_view, boost::asio::ip::tcp::e...
#6 do_complete /superboost/boost/asio/detail/reactive_socket_accept_op.hpp:150 (tests-beast-core+0x4f0271)
#7 boost::asio::detail::scheduler_operation::complete(void*, boost::system::error_code const&, unsigned long) /superboost/boost/asio/detail/scheduler_ope...
#8 boost::asio::detail::epoll_reactor::descriptor_state::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, ...
#9 boost::asio::detail::scheduler_operation::complete(void*, boost::system::error_code const&, unsigned long) /superboost/boost/asio/detail/scheduler_ope...
#10 boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info...
#11 boost::asio::detail::scheduler::run(boost::system::error_code&) /superboost/boost/asio/detail/impl/scheduler.ipp:210 (tests-beast-core+0x90d424)
#12 boost::asio::io_context::run() /superboost/boost/asio/impl/io_context.ipp:63 (tests-beast-core+0x91bf39)
#13 operator() /superboost/libs/beast/test/beast/core/basic_stream.cpp:234 (tests-beast-core+0x4cf1f9)
#14 __invoke_impl<void, boost::beast::(anonymous namespace)::test_server::test_server(boost::beast::string_view, boost::asio::ip::tcp::endpoint, std::ost...
#15 __invoke<boost::beast::(anonymous namespace)::test_server::test_server(boost::beast::string_view, boost::asio::ip::tcp::endpoint, std::ostream&)::<la...
#16 _M_invoke<0> /usr/include/c++/10/thread:264 (tests-beast-core+0x4cf1f9)
#17 operator() /usr/include/c++/10/thread:271 (tests-beast-core+0x4cf1f9)
#18 _M_run /usr/include/c++/10/thread:215 (tests-beast-core+0x4cf1f9)
#19 execute_native_thread_routine ../../../../../src/libstdc++-v3/src/c++11/thread.cc:82 (libstdc++.so.6+0xd44bf)
Previous write of size 8 at 0x7b0400000030 by main thread:
#0 pipe ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1726 (libtsan.so.0+0x3e574)
#1 __sanitizer::IsAccessibleMemoryRange(unsigned long, unsigned long) ../../../../src/libsanitizer/sanitizer_common/sanitizer_posix_libcdep.cpp:281 (libu...
#2 void boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::io_context::basic_executor_type<std::allocator<void>, 0ul> >::initiate_async_...
#3 void boost::asio::detail::completion_handler_async_result<boost::beast::basic_stream<boost::asio::ip::tcp, boost::asio::io_context::basic_executor_typ...
#4 boost::asio::constraint<boost::asio::detail::async_result_has_initiate_memfn<boost::beast::basic_stream<boost::asio::ip::tcp, boost::asio::io_context:...
#5 decltype ((async_initiate<boost::beast::basic_stream<boost::asio::ip::tcp, boost::asio::io_context::basic_executor_type<std::allocator<void>, 0ul>, bo...
#6 boost::beast::basic_stream<boost::asio::ip::tcp, boost::asio::io_context::basic_executor_type<std::allocator<void>, 0ul>, boost::beast::unlimited_rate...
#7 boost::beast::basic_stream<boost::asio::ip::tcp, boost::asio::io_context::basic_executor_type<std::allocator<void>, 0ul>, boost::beast::unlimited_rate...
#8 boost::beast::basic_stream<boost::asio::ip::tcp, boost::asio::io_context::basic_executor_type<std::allocator<void>, 0ul>, boost::beast::unlimited_rate...
#9 void boost::beast::basic_stream<boost::asio::ip::tcp, boost::asio::io_context::basic_executor_type<std::allocator<void>, 0ul>, boost::beast::unlimited...
#10 void boost::asio::detail::completion_handler_async_result<boost::beast::basic_stream_test::handler, void (boost::system::error_code, unsigned long)>:...
#11 boost::asio::constraint<boost::asio::detail::async_result_has_initiate_memfn<boost::beast::basic_stream_test::handler, void (boost::system::error_cod...
#12 boost::asio::async_result<std::decay<boost::beast::basic_stream_test::handler>::type, void (boost::system::error_code, unsigned long)>::return_type b...
#13 boost::beast::basic_stream_test::testRead() /superboost/libs/beast/test/beast/core/basic_stream.cpp:488 (tests-beast-core+0x678b01)
#14 boost::beast::basic_stream_test::run() /superboost/libs/beast/test/beast/core/basic_stream.cpp:1401 (tests-beast-core+0x70b193)
#15 void boost::beast::unit_test::suite::run<void>(boost::beast::unit_test::runner&) /superboost/libs/beast/include/boost/beast/_experimental/unit_test/s...
#16 void boost::beast::unit_test::suite::operator()<void>(boost::beast::unit_test::runner&) /superboost/libs/beast/include/boost/beast/_experimental/unit...
#17 boost::beast::unit_test::make_suite_info<boost::beast::basic_stream_test>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<cha...
#18 void std::__invoke_impl<void, boost::beast::unit_test::make_suite_info<boost::beast::basic_stream_test>(std::__cxx11::basic_string<char, std::char_tr...
#19 std::enable_if<std::__and_<std::is_void<void>, std::__is_invocable<boost::beast::unit_test::make_suite_info<boost::beast::basic_stream_test>(std::__c...
#20 std::_Function_handler<void (boost::beast::unit_test::runner&), boost::beast::unit_test::make_suite_info<boost::beast::basic_stream_test>(std::__cxx1...
#21 std::function<void (boost::beast::unit_test::runner&)>::operator()(boost::beast::unit_test::runner&) const /usr/include/c++/10/bits/std_function.h:62...
#22 boost::beast::unit_test::suite_info::run(boost::beast::unit_test::runner&) const /superboost/libs/beast/include/boost/beast/_experimental/unit_test/s...
#23 bool boost::beast::unit_test::runner::run<void>(boost::beast::unit_test::suite_info const&) /superboost/libs/beast/include/boost/beast/_experimental/...
#24 bool boost::beast::unit_test::runner::run_each<boost::beast::unit_test::suite_list>(boost::beast::unit_test::suite_list const&) /superboost/libs/beas...
#25 main /superboost/libs/beast/include/boost/beast/_experimental/unit_test/main.ipp:82 (tests-beast-core+0x4cc1d3)
Thread T1 (tid=24054, running) created by main thread at:
#0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:969 (libtsan.so.0+0x5fe84)
#1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) /build/gcc-11-YRKbe7/gcc-11-...
#2 boost::beast::basic_stream_test::testRead() /superboost/libs/beast/test/beast/core/basic_stream.cpp:484 (tests-beast-core+0x67891b)
#3 boost::beast::basic_stream_test::run() /superboost/libs/beast/test/beast/core/basic_stream.cpp:1401 (tests-beast-core+0x70b193)
#4 void boost::beast::unit_test::suite::run<void>(boost::beast::unit_test::runner&) /superboost/libs/beast/include/boost/beast/_experimental/unit_test/su...
#5 void boost::beast::unit_test::suite::operator()<void>(boost::beast::unit_test::runner&) /superboost/libs/beast/include/boost/beast/_experimental/unit_...
#6 boost::beast::unit_test::make_suite_info<boost::beast::basic_stream_test>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char...
#7 void std::__invoke_impl<void, boost::beast::unit_test::make_suite_info<boost::beast::basic_stream_test>(std::__cxx11::basic_string<char, std::char_tra...
#8 std::enable_if<std::__and_<std::is_void<void>, std::__is_invocable<boost::beast::unit_test::make_suite_info<boost::beast::basic_stream_test>(std::__cx...
#9 std::_Function_handler<void (boost::beast::unit_test::runner&), boost::beast::unit_test::make_suite_info<boost::beast::basic_stream_test>(std::__cxx11...
#10 std::function<void (boost::beast::unit_test::runner&)>::operator()(boost::beast::unit_test::runner&) const /usr/include/c++/10/bits/std_function.h:62...
#11 boost::beast::unit_test::suite_info::run(boost::beast::unit_test::runner&) const /superboost/libs/beast/include/boost/beast/_experimental/unit_test/s...
#12 bool boost::beast::unit_test::runner::run<void>(boost::beast::unit_test::suite_info const&) /superboost/libs/beast/include/boost/beast/_experimental/...
#13 bool boost::beast::unit_test::runner::run_each<boost::beast::unit_test::suite_list>(boost::beast::unit_test::suite_list const&) /superboost/libs/beas...
#14 main /superboost/libs/beast/include/boost/beast/_experimental/unit_test/main.ipp:82 (tests-beast-core+0x4cc1d3)
SUMMARY: ThreadSanitizer: data race ../../../../src/libsanitizer/sanitizer_common/sanitizer_posix_libcdep.cpp:281 in __sanitizer::IsAccessibleMemoryRange(uns...
==================
559ms, 1 suite, 1 case, 217 tests total, 0 failures
ThreadSanitizer: reported 1 warnings
Given this observation, I thought to see whether excluding the offending TU (basic_stream.cpp) from the b2 build removes the SEGV. No such luck.
On the contrary, the SEGV manfifest with the following TUs:
module
TU
test/beast/core
buffered_read_stream.cpp
test/beast/http
read.cpp
test/beast/http
write.cpp
test/beast/websocket
close.cpp
test/beast/websocket
handshake.cpp
test/beast/websocket
ping.cpp
test/beast/websocket
read2.cpp
test/beast/websocket
write.cpp
test/doc
http_examples.cpp
Dropping these TUs from their respective test/**/Jamfile allows all remaining tests to pass TSAN under b2. Now, I did some soul searching and e.g. unique include diving using a script like:
#!/bin/bash -i
alias PP='find build/ -name *.cpp.i'
PP -delete
make -C build/test/beast/core buffered_read_stream.i
make -C build/test/beast/http read.i
make -C build/test/beast/http write.i
make -C build/test/beast/websocket close.i
make -C build/test/beast/websocket handshake.i
make -C build/test/beast/websocket ping.i
make -C build/test/beast/websocket read2.i
make -C build/test/beast/websocket write.i
make -C build/test/doc http_examples.i
PP | nl
n=$(PP | wc -l)
set -x
export LANG=C
PP -exec sort -b {} \+ | uniq -dc | grep -wP "^\s*$n # 1" | grep -P '1( 3 4)?$' | nl
Which uncovers a common subset of includes of 854 includes. 40 of the beast headers are in that consistent set, but 208 are asio headers.
Questions from here:
why is SEGV not happening in the CMake build?
are there headers in the common subset that do not appear in the TUs that don't trip TSAN up?
is a recent change in Asio relevant?
Choosing to address these 3., 1., 2. (optimizing for return-on-effort)
3. Is a recent Asio change involved? [YES]
Doing the b2 test with only Asio reverted to 1.79.0 (e929e5cf Merge asio from 'develop') passes all the tests cleanly.
Just to check that no compiler flags were harmed in the process e.g. buffered_read_stream.cpp showed the same exact commands:
gcc.compile.c++ bin.v2/libs/beast/test/beast/core/buffered_read_stream.test/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/buffered_read_stream.o
"g++-10" -fvisibility-inlines-hidden -fsanitize=thread -fno-sanitize-recover=thread -fno-omit-frame-pointer -m64 -pthread -O0 -fno-inline -Wall -g -fvisibility=hidden -DBOOST_ALL_NO_LIB=1 -DBOOST_ASIO_DISABLE_BOOST_ARRAY=1 -DBOOST_ASIO_DISABLE_BOOST_BIND=1 -DBOOST_ASIO_DISABLE_BOOST_DATE_TIME=1 -DBOOST_ASIO_DISABLE_BOOST_REGEX=1 -DBOOST_ASIO_NO_DEPRECATED=1 -DBOOST_ASIO_SEPARATE_COMPILATION -DBOOST_ATOMIC_STATIC_LINK=1 -DBOOST_BEAST_ALLOW_DEPRECATED -DBOOST_BEAST_SEPARATE_COMPILATION -DBOOST_BEAST_TESTS -DBOOST_COROUTINES_NO_DEPRECATION_WARNING=1 -DBOOST_FILESYSTEM_STATIC_LINK=1 -D_GNU_SOURCE=1 -D_XOPEN_SOURCE=600 -I"." -I"libs/beast" -I"libs/beast/test/extern" -I"libs/beast/test/extras/include" -c -o "bin.v2/libs/beast/test/beast/core/buffered_read_stream.test/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/buffered_read_stream.o" "libs/beast/test/beast/core/buffered_read_stream.cpp"
gcc.link bin.v2/libs/beast/test/beast/core/buffered_read_stream.test/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/buffered_read_stream
"g++-10" -o "bin.v2/libs/beast/test/beast/core/buffered_read_stream.test/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/buffered_read_stream" -Wl,--start-group "bin.v2/libs/beast/test/beast/core/buffered_read_stream.test/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/buffered_read_stream.o" "bin.v2/libs/beast/test/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/lib-test.a" "bin.v2/libs/beast/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/lib-asio.a" "bin.v2/libs/beast/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/lib-beast.a" "bin.v2/libs/coroutine/build/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/libboost_coroutine.a" "bin.v2/libs/context/build/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/libboost_context.a" "bin.v2/libs/filesystem/build/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/libboost_filesystem.a" "bin.v2/libs/atomic/build/gcc-10.0/debug/link-static/thread-sanitizer-norecover/threading-multi/visibility-hidden/libboost_atomic.a" -Wl,-Bstatic -Wl,-Bdynamic -lrt -Wl,--end-group -m64 -pthread -g -fvisibility=hidden -fvisibility-inlines-hidden -fsanitize=thread -fno-sanitize-recover=thread -fno-omit-frame-pointer
So, now we know that something inside the 208 Asio headers must be involved.
1. Why is SEGV not happening in the CMake build?
Surely, here we should be able to spot difference in compilation flags? Ever-so-slightly redacted to remove spelling differences (left = CMake, right = b2):
I used the process of elimination, figured out that the culprit is -fno-inline -O0. Somehow it leads to recursive TSAN errors:
beast.core.buffered_read_stream
ThreadSanitizer:DEADLYSIGNAL
==2827==ERROR: ThreadSanitizer: SEGV on unknown address 0x60000212fff8 (pc 0x7f319a938bfc bp 0x7f3196fbe1c0 sp 0x7f3196fbe1a8 T2829)
==2827==The signal is caused by a WRITE memory access.
ThreadSanitizer:DEADLYSIGNAL
ThreadSanitizer: nested bug in the same thread, aborting.
Observations without -fno-inline:
from -O2 the symptoms go away
at -O1 the symptoms go away if NDEBUG is defined
at -O0 the symptom is there regardless
This suggests that a NDEBUG-sensitive piece of code could be involved. This might be a lead to guide minimization.
Comparing preprocessed sources may highlight specific possible causes. My main suspicions are the revamped spawned_thread_base in asio/spawn.hpp, source-locations and or cancellation slots.
As a courtesy, here's the preprocessed-buffered-stream-reader.zip containing 4 files (/home/sehe/{with,without}-NDEBUG.i.asio1.{79,80}.0).
Thanks everyone who chipped in on this.
I was was able to get some help from the legendary Chris Kohlhoff, author of the Asio library.
Quoting:
It seems that thread-sanitizer does not correctly handle coroutine/fiber stacks that migrate between threads. This would either be considered a bug in thread-sanitizer, or perhaps a feature request for it.
The issue is in the thread sanitiser library itself.
The workaround in this case is to restructure the code so that the initiation of the fiber takes place on the same thread as the one in which it makes progress.
Here's the old (correct but not TSAN-compatible) code:
template<class F0, class... FN>
inline
void
enable_yield_to::
spawn(F0&& f, FN&&... fn)
{
asio::spawn(ioc_,
[&](yield_context yield)
{
f(yield);
std::lock_guard<std::mutex> lock{m_};
if(--running_ == 0)
cv_.notify_all();
}
, boost::coroutines::attributes(2 * 1024 * 1024));
spawn(fn...);
}
And here's the code with the workaround:
template<class F0, class... FN>
inline
void
enable_yield_to::
spawn(F0&& f, FN&&... fn)
{
// dispatch of spawn is a workaround for
// https://github.com/boostorg/beast/issues/2499
asio::dispatch(ioc_,
[&]
{
asio::spawn(ioc_,
[&](yield_context yield)
{
f(yield);
std::lock_guard<std::mutex> lock{m_};
if(--running_ == 0)
cv_.notify_all();
}
, boost::coroutines::attributes(2 * 1024 * 1024));
});
spawn(fn...);
}
Related
I got a strange core dump which I copied from a part of code in http://en.cppreference.com/w/cpp/thread/packaged_task,
#include <future>
#include <iostream>
#include <cmath>
void task_lambda() {
std::packaged_task<int(int,int)> task([](int a, int b) {
return std::pow(a, b);
});
std::future<int> result = task.get_future();
task(2, 9);
std::cout << "task_lambda:\t" << result.get() << '\n';
}
int main() {
task_lambda();
}
I got this
terminate called after throwing an instance of 'std::system_error'
what(): Unknown error -1
[1] 28373 abort (core dumped) ./a.out
The call stack is like below:
#0 0x00007ffff71a2428 in __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007ffff71a402a in __GI_abort () at abort.c:89
#2 0x00007ffff7ae484d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007ffff7ae26b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007ffff7ae2701 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007ffff7ae2919 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007ffff7b0b7fe in std::__throw_system_error(int) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x0000000000404961 in std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) (__once=..., __f=<unknown type in /home/ace/test/a.out, CU 0x0, DIE 0x1246d>)
at /usr/include/c++/5/mutex:746
#8 0x0000000000403eb2 in std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool) (
this=0x61ec30, __res=..., __ignore_failure=false) at /usr/include/c++/5/future:387
#9 0x0000000000402b76 in std::__future_base::_Task_state<task_lambda()::<lambda(int, int)>, std::allocator<int>, int(int, int)>::_M_run(<unknown type in /home/ace/test/a.out, CU 0x0, DIE 0x17680>, <unknown type in /home/ace/test/a.out, CU 0x0, DIE 0x17685>) (this=0x61ec30, __args#0=<unknown type in /home/ace/test/a.out, CU 0x0, DIE 0x17680>,
__args#1=<unknown type in /home/ace/test/a.out, CU 0x0, DIE 0x17685>) at /usr/include/c++/5/future:1403
#10 0x00000000004051c1 in std::packaged_task<int (int, int)>::operator()(int, int) (this=0x7fffffffdca0, __args#0=2, __args#1=9) at /usr/include/c++/5/future:1547
#11 0x0000000000401c7d in task_lambda () at aa.cc:12
#12 0x0000000000401d1b in main () at aa.cc:19
Then I added some sample code into my program, it became
#include <iostream>
#include <cmath>
#include <future>
#include <thread>
int f(int x, int y) { return std::pow(x,y); }
void task_thread() {
std::packaged_task<int(int,int)> task(f);
std::future<int> result = task.get_future();
std::thread task_td(std::move(task), 2, 10);
task_td.join();
std::cout << "task_thread:\t" << result.get() << '\n';
}
void task_lambda() {
std::packaged_task<int(int,int)> task([](int a, int b) {
return std::pow(a, b);
});
std::future<int> result = task.get_future();
task(2, 9);
std::cout << "task_lambda:\t" << result.get() << '\n';
}
int main() {
task_lambda();
}
the error was gone. How can I correct the program by adding a function even though I never call it?
gcc version
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
Promgram compiled with command:
g++ -std=c++11 aa.cc -lpthread
With #fedepad's help, I got correct output by replace lpthread with pthread. But I still confused how second code work by add a dummy function!
I tried your first code snippet using the following version of g++:
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
and compiled with the following
g++ -o test_threads test_threads.cpp -std=c++11 -pthread
and I can run the program with no problems, and getting the following output:
$ ./test_threads
task_lambda: 512
If I then use the -lpthread as you did with the following
g++ -o test_threads test_threads.cpp -std=c++11 -lpthread
I get
$ ./test_threads
terminate called after throwing an instance of 'std::system_error'
what(): Unknown error -1
[1] 7890 abort ./test_threads
So please use -pthread as a flag and not -lpthread.
This behavior is also mentioned in the following
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59830
There's a difference between -pthread and -lpthread.
Looking at the man page for g++
-pthread
Adds support for multithreading with the pthreads library. This option sets flags for both the preprocessor and linker.
To have a look to what flags are activated for both one can check with the following:
g++ -dumpspecs | grep pthread
g++ -dumpspecs | grep lpthread
As one can clearly see, there are some preprocessor macros that are not activated if one is using -lpthread.
I've been trying to use gloox 1.0.14 for the first time and I think I'm using the most minimal example there is but poorly I get a SIGSEGV. Can anyone reproduce this problem or tell me why this happens and what I'm doing wrong? It's seems to be that the JID has an impact on this but I'd expect it to throw an error instead of segv crashing and the JID seems valid to me and even if the certificate is incorrect or whatever I'd still expect it to throw instead.
#include <cstdlib>
#include "gloox/client.h"
int main() {
gloox::JID jid("segv#jabber.de");
gloox::Client client(jid, "password");
client.connect();
return EXIT_SUCCESS;
}
Sanitizer told me:
ASAN:SIGSEGV
=================================================================
==27028==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f6357f79acd bp 0x7ffcca7c1a50 sp 0x7ffcca7c17d0 T0)
#0 0x7f6357f79acc (/lib64/libgnutls.so.30+0x99acc)
#1 0x7f6357f7b49a (/lib64/libgnutls.so.30+0x9b49a)
#2 0x7f6357f7bb18 in gnutls_x509_crt_verify (/lib64/libgnutls.so.30+0x9bb18)
#3 0x7f635a4ad54b in gloox::GnuTLSClient::verifyAgainstCAs(gnutls_x509_crt_int*, gnutls_x509_crt_int**, int) (/lib64/libgloox.so.13+0xbd54b)
#4 0x7f635a4ad6bf in gloox::GnuTLSClient::getCertInfo() (/lib64/libgloox.so.13+0xbd6bf)
#5 0x7f635a4afd6c in gloox::GnuTLSBase::handshake() (/lib64/libgloox.so.13+0xbfd6c)
#6 0x7f635a4afbc0 in gloox::GnuTLSBase::decrypt(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (/lib64/libgloox.so.13+0xbfbc0)
#7 0x7f635a4490bb in gloox::ConnectionTCPClient::recv(int) (/lib64/libgloox.so.13+0x590bb)
#8 0x7f635a4c297d in gloox::ConnectionTCPBase::receive() (/lib64/libgloox.so.13+0xd297d)
#9 0x7f635a4541d7 in gloox::ClientBase::connect(bool) (/lib64/libgloox.so.13+0x641d7)
#10 0x4014d7 in main test.cc:8
#11 0x7f6358aa357f in __libc_start_main (/lib64/libc.so.6+0x2057f)
#12 0x401198 in _start (test+0x401198)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 ??
==27028==ABORTING
I compiled via:
c++ -fsanitize=address -fsanitize=undefined -ggdb -std=c++14 -Wall -Wextra -Wpedantic -Wconversion -Wsign-conversion -lgloox main.cc
This looks like a bug in recent version of gloox. Running the code under gdb or valgrind (without sanitizer) shows nice backtrace.
Full backtrace from valgrind points to the place of problem:
==29533== at 0x62B558D: verify_crt (verify.c:602)
==29533== by 0x62B6F57: _gnutls_verify_crt_status (verify.c:936)
==29533== by 0x62B75CC: gnutls_x509_crt_verify (verify.c:1329)
==29533== by 0x4EF254B: gloox::GnuTLSClient::verifyAgainstCAs(gnutls_x509_crt_int*, gnutls_x509_crt_int**, int) (tlsgnutlsclient.cpp:227)
==29533== by 0x4EF26BF: gloox::GnuTLSClient::getCertInfo() (tlsgnutlsclient.cpp:157)
==29533== by 0x4EF4D6C: gloox::GnuTLSBase::handshake() (tlsgnutlsbase.cpp:138)
==29533== by 0x4EF4BC0: gloox::GnuTLSBase::decrypt(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (tlsgnutlsbase.cpp:70)
==29533== by 0x4E8E0BB: gloox::ConnectionTCPClient::recv(int) (connectiontcpclient.cpp:169)
==29533== by 0x4F0797D: gloox::ConnectionTCPBase::receive() (connectiontcpbase.cpp:115)
==29533== by 0x4E991D7: gloox::ClientBase::connect(bool) (clientbase.cpp:212)
==29533== by 0x400DAC: main (main.cc:9)
Backtrace from gdb shows:
#0 verify_crt (cert=0xbebebebebebebebe, trusted_cas=trusted_cas#entry=0x0, tcas_size=tcas_size#entry=0, flags=flags#entry=0, output=output#entry=0x7fffffffce90, _issuer=_issuer#entry=0x7fffffffce98,
now=1455405710, max_path=0x7fffffffce94, end_cert=true, nc=0x602000007a50, func=0x0) at verify.c:602
#1 0x00007ffff46bcf58 in _gnutls_verify_crt_status (certificate_list=certificate_list#entry=0x7fffffffcf08, clist_size=clist_size#entry=1, trusted_cas=trusted_cas#entry=0x0, tcas_size=tcas_size#entry=0,
flags=flags#entry=0, purpose=purpose#entry=0x0, func=0x0) at verify.c:936
#2 0x00007ffff46bd5cd in gnutls_x509_crt_verify (cert=cert#entry=0xbebebebebebebebe, CA_list=CA_list#entry=0x0, CA_list_length=CA_list_length#entry=0, flags=flags#entry=0, verify=verify#entry=0x7fffffffcf24)
at verify.c:1329
#3 0x00007ffff6bee54c in gloox::GnuTLSClient::verifyAgainstCAs (this=this#entry=0x61400000fc40, cert=0xbebebebebebebebe, CAList=CAList#entry=0x0, CAListSize=CAListSize#entry=0) at tlsgnutlsclient.cpp:227
#4 0x00007ffff6bee6c0 in gloox::GnuTLSClient::getCertInfo (this=0x61400000fc40) at tlsgnutlsclient.cpp:157
#5 0x00007ffff6bf0d6d in gloox::GnuTLSBase::handshake (this=0x61400000fc40) at tlsgnutlsbase.cpp:138
#6 0x00007ffff6bf0bc1 in gloox::GnuTLSBase::decrypt (this=0x61400000fc40,
data="\024\003\003\000\001\001\026\003\003\000(\373\336\267\221q\256\266\344\363\022\367 C\022\233\351\251\065\036\355\070\362\217\264\370\003\206+\"\201r^\355\067I\203Y\213\350\301")
at tlsgnutlsbase.cpp:70
#7 0x00007ffff6b8a0bc in gloox::ConnectionTCPClient::recv (this=<optimized out>, timeout=<optimized out>) at connectiontcpclient.cpp:169
#8 0x00007ffff6c0397e in gloox::ConnectionTCPBase::receive (this=0x60c00000bec0) at connectiontcpbase.cpp:115
#9 0x00007ffff6b951d8 in gloox::ClientBase::connect (this=0x7fffffffd410, block=<optimized out>) at clientbase.cpp:212
#10 0x00000000004013a8 in main () at main.cc:9
cert=0xbebebebebebebebe pointer is the point of failure. It is brought to that place from frame 4 in tlsgnutlsclient.cpp:157, where is such fanny construct:
157 m_certInfo.chain = verifyAgainstCAs( cert[certListSize], 0 /*CAList*/, 0 /*CAListSize*/ );
cert[certListSize] is clearly pointing away from the existing array. I tried to trace the bug in sources, but I am not so skilled with the svn commandline tools so I am leaving this on the reported to fill upstream bug (ok, I can do that, but let me know if there is anything I can do for you).
I am having a problem with Gcc's thread sanitizer that I cannot find on their bugzilla or on stackoverflow so I am unsure if I am missing something or if this really is a bug. If I create a main.cpp file containing:
#include <thread>
int main(){
std::thread t([](){});
t.join();
return 0;}
Now if I compile it using:
g++-4.9.2 -std=c++1y -fsanitize=thread -fPIE -pie -o TestProgram main.cpp
Running the resulting executable does not yield any problem. Yet if I add the debug info flag:
g++-4.9.2 -std=c++1y -fsanitize=thread -g -fPIE -pie -o TestProgram main.cpp
then the thread sanitizer detects a data race:
WARNING: ThreadSanitizer: data race (pid=22683)
Write of size 8 at 0x7d0c0000efd8 by thread T1:
#0 operator delete(void*) ../../../../gcc-4.9.2/libsanitizer/tsan/tsan_interceptors.cc:592 (libtsan.so.0+0x000000049490)
#1 deallocate /usr/local/include/c++/4.9.2/ext/new_allocator.h:110 (TestProgram+0x000000002089)
#2 deallocate /usr/local/include/c++/4.9.2/bits/alloc_traits.h:383 (TestProgram+0x000000001f78)
#3 _M_destroy /usr/local/include/c++/4.9.2/bits/shared_ptr_base.h:535 (TestProgram+0x0000000026f4)
#4 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /home/UserG/Compile/objdir/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/shared_ptr_base.h:166 (libstdc++.so.6+0x0000000b5c51)
#5 ~__shared_count /home/UserG/Compile/objdir/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/shared_ptr_base.h:666 (libstdc++.so.6+0x0000000b5c51)
#6 ~__shared_ptr /home/UserG/Compile/objdir/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/shared_ptr_base.h:914 (libstdc++.so.6+0x0000000b5c51)
#7 ~shared_ptr /home/UserG/Compile/objdir/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/shared_ptr.h:93 (libstdc++.so.6+0x0000000b5c51)
#8 execute_native_thread_routine ../../../../../gcc-4.9.2/libstdc++-v3/src/c++11/thread.cc:95 (libstdc++.so.6+0x0000000b5c51)
Previous atomic write of size 4 at 0x7d0c0000efd8 by main thread:
#0 __tsan_atomic32_fetch_add ../../../../gcc-4.9.2/libsanitizer/tsan/tsan_interface_atomic.cc:468 (libtsan.so.0+0x0000000206ce)
#1 __exchange_and_add /usr/local/include/c++/4.9.2/ext/atomicity.h:49 (TestProgram+0x0000000014a0)
#2 __exchange_and_add_dispatch /usr/local/include/c++/4.9.2/ext/atomicity.h:82 (TestProgram+0x000000001557)
#3 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /usr/local/include/c++/4.9.2/bits/shared_ptr_base.h:146 (TestProgram+0x000000002ceb)
#4 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /usr/local/include/c++/4.9.2/bits/shared_ptr_base.h:666 (TestProgram+0x000000002cb6)
#5 std::__shared_ptr<std::thread::_Impl_base, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /usr/local/include/c++/4.9.2/bits/shared_ptr_base.h:914 (TestProgram+0x000000002bc1)
#6 std::shared_ptr<std::thread::_Impl_base>::~shared_ptr() /usr/local/include/c++/4.9.2/bits/shared_ptr.h:93 (TestProgram+0x000000002bed)
#7 thread<main()::<lambda()> > /usr/local/include/c++/4.9.2/thread:135 (TestProgram+0x0000000016f1)
#8 main /home/UserG/main.cpp:3 (TestProgram+0x0000000015af)
Location is heap block of size 48 at 0x7d0c0000efd0 allocated by main thread:
#0 operator new(unsigned long) ../../../../gcc-4.9.2/libsanitizer/tsan/tsan_interceptors.cc:560 (libtsan.so.0+0x0000000496d2)
#1 allocate /usr/local/include/c++/4.9.2/ext/new_allocator.h:104 (TestProgram+0x000000001fe9)
#2 allocate /usr/local/include/c++/4.9.2/bits/alloc_traits.h:357 (TestProgram+0x000000001ecd)
#3 __shared_count<std::thread::_Impl<std::_Bind_simple<main()::<lambda()>()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<main()::<lambda()>()> > >, std::_Bind_simple<main()::<lambda()>()> > /usr/local/include/c++/4.9.2/bits/shared_ptr_base.h:616 (TestProgram+0x000000001d99)
#4 __shared_ptr<std::allocator<std::thread::_Impl<std::_Bind_simple<main()::<lambda()>()> > >, std::_Bind_simple<main()::<lambda()>()> > /usr/local/include/c++/4.9.2/bits/shared_ptr_base.h:1090 (TestProgram+0x000000001ccb)
#5 shared_ptr<std::allocator<std::thread::_Impl<std::_Bind_simple<main()::<lambda()>()> > >, std::_Bind_simple<main()::<lambda()>()> > /usr/local/include/c++/4.9.2/bits/shared_ptr.h:316 (TestProgram+0x000000001c5f)
#6 allocate_shared<std::thread::_Impl<std::_Bind_simple<main()::<lambda()>()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<main()::<lambda()>()> > >, std::_Bind_simple<main()::<lambda()>()> > /usr/local/include/c++/4.9.2/bits/shared_ptr.h:588 (TestProgram+0x000000001bf0)
#7 make_shared<std::thread::_Impl<std::_Bind_simple<main()::<lambda()>()> >, std::_Bind_simple<main()::<lambda()>()> > /usr/local/include/c++/4.9.2/bits/shared_ptr.h:604 (TestProgram+0x000000001ab0)
#8 _M_make_routine<std::_Bind_simple<main()::<lambda()>()> > /usr/local/include/c++/4.9.2/thread:193 (TestProgram+0x000000001919)
#9 thread<main()::<lambda()> > /usr/local/include/c++/4.9.2/thread:135 (TestProgram+0x0000000016bf)
#10 main /home/UserG/main.cpp:3 (TestProgram+0x0000000015af)
Thread T1 (tid=22685, running) created by main thread at:
#0 pthread_create ../../../../gcc-4.9.2/libsanitizer/tsan/tsan_interceptors.cc:877 (libtsan.so.0+0x000000047c03)
#1 __gthread_create /home/UserG/Compile/objdir/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:662 (libstdc++.so.6+0x0000000b5d00)
#2 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) ../../../../../gcc-4.9.2/libstdc++-v3/src/c++11/thread.cc:142 (libstdc++.so.6+0x0000000b5d00)
#3 main /home/UserG/main.cpp:3 (TestProgram+0x0000000015af)
SUMMARY: ThreadSanitizer: data race /usr/local/include/c++/4.9.2/ext/new_allocator.h:110 deallocate
==================
ThreadSanitizer: reported 1 warnings
Now the exact same code compiled with clang++ (version 3.6.0 (trunk 221144)) does not detect a data race:
clang++ -std=c++1y -fsanitize=thread -g -fPIE -pie -o TestProgram main.cpp
I am a bit quizzical about this behavior from gcc as:
1) passing an empty lambda function as an argument to a thread seems licit to me
2) gcc's behavior depends on the -g flag which doesn't strike me as having much to do with the thread sanitizer
3) under similar circumstances clang adopts a behavior that I would consider correct
Many thanks,
You mean this bug? It was reported for 4.8, but a similar report exists for 4.9.2.
Clang has the -fsanitize-blacklist compile switch to suppress warnings from the ThreadSanitizer. Unfortunately, I cannot get it to work.
Here is an example that I want to suppress:
WARNING: ThreadSanitizer: data race (pid=21502)
Read of size 8 at 0x7f0dcf5b31a8 by thread T6:
#0 tbb::interface6::internal::auto_partition_type_base<tbb::interface6::internal::auto_partition_type>::check_being_stolen(tbb::task&) /usr/include/tbb/partitioner.h:305 (exe+0x000000388b38)
#1 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
Previous write of size 8 at 0x7f0dcf5b31a8 by thread T1:
#0 auto_partition_type_base /usr/include/tbb/partitioner.h:299 (exe+0x000000388d9a)
#1 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
#2 GhostSearch::Ghost3Search::SearchTask::execute_impl() /home/phil/ghost/search/ghost3/ghost3_search_alg.cpp:1456 (exe+0x000000387a8a)
#3 <null> <null>:0 (libtbb.so.2+0x0000000224d9)
#4 GhostSearch::Ghost3Search::Ghost3SearchAlg::NullWindowSearch(int, MOVE, int, std::vector<MOVE, std::allocator<MOVE> >&) /home/phil/ghost/search/ghost3/ghost3_search_alg.cpp:1640 (exe+0x000000388310)
#5 GhostSearch::PureMTDSearchAlg::FullWindowSearch(GhostSearch::SearchWindow, GhostSearch::SearchWindow, MOVE, int, std::vector<MOVE, std::allocator<MOVE> >&) /home/phil/ghost/search/pure_mtd_search_alg.cpp:41 (exe+0x000000370e3f)
#6 GhostSearch::PureSearchAlgWrapper::RequestHandlerThread::EnterHandlerMainLoop() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:124 (exe+0x000000372d1b)
#7 operator() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:94 (exe+0x000000374683)
#8 execute_native_thread_routine /home/phil/tmp/gcc/src/gcc-4.8-20130725/libstdc++-v3/src/c++11/thread.cc:84 (libstdc++.so.6+0x0000000b26cf)
Thread T6 (tid=21518, running) created by thread T3 at:
#0 pthread_create ??:0 (exe+0x0000002378e1)
#1 <null> <null>:0 (libtbb.so.2+0x0000000198c0)
Thread T1 (tid=21513, running) created by main thread at:
#0 pthread_create ??:0 (exe+0x0000002378e1)
#1 __gthread_create /home/phil/tmp/gcc/src/gcc-build/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:662 (libstdc++.so.6+0x0000000b291e)
#2 GhostSearch::PureSearchAlgWrapper::StartRequestHandlerThread() /home/phil/ghost/search/pure_search_alg_wrapper.cpp:77 (exe+0x0000003715c3)
#3 GhostSearch::Search::ExecuteSearch(GhostSearch::SEARCH_SETTINGS const&) /home/phil/ghost/search.cpp:243 (exe+0x00000033063f)
#4 GhostSearch::Search::StartSearch(GhostSearch::SEARCH_SETTINGS const&, UserBoard const&, GhostInterfaces::UserInterface*) /home/phil/ghost/search.cpp:176 (exe+0x00000033037a)
#5 GhostInterfaces::UserInterface::StartSearch(GhostSearch::SEARCH_SETTINGS const&, UserBoard const&) /home/phil/ghost/interface.cpp:1072 (exe+0x0000002ea220)
#6 GhostInterfaces::UserInterface::MainLoop() /home/phil/ghost/interface.cpp:576 (exe+0x0000002e9464)
#7 GhostInterfaces::Command_Analyze::Execute(GhostInterfaces::UserInterfaceData&) /home/phil/ghost/commands.cpp:1005 (exe+0x00000028756c)
#8 GhostInterfaces::UserInterface::FinishNextCommand() /home/phil/ghost/interface.cpp:1161 (exe+0x0000002e9ed0)
#9 GhostInterfaces::UserInterface::MainLoop() /home/phil/ghost/interface.cpp:571 (exe+0x0000002e9447)
#10 main /home/phil/ghost/ghost.cpp:54 (exe+0x000000274efd)
SUMMARY: ThreadSanitizer: data race /usr/include/tbb/partitioner.h:305 tbb::interface6::internal::auto_partition_type_base<tbb::interface6::internal::auto_partition_type>::check_being_stolen(tbb::task&)
My tries for the suppression file so far (but it does not work):
# TBB
fun:tbb::*
src:/usr/include/tbb/partitioner.h
Do you know why it does not work?
(By the way, I would be happy to suppress all warnings from the TBB library.)
Finally, I got it working.
According to the documentation, each line must start with a valid "suppression_type" (race, thread, mutex, signal, deadlock, or called_from_lib).
In my example, the correct suppression_type is race.
Here is an example file called "sanitizer-thread-suppressions.txt", which suppresses two functions, which are known to contain data races:
race:Function1
race:MyNamespace::Function2
To test the suppress file, set the TSAN_OPTIONS environment variable and call the application (compiled with -fsanitize=thread):
$ TSAN_OPTIONS="suppressions=sanitizer-thread-suppressions.txt" ./myapp
If that works, you can apply the settings at compile time:
-fsanitize=thread -fsanitize-blacklist=sanitizer-thread-suppressions.txt
I have problems creating a ruby extension to export a C++ library I wrote to ruby under OSX. This simple example:
#include <boost/regex.hpp>
extern "C" void Init_bayeux()
{
boost::regex expression("^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?");
}
results in a bad_cast exception being thrown:
#0 0x00000001014663bd in __cxa_throw ()
#1 0x00000001014cf6b2 in __cxa_bad_cast ()
#2 0x00000001014986f9 in std::use_facet<std::collate<char> > ()
#3 0x0000000101135a4f in boost::re_detail::cpp_regex_traits_base<char>::imbue (this=0x7fff5fbfe4d0, l=#0x7fff5fbfe520) at cpp_regex_traits.hpp:218
#4 0x0000000101138d42 in cpp_regex_traits_base (this=0x7fff5fbfe4d0, l=#0x7fff5fbfe520) at cpp_regex_traits.hpp:173
#5 0x000000010113eda6 in boost::re_detail::create_cpp_regex_traits<char> (l=#0x7fff5fbfe520) at cpp_regex_traits.hpp:859
#6 0x0000000101149bee in cpp_regex_traits (this=0x101600200) at cpp_regex_traits.hpp:880
#7 0x0000000101142758 in regex_traits (this=0x101600200) at regex_traits.hpp:75
#8 0x000000010113d68c in regex_traits_wrapper (this=0x101600200) at regex_traits.hpp:169
#9 0x000000010113bae1 in regex_data (this=0x101600060) at basic_regex.hpp:166
#10 0x000000010113981e in basic_regex_implementation (this=0x101600060) at basic_regex.hpp:202
#11 0x0000000101136e1a in boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::do_assign (this=0x7fff5fbfe710, p1=0x100540ae0 "^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?", p2=0x100540b19 "", f=0) at basic_regex.hpp:652
#12 0x0000000100540a66 in boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::assign (this=0x7fff5fbfe710, p1=0x100540ae0 "^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?", p2=0x100540b19 "", f=0) at basic_regex.hpp:379
#13 0x0000000100540a13 in boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::assign (this=0x7fff5fbfe710, p=0x100540ae0 "^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?", f=0) at basic_regex.hpp:364
#14 0x000000010054096e in basic_regex (this=0x7fff5fbfe710, p=0x100540ae0 "^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?", f=0) at basic_regex.hpp:333
#15 0x00000001005407e2 in Init_bayeux () at bayeux.cpp:10
#16 0x0000000100004593 in dln_load (file=0x1008bc000 "/Users/todi/sioux/lib/debug/rack/bayeux.bundle") at dln.c:1293
I compile the extension with:
g++ ./source/rack/bayeux.cpp -o /Users/todi/sioux/obj/debug/rack/bayeux.o -Wall -pedantic -Wno-parentheses -Wno-sign-compare -fno-common -c -pipe -I/Users/todi/sioux/source -ggdb -O0
And finally link the dynamic library with:
g++ -o /Users/todi/sioux/lib/debug/rack/bayeux.bundle -bundle -ggdb /Users/todi/sioux/obj/debug/rack/bayeux.o -L/Users/todi/sioux/lib/debug -lrack -lboost_regex-mt-d -lruby
I've searched the web and tried all kind of link and compiler switches. If I build a executable there is no such problem. Does someone else had such a problem and found a solution?
I've further investigated this and found that the function causing the exception looks like this:
std::locale loc = std::locale("C");
std::use_facet< std::collate<char> >( loc );
In the source of std::collate<> I found the throw statment:
use_facet(const locale& __loc)
{
const size_t __i = _Facet::id._M_id();
const locale::facet** __facets = __loc._M_impl->_M_facets;
if (__i >= __loc._M_impl->_M_facets_size || !__facets[__i])
__throw_bad_cast();
#ifdef __GXX_RTTI
return dynamic_cast<const _Facet&>(*__facets[__i]);
#else
return static_cast<const _Facet&>(*__facets[__i]);
#endif
}
Does this makes any sense to you?
Update: I've tried Jan's suggestion:
Todis-MacBook-Pro:rack todi$ g++ -shared -fpic -o bayeux.bundle bayeux.cpp
Todis-MacBook-Pro:rack todi$ ruby -I. -rbayeux -e 'puts :ok'
terminate called after throwing an instance of 'std::bad_cast'
what(): std::bad_cast
Abort trap
versions:
Todis-MacBook-Pro:rack todi$ ruby -v
ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.6.0]
Todis-MacBook-Pro:rack todi$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/local/libexec/gcc/x86_64-apple-darwin10/4.5.2/lto-wrapper
Target: x86_64-apple-darwin10
Configured with: ../gcc-4.5.2/configure --prefix=/opt/local --build=x86_64-apple-darwin10 --enable-languages=c,c++,objc,obj-c++,fortran,java --libdir=/opt/local/lib/gcc45 --includedir=/opt/local/include/gcc45 --infodir=/opt/local/share/info --mandir=/opt/local/share/man --datarootdir=/opt/local/share/gcc-4.5 --with-local-prefix=/opt/local --with-system-zlib --disable-nls --program-suffix=-mp-4.5 --with-gxx-include-dir=/opt/local/include/gcc45/c++/ --with-gmp=/opt/local --with-mpfr=/opt/local --with-mpc=/opt/local --enable-stage1-checking --disable-multilib --enable-fully-dynamic-string
Thread model: posix
gcc version 4.5.2 (GCC)
Update:
It's not the bound-check in use_facet() that throws, but the next line, that actually does a dynamic cast. This example boils it down to maybe something with RTTI:
#define private public
#include <locale>
#include <iostream>
#include <typeinfo>
extern "C" void Init_bayeux()
{
std::locale loc = std::locale("C");
printf( "size: %i\n", loc._M_impl->_M_facets_size );
printf( "id: %i\n", std::collate< char >::id._M_id() );
const std::locale::facet& fac = *loc._M_impl->_M_facets[ std::collate< char >::id._M_id() ];
printf( "name: %s\n", typeid( fac ).name());
printf( "name: %s\n", typeid( std::collate<char> ).name());
const std::type_info& a = typeid( fac );
const std::type_info& b = typeid( std::collate<char> );
printf( "equal: %i\n", !a.before( b ) && !b.before( a ) );
dynamic_cast< const std::collate< char >& >( fac );
}
I've used printf() because usage of cout also fails. The output of the code above is:
size: 28
id: 5
name: St7collateIcE
name: St7collateIcE
equal: 1
terminate called after throwing an instance of 'std::bad_cast'
what(): std::bad_cast
Abort trap
Build with:
g++ -shared -fpic -o bayeux.bundle bayeux.cpp && ruby -I. -rbayeux -e 'puts :ok'
Update:
If I rename Init_bayeux to main() and link it to an executable, the output is the same, but no call to terminate.
Update:
When I write a little program to load the shared library and to execute Init_bayeux(), again, no exception is thrown:
#include <dlfcn.h>
int main()
{
void* handle = dlopen("bayeux.bundle", RTLD_LAZY|RTLD_GLOBAL );
void(*f)(void) = (void(*)(void)) dlsym( handle, "Init_bayeux" ) ;
f();
}
So it looks to me, that it might be a problem with how the ruby.exe was build. Does that make sense?
Update:
I had a look at the addresses containing the names of the two type_info objects. Same content, but different addresses. I added the -flat_namespace switch to the link command. Now the dynamic_cast works. The original Problem with the boost regex library still exists, but I think this might be solvable by linking boost statically into the shared library or by rebuilding the boost libraries with the -flat_namespace switch.
Update:
Now I'm back to the very first example with the boost regex expression, build with this command:
g++ -shared -flat_namespace -fPIC -o bayeux.bundle /Users/todi/boost_1_49_0/stage/lib/libboost_regex.a bayeux.cpp
But when loading the extension into the ruby interpreter, initializing of static symbols fails:
ruby(59384,0x7fff712b8cc0) malloc: *** error for object 0x7fff70b19500: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Program received signal SIGABRT, Aborted.
0x00007fff8a6ab0b6 in __kill ()
(gdb) bt
#0 0x00007fff8a6ab0b6 in __kill ()
#1 0x00007fff8a74b9f6 in abort ()
#2 0x00007fff8a663195 in free ()
#3 0x0000000100541023 in boost::re_detail::cpp_regex_traits_char_layer<char>::init (this=0x10060be50) at basic_string.h:237
#4 0x0000000100543904 in boost::object_cache<boost::re_detail::cpp_regex_traits_base<char>, boost::re_detail::cpp_regex_traits_implementation<char> >::do_get (k=#0x7fff5fbfddd0) at cpp_regex_traits.hpp:366
#5 0x000000010056005b in create_cpp_regex_traits<char> (l=<value temporarily unavailable, due to optimizations>) at pending/object_cache.hpp:69
#6 0x0000000100544c33 in boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::do_assign (this=0x7fff5fbfe090, p1=0x100567158 "^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?", p2=0x100567191 "", f=0) at cpp_regex_traits.hpp:880
#7 0x0000000100566280 in boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::assign ()
#8 0x000000010056622d in boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::assign ()
#9 0x0000000100566188 in boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::basic_regex ()
#10 0x0000000100566025 in Init_bayeux ()
#11 0x0000000100003a23 in dln_load (file=0x10201a000 "/Users/todi/sioux/source/rack/bayeux.bundle") at dln.c:1293
#12 0x000000010016569d in vm_pop_frame [inlined] () at /Users/todi/.rvm/src/ruby-1.9.2-p320/vm_insnhelper.c:1465
#13 0x000000010016569d in rb_vm_call_cfunc (recv=4303980440, func=0x100042520 <load_ext>, arg=4303803000, blockptr=0x1, filename=<value temporarily unavailable, due to optimizations>, filepath=<value temporarily unavailable, due to optimizations>) at vm.c:1467
#14 0x0000000100043382 in rb_require_safe (fname=4303904640, safe=0) at load.c:602
#15 0x000000010017cbf3 in vm_call_cfunc [inlined] () at /Users/todi/.rvm/src/ruby-1.9.2-p320/vm_insnhelper.c:402
#16 0x000000010017cbf3 in vm_call_method (th=0x1003016b0, cfp=0x1004ffef8, num=1, blockptr=0x1, flag=8, id=<value temporarily unavailable, due to optimizations>, me=0x10182cfa0, recv=4303980440) at vm_insnhelper.c:528
...
Again, this doesn't fail, when I load the shared library by the little c program from above.
Update:
Now I link the first example static:
g++ -shared -fPIC -flat_namespace -nodefaultlibs -o bayeux.bundle -static -lstdc++ -lpthread -lgcc_eh -lboost_regex-mt bayeux.cpp
With the same error:
ruby(15197,0x7fff708aecc0) malloc: *** error for object 0x7fff7027e500: pointer being freed was not allocated
otool -L confirmed that every library is linked static:
bayeux.bundle:
bayeux.bundle (compatibility version 0.0.0, current version 0.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.2.11)
debug:
If I link against the boost debug version, then it works like expected.
For the records: I've now build boost and my application with the very same compiler (version 4.2.1 [official apple version]). No problems so far. Why it will not work as expected when the ruby extension links all libraries statically is a miracle to me. Thank to all who put time into this issue.
Kind regards
Torsten