Boost ASIO segfault in release mode - c++

I have made this small illustrative code that exhibits the same issues the program I'm writing does: namely, it works fine in debug mode, segfaults in release. The problem seems to be that the ui_context, in release mode, when being called to run the work it has assigned, is nullptr.
Running in Fedora 33, with g++ (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9) and clang version 11.0.0 (Fedora 11.0.0-2.fc33) . Both compilers behave in the same way. Boost version is 1.75 .
Code:
#include <iostream>
#include <vector>
#include <memory>
#include <chrono>
#include <thread>
#include <boost/asio.hpp>
#include <boost/signals2.hpp>
constexpr auto MAX_LOOP_COUNT = 100;
class network_client : public std::enable_shared_from_this<network_client>
{
private:
using Signal = boost::signals2::signal<void(int)>;
public:
network_client(boost::asio::io_context &context) :
strand(boost::asio::make_strand(context))
{
std::cout << "network client created" << std::endl;
}
void doNetworkWork()
{
std::cout << "doing network work" << std::endl;
boost::asio::post(strand,std::bind(&network_client::onWorkComplete,shared_from_this()));
}
void onWorkComplete()
{
std::this_thread::sleep_for(std::chrono::milliseconds(10));
std::cout << "signalling completion" << " from thread id:" << std::this_thread::get_id() << std::endl;
signal(42);
}
void workCompleteHandler(const typename Signal::slot_type &slot)
{
signal.connect(slot);
}
private :
boost::asio::strand<boost::asio::io_context::executor_type> strand;
Signal signal;
};
class network_client_producer
{
public :
network_client_producer() : work(boost::asio::make_work_guard(context))
{
using run_function = boost::asio::io_context::count_type (boost::asio::io_context::*)();
for (int i = 0; i < 2; i++)
{
context_threads.emplace_back(std::bind(static_cast<run_function>(&boost::asio::io_context::run), std::ref(context)));
}
}
~network_client_producer()
{
context.stop();
for(auto&& thread : context_threads)
{
if(thread.joinable())
{
thread.join();
}
}
}
using NetworkClientPtr = std::shared_ptr<network_client>;
NetworkClientPtr makeNetworkClient()
{
return std::make_shared<network_client>(context);
}
private :
boost::asio::io_context context;
std::vector<std::thread> context_threads;
boost::asio::executor_work_guard<boost::asio::io_context::executor_type> work;
};
class desktop : public std::enable_shared_from_this<desktop>
{
public:
desktop(const boost::asio::io_context::executor_type &executor):executor(executor)
{
}
void doSomeNetworkWork()
{
auto client = client_producer.makeNetworkClient();
client->workCompleteHandler([self = shared_from_this()](int i){
//post work into the UI thread
std::cout << "calling into the uiThreadWork with index " << i << " from thread id:" << std::this_thread::get_id() << std::endl;
boost::asio::post(self->executor, std::bind(&desktop::uiThreadWorkComplete, self, i));
});
client->doNetworkWork();
}
void showDesktop()
{
std::this_thread::sleep_for(std::chrono::milliseconds(20));
}
public:
void uiThreadWorkComplete(int i)
{
std::cout << "Called in the UI thread with index:" << i << ", on thread id:" << std::this_thread::get_id() << std::endl;
}
private:
const boost::asio::io_context::executor_type& executor;
network_client_producer client_producer;
};
int main()
{
std::cout << "Starting application. Main thread id:"<<std::this_thread::get_id() << std::endl;
int count = 0;
boost::asio::io_context ui_context;
auto work = boost::asio::make_work_guard(ui_context);
/*auto work = boost::asio::require(ui_context.get_executor(),
boost::asio::execution::outstanding_work.tracked);*/
auto ui_desktop = std::make_shared<desktop>(ui_context.get_executor());
ui_desktop->doSomeNetworkWork();
while(true)
{
ui_context.poll_one();
ui_desktop->showDesktop();
if (count >= MAX_LOOP_COUNT)
break;
count++;
}
ui_context.stop();
std::cout << "Stopping application" << std::endl;
return 0;
}
Compiling it with g++ -std=c++17 -g -o main -pthread -O3 main.cpp and
running it in gdb I get this:
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Starting application. Main thread id:140737348183872
[New Thread 0x7ffff7a51640 (LWP 27082)]
[New Thread 0x7ffff7250640 (LWP 27083)]
network client created
doing network work
signalling completion from thread id:140737348179520
calling into the uiThreadWork with index 42 from thread id:140737348179520
Thread 2 "main" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7a51640 (LWP 27082)]
0x000000000040b7b8 in boost::asio::io_context::basic_executor_type<std::allocator<void>, 0u>::execute<std::_Bind<void (desktop::*(std::shared_ptr<desktop>, int))(int)> >(std::_Bind<void (desktop::*(std::shared_ptr<desktop>, int))(int)>&&) const (this=<optimized out>, f=...) at /usr/local/include/boost/asio/impl/io_context.hpp:309
309 io_context_->impl_.post_immediate_completion(p.p,
While compiling without any optimizations g++ -std=c++17 -g -o main -pthread -O0 main.cpp works as expected.
I tried to keep it as close as I can to the real program that actually does network IO, which is why I have that strand in there.
It's obvious that I'm doing something horribly wrong here. The question is: what is the problem?
Thank you for any pointers.

Add the sanitizers -fsanitize=undefined,address:
Starting application. Main thread id:139902898299968
network client created
doing network work
signalling completion from thread id:139902399940352
calling into the uiThreadWork with index 42 from thread id:139902399940352
=================================================================
==29084==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffc8393d7c0 at pc 0x000000507ce0 bp 0x7f3d90d9e3d0 sp 0x7f3d90d9e3c8
READ of size 8 at 0x7ffc8393d7c0 thread T1
#0 0x507cdf in boost::asio::io_context::basic_executor_type<std::allocator<void>, ...
#1 0x507cdf in boost::asio::detail::initiate_post_with_executor<boost::asio::io_co...
#2 0x507cdf in auto boost::asio::post<boost::asio::io_context::basic_executor_type...
#3 0x5077cf in desktop::doSomeNetworkWork()::'lambda'(int)::operator()(int) const ...
#4 0x518ce2 in boost::function1<void, int>::operator()(int) const /home/sehe/custo...
#5 0x518481 in boost::signals2::detail::void_type boost::signals2::detail::call_wi...
#6 0x518481 in boost::signals2::detail::void_type boost::signals2::detail::variadi...
#7 0x517f43 in boost::signals2::detail::slot_call_iterator_t<boost::signals2::deta...
#8 0x516397 in void boost::signals2::optional_last_value<void>::operator()<boost::...
#9 0x516397 in void boost::signals2::detail::combiner_invoker<void>::operator()<bo...
#10 0x516397 in boost::signals2::detail::signal_impl<void (int), boost::signals2::...
#11 0x50d9d4 in network_client::onWorkComplete() /home/sehe/Projects/stackoverflow...
#12 0x51021d in void std::_Bind<void (network_client::* (std::shared_ptr<network_c...
#13 0x51021d in void boost::asio::asio_handler_invoke<std::_Bind<void (network_cli...
#14 0x51021d in void boost_asio_handler_invoke_helpers::invoke<std::_Bind<void (ne...
#15 0x51021d in boost::asio::detail::executor_op<std::_Bind<void (network_client::...
#16 0x51188e in boost::asio::detail::strand_executor_service::invoker<boost::asio:...
#17 0x514311 in void boost::asio::asio_handler_invoke<boost::asio::detail::strand_...
#18 0x514311 in void boost_asio_handler_invoke_helpers::invoke<boost::asio::detail...
#19 0x514311 in boost::asio::detail::executor_op<boost::asio::detail::strand_execu...
#20 0x4d8704 in boost::asio::detail::scheduler::do_run_one(boost::asio::detail::co...
#21 0x4d70dc in boost::asio::detail::scheduler::run(boost::system::error_code&) /h...
#22 0x523a6e in boost::asio::io_context::run() /home/sehe/custom/boost_1_75_0/boos...
#23 0x5258ef in unsigned long std::_Bind<unsigned long (boost::asio::io_context::*...
#24 0x5258ef in unsigned long std::__invoke_impl<unsigned long, std::_Bind<unsigne...
#25 0x5258ef in std::__invoke_result<std::_Bind<unsigned long (boost::asio::io_con...
#26 0x5258ef in unsigned long std::thread::_Invoker<std::tuple<std::_Bind<unsigned...
#27 0x7f3da660bd7f (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xd0d7f)
#28 0x7f3da5f856da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)
#29 0x7f3da5a9671e in clone /build/glibc-S7xCS9/glibc-2.27/misc/../sysdeps/unix/sy...
Address 0x7ffc8393d7c0 is located in stack of thread T0 at offset 224 in frame
#0 0x4cb30f in main /home/sehe/Projects/stackoverflow/test.cpp:109
This frame has 6 object(s):
[32, 40) 'ref.tmp.i85' (line 96)
[64, 80) 'ref.tmp.i'
[96, 112) 'ui_context' (line 113)
[128, 152) 'work' (line 114)
[192, 208) 'ui_desktop' (line 117)
[224, 240) 'ref.tmp' (line 117) <== Memory access at offset 224 is inside this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope /home/sehe/custom/boost_1_75_0/boost/asio/io_context.hpp:678:25 in boost::asio::io_context::basic_executor_type<std::allocator<void>, 0u>::basic_executor_type(boost::asio::io_context::basic_executor_type<std::allocator<void>, 0u> const&)
Shadow bytes around the buggy address:
0x10001071faa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10001071fab0: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f8 f2 f2 f2
0x10001071fac0: f8 f2 f2 f2 00 f2 f2 f2 00 00 f3 f3 00 00 00 00
0x10001071fad0: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
0x10001071fae0: 00 f2 f2 f2 f8 f8 f2 f2 00 00 f2 f2 00 00 00 f2
=>0x10001071faf0: f2 f2 f2 f2 00 00 f2 f2[f8]f8 f3 f3 00 00 00 00
0x10001071fb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10001071fb10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10001071fb20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10001071fb30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10001071fb40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
Thread T1 created by T0 here:
#0 0x483a6a in pthread_create (/home/sehe/Projects/stackoverflow/sotest+0x483a6a)
#1 0x7f3da660c014 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std...
#2 0x5241d4 in void std::vector<std::thread, std::allocator<std::thread> >::_M_realloc_ins...
#3 0x523517 in std::thread& std::vector<std::thread, std::allocator<std::thread> >::emplac...
==29084==ABORTING
There's your culprit.
Searching The Culprit
This frame has 6 object(s):
[32, 40) 'ref.tmp.i85' (line 96)
[64, 80) 'ref.tmp.i'
[96, 112) 'ui_context' (line 113)
[128, 152) 'work' (line 114)
[192, 208) 'ui_desktop' (line 117)
[224, 240) 'ref.tmp' (line 117) <== Memory access at offset 224 is inside this variable
What variable is that? Apparently in the line
auto ui_desktop = std::make_shared<desktop>(ui_context.get_executor());
there's a temporary that is being kept a reference to. It must be ui_context.get_executor() because ui_desktop is named and has "obvious" lifetime.
Sure enough, desktop declares its executor member by reference:
const boost::asio::io_context::executor_type& executor;
This is a clear error. Executors are not services nor execution contexts, and are designed to be cheaply copyable and passed by value. The fix is trivial:
boost::asio::io_context::executor_type executor;
BONUS
As a bonus, here's a simplified version that runs the demo for half a second. Notes:
using a thread_pool instead of hand-rolling a flawed one
consider not using .stop() on execution contexts, or forgetting about the redundant word guards?
Live On Compiler Explorer
#include <iostream>
#include <chrono>
#include <iomanip>
#include <memory>
#include <thread>
#include <boost/asio.hpp>
#include <boost/signals2.hpp>
namespace {
using namespace std::chrono_literals;
auto now = std::chrono::high_resolution_clock::now;
auto elapsed = [start=now()] { return (now()-start)/1ms; };
inline std::string thread_hash() {
static constexpr std::hash<std::thread::id> h{};
std::ostringstream oss;
oss << std::hex << std::setw(2) << std::setfill('0')
<< h(std::this_thread::get_id()) % 0xff;
return oss.str();
}
auto trace = [](auto const&... args) {
std::cout << "thread #" << thread_hash() << " at t+" << std::setw(3) << elapsed() << "ms\t";
(std::cout << ... << args) << std::endl;
};
} // namespace
struct network_client : std::enable_shared_from_this<network_client> {
explicit network_client(const boost::asio::any_io_executor& context) : strand(make_strand(context)) {
trace("network client created");
}
void doNetworkWork() {
trace("doing network work");
post(strand, std::bind(&network_client::onWorkComplete, shared_from_this()));
}
void onWorkComplete() {
std::this_thread::sleep_for(10ms);
trace("signalling completion");
signal(42);
}
template <typename F> void workCompleteHandler(F slot) {
signal.connect(std::move(slot));
}
private:
boost::asio::strand<boost::asio::any_io_executor> strand;
using Signal = boost::signals2::signal<void(int)>;
Signal signal;
};
struct network_client_producer {
auto makeNetworkClient() {
return std::make_shared<network_client>(context_threads.get_executor());
}
private :
boost::asio::thread_pool context_threads {2};
};
struct desktop : std::enable_shared_from_this<desktop> {
explicit desktop(boost::asio::io_context::executor_type executor) : executor(std::move(executor)) {}
void doSomeNetworkWork() {
auto client = client_producer.makeNetworkClient();
client->workCompleteHandler([this, self = shared_from_this()](int i) {
// post work into the UI thread
trace("calling into the uiThreadWork with index ", i);
post(executor, std::bind(&desktop::uiThreadWorkComplete, self, i));
});
client->doNetworkWork();
}
static void showDesktop() {
trace("showDesktop");
std::this_thread::sleep_for(20ms);
}
void uiThreadWorkComplete(int i) const {
trace("Called in the UI thread with index:", i);
}
private:
boost::asio::io_context::executor_type executor;
network_client_producer client_producer;
};
int main() {
trace("Starting application. Main thread is #", thread_hash());
boost::asio::io_context ui_context;
auto work = boost::asio::make_work_guard(ui_context);
/*auto work = boost::asio::require(ui_context.get_executor(),
boost::asio::execution::outstanding_work.tracked);*/
auto ui_desktop = std::make_shared<desktop>(ui_context.get_executor());
ui_desktop->doSomeNetworkWork();
for (auto deadline = now() + 0.5s; now() < deadline;) {
ui_context.poll_one();
ui_desktop->showDesktop();
}
trace("Stopping application");
work.reset();
ui_context.run();
// ui_context.stop();
trace("Bye\n");
}
Prints
thread #97 at t+ 0ms Starting application. Main thread is #97
thread #97 at t+ 0ms network client created
thread #97 at t+ 1ms doing network work
thread #97 at t+ 1ms showDesktop
thread #2d at t+ 11ms signalling completion
thread #2d at t+ 11ms calling into the uiThreadWork with index 42
thread #97 at t+ 21ms Called in the UI thread with index:42
thread #97 at t+ 21ms showDesktop
thread #97 at t+ 41ms showDesktop
thread #97 at t+ 61ms showDesktop
thread #97 at t+ 81ms showDesktop
thread #97 at t+101ms showDesktop
thread #97 at t+122ms showDesktop
thread #97 at t+142ms showDesktop
thread #97 at t+162ms showDesktop
thread #97 at t+182ms showDesktop
thread #97 at t+202ms showDesktop
thread #97 at t+222ms showDesktop
thread #97 at t+242ms showDesktop
thread #97 at t+262ms showDesktop
thread #97 at t+282ms showDesktop
thread #97 at t+302ms showDesktop
thread #97 at t+323ms showDesktop
thread #97 at t+343ms showDesktop
thread #97 at t+363ms showDesktop
thread #97 at t+383ms showDesktop
thread #97 at t+403ms showDesktop
thread #97 at t+423ms showDesktop
thread #97 at t+443ms showDesktop
thread #97 at t+463ms showDesktop
thread #97 at t+483ms showDesktop
thread #97 at t+503ms Stopping application
thread #97 at t+504ms Bye

The problem is that your executor is a reference to a temporary object. In your main method, you call ui_context.get_executor(), which returns a temporary object. You pass the temporary to the desktop constructor, which stores a reference to this object in the member variable executor. After the auto ui_desktop = ... line in main has completed, the temporary goes out-of-scope and the reference held by executor becomes invalid.
This problem is also detected when compiling your program with address sanitization enabled (-fsanitize=address):
==24629==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffe8bee0270 at pc 0x5584b2d89b0c bp 0x7f3ac42fd970 sp 0x7f3ac42fd960
READ of size 8 at 0x7ffe8bee0270 thread T1
#0 0x5584b2d89b0b in boost::asio::io_context::basic_executor_type<std::allocator<void>, 0u>::basic_executor_type(boost::asio::io_context::basic_executor_type<std::allocator<void>, 0u> const&) /usr/include/boost/asio/io_context.hpp:678
...
I would suspect that in your debug build, the temporary object gets to live slightly longer, i.e. the stack memory that was occupied by the temporary object is not reclaimed immediately after the temporary goes out-of-scope. Whereas in the release build, more aggressive optimizations are applied, which results in the memory being reclaimed sooner, thereby invalidating the reference sooner, and then crashing your program once the reference is accessed.
To fix this, you have to ensure that the executor returned by get_executor does not go out-of-scope, so that the reference held by the ui_desktop object remains valid. For example, you could assign the result of get_executor to a variable in your main:
auto executor{ui_context.get_executor()};
auto ui_desktop = std::make_shared<desktop>(executor);

Related

Initializing plog::RollingFileAppender on Windows XP Triggers Access Violation (Null Pointer)

When using [plog][1] on Windows XP. In this case, the code is:
void LogInit(void)
{
static plog::RollingFileAppender<plog::TxtFormatter> fileAppender("log.log");
Using Visual Studio 2019 but the project uses the platform toolset Visual Studio 2017 - Windows XP (v141_XP)
The output assembly is:
; COMDAT _LogInit
_TEXT SEGMENT
_status$1$ = -516 ; size = 4
_appender$66 = -516 ; size = 4
$T65 = -512 ; size = 256
$T64 = -512 ; size = 256
$T62 = -512 ; size = 256
$T60 = -512 ; size = 256
$T58 = -256 ; size = 256
$T57 = -256 ; size = 256
$T41 = -256 ; size = 256
_LogInit PROC ; COMDAT
; 108 : {
00000 55 push ebp
00001 8b ec mov ebp, esp
00003 83 e4 f8 and esp, -8 ; fffffff8H
; 109 : static plog::RollingFileAppender<plog::TxtFormatter> fileAppender("log.log");
00006 64 a1 00 00 00
00 mov eax, DWORD PTR fs:__tls_array
0000c 81 ec 04 02 00
00 sub esp, 516 ; 00000204H
00012 8b 0d 00 00 00
00 mov ecx, DWORD PTR __tls_index
00018 53 push ebx
00019 56 push esi
0001a 8b 34 88 mov esi, DWORD PTR [eax+ecx*4]
The null pointer is because EAX (__tls_array) and ECX (__tls_index) area both null. Output from WinDbg:
TGLOBALFLAG: 70
APPLICATION_VERIFIER_FLAGS: 0
CONTEXT: (.ecxr)
eax=00000000 ebx=00000000 ecx=00000000 edx=7c90e4f4 esi=0012f624 edi=00000000
eip=1000366a esp=001afda4 ebp=001affb4 iopl=0 nv up ei pl nz ac pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010216
LogTest!LogInit+0x1a:
1000366a 8b3488 mov esi,dword ptr [eax+ecx*4] ds:0023:00000000=????????
Resetting default scope
EXCEPTION_RECORD: (.exr -1)
ExceptionAddress: 1000366a (LogTest!LogInit+0x0000001a)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000000
Parameter[1]: 00000000
Attempt to read from address 00000000
PROCESS_NAME: notepad.exe
READ_ADDRESS: 00000000
ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.
EXCEPTION_CODE_STR: c0000005
EXCEPTION_PARAMETER1: 00000000
EXCEPTION_PARAMETER2: 00000000
FAULTING_LOCAL_VARIABLE_NAME: fileAppender
STACK_TEXT:
001affb4 7c80b713 00000000 00000000 0012f624 LogTest!LogInit+0x1a
001affec 00000000 10003650 00000000 00000000 kernel32!BaseThreadStart+0x37
STACK_COMMAND: ~1s; .ecxr ; kb
FAULTING_SOURCE_LINE: d:\test\logtest.cpp
FAULTING_SOURCE_FILE: d:\test\logtest.cpp
FAULTING_SOURCE_LINE_NUMBER: 109
FAULTING_SOURCE_CODE:
105:
106: // This is an example of an exported function.
107: LogInit_API void LogInit(void)
108: {
> 109: static plog::RollingFileAppender<plog::TxtFormatter> fileAppender(";pg.log");
110: plog::init(plog::info, &fileAppender);
111:
112:
113:
114:
SYMBOL_NAME: LogTest!LogInit+1a
MODULE_NAME: LogTest
IMAGE_NAME: LogTest.dll
FAILURE_BUCKET_ID: NULL_POINTER_READ_c0000005_LogTest.dll!LogInit
OS_VERSION: 5.1.2600.5512
BUILDLAB_STR: xpsp
OSPLATFORM_TYPE: x86
OSNAME: Windows XP
FAILURE_ID_HASH: {0218fa42-bce4-328f-5683-a7e3657927fc}
Followup: MachineOwner
---------
Code for affected class is:
namespace plog
{
template<class Formatter, class Converter = NativeEOLConverter<UTF8Converter> >
class PLOG_LINKAGE_HIDDEN RollingFileAppender : public IAppender
{
public:
RollingFileAppender(const util::nchar* fileName, size_t maxFileSize = 0, int maxFiles = 0)
: m_fileSize()
, m_maxFileSize()
, m_maxFiles(maxFiles)
, m_firstWrite(true)
{
setFileName(fileName);
setMaxFileSize(maxFileSize);
}
#ifdef _WIN32
RollingFileAppender(const char* fileName, size_t maxFileSize = 0, int maxFiles = 0)
: m_fileSize()
, m_maxFileSize()
, m_maxFiles(maxFiles)
, m_firstWrite(true)
{
setFileName(fileName);
setMaxFileSize(maxFileSize);
}
#endif
virtual void write(const Record& record)
{
util::MutexLock lock(m_mutex);
if (m_firstWrite)
{
openLogFile();
m_firstWrite = false;
}
else if (m_maxFiles > 0 && m_fileSize > m_maxFileSize && static_cast<size_t>(-1) != m_fileSize)
{
rollLogFiles();
}
size_t bytesWritten = m_file.write(Converter::convert(Formatter::format(record)));
if (static_cast<size_t>(-1) != bytesWritten)
{
m_fileSize += bytesWritten;
}
}
void setFileName(const util::nchar* fileName)
{
util::MutexLock lock(m_mutex);
util::splitFileName(fileName, m_fileNameNoExt, m_fileExt);
m_file.close();
m_firstWrite = true;
}
#ifdef _WIN32
void setFileName(const char* fileName)
{
setFileName(util::toWide(fileName).c_str());
}
#endif
void setMaxFiles(int maxFiles)
{
m_maxFiles = maxFiles;
}
void setMaxFileSize(size_t maxFileSize)
{
m_maxFileSize = (std::max)(maxFileSize, static_cast<size_t>(1000)); // set a lower limit for the maxFileSize
}
void rollLogFiles()
{
m_file.close();
util::nstring lastFileName = buildFileName(m_maxFiles - 1);
util::File::unlink(lastFileName.c_str());
for (int fileNumber = m_maxFiles - 2; fileNumber >= 0; --fileNumber)
{
util::nstring currentFileName = buildFileName(fileNumber);
util::nstring nextFileName = buildFileName(fileNumber + 1);
util::File::rename(currentFileName.c_str(), nextFileName.c_str());
}
openLogFile();
m_firstWrite = false;
}
private:
void openLogFile()
{
util::nstring fileName = buildFileName();
m_fileSize = m_file.open(fileName.c_str());
if (0 == m_fileSize)
{
size_t bytesWritten = m_file.write(Converter::header(Formatter::header()));
if (static_cast<size_t>(-1) != bytesWritten)
{
m_fileSize += bytesWritten;
}
}
}
util::nstring buildFileName(int fileNumber = 0)
{
util::nostringstream ss;
ss << m_fileNameNoExt;
if (fileNumber > 0)
{
ss << '.' << fileNumber;
}
if (!m_fileExt.empty())
{
ss << '.' << m_fileExt;
}
return ss.str();
}
private:
util::Mutex m_mutex;
util::File m_file;
size_t m_fileSize;
size_t m_maxFileSize;
int m_maxFiles;
util::nstring m_fileExt;
util::nstring m_fileNameNoExt;
bool m_firstWrite;
};
}
Is there code or compiler settings that can be modified to fix/remove the references to __tls_array / __tls_index.
This occurs in both debug & release builds.
[1]: https://github.com/SergiusTheBest/plog
Setting compiler option /Zc:threadSafeInit- removes the references to __tls_array and __tls_index and stops the access violation crash.
Microsoft documentation here mentions:
In the C++11 standard, block scope variables with static or thread
storage duration must be zero-initialized before any other
initialization takes place. Initialization occurs when control first
passes through the declaration of the variable. If an exception is
thrown during initialization, the variable is considered
uninitialized, and initialization is re-attempted the next time
control passes through the declaration. If control enters the
declaration concurrently with initialization, the concurrent execution
blocks while initialization is completed. The behavior is undefined if
control re-enters the declaration recursively during initialization.
By default, Visual Studio starting in Visual Studio 2015 implements
this standard behavior. This behavior may be explicitly specified by
setting the /Zc:threadSafeInit compiler option.
The /Zc:threadSafeInit compiler option is on by default. The
/permissive- option does not affect /Zc:threadSafeInit.
Thread-safe initialization of static local variables relies on code
implemented in the Universal C run-time library (UCRT). To avoid
taking a dependency on the UCRT, or to preserve the non-thread-safe
initialization behavior of versions of Visual Studio prior to Visual
Studio 2015, use the /Zc:threadSafeInit- option. If you know that
thread-safety is not required, use this option to generate slightly
smaller, faster code around static local declarations.
Thread-safe static local variables use thread-local storage (TLS)
internally to provide efficient execution when the static has already
been initialized. The implementation of this feature relies on Windows
operating system support functions in Windows Vista and later
operating systems. Windows XP, Windows Server 2003, and older
operating systems do not have this support, so they do not get the
efficiency advantage. These operating systems also have a lower limit
on the number of TLS sections that can be loaded. Exceeding the TLS
section limit can cause a crash. If this is a problem in your code,
especially in code that must run on older operating systems, use
/Zc:threadSafeInit- to disable the thread-safe initialization code.

Crash dump analysis for multi threaded c++ application crash [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
This is a question on generic scenario. I have a multi threaded c++ application which crashed and I have the crash dump. There might be hundreds of thread running and any of them might have caused the crash.
What would be the good approach to start analyzing the crash dump.
Among many threads (already logged information under dump file) how to find any specific thread causing the crash. Should I look for any specific criteria because I cannot go ahead and analyze all the threads and their stacks.
Any other useful information/clue you want to suggest.
Thank you very much in advance
We call the following code a minimum reproducible example and you should provide it as part of your question in the future.
It creates 100 threads, synchronizes them so that they all start running at the same time (needs C++ 20). One of the threads will generate an exception randomly, so that we don't know which one it was.
#include <random>
#include <thread>
#include <vector>
using namespace std::chrono_literals;
std::random_device rd;
std::mt19937 twister(rd());
std::uniform_int_distribution<int> dist(0, 100);
std::counting_semaphore<100> synchronizer(0);
void randomCrash()
{
synchronizer.acquire();
if (dist(twister) < 2)
{
throw std::exception();
}
std::this_thread::sleep_for(1000ms); // Ensure the thread is still there when we analyze the dump
}
int main()
{
std::vector<std::thread> threads;
for (int i = 0; i < 100; i++)
{
std::thread t(&randomCrash);
threads.push_back(std::move(t)); // Threads can't be copied, so move it
}
std::cout << "Created 100 threads.\r\n";
synchronizer.release(100);
std::cout << "100 threads running now.\r\n";
for (std::thread& th : threads)
{
if (th.joinable())
{
th.join();
}
}
std::cout << "Done. Ooops ... no exception happened? Well, that's randomness.\r\n";
}
If we now open the crash dump, we can see that it has already switched to thread 92 which caused the exception by looking at the prompt:
0:092>
But let's pretend that didn't work by using the command ~0s, so we're back on the main thread.
0:092> ~0s
ntdll!NtWaitForSingleObject+0x14:
00007ffc`cbcacc94 c3 ret
0:000>
Using the ~ command, you can identify the thread(s) which caused an exception:
0:000> ~
. 0 Id: 3edc.cc8 Suspend: 1 Teb: 0000004a`fb029000 Unfrozen
[...]
91 Id: 3edc.38d0 Suspend: 1 Teb: 0000004a`fb0df000 Unfrozen
# 92 Id: 3edc.2418 Suspend: 1 Teb: 0000004a`fb0e1000 Unfrozen
93 Id: 3edc.4788 Suspend: 1 Teb: 0000004a`fb0e3000 Unfrozen
[...]
103 Id: 3edc.43e4 Suspend: 1 Teb: 0000004a`fb0f7000 Unfrozen
The current thread has a dot (.) and the threads with an exception have a hash (#). Note that the dot may hide the hash if the current thread is the one which threw the exception. So you can easily switch to the thread
0:000> ~92s
ucrtbase!abort+0x4e:
00007ffc`c960286e cd29 int 29h
and look at the call stack
0:092> k
# Child-SP RetAddr Call Site
00 0000004a`80efe500 00007ffc`c9601f9f ucrtbase!abort+0x4e
01 0000004a`80efe530 00007ffc`b6e01aab ucrtbase!terminate+0x1f
02 0000004a`80efe560 00007ffc`b6e02317 VCRUNTIME140_1!FindHandler<__FrameHandler4>+0x45b [D:\...\frame.cpp # 693]
03 0000004a`80efe730 00007ffc`b6e040d9 VCRUNTIME140_1!__InternalCxxFrameHandler<__FrameHandler4>+0x267 [D:\...\frame.cpp # 357]
04 0000004a`80efe7d0 00007ffc`cbcb1f6f VCRUNTIME140_1!__CxxFrameHandler4+0xa9 [D:\...\risctrnsctrl.cpp # 306]
05 0000004a`80efe840 00007ffc`cbc61454 ntdll!RtlpExecuteHandlerForException+0xf
06 0000004a`80efe870 00007ffc`cbcb0a9e ntdll!RtlDispatchException+0x244
07 0000004a`80efef80 00007ffc`c96bd759 ntdll!KiUserExceptionDispatch+0x2e
08 0000004a`80eff6b0 00007ffc`a9f36480 KERNELBASE!RaiseException+0x69
09 0000004a`80eff790 00007ff7`49ec13fd VCRUNTIME140!_CxxThrowException+0x90 [D:\...\throw.cpp # 75]
0a 0000004a`80eff7f0 00007ff7`49ec1ecb WhichThreadCrashes!randomCrash+0x1bd [C:\...\WhichThreadCrashes.cpp # 19]
0b (Inline Function) --------`-------- WhichThreadCrashes!std::invoke+0x2 [C:\...\type_traits # 1585]
0c 0000004a`80eff850 00007ffc`c95b1bb2 WhichThreadCrashes!std::thread::_Invoke<std::tuple<void (__cdecl*)(void)>,0>+0xb [C:\...\thread # 55]
0d 0000004a`80eff880 00007ffc`cb7c7034 ucrtbase!thread_start<unsigned int (__cdecl*)(void *),1>+0x42
0e 0000004a`80eff8b0 00007ffc`cbc62651 kernel32!BaseThreadInitThunk+0x14
0f 0000004a`80eff8e0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
So we can see it crashes in randomCrash().
Once you know how it works, you can also switch to the thread with the exception directly by using ~#s:
0:092> ~0s
ntdll!NtWaitForSingleObject+0x14:
00007ffc`cbcacc94 c3 ret
0:000> ~#s
ucrtbase!abort+0x4e:
00007ffc`c960286e cd29 int 29h
0:092>
Also, !analyze -v should give you
0:000> !analyze -v
[...]
STACK_COMMAND: ~92s ; .ecxr ; kb

Boost deadline_timer causes stack-buffer-overflow

I have been stuck on a really wierd bug with Boost Deadline_timer for the last days.
Desktop: Ubuntu 18.04
Boost: v1.65.01
When I create a new deadline_timer within the constructor of my class AddressSanitizer catches a stack-buffer-overflow coming from inside the Boost libraries.
I have a few observations:
I also notice that something is wrong without AddressSanitizer by that either the timer timeouts all the time becauce expiry_time is negative, or never expires. So it seems as if someplace someone is changing that memory region.
The class I am working with is quite big and is using the same Boost io_service to send data over UDP.
I am not able to reproduce the bug in just a standalone source file.
When I remove code to isolate the issue the issue remains no matter how much code I remove. I have gone down to a just a main filecreate a io_service and a deadline_timer and it stills throws that error. If I duplicate that in another file and duplicate the CMakeLists entry I am still not able to reproduce it.
The structure of the class is not very complicated and here is an example class which essentially does the same
udp_timer.hpp
#include "boost/asio.hpp"
class UdpTimer {
public:
UdpTimer();
~UdpTimer();
void run();
void timer_callback(const boost::system::error_code &e);
void udp_callback(const boost::system::error_code &e, size_t bytes_recvd);
boost::asio::io_service io;
private:
boost::asio::ip::udp::socket *socket;
boost::asio::ip::udp::endpoint *ep;
boost::asio::deadline_timer *timer;
char recv_buf[2048];
unsigned int tot_bytes_recved;
};
udp_timer.cpp
#include "udp_timer.hpp"
#include "boost/bind.hpp"
#include <iostream>
UdpTimer::UdpTimer() {
// Set up UDP part
ep = new boost::asio::ip::udp::endpoint(boost::asio::ip::udp::v4(), 30042);
socket = new boost::asio::ip::udp::socket(io, *ep);
socket->async_receive_from(
boost::asio::buffer(recv_buf, 2048), *ep,
boost::bind(&UdpTimer::udp_callback, this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred)
);
tot_bytes_recved = 0;
timer = new boost::asio::deadline_timer(io, boost::posix_time::seconds(1));
timer->async_wait(boost::bind(&UdpTimer::timer_callback, this, boost::asio::placeholders::error));
}
UdpTimer::~UdpTimer() {
delete ep;
delete socket;
delete timer;
}
void UdpTimer::run() {
io.run(); // Never returns
}
// Timer callback. Print info and reset timer
void UdpTimer::timer_callback(const boost::system::error_code &e) {
if (e) return;
static int count = 0;
std::cout <<"Timer Callback #" <<count++ <<"Bytes received = " <<tot_bytes_recved <<std::endl;
std::cout <<recv_buf <<std::endl;
timer->expires_from_now(boost::posix_time::seconds(1));
timer->async_wait(boost::bind(&UdpTimer::timer_callback, this, boost::asio::placeholders::error));
}
// Udp callback. Update bytes received count
void UdpTimer::udp_callback(const boost::system::error_code &e, size_t bytes_recvd) {
if (e) return;
tot_bytes_recved += bytes_recvd;
socket->async_receive_from(
boost::asio::buffer(recv_buf, 2048), *ep,
boost::bind(&UdpTimer::udp_callback, this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred)
);
}
int main(void) {
UdpTimer udp_timer;
udp_timer.run();
}
This placed inside the program is enough to generate that error.
=================================================================
==20441==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffe4a7621d0 at pc 0x55d73239950c bp 0x7ffe4a761f50 sp 0x7ffe4a761f40
WRITE of size 16 at 0x7ffe4a7621d0 thread T0
#0 0x55d73239950b in boost::date_time::base_time<boost::posix_time::ptime, boost::date_time::split_timedate_system<boost::posix_time::posix_time_system_config> >::base_time(boost::gregorian::date const&, boost::posix_time::time_duration const&, boost::date_time::dst_flags) (/home/erl/dev/test/build/prog_ins+0x61950b)
#1 0x55d732396495 in boost::posix_time::ptime::ptime(boost::gregorian::date, boost::posix_time::time_duration) /usr/include/boost/date_time/posix_time/ptime.hpp:40
#2 0x55d7323d4855 in boost::date_time::microsec_clock<boost::posix_time::ptime>::create_time(tm* (*)(long const*, tm*)) /usr/include/boost/date_time/microsec_time_clock.hpp:116
#3 0x55d7323d12f6 in boost::date_time::microsec_clock<boost::posix_time::ptime>::universal_time() /usr/include/boost/date_time/microsec_time_clock.hpp:76
#4 0x55d7323cb501 in boost::asio::time_traits<boost::posix_time::ptime>::now() /usr/include/boost/asio/time_traits.hpp:48
#5 0x55d7323db197 in boost::asio::detail::deadline_timer_service<boost::asio::time_traits<boost::posix_time::ptime> >::expires_from_now(boost::asio::detail::deadline_timer_service<boost::asio::time_traits<boost::posix_time::ptime> >::implementation_type&, boost::posix_time::time_duration const&, boost::system::error_code&) (/home/erl/dev/test/build/prog_ins+0x65b197)
#6 0x55d7323d6a25 in boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> >::expires_from_now(boost::asio::detail::deadline_timer_service<boost::asio::time_traits<boost::posix_time::ptime> >::implementation_type&, boost::posix_time::time_duration const&, boost::system::error_code&) /usr/include/boost/asio/deadline_timer_service.hpp:129
#7 0x55d7323d2ca8 in boost::asio::basic_deadline_timer<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime>, boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> > >::basic_deadline_timer(boost::asio::io_service&, boost::posix_time::time_duration const&) /usr/include/boost/asio/basic_deadline_timer.hpp:187
#8 0x55d7323b7f22 in InsHandler::InsHandler(InsConfig*, spdlog::logger*) /home/erl/dev/test/src/InsHandler.cpp:57
#9 0x55d7323a3fb0 in main /home/erl/dev/test/src/prog_ins.cpp:74
#10 0x7f369ed89bf6 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6)
#11 0x55d7322894d9 in _start (/home/erl/dev/test/build/prog_ins+0x5094d9)
Address 0x7ffe4a7621d0 is located in stack of thread T0 at offset 480 in frame
#0 0x55d7323d426f in boost::date_time::microsec_clock<boost::posix_time::ptime>::create_time(tm* (*)(long const*, tm*)) /usr/include/boost/date_time/microsec_time_clock.hpp:80
This frame has 10 object(s):
[32, 34) '<unknown>'
[96, 98) '<unknown>'
[160, 162) '<unknown>'
[224, 228) 'd'
[288, 296) 't'
[352, 360) 'td'
[416, 424) '<unknown>'
[480, 488) '<unknown>' <== Memory access at offset 480 partially overflows this variable
[544, 560) 'tv'
[608, 664) 'curr'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (/home/erl/dev/test/build/prog_ins+0x61950b) in boost::date_time::base_time<boost::posix_time::ptime, boost::date_time::split_timedate_system<boost::posix_time::posix_time_system_config> >::base_time(boost::gregorian::date const&, boost::posix_time::time_duration const&, boost::date_time::dst_flags)
Shadow bytes around the buggy address:
0x1000494e43e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1000494e43f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
0x1000494e4400: f1 f1 f8 f2 f2 f2 f2 f2 f2 f2 f8 f2 f2 f2 f2 f2
0x1000494e4410: f2 f2 f8 f2 f2 f2 f2 f2 f2 f2 04 f2 f2 f2 f2 f2
0x1000494e4420: f2 f2 00 f2 f2 f2 f2 f2 f2 f2 00 f2 f2 f2 f2 f2
=>0x1000494e4430: f2 f2 00 f2 f2 f2 f2 f2 f2 f2[00]f2 f2 f2 f2 f2
0x1000494e4440: f2 f2 00 00 f2 f2 f2 f2 f2 f2 00 00 00 00 00 00
0x1000494e4450: 00 f2 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1000494e4460: 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2 f2 f2 f2 f2
0x1000494e4470: 00 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
0x1000494e4480: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 f2 f2
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==20441==ABORTING
From this error printout it seems as if there is a bug in the Boost library that writes 16 bytes to something that only 8 bytes were allocated to. But why should that surface so intermittently? Also I notice that we have a few words that are marked as stack-use-after-scope which is f8. Could that mean that we have another part of the program is using a pointer to a stack-allocated object after its out of scope?
Running with valgrind gives me this
==27251== Conditional jump or move depends on uninitialised value(s)
==27251== at 0x578FA1: boost::date_time::int_adapter<long>::is_infinity() const (int_adapter.hpp:114)
==27251== by 0x5772A9: boost::date_time::int_adapter<long>::is_special() const (int_adapter.hpp:131)
==27251== by 0x5A1069: boost::date_time::counted_time_rep<boost::posix_time::millisec_posix_time_system_config>::is_special() const (time_system_counted.hpp:108)
==27251== by 0x59FCD3: boost::date_time::counted_time_system<boost::date_time::counted_time_rep<boost::posix_time::millisec_posix_time_system_config> >::add_time_duration(boost::date_time::counted_time_rep<boost::posix_time::millisec_posix_time_system_config> const&, boost::posix_time::time_duration) (time_system_counted.hpp:226)
==27251== by 0x59EA90: boost::date_time::base_time<boost::posix_time::ptime, boost::date_time::counted_time_system<boost::date_time::counted_time_rep<boost::posix_time::millisec_posix_time_system_config> > >::operator+(boost::posix_time::time_duration const&) const (time.hpp:163)
==27251== by 0x59E46B: boost::asio::time_traits<boost::posix_time::ptime>::add(boost::posix_time::ptime const&, boost::posix_time::time_duration const&) (time_traits.hpp:57)
==27251== by 0x5A1BEC: boost::asio::detail::deadline_timer_service<boost::asio::time_traits<boost::posix_time::ptime> >::expires_from_now(boost::asio::detail::deadline_timer_service<boost::asio::time_traits<boost::posix_time::ptime> >::implementation_type&, boost::posix_time::time_duration const&, boost::system::error_code&) (deadline_timer_service.hpp:161)
==27251== by 0x5A0811: boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> >::expires_from_now(boost::asio::detail::deadline_timer_service<boost::asio::time_traits<boost::posix_time::ptime> >::implementation_type&, boost::posix_time::time_duration const&, boost::system::error_code&) (deadline_timer_service.hpp:129)
==27251== by 0x59F20B: boost::asio::basic_deadline_timer<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime>, boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> > >::basic_deadline_timer(boost::asio::io_service&, boost::posix_time::time_duration const&) (basic_deadline_timer.hpp:187)
==27251== by 0x59DA57: OutputTimer::OutputTimer(boost::asio::io_service*, unsigned int, boost::function<OutputStates ()>) (output_timer.cpp:5)
==27251== by 0x5877D5: InsHandler::InsHandler(InsConfig*, spdlog::logger*) (InsHandler.cpp:57)
==27251== by 0x57B149: main (senti_ins.cpp:74)
==27251== Uninitialised value was created by a stack allocation
==27251== at 0x59FB3C: boost::date_time::microsec_clock<boost::posix_time::ptime>::create_time(tm* (*)(long const*, tm*)) (microsec_time_clock.hpp:80)
I am really lost here. There is really no connection between the changes I make to the source code and the resulting behaviour. I am able to remove the error by removing an inclusion of a totally unrelated header file. But the error surfaces again when including a mock_header with some function definitions and enums. So it really seems to be random when this error surfaces.
I would be extremely happy for any advice on how to attack such a problem.
I am very grateful for any advice on this
UDPATE to the edited question
I see loads of dynamic allocation (Why should C++ programmers minimize use of 'new'?).
I see repeated magic constants (1s, 2048), failure to NUL-terminate the recv_buf and then treating it as a C string, swallowing errors.
Removing all these:
Live On Coliru
Live On Wandbox
udp_timer.h
//#define BOOST_BIND_NO_PLACEHOLDERS
#include <boost/asio.hpp>
using boost::asio::ip::udp;
using namespace std::chrono_literals;
class UdpTimer {
public:
UdpTimer();
void run();
private:
using error_code = boost::system::error_code;
void timer_callback(error_code e);
void udp_callback(error_code e, size_t bytes_recvd);
void do_recv();
void do_timer();
boost::asio::io_service io;
udp::endpoint ep { {}, 30042 };
udp::socket socket { io, ep };
boost::asio::steady_timer timer { io };
std::array<char, 2048> recv_buf{};
unsigned int tot_bytes_recved = 0;
};
udp_timer.cpp
#include "udp_timer.hpp"
using namespace boost::asio::placeholders;
#include <boost/bind/bind.hpp>
#include <iostream>
#include <iomanip>
UdpTimer::UdpTimer() {
do_recv();
do_timer();
}
void UdpTimer::do_recv() {
socket.async_receive_from(boost::asio::buffer(recv_buf), ep,
boost::bind(&UdpTimer::udp_callback, this, error, bytes_transferred));
}
void UdpTimer::do_timer() {
timer.expires_from_now(1s);
timer.async_wait(boost::bind(&UdpTimer::timer_callback, this, error));
}
void UdpTimer::run() {
io.run(); // Never returns
}
// Timer callback. Print info and reset timer
void UdpTimer::timer_callback(error_code e)
{
if (e) {
std::cout << "timer_callback: " << e.message() << std::endl;
return;
}
static int count = 0;
std::cout << "Timer Callback #" << count++
<< " Bytes received = " << tot_bytes_recved << std::endl
<< " Last received: " << std::quoted(recv_buf.data()) << std::endl;
do_timer();
}
// Udp callback. Update bytes received count
void UdpTimer::udp_callback(error_code e, size_t bytes_recvd) {
if (e) {
std::cout << "timer_callback: " << e.message() << std::endl;
recv_buf[0] = '\0';
return;
}
// because you want to print the buffer, you will also want to make sure it
// is actually NUL terminated
assert(bytes_recvd < recv_buf.size());
recv_buf[bytes_recvd] = '\0';
tot_bytes_recved += bytes_recvd;
do_recv();
}
main.cpp
int main()
{
UdpTimer udp_timer;
udp_timer.run();
}
Running Demo, with ASAN+UBSAN enabled
OLD ANSWER:
boost::asio::io_service io2;
boost::asio::deadline_timer* t = new boost::asio::deadline_timer(io2, boost::posix_time::seconds(1));
This is merely a memory leak, but in the absence of other code it cannot possibly lead to any symptom, simply because no more code is generated: Live On Compiler Explorer
Now all the other observations make you suspicious. And rightfully so!
I am not able to reproduce the bug in just a standalone source file.
This is the key. There is Undefined Behaviour in your code. It may or may not have something to do with the timer, but it certainly isn't caused by this instantiation.
One obvious problem with the code is the memory leak, and the fact that you're doing manual allocation in the first place. This opens up the door for lifetime issues.
E.g. it is conceivable that
you have these lines in a function, the io2 goes out of scope and the time holds a stale reference to it.
In fact this directly corresponds to the "stack-use-after-scope" detection
many other scenarios assuming that you also t->async_wait() somehwere
Side observations are that io2 implies that you use two io services (why?). Besides all of this I hope you use better names in your real code, because it's really easy to get lost in a sea of io2, i, m3, t etc :)

Puzzled over memory leaks in MFC project that disappear if _CrtDumpMemoryLeaks() is never called

I have an MFC (C++) dialog-based project that is compiled with Visual Studio 2017. I've added the following code to track for possible memory leaks as I build it:
From within ProjectName.cpp before my CWinApp-derived class is initialized.
#define _CRTDBG_MAP_ALLOC
#include <stdlib.h>
#include <crtdbg.h>
#include <Wtsapi32.h>
#pragma comment(lib, "Wtsapi32.lib")
struct CatchMemLeaks{
CatchMemLeaks()
{
HANDLE ghDebugLogFile = ::CreateFile(L".\\dbg_output.txt",
GENERIC_READ | GENERIC_WRITE,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
//Enable logging into that file
_CrtSetDbgFlag(_CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF);
_CrtSetReportMode(_CRT_WARN, _CRTDBG_MODE_FILE | _CrtSetReportMode(_CRT_WARN, _CRTDBG_MODE_FILE));
_CrtSetReportFile(_CRT_WARN, ghDebugLogFile);
_CrtSetReportMode(_CRT_ERROR, _CRTDBG_MODE_FILE | _CrtSetReportMode(_CRT_ERROR, _CRTDBG_MODE_FILE));
_CrtSetReportFile(_CRT_ERROR, ghDebugLogFile);
_CrtSetReportMode(_CRT_ASSERT, _CRTDBG_MODE_FILE | _CrtSetReportMode(_CRT_ASSERT, _CRTDBG_MODE_FILE));
_CrtSetReportFile(_CRT_ASSERT, ghDebugLogFile);
//Try to break on the error reported
_CrtSetBreakAlloc(75);
}
~CatchMemLeaks()
{
if(_CrtDumpMemoryLeaks())
{
DWORD dwRespMsgBx;
::WTSSendMessage(NULL, ::WTSGetActiveConsoleSessionId(),
L"MemLeak", lstrlen(L"MemLeak") * sizeof(WCHAR),
L"MemLeak", lstrlen(L"MemLeak") * sizeof(WCHAR),
MB_OK | MB_ICONERROR | MB_SYSTEMMODAL,
0, &dwRespMsgBx, TRUE);
}
}
};
CatchMemLeaks cml;
//Then the usual MFC CWinApp-app derived class stuff:
// CProjectNameApp
BEGIN_MESSAGE_MAP(CProjectNameApp, CWinApp)
ON_COMMAND(ID_HELP, &CWinApp::OnHelp)
END_MESSAGE_MAP()
// CProjectNameApp construction
CProjectNameApp::CProjectNameApp()
{
// support Restart Manager
m_dwRestartManagerSupportFlags = AFX_RESTART_MANAGER_SUPPORT_RESTART;
// TODO: add construction code here,
// Place all significant initialization in InitInstance
}
// The one and only CProjectNameApp object
CProjectNameApp theApp;
//....
Then when the project runs and then exits, I'm getting my WTSSendMessage triggered:
Which gives me the following output:
Detected memory leaks!
Dumping objects ->
{75} normal block at 0x0000029BA5EA75D0, 16 bytes long.
Data: < G > B0 86 D0 47 F7 7F 00 00 00 00 00 00 00 00 00 00
{74} normal block at 0x0000029BA5ECE930, 48 bytes long.
Data: <0 0 > 30 E9 EC A5 9B 02 00 00 30 E9 EC A5 9B 02 00 00
{73} normal block at 0x0000029BA5EA82F0, 16 bytes long.
Data: <p G > 70 86 D0 47 F7 7F 00 00 00 00 00 00 00 00 00 00
{72} normal block at 0x0000029BA5ECEA80, 48 bytes long.
Data: < > 80 EA EC A5 9B 02 00 00 80 EA EC A5 9B 02 00 00
{71} normal block at 0x0000029BA5EA8070, 16 bytes long.
Data: < G > 20 86 D0 47 F7 7F 00 00 00 00 00 00 00 00 00 00
{70} normal block at 0x0000029BA5E98BA0, 120 bytes long.
Data: < > A0 8B E9 A5 9B 02 00 00 A0 8B E9 A5 9B 02 00 00
Object dump complete.
But then on the next debug run, when I add the _CrtSetBreakAlloc(75); line showed in the code above, the breakpoint on error 75 never triggers, although the output still remains the same.
Then another interesting discovery is that if I remove the _CrtDumpMemoryLeaks() function from my ~CatchMemLeaks destructor, those memory leaks go away.
PS. I know that this is something peculiar for this particular project because I don't get the same behavior if I try it with a stock MFC-dialog-based app.
Any idea how to track where those leaks are coming from?
Oh shoot, I got it. (Thanks to #RbMm in the comments!) The catch is to make this leak detecting code initialize before (and un-initialize after) all other CRT and MFC constructors and other stuff. The trick is to use #pragma init_seg(compiler) directive. My original mistake was to use it in the .cpp file where the CWinApp-derived class was defined. That caused a crash when the app was exiting because that #pragma directive applies to the entire .cpp file.
So the solution is to create a separate .h and .cpp files for my CatchMemLeaks class and set that #pragma directive there, as such:
CatchMemLeaks.h file:
#pragma once
//Only debugger builds
#ifdef _DEBUG
#define _CRTDBG_MAP_ALLOC
#include <stdlib.h>
#include <crtdbg.h>
#include <Strsafe.h>
#include <Wtsapi32.h>
#pragma comment(lib, "Wtsapi32.lib")
struct CatchMemLeaks{
CatchMemLeaks(int nMemLeakCodeToCatch);
~CatchMemLeaks();
};
#endif
and CatchMemLeaks.cpp file:
#include "StdAfx.h"
#include "CatchMemLeaks.h"
//Only debugger builds
#ifdef _DEBUG
#pragma warning( push )
#pragma warning( disable : 4074)
#pragma init_seg(compiler) //Make this code execute before any other code in this project (including other static constructors).
//This will also make its destructors run last.
//WARNING: Because of this do not call any CRT functions from this .cpp file!
#pragma warning( pop )
CatchMemLeaks cml(0); //Set to (0) to monitor memory leaks, or to any other value to break on a specific leak number
CatchMemLeaks::CatchMemLeaks(int nMemLeakNumberToBreakOn)
{
HANDLE ghDebugLogFile = ::CreateFile(.\\dbg_output.txt,
GENERIC_READ | GENERIC_WRITE,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
//Enable logging into that file
_CrtSetDbgFlag(_CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF);
_CrtSetReportMode(_CRT_WARN, _CRTDBG_MODE_FILE | _CrtSetReportMode(_CRT_WARN, _CRTDBG_MODE_FILE));
_CrtSetReportFile(_CRT_WARN, ghDebugLogFile);
_CrtSetReportMode(_CRT_ERROR, _CRTDBG_MODE_FILE | _CrtSetReportMode(_CRT_ERROR, _CRTDBG_MODE_FILE));
_CrtSetReportFile(_CRT_ERROR, ghDebugLogFile);
_CrtSetReportMode(_CRT_ASSERT, _CRTDBG_MODE_FILE | _CrtSetReportMode(_CRT_ASSERT, _CRTDBG_MODE_FILE));
_CrtSetReportFile(_CRT_ASSERT, ghDebugLogFile);
if(nMemLeakNumberToBreakOn)
{
_CrtSetBreakAlloc(nMemLeakNumberToBreakOn);
}
}
CatchMemLeaks::~CatchMemLeaks()
{
//Dump memory leaks, if any
if(_CrtDumpMemoryLeaks())
{
DWORD dwRespMsgBx;
::WTSSendMessage(NULL, ::WTSGetActiveConsoleSessionId(),
L"MemLeak", lstrlen(L"MemLeak") * sizeof(WCHAR),
L"MemLeak", lstrlen(L"MemLeak") * sizeof(WCHAR),
MB_OK | MB_ICONERROR | MB_SYSTEMMODAL,
0, &dwRespMsgBx, TRUE);
}
}
#endif
then lastly include it in the stdafx.h file:
#include "CatchMemLeaks.h"
It's most likely that your CatchMemLeaks class is created and then destroyed before all other objects in your program are cleaned up, so in effect you're reporting a false-positive (the other objects will get cleaned up after)
But it's hard to tell without a fully running program.

Segmentation fault on class destruction with boost::timer as a member of the class with periodic invocation

I'm working on a simple class which upon creation schedules a periodic timer for invoking one of its' methods. The method is virtual, so that derived classes can overload it with whatever periodic work they need.
In my test of this class, however, I randomly experience segmentation fault and can't figure out why. Here's the code and example of good and bad outputs:
#include <boost/thread/mutex.hpp>
#include <boost/thread/lock_guard.hpp>
#include <boost/asio/steady_timer.hpp>
#include <boost/chrono.hpp>
#include <boost/enable_shared_from_this.hpp>
#include <boost/function.hpp>
#include <boost/atomic.hpp>
#include <boost/make_shared.hpp>
#include <boost/bind.hpp>
//******************************************************************************
class PeriodicImpl;
class Periodic {
public:
Periodic(boost::asio::io_service& io, unsigned int periodMs);
~Periodic();
virtual unsigned int periodicInvocation() = 0;
private:
boost::shared_ptr<PeriodicImpl> pimpl_;
};
//******************************************************************************
class PeriodicImpl : public boost::enable_shared_from_this<PeriodicImpl>
{
public:
PeriodicImpl(boost::asio::io_service& io, unsigned int periodMs,
boost::function<unsigned int(void)> workFunc);
~PeriodicImpl();
void setupTimer(unsigned int intervalMs);
boost::atomic<bool> isRunning_;
unsigned int periodMs_;
boost::asio::io_service& io_;
boost::function<unsigned int(void)> workFunc_;
boost::asio::steady_timer timer_;
};
//******************************************************************************
Periodic::Periodic(boost::asio::io_service& io, unsigned int periodMs):
pimpl_(boost::make_shared<PeriodicImpl>(io, periodMs, boost::bind(&Periodic::periodicInvocation, this)))
{
std::cout << "periodic ctor " << pimpl_.use_count() << std::endl;
pimpl_->setupTimer(periodMs);
}
Periodic::~Periodic()
{
std::cout << "periodic dtor " << pimpl_.use_count() << std::endl;
pimpl_->isRunning_ = false;
pimpl_->timer_.cancel();
std::cout << "periodic dtor end " << pimpl_.use_count() << std::endl;
}
//******************************************************************************
PeriodicImpl::PeriodicImpl(boost::asio::io_service& io, unsigned int periodMs,
boost::function<unsigned int(void)> workFunc):
isRunning_(true),
io_(io), periodMs_(periodMs), workFunc_(workFunc), timer_(io_)
{
}
PeriodicImpl::~PeriodicImpl()
{
std::cout << "periodic impl dtor" << std::endl;
}
void
PeriodicImpl::setupTimer(unsigned int intervalMs)
{
std::cout << "schedule new " << intervalMs << std::endl;
boost::shared_ptr<PeriodicImpl> self(shared_from_this());
timer_.expires_from_now(boost::chrono::milliseconds(intervalMs));
timer_.async_wait([self, this](const boost::system::error_code& e){
std::cout << "hello invoke" << std::endl;
if (!e)
{
if (isRunning_)
{
std::cout << "invoking" << std::endl;
unsigned int nextIntervalMs = workFunc_();
if (nextIntervalMs)
setupTimer(nextIntervalMs);
}
else
std::cout << "invoke not running" << std::endl;
}
else
std::cout << "invoke cancel" << std::endl;
});
std::cout << "scheduled " << self.use_count() << std::endl;
}
//******************************************************************************
class PeriodicTest : public Periodic
{
public:
PeriodicTest(boost::asio::io_service& io, unsigned int periodMs):
Periodic(io, periodMs), periodMs_(periodMs), workCounter_(0){}
~PeriodicTest(){
std::cout << "periodic test dtor" << std::endl;
}
unsigned int periodicInvocation() {
std::cout << "invocation " << workCounter_ << std::endl;
workCounter_++;
return periodMs_;
}
unsigned int periodMs_;
unsigned int workCounter_;
};
//******************************************************************************
void main()
{
boost::asio::io_service io;
boost::shared_ptr<boost::asio::io_service::work> work(new boost::asio::io_service::work(io));
boost::thread t([&io](){
io.run();
});
unsigned int workCounter = 0;
{
PeriodicTest p(io, 50);
boost::this_thread::sleep_for(boost::chrono::milliseconds(550));
workCounter = p.workCounter_;
}
work.reset();
//EXPECT_EQ(10, workCounter);
}
Good output:
hello invoke
invoking
invocation 9
schedule new 50
scheduled 5
periodic test dtor
periodic dtor 2
periodic dtor end 2
hello invoke
invoke cancel
periodic impl dtor
Bad output:
hello invoke
invoking
invocation 9
schedule new 50
scheduled 5
periodic test dtor
periodic dtor 2
periodic dtor end 2
periodic impl dtor
Segmentation fault: 11
Apparently, segmentation fault is happening because PeriodicImpl is destructed so as its' timer timer_. But timer is still scheduled - and this leads to SEGFAULT. I can't understand why PeriodicImpl destructor is called in this case, because a shared_ptr to PeriodicImpl was copied to lambda passed as the timer's handler function during setupTimer call and this should've retained a copy of PeriodicImpl and prevent destructor invocation.
Any ideas?
The problem turned out to be entirely not in the questioned code, but in the code that tested it.
I enabled saving core dump file by running ulimit -c unlimited and then used lldb to read it:
$ lldb bin/tests/test-segment-controller -c /cores/core.75876
(lldb) bt all
* thread #1: tid = 0x0000, 0x00007fff8eb800f9 libsystem_malloc. dylib`szone_malloc_should_clear + 2642, stop reason = signal SIGSTOP
* frame #0: 0x00007fff8eb800f9 libsystem_malloc.dylib`szone_malloc_should_clear + 2642
frame #1: 0x00007fff8eb7f667 libsystem_malloc.dylib`malloc_zone_malloc + 71
frame #2: 0x00007fff8eb7e187 libsystem_malloc.dylib`malloc + 42
frame #3: 0x00007fff9569923e libc++abi.dylib`operator new(unsigned long) + 30
frame #4: 0x000000010da4b516 test-periodic`testing::Message::Message( this=0x00007fff521e8450) + 38 at gtest.cc:946
frame #5: 0x000000010da4a645 test-periodic`testing::Message::Message( this=0x00007fff521e8450) + 21 at gtest.cc:946
frame #6: 0x000000010da6c027 test-periodic`std::string testing::internal::StreamableToString<long long>(streamable=0x00007fff521e84b0) + 39 at gtest-message.h:244
frame #7: 0x000000010da558e8 test- periodic`testing::internal::PrettyUnitTestResultPrinter::OnTestEnd( this=0x00007fe733421570, test_info=0x00007fe7334211c0) + 216 at gtest.cc:3141
frame #8: 0x000000010da56a28 test- periodic`testing::internal::TestEventRepeater::OnTestEnd( this=0x00007fe733421520, parameter=0x00007fe7334211c0) + 136 at gtest.cc:3321
frame #9: 0x000000010da53957 test-periodic`testing::TestInfo::Run( this=0x00007fe7334211c0) + 343 at gtest.cc:2667
frame #10: 0x000000010da540c7 test-periodic`testing::TestCase::Run( this=0x00007fe733421660) + 231 at gtest.cc:2774
frame #11: 0x000000010da5b5d6 test- periodic`testing::internal::UnitTestImpl::RunAllTests(this=0x00007fe733421310) + 726 at gtest.cc:4649
frame #12: 0x000000010da83263 test-periodic`bool testing::internal::HandleSehExceptionsInMethodIfSupported< testing::internal::UnitTestImpl, bool>(object=0x00007fe733421310, method=0x000000010da5b300, location="auxiliary test code (environments or event listeners)")(), char const*) + 131 at gtest.cc:2402
frame #13: 0x000000010da6cde1 test-periodic`bool testing::internal::HandleExceptionsInMethodIfSupported< testing::internal::UnitTestImpl, bool>(object=0x00007fe733421310, method=0x000000010da5b300, location="auxiliary test code (environments or event listeners)")(), char const*) + 113 at gtest.cc:2438
frame #14: 0x000000010da5b2a2 test-periodic`testing::UnitTest::Run( this=0x000000010dab18e8) + 210 at gtest.cc:4257
frame #15: 0x000000010da19541 test-periodic`RUN_ALL_TESTS() + 17 at gtest. h:2233
frame #16: 0x000000010da1818b test-periodic`main(argc=1, argv=0x00007fff521e88b8) + 43 at test-periodic.cc:57
frame #17: 0x00007fff9557b5c9 libdyld.dylib`start + 1
frame #18: 0x00007fff9557b5c9 libdyld.dylib`start + 1
thread #2: tid = 0x0001, 0x00007fff8ab404cd libsystem_pthread. dylib`_pthread_mutex_lock + 23, stop reason = signal SIGSTOP
frame #0: 0x00007fff8ab404cd libsystem_pthread.dylib`_pthread_mutex_lock + 23
frame #1: 0x000000010da1c8d5 test- periodic`boost::asio::detail::posix_mutex::lock(this=0x0000000000000030) + 21 at posix_mutex.hpp:52
frame #2: 0x000000010da1c883 test-periodic`boost::asio::detail::scoped_lock< boost::asio::detail::posix_mutex>::scoped_lock(this=0x000000010e4fac38, m=0x0000000000000030) + 51 at scoped_lock.hpp:46
frame #3: 0x000000010da1c79d test-periodic`boost::asio::detail::scoped_lock< boost::asio::detail::posix_mutex>::scoped_lock(this=0x000000010e4fac38, m=0x0000000000000030) + 29 at scoped_lock.hpp:45
frame #4: 0x000000010da227a7 test- periodic`boost::asio::detail::kqueue_reactor::run(this=0x0000000000000000, block=true, ops=0x000000010e4fbda8) + 103 at kqueue_reactor.ipp:355
frame #5: 0x000000010da2223c test- periodic`boost::asio::detail::task_io_service::do_run_one( this=0x00007fe733421900, lock=0x000000010e4fbd60, this_thread=0x000000010e4fbd98, ec=0x000000010e4fbe58) + 348 at task_io_service .ipp:368
frame #6: 0x000000010da21e9f test- periodic`boost::asio::detail::task_io_service::run(this=0x00007fe733421900, ec=0x000000010e4fbe58) + 303 at task_io_service.ipp:153
frame #7: 0x000000010da21d51 test-periodic`boost::asio::io_service::run( this=0x00007fff521e8338) + 49 at io_service.ipp:59
frame #8: 0x000000010da184b8 test- periodic`TestPeriodic_TestDestructionDifferentThread_Test::TestBody( this=0x00007fe733421e28)::$_0::operator()() const + 24 at test-periodic.cc:41
frame #9: 0x000000010da1846c test-periodic`boost::detail::thread_data< TestPeriodic_TestDestructionDifferentThread_Test::TestBody()::$_0>::run( this=0x00007fe733421c10) + 28 at thread.hpp:117
frame #10: 0x000000010da8849c test-periodic`boost::(anonymous namespace) ::thread_proxy(param=<unavailable>) + 124 at thread.cpp:164
frame #11: 0x00007fff8ab4305a libsystem_pthread.dylib`_pthread_body + 131
frame #12: 0x00007fff8ab42fd7 libsystem_pthread.dylib`_pthread_start + 176
frame #13: 0x00007fff8ab403ed libsystem_pthread.dylib`thread_start + 13
Apparently, thread 2 causes crash as it tries to lock mutex which is already destructed. However, I'm not using any mutexes, so this must be something internal to io_service. This might happen if io_service is still being used after its' destruction. Looking closely at my main() function I noticed that the thread t I created is left dangling, i.e. there is no join() call on it. Consequently, this sometimes creates a situation when io object is already destructed (at the end of main) but thread t still tries to use it.
Thus, the problem was fixed by adding t.join() call at the end of main() function:
void main()
{
boost::asio::io_service io;
boost::shared_ptr<boost::asio::io_service::work> work(new boost::asio::io_service::work(io));
boost::thread t([&io](){
io.run();
});
unsigned int workCounter = 0;
{
PeriodicTest p(io, 50);
boost::this_thread::sleep_for(boost::chrono::milliseconds(550));
workCounter = p.workCounter_;
}
work.reset();
t.join();
//EXPECT_EQ(10, workCounter);
}
I run your program.Regreattably,I falied to compile.
I add to your program,and modify the following code:
timer_.expires_from_now(boost::chrono::milliseconds(intervalMs));
Modified:
timer_.expires_from_now(std::chrono::milliseconds(intervalMs));
So,I get the same result as your "Good output",and don't get the same result as your "Bad output".