std::thread::join() hangs if called after main() exits when using VS2012 RC - c++

The following example runs successfully (i.e. doesn't hang) if compiled using Clang 3.2 or GCC 4.7 on Ubuntu 12.04, but hangs if I compile using VS11 Beta or VS2012 RC.
#include <iostream>
#include <string>
#include <thread>
#include "boost/thread/thread.hpp"
void SleepFor(int ms) {
std::this_thread::sleep_for(std::chrono::milliseconds(ms));
}
template<typename T>
class ThreadTest {
public:
ThreadTest() : thread_([] { SleepFor(10); }) {}
~ThreadTest() {
std::cout << "About to join\t" << id() << '\n';
thread_.join();
std::cout << "Joined\t\t" << id() << '\n';
}
private:
std::string id() const { return typeid(decltype(thread_)).name(); }
T thread_;
};
int main() {
static ThreadTest<std::thread> std_test;
static ThreadTest<boost::thread> boost_test;
// SleepFor(100);
}
The issue appears to be that std::thread::join() never returns if it is invoked after main has exited. It is blocked at WaitForSingleObject in _Thrd_join defined in cthread.c.
Uncommenting SleepFor(100); at the end of main allows the program to exit properly, as does making std_test non-static. Using boost::thread also avoids the issue.
So I'd like to know if I'm invoking undefined behaviour here (seems unlikely to me), or if I should be filing a bug against VS2012?

Tracing through Fraser's sample code in his connect bug (https://connect.microsoft.com/VisualStudio/feedback/details/747145)
with VS2012 RTM seems to show a fairly straightforward case of deadlocking. This likely isn't specific to std::thread - likely _beginthreadex suffers the same fate.
What I see in the debugger is the following:
On the main thread, the main() function has completed, the process cleanup code has acquired a critical section called _EXIT_LOCK1, called the destructor of ThreadTest, and is waiting (indefinitely) on the second thread to exit (via the call to join()).
The second thread's anonymous function completed and is in the thread cleanup code waiting to acquire the _EXIT_LOCK1 critical section. Unfortunately, due to the timing of things (whereby the second thread's anonymous function's lifetime exceeds that of the main() function) the main thread already owns that critical section.
DEADLOCK.
Anything that extends the lifetime of main() such that the second thread can acquire _EXIT_LOCK1 before the main thread avoids the deadlock situation. That's why the uncommenting the sleep in main() results in a clean shutdown.
Alternatively if you remove the static keyword from the ThreadTest local variable, the destructor call is moved up to the end of the main() function (instead of in the process cleanup code) which then blocks until the second thread has exited - avoiding the deadlock situation.
Or you could add a function to ThreadTest that calls join() and call that function at the end of main() - again avoiding the deadlock situation.

I realize this is an old question regarding VS2012, but the bug is still present in VS2013. For those who are stuck on VS2013, perhaps due to Microsoft's refusal to provide upgrade pricing for VS2015, I offer the following analysis and workaround.
The problem is that the mutex (at_thread_exit_mutex) used by _Cnd_do_broadcast_at_thread_exit() is either not yet initialized, or has already been destroyed, depending on the exact circumstances. In the former case, _Cnd_do_broadcast_at_thread_exit() tries to initialize the mutex during shutdown, causing a deadlock. In the latter case, where the mutex has already been destroyed via the atexit stack, the program will crash on the way out.
The solution I found is to explicitly call _Cnd_do_broadcast_at_thread_exit() (which thankfully is declared publicly) early during program startup. This has the effect of creating the mutex before anyone else tries to access it, as well as ensuring that the mutex continues to exist until the last possible moment.
So, to fix the problem, insert the following code at the bottom of a source module, for instance somewhere below main().
#pragma warning(disable:4073) // initializers put in library initialization area
#pragma init_seg(lib)
#if _MSC_VER < 1900
struct VS2013_threading_fix
{
VS2013_threading_fix()
{
_Cnd_do_broadcast_at_thread_exit();
}
} threading_fix;
#endif

I believe your threads have already been terminated and their resources freed following the termination of your main function and before static destruction. This is the behavior of the VC runtimes dating back to at least VC6.
Do child threads exit when the parent thread terminates
boost thread and process cleanup on windows

My answer is too far late, but hope will help someone.
I was stucked by this bug, and i find a trick to solve this problem,it worked in my code.
int main()
{
ThreadTest trick_obj; //trick... You can put this line of code anywhere
static ThreadTest std_test;
return 1;
}

I have been battling this bug for a day, and found the following work-around, which turned out the be the least dirty trick:
Instead of returning, one can use the standard Windows API function call ExitThread() to terminate the thread. This method of course may mess up the internal state of the std::thread object and associated library, but since the program is going to terminate anyway, well, so be it.
#include <windows.h>
template<typename T>
class ThreadTest {
public:
ThreadTest() : thread_([] { SleepFor(10); ExitThread(NULL); }) {}
~ThreadTest() {
std::cout << "About to join\t" << id() << '\n';
thread_.join();
std::cout << "Joined\t\t" << id() << '\n';
}
private:
std::string id() const { return typeid(decltype(thread_)).name(); }
T thread_;
};
The join() call apparently works correctly. However, I chose to use a more safe method in our solution. One can get the thread HANDLE via std::thread::native_handle(). With this handle we can call the Windows API directly to join the thread:
WaitForSingleObject(thread_.native_handle(), INFINITE);
CloseHandle(thread_.native_handle());
Thereafter, the std::thread object must not be destroyed, as the destructor would try to join the thread a second time. So we just leave the std::thread object dangling at program exit.

Related

How to delete singleton instance from separate module before system killed running threads

I have a C++ Windows (compiled with Visual Studio 2019) program that uses shared libraries. A shared library uses a singleton on a class that creates a thread. The class destructor kills the thread cleanly, so there should be no memory leak. However, I see that the destructor is being invoked after the system actually killed all running threads upon exit, so it's too late, the thread is not exited cleanly and this introduces a memory leak (and possibly other problems depending on the code being processed by the thread).
Here is a MCVE:
#include <thread>
#include <atomic>
class Single
{
public:
static Single& GetInstance()
{
static Single single;
return single;
}
int doSomething()
{
while ( !started )
std::this_thread::sleep_for( std::chrono::milliseconds(100) );
return 0;
}
private:
Single() :
started( false ),
continueThread( true )
{
thread = new std::thread( &Single::threadFunc, this );
}
~Single()
{
continueThread = false;
thread->join();
delete thread;
}
void threadFunc()
{
started = true;
while ( continueThread )
{
std::this_thread::sleep_for( std::chrono::milliseconds(1) );
}
}
std::atomic_bool started;
std::atomic_bool continueThread;
std::thread* thread;
};
int main( int argc, char* argv[] )
{
return Single::GetInstance().doSomething();
}
If this is copied to a single main.cpp file and executed, everything works fine. When ~Single is executed, in the debugger, I see the threadFunc thread is running and it gets stopped cleanly.
Now, if Single definition and implementation is moved to a separate dll. When ~Single is executed, in the debugger, I see the threadFunc thread is not running anymore (the system already stopped it) and the code can't stop in cleanly. Visual Leak Detector reports then a memory leak.
Is there any flag (in code or at compiler level) that could be set to guarantee threads are not destroyed before the singleton gets deleted?
I know I could call a deinit function manually from the main function, but at some point, the main may not even know there is singleton running a thread in the shared library it uses...the shared library itself should be able to cleanly exit.
No.
Multithreading, automatic cleanup, and DLL unloading are basically a huge mess on Windows once they interact.
The solution is to not have singletons, or any static lifetime variables (globals, local statics, class statics) with non-trivial destruction semantics. Make an instance of your thing in main()/WinMain(). Pass a reference to whoever needs it. Let the destructor clean it up before main exits and thus before everything gets unloaded.
Or simply ignore the memory leak. The process is exiting anyway.
This a common case of SUOF (Static Unitialization Order Fiasco) caused by giving up control over object instance lifetime by using static local variable. Solution is to get back control over object instance lifetime by adding a couple of initialization / uninitialization routines (probably wrapped with RAII) that will ensure that object is created before the first use and destroyed after last use but prior to dll getting unloaded / main function returning.

Program abort hangs the named mutex

I have several processes but only one should be running at the time. This means that let's say the Process1 is running and if the Process2 get launched, then Process2 should wait until Process1 is complete. I am considering the boost named_mutex for this purpose. In order to avoid a scenario where mutex may not get released if some exception is thrown, it looks like boost::lock_guard could be useful. I came up with the following simplified version of the code.
#include <iostream>
#include <boost/interprocess/sync/named_mutex.hpp>
#include <boost/thread.hpp>
#include <chrono>
#include <thread>
using namespace boost::interprocess;
#pragma warning(disable: 4996)
int main()
{
std::cout << "Before taking lock" << std::endl;
named_mutex mutex(open_or_create, "some_name");
boost::lock_guard<named_mutex> guard(mutex) ;
// Some work that is simulated by sleep
std::cout << "now wait for 10 second" << std::endl;
std::this_thread::sleep_for(std::chrono::seconds(10));
std::cout << "Hello World";
}
So far, so good. When this program is running, I hit Ctl+C so the program gets aborted (kind of simulation of program crashed, unhandled exception etc). After that when I run the application, the program gets hung on the following line of code.
named_mutex mutex(open_or_create, "some_name");
boost::lock_guard<named_mutex> guard(mutex) ;
If I change the mutex name, then it works fine without getting hung. However, it looks like mutex named some_name is somehow "remembered" on the machine in some sort of bad state. This results in any application that tries to acquire a mutex with name some_name gets hung on this line of code. If I change this mutex name to let' say some_name2, the program works fine again.
Can someone please explain what is causing this behavior?
How can I reset the behavior for this particular mutex?
Most importantly, how to avoid this scenario in a real application?
As explained in this answer to the question linked by #ppetraki above, boost::interprocess:named_mutex, unfortunately, uses a file lock on Windows rather than an actual mutex. If your application terminates abnormally, that file lock will not be removed from the system. This is actually subject to an open issue.
Looking at the source code, we see that, if BOOST_INTERPROCESS_USE_WINDOWS is defined, internal_mutex_type maps to a windows_named_mutex which, internally, uses a windows_named_sync, which seems to just be using a file lock in the end. I'm not sure what exactly is the rationale of this choice of implementation. Whatever it may be, there does not seem to be any way to get boost::interprocess to use a proper named mutex on Windows. I would suggest to simply create a named mutex yourself using CreateMutex, for example:
#include <type_traits>
#include <memory>
#include <stdexcept>
#include <mutex>
#include <iostream>
#define NOMINMAX
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
struct CloseHandleDeleter { void operator ()(HANDLE h) const { CloseHandle(h); } };
class NamedMutex
{
std::unique_ptr<std::remove_pointer_t<HANDLE>, CloseHandleDeleter> m;
public:
NamedMutex(const wchar_t* name)
: m(CreateMutexW(nullptr, FALSE, name))
{
if (!m)
throw std::runtime_error("failed to create mutex");
}
void lock()
{
if (WaitForSingleObject(m.get(), INFINITE) == WAIT_FAILED)
throw std::runtime_error("something bad happened");
}
void unlock()
{
ReleaseMutex(m.get());
}
};
int main()
{
try
{
NamedMutex mutex(L"blub");
std::lock_guard lock(mutex);
std::cout << "Hello, World!" << std::endl;
}
catch (...)
{
std::cerr << "something went wrong\n";
return -1;
}
return 0;
}
Can someone please explain what is causing this behavior?
The mutex is global.
How can I reset the behavior for this particular mutex?
Call boost::interprocess::named_mutex::remove("mutex_name");
Most importantly, how to avoid this scenario in a real application?
It depends on what your outer problem is. Perhaps a more sensible solution is to use a file lock instead. A file lock will go away when a process is destroyed.
Updates:
I understand mutex is global but what happens with that mutex that causes the program to hang?
The first program acquired the mutex and never released it so the mutex is still held. Mutexes are typically held while shared state is put into an inconsistent state, so automatically releasing the mutex would be disastrous.
How can I determine if that mutex_name is in a bad state so its time to call the remove on it?
In your case you really can't because you picked the wrong tool for the job. The same logic you would use to tell if the mutex was in a sane state would just solve your whole problem, so the mutex just made things harder. Instead, use a file lock. It may be useful to write the process name and process ID into the file to help in troubleshooting.

C++ program unexpectedly blocks / throws

I'm learning about mutexes in C++ and have a problem with the following code (taken from N. Josuttis' "The C++ Standard Library").
I don't understand why it blocks / throws unless I add this_thread::sleep_for in the main thread (then it doesn't block and all three calls are carried out).
The compiler is cl.exe used from the command line.
#include <future>
#include <mutex>
#include <iostream>
#include <string>
#include <thread>
#include <chrono>
std::mutex printMutex;
void print(const std::string& s)
{
std::lock_guard<std::mutex> lg(printMutex);
for (char c : s)
{
std::cout.put(c);
}
std::cout << std::endl;
}
int main()
{
auto f1 = std::async(std::launch::async, print, "Hello from thread 1");
auto f2 = std::async(std::launch::async, print, "Hello from thread 2");
// std::this_thread::sleep_for(std::chrono::seconds(1));
print(std::string("Hello from main"));
}
I think what you are seeing is an issue with the conformance of the MSVC implementation of async (in combination with future). I believe it is not conformant. I am able to reproduce it with VS2013, but unable to reproduce the issue with gcc.
The crash is because the main thread exits (and starts to clean up) before the other two threads complete.
Hence a simple delay (the sleep_for) or .get() or .wait() on the two futures should fix it for you. So the modified main could look like;
int main()
{
auto f1 = std::async(std::launch::async, print, "Hello from thread 1");
auto f2 = std::async(std::launch::async, print, "Hello from thread 2");
print(std::string("Hello from main"));
f1.get();
f2.get();
}
Favour the explicit wait or get over the timed "sleep".
Notes on the conformance
There was a proposal from Herb Sutter to change the wait or block on the shared state of the future returned from async. This may be the reason for the behaviour in MSVC, it could be seen as having implemented the proposal. I'm not sure what the final result was of the proposal was or its integration (or part thereof) into C++14. At least w.r.t. the blocking of the future returned from async it looks like the MSVC behaviour did not make it into the specification.
It is interesting to note that the wording in ยง30.6.8/5 changed;
From C++11
a call to a waiting function on an asynchronous return object that shares the shared state created
by this async call shall block until the associated thread has completed, as if joined
To C++14
a call to a waiting function on an asynchronous return object that shares the shared state created
by this async call shall block until the associated thread has completed, as if joined, or else time
out
I'm not sure how the "time out" would be specified, I would imagine it is implementation defined.
std::async returns a future. Its destructor blocks if get or wait has not been called:
it may block if all of the following are true: the shared state was created by a call to std::async, the shared state is not yet ready, and this was the last reference to the shared state.
See std::futures from std::async aren't special! for a detailed treatment of the subject.
Add these 2 lines at the end of main:
f1.wait();
f2.wait();
This will make sure the threads finish before main exists.

Actor calculation model using boost::thread

I'm trying to implement Actor calculation model over threads on C++ using boost::thread.
But program throws weird exception during execution. Exception isn't stable and some times program works in correct way.
There my code:
actor.hpp
class Actor {
public:
typedef boost::function<int()> Job;
private:
std::queue<Job> d_jobQueue;
boost::mutex d_jobQueueMutex;
boost::condition_variable d_hasJob;
boost::atomic<bool> d_keepWorkerRunning;
boost::thread d_worker;
void workerThread();
public:
Actor();
virtual ~Actor();
void execJobAsync(const Job& job);
int execJobSync(const Job& job);
};
actor.cpp
namespace {
int executeJobSync(std::string *error,
boost::promise<int> *promise,
const Actor::Job *job)
{
int rc = (*job)();
promise->set_value(rc);
return 0;
}
}
void Actor::workerThread()
{
while (d_keepWorkerRunning) try {
Job job;
{
boost::unique_lock<boost::mutex> g(d_jobQueueMutex);
while (d_jobQueue.empty()) {
d_hasJob.wait(g);
}
job = d_jobQueue.front();
d_jobQueue.pop();
}
job();
}
catch (...) {
// Log error
}
}
void Actor::execJobAsync(const Job& job)
{
boost::mutex::scoped_lock g(d_jobQueueMutex);
d_jobQueue.push(job);
d_hasJob.notify_one();
}
int Actor::execJobSync(const Job& job)
{
std::string error;
boost::promise<int> promise;
boost::unique_future<int> future = promise.get_future();
{
boost::mutex::scoped_lock g(d_jobQueueMutex);
d_jobQueue.push(boost::bind(executeJobSync, &error, &promise, &job));
d_hasJob.notify_one();
}
int rc = future.get();
if (rc) {
ErrorUtil::setLastError(rc, error.c_str());
}
return rc;
}
Actor::Actor()
: d_keepWorkerRunning(true)
, d_worker(&Actor::workerThread, this)
{
}
Actor::~Actor()
{
d_keepWorkerRunning = false;
{
boost::mutex::scoped_lock g(d_jobQueueMutex);
d_hasJob.notify_one();
}
d_worker.join();
}
Actually exception that is thrown is boost::thread_interrupted in int rc = future.get(); line. But form boost docs I can't reason of this exception. Docs says
Throws: - boost::thread_interrupted if the result associated with *this is not ready at the point of the call, and the current thread is interrupted.
But my worker thread can't be in interrupted state.
When I used gdb and set "catch throw" I see that back trace looks like
throw thread_interrupted
boost::detail::interruption_checker::check_for_interruption
boost::detail::interruption_checker::interruption_checker
boost::condition_variable::wait
boost::detail::future_object_base::wait_internal
boost::detail::future_object_base::wait
boost::detail::future_object::get
boost::unique_future::get
I looked into boost sources but can't get why interruption_checker decided that worker thread is interrupted.
So someone C++ guru, please help me. What I need to do to get correct code?
I'm using:
boost 1_53
Linux version 2.6.18-194.32.1.el5 Red Hat 4.1.2-48
gcc 4.7
EDIT
Fixed it! Thanks to Evgeny Panasyuk and Lazin. The problem was in TLS
management. boost::thread and boost::thread_specific_ptr are using
same TLS storage for their purposes. In my case there was problem when
they both tried to change this storage on creation (Unfortunately I
didn't get why in details it happens). So TLS became corrupted.
I replaced boost::thread_specific_ptr from my code with __thread
specified variable.
Offtop: During debugging I found memory corruption in external library
and fixed it =)
.
EDIT 2
I got the exact problem... It is a bug in GCC =)
The _GLIBCXX_DEBUG compilation flag breaks ABI.
You can see discussion on boost bugtracker:
https://svn.boost.org/trac/boost/ticket/7666
I have found several bugs:
Actor::workerThread function does double unlock on d_jobQueueMutex. First unlock is manual d_jobQueueMutex.unlock();, second is in destructor of boost::unique_lock<boost::mutex>.
You should prevent one of unlocking, for example release association between unique_lock and mutex:
g.release(); // <------------ PATCH
d_jobQueueMutex.unlock();
Or add additional code block + default-constructed Job.
It is possible that workerThread will never leave following loop:
while (d_jobQueue.empty()) {
d_hasJob.wait(g);
}
Imagine following case: d_jobQueue is empty, Actor::~Actor() is called, it sets flag and notifies worker thread:
d_keepWorkerRunning = false;
d_hasJob.notify_one();
workerThread wakes up in while loop, sees that queue is empty and sleeps again.
It is common practice to send special final job to stop worker thread:
~Actor()
{
execJobSync([this]()->int
{
d_keepWorkerRunning = false;
return 0;
});
d_worker.join();
}
In this case, d_keepWorkerRunning is not required to be atomic.
LIVE DEMO on Coliru
EDIT:
I have added event queue code into your example.
You have concurrent queue in both EventQueueImpl and Actor, but for different types. It is possible to extract common part into separate entity concurrent_queue<T> which works for any type. It would be much easier to debug and test queue in one place than catching bugs scattered over different classes.
So, you can try to use this concurrent_queue<T>(on Coliru)
This is just a guess. I think that some code can actually call boost::tread::interrupt(). You can set breakpoint to this function and see what code is responsible for this. You can test for interruption in execJobSync:
int Actor::execJobSync(const Job& job)
{
if (boost::this_thread::interruption_requested())
std::cout << "Interruption requested!" << std::endl;
std::string error;
boost::promise<int> promise;
boost::unique_future<int> future = promise.get_future();
The most suspicious code in this case is a code that has reference to thread object.
It is good practice to make your boost::thread code interruption aware anyway. It is also possible to disable interruption for some scope.
If this is not the case - you need to check code that works with thread local storage, because thread interruption flag stored in the TLS. Maybe some your code rewrites it. You can check interruption before and after such code fragment.
Another possibility is that your memory is corrupt. If no code is calling boost::thread::interrupt() and you doesn't work with TLS. This is the most hard case, try to use some dynamic analyzer - valgrind or clang memory sanitizer.
Offtopic:
You probably need to use some concurrent queue. std::queue will be very slow because of high memory contention and you will end up with poor cache performance. Good concurrent queue allow your code to enqueue and dequeue elements in parallel.
Also, actor is not something that supposed to execute arbitrary code. Actor queue must receive simple messages, not functions! Youre writing a job queue :) You need to take a look at some actor system like Akka or libcpa.

std::thread c++. More threads same data

Im using visual studio 2012 and c++11. I dont understand why this does not work:
void client_loop(bool &run)
{
while ( run );
}
int main()
{
bool running = true;
std::thread t(&client_loop,std::ref(running));
running = false ;
t.join();
}
In this case, the loop of thread t never finishes but I explicity set running to false. run and running have the same location. I tried to set running as a single global variable but nothing happens. I tried to pass a pointer value too but nothing.
The threads use the same heap. I really don't understand. Can anyone help me?
Your program has Undefined Behavior, because it introduces a data race on the running variable (one thread writes it, another thread reads it).
You should use a mutex to synchronize access, or make running an atomic<bool>:
#include <iostream>
#include <thread>
#include <atomic>
void client_loop(std::atomic<bool> const& run)
{
while (run.load());
}
int main()
{
std::atomic<bool> running(true);
std::thread t(&client_loop,std::ref(running));
running = false ;
t.join();
std::cout << "Arrived";
}
See a working live example.
The const probably doesn't affect the compiler's view of the code. In a single-threaded application, the value won't change (and this particular program is meaningless). In a multi-threaded application, since it's an atomic type, the compiler can't optimize out the load, so in fact there's no real issue here. It's really more a matter of style; since main modifies the value, and client_loop looks for that modification, it doesn't seem right to me to say that the value is const.