thread destructors in C++0x vs boost

thread destructors in C++0x vs boost - c++

These days I am reading the pdf Designing MT programs . It explains that the user MUST explicitly call detach() on an object of class std::thread in C++0x before that object gets out of scope. If you don't call it std::terminate() will be called and the application will die.
I usually use boost::thread for threading in C++. Correct me if I am wrong but a boost::thread object detaches automatically when it get out of scope.
Is seems to me that the boost approach follow a RAII principle and the std doesn't.
Do you know if there is some particular reason for this?

This is indeed true, and this choice is explained in N3225 on a note regarding std::thread destructor :
If joinable() then terminate(), otherwise no effects. [ Note:
Either implicitly detaching or joining
a joinable() thread in its
destructor could result in difficult
to debug correctness (for detach) or
performance (for join) bugs
encountered only when an exception is
raised. Thus the programmer must
ensure that the destructor is never
executed while the thread is still
joinable. —end note ]
Apparently the committee went for the lesser of two evils.
EDIT I just found this interesting paper which explains why the initial wording :
If joinable() then detach(), otherwise no effects.
was changed for the previously quoted one.

Here's one way to implement RAII threads.
#include <memory>
#include <thread>
void run() { /* thread runs here */ }
struct ThreadGuard
{
operator()(std::thread* thread) const
{
if (thread->joinable())
thread->join(); // this is safe, but it blocks when scoped_thread goes out of scope
//thread->detach(); // this is unsafe, check twice you know what you are doing
delete thread;
}
}
auto scoped_thread = std::unique_ptr<std::thread, ThreadGuard>(new std::thread(&run), ThreadGuard());
If you want to use this to detach a thread, read this first.

Related

Why must one call join() or detach() before thread destruction?

I don't understand why when an std::thread is destructed it must be in join() or detach() state.
Join waits for the thread to finish, and detach doesn't.
It seems that there is some middle state which I'm not understanding.
Because my understanding is that join and detach are complementary: if I don't call join() than detach() is the default.
Put it this way, let's say you're writing a program that creates a thread and only later in the life of this thread you call join(), so up until you call join the thread was basically running as if it was detached, no?
Logically detach() should be the default behavior for threads because that is the definition of what threads are, they are parallelly executed irrespective of other threads.
So when the thread object gets destructed why is terminate() called? Why can't the standard simply treat the thread as being detached?
I'm not understanding the rationale behind terminating a program when either join() or detached() wasn't called before the thread was destructed. What is the purpose of this?
UPDATE:
I recently came across this. Anthony Williams states in his book, Concurrency In Action, "One of the proposals for C++17 was for a joining_thread class that would be similar to std::thread, except that it would automatically join in the destructor much like scoped_thread does. This didn’t get consensus in the committee, so it wasn’t accepted into the standard (though it’s still on track for C++20 as std::jthread)..."

Technically the answer is "because the spec says so" but that is an obtuse answer. We can't read the designers' minds, but here are some issues that may have contributed:
With POSIX pthreads, child threads must be joined after they have exited, or else they continue to occupy system resources (like a process table entry in the kernel). This is done via pthread_join().
Windows has a somewhat analogous issue if the process holds a HANDLE to the child thread; although Windows doesn't require a full join, the process must still call CloseHandle() to release its refcount on the thread.
Since std::thread is a cross-platform abstraction, it's constrained by the POSIX requirement which requires the join.
In theory the std::thread destructor could have called pthread_join() instead of throwing an exception, but that (subjectively) that may increase the risk of deadlock. Whereas a properly written program would know when to insert the join at a safe time.
See also:
https://en.wikipedia.org/wiki/Zombie_process
https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessa
https://learn.microsoft.com/en-us/windows/win32/procthread/terminating-a-process

You're getting confused because you're conflating the std::thread object with the thread of execution it refers to. A std::thread object is a C++ object (a bunch of bytes in memory) that acts as a reference to a thread of execution. When you call std::thread::detach what happens is that the std::thread object is "detached" from the thread of execution -- it no longer refers to (any) thread of execution, and the thread of execution continues running independently. But the std::thread object still exists, until it is destroyed.
When a thread of execution completes, it stores its exit info into the std::thread object that refers to it, if there is one (If it was detached, then there isn't one, so the exit info is just thrown away.) It has no other effect on the std::thread object -- in particular the std::thread object is not destroyed and continues to exist until someone else destroys it.

You might want a thread to completely clean up after itself when it's done leaving no traces. This would mean that you could start a thread and then forget about it.
But you might also want to be able to manage a thread while it was running and get any return value it had provided when it was done. In this case, if a thread cleaned up after itself when it was done, your attempt to manage it could cause a crash because you would be accessing a handle that might be invalid. And to check for the return value when the thread finishes, the return value has to be stored somewhere, which means the thread can't be fully cleaned up because the place where the return value is stored has to be left around.
In most frameworks, by default, you get the second option. You can manage the thread (by interrupting it, sending signals to it, joining it, or whatever) but it can't clean up after itself. If you prefer the first option, there's a function to get that behavior (detach) but that means that you may not be able to access the thread because it may or may not continue to exist.

When a thread handle for an active thread goes out of scope you have a couple of options:
join
detach
kill thread
kill program
Each one of these options is terrible. No matter which one you pick it will be surprising, confusing and not what you wanted in most situations.
Arguably the joining thread you mentioned already exists in the form of std::async which gives you a std::future that blocks until the created thread is done, so doing an implicit join. But the many questions about why
std::async(std::launch::async, f);
g();
does not run f and g concurrently indicate how confusing that is. The best approach I'm aware of is to define it to be a programming error and have the programmer fix it, so an assert would be most appropriate. Unfortunately the standard went with std::terminate instead.
If you really want a detaching thread just write a little wrapper around std::thread that does if (thread.joinable()) thread.detach(); in its destructor or whichever handler you want.

Question: "So when the thread object gets destructed why is terminate() called? Why can't the standard simply treat the thread as being detached?"
Answer: Yes, I agree that it terminates the program badly but such design has its reasons. Without the std::terminate() mechanism in the destructor std::thread::~thread, if the users really wanted to do join(), but for some reason "join" didn't execute (for e.g. exception was thrown) then the new_thread will run in the background just like the detach() behaviors. This might cause undefined behaviors because that was not the original intention of the user to have a detached thread.

In Visual Studio, `thread_local` variables' destructor not called when used with std::async, is this a bug?

The following code
#include <iostream>
#include <future>
#include <thread>
#include <mutex>
std::mutex m;
struct Foo {
Foo() {
std::unique_lock<std::mutex> lock{m};
std::cout <<"Foo Created in thread " <<std::this_thread::get_id() <<"\n";
}
~Foo() {
std::unique_lock<std::mutex> lock{m};
std::cout <<"Foo Deleted in thread " <<std::this_thread::get_id() <<"\n";
}
void proveMyExistance() {
std::unique_lock<std::mutex> lock{m};
std::cout <<"Foo this = " << this <<"\n";
}
};
int threadFunc() {
static thread_local Foo some_thread_var;
// Prove the variable initialized
some_thread_var.proveMyExistance();
// The thread runs for some time
std::this_thread::sleep_for(std::chrono::milliseconds{100});
return 1;
}
int main() {
auto a1 = std::async(std::launch::async, threadFunc);
auto a2 = std::async(std::launch::async, threadFunc);
auto a3 = std::async(std::launch::async, threadFunc);
a1.wait();
a2.wait();
a3.wait();
std::this_thread::sleep_for(std::chrono::milliseconds{1000});
return 0;
}
Compiled and run width clang in macOS:
clang++ test.cpp -std=c++14 -pthread
./a.out
Got result
Foo Created in thread 0x70000d9f2000
Foo Created in thread 0x70000daf8000
Foo Created in thread 0x70000da75000
Foo this = 0x7fd871d00000
Foo this = 0x7fd871c02af0
Foo this = 0x7fd871e00000
Foo Deleted in thread 0x70000daf8000
Foo Deleted in thread 0x70000da75000
Foo Deleted in thread 0x70000d9f2000
Compiled and run in Visual Studio 2015 Update 3:
Foo Created in thread 7180
Foo this = 00000223B3344120
Foo Created in thread 8712
Foo this = 00000223B3346750
Foo Created in thread 11220
Foo this = 00000223B3347E60
Destructor are not called.
Is this a bug or some undefined grey zone?
P.S.
If the sleep std::this_thread::sleep_for(std::chrono::milliseconds{1000}); at the end is not long enough, you may not see all 3 "Delete" messages sometimes.
When using std::thread instead of std::async, the destructors get called on both platform, and all 3 "Delete" messages will always be printed.

Introductory Note: I have now learned a lot more about this and have therefore re-written my answer. Thanks to #super, #M.M and (latterly) #DavidHaim and #NoSenseEtAl for putting me on the right track.
tl;dr Microsoft's implementation of std::async is non-conformant, but they have their reasons and what they have done can actually be useful, once you understand it properly.
For those who don't want that, it is not too difficult to code up a drop-in replacement replacement for std::async which works the same way on all platforms. I have posted one here.
Edit: Wow, how open MS are being these days, I like it, see: https://github.com/MicrosoftDocs/cpp-docs/issues/308
Let's being at the beginning. cppreference has this to say (emphasis and strikethrough mine):
The template function async runs the function f asynchronously (potentially optionally in a separate thread which may be part of a thread pool).
However, the C++ standard says this:
If launch::async is set in policy, [std::async] calls [the function f] as if in a new thread of execution ...
So which is correct? The two statements have very different semantics as the OP has discovered. Well of course the standard is correct, as both clang and gcc show, so why does the Windows implementation differ? And like so many things, it comes down to history.
The (oldish) link that M.M dredged up has this to say, amongst other things:
... Microsoft has its implementation of [std::async] in the form of PPL (Parallel Pattern Library) ... [and] I can understand the eagerness of those companies to bend the rules and make these libraries accessible through std::async, especially if they can dramatically improve performance...
... Microsoft wanted to change the semantics of std::async when called with launch_policy::async. I think this was pretty much ruled out in the ensuing discussion ... (rationale follows, if you want to know more then read the link, it's well worth it).
And PPL is based on Windows' built-in support for ThreadPools, so #super was right.
So what does the Windows thread pool do and what is it good for? Well, it's intended to manage frequently-sheduled, short-running tasks in an efficient way so point 1 is don't abuse it, but my simple tests show that if this is your use-case then it can offer significant efficiencies. It does, essentially, two things
It recycles threads, rather than having to always start a new one for each asynchronous task you launch.
It limits the total number of background threads it uses, after which a call to std::async will block until a thread becomes free. On my machine, this number is 768.
So knowing all that, we can now explain the OP's observations:
A new thread is created for each of the three tasks started by main() (because none of them terminates immediately).
Each of these three threads creates a new thread-local variable Foo some_thread_var.
These three tasks all run to completion but the threads they are running on remain in existence (sleeping).
The program then sleeps for a short while and then exits, leaving the 3 thread-local variables un-destructed.
I ran a number of tests and in addition to this I found a few key things:
When a thread is recycled, the thread-local variables are re-used. Specifically, they are not destroyed and then re-created (you have been warned!).
If all the asynchonous tasks complete and you wait long enough, the thread pool terminates all the associated threads and the thread-local variables are then destroyed. (No doubt the actual rules are more complex than that but that's what I observed).
As new asynchonous tasks are submitted, the thread pool limits the rate at which new threads are created, in the hope that one will become free before it needs to perform all that work (creating new threads is expensive). A call to std::async might therefore take a while to return (up to 300ms in my tests). In the meantime, it's just hanging around, hoping that its ship will come in. This behaviour is documented but I call it out here in case it takes you by surprise.
Conclusions:
Microsoft's implementation of std::async is non-conformant but it is clearly designed with a specific purpose, and that purpose is to make good use of the Win32 ThreadPool API. You can beat them up for blantantly flouting the standard but it's been this way for a long time and they probably have (important!) customers who rely on it. I will ask them to call this out in their documentation. Not doing that is criminal.
It is not safe to use thread_local variables in std::async tasks on Windows. Just don't do it, it will end in tears.

Looks like just another of many bugs in VC++.
Consider this quote from n4750
All variables declared with the thread_local keyword have thread
storage duration . The storage for these entities shall last for the
duration of the thread in which they are created. There is a distinct
object or reference per thread, and use of the declared name refers to
the entity associated with the current thread. 2 A variable with
thread storage duration shall be initialized before its first odr-use
(6.2) and, if constructed, shall be destroyed on thread exit.
+this
If the implementation chooses the launch::async policy, — (5.3) a call
to a waiting function on an asynchronous return object that shares the
shared state created by this async call shall block until the
associated thread has completed, as if joined, or else time out
(33.3.2.5);
I could be wrong("thread exit" vs "thread completed", but I feel this means that thread_local variables need to be destroyed before .wait() call unblocks.

Is std::exception_ptr thread safe?

I have a worker thread that is constantly running, created & managed through a std::thread. At the top level of my worker thread, I have a try/catch block with a while loop inside it. If an exception leaks through to the top level of the thread, I catch it and store it in a std::exception_ptr, which is a member of the class that also owns the non-static thread function:
// In class header (inside class declaration)
std::exception_ptr m_threadException;
// In class CPP file
void MyClass::MyThreadFunction()
{
try {
while (true) {
// Do thread stuff
}
}
catch (std::exception const& e) {
m_threadException = std::current_exception();
}
}
Once the thread dies due to this kind of exception, my class (which is also primarily used by the main thread) doesn't know it yet. My plan was to add thread checkpoints to the start of all the class's main functions, like so:
void MyClass::SomethingMainThreadCalls()
{
if (m_threadException) {
std::rethrow_exception(m_threadException);
m_threadException = nullptr; // Somehow reset it back to null; not sure if this will work
}
// Do normal function stuff
}
Assuming this is even a good idea, there's a possible race condition between when my main thread is checking if the exception_ptr is null (when calling SomethingMainThreadCalls()) and when the worker thread assigns to it. I haven't found any information (haven't checked the C++11 draft yet) about whether or not this is inherently thread safe (guaranteed by the standard) or if I am responsible for thread synchronization in this case.
If the latter, is using std::atomic a good idea to keep it simple? Example:
std::atomic<std::exception_ptr> m_threadException;
Something like that? I'd be interested in recommendations and information on best practice here.

There is no special statement about exception_ptr with regards to its thread safety in the standard. As such, it provides the default standard guarantee: accessing separate instances are fine, accessing the same instance is not.
I would suggest using atomic<bool> instead (if for no other reason than that exception_ptr is not trivially copyable and therefore can't be put in an atomic<T>) to let the other code know that the exception_ptr has been set. You'll be fine so long as:
You set m_threadException before setting the flag
You read m_threadException after checking the flag
You use the appropriate load/store memory orders to set/check the flag. The defaults are fine
You only write m_threadException exactly once.

The standard doesn't specify what is the implementation of std::exception_ptr, so the thread safeness of std::exception_ptr is also unspecified.
just wrap the exception pointer with some lock and the code will be fine.

Just tried to do this, but std::atomic requires a trivially copyable type, std::exception_ptr is not. You should get compilation error as I do (when using MSVC VS2019, C++14).

std::thread termination without calling join() [duplicate]

Given below:
void test()
{
std::chrono::seconds dura( 20 );
std::this_thread::sleep_for( dura );
}
int main()
{
std::thread th1(test);
std::chrono::seconds dura( 5 );
std::this_thread::sleep_for( dura );
return 0;
}
main will exit after 5 seconds, what will happen to th1 that's still executing?
Does it continue executing until completion even if the th1 thread object you defined in main goes out of scope and gets destroyed?
Does th1 simply sits there after it's finished executing or somehow gets cleaned up when the program terminates?
What if the thread was created in a function, not main - does the thread stays around until the program terminates or when the function goes out of scope?
Is it safe to simply not call join for a thread if you want some type of timeout behavior on the thread?

If you have not detached or joined a thread when the destructor is called it will call std::terminate, we can see this by going to the draft C++11 standard we see that section 30.3.1.3 thread destructor says:
If joinable(), calls std::terminate(). Otherwise, has no effects. [
Note: Either implicitly detaching or joining a joinable() thread in
its destructor could result in difficult to debug correctness (for
detach) or performance (for join) bugs encountered only when an
exception is raised. Thus the programmer must ensure that the
destructor is never executed while the thread is still joinable. —end
note ]
as for a rationale for this behavior we can find a good summary in (Not) using std::thread
Why does the destructor of a joinable thread have to call
std::terminate? After all, the destructor could join with the child
thread, or it could detach from the child thread, or it could cancel
the thread. In short, you cannot join in the destructor as this would
result in unexpected (not indicated explicitly in the code) program
freeze in case f2 throws.
and an example follows and also says:
You cannot detach as it would risk the situation where main thread
leaves the scope which the child thread was launched in, and the child
thread keeps running and keeps references to the scope that is already
gone.
The article references N2802: A plea to reconsider detach-on-destruction for thread objects which is argument against the previous proposal which was detach on destruction if joinable and it notes that one of the two alternatives would be to join which could lead to deadlocks the other alternative is what we have today which is std::terminate on destruction if joinable.

std::thread::~thread()
If *this has an associated thread (joinable() == true), std::terminate() is called
Source: http://en.cppreference.com/w/cpp/thread/thread/~thread
This means that program like this is not at all well-formed or safe.
Note, however, that boost::thread::~thread() calls detach() instead in this case.
(as user dyp stated in comments, this behavior is deprecated in more recent versions)
You could always workaround this using RAII. Just wrap your thread inside another class, that will have desired behavior on destruction.

In C++11, you must explicitly specify 'what happens' when the newly created thread goes out of scope (our it's dtor is called). Sometimes, when we are sure that the main thread, is continuing, and our threads are acting as 'pipeline', it is safe to 'detach()' them; and sometimes when we are waiting for our WORKER threads to complete their operations, we 'join()' them.
As this says, the programmer must ensure that the destructor is never executed while the thread is still joinable.
Specify your multi-threaded strategy. In this example, std::terminate() is called.

C++11: What happens if you don't call join() for std::thread

Given below:
void test()
{
std::chrono::seconds dura( 20 );
std::this_thread::sleep_for( dura );
}
int main()
{
std::thread th1(test);
std::chrono::seconds dura( 5 );
std::this_thread::sleep_for( dura );
return 0;
}
main will exit after 5 seconds, what will happen to th1 that's still executing?
Does it continue executing until completion even if the th1 thread object you defined in main goes out of scope and gets destroyed?
Does th1 simply sits there after it's finished executing or somehow gets cleaned up when the program terminates?
What if the thread was created in a function, not main - does the thread stays around until the program terminates or when the function goes out of scope?
Is it safe to simply not call join for a thread if you want some type of timeout behavior on the thread?

std::thread::~thread()
If *this has an associated thread (joinable() == true), std::terminate() is called
Source: http://en.cppreference.com/w/cpp/thread/thread/~thread
This means that program like this is not at all well-formed or safe.
Note, however, that boost::thread::~thread() calls detach() instead in this case.
(as user dyp stated in comments, this behavior is deprecated in more recent versions)
You could always workaround this using RAII. Just wrap your thread inside another class, that will have desired behavior on destruction.

In C++11, you must explicitly specify 'what happens' when the newly created thread goes out of scope (our it's dtor is called). Sometimes, when we are sure that the main thread, is continuing, and our threads are acting as 'pipeline', it is safe to 'detach()' them; and sometimes when we are waiting for our WORKER threads to complete their operations, we 'join()' them.
As this says, the programmer must ensure that the destructor is never executed while the thread is still joinable.
Specify your multi-threaded strategy. In this example, std::terminate() is called.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js