To be specific, boost::future doesn't seem to block in the destructor. And boost documentation didn't mention it. However std::future did mention it.
If I want to chain some .then() to boost::future I also need to chain a .get() call at the end to force the temporary object to block to get the correct behavior. Is that the right way of using it?
void a()
{
boost::async([] {
boost::this_thread::sleep_for(boost::chrono::seconds{1});
std::cout << "Finished sleeping\n";
return 1;
});
std::cout << "End of a()\n";
}
void b()
{
std::async([] {
std::this_thread::sleep_for(std::chrono::seconds{1});
std::cout << "Finished sleeping\n";
return 1;
});
std::cout << "End of b()\n";
}
int main()
{
a();//No finished sleeping printed
std::cout << "End of main()\n";
}
int main()
{
b();//finished sleeping will print
std::cout << "End of main()\n";
}
https://www.boost.org/doc/libs/1_75_0/doc/html/thread/synchronization.html
The returned futures behave as the ones returned from boost::async, the destructor of the future object returned from then will block. This could be subject to change in future versions.
(2015) https://www.boost.org/doc/libs/1_75_0/doc/html/thread/changes.html
Version 4.5.0 - boost 1.58
Fixed Bugs:
#10968 The futures returned by async() and future::then() are not blocking.
(2017) https://github.com/boostorg/thread/issues/194
I'm aware of the issue. Boost.thread did it by design following std::async and the first proposals of std::future::then.
The problem is that changing the behavior (even if it is not desired in some contexts) would be a breaking change. I would need to crate a new version, but Idon't want to create a new version using compiler flags as this has become a nightmare to maintain.
So I will need to create a boost/thread_v5 which will produce different binaries lib_boost_thread_v5.a. Boost.Threads is not structured this way, and this will mean a lot of work.
This is the single way to avoid breaking changes, but before doing that I would like to enumerate all the breaking changes that should be on this new version.
It seems the destructor of future object returned from async has really changed to non-blocking at some time point after 2017, so you should use .get() to block.
I can not make it block because:
BOOST_THREAD_FUTURE_BLOCKING(not documented) is not #defined in all cases;
if you #define BOOST_THREAD_FUTURE_BLOCKING manually, you can see that ~future_async_shared_state_base() will be called when the future is destroyed and the lambda passed to async has finished running (i.e. when the boost::shared_ptr's use_count drops from 2 to 1 to 0), so the destructor of future object returned from async (when the boost::shared_ptr's use_count drops from 2 to 1) will still not block.
https://www.boost.org/doc/libs/1_75_0/boost/thread/detail/config.hpp
#if BOOST_THREAD_VERSION>=5
//#define BOOST_THREAD_FUTURE_BLOCKING
#if ! defined BOOST_THREAD_PROVIDES_EXECUTORS \
&& ! defined BOOST_THREAD_DONT_PROVIDE_EXECUTORS
#define BOOST_THREAD_PROVIDES_EXECUTORS
#endif
#else
//#define BOOST_THREAD_FUTURE_BLOCKING
#define BOOST_THREAD_ASYNC_FUTURE_WAITS
#endif
https://www.boost.org/doc/libs/1_75_0/boost/thread/future.hpp
in async:
void init(BOOST_THREAD_FWD_REF(Fp) f)
{
#ifdef BOOST_THREAD_FUTURE_BLOCKING
this->thr_ = boost::thread(&future_async_shared_state::run, static_shared_from_this(this), boost::forward<Fp>(f));
#else
boost::thread(&future_async_shared_state::run, static_shared_from_this(this), boost::forward<Fp>(f)).detach();
#endif
}
when the future is destroyed and the lambda passed to async has finished running:
~future_async_shared_state_base()
{
#ifdef BOOST_THREAD_FUTURE_BLOCKING
join();
#elif defined BOOST_THREAD_ASYNC_FUTURE_WAITS
unique_lock<boost::mutex> lk(this->mutex);
this->waiters.wait(lk, boost::bind(&shared_state_base::is_done, boost::ref(*this)));
#endif
}
C++20 has moved to coroutine which is able to split control flow in more complex cases than future.then: CppCon 2015: Gor Nishanov “C++ Coroutines - a negative overhead abstraction". You can try libraries like cppcoro or boost.fiber and pause the task function inside complex control structures (e.g. co_await inside if inside while inside for).
C++23 Executor TS: in initial versions of A Unified Executors Proposal for C++ there were OneWayExecutor execute(fire-and-forget) TwoWayExecutor twoway_execute(return a future) then_execute(take a future and a function, return a future) etc but in recent versions they were replaced by sender and receiver.
Related
When I run several std::threads in parallell and need to cancel other threads in a deferred manner if one thread fails I use a std::atomic<bool> flag:
#include <thread>
#include <chrono>
#include <iostream>
void threadFunction(unsigned int id, std::atomic<bool>& terminated) {
srand(id);
while (!terminated) {
int r = rand() % 100;
if (r == 0) {
std::cerr << "Thread " << id << ": an error occured.\n";
terminated = true; // without this line we have to wait for other thread to finish
return;
}
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
}
int main()
{
std::atomic<bool> terminated = false;
std::thread t1(&threadFunction, 1, std::ref(terminated));
std::thread t2(&threadFunction, 2, std::ref(terminated));
t1.join();
t2.join();
std::cerr << "Both threads finished.\n";
int k;
std::cin >> k;
}
However now I am reading about std::stop_sourceand std::stop_token.
I find that I can achieve the same as above by passing both a std::stop_sourceby reference and std::stop_token by value to the thread function?
How would that be superior?
I understand that when using std::jthread the std::stop_token is very convenient if I want to stop threads from outside the threads.
I could then call std::jthread::request_stop() from the main program.
However in the case where I want to stop threads from a thread is it still better?
I managed to achieve the same thing as in my code using std::stop_source:
void threadFunction(std::stop_token stoken, unsigned int id, std::stop_source source) {
srand(id);
while (!stoken.stop_requested()) {
int r = rand() % 100;
if (r == 0) {
std::cerr << "Thread " << id << ": an error occured.\n";
source.request_stop(); // without this line we have to wait for other thread to finish
return;
}
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
}
int main()
{
std::stop_source source;
std::stop_token stoken = source.get_token();
std::thread t1(&threadFunction, stoken, 1, source);
std::thread t2(&threadFunction, stoken, 2, source);
t1.join();
t2.join();
std::cerr << "Both threads finished.\n";
int k;
std::cin >> k;
}
Using std::jthread would have resulted in more compact code:
std::jthread t1(&threadFunction, 1, source);
std::jthread t2(&threadFunction, 2, source);
But that did not seem to work.
It didn't work because std::jthread has a special feature where, if the first parameter of a thread-function is a std::stop_token, it fills that token in by an internal stop_source object.
What you ought to do is only pass a stop_source (by value, not by reference), and extract the token from it within your thread function.
As for why this is better than a reference to an atomic, there are a myriad of reasons. The first being that stop_source is a lot safer than a bare reference to an object whose lifetime is not under the local control of the thread function. The second being that you don't have to do std::ref gymnastics to pass parameters. This can be a source of bugs since you might accidentally forget to do that in some place.
The standard stop_token mechanism has features beyond just requesting and responding to a stop. Since the response to a stop happens at an arbitrary time after issuing it, it may be necessary to execute some code when the stop is actually requested rather than when it is responded to. The stop_callback mechanism allows you to register a callback with a stop_token. This callback will be called in the thread of the stop_source::request_stop call (unless you register the callback after the stop was requested, in which case it's called right when you register it). This can be useful in limited cases, and it's not simple code to write yourself. Especially when all you have is an atomic<bool>.
And then there's simple readability. Passing a stop_source tells you exactly what is going on without having to even see the name of a parameter. Passing an atomic<bool> tells you very little from just the typename; you have to look at the parameter name or its usage in the function to know that it is for halting the thread.
Apart from being more expressive and communicating intentions better, stop_token and friends achieve something really important for jthread. To understand it you have to consider its destructor which looks something like this:
~jthread()
{
if(joinable())
{
// Not only user code, but the destructor as well
// will let your callback know it's time to go.
request_stop();
join();
}
}
by encapsulating a stop_source, jthread facilitates what is called cooperative cancellation. As you've also noted, you never have to pass the stop_token to a jthread, just provide a callback that accepts the token as its first parameter. What happens next is that the class can detect that your callback accepts a stop token and pass a token to its internal stop source when calling it.
What does this mean for cooperative cancellation? Safer termination of course! Since jthread will always attempt to join on destruction, it now has the means to prevent endless loops and deadlocks where two or more threads wait for each other to finish. By using stop_token your code can make sure that it can safely join when it's time to go.
However in the case where I want to stop threads from a thread is it still better?
Now regarding the feature you are requesting, that's what C# calls "linked cancellation". Yes, there are requests and discussions to add a parameter in the jthread constructor so that it can refer to an external stop source, but that's not yet available (and has many implications). Doing something similar purely with stop tokens would require a stop_callback to tie all cancellations together, but still it could be suboptimal (as shown in the link). The bottom line is that jthread needs stop_token, but in some cases you may not need jthread, especially if the following solution does not appeal to you:
stop_source ssource;
std::stop_callback cb {ssource.get_token(), [&] {
t1.request_stop();
t2.request_stop();
}};
ssource.request_stop(); // This stops boths threads.
The good news is that if you don't fall into the suboptimal pattern described in the link (i.e. you don't need an asynchronous termination), then this functionality is easy to abstract into a utility, something like:
auto linked_cancellations = [](auto&... jthreads) {
stop_source s;
return std::make_pair(s, std::stop_callback{
s.get_token(), [&]{ (jthreads.request_stop(), ...); }});
};
which you'd use as
auto [stop_source, cb] = linked_cancellations(t1, t2);
// or as many thread objects as you want to link ^^^
stop_source.request_stop(); // Stops all the threads that you linked.
Now if you want to control the linked threads from within the thread, I'd use the initial pattern (std::atomic<bool>), since having a callback with both a stop token and a stop source is somewhat confusing.
I got to know the reason that future returned from std::async has some special shared state through which wait on returned future happened in the destructor of future. But when we use std::pakaged_task, its future does not exhibit the same behavior.
To complete a packaged task, you have to explicitly call get() on future object from packaged_task.
Now my questions are:
What could be the internal implementation of future (thinking std::async vs std::packaged_task)?
Why the same behavior was not applied to future returned from std::packaged_task? Or, in other words, how is the same behavior stopped for std::packaged_task future?
To see the context, please see the code below:
It does not wait to finish countdown task. However, if I un-comment // int value = ret.get();, it would finish countdown and is obvious because we are literally blocking on returned future.
// packaged_task example
#include <iostream> // std::cout
#include <future> // std::packaged_task, std::future
#include <chrono> // std::chrono::seconds
#include <thread> // std::thread, std::this_thread::sleep_for
// count down taking a second for each value:
int countdown (int from, int to) {
for (int i=from; i!=to; --i) {
std::cout << i << std::endl;
std::this_thread::sleep_for(std::chrono::seconds(1));
}
std::cout << "Lift off!" <<std::endl;
return from-to;
}
int main ()
{
std::cout << "Start " << std::endl;
std::packaged_task<int(int,int)> tsk (countdown); // set up packaged_task
std::future<int> ret = tsk.get_future(); // get future
std::thread th (std::move(tsk),10,0); // spawn thread to count down from 10 to 0
// int value = ret.get(); // wait for the task to finish and get result
std::cout << "The countdown lasted for " << std::endl;//<< value << " seconds.\n";
th.detach();
return 0;
}
If I use std::async to execute task countdown on another thread, no matter if I use get() on returned future object or not, it will always finish the task.
// packaged_task example
#include <iostream> // std::cout
#include <future> // std::packaged_task, std::future
#include <chrono> // std::chrono::seconds
#include <thread> // std::thread, std::this_thread::sleep_for
// count down taking a second for each value:
int countdown (int from, int to) {
for (int i=from; i!=to; --i) {
std::cout << i << std::endl;
std::this_thread::sleep_for(std::chrono::seconds(1));
}
std::cout << "Lift off!" <<std::endl;
return from-to;
}
int main ()
{
std::cout << "Start " << std::endl;
std::packaged_task<int(int,int)> tsk (countdown); // set up packaged_task
std::future<int> ret = tsk.get_future(); // get future
auto fut = std::async(std::move(tsk), 10, 0);
// int value = fut.get(); // wait for the task to finish and get result
std::cout << "The countdown lasted for " << std::endl;//<< value << " seconds.\n";
return 0;
}
std::async has definite knowledge of how and where the task it is given is executed. That is its job: to execute the task. To do that, it has to actually put it somewhere. That somewhere could be a thread pool, a newly created thread, or in a place to be executed by whomever destroys the future.
Because async knows how the function will be executed, it has 100% of the information it needs to build a mechanism that can communicate when that potentially asynchronous execution has concluded, as well as to ensure that if you destroy the future, then whatever mechanism that's going to execute that function will eventually get around to actually executing it. After all, it knows what that mechanism is.
But packaged_task doesn't. All packaged_task does is store a callable object which can be called with the given arguments, create a promise with the type of the function's return value, and provide a means to both get a future and to execute the function that generates the value.
When and where the task actually gets executed is none of packaged_task's business. Without that knowledge, the synchronization needed to make future's destructor synchronize with the task simply can't be built.
Let's say you want to execute the task on a freshly-created thread. OK, so to synchronize its execution with the future's destruction, you'd need a mutex which the destructor will block on until the task thread finishes.
But what if you want to execute the task in the same thread as the caller of the future's destructor? Well, then you can't use a mutex to synchronize that since it all on the same thread. Instead, you need to make the destructor invoke the task. That's a completely different mechanism, and it is contingent on how you plan to execute.
Because packaged_task doesn't know how you intend to execute it, it cannot do any of that.
Note that this is not unique to packaged_task. All futures created from a user-created promise object will not have the special property of async's futures.
So the question really ought to be why async works this way, not why everyone else doesn't.
If you want to know that, it's because of two competing needs: async needed to be a high-level, brain-dead simple way to get asynchronous execution (for which sychronization-on-destruction makes sense), and nobody wanted to create a new future type that was identical to the existing one save for the behavior of its destructor. So they decided to overload how future works, complicating its implementation and usage.
#Nicol Bolas has already answered this question quite satisfactorily. So I'll attempt to answer the question slightly from different perspective, elaborating the points already mentioned by #Nicol Bolas.
The design of related things and their goals
Consider this simple function which we want to execute, in various ways:
int add(int a, int b) {
std::cout << "adding: " << a << ", "<< b << std::endl;
return a + b;
}
Forget std::packaged_task, std ::future and std::async for a while, let's take one step back and revisit how std::function works and what problem it causes.
case 1 — std::function isn't good enough for executing things in different threads
std::function<int(int,int)> f { add };
Once we have f, we can execute it, in the same thread, like:
int result = f(1, 2); //note we can get the result here
Or, in a different thread, like this:
std::thread t { std::move(f), 3, 4 };
t.join();
If we see carefully, we realize that executing f in a different thread creates a new problem: how do we get the result of the function? Executing f in the same thread does not have that problem — we get the result as returned value, but when executed it in a different thread, we don't have any way to get the result. That is exactly what is solved by std::packaged_task.
case 2 — std::packaged_task solves the problem which std::function does not solve
In particular, it creates a channel between threads to send the result to the other thread. Apart from that, it is more or less same as std::function.
std::packaged_task<int(int,int)> f { add }; // almost same as before
std::future<int> channel = f.get_future(); // get the channel
std::thread t{ std::move(f), 30, 40 }; // same as before
t.join(); // same as before
int result = channel.get(); // problem solved: get the result from the channel
Now you see how std::packaged_task solves the problem created by std::function. That however does not mean that std::packaged_task has to be executed in a different thread. You can execute it in the same thread as well, just like std::function, though you will still get the result from the channel.
std::packaged_task<int(int,int)> f { add }; // same as before
std::future<int> channel = f.get_future(); // same as before
f(10, 20); // execute it in the current thread !!
int result = channel.get(); // same as before
So fundamentally std::function and std::packaged_task are similar kind of thing: they simply wrap callable entity, with one difference: std::packaged_task is multithreading-friendly, because it provides a channel through which it can pass the result to other threads. Both of them do NOT execute the wrapped callable entity by themselves. One needs to invoke them, either in the same thread, or in another thread, to execute the wrapped callable entity. So basically there are two kinds of thing in this space:
what is executed i.e regular functions, std::function, std::packaged_task, etc.
how/where is executed i.e threads, thread pools, executors, etc.
case 3: std::async is an entirely different thing
It's a different thing because it combines what-is-executed with how/where-is-executed.
std::future<int> fut = std::async(add, 100, 200);
int result = fut.get();
Note that in this case, the future created has an associated executor, which means that the future will complete at some point as there is someone executing things behind the scene. However, in case of the future created by std::packaged_task, there is not necessarily an executor and that future may never complete if the created task is never given to any executor.
Hope that helps you understand how things work behind the scene. See the online demo.
The difference between two kinds of std::future
Well, at this point, it becomes pretty much clear that there are two kinds of std::future which can be created:
One kind can be created by std::async. Such future has an associated executor and thus can complete.
Other kind can be created by std::packaged_task or things like that. Such future does not necessarily have an associated executor and thus may or may not complete.
Since, in the second case the future does not necessarily have an associated executor, its destructor is not designed for its completion/wait because it may never complete:
{
std::packaged_task<int(int,int)> f { add };
std::future<int> fut = f.get_future();
} // fut goes out of scope, but there is no point
// in waiting in its destructor, as it cannot complete
// because as `f` is not given to any executor.
Hope this answer helps you understand things from a different perspective.
The change in behaviour is due to the difference between std::thread and std::async.
In the first example, you have created a daemon thread by detaching. Where you print std::cout << "The countdown lasted for " << std::endl; in your main thread, may occur before, during or after the print statements inside the countdown thread function. Because the main thread does not await the spawned thread, you will likely not even see all of the print outs.
In the second example, you launch the thread function with the std::launch::deferred policy. The behaviour for std::async is:
If the async policy is chosen, the associated thread completion synchronizes-with the successful return from the first function that is waiting on the shared state, or with the return of the last function that releases the shared state, whichever comes first.
In this example, you have two futures for the same shared state. Before their dtors are called when exiting main, the async task must complete. Even if you had not explicitly defined any futures, the temporary future that gets created and destroyed (returned from the call to std::async) will mean that the task completes before the main thread exits.
Here is a great blog post by Scott Meyers, clarifying the behaviour of std::future & std::async.
Related SO post.
I'm looking for a way to compose asynchronous operations. The ultimate goal is to execute an asynchronous operation, and either have it run to completion, or return after a user-defined timeout.
For exemplary purposes, assume that I'm looking for a way to combine the following coroutines1:
IAsyncOperation<IBuffer> read(IBuffer buffer, uint32_t count)
{
auto&& result{ co_await socket_.InputStream().ReadAsync(buffer, count, InputStreamOptions::None) };
co_return result;
}
with socket_ being a StreamSocket instance.
And the timeout coroutine:
IAsyncAction timeout()
{
co_await 5s;
}
I'm looking for a way to combine these coroutines in a way, that returns as soon as possible, either once the data has been read, or the timeout has expired.
These are the options I have evaluated so far:
C++20 coroutines: As far as I understand P1056R0, there is currently no library or language feature "to enable creation and composition of coroutines".
Windows Runtime supplied asynchronous task types, ultimately derived from IAsyncInfo: Again, I didn't find any facilities that would allow me to combine the tasks the way I need.
Concurrency Runtime: This looks promising, particularly the when_any function template looks to be exactly what I need.
From that it looks like I need to go with the Concurrency Runtime. However, I'm having a hard time bringing all the pieces together. I'm particularly confused about how to handle exceptions, and whether cancellation of the respective other concurrent task is required.
The question is two-fold:
Is the Concurrency Runtime the only option (UWP application)?
What would an implementation look like?
1 The methods are internal to the application. It is not required to have them return Windows Runtime compatible types.
I think the easiest would be to use the concurrency library. You need to modify your timeout to return the same type as the first method, even if it returns null.
(I realize this is only a partial answer...)
My C++ sucks, but I think this is close...
array<task<IBuffer>, 2> tasks =
{
concurrency::create_task([]{return read(buffer, count).get();}),
concurrency::create_task([]{return modifiedTimeout.get();})
};
concurrency::when_any(begin(tasks), end(tasks)).then([](IBuffer buffer)
{
//do something
});
As suggested by Lee McPherson in another answer, the Concurrency Runtime looks like a viable option. It provides tasks, that can be combined with others, chained up using continuations, as well as seamlessly integrate with the Windows Runtime asynchronous model (see Creating Asynchronous Operations in C++ for UWP Apps). As a bonus, including the <pplawait.h> header provides adapters for concurrency::task class template instantiations to be used as C++20 coroutine awaitables.
I wasn't able to answer all of the questions, but this is what I eventually came up with. For simplicity (and ease of verification) I'm using Sleep in place of the actual read operation, and return an int instead of an IBuffer.
Composition of tasks
The ConcRT provides several ways to combine tasks. Given the requirements concurrency::when_any can be used to create a task that returns, when any of the supplied tasks completes. When only 2 tasks are supplied as input, there's also a convenience operator (operator||) available.
Exception propagation
Exceptions raised from either of the input tasks do not count as a successful completion. When used with the when_any task, throwing an exception will not suffice the wait condition. As a consequence, exceptions cannot be used to break out of combined tasks. To deal with this I opted to return a std::optional, and raise appropriate exceptions in a then continuation.
Task cancellation
This is still a mystery to me. It appears that once a task satisfies the wait condition of the when_any task, there is no requirement to cancel the respective other outstanding tasks. Once those complete (successfully or otherwise), they are silently dealt with.
Following is the code, using the simplifications mentioned earlier. It creates a task consisting of the actual workload and a timeout task, both returning a std::optional. The then continuation examines the return value, and throws an exception in case there isn't one (i.e. the timeout_task finished first).
#include <Windows.h>
#include <cstdint>
#include <iostream>
#include <optional>
#include <ppltasks.h>
#include <stdexcept>
using namespace concurrency;
task<int> read_with_timeout(uint32_t read_duration, uint32_t timeout)
{
auto&& read_task
{
create_task([read_duration]
{
::Sleep(read_duration);
return std::optional<int>{42};
})
};
auto&& timeout_task
{
create_task([timeout]
{
::Sleep(timeout);
return std::optional<int>{};
})
};
auto&& task
{
(read_task || timeout_task)
.then([](std::optional<int> result)
{
if (!result.has_value())
{
throw std::runtime_error("timeout");
}
return result.value();
})
};
return task;
}
The following test code
int main()
{
try
{
auto res1{ read_with_timeout(3000, 5000).get() };
std::cout << "Succeeded. Result = " << res1 << std::endl;
auto res2{ read_with_timeout(5000, 3000).get() };
std::cout << "Succeeded. Result = " << res2 << std::endl;
}
catch( std::runtime_error const& e )
{
std::cout << "Failed. Exception = " << e.what() << std::endl;
}
}
produces this output:
Succeeded. Result = 42
Failed. Exception = timeout
I wrote this sample code to test boost::future continuations to use in my application.
#include <iostream>
#include <functional>
#include <unistd.h>
#include <exception>
#define BOOST_THREAD_PROVIDES_FUTURE
#define BOOST_THREAD_PROVIDES_FUTURE_CONTINUATION
#include <boost/thread/future.hpp>
void magicNumber(std::shared_ptr<boost::promise<long>> p)
{
sleep(5);
p->set_value(0xcafebabe);
}
boost::future<long> foo()
{
std::shared_ptr<boost::promise<long>> p =
std::make_shared<boost::promise<long>>();
boost::future<long> f = p->get_future();
boost::thread t([p](){magicNumber(p);});
t.detach();
return f;
}
void bar()
{
auto f = foo();
f.then([](boost::future<long> f) { std::cout << f.get() << std::endl; });
std::cout << "Should have blocked?" << std::endl;
}
int main()
{
bar();
sleep (6);
return 0;
}
When compiled, linked and run with boost version 1.64.0_1, I am getting following output:
Should have blocked?
3405691582
But according to boost::future::then's documentation here.
The execution should be blocked at f.then() in function bar() because the temporary variable of type boost::future<void> should block at destruction, and the output should be
3405691582
Should have blocked?
In my application though, the call to f.then() is blocking the execution till continuation is not invoked.
What is happening here?
Note that the only time a future would ever block in the destructor used to be documented as when you use std::async with a launch-policy of launch::async.
See Why is the destructor of a future returned from `std::async` blocking?
The answer lists the many discussions that have taken place around this subject. The proposal N3776 made it into C++14:
This paper provides proposed wording to implement a positive SG1 straw poll to clarify that ~future and
~shared_future don’t block except possibly in the presence of async.
cppreference.com documents std::async
Your code never used async, so it would be surprising if any future derived would block on destruction.
More gener
ally, it is clear that the consensus is that blocking destruction is an unfortunate design wart, not something you'd expect being introduced on newer extensions (such as .then continuations).
I can only assume this is a case of documentation error where the wording
The returned futures behave as the ones returned from boost::async, the destructor of the future object returned from then will block. This could be subject to change in future versions.
should be removed.
The following example runs successfully (i.e. doesn't hang) if compiled using Clang 3.2 or GCC 4.7 on Ubuntu 12.04, but hangs if I compile using VS11 Beta or VS2012 RC.
#include <iostream>
#include <string>
#include <thread>
#include "boost/thread/thread.hpp"
void SleepFor(int ms) {
std::this_thread::sleep_for(std::chrono::milliseconds(ms));
}
template<typename T>
class ThreadTest {
public:
ThreadTest() : thread_([] { SleepFor(10); }) {}
~ThreadTest() {
std::cout << "About to join\t" << id() << '\n';
thread_.join();
std::cout << "Joined\t\t" << id() << '\n';
}
private:
std::string id() const { return typeid(decltype(thread_)).name(); }
T thread_;
};
int main() {
static ThreadTest<std::thread> std_test;
static ThreadTest<boost::thread> boost_test;
// SleepFor(100);
}
The issue appears to be that std::thread::join() never returns if it is invoked after main has exited. It is blocked at WaitForSingleObject in _Thrd_join defined in cthread.c.
Uncommenting SleepFor(100); at the end of main allows the program to exit properly, as does making std_test non-static. Using boost::thread also avoids the issue.
So I'd like to know if I'm invoking undefined behaviour here (seems unlikely to me), or if I should be filing a bug against VS2012?
Tracing through Fraser's sample code in his connect bug (https://connect.microsoft.com/VisualStudio/feedback/details/747145)
with VS2012 RTM seems to show a fairly straightforward case of deadlocking. This likely isn't specific to std::thread - likely _beginthreadex suffers the same fate.
What I see in the debugger is the following:
On the main thread, the main() function has completed, the process cleanup code has acquired a critical section called _EXIT_LOCK1, called the destructor of ThreadTest, and is waiting (indefinitely) on the second thread to exit (via the call to join()).
The second thread's anonymous function completed and is in the thread cleanup code waiting to acquire the _EXIT_LOCK1 critical section. Unfortunately, due to the timing of things (whereby the second thread's anonymous function's lifetime exceeds that of the main() function) the main thread already owns that critical section.
DEADLOCK.
Anything that extends the lifetime of main() such that the second thread can acquire _EXIT_LOCK1 before the main thread avoids the deadlock situation. That's why the uncommenting the sleep in main() results in a clean shutdown.
Alternatively if you remove the static keyword from the ThreadTest local variable, the destructor call is moved up to the end of the main() function (instead of in the process cleanup code) which then blocks until the second thread has exited - avoiding the deadlock situation.
Or you could add a function to ThreadTest that calls join() and call that function at the end of main() - again avoiding the deadlock situation.
I realize this is an old question regarding VS2012, but the bug is still present in VS2013. For those who are stuck on VS2013, perhaps due to Microsoft's refusal to provide upgrade pricing for VS2015, I offer the following analysis and workaround.
The problem is that the mutex (at_thread_exit_mutex) used by _Cnd_do_broadcast_at_thread_exit() is either not yet initialized, or has already been destroyed, depending on the exact circumstances. In the former case, _Cnd_do_broadcast_at_thread_exit() tries to initialize the mutex during shutdown, causing a deadlock. In the latter case, where the mutex has already been destroyed via the atexit stack, the program will crash on the way out.
The solution I found is to explicitly call _Cnd_do_broadcast_at_thread_exit() (which thankfully is declared publicly) early during program startup. This has the effect of creating the mutex before anyone else tries to access it, as well as ensuring that the mutex continues to exist until the last possible moment.
So, to fix the problem, insert the following code at the bottom of a source module, for instance somewhere below main().
#pragma warning(disable:4073) // initializers put in library initialization area
#pragma init_seg(lib)
#if _MSC_VER < 1900
struct VS2013_threading_fix
{
VS2013_threading_fix()
{
_Cnd_do_broadcast_at_thread_exit();
}
} threading_fix;
#endif
I believe your threads have already been terminated and their resources freed following the termination of your main function and before static destruction. This is the behavior of the VC runtimes dating back to at least VC6.
Do child threads exit when the parent thread terminates
boost thread and process cleanup on windows
My answer is too far late, but hope will help someone.
I was stucked by this bug, and i find a trick to solve this problem,it worked in my code.
int main()
{
ThreadTest trick_obj; //trick... You can put this line of code anywhere
static ThreadTest std_test;
return 1;
}
I have been battling this bug for a day, and found the following work-around, which turned out the be the least dirty trick:
Instead of returning, one can use the standard Windows API function call ExitThread() to terminate the thread. This method of course may mess up the internal state of the std::thread object and associated library, but since the program is going to terminate anyway, well, so be it.
#include <windows.h>
template<typename T>
class ThreadTest {
public:
ThreadTest() : thread_([] { SleepFor(10); ExitThread(NULL); }) {}
~ThreadTest() {
std::cout << "About to join\t" << id() << '\n';
thread_.join();
std::cout << "Joined\t\t" << id() << '\n';
}
private:
std::string id() const { return typeid(decltype(thread_)).name(); }
T thread_;
};
The join() call apparently works correctly. However, I chose to use a more safe method in our solution. One can get the thread HANDLE via std::thread::native_handle(). With this handle we can call the Windows API directly to join the thread:
WaitForSingleObject(thread_.native_handle(), INFINITE);
CloseHandle(thread_.native_handle());
Thereafter, the std::thread object must not be destroyed, as the destructor would try to join the thread a second time. So we just leave the std::thread object dangling at program exit.