Why should I use std::async? - c++

I'm trying to explore all the options of the new C++11 standard in depth, while using std::async and reading its definition, I noticed 2 things, at least under linux with gcc 4.8.1 :
it's called async, but it got a really "sequential behaviour", basically in the row where you call the future associated with your async function foo, the program blocks until the execution of foo it's completed.
it depends on the exact same external library as others, and better, non-blocking solutions, which means pthread, if you want to use std::async you need pthread.
at this point it's natural for me asking why choosing std::async over even a simple set of functors ? It's a solution that doesn't even scale at all, the more future you call, the less responsive your program will be.
Am I missing something ? Can you show an example that is granted to be executed in an async, non blocking, way ?

it's called async, but it got a really "sequential behaviour",
No, if you use the std::launch::async policy then it runs asynchronously in a new thread. If you don't specify a policy it might run in a new thread.
basically in the row where you call the future associated with your async function foo, the program blocks until the execution of foo it's completed.
It only blocks if foo hasn't completed, but if it was run asynchronously (e.g. because you use the std::launch::async policy) it might have completed before you need it.
it depends on the exact same external library as others, and better, non-blocking solutions, which means pthread, if you want to use std::async you need pthread.
Wrong, it doesn't have to be implemented using Pthreads (and on Windows it isn't, it uses the ConcRT features.)
at this point it's natural for me asking why choosing std::async over even a simple set of functors ?
Because it guarantees thread-safety and propagates exceptions across threads. Can you do that with a simple set of functors?
It's a solution that doesn't even scale at all, the more future you call, the less responsive your program will be.
Not necessarily. If you don't specify the launch policy then a smart implementation can decide whether to start a new thread, or return a deferred function, or return something that decides later, when more resources may be available.
Now, it's true that with GCC's implementation, if you don't provide a launch policy then with current releases it will never run in a new thread (there's a bugzilla report for that) but that's a property of that implementation, not of std::async in general. You should not confuse the specification in the standard with a particular implementation. Reading the implementation of one standard library is a poor way to learn about C++11.
Can you show an example that is granted to be executed in an async, non blocking, way ?
This shouldn't block:
auto fut = std::async(std::launch::async, doSomethingThatTakesTenSeconds);
auto result1 = doSomethingThatTakesTwentySeconds();
auto result2 = fut.get();
By specifying the launch policy you force asynchronous execution, and if you do other work while it's executing then the result will be ready when you need it.

If you need the result of an asynchronous operation, then you have to block, no matter what library you use. The idea is that you get to choose when to block, and, hopefully when you do that, you block for a negligible time because all the work has already been done.
Note also that std::async can be launched with policies std::launch::async or std::launch::deferred. If you don't specify it, the implementation is allowed to choose, and it could well choose to use deferred evaluation, which would result in all the work being done when you attempt to get the result from the future, resulting in a longer block. So if you want to make sure that the work is done asynchronously, use std::launch::async.

I think your problem is with std::future saying that it blocks on get. It only blocks if the result isn't already ready.
If you can arrange for the result to be already ready, this isn't a problem.
There are many ways to know that the result is already ready. You can poll the future and ask it (relatively simple), you could use locks or atomic data to relay the fact that it is ready, you could build up a framework to deliver "finished" future items into a queue that consumers can interact with, you could use signals of some kind (which is just blocking on multiple things at once, or polling).
Or, you could finish all the work you can do locally, and then block on the remote work.
As an example, imagine a parallel recursive merge sort. It splits the array into two chunks, then does an async sort on one chunk while sorting the other chunk. Once it is done sorting its half, the originating thread cannot progress until the second task is finished. So it does a .get() and blocks. Once both halves have been sorted, it can then do a merge (in theory, the merge can be done at least partially in parallel as well).
This task behaves like a linear task to those interacting with it on the outside -- when it is done, the array is sorted.
We can then wrap this in a std::async task, and have a future sorted array. If we want, we could add in a signally procedure to let us know that the future is finished, but that only makes sense if we have a thread waiting on the signals.

In the reference: http://en.cppreference.com/w/cpp/thread/async
If the async flag is set (i.e. policy & std::launch::async != 0), then
async executes the function f on a separate thread of execution as if
spawned by std::thread(f, args...), except that if the function f
returns a value or throws an exception, it is stored in the shared
state accessible through the std::future that async returns to the
caller.
It is a nice property to keep a record of exceptions thrown.

http://www.cplusplus.com/reference/future/async/
there are three type of policy,
launch::async
launch::deferred
launch::async|launch::deferred
by default launch::async|launch::deferred is passed to std::async.

Related

Spawn a new thread as soon as another has finished

I've an expensive function that need to be executed 1000 times. Execution can take between 5 seconds and 10 minutes. It has thus a high variation.
I like to have multiple threads working on it. My current implementation devised these 1000 calls in 4 times 250 calls and spawns 4 threads. However, if one thread has a "bad day", it has much longer to finish compared to the other 3 threads.
Hence I like to do a new call to the function whenever a thread has finished a previous call - until all 1000 calls have been made.
I think a thread-pool would work - but if ever possible I like to have a simple method (=as less additional code as possible). Also task-based design goes into this direction (I think). Is there an easy solution for this?
Initialize a semaphore with 1000 units. Have each of the 4 threads loop around a semaphore wait() and the work function.
All the threads will then work on the function until it has been executed 1000 times. Even if three of the threads get stuck and take ages, the fourth will handle the other 997 calls.
[Edit]
Meh.. aparrently, the standard C++11 library does not include semaphores. A semaphore is, however, a basic OS sunchro primitive and so should be easy enough to call, eg. with POSIX.
You can use either one of the reference implementation of Exectuors and then call the function via
#include <experimental/thread_pool>
using std::experimental::post;
using std::experimental::thread_pool;
thread_pool pool_{1};
void do_big_task()
{
for (auto i : n)
{
post(pool_, [=]
{
// do your work here;
});
}
}
Executors are coming in C++17 so I thought I would get in early.
Or if you want to try another flavour of executors then there is a more recent implementation with a slightly different syntax.
Given that you have already been able to segment the calls into separate entities and the threads to handle. Once approach is to use std::package_task (with its associated std::future) to handle the function call, and place them in a queue of some sort. In turn, each thread can pick up the packaged tasks and process them.
You will need to lock the queue for concurrent access, there may be some bottle necking here, but compared to the concern that a thread can have "a bad day", this should be minimal. This is effectively a thread pool, but it allows you some control over the execution of the tasks.
Another alternative is to use std::async and specify its launch policy as std::launch::async, the disadvantage it that you do not control the thread creation itself, so you are dependent on how efficient your standard library is controlling the threads vs. how many cores you have.
Either approach would work, the key would be to measure the performance of the approaches over a reasonable sample size. The measure should be for time and resource use (threads and keeping the cores busy). Most OSes will include ways of measuring the resource usage of the process.

C++/Windows Multi threaded synchronization/Data Sharing

My requirement is that a single frame of data is to be processed by two methods in parallel (they need to be parallel because they are which are computationally demanding).
Based on the result of either of the threads, the other need to be stopped.
That is if method 1 returns TRUE first, method 2 should be stopped.
If method 1 returns FALSE first, method 2 should not be stopped.
Similarly, if method 2 returns TRUE first, method 1 should be stopped.
If method 2 returns FALSE first, method 1 should not be stopped.
Please note that method 1 and method 2 are library calls (black box) and I don't have access to their internals. All I know is that they are computationally intense.
How can I implement it in C++/Windows? Any suggestions?
Take a look at the concurrency runtime.
Specifically the task namespace (http://msdn.microsoft.com/en-us/library/dd492427.aspx) and the when_any function (http://msdn.microsoft.com/en-us/library/hh749973.aspx).
concurrency::when_any will create a task that completes when any of the input tasks complete.
No matter if you use plain Windows threads, std::thread, Task Parallelism, or whatever library you prefer, you're still not going to achieve what you want given the details you provided in your question.
While you can certainly figure out when the first thread/task is finished (e.g. #j-w's answer), you cannot really stop the other task gracefully without telling your "blackbox library function" to stop (unless it provides a ways for explicit early cancellation). You didn't indicate the blackbox function can be told to cancel midway, so I'm assuming it is not.
You cannot simply kill the thread/task since this would create resource leaks and maybe even other nasty stuff such as dealocks, etc. depending on what your blackbox function does.
So, you could go with something like when_any, or other synchronization/signaling primitives, and just let the other thread/task continue to run even though you don't need the result, "un-blackbox" your library functions and add cancellation support, or forget about it altogether.

cancel a c++ 11 async task

How can I stop/cancel an asynchronous task created with std::async and policy std::launch::async? In other words, I have started a task running on another thread, using future object. Is there a way to cancel or stop the running task?
In short no.
Longer explanation: There is no safe way to cancel any threads in standard C++. This would require thread cancellation. This feature has been discussed many times during the C++11 standardisation and the general consensus is that there is no safe way to do so. To my knowledge there were three main considered ways to do thread cancellation in C++.
Abort the thread. This would be rather like an emergency stop. Unfortunately it would result in no stack unwinding or destructors called. The thread could have been in any state so possibly holding mutexes, having heap allocated data which would be leaked, etc. This was clearly never going to be considered for long since it would make the entire program undefined. If you want to do this yourself however just use native_handle to do it. It will however be non-portable.
Compulsory cancellation/interruption points. When a thread cancel is requested it internally sets some variable so that next time any of a predefined set of interruption points is called (such as sleep, wait, etc) it will throw some exception. This would cause the stack to unwind and cleanup can be done. Unfortunately this type of system makes it very difficult make any code exception safe since most multithreaded code can then suddenly throw. This is the model that boost.thread uses. It uses disable_interruption to work around some of the problems but it is still exceedingly difficult to get right for anything but the simplest of cases. Boost.thread uses this model but it has always been considered risky and understandably it was not accepted into the standard along with the rest.
Voluntary cancellation/interruption points. ultimately this boils down to checking some condition yourself when you want to and if appropriate exiting the thread yourself in a controlled fashion. I vaguely recall some talk about adding some library features to help with this but it was never agreed upon.
I would just use a variation of 3. If you are using lambdas for instance it would be quite easy to reference an atomic "cancel" variable which you can check from time to time.
In C++11 (I think) there is no standard way to cancel a thread. If you get std::thread::native_handle(), you can do something with it but that's not portable.
maybe you can do like this way by checking some condition:
class Timer{
public:
Timer():timer_destory(false){}
~Timer(){
timer_destory=true;
for(auto result:async_result){
result.get();
}
}
int register_event(){
async_result.push_back(
std::async(std::launch::async,[](std::atomic<bool>& timer_destory){
while(!timer_destory){
//do something
}
},std::ref(timer_destory))
);
}
private:
std::vector<std::future<int>> async_result;
std::atomic<bool> timer_destory;
}

What is the purpose of .then construction in PPL tasks?

I'm interesting what is the purpose of .then construction in PPL and where can I test it? It seems Visual Studio 2012 don't support it yet (may be some future CTP's?). And does it have equivalents in standard C++11 async library?
The purpose is for you to be able to express asynchronous tasks that have to be executed in sequence.
For example, let's say I'm in a GUI application. When the user presses a button, I want to start a task asynchronously to retrieve a file online, then process it to retrieve some kind of data, then use this data to update the GUI. While this is happening, there are tons of other tasks going on, mainly to keep the GUI responsive.
This can be done by using callbacks that call callbacks.
The .then() feature associated with lambdas allows you to write all the callbacks content where you instantiate it (you can still use separate callbacks if you want).
It also don't guarantee that the work of each separate task will be done by the same thread, making possible for free threads to steal the tasks if the initial thread have too much work to do already.
The .then() function doesn't exists in C++11 but it is proposed to be added to the std::future class (that is basically a handle to a task or task result).
Klaim already made a great answer, but I thought I'd give a specific example.
.then attaches a continuation to the task, and is the async equivalent to the synchronous .get, basically.
C++11 has std::future, which is the equivalent to a concurrency::task. std::future currently only has .get, but there is a proposal to add .then (and other good stuff).
std::async(calculate_answer(the_question_of_everything))
.then([](std::future<int> f){ std::cout << f.get() << "\n"; });
The above snippet will create an asynchronous task (launched with std::async), and then attach a continuation which gets passed the std::future of the finished task as soon as the aforementioned task is done. This actually returns another std::future for that task, and the current C++11 standard would block on its destructor, but there is another proposal to make the destructor unblocking. So with the above code, you create a fire-and-forget task that prints the answer as soon as it's calculated.
The blocking equivalent would be:
auto f = std::async(calculate_answer(the_question_of_everything));
std::cout << f.get() << "\n";
This code will block in f.get() until the answer becomes available.

How do Clojure futures and promises differ?

Both futures and promises block until they have calculated their values, so what is the difference between them?
Answering in Clojure terms, here are some examples from Sean Devlin's screencast:
(def a-promise (promise))
(deliver a-promise :fred)
(def f (future (some-sexp)))
(deref f)
Note that in the promise you are explicitly delivering a value that you select in a later computation (:fred in this case). The future, on the other hand, is being consumed in the same place that it was created. The some-expr is presumably launched behind the scenes and calculated in tandem (eventually), but if it remains unevaluated by the time it is accessed the thread blocks until it is available.
edited to add
To help further distinguish between a promise and a future, note the following:
promise
You create a promise. That promise object can now be passed to any thread.
You continue with calculations. These can be very complicated calculations involving side-effects, downloading data, user input, database access, other promises -- whatever you like. The code will look very much like your mainline code in any program.
When you're finished, you can deliver the results to that promise object.
Any item that tries to deref your promise before you're finished with your calculation will block until you're done. Once you're done and you've delivered the promise, the promise won't block any longer.
future
You create your future. Part of your future is an expression for calculation.
The future may or may not execute concurrently. It could be assigned a thread, possibly from a pool. It could just wait and do nothing. From your perspective you cannot tell.
At some point you (or another thread) derefs the future. If the calculation has already completed, you get the results of it. If it has not already completed, you block until it has. (Presumably if it hasn't started yet, derefing it means that it starts to execute, but this, too, is not guaranteed.)
While you could make the expression in the future as complicated as the code that follows the creation of a promise, it's doubtful that's desirable. This means that futures are really more suited to quick, background-able calculations while promises are really more suited to large, complicated execution paths. Too, promises seem, in terms of calculations available, a little more flexible and oriented toward the promise creator doing the work and another thread reaping the harvest. Futures are more oriented toward automatically starting a thread (without the ugly and error-prone overhead) and going on with other things until you -- the originating thread -- need the results.
Both Future and Promise are mechanisms to communicate result of asynchronous computation from Producer to Consumer(s).
In case of Future the computation is defined at the time of Future creation and async execution begins "ASAP". It also "knows" how to spawn an asynchronous computation.
In case of Promise the computation, its start time and [possible] asynchronous invocation are decoupled from the delivery mechanism. When computation result is available Producer must call deliver explicitly, which also means that Producer controls when result becomes available.
For Promises Clojure makes a design mistake by using the same object (result of promise call) to both produce (deliver) and consume (deref) the result of computation. These are two very distinct capabilities and should be treated as such.
There are already excellent answers so only adding the "how to use" summary:
Both
Creating promise or future returns a reference immediately. This reference blocks on #/deref until result of computation is provided by other thread.
Future
When creating future you provide a synchronous job to be done. It's executed in a thread from the dedicated unbounded pool.
Promise
You give no arguments when creating promise. The reference should be passed to other 'user' thread that will deliver the result.
In Clojure, promise, future, and delay are promise-like objects. They all represent a computation that clients can await by using deref (or #). Clients reuse the result, so that the computation is not run several times.
They differ in the way the computation is performed:
future will start the computation in a different worker thread. deref will block until the result is ready.
delay will perform the computation lazily, when the first client uses deref, or force.
promise offers most flexibility, as its result is delivered in any custom way by using deliver. You use it when neither future or delay match your use case.
I think chapter 9 of Clojure for the Brave has the best explanation of the difference between delay, future, and promise.
The idea which unifies these three concepts is this: task lifecycle. A task can be thought of as going through three stages: a task is defined, a task is executed, a task's result is used.
Some programming languages (like JavaScript) have similarly named constructs (like JS's Promise) which couple together several (or all) of the stages in the task lifecycle. In JS, for instance, it is impossible to construct a Promise object without providing it either with the function (task) which will compute its value, or resolveing it immediately with a constant value.
Clojure, however, eschews such coupling, and for this reason it has three separate constructs, each corresponding to a single stage in the task lifecycle.
delay: task definition
future: task execution
promise: task result
Each construct is concerned with its own stage of the task lifecycle and nothing else, thus disentangling higher order constructs like JS's Promise and separating them into their proper parts.
We see now that in JavaScript, a Promise is the combination of all three Clojure constructs listed above. Example:
const promise = new Promise((resolve) => resolve(6))
Let's break it down:
task definition: resolve(6) is the task.
task execution: there is an implied execution context here, namely that this task will be run on a future cycle of the event loop. You don't get a say in this; you can't, for instance, require that this task be resolved synchronously, because asynchronicity is baked into Promise itself. Notice how in constructing a Promise you've already scheduled your task to run (at some unspecified time). You can't say "let me pass this around to a different component of my system and let it decide when it wants to run this task".
task result: the result of the task is baked into the Promise object and can be obtained by thening or awaiting. There's no way to create an "empty" promised result to be filled out later by some yet unknown part of your system; you have to both define the task and simultaneously schedule it for execution.
PS: The separation which Clojure imposes allows these constructs to assume roles for which they would have been unsuited had they been tightly coupled. For instance, a Clojure promise, having been separated from task definition and execution, can now be used as a unit of transfer between threads.
Firstly, a Promise is a Future. I think you want to know the difference between a Promise and a FutureTask.
A Future represents a value that is not currently known but will be known in the future.
A FutureTask represents the result of a computation that will happen in future (maybe in some thread pool). When you try to access the result, if the computation has not happened yet, it blocks. Otherwise the result is returned immediately. There is no other party involved in the computing the result as the computation is specified by you in advance.
A Promise represents a result that will be delivered by the promiser to the promisee in future. In this case you are the promisee and the promiser is that one who gave you the Promise object. Similar to the FutureTask, if you try to access the result before the Promise has been fulfilled, it gets blocked till the promiser fulfills the Promise. Once the Promise is fulfilled, you get the same value always and immediately. Unlike a FutureTask, there is an another party involved here, one which made the Promise. That another party is responsible for doing the computation and fulfilling the Promise.
In that sense, a FutureTask is a Promise you made to yourself.