What's the overhead of destructing std::future?
When I was reading the pdf , it noticed that:
// Example 1
// (a)
{
async( []{ f(); } );
async( []{ g(); } );
}
// (b)
{
auto f1 = async( []{ f(); } );
auto f2 = async( []{ g(); } );
}
Users are often surprised to discover that (a) and (b) do not have the same behavior, because normally we ignore (and/or don’t look at) return values if we end up deciding we’re not interested in the value and doing so does not change the meaning of our program.
But I checked the difference in the quickbench, , It was the opposite of what I thought. Am I missing any point of the discussion?
https://quick-bench.com/q/L0HtxBWmgvswK7Q90AqOPmsg_F8
Benchmarking
I suspect std::this_thread::sleep_for(1000ms) to be the culprit here. Putting a thread to sleep and waking it up again is not a very precise operation. Sometimes a thread is woken up a little earlier, sometimes too early, so it has to be put back to sleep. If you remove the sleep (or use a deterministic operation instead), you can see that they're basically identical.
The async Issue
Skimming the PDF, the issue Herb is talking about, is if you chain multiple async calls, they might execute differently depending on whether you assign the future to a variable. The future's destructor is still run in all scenarios, just at different points in time. Remember that C++ cannot distinguish a function overload based on return type. That means, no matter if you use the return value or not, it will still be constructed and therefore has to be destructed. The only difference is when the destruction happens.
Scenario (a) is different, because both async calls happen synchronously, since the future is destroyed immediately, which blocks the thread:
{
async( []{ f(); } );
// ^ this future is a temporary, it will be destroyed here, directly after the function call
async( []{ g(); } );
// ^ this future is a temporary, it will be destroyed here, directly after the function call
}
Scenario (b) on the other hand, both async calls can happen before either future is destroyed, because both futures only ever get destroyed after both async calls happened.
{
auto f1 = async( []{ f(); } );
auto f2 = async( []{ g(); } );
}
// ^ variables went out of scope here, both futures will be destroyed now
The point of the PDF is that when you do:
async( []{ f(); } );
async( []{ g(); } );
The hidden destructor of the future after each async call makes the current thread wait for async to complete, thereby making the code synchronous. i.e the code does:
launch task1
wait for task1
launch task2
wait for task2
Changing to:
auto f1 = async( []{ f(); } );
auto f2 = async( []{ g(); } );
Both tasks are launched before the futures destruct allowing the tasks to run in parallel (assuming std::async is actually using multiple threads).
Your benchmark only runs a single task in each loop and therefore doesn't demonstrate this problem. I imagine most of the difference in your results come from the randomness of std::this_thread::sleep_for.
Changing the benchmark to have two tasks does show a small difference: https://quick-bench.com/q/wy6yPq4yMBi_VGSJ7HI_hLccZmc (though again it might just be within the margin of error). I'm not sure that the quickbench website even offers multiple CPUs?
Related
void func() {
std::future<int> fut = std::async(std::launch::async, []{
std::this_thread::sleep_for(std::chrono::seconds(10));
return 8;
});
return;
}
Let's say that I have such a function. An object fut of std::future<int> is initialized with a std::async job, which will return an integer in the future. But the fut will be immediately released after the function func returns.
Is there such an issue: When the std::async job returns and try to assign the 8 to the object fut, the object has already been released...
If so, a coredump about SIGSEGV may occur...
Am I right about it? Or std::future has some mechanism to avoid this issue?
Futures created with std::async have a blocking destructor that waits for the task to finish.
Even if you instead create a future from std::promise, the docs for ~future() don't mention the problem you're asking about.
The docs use the term "shared state" a lot, implying that a promise and a future share a (heap-allocated?) state, which holds the return value, and isn't destroyed until both die.
While working with the threaded model of C++11, I noticed that
std::packaged_task<int(int,int)> task([](int a, int b) { return a + b; });
auto f = task.get_future();
task(2,3);
std::cout << f.get() << '\n';
and
auto f = std::async(std::launch::async,
[](int a, int b) { return a + b; }, 2, 3);
std::cout << f.get() << '\n';
seem to do exactly the same thing. I understand that there could be a major difference if I ran std::async with std::launch::deferred, but is there one in this case?
What is the difference between these two approaches, and more importantly, in what use cases should I use one over the other?
Actually the example you just gave shows the differences if you use a rather long function, such as
//! sleeps for one second and returns 1
auto sleep = [](){
std::this_thread::sleep_for(std::chrono::seconds(1));
return 1;
};
Packaged task
A packaged_task won't start on it's own, you have to invoke it:
std::packaged_task<int()> task(sleep);
auto f = task.get_future();
task(); // invoke the function
// You have to wait until task returns. Since task calls sleep
// you will have to wait at least 1 second.
std::cout << "You can see this after 1 second\n";
// However, f.get() will be available, since task has already finished.
std::cout << f.get() << std::endl;
std::async
On the other hand, std::async with launch::async will try to run the task in a different thread:
auto f = std::async(std::launch::async, sleep);
std::cout << "You can see this immediately!\n";
// However, the value of the future will be available after sleep has finished
// so f.get() can block up to 1 second.
std::cout << f.get() << "This will be shown after a second!\n";
Drawback
But before you try to use async for everything, keep in mind that the returned future has a special shared state, which demands that future::~future blocks:
std::async(do_work1); // ~future blocks
std::async(do_work2); // ~future blocks
/* output: (assuming that do_work* log their progress)
do_work1() started;
do_work1() stopped;
do_work2() started;
do_work2() stopped;
*/
So if you want real asynchronous you need to keep the returned future, or if you don't care for the result if the circumstances change:
{
auto pizza = std::async(get_pizza);
/* ... */
if(need_to_go)
return; // ~future will block
else
eat(pizza.get());
}
For more information on this, see Herb Sutter's article async and ~future, which describes the problem, and Scott Meyer's std::futures from std::async aren't special, which describes the insights. Also do note that this behavior was specified in C++14 and up, but also commonly implemented in C++11.
Further differences
By using std::async you cannot run your task on a specific thread anymore, where std::packaged_task can be moved to other threads.
std::packaged_task<int(int,int)> task(...);
auto f = task.get_future();
std::thread myThread(std::move(task),2,3);
std::cout << f.get() << "\n";
Also, a packaged_task needs to be invoked before you call f.get(), otherwise you program will freeze as the future will never become ready:
std::packaged_task<int(int,int)> task(...);
auto f = task.get_future();
std::cout << f.get() << "\n"; // oops!
task(2,3);
TL;DR
Use std::async if you want some things done and don't really care when they're done, and std::packaged_task if you want to wrap up things in order to move them to other threads or call them later. Or, to quote Christian:
In the end a std::packaged_task is just a lower level feature for implementing std::async (which is why it can do more than std::async if used together with other lower level stuff, like std::thread). Simply spoken a std::packaged_task is a std::function linked to a std::future and std::async wraps and calls a std::packaged_task (possibly in a different thread).
TL;DR
std::packaged_task allows us to get the std::future "bounded" to some callable, and then control when and where this callable will be executed without the need of that future object.
std::async enables the first, but not the second. Namely, it allows us to get the future for some callable, but then, we have no control of its execution without that future object.
Practical example
Here is a practical example of a problem that can be solved with std::packaged_task but not with std::async.
Consider you want to implement a thread pool. It consists of a fixed number of worker threads and a shared queue. But shared queue of what? std::packaged_task is quite suitable here.
template <typename T>
class ThreadPool {
public:
using task_type = std::packaged_task<T()>;
std::future<T> enqueue(task_type task) {
// could be passed by reference as well...
// ...or implemented with perfect forwarding
std::future<T> res = task.get_future();
{ std::lock_guard<std::mutex> lock(mutex_);
tasks_.push(std::move(task));
}
cv_.notify_one();
return res;
}
void worker() {
while (true) { // supposed to be run forever for simplicity
task_type task;
{ std::unique_lock<std::mutex> lock(mutex_);
cv_.wait(lock, [this]{ return !this->tasks_.empty(); });
task = std::move(tasks_.top());
tasks_.pop();
}
task();
}
}
... // constructors, destructor,...
private:
std::vector<std::thread> workers_;
std::queue<task_type> tasks_;
std::mutex mutex_;
std::condition_variable cv_;
};
Such functionality cannot be implemented with std::async. We need to return an std::future from enqueue(). If we called std::async there (even with deferred policy) and return std::future, then we would have no option how to execute the callable in worker(). Note that you cannot create multiple futures for the same shared state (futures are non-copyable).
Packaged Task vs async
p> Packaged task holds a task [function or function object] and future/promise pair. When the task executes a return statement, it causes set_value(..) on the packaged_task's promise.
a> Given Future, promise and package task we can create simple tasks without worrying too much about threads [thread is just something we give to run a task].
However we need to consider how many threads to use or whether a task is best run on the current thread or on another etc.Such descisions can be handled by a thread launcher called async(), that decides whether to create a new a thread or recycle an old one or simply run the task on the current thread. It returns a future .
"The class template std::packaged_task wraps any callable target
(function, lambda expression, bind expression, or another function
object) so that it can be invoked asynchronously. Its return value or
exception thrown is stored in a shared state which can be accessed
through std::future objects."
"The template function async runs the function f asynchronously
(potentially in a separate thread) and returns a std::future that will
eventually hold the result of that function call."
I am a little bit confused by the std::async function.
The specification says:
asynchronous operation being executed "as if in a new thread of execution" (C++11 §30.6.8/11).
Now, what is that supposed to mean?
In my understanding, the code
std::future<double> fut = std::async(std::launch::async, pow2, num);
should launch the function pow2 on a new thread and pass the variable num to the thread by value, then sometime in the future, when the function is done, place the result in fut (as long as the function pow2 has a signature like double pow2(double);). But the specification states "as if", which makes the whole thing kinda foggy for me.
The question is:
Is a new thread always launched in this case? I hope so. I mean for me, the parameter std::launch::async makes sense in a way that I am explicitly stating I indeed want to create a new thread.
And the code
std::future<double> fut = std::async(std::launch::deferred, pow2, num);
should make lazy evaluation possible, by delaying the pow2 function call to the point where i write something like var = fut.get();. In this case the parameter std::launch::deferred, should mean that I am explicitly stating, I don't want a new thread, I just want to make sure the function gets called when there is need for it's return value.
Are my assumptions correct? If not, please explain.
Also, I know that by default the function is called as follows:
std::future<double> fut = std::async(std::launch::deferred | std::launch::async, pow2, num);
In this case, I was told that whether a new thread will be launched or not depends on the implementation. Again, what is that supposed to mean?
The std::async (part of the <future> header) function template is used to start a (possibly) asynchronous task. It returns a std::future object, which will eventually hold the return value of std::async's parameter function.
When the value is needed, we call get() on the std::future instance; this blocks the thread until the future is ready and then returns the value. std::launch::async or std::launch::deferred can be specified as the first parameter to std::async in order to specify how the task is run.
std::launch::async indicates that the function call must be run on its own (new) thread. (Take user #T.C.'s comment into account).
std::launch::deferred indicates that the function call is to be deferred until either wait() or get() is called on the future. Ownership of the future can be transferred to another thread before this happens.
std::launch::async | std::launch::deferred indicates that the implementation may choose. This is the default option (when you don't specify one yourself). It can decide to run synchronously.
Is a new thread always launched in this case?
From 1., we can say that a new thread is always launched.
Are my assumptions [on std::launch::deferred] correct?
From 2., we can say that your assumptions are correct.
What is that supposed to mean? [in relation to a new thread being launched or not depending on the implementation]
From 3., as std::launch::async | std::launch::deferred is the default option, it means that the implementation of the template function std::async will decide whether it will create a new thread or not. This is because some implementations may be checking for over scheduling.
WARNING
The following section is not related to your question, but I think that it is important to keep in mind.
The C++ standard says that if a std::future holds the last reference to the shared state corresponding to a call to an asynchronous function, that std::future's destructor must block until the thread for the asynchronously running function finishes. An instance of std::future returned by std::async will thus block in its destructor.
void operation()
{
auto func = [] { std::this_thread::sleep_for( std::chrono::seconds( 2 ) ); };
std::async( std::launch::async, func );
std::async( std::launch::async, func );
std::future<void> f{ std::async( std::launch::async, func ) };
}
This misleading code can make you think that the std::async calls are asynchronous, they are actually synchronous. The std::future instances returned by std::async are temporary and will block because their destructor is called right when std::async returns as they are not assigned to a variable.
The first call to std::async will block for 2 seconds, followed by another 2 seconds of blocking from the second call to std::async. We may think that the last call to std::async does not block, since we store its returned std::future instance in a variable, but since that is a local variable that is destroyed at the end of the scope, it will actually block for an additional 2 seconds at the end of the scope of the function, when local variable f is destroyed.
In other words, calling the operation() function will block whatever thread it is called on synchronously for approximately 6 seconds. Such requirements might not exist in a future version of the C++ standard.
Sources of information I used to compile these notes:
C++ Concurrency in Action: Practical Multithreading, Anthony Williams
Scott Meyers' blog post: http://scottmeyers.blogspot.ca/2013/03/stdfutures-from-stdasync-arent-special.html
I was also confused by this and ran a quick test on Windows which shows that the async future will be run on the OS thread pool threads. A simple application can demonstrate this, breaking out in Visual Studio will also show the executing threads named as "TppWorkerThread".
#include <future>
#include <thread>
#include <iostream>
using namespace std;
int main()
{
cout << "main thread id " << this_thread::get_id() << endl;
future<int> f1 = async(launch::async, [](){
cout << "future run on thread " << this_thread::get_id() << endl;
return 1;
});
f1.get();
future<int> f2 = async(launch::async, [](){
cout << "future run on thread " << this_thread::get_id() << endl;
return 1;
});
f2.get();
future<int> f3 = async(launch::async, [](){
cout << "future run on thread " << this_thread::get_id() << endl;
return 1;
});
f3.get();
cin.ignore();
return 0;
}
Will result in an output similar to:
main thread id 4164
future run on thread 4188
future run on thread 4188
future run on thread 4188
That is not actually true.
Add thread_local stored value and you will see, that actually std::async run f1 f2 f3 tasks in different threads, but with same std::thread::id
What is the difference between the two statements below in terms of execution?
async([]() { ... });
thread([]() { ... }).detach();
std::async ([]() { ... }); // (1)
std::thread ([]() { ... }).detach (); // (2)
Most often when std::async is being discussed the first thing noted is that it's broken, the name implies something which doesn't hold when the returned value isn't honored (assigned to a variable to be destructed at the end of the current scope).
In this case the brokenness of std::async is exactly what is going to result in a huge difference between (1) and (2); one will block, the other won't.
Why does std::async block in this context?
The return-value of std::async is a std::future which has a blocking destructor that must execute before the code continues.
In an example as the below g won't execute until f has finished, simply because the unused return value of (3) can't be destroyed until all work is done in the relevant statement.
std::async (f); // (3)
std::async (g); // (4)
What is the purpose of std::thread (...).detach ()?
When detaching from a std::thread we are simply saying; "I don't care about this thread handle anymore, please just execute the damn thing."
To continue with an example similar to the previous one (about std::async) the difference is notably clear; both f and g will execute simultaneously.
std::thread (f).detach ();
std::thread (g).detach ();
I know a good answer was given to your question but if we were to change your question a little something interesting would occur.
Imagine you kept the future returned by the async and didn't detach the thread but instead made a variable for it like this,
Asynchronous code
auto fut=std::async([]() { ... });
std::thread th([]() { ... });
Now you have the setup to what makes these 2 constructs different.
th.join()//you're here until the thread function returns
fut.wait_for(std::chrono::seconds(1)); //wait for 1 sec then continue.
A thread is an all or nothing thing when joining it where as an async can be checked and you can go do other stuff.
wait_for actually returns a status so you can do things like this.
int numOfDots = 0;
//While not ready after waiting 1 sec do some stuff and then check again
while(fut.wait_for(std::chrono::seconds(1)) != std::future_status::ready)
{
(numOfDots++)%=20;
//Print status to the user you're still working on it.
std::cout << "Working on it" <<std::string(numOfDots,'.')<<"\r"<<std::flush();
}
std::cout << "Thanks for waiting!\nHere's your answer: " << fut.get() <<std::endl();
async returns a future object, detach does not. All detach does is allow the execution to continue independently. In order to achieve a similar effect as async, you must use join. For example:
{
std::async(std::launch::async, []{ f(); });
std::async(std::launch::async, []{ g(); }); // does not run until f() completes
}
{
thread1.join();
thread2.join();
}
When should I use std::promise over std::async or std::packaged_task?
Can you give me practical examples of when to use each one of them?
std::async
std::async is a neat and easy way to get a std::future, but:
It does not always start a new thread; the enum value std::launch::async can be passed as the first argument to std::async in order to ensure that a new thread is created to execute the task specified by func, thus ensuring that func executes asynchronously.
auto f = std::async( std::launch::async, func );
The destructor of std::future can block until the new thread completes.
auto sleep = [](int s) { std::this_thread::sleep_for(std::chrono::seconds(s)); };
{
auto f = std::async( std::launch::async, sleep, 5 );
}
Normally we expect that only .get() or .wait() blocks, but for a std::future returned from std::async, the destructor also may block, so be careful not to block your main thread just by forgetting about it.
If the std::future is stored in a temporary object, the std::async call will block at the point of the object's destruction, so the following block will take 10 seconds if you remove the auto f = initializations. It will only block for 5 seconds otherwise, because the two sleeps will be concurrent, with a wait for both to complete resulting from the destruction of the two objects at the end of the block:
auto sleep = [](int s) { std::this_thread::sleep_for(std::chrono::seconds(s)); };
{
auto f1 = std::async( std::launch::async, sleep, 5 );
auto f2 = std::async( std::launch::async, sleep, 5 );
}
std::packaged_task
std::packaged_task by itself has nothing to do with threads: it is just a functor and a related std::future. Consider the following:
auto task = [](int i) {
std::this_thread::sleep_for(std::chrono::seconds(5));
return i+100;
};
std::packaged_task< int(int) > package{ task };
std::future<int> f = package.get_future();
package(1);
std::cout << f.get() << "\n";
Here we just run the task by package(1), and after it returns, f is ready so no blocking on .get().
There is a feature of std::packaged_task that makes it very useful for threads. Instead of just a function, you can initialize std::thread with a std::packaged_task which gives a really nice way of getting to the 'std::future'. Consider the following:
std::packaged_task< int(int) > package{ task };
std::future<int> f = package.get_future();
std::thread t { std::move(package), 5 };
std::cout << f.get() << "\n"; //block here until t finishes
t.join();
Because std::packaged_task is not copyable, you must move it to new thread with std::move.
std::promise
std::promise is a powerful mechanism. For example, you can pass a value to new thread without need of any additional synchronization.
auto task = [](std::future<int> i) {
std::cout << i.get() << std::flush;
};
std::promise<int> p;
std::thread t{ task, p.get_future() };
std::this_thread::sleep_for(std::chrono::seconds(5));
p.set_value(5);
t.join();
New thread will wait for us on .get()
So, in general, answering your question:
Use std::async only for simple things, e.g. to make some call non-blocking, but bear in mind the comments on blocking above.
Use std::packaged_task to easily get to a std::future, and run it as a separate thread
std::thread{ std::move(package), param }.detach();
or
std::thread t { std::move(package), param };
Use std::promise when you need more control over the future.
See also std::shared_future and on passing exceptions between threads std::promise::set_exception
A promise is used to store a value that was calculated using e.g. a std::async.
See http://en.cppreference.com/w/cpp/thread/promise
I can imagine you wonder about the difference between std::packaged_task and std::async (in the most common approach, std::async starts a separate thread NOW to run function/lambda/etc with a (likely) expensive calculation.
A std::packaged_task is used to wrap a function/lambda/etc with the current values of arguments so you can LATER run it, either synchronously or in a separate thread).
Both std::packaged_task and std::async provide a std::future that will contain the RESULT of the wrapped function/lambda/etc once run.
Internally, the std::future uses a std::promise to hold that result.