I wanted to know how appropriate its to use std::async in performance oriented code. Specifically
Is there any penalty in catching the exception from worker thread to main thread?
How are the values returned from worker to main?
Are the input arguments passed by ref actually never get copied or not?
I am planning to pass a heavy session object to a thread or write std::async.
bool fun(MySession& sessRef);
MySession sess;
auto r = std::async(&fun, sess);
EDIT:
I am planning to use it with GCC 4.9.1 and VS2013 both since the application is platform agnostic. However most deployments will be *nix based so atleast GCC should be performant.
We can't tell exactly "how is std::async implemented", since you're not referring to a particular toolchain that provides that implementation actually.
1. Is there any penalty in catching the exception from worker thread to main thread?
Define "Penalty" by which means exactly? That can't be answered unless you clarify about your concerns/requirements.
Usually there shouldn't be any penalty, by just catching the exception in the thread, that created the throwing one. It's just about the exception may be provided to the creating thread via the join(), and this causes some cost for keeping that particular exception through handling of join().
2. How are the values returned from worker to main?
To cite what's the c++ standards definition saying about this point:
30.6.8 Function template async
4 Returns: An object of type future<typename result_of<typename decay<F>::type(typename decay<Args>::type...)>::type> that refers to the shared state created by this call to async.
3. Are the input arguments passed by ref actually never get copied or
not?
That point is answered in detail here: Passing arguments to std::async by reference fails. As you see, the default case they are copied.
According to #Yakk's comment, it might be possible to pass these parameters via std::ref to avoid operating on copies, but take references.
how is std::async implemented
I can tell only for the c++ standards requirements, how it should be implemented, unless you're referring to a particular toolchain, that tries to provide a proper implementation of std::async.
Related
This question already has an answer here:
Why doesn't C++ have a std::thread_pool in the standard library?
(1 answer)
Closed 4 months ago.
Since C++11 there has been a surge in the amount of parallel/concurrent programming tools in C++: threads, async functions, parallel algorithms, coroutines… But what about a popular parallel programming pattern: thread pool?
As far as I can see, nothing in the standard library implements this directly. Threading via std::thread can be used to implement a thread pool, but this requires manual labor. Asynchronous function via std::async can be launched either in a new thread (std::launch::async) or in the calling thread (std::launch::deferred).
I think std::async could've been easily made to support thread pooling: via another launch policy (std::launch::thread_pool) which executes the task in an implicitly created global thread pool; or there could be a std::thread_pool object plus an overload of std::async which takes a thread pool.
Was something like this considered, and if so, why was it rejected? Or is there a standard solution that I am missing?
In principal std::async could use a thread pool and is seems to me that allowing this was the intention. But in practice the existence of thread_local makes it difficult.
From cppreference on std::async with std::launch::async:
[...] execute the callable object f on a new thread of execution (with all thread-locals initialized) as if spawned by std::thread(std::forward<F>(f), std::forward<Args>(args)...) [...]
If the function contains any local thread_local variables, and if std::async used a thread pool, the behavior of running the function would std::async could be different from the behavior of std::thread.
One example may be that the thread_local might not have not have it's initial value the second time its called by the same thread. If you were to use std::thread instead, it would always have the initial value.
Another way the behavior would diverge is thread_local object's destructors would not run in the same way for std::async and std::thread. This is illustrated by Microsoft's attempt at using a thread pool, which I suspect scared off others from trying it. You can read about this non-conformance here : In Visual Studio, thread_local variables' destructor not called when used with std::async, is this a bug?.
To be completely conforming, the implementation would need to "reset" all the thread_local objects anyway. This requires compiler support and starts to look an awful lot like starting a new thread anyway.
I am using std::thread to create a thread, but looks like there is no API can tell if this operation success or failure. Is there any way to know this information?
The constructor of std::thread throws an exception in the case of a failure.
https://en.cppreference.com/w/cpp/thread/thread/thread
Exceptions 3)
std::system_error if the thread could not be started.
The exception may represent the error condition
std::errc::resource_unavailable_try_again or another
implementation-specific error condition.
If you use OS-specific threading mechanics, you may look up their respective (C) APIs:
Windows
Linux
See reference for std::thread constructor:
std::system_error if the thread could not be started.
This is the general practice in C++. When constructor fails, you raise an exception, otherwise your object is in an awkward invalid state, which is something standard library usually avoids.
There are two popular approaches to determining success of unreliable operations in modern programming languages.
Have a returned status code or readable error code, which indicates whether the operation was successful. This is the older approach, used in C programs and hence the POSIX interfaces many of them use. With this style, functions and methods have the semantics of only trying to perform an operation. On return from the function or method you know that an attempt has been made, but you do not know whether it was successful.
Throw an exception if, and only if, the operation failed. This is the newer approach, used in well written C++ programs and the C++ standard library. With this style, functions, methods and constructors have the semantics of doing an operation. On return from the function, method or constructor you know that the operation has been successfull.
So, in your particular case, return from the thread constructor means the thread was successfully created, and there is no need to check a status code.
As the title of the question says, why C++ threads (std::thread and pthread) are movable but not copiable? What consequences are there, if we do make it copiable?
Regarding copying, consider the following snippet:
void foo();
std::thread first (foo);
std::thread second = first; // (*)
When the line marked (*) takes place, presumably some of foo already executed. What would the expected behavior be, then? Execute foo from the start? Halt the thread, copy the registers and state, and rerun it from there?
In particular, given that function objects are now part of the standard, it's very easy to launch another thread that performs exactly the same operation as some earlier thread, by reusing the function object.
There's not much motivation to begin with for this, therefore.
Regarding moves, though, consider the following:
std::vector<std::thread> threads;
without move semantics, it would be problematic: when the vector needs to internally resize, how would it move its elements to another buffer? See more on this here.
If the thread objects are copyable, who is finally responsible for the single thread of execution associated with the thread objects? In particular, what would join() do for each of the thread objects?
There are several possible outcomes, but that is the problem, there are several possible outcomes with no real overlap that can be codified (standardised) as a general use case.
Hence, the most reasonable outcome is that 1 thread of execution is associated with at most 1 thread object.
That is not to say some shared state cannot be provided, it is just that the user then needs to take further action in this regard, such as using a std::shared_ptr.
The C++11 standard says:
30.6.6 Class template future
(3) "The effect of calling any member function other than the destructor,
the move-assignment operator, or valid on a future object for which
valid() == false is undefined."
So, does it mean that the following code might encounter undefined behaviour?
void wait_for_future(std::future<void> & f)
{
if (f.valid()) {
// what if another thread meanwhile calls get() on f (which invalidates f)?
f.wait();
}
else {
return;
}
}
Q1: Is this really a possible undefined behaviour?
Q2: Is there any standard compliant way to avoid the possible undefined behaviour?
Note that the standard has an interesting note [also in 30.6.6 (3)]:
"[Note: Implementations are encouraged
to detect this case and throw an object of type future_error with an
error condition of future_errc::no_state. —endnote]"
Q3: Is it ok if I just rely on the standard's note and just use f.wait() without checking f's validity?
void wait_for_future(std::future<void> & f)
{
try {
f.wait();
}
catch (std::future_error const & err) {
return;
}
}
EDIT: Summary after receiving the answers and further research on the topic
As it turned out, the real problem with my example was not directly due to parallel modifications (a single modifying get was called from a single thread, the other thread called valid and wait which shall be safe).
The real problem was that the std::future object's get function was accessed from a different thread, which is not the intended use case! The std::future object shall only be used from a single thread!
The only other thread that is involved is the thread that sets the shared state: via return from the function passed to std::async or calling set_value on the related std::promise object, etc.
More: even waiting on an std::future object from another thread is not intended behaviour (due to the very same UB as in my example#1). We shall use std::shared_future for this use case, having each thread its own copy of an std::shared_future object. Note that all these are not through the same shared std::future object, but through separate (related) objects!
Bottom line:
These objects shall not be shared between threads. Use a separate (related) object in each thread.
A normal std::future is not threadsafe by itself. So yes it is UB, if you call modifying functions from multiple threads on a single std::future as you have a potential race condition. Though, calling wait from multiple threads is ok as it's const/non-modifying.
However, if you really need to access the return value of a std::future from multiple threads you can first call std::future::share on the future to get a std::shared_future which you can copy to each thread and then each thread can call get. Note that it's important that each thread has its own std::shared_future object.
You only need to check valid if it is somehow possible that your future might be invalid which is not the case for the normal usecases(std::async etc.) and proper usage(e.g.: not callig get twice).
Futures allow you to store the state from one thread and retrieve it from another. They don't provide any further thread safety.
Is this really a possible undefined behaviour?
If you have two threads trying to get the future's state without synchronisation, yes. I've no idea why you might do that though.
Is there any standard compliant way to avoid the possible undefined behaviour?
Only try to get the state from one thread; or, if you genuinely need to share it between threads, use a mutex or other synchronisation.
Is it ok if I just rely on the standard's note
If you known that the only implementations you need to support follow that recommendation, yes. But there should be no need.
and just use f.wait() without checking f's validity?
If you're not doing any weird shenanigans with multiple threads accessing the future, then you can just assume that it's valid until you've retrieved the state (or moved it to another future).
I'm trying to explore all the options of the new C++11 standard in depth, while using std::async and reading its definition, I noticed 2 things, at least under linux with gcc 4.8.1 :
it's called async, but it got a really "sequential behaviour", basically in the row where you call the future associated with your async function foo, the program blocks until the execution of foo it's completed.
it depends on the exact same external library as others, and better, non-blocking solutions, which means pthread, if you want to use std::async you need pthread.
at this point it's natural for me asking why choosing std::async over even a simple set of functors ? It's a solution that doesn't even scale at all, the more future you call, the less responsive your program will be.
Am I missing something ? Can you show an example that is granted to be executed in an async, non blocking, way ?
it's called async, but it got a really "sequential behaviour",
No, if you use the std::launch::async policy then it runs asynchronously in a new thread. If you don't specify a policy it might run in a new thread.
basically in the row where you call the future associated with your async function foo, the program blocks until the execution of foo it's completed.
It only blocks if foo hasn't completed, but if it was run asynchronously (e.g. because you use the std::launch::async policy) it might have completed before you need it.
it depends on the exact same external library as others, and better, non-blocking solutions, which means pthread, if you want to use std::async you need pthread.
Wrong, it doesn't have to be implemented using Pthreads (and on Windows it isn't, it uses the ConcRT features.)
at this point it's natural for me asking why choosing std::async over even a simple set of functors ?
Because it guarantees thread-safety and propagates exceptions across threads. Can you do that with a simple set of functors?
It's a solution that doesn't even scale at all, the more future you call, the less responsive your program will be.
Not necessarily. If you don't specify the launch policy then a smart implementation can decide whether to start a new thread, or return a deferred function, or return something that decides later, when more resources may be available.
Now, it's true that with GCC's implementation, if you don't provide a launch policy then with current releases it will never run in a new thread (there's a bugzilla report for that) but that's a property of that implementation, not of std::async in general. You should not confuse the specification in the standard with a particular implementation. Reading the implementation of one standard library is a poor way to learn about C++11.
Can you show an example that is granted to be executed in an async, non blocking, way ?
This shouldn't block:
auto fut = std::async(std::launch::async, doSomethingThatTakesTenSeconds);
auto result1 = doSomethingThatTakesTwentySeconds();
auto result2 = fut.get();
By specifying the launch policy you force asynchronous execution, and if you do other work while it's executing then the result will be ready when you need it.
If you need the result of an asynchronous operation, then you have to block, no matter what library you use. The idea is that you get to choose when to block, and, hopefully when you do that, you block for a negligible time because all the work has already been done.
Note also that std::async can be launched with policies std::launch::async or std::launch::deferred. If you don't specify it, the implementation is allowed to choose, and it could well choose to use deferred evaluation, which would result in all the work being done when you attempt to get the result from the future, resulting in a longer block. So if you want to make sure that the work is done asynchronously, use std::launch::async.
I think your problem is with std::future saying that it blocks on get. It only blocks if the result isn't already ready.
If you can arrange for the result to be already ready, this isn't a problem.
There are many ways to know that the result is already ready. You can poll the future and ask it (relatively simple), you could use locks or atomic data to relay the fact that it is ready, you could build up a framework to deliver "finished" future items into a queue that consumers can interact with, you could use signals of some kind (which is just blocking on multiple things at once, or polling).
Or, you could finish all the work you can do locally, and then block on the remote work.
As an example, imagine a parallel recursive merge sort. It splits the array into two chunks, then does an async sort on one chunk while sorting the other chunk. Once it is done sorting its half, the originating thread cannot progress until the second task is finished. So it does a .get() and blocks. Once both halves have been sorted, it can then do a merge (in theory, the merge can be done at least partially in parallel as well).
This task behaves like a linear task to those interacting with it on the outside -- when it is done, the array is sorted.
We can then wrap this in a std::async task, and have a future sorted array. If we want, we could add in a signally procedure to let us know that the future is finished, but that only makes sense if we have a thread waiting on the signals.
In the reference: http://en.cppreference.com/w/cpp/thread/async
If the async flag is set (i.e. policy & std::launch::async != 0), then
async executes the function f on a separate thread of execution as if
spawned by std::thread(f, args...), except that if the function f
returns a value or throws an exception, it is stored in the shared
state accessible through the std::future that async returns to the
caller.
It is a nice property to keep a record of exceptions thrown.
http://www.cplusplus.com/reference/future/async/
there are three type of policy,
launch::async
launch::deferred
launch::async|launch::deferred
by default launch::async|launch::deferred is passed to std::async.