One of the proposals for C++14 is Resumable Functions which gives C++ what is available in C# today with the async/await mechanisms. The basic idea is that a function can be paused
while waiting for an asynchronous operation to complete. When the asynchronous operation completes the function can be resumed where it was paused. This is done in a non-blocking way so that the thread from which the resumable function was invoked will not be blocked.
It is not obvious to me in which context (thread) the function will be resumed. Will it be resumed by the thread from which the function was paused (this is how it is done in C# as I understand it) or does it use another thread?
If it is resumed by the thread from which it was paused, does the thread has to be put in some special state or will the scheduler handle this?
To quote from N3564:
After suspending, a resumable function may be resumed by the scheduling logic of the runtime and will eventually complete its logic, at which point it executes a return statement (explicit or implicit) and sets the function’s result value in the placeholder.
It should thus be noted that there is an asymmetry between the function’s observed behavior from the outside (caller) and the inside: the outside perspective is that function returns a value of type future at the first suspension point, while the inside perspective is that the function returns a value of type T via a return statement, functions returning future/shared_future behaving somewhat different still.
A resumable function may continue execution on another thread after resuming following a suspension of its execution.
This essentially means that
When first called, a resumable function executes in the thread context of its caller.
After each suspension point, the implementation can freely choose on which thread to continue the execution of a resumable function
From the perspective of the calling code, a resumable function works like an asynchronous function, where part of the (observable) behaviour is reliably executed by the time the function call returns, but the final result might not be in yet (the returned future<T> does not have to be in a ready state).
As a programmer, you don't have to jump through hoops to get a resumable function to resume.
Related
I had a question regarding the working of co_await in C++. I have the following code snippet:-
// Downloads url to cache and
// returns cache file path.
future<path> cacheUrl(string url)
{
cout << "Downloading url.";
string text = co_await downloadAsync(url); // suspend coroutine
cout << "Saving in cache.";
path p = randomFileName();
co_await saveInCacheAsync(p, text); // suspend coroutine
co_return p;
}
int main(void) {
future<path> filePath = cacheUrl("https://localhost:808/");
return 0;
}
The co_await keyword is used to suspend the execution of any co-routine. We have 2 instances in the above code where it is used. In the main function, we get access to the co-routine. When the program executes the line co_await downloadAsync(url) will it invoke downloadAsync or just suspend the co-routine.
Also, for executing the next saveInCacheAsync(p, text) function, should the main function call resume on the co-routine ? Or will it get called automatically ?
The coroutine model in C++ is opaque: the caller of a coroutine sees it as an ordinary function call that synchronously returns a value of the declared type (here, future<path>). That value is but a placeholder, though: the function body executes only when that result is awaited—but not necessarily co_awaited, since the caller need not be a coroutine (the opacity again).
Separately, co_await may suspend a coroutine, but need not do so (consider that it might be “waiting” on a coroutine with an empty function body). It’s also quite separate from calling the coroutine: one may write
auto cr=coroutine(…);
do_useful_work();
co_await cr;
to create the placeholder long before using it.
co_await in C++ is an operator, just like prefix * or whatever. If you saw *downloadAsync(...), you would expect the function call to happen, then the * operator would act on the value returned by that function. So too with co_await.
The objects that downloadAsync and saveInCacheAsync return are expected to have some mechanism in them to determine when and where to continue the execution of the coroutine once their asynchronous processes have concluded. The co_await expression (potentially) suspends execution of the coroutine and then accesses those mechanisms, scheduling the resumption of the coroutine's execution with that mechanism.
The future object return value defined by your coroutine function is meant to be able to shepherd the co_returned value from your function to whomever asks for it. How that works depends entirely on how you wrote your promise/future machinery for your coroutine.
The typical way to handle it is to be able to block the thread who asks for the value (eg. with a mutex) until the asynchronous process computing that value has completed. But it could do something else. Indeed, being able to co_await on such things, and thereby form long-chains of asynchronous continuations to build more complex values, is a common part of most coroutine machinery.
But at some point, someone has to actually retrieve the value resulting from all of this.
There are plenty of tutorials that explain how it's easy to use coroutines in C++, but I've spent a lot of time getting how to schedule "detached" coroutines.
Assume, I have the following definition of coroutine result type:
struct task {
struct promise_type {
auto initial_suspend() const noexcept { return std::suspend_never{}; }
auto final_suspend() const noexcept { return std::suspend_never{}; }
void return_void() const noexcept { }
void unhandled_exception() const { std::terminate(); }
task get_return_object() const noexcept { return {}; }
};
};
And there is also a method that runs "detached" coroutine, i.e. runs it asynchronously.
/// Handler should have overloaded operator() returning task.
template<class Handler>
void schedule_coroutine(Handler &&handler) {
std::thread([handler = std::forward<Handler>(handler)]() { handler(); }).detach();
}
Obviously, I can not pass lambda-functions or any other functional object that has a state into this method, because once the coroutine is suspended, the lambda passed into std::thread method will be destroyed with all the captured variables.
task coroutine_1() {
std::vector<object> objects;
// ...
schedule_coroutine([objects]() -> task {
// ...
co_await something;
// ...
co_return;
});
// ...
co_return;
}
int main() {
// ...
schedule_coroutine(coroutine_1);
// ...
}
I think there is should be a way to save the handler somehow (preferably near or within the coroutine promise) so that the next time coroutine is resumed it won't try to access to the destroyed object data. But unfortunately I have no idea how to do it.
I think your problem is a general (and common) misunderstanding of how co_await coroutines are intended to work.
When a function performs co_await <expr>, this (generally) means that the function suspends execution until expr resumes its execution. That is, your function is waiting until some process completes (and typically returns a value). That process, represented by expr, is the one who is supposed to resume the function (generally).
The whole point of this is to make code that executes asynchronously look like synchronous code as much as possible. In synchronous code, you would do something like <expr>.wait(), where wait is a function that waits for the task represented by expr to complete. Instead of "waiting" on it, you "a-wait" or "asynchronously wait" on it. The rest of your function executes asynchronously relative to your caller, based on when expr completes and how it decides to resume your function's execution. In this way, co_await <expr> looks and appears to act very much like <expr>.wait().
Compiler Magictm then goes in behind the scenes to make it asynchronous.
So the idea of launching a "detached coroutine" doesn't make sense within this framework. The caller of a coroutine function (usually) isn't the one who determines where the coroutine executes; it's the processes the coroutine invokes during its execution that decides that.
Your schedule_coroutine function really ought to just be a regular "execute a function asynchronously" operation. It shouldn't have any particular association with coroutines, nor any expectation that the given functor is or represents some asynchronous task or if it happens to invoke co_await. The function is just going to create a new thread and execute a function on it.
Just as you would have done pre-C++20.
If your task type represents an asynchronous task, then in proper RAII style, its destructor ought to wait until the task is completed before exiting (this includes any resumptions of coroutines scheduled by that task, throughout the entire execution of said task. The task isn't done until it is entirely done). Therefore, if handler() in your schedule_coroutine call returns a task, then that task will be initialized and immediately destroyed. Since the destructor waits for the asynchronous task to complete, the thread will not die until the task is done. And since the thread's functor is copied/moved from the function object given to the thread constructor, any captures will continue to exist until the thread itself exits.
I hope I got you right, but I think there might be a couple of misconceptions here. First off, you clearly cannot detach a coroutine, that would not make any sense at all. But you can execute asynchronous tasks inside a coroutine for sure, even though in my opinion this defeats its purpose entirely.
But let's take a look at the second block of code you posted. Here you invoke std::async and forward a handler to it. Now, in order to prevent any kind of early destruction you should use std::move instead and pass the handler to the lambda so it will be kept alive for as long as the scope of the lambda function is valid. This should probably already answer your final question as well, because the place where you want this handler to be stored would be the lambda capture itself.
Another thing that bothers me is the usage of std::async. The call will return a std::future-kind of type that will block until the lambda has been executed. But this will only happen if you set the launch type to std::launch::async, otherwise you will need to call .get() or .wait() on the future as the default launch type is std::launch::deferred and this will only lazy fire (meaning: when you actually request the result).
So, in your case and if you really wanted to use coroutines that way, I would suggest to use a std::thread instead and store it for a later join() somewhere pseudo-globally. But again, I don't think you would really want to use coroutines mechanics that way.
Your question makes perfect sense, the misunderstanding is C++20 coroutines are actually generators mistakenly occupying coroutine header name.
Let me explain how generators work and then answer how to schedule detached coroutine.
How generators work
Your question Scheduling a detached coroutine then looks How to schedule a detached generator and answer is: not possible because special convention transforms regular function into generator function.
What is not obvious right there is the yielding a value must take place inside generator function body. When you want to call a helper function that yields value for you - you can't. Instead you also make a helper function into generator and then await instead of just calling helper function. This effectively chains generators and might feel like writing synchronous code that executes async.
In Javascript special convention is async keyword. In Python special convention is yield instead of return keyword.
The C++20 coroutines are low level mechanism allowing to implement Javascipt like async/await.
Nothing wrong with including this low-level mechanism in C++ language except placing it in header named coroutine.
How to schedule detached coroutine
This question makes sense if you want to have green threads or fibers and you are writing scheduler logic that uses symmetric or asymmetric coroutines to accomplish this.
Now others might ask: why should anyone bother with fibers(not windows fibers;) when you have generators? The answer is because you can have encapsulated concurrency and parallelism logic, meaning rest of your team isn't required to learn and apply additional mental gymnastics while working on the project.
The result is true asynchronous programming where the rest of the team write linear code, without callbacks and such, with simple concept of concurrency for example single spawn() library function, avoiding any locks/mutexes and other multithreading complexity.
The beauty of encapsulation is seen when all details are hidden in low level i/o methods. All context switching, scheduling, etc. happens deep inside i/o classes like Channel, Queue or File.
Everyone involved in async programming should experience working like this. The feeling is intense.
To accomplish this instead of C++20 coroutines use Boost::fiber that includes scheduler or Boost::context that allows symmetric coroutines. Symmetric coroutines allow to suspend and switch to any other coroutine while asymmetric coroutines suspend and resume calling coroutine.
Per this latest C++ TS: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4628.pdf, and based on the understanding of C# async/await language support, I'm wondering what is the "execution context" (terminology borrowed from C#) of the C++ coroutines?
My simple test code in Visual C++ 2017 RC reveals that coroutines seem to always execute on a thread pool thread, and little control is given to the application developer on which threading context the coroutines could be executed - e.g. Could an application forces all the coroutines (with the compiler generated state machine code) to be executed on the main thread only, without involving any thread pool thread?
In C#, SynchronizationContext is a way to specify the "context" where all the coroutine "halves" (compiler generated state machine code) will be posted and executed, as illustrated in this post: https://blogs.msdn.microsoft.com/pfxteam/2012/01/20/await-synchronizationcontext-and-console-apps/, while the current coroutine implementation in Visual C++ 2017 RC seems to always rely on the concurrency runtime, which by default executes the generated state machine code on a thread pool thread. Is there a similar concept of synchronization context that the user application can use to bind coroutine execution to a specific thread?
Also, what's the current default "scheduler" behavior of the coroutines as implemented in Visual C++ 2017 RC? i.e. 1) how a wait condition is exactly specified? and 2) when a wait condition is satisfied, who invokes the "bottom half" of the suspended coroutine?
My (naive) speculation regarding Task scheduling in C# is that C# "implements" the wait condition purely by task continuation - a wait condition is synthesized by a TaskCompletionSource owned task, and any code logic that needs to wait will be chained as a continuation to it, so if the wait condition is satisfied, e.g. if a full message is received from the low level network handler, it does TaskCompletionSource.SetValue, which transitions the underlying task to the completed state, effectively allowing the chained continuation logic to start execution (putting the task into the ready state/list from the previous created state) - In C++ coroutine, I'm speculating that std::future and std::promise would be used as similar mechanism (std::future being the task, while std::promise being the TaskCompletionSource, and the usage is surprisingly similar too!) - so does the C++ coroutine scheduler, if any, relies on some similar mechanism to carry out the behavior?
[EDIT]: after doing some further research, I was able to code a very simple yet very powerful abstraction called awaitable that supports single threaded and cooperative multitasking, and features a simple thread_local based scheduler, which can execute coroutines on the thread the root coroutine is started. The code can be found from this github repo: https://github.com/llint/Awaitable
Awaitable is composable in a way that it maintains correct invocation ordering at nested levels, and it features primitive yielding, timed wait, and setting ready from somewhere else, and very complex usage pattern can be derived from this (such as infinite looping coroutines that only get woken up when certain events happen), the programming model follows C# Task based async/await pattern closely. Please feel free to give your feedbacks.
The opposite!
C++ coroutine is all about control. the key point here is the void await_suspend(std::experimental::coroutine_handle<> handle)
function.
evey co_await expects awaitable type. in a nutshell, awaitable type is a type which provide these three functions:
bool await_ready() - should the program halt the execution of the coroutine?
void await_suspend(handle) - the program passes you a continuation context for that coroutine frame. if you activate the handle (for example, by calling operator () that the handle provides - the current thread resumes the coroutine immediately).
T await_resume() - tells the thread which resumes the coroutine what to do when resuming the coroutine and what to return from co_await.
so when you call co_await on awaitable type, the program asks the awaitable if the coroutine should be suspended (if await_ready returns false) and if so - you get a coroutine handle in which you can do whatever you like.
for example, you can pass the coroutine handle to a thread-pool. in this case a thread-pool thread will resume the coroutine.
you can pass the coroutine handle to a simple std::thread - your own create thread will resume the coroutine.
you can attach the coroutine handle into a derived class of OVERLAPPED and resume the coroutine when the asynchronous IO finishes.
as you can see - you can control where and when the coroutine is suspended and resumes - by managing the coroutine handle passed in await_suspend. there is no "default scheduler" - how you implement you awaitable type will decide how the coroutine is schedueled.
So, what happens in VC++? unfortunately, std::future still doesn't have then function, so you can't pass the coroutine handle to a std::future. if you await on std::future - the program will just open a new thread. look at the source code given by the future header:
template<class _Ty>
void await_suspend(future<_Ty>& _Fut,
experimental::coroutine_handle<> _ResumeCb)
{ // change to .then when future gets .then
thread _WaitingThread([&_Fut, _ResumeCb]{
_Fut.wait();
_ResumeCb();
});
_WaitingThread.detach();
}
So why did you see a win32 threadpool-thread if the coroutines are launched in a regular std::thread? that's because it wasn't the coroutine. std::async calls behind the scenes to concurrency::create_task. a concurrency::task is launched under the win32 threadpool by default. after all, the whole purpose of std::async is to launch the callable in another thread.
I'm currently trying to get my hands on boost::asio strands. Doing so, I keep reading about "invoking strand post/dispatch inside or outside a strand". Somehow I can't figure out how inside a strand differs from through a strand, and therefore can't grasp the concept of invoking a strand function outside the strand at all.
Probably there is just a small piece missing in my puzzle. Can somebody please give an example how calls to a strand can be inside or outside it?
What I think I've understood so far is that posting something through a strand would be
m_strand.post(myfunctor);
or
m_strand.wrap(myfunctor);
io_svc.post(myfunctor);
Is the latter considered a call to dispatch outside the strand (as opposed to the other being a call to post inside it)? Is there some relation between the strand's "inside realm" and the threads the strand operates on?
If being inside a strand simply meant to invoke a strand's function, then the strand class's documentation would be pointless. It states that strand::post can be invoked outside the strand... That's precisely the part I don't understand.
Even I had some trouble in understanding this concept, but became clear once I started working on libdispatch. It helped me map things with asio better.
Now lets see how to make some sense out of strand. Consider strand as a serial queue of handlers which needs to be executed.
Now, where does these handlers get executed ? Within the worker threads.
Where did these worker threads come from ? From the io_service object you passed while creating the strand.
Something like:
asio::strand s(io_serv_obj);
Now, as you must be knowing, the io_service::run can be called by a single thread or multiple threads. The threads calling the run method of the io_serv_obj are the worker threads for that strand in our case. So, it could be either single threaded or multithreaded.
Coming back to strands, when you post a handler, that handler is always enqueued in the serial queue which we talked about. The worker threads will pick up the handler from the queue one after the other.
Now, when you do a dispatch, asio does some optimization for you:
It checks whether you are calling it from inside one of the worker thread or from some other thread (maybe of some other io_service instance). When it is called outside the current execution context of the strand, thats when it is called outside the strand. So, in the outside case, the dispatch will just enqueue the handler like post when there are other handlers waiting in the queue or will call it directly when it can guarantee that it will not be called concurrently with any other handler from that queue that may be running in one of the worker threads at that moment.
UPDATE:
As noted in the comments section, inside means called within another handler i.e for eg: I posted a handler A and inside that handler, I am doing a dispatch of another handler. Now, as would be explained in #2, if there are no other handlers waiting in the strands serial queue, the dispatch handler will be called synchronously. If this condition is not met, that means, the dispatch is called from outside.
Now, if you call dispatch from outside of the strand i.e not within the current execution context, asio checks its callstack to see if any other handler present in its serial queue is running or not. If not, then it will directly call that handler synchronously. So, there is no cost of enqueueing the handler (I think no extra allocation will be done as well, not sure though).
Lets see the documentation link now:
s.dispatch(a) happens-before s.post(b), where the former is performed
outside the strand
This means that, if dispatch was called from some outside the current run OR there are other handlers already enqueued, then it needs to enqueue the handler, it just cannot call it synchronously. Since its a serial queue, a will get executed before b.
Had there been another call s.dispatch(c) along with a and b but before a and b(in the mentioned order) enqueued, then c will get executed before a and b, but in no way b can get executed before a.
Hope this clears your doubt.
For a given strand object s, running outside s implies that s.running_in_this_thread() returns false. This returns true if the calling thread is executing a handler that was submitted to the strand via post(), dispatch(), or wrap(). Otherwise, it returns false:
io_service.post(handler); // handler will run outside of strand
strand.post(handler); // handler will run inside of strand
strand.dispatch(handler); // handler will run inside of strand
io_service.post(strand.wrap(handler)); // handler will run inside of strand
Given:
a strand object s
a function object f1 that is added to strand s via s.post(), or s.dispatch() when s.running_in_this_thread() == false
a function object f2 that is added to strand s via s.post(), or s.dispatch() when s.running_in_this_thread() == false
then the strand provides a guarantee of ordering and non-concurrency, such that f1 and f2 will not be invoked concurrently. Furthermore, if the addition of f1 happens before the addition of f2, then f1 will be invoked before f2.
I evaluate JavaScript in my Qt application using QScriptEngine::evaluate(QString code). Let's say I evaluate a buggy piece of JavaScript which loops forever (or takes too long to wait for the result). How can I abort such an execution?
I want to control an evaluation via two buttons Run and Abort in a GUI. (But only one execution is allowed at a time.)
I thought of running the script via QtConcurrent::run, keeping the QFuture and calling cancel() when the Abort is was pressed. But the documentation says that I can't abort such executions. It seems like QFuture only cancels after the current item in the job has been processed, i.e. when reducing or filtering a collection. But for QtConcurrent::run this means that I can't use the future to abort its execution.
The other possibility I came up with was using a QThread and calling quit(), but there I have a similar problems: It only cancels the thread if / as soon as it is waiting in an event loop. But since my execution is a single function call, this is no option either.
QThread also has terminate(), but the documentation makes me worry a bit. Although my code itself doesn't involve mutexes, maybe QScriptEngine::evaluate does behind the scenes?
Warning: This function is dangerous and its use is discouraged. The thread can be terminated at any point in its code path. Threads can be terminated while modifying data. There is no chance for the thread to clean up after itself, unlock any held mutexes, etc. In short, use this function only if absolutely necessary.
Is there another option I am missing, maybe some asynchronous evaluation feature?
http://doc.qt.io/qt-4.8/qscriptengine.html#details
It has a few sections that address your concerns:
http://doc.qt.io/qt-4.8/qscriptengine.html#long-running-scripts
http://doc.qt.io/qt-4.8/qscriptengine.html#script-exceptions
http://doc.qt.io/qt-4.8/qscriptengine.html#abortEvaluation
http://doc.qt.io/qt-4.8/qscriptengine.html#setProcessEventsInterval
Hope that helps.
While the concurrent task itself can't be aborted "from outside", the QScriptEngine can be told (of course from another thread, like your GUI thread) to abort the execution:
QScriptEngine::abortEvaluation(const QScriptValue & result = QScriptValue())
The optional parameter is used as the "pseudo result" which is passed to the caller of evaluate().
You should either set a flag somewhere or use a special result value in abortEvaluation() to make it possible for the caller routine to detect that the execution was aborted.
Note: Using isEvaluating() you can see if an evaluation is currently running.