C++20 coroutines using final_suspend for continuations - c++

BACKGROUND
After being convinced that C++ stackless coroutines are pretty awesome. I have been implementing coroutines for my codebase, and realised an oddity in final_suspend.
CONTEXT
Let’s say you have the following final_suspend function:
final_awaitable final_suspend() noexcept
{
return {};
}
And, final_awaitable was implemented as follows:
struct final_awaitable
{
bool await_ready() const noexcept
{
return false;
}
default_handle_t await_suspend( promise_handle_t h ) const noexcept
{
return h.promise().continuation();
}
void await_resume() const noexcept {}
};
If continuation here was retrieved atomically from task queue and the task queue is potentially empty (which could occur any time between await_ready and await_suspend) then await_suspend must be able to return a blank continuation.
It is my understanding that when await_suspend returns a handle, the returned handle is immediately resumed (5.1 in N4775 draft). So, if there was no avaliable continuation here, any application crashes as resume is called on an invalid coroutine handle after receiving it from await_suspend.
The following is the execution order:
final_suspend Constructs final_awaitable.
final_awaitable::await_ready Returns false, triggering await_suspend.
final_awaitable::await_suspend Returns a continuation (or empty continuation).
continuation::resume This could be null if a retrieved from an empty work queue.
No check appears to be specified for a valid handle (as it is if await_suspend returns bool).
QUESTION
How are you suppose to add a worker queue to await_suspend without a lock in this case? Looking for a scalable solution.
Why doesn't the underlying coroutine implementation check for a valid handle.
A contrived example causing the crash is here.
SOLUTION IDEAS
Using a dummy task that is an infinite loop of co_yield. This is sort of wasted cycles and I would prefer not to have to do this, also I would need to create seperate handles to the dummy task for every thread of execution and that just seems silly.
Creating a specialisation of std::coroutine_handle where resume does nothing, returning an instance of that handle. I'd prefer not specialise the standard library. This also doesn't work because coroutine_handle<> doesn't have done() and resume() as virtual.
EDIT 1 16/03/2020 Call continuation() to atomically retrieve a continuation and store the result in the final_awaitable structure, await_ready world return true if there wasn't a continuation available. If there was a continuation available await_ready would return false, await_suspend would then be called and the continuation returned (immediately resuming it).
This doesn't work because the value returned by a task is stored in the coroutine frame and if the value is still needed then the coroutine frame must not be destroyed. In this case it is destroyed after await_resume is called on the final_awaitable.
This is only an issue if the task is the last in a chain of continuations.
EDIT 2 - 20/03/2020 Ignore the possibility of returning a usable co routine handle from await_suspend. Only resume continuation from top level co routine. This doesn't appear as efficient.
01/04/2020
I still haven't found a solution that doesn't have substantial disadvantages. I suppose the reason I'm caught up on this is because await_suspend appears to be designed to solve this exact problem (being able to return a corountine_handle). I just cannot figure out the pattern that was intended.

You can use std::noop_coroutine as a blank continuation.

What about: (Just a large comment in fact.)
struct final_awaitable
{
bool await_ready() const noexcept
{
return false;
}
bool await_suspend( promise_handle_t h ) const noexcept
{
auto continuation = h.promise().atomically_pop_a_continuation();
if (continuation)
continuation.handle().resume();
return true;//or whatever is meaningfull for your case.
}
void await_resume() const noexcept {}
};

Related

What is the benefit of `await_ready` in C++ coroutine awaiters

In C++ coroutines, awaiters-types (i. e. argument-types of co_await and return types of initial_suspend/final_suspend) require the three methods await_ready(), await_suspend(std::coroutine_handle<...>) and await_resume(). I am aware of the order of invocation1 in case of suspension/resumption of a coroutine.
But it is unclear to me what is the rationale of demanding the distinct await_ready-method. When returning true invocation of await_suspend is prevented (and with it suspension altogether). However, the same effect is achievable by simply returning true from await_suspend.
In this blog article the author David Mazières says
await_ready is an optimization. If it returns true, then co_await does not suspend the function. Of course, you could achieve the same effect in await_suspend, by resuming (or not suspending) the current coroutine, but before calling await_suspend, the compiler must bundle all state into the heap object referenced by the coroutine handle, which is potentially expensive.
However, to me this does not seem to be a valid argument: The (possibly heap-)allocated coroutine state needs to be created before promise_type::get_return_object() is invoked which is always invoked, since the promies_type-instance is part of the coroutine state. Therefore the coroutine state referenced by the handle is created anyway, regardless of whether we will invoke await_suspend or not.
Back to the question putting my thougths into code: I do not see in which case the rewriting of
struct awaiter
{
bool await_ready() { /* <arbitrary-await-ready-body> */ }
void /*| bool*/ await_suspend(std::coroutine_handle<...>) { /* <arbitrary-await-suspend-body> */ }
<any> await_resume() { ... }
};
to
struct awaiter
{
bool await_ready() { return false; } // always false
bool await_suspend(std::coroutine_handle<...>)
{
if (await_ready_result()) // effectively moved await_ready into await_suspend
return false;
/* <arbitrary-await-suspend-body> */
return true;
}
<any> await_resume() { ... }
private:
bool await_ready_result() { /* <arbitrary-await-ready-body> */ }; // could just be "in-lined" into await_suspend
};
would not be applicable. In the latter case awaiter_ready is obsolete without giving away any flexibility2. So what is the use of distinct awaiter_ready at all?
One thing that came to my mind was that await_ready could be kept noexcept when await_suspend's implementation is arbitrarily complex, which might allow for some optimization but that seems overly specific to be the actual rationale.
1: See for example the pseudo code block in section The Awaiter Workflow of article https://www.modernescpp.com/index.php/a-generic-data-stream-with-coroutines-in-c-20
2: I neglect the case of await_suspend returning some std::coroutine_handle, mainly because I don't see the particular use of that either. After all you could simply just call some_handle.resume() as last statement instead of returning it and instead return void/true. Feel free to enlighten me about that one as well.

Ensuring that only one instance of a function is running?

I'm just getting into concurrent programming. Most probably my issue is very common, but since I can't find a good name for it, I can't google it.
I have a C++ UWP application where I try to apply MVVM pattern, but I guess that the pattern or even being UWP is not relevant.
First, I have a service interface that exposes an operation:
struct IService
{
virtual task<int> Operation() = 0;
};
Of course, I provide a concrete implementation, but it is not relevant for this discussion. The operation is potentially long-running: it makes an HTTP request.
Then I have a class that uses the service (again, irrelevant details omitted):
class ViewModel
{
unique_ptr<IService> service;
public:
task<void> Refresh();
};
I use coroutines:
task<void> ViewModel::Refresh()
{
auto result = co_await service->Operation();
// use result to update UI
}
The Refresh function is invoked on timer every minute, or in response to a user request. What I want is: if a Refresh operation is already in progress when a new one is started or requested, then abandon the second one and just wait for the first one to finish (or time out). In other words, I don't want to queue all the calls to Refresh - if a call is already in progress, I prefer to skip a call until the next timer tick.
My attempt (probably very naive) was:
mutex refresh;
task<void> ViewModel::Refresh()
{
unique_lock<mutex> lock(refresh, try_to_lock);
if (!lock)
{
// lock.release(); commented out as harmless but useless => irrelevant
co_return;
}
auto result = co_await service->Operation();
// use result to update UI
}
Edit after the original post: I commented out the line in the code snippet above, as it makes no difference. The issue is still the same.
But of course an assertion fails: unlock of unowned mutex. I guess that the problem is the unlock of mutex by unique_lock destructor, which happens in the continuation of the coroutine and on a different thread (other than the one it was originally locked on).
Using Visual C++ 2017.
use std::atomic_bool:
std::atomic_bool isRunning = false;
if (isRunning.exchange(true, std::memory_order_acq_rel) == false){
try{
auto result = co_await Refresh();
isRunning.store(false, std::memory_order_release);
//use result
}
catch(...){
isRunning.store(false, std::memory_order_release);
throw;
}
}
Two possible improvements : wrap isRunning.store in a RAII class and use std::shared_ptr<std::atomic_bool> if the lifetime if the atomic_bool is scoped.

Throwing exception vs return code

I'm implementing my own queue which blocks on .pop(). This function also accepts additional argument which is a timeout. So at the moment I have such code:
template <class T>
class BlockingQueue {
private:
std::queue<T> m_queue;
std::mutex m_mutex;
std::condition_variable m_condition;
public:
T pop(uint64_t t_millis) {
std::unique_lock<std::mutex> lock(m_mutex);
auto status = m_condition.wait_for(
lock,
std::chrono::milliseconds(t_millis),
[=] {
return !m_queue.empty();
}
);
if (!status) {
throw exceptions::Timeout();
}
T next(std::move(m_queue.front()));
m_queue.pop();
return next;
};
}
where exceptions::Timeout is my custom exception. Now I've been thinking about this exception throwing from the performance point of view. Would it be better to return some kind of return code from that function? How does that affect performance?
Also since .pop already returns something how would you implement additional return code? I suppose some new structure that holds both T and a return code would be needed. Is that increase in complexity really worth it?
Throw exceptions when an expectation has not been met, return a status code when you're querying for status.
for example:
/// pops an object from the stack
/// #returns an object of type T
/// #pre there is an object on the stack
/// #exception std::logic_error if precondition not met
T pop();
/// queries how many objects are on the stack
/// #returns a count of objects on the stack
std::size_t object_count() const;
/// Queries the thing for the last transport error
/// #returns the most recent error or an empty error_code
std::error_code last_error() const;
and then there's the asio-style reactor route coupled with executor-based futures:
/// Asynchronously wait for an event to be available on the stack.
/// The handler will be called exactly once.
/// to cancel the wait, call the cancel() method
/// #param handler is the handler to call either on error or when
/// an item is available
/// #note Handler has the call signature void(const error_code&, T)
///
template<class Handler>
auto async_pop(Handler handler);
which could be called like this:
queue.async_pop(asio::use_future).then([](auto& f) {
try {
auto thing = f.get();
// use the thing we just popped
}
catch(const system_error& e) {
// e.code() indicates why the pop failed
}
});
One way to signal an error in a situation like this, without throwing an exception, would be to use something like Andrei Alexandrescu's expected<T> template.
He gave a nice talk about it a while back. The idea is, expected<T> either contains a T, or it contains an exception / error code object describing why the T couldn't be produced.
You can use his implementation, or easily adapt the idea for your own purposes. For instance you can build such a class on top of boost::variant<T, error_code> quite easily.
This is just another style of error handling, distinct from C-style integer error codes and C++ exceptions. Using a variant type does not imply any extra dynamic allocations -- such code can be efficient and doesn't add much complexity.
This is actually pretty close to how error handling is done in Rust idiomatically. c.f. 2 3
Also since .pop already returns something how would you implement additional
return code? I suppose some new structure that holds both T and a return code
would be needed.
Going with this approach would put an extra requirement on the types that can be used with your BlockingQueue: they must be default constructible. It can be avoided if pop() returns the result through a std::unique_ptr (signaling the timeout with a nullptr), but that will introduce noticeable overhead.
I see no disadvantage of using exceptions here. If you are measuring your timeouts in milliseconds, then handling an exception in case of a timeout should be negligible.
An exception is not necessary here. A "timeout" is just as expected an outcome as getting an item from the queue. Without the timeout, the program is essentially equivalent to the halting problem. Let's say the client specified that they want an indefinite timeout. Would the exception ever throw? How would you handle such an exception (assuming you're still alive in this post-apocalyptic scenario?)
Instead I find these two design choices more logical (though they're not the only ones):
Block until an item is available. Create a function named wait that polls and returns false if it times out, or true when an item is available. The rest of your pop() function can remain unchanged.
Don't block. Instead return a status:
If the operation would block, return "busy"
If the queue is empty, return "empty"
Otherwise, you can "pop" and return "success"
Since you have a mutex, these options seem preferable to an i.e. non-waiting function.

Pattern for future conversion

currently we are using asynchronous values very heavily.
Assume that I have a function which does something like this:
int do_something(const boost::posix_time::time_duration& sleep_time)
{
BOOST_MESSAGE("Sleeping a bit");
boost::this_thread::sleep(sleep_time);
BOOST_MESSAGE("Finished taking a nap");
return 42;
}
At some point in code we create a task which creates a future to such an int value which will be set by a packaged_task - like this (worker_queue is a boost::asio::io_service in this example):
boost::unique_future<int> createAsynchronousValue(const boost::posix_time::seconds& sleep)
{
boost::shared_ptr< boost::packaged_task<int> > task(
new boost::packaged_task<int>(boost::bind(do_something, sleep)));
boost::unique_future<int> ret = task->get_future();
// Trigger execution
working_queue.post(boost::bind(&boost::packaged_task<int>::operator (), task));
return boost::move(ret);
}
At another point in code I want to wrap this function to return some higher level object which should also be a future. I need a conversion function which takes the first value and transforms it to another value (in our actual code we have some layering and doing asynchronous RPC which returns futures to responses - these responses should be converted to futures to real objects, PODs or even void future to be able to wait on it or catch exceptions). So this is the conversion function in this example:
float converter(boost::shared_future<int> value)
{
BOOST_MESSAGE("Converting value " << value.get());
return 1.0f * value.get();
}
Then I thought of creating a lazy future as described in the Boost docs to do this conversion only if wanted:
void invoke_lazy_task(boost::packaged_task<float>& task)
{
try
{
task();
}
catch(boost::task_already_started&)
{}
}
And then I have a function (might be a higher level API) to create a wrapped future:
boost::unique_future<float> createWrappedFuture(const boost::posix_time::seconds& sleep)
{
boost::shared_future<int> int_future(createAsynchronousValue(sleep));
BOOST_MESSAGE("Creating converter task");
boost::packaged_task<float> wrapper(boost::bind(converter, int_future));
BOOST_MESSAGE("Setting wait callback");
wrapper.set_wait_callback(invoke_lazy_task);
BOOST_MESSAGE("Creating future to converter task");
boost::unique_future<float> future = wrapper.get_future();
BOOST_MESSAGE("Returning the future");
return boost::move(future);
}
At the end I want to be able to use it like this:
{
boost::unique_future<float> future = createWrappedFuture(boost::posix_time::seconds(1));
BOOST_MESSAGE("Waiting for the future");
future.wait();
BOOST_CHECK_EQUAL(future.get(), 42.0f);
}
But here I end up getting an exception about a broken promise. The reason seems to be pretty clear for me because the packaged_task which does the conversion goes out of scope.
So my questing is: How do I deal with such situations. How can I prevent the task from being destroyed? Is there a pattern for this?
Bests,
Ronny
You need to manage the lifetime of task object properly.
The most correct way is to return boost::packaged_task<float> instead of boost::unique_future<float> from createWrappedFuture(). The caller will be responsible to get future object and to prolongate task lifetime until future value is ready.
Or you can place task object into some 'pending' queue (global or class member) the similar way you did in createAsynchronousValue. But in this case you will need to explcitly manage task lifetime and remove it from queue after completion. So don't think this solution has advantages against returning task object itself.

std::async doesn't seem to spawn thread with std::launch::async

I am writing a DCPU-16 emulator and I am calculating the real time clock speed of the CPU by launching a thread that calls the function getRealTimeCPUClock() in a separate thread. The problem is it seems that the future object's "valid" attribute is true even when it has not returned a value. As a result, when calling futureObj.get(), it then waits for getRealTimeCPUClock() to return.
With a launch policy of async (as opposed to deferred) isn't it supposed to launch the function into the background and then when it returns set the valid attribute to true?
Is this the wrong usage?
int getRealTimeCPUClock() {
int cyclesBeforeTimer = totalCycles;
sleep(1);
return totalCycles - cyclesBeforeTimer;
}
void startExecutionOfProgram(char *programFileName)
{
size_t lengthOfProgramInWords = loadProgramIntoRAM(programFileName);
auto futureRealTimeClockSpeed = std::async(std::launch::async, getRealTimeCPUClock);
while(programCounter < lengthOfProgramInWords) {
if(futureRealTimeClockSpeed.valid()) {
realTimeClockSpeed = futureRealTimeClockSpeed.get();
futureRealTimeClockSpeed = std::async(std::launch::async, getRealTimeCPUClock);
}
step();
}
}
valid() does not what you think it does (although the entry in cppreference suggests otherwise).
Here is what the Standard says about valid():
(§ 30.6.6/18)
bool valid() const noexcept;
Returns: true only if *this refers to a shared state.
The value returned by valid() will be true as long as long as the future object is associated with a valid shared state, which is generally the case after you launched it using std::async and before you retrieve the result (using get()). The future will also be invalidated when you use the share() method to create a shared_future. None of this is related to what you are trying to do, i.e. checking whether the result is available.
To determine whether the result of a future is ready, I suggest using the wait_for() function with a delay of 0:
if (futureRealTimeClockSpeed.wait_for(std::chrono::seconds(0))
== std::future_status::ready)
/*...*/