Future with Coroutines co_await - c++

Watching a c++ lecture (https://youtu.be/DLLt4anKXKU?t=1589), I tried to understand how future work with co_await; example:
auto compute = []() -> std::future<int> {
int fst = co_await std::async(get_first);
int snd = co_await std::async(get_second);
co_return fst + snd;
};
auto f = compute();
/* some heavy task */
f.get();
I can't understand how and when co_await std::async(get_first) returns control to compute. i.e how std::future implements an awaitable interface (type).

how std::future implements an awaitable interface
Well as far as C++20 is concerned, it doesn't. C++20 provides co_await and its attendant language functionality, but it doesn't provide any actual awaitable types.
How std::future could implement the awaitable interface is basically the same as how std::experimental::future from the Concurrency TS implements future::then. then takes a function to be continued when the future's value becomes available. The return value of then is a new future<U> (the old future<T> now becomes non-functional), where U is the new value that the given continuation function returns. That new future will only have a U available when the original value is available and when the continuation has processed it into the new value. In that order.
The exact details about how .then works depend entirely on how future is implemented. And it may depend on how the specific future was created, as futures from std::async have special properties that other futures don't.
co_await just makes this process much more digestible visually. A co_awaitable future would simply shove the coroutine handle into future::then, thereby altering the future.

Here there is a full program that can await futures with C++20 coroutines. I did it myself these days to learn.
#include <cassert>
#include <coroutine>
#include <future>
#include <iostream>
#include <optional>
#include <thread>
using namespace std::literals;
template <class T>
class FutureAwaitable {
public:
template <class U> struct BasicPromiseType {
auto get_return_object() {
return FutureAwaitable<T>(CoroHandle::from_promise(*this));
}
std::suspend_always initial_suspend() noexcept {
std::cout << "Initial suspend\n";
return {};
}
std::suspend_never final_suspend() noexcept {
std::cout << "Final suspend\n";
return {};
}
template <class V>
requires std::is_convertible_v<V, T>
void return_value(V v) { _value = v; }
void unhandled_exception() { throw; }
std::optional<T> _value;
};
using promise_type = BasicPromiseType<FutureAwaitable<T>>;
using CoroHandle = std::coroutine_handle<promise_type>;
explicit FutureAwaitable(CoroHandle h) : _parent(h) { }
~FutureAwaitable() {
}
bool is_ready() const {
auto & fut = std::get<FutureAwaitable<T> *>(&_parent);
return fut->wait_for(std::chrono::seconds(0)) != std::future_status::ready;
}
FutureAwaitable(std::future<T> && f) {
_f = &f;
}
T get() const { return promise()._value.value(); }
std::future<T> & std_future() const {
assert(_f->valid());
return *_f;
}
bool await_ready() {
if (!(_f->wait_for(std::chrono::seconds(0)) == std::future_status::ready)) {
std::cout << "Await ready IS ready\n";
return true;
}
else
std::cout << "Await ready NOT ready\n";
return false;
}
auto await_resume() {
std::cout << "Await resume" << std::endl;
return std_future().get();
}
bool await_suspend(CoroHandle parent) {
_parent = parent;
std::cout << "Await suspend\n";
return true;
}
void resume() {
assert(_parent);
_parent.resume();
}
auto parent() const { return _parent; }
bool done() const noexcept {
return _parent.done();
}
private:
auto & promise() const noexcept { return _parent.promise(); }
CoroHandle _parent = nullptr;
std::future<T> * _f = nullptr;
};
template <class T> auto operator co_await(std::future<T> &&f) {
return FutureAwaitable<T>(std::forward<std::future<T>>(f));
}
template <class T> auto operator co_await(std::future<T> & f) {
return FutureAwaitable<T>(std::forward<std::future<T>>(f));
}
FutureAwaitable<int> coroutine() {
std::promise<int> p;
auto fut = p.get_future();
p.set_value(31);
std::cout << "Entered func()" << std::endl;
auto res = co_await std::move(fut);
std::cout << "Continue func(): " << res << std::endl;
auto computation = co_await std::async(std::launch::async, [] {
int j = 0;
for (int i = 0; i < 1000; ++i) {
j += i;
}
return j;
});
auto computation2 = std::async(std::launch::async, [] {
int j = 0;
std::this_thread::sleep_for(20s);
for (int i = 0; i < 1000; ++i) {
j += i;
}
return j;
});
auto computation3 = std::async(std::launch::async, [] {
int j = 0;
std::this_thread::sleep_for(20s);
for (int i = 0; i < 1000; ++i) {
j += i;
}
return j;
});
co_await computation2;
co_await computation3;
std::cout << "Computation result is " << computation << std::endl;
co_return computation;
}
#define ASYNC_MAIN(coro) \
int main() { \
FutureAwaitable<int> c = coro(); \
do { c.resume(); } while (!c.done()); \
std::cout << "The coroutine returned " << c.get(); \
return 0; \
}
ASYNC_MAIN(coroutine)

Related

A workaround of the crash caused by calling destroy() from final_suspend()

Two days ago in my previous post I provided a code that works with GCC but crashes with MSVC2002 that calls the task destructor two times.
Today I made it work with both MSVC2002 and GCC by replacing my former await_suspend implementation:
std::coroutine_handle<> await_suspend(std::coroutine_handle<UpdatePromise> h) noexcept
{
// resume awaiting coroutine or if there is no coroutine to resume return special coroutine that do
// nothing
std::coroutine_handle<> val = awaiting_coroutine ? awaiting_coroutine : std::noop_coroutine();
h.destroy();
return val;
}
with the following:
void await_suspend(std::coroutine_handle<UpdatePromise> h) noexcept
{
auto coro = awaiting_coroutine;
h.destroy();
if (coro)
{
coro.resume();
}
}
What can be a difference between these two implementations?
If they are identical why are different return types of await_suspend are supported by the compiler? What are they for? Is it something like a syntax sugar?
Now the full example looks like this:
#include <coroutine>
#include <optional>
#include <iostream>
#include <thread>
#include <chrono>
#include <queue>
#include <vector>
// simple timers
// stored timer tasks
struct timer_task
{
std::chrono::steady_clock::time_point target_time;
std::coroutine_handle<> handle;
};
// comparator
struct timer_task_before_cmp
{
bool operator()(const timer_task& left, const timer_task& right) const
{
return left.target_time > right.target_time;
}
};
std::priority_queue<timer_task, std::vector<timer_task>, timer_task_before_cmp> timers;
inline void submit_timer_task(std::coroutine_handle<> handle, std::chrono::nanoseconds timeout)
{
timers.push(timer_task{ std::chrono::steady_clock::now() + timeout, handle });
}
//template <bool owning>
struct UpdatePromise;
//template <bool owning>
struct UpdateTask
{
// declare promise type
using promise_type = UpdatePromise;
UpdateTask(std::coroutine_handle<promise_type> handle) :
handle(handle)
{
std::cout << "UpdateTask constructor." << std::endl;
}
UpdateTask(const UpdateTask&) = delete;
UpdateTask(UpdateTask&& other) : handle(other.handle)
{
std::cout << "UpdateTask move constructor." << std::endl;
}
UpdateTask& operator = (const UpdateTask&) = delete;
UpdateTask& operator = (const UpdateTask&& other)
{
handle = other.handle;
std::cout << "UpdateTask move assignment." << std::endl;
return *this;
}
~UpdateTask()
{
std::cout << "UpdateTask destructor." << std::endl;
}
std::coroutine_handle<promise_type> handle;
};
struct UpdatePromise
{
std::coroutine_handle<> awaiting_coroutine;
UpdateTask get_return_object();
std::suspend_never initial_suspend()
{
return {};
}
void unhandled_exception()
{
std::terminate();
}
auto final_suspend() noexcept
{
// if there is a coroutine that is awaiting on this coroutine resume it
struct transfer_awaitable
{
std::coroutine_handle<> awaiting_coroutine;
// always stop at final suspend
bool await_ready() noexcept
{
return false;
}
//Results in a crash with MSVC2022, but not with GCC.
/*
std::coroutine_handle<> await_suspend(std::coroutine_handle<UpdatePromise> h) noexcept
{
// resume awaiting coroutine or if there is no coroutine to resume return special coroutine that do
// nothing
std::coroutine_handle<> val = awaiting_coroutine ? awaiting_coroutine : std::noop_coroutine();
h.destroy();
return val;
}*/
//Does not crash.
void await_suspend(std::coroutine_handle<UpdatePromise> h) noexcept
{
auto coro = awaiting_coroutine;
h.destroy();
if (coro)
{
coro.resume();
}
}
void await_resume() noexcept {}
};
return transfer_awaitable{ awaiting_coroutine };
}
void return_void() {}
// use `co_await std::chrono::seconds{n}` to wait specified amount of time
auto await_transform(std::chrono::milliseconds d)
{
struct timer_awaitable
{
std::chrono::milliseconds m_d;
// always suspend
bool await_ready()
{
return m_d <= std::chrono::milliseconds(0);
}
// h is a handler for current coroutine which is suspended
void await_suspend(std::coroutine_handle<> h)
{
// submit suspended coroutine to be resumed after timeout
submit_timer_task(h, m_d);
}
void await_resume() {}
};
return timer_awaitable{ d };
}
// also we can await other UpdateTask<T>
auto await_transform(UpdateTask& update_task)
{
if (!update_task.handle)
{
throw std::runtime_error("coroutine without promise awaited");
}
if (update_task.handle.promise().awaiting_coroutine)
{
throw std::runtime_error("coroutine already awaited");
}
struct task_awaitable
{
std::coroutine_handle<UpdatePromise> handle;
// check if this UpdateTask already has value computed
bool await_ready()
{
return handle.done();
}
// h - is a handle to coroutine that calls co_await
// store coroutine handle to be resumed after computing UpdateTask value
void await_suspend(std::coroutine_handle<> h)
{
handle.promise().awaiting_coroutine = h;
}
// when ready return value to a consumer
auto await_resume()
{
}
};
return task_awaitable{ update_task.handle };
}
};
inline UpdateTask UpdatePromise::get_return_object()
{
return { std::coroutine_handle<UpdatePromise>::from_promise(*this) };
}
// timer loop
void loop()
{
while (!timers.empty())
{
auto& timer = timers.top();
// if it is time to run a coroutine
if (timer.target_time < std::chrono::steady_clock::now())
{
auto handle = timer.handle;
timers.pop();
handle.resume();
}
else
{
std::this_thread::sleep_until(timer.target_time);
}
}
}
// example
using namespace std::chrono_literals;
UpdateTask TestTimerAwait()
{
using namespace std::chrono_literals;
std::cout << "testTimerAwait started." << std::endl;
co_await 1s;
std::cout << "testTimerAwait finished." << std::endl;
}
UpdateTask TestNestedTimerAwait()
{
using namespace std::chrono_literals;
std::cout << "testNestedTimerAwait started." << std::endl;
auto task = TestTimerAwait();
co_await 2s;
//We can't wait for a destroyed coroutine.
//co_await task;
std::cout << "testNestedTimerAwait finished." << std::endl;
}
// main can't be a coroutine and usually need some sort of looper (io_service or timer loop in this example)
int main()
{
auto task = TestNestedTimerAwait();
// execute deferred coroutines
loop();
}
I compile the example with Microsoft (R) C/C++ Optimizing Compiler Version 19.30.30709 for x86 using the following command:
cl /std:c++latest /EHsc a.cpp
EDIT1:
The code was a bit incorrect, I commented co_await task out:
//We can't wait for a destroyed coroutine.
//co_await task;

Determining function time using a wrapper

I'm looking for a generic way of measuring a functions timing like Here, but for c++.
My main goal is to not have cluttered code like this piece everywhere:
auto t1 = std::chrono::high_resolution_clock::now();
function(arg1, arg2);
auto t2 = std::chrono::high_resolution_clock::now();
auto tDur = std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1);
But rather have a nice wrapper around the function.
What I got so far is:
timing.hpp:
#pragma once
#include <chrono>
#include <functional>
template <typename Tret, typename Tin1, typename Tin2> unsigned int getDuration(std::function<Tret(Tin1, Tin2)> function, Tin1 arg1, Tin2 arg2, Tret& retValue)
{
auto t1 = std::chrono::high_resolution_clock::now();
retValue = function(arg1, arg2);
auto t2 = std::chrono::high_resolution_clock::now();
auto tDur = std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1);
return tDur.count();
}
main.cpp:
#include "timing.hpp"
#include "matrix.hpp"
constexpr int G_MATRIXSIZE = 2000;
int main(int argc, char** argv)
{
CMatrix<double> myMatrix(G_MATRIXSIZE);
bool ret;
// this call is quite ugly
std::function<bool(int, std::vector<double>)> fillRow = std::bind(&CMatrix<double>::fillRow, &myMatrix, 0, fillVec);
auto duration = getDuration(fillRow, 5, fillVec, ret );
std::cout << "duration(ms): " << duration << std::endl;
}
in case sb wants to test the code, matrix.hpp:
#pragma once
#include <iostream>
#include <string>
#include <sstream>
#include <vector>
template<typename T> class CMatrix {
public:
// ctor
CMatrix(int size) :
m_size(size)
{
m_matrixData = new std::vector<std::vector<T>>;
createUnityMatrix();
}
// dtor
~CMatrix()
{
std::cout << "Destructor of CMatrix called" << std::endl;
delete m_matrixData;
}
// print to std::out
void printMatrix()
{
std::ostringstream oss;
for (int i = 0; i < m_size; i++)
{
for (int j = 0; j < m_size; j++)
{
oss << m_matrixData->at(i).at(j) << ";";
}
oss << "\n";
}
std::cout << oss.str() << std::endl;
}
bool fillRow(int index, std::vector<T> row)
{
// checks
if (!indexValid(index))
{
return false;
}
if (row.size() != m_size)
{
return false;
}
// data replacement
for (int j = 0; j < m_size; j++)
{
m_matrixData->at(index).at(j) = row.at(j);
}
return true;
}
bool fillColumn(int index, std::vector<T> column)
{
// checks
if (!indexValid(index))
{
return false;
}
if (column.size() != m_size)
{
return false;
}
// data replacement
for (int j = 0; j < m_size; j++)
{
m_matrixData->at(index).at(j) = column.at(j);
}
return true;
}
private:
// variables
std::vector<std::vector<T>>* m_matrixData;
int m_size;
bool indexValid(int index)
{
if (index + 1 > m_size)
{
return false;
}
return true;
}
// functions
void createUnityMatrix()
{
for (int i = 0; i < m_size; i++)
{
std::vector<T> _vector;
for (int j = 0; j < m_size; j++)
{
if (i == j)
{
_vector.push_back(1);
}
else
{
_vector.push_back(0);
}
}
m_matrixData->push_back(_vector);
}
}
};
The thing is, this code is still quite ugly due to the std::function usage. Is there a better and/or simpler option ?
(+ also I'm sure I messed sth up with the std::bind, I think I need to use std::placeholders since I want to set the arguments later on.)
// edit, correct use of placeholder in main:
std::function<bool(int, std::vector<double>)> fillRow = std::bind(&CMatrix<double>::fillRow, &myMatrix, std::placeholders::_1, std::placeholders::_2);
auto duration = getDuration(fillRow, 18, fillVec, ret );
You can utilize RAII to implement a timer that records the execution time of a code block and a template function that wraps the function you would like to execute with the timer.
#include<string>
#include<chrono>
#include <unistd.h>
struct Timer
{
std::string fn, title;
std::chrono::time_point<std::chrono::steady_clock> start;
Timer(std::string fn, std::string title)
: fn(std::move(fn)), title(std::move(title)), start(std::chrono::steady_clock::now())
{
}
~Timer()
{
const auto elapsed =
std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::steady_clock::now() - start).count();
printf("%s: function=%s; elasepd=%f ms\n", title.c_str(), fn.c_str(), elapsed / 1000.0);
}
};
#ifndef ENABLE_BENCHMARK
static constexpr inline void dummy_fn() { }
#define START_BENCHMARK_TIMER(...) dummy_fn()
#else
#define START_BENCHMARK_TIMER(title) bench::Timer timer(__FUNCTION__, title)
#endif
template<typename F, typename ...Args>
auto time_fn(F&& fn, Args&&... args) {
START_BENCHMARK_TIMER("wrapped fn");
return fn(std::forward<Args>(args)...);
}
int foo(int i) {
usleep(70000);
return i;
}
int main()
{
printf("%d\n", time_fn(foo, 3));
}
stdout:
wrapped fn: function=time_fn; elasepd=71.785000 ms
3
General Idea:
time_fn is a simple template function that calls START_BENCHMARK_TIMER and calls fn with the provided arguments
START_BENCHMARK_TIMER then creates a Timer object. It will record the current time in start. Do note that __FUNCTION__ will be replaced with the function that was called.
When the
provided fn returns or throws an exception, the Timer object from (1) will be destroyed and the destructor will be called. The destructor will then calculate the time difference between the current time and the recorded start time and prints it to stdout
Note:
Even though declaring start and end in time_fn instead of the RAII timer will work, having an RAII timer will allow you to cleanly handle the situation when fn throws an exception
If you are on c++11, you will need to change time_fn declaration to typename std::result_of<F &&(Args &&...)>::type time_fn(F&& fn, Args&&... args).
Edit: Updated the response to include a wrapper function approach.

C++20 coroutines. When yield is called empty value is retrieved

I watched the Björn Fahller - Asynchronous I/O and coroutines for smooth data streaming - Meeting C++ online talk. Following up this presentation, I gave a try to execute a similar example myself. There is a bug in my code and when yield is called , the value that is printed is zero. Debugging the code , I detected that the yield_value comparing with await_resume, is called from different promise object.
I am confused, and I do not know how to call the yield_value using the correct promise object.
#include <iostream>
#include <coroutine>
#include <optional>
#include <string>
#include <memory>
using namespace std;
template<typename T>
struct promise;
template<typename T>
struct task
{
using promise_type = promise<T>;
auto operator co_await() const noexcept
{
struct awaitable
{
awaitable(promise<T> & promise)
:m_promise(promise)
{
}
bool await_ready() const noexcept
{
return m_promise.isready();
}
void await_suspend(coroutine_handle<promise_type> next)
{
m_promise.m_continuation = next;
}
T await_resume() const
{
std::cout << "await_resume m_promise::" << &m_promise << std::endl;
return m_promise.get();
}
promise<T> & m_promise;
};
return awaitable(_coroutine.promise());
}
task(promise_type& promise) : _coroutine(coroutine_handle<promise_type>::from_promise(promise))
{
promise.m_continuation = _coroutine;
}
task() = default;
task(task const&) = delete;
task& operator=(task const&) = delete;
task(task && other) : _coroutine(other._coroutine)
{
other._coroutine = nullptr;
}
task& operator=(task&& other)
{
if (&other != this) {
_coroutine = other._coroutine;
other._coroutine = nullptr;
}
return *this;
}
static task<T> make()
{
std::cout << "Enter make" << std::endl;
co_await suspend_always{};
std::cout << "Enter exit" << std::endl;
}
auto get_promise()
{
std::cout << "get_promise " << &_coroutine.promise() << std::endl;
return _coroutine.promise();
}
~task()
{
if (_coroutine) {
_coroutine.destroy();
}
}
private:
friend class promise<T>;
coroutine_handle<promise_type> _coroutine;
};
template<typename T>
struct promise
{
task<T> get_return_object() noexcept
{
return {*this};
}
suspend_never initial_suspend() noexcept{return {};}
suspend_always final_suspend() noexcept{return {};}
bool isready() const noexcept
{
return m_value.has_value();
}
T get()
{
return m_value.has_value()? m_value.value(): 0;
}
void unhandled_exception()
{
auto ex = std::current_exception();
std::rethrow_exception(ex);
//// MSVC bug? should be possible to rethrow with "throw;"
//// rethrow exception immediately
// throw;
}
template<typename U>
suspend_always yield_value(U && u)
{
std::cout << "yield_value::" << &m_continuation.promise() << std::endl;
m_value.emplace(std::forward<U>(u));
m_continuation.resume();
//m_continuation.
return {};
}
void return_void(){}
coroutine_handle<promise<T>> m_continuation;
optional<T> m_value;
};
template<typename T>
task<T> print_all(task<T> & values)
{
std::cout << "print all" << std::endl;
for(;;)
{
auto v = co_await values;
std::cout << v << "\n" << std::flush;
}
}
int main(int argc, const char * argv[]) {
auto incoming = task<int>::make();
auto h = print_all(incoming);
auto promise = incoming.get_promise();
promise.yield_value(4);
}
Any help?
demo
This is returning a copy of the promise:
auto get_promise()
{
std::cout << "get_promise " << &_coroutine.promise() << std::endl;
return _coroutine.promise();
}
So instead of calling into the promise for the task, you're calling into just some other, unrelated promise object.
Once you fix that, you'll find that your code has an infinite loop. Your promise is "ready" when it has a value. But once it has a value, it always has a value - it's always ready. One way to fix this is to ensure that await_resume consumes the value. For instance, by changing get() to:
T get()
{
assert(m_value.has_value());
T v = *std::move(m_value);
m_value.reset();
return v;
}
That ensures that the next co_await actually suspends.

C++20 coroutine use after free issue

I'm making an attempt to learn and implement C++20 coroutines, and I'm experiencing a bug.
Generator class:
template<class ReturnType = void>
class enumerable
{
public:
class promise_type;
using handle_type = std::coroutine_handle<promise_type>;
class promise_type
{
public:
ReturnType current_value{};
auto get_return_object()
{
return enumerable{ handle_type::from_promise(*this) };
}
auto initial_suspend()
{
return std::suspend_always{};
}
auto final_suspend() noexcept
{
return std::suspend_always();
}
void unhandled_exception()
{
// TODO:
}
void return_void()
{
}
auto yield_value(ReturnType& value) noexcept
{
current_value = std::move(value);
return std::suspend_always{};
}
auto yield_value(ReturnType&& value) noexcept
{
return yield_value(value);
}
};
class iterator
{
using iterator_category = std::forward_iterator_tag;
using difference_type = std::ptrdiff_t;
using value_type = ReturnType;
using pointer = ReturnType*;
using reference = ReturnType&;
private:
handle_type handle;
public:
iterator(handle_type handle)
: handle(handle)
{
}
reference operator*() const
{
return this->handle.promise().current_value;
}
pointer operator->()
{
return &this->handle.promise().current_value;
}
iterator& operator++()
{
this->handle.resume();
return *this;
}
friend bool operator==(const iterator& it, std::default_sentinel_t s) noexcept
{
return !it.handle || it.handle.done();
}
friend bool operator!=(const iterator& it, std::default_sentinel_t s) noexcept
{
return !(it == s);
}
friend bool operator==(std::default_sentinel_t s, const iterator& it) noexcept
{
return (it == s);
}
friend bool operator!=(std::default_sentinel_t s, const iterator& it) noexcept
{
return it != s;
}
};
handle_type handle;
enumerable() = delete;
enumerable(handle_type h)
: handle(h)
{
std::cout << "enumerable constructed: " << this << " : " << this->handle.address() << '\n';
};
iterator begin()
{
this->handle.resume();
return iterator(this->handle);
}
std::default_sentinel_t end()
{
return {};
}
//Filters a sequence of values based on a predicate.
template<class Predicate>
enumerable<ReturnType> where(Predicate&& pred)
{
std::cout << "where: " << this << " : " << this->handle.address() << '\n';
for (auto& i : *this)
{
if(pred(i))
co_yield i;
}
}
~enumerable()
{
std::cout << "enumerable destructed: " << this << " : " << this->handle.address() << '\n';
}
};
Test code:
enumerable<int> numbers()
{
co_yield 1;
co_yield 2;
co_yield 3;
co_yield 4;
}
enumerable<int> filtered_numbers()
{
return numbers().where([](int i) { return true; });
}
// Crashes
int main()
{
for (auto& i : filtered_numbers())
{
std::cout << "value: " << i << '\n';
}
return 0;
}
Output:
enumerable constructed: 000000FF0550F560:000002959E3B5290
enumerable constructed: 000000FF0550F5D8:000002959E3B6470
destructed: 000000FF0550F560 : 000002959E3B5290
where: 000000FF0550F560 : 000000FF0550F640
//Works, despite "this" inside "where" still being destructed before use, can be observed with the couts.
int main()
{
for(auto i : numbers().where([](int i) { return true; }))
{
std::cout << "value: " << i << '\n';
}
return 0;
}
Output:
enumerable constructed: 000000C9EDD2FD78:000001DADD1A61D0
enumerable constructed: 000000C9EDD2FD28:000001DADD1A73B0
destructed: 000000C9EDD2FD78 : 000001DADD1A61D0
where: 000000C9EDD2FD78 : 000001DADD1A61D0
value: 1
value: 2
value: 3
value: 4
destructed: 000000C9EDD2FD28 : 000001DADD1A73B0
Could somebody explain what is happening here? I'd like to come up with a workaround if possible, the crash does not happen if we return "std::suspend_never" in our promise_type's "initial_suspend", but suspending in initial_suspend isn't ideal behavior for a generator.
This:
return numbers().where([](int i) { return true; });
Creates a temporary (numbers()), then stores a reference to that temporary in a coroutine (the *this used in the loop), and then the temporary goes away.
That's bad. If you want to do chaining of coroutines, each step in that chain needs to be an object on someone's stack. where could be a non-member function that takes an enumerable by value. That would allow the where coroutine to preserve the existence of the enumerable.
//Filters a sequence of values based on a predicate.
template<class Predicate>
enumerable<ReturnType> where(Predicate pred)&
{
std::cout << "where: " << this << " : " << this->handle.address() << '\n';
for (auto& i : *this)
{
if(pred(i))
co_yield i;
}
}
//Captures *this as well as above.
template<class Predicate>
enumerable<ReturnType> where(Predicate pred)&&
{
auto self=std::move(*this);
std::cout << "where: " << this << " : " << this->handle.address() << '\n';
for (auto& i : self)
{
if(pred(i))
co_yield i;
}
}
two changes.
I take Predicate by value, to avoid dangling reference problem.
I have a && overload that copies *this (well, moves from) and stores it within the coroutine.
This still doesn't work.
The first thing that happens is that our coroutine is suspended before any code is run. So the copy of auto self=std::move(*this) happens on the first time we try to get a value.
We can work around this in a few ways. One of them is to bounce to a free function and let it copy the enumerable<int>:
template<class Predicate>
friend enumerable<ReturnType> where( enumerable<ReturnType> self, Predicate pred ) {
for (auto& i: self)
if (pred(i))
co_yield i;
}
//Filters a sequence of values based on a predicate.
template<class Predicate>
enumerable<ReturnType> where(Predicate pred)&
{
return where( *this, std::move(pred) );
}
template<class Predicate>
enumerable<ReturnType> where(Predicate pred)&&
{
return where( std::move(*this), std::move(pred) );
}
a second way is to modify enumerable<ReturnType> to support a setup phase.
struct init_done {};
auto initial_suspend() {
return std::suspend_never{};
}
auto yield_value(init_done) noexcept {
return std::suspend_always{};
}
and modify enumerable<int> returning functions to first co_yield init_done{}; after their setup is finished.
We'd do this on the first line of the numbers() coroutine, and after we copy *this into the local variable self in the where() coroutine.
This is probably simplest:
template<class F>
friend
enumerable<ReturnType> where2(enumerable<ReturnType> self, F f )
{
for (auto i : self.where(std::move(f)))
co_yield i;
}
template<class F>
enumerable<ReturnType> where(F f)&&
{
return where2(std::move(*this), std::move(f));
}
template<class F>
enumerable<ReturnType> where(F f)&
{
for (auto i : *this)
{
if (f(i))
co_yield i;
}
}

Concurrent program compiled with clang runs fine, but hangs with gcc

I wrote a class to share a limited number of resources (for instance network interfaces) between a larger number of threads. The resources are pooled and, if not in use, they are borrowed out to the requesting thread, which otherwise waits on a condition_variable.
Nothing really exotic: apart for the fancy scoped_lock which requires c++17, it should be good old c++11.
Both gcc10.2 and clang11 compile the test main fine, but while the latter produces an executable which does pretty much what expected, the former hangs without consuming CPU (deadlock?).
With the help of https://godbolt.org/ I tried older versions of gcc and also icc (passing options -O3 -std=c++17 -pthread), all reproducing the bad result, while even there clang confirms the proper behavior.
I wonder if I made a mistake or if the code triggers some compiler misbehavior and in case how to work around that.
#include <iostream>
#include <vector>
#include <stdexcept>
#include <mutex>
#include <condition_variable>
template <typename T>
class Pool {
///////////////////////////
class Borrowed {
friend class Pool<T>;
Pool<T>& pool;
const size_t id;
T * val;
public:
Borrowed(Pool & p, size_t i, T& v): pool(p), id(i), val(&v) {}
~Borrowed() { release(); }
T& get() const {
if (!val) throw std::runtime_error("Borrowed::get() this resource was collected back by the pool");
return *val;
}
void release() { pool.collect(*this); }
};
///////////////////////////
struct Resource {
T val;
bool available = true;
Resource(T v): val(std::move(v)) {}
};
///////////////////////////
std::vector<Resource> vres;
size_t hint = 0;
std::condition_variable cv;
std::mutex mtx;
size_t available_cnt;
public:
Pool(std::initializer_list<T> l): available_cnt(l.size()) {
vres.reserve(l.size());
for (T t: l) {
vres.emplace_back(std::move(t));
}
std::cout << "Pool has size " << vres.size() << std::endl;
}
~Pool() {
for ( auto & res: vres ) {
if ( ! res.available ) {
std::cerr << "WARNING Pool::~Pool resources are still in use\n";
}
}
}
Borrowed borrow() {
std::unique_lock<std::mutex> lk(mtx);
cv.wait(lk, [&](){return available_cnt > 0;});
if ( vres[hint].available ) {
// quick path, if hint points to an available resource
std::cout << "hint good" << std::endl;
vres[hint].available = false;
--available_cnt;
Borrowed b(*this, hint, vres[hint].val);
if ( hint + 1 < vres.size() ) ++hint;
return b; // <--- gcc seems to hang here
} else {
// full scan to find the available resource
std::cout << "hint bad" << std::endl;
for ( hint = 0; hint < vres.size(); ++hint ) {
if ( vres[hint].available ) {
vres[hint].available = false;
--available_cnt;
return Borrowed(*this, hint, vres[hint].val);
}
}
}
throw std::runtime_error("Pool::borrow() no resource is available - internal logic error");
}
void collect(Borrowed & b) {
if ( &(b.pool) != this )
throw std::runtime_error("Pool::collect() trying to collect resource owned by another pool!");
if ( b.val ) {
b.val = nullptr;
{
std::scoped_lock<std::mutex> lk(mtx);
hint = b.id;
vres[hint].available = true;
++available_cnt;
}
cv.notify_one();
}
}
};
///////////////////////////////////////////////////////////////////
#include <thread>
#include <chrono>
int main() {
Pool<std::string> pool{"hello","world"};
std::vector<std::thread> vt;
for (int i = 10; i > 0; --i) {
vt.emplace_back( [&pool, i]()
{
auto res = pool.borrow();
std::this_thread::sleep_for(std::chrono::milliseconds(i*300));
std::cout << res.get() << std::endl;
}
);
}
for (auto & t: vt) t.join();
return 0;
}
You're running into undefined behavior since you effectively relock an already acquired lock. With MSVC I obtained a helpful callstack to distinguish this. Here is a working fixed example (I suppose, works now for me, see the changes within the borrow() method, might be further re-designed since locking inside a destructor might be questioned):
#include <iostream>
#include <vector>
#include <stdexcept>
#include <mutex>
#include <condition_variable>
template <typename T>
class Pool {
///////////////////////////
class Borrowed {
friend class Pool<T>;
Pool<T>& pool;
const size_t id;
T * val;
public:
Borrowed(Pool & p, size_t i, T& v) : pool(p), id(i), val(&v) {}
~Borrowed() { release(); }
T& get() const {
if (!val) throw std::runtime_error("Borrowed::get() this resource was collected back by the pool");
return *val;
}
void release() { pool.collect(*this); }
};
///////////////////////////
struct Resource {
T val;
bool available = true;
Resource(T v) : val(std::move(v)) {}
};
///////////////////////////
std::vector<Resource> vres;
size_t hint = 0;
std::condition_variable cv;
std::mutex mtx;
size_t available_cnt;
public:
Pool(std::initializer_list<T> l) : available_cnt(l.size()) {
vres.reserve(l.size());
for (T t : l) {
vres.emplace_back(std::move(t));
}
std::cout << "Pool has size " << vres.size() << std::endl;
}
~Pool() {
for (auto & res : vres) {
if (!res.available) {
std::cerr << "WARNING Pool::~Pool resources are still in use\n";
}
}
}
Borrowed borrow() {
std::unique_lock<std::mutex> lk(mtx);
while (available_cnt == 0) cv.wait(lk);
if (vres[hint].available) {
// quick path, if hint points to an available resource
std::cout << "hint good" << std::endl;
vres[hint].available = false;
--available_cnt;
Borrowed b(*this, hint, vres[hint].val);
if (hint + 1 < vres.size()) ++hint;
lk.unlock();
return b; // <--- gcc seems to hang here
}
else {
// full scan to find the available resource
std::cout << "hint bad" << std::endl;
for (hint = 0; hint < vres.size(); ++hint) {
if (vres[hint].available) {
vres[hint].available = false;
--available_cnt;
lk.unlock();
return Borrowed(*this, hint, vres[hint].val);
}
}
}
throw std::runtime_error("Pool::borrow() no resource is available - internal logic error");
}
void collect(Borrowed & b) {
if (&(b.pool) != this)
throw std::runtime_error("Pool::collect() trying to collect resource owned by another pool!");
if (b.val) {
b.val = nullptr;
{
std::scoped_lock<std::mutex> lk(mtx);
hint = b.id;
vres[hint].available = true;
++available_cnt;
cv.notify_one();
}
}
}
};
///////////////////////////////////////////////////////////////////
#include <thread>
#include <chrono>
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
int main()
{
try
{
Pool<std::string> pool{ "hello","world" };
std::vector<std::thread> vt;
for (int i = 10; i > 0; --i) {
vt.emplace_back([&pool, i]()
{
auto res = pool.borrow();
std::this_thread::sleep_for(std::chrono::milliseconds(i * 300));
std::cout << res.get() << std::endl;
}
);
}
for (auto & t : vt) t.join();
return 0;
}
catch(const std::exception& e)
{
std::cout << "exception occurred: " << e.what();
}
return 0;
}
Locking destructor coupled with missed NRVO caused the issue (credits to Secundi for pointing this out in the comments).
If the compiler skips NRVO, the few lines below if will call the destructor of b. The destructor tries to acquire the mutex before this gets released by the unique_lock, resulting in a deadlock.
Borrowed b(*this, hint, vres[hint].val);
if ( hint + 1 < vres.size() ) ++hint;
return b; // <--- gcc seems to hang here
It is of crucial importance here to avoid destroying b. In fact, even if manually releasing the unique_lock before returning will avoid the deadlock, the destructor of b will mark the pooled resource as available, while this is just being borrowed out, making the code wrong.
A possible fix consists in replacing the lines above with:
const auto tmp = hint;
if ( hint + 1 < vres.size() ) ++hint;
return Borrowed(*this, tmp, vres[tmp].val);
Another possibility (which does not exclude the former) is to delete the (evil) copy ctor of Borrowed and only provide a move ctor:
Borrowed(const Borrowed &) = delete;
Borrowed(Borrowed && b): pool(b.pool), id(b.id), val(b.val) { b.val = nullptr; }