I'm wondering how to develop an asynchronous API using promises and futures.
The application is using a single data stream that is used for both unsolicited periodic data and requesty/reply communication.
For the requesty/reply blocking until the reply is received is not an option and I don't want lo litter the code using callbacks, so I'd like to write some kind of a SendMessage that accepts the id of the expected reply and exits only upon reception. It's up to the caller to read the reply.
A candidate API could be:
std::future<void> sendMessage(Message msg, id expected)
{
// Write the message
auto promise = make_shared<std::promise<void>>();
// Memorize the promise somewhere accessible to the receiving thread
return promise->get_future();
}
The worker thread upon reception of a message should be able to query a data-structure to know if there is someone waiting for it and "release" the future.
Given that promises are not re-usable what I'm trying to understand is what kind of data-structure should I use to manage "in flight" promises.
This answer has been rewritten.
Setting the state of a shared flag can enable the worker to know whether the other side, say boss, is still expecting the result.
The shared flag along with the promise and the future can be enclosed into a class (template), say Request. The boss set the flag by destructing his copy of the request. And the worker query whether the boss is still expecting the request being done by calling certain member function on his own copy of the request.
Simultaneous reading/writing on the flag should be probably synchronized.
The boss may not access the promise and the worker may not access the future.
There should be at most two copies of the request, becaue the flag will be set on the destruction of the request object. For achieving this, we can delcare corresponding member functions as delete or private, and provide two copies of the request on construction.
Here follows a simple implementation of request:
#include <atomic>
#include <future>
#include <memory>
template <class T>
class Request {
public:
struct Detail {
std::atomic<bool> is_canceled_{false};
std::promise<T> promise_;
std::future<T> future_ = promise_.get_future();
};
static auto NewRequest() {
std::unique_ptr<Request> copy1{new Request()};
std::unique_ptr<Request> copy2{new Request(*copy1)};
return std::make_pair(std::move(copy1), std::move(copy2));
}
Request(Request &&) = delete;
~Request() {
detail_->is_canceled_.store(true);
}
Request &operator=(const Request &) = delete;
Request &operator=(Request &&) = delete;
// simple api
std::promise<T> &Promise(const WorkerType &) {
return detail_->promise_;
}
std::future<T> &Future(const BossType &) {
return detail_->future_;
}
// return value:
// true if available, false otherwise
bool CheckAvailable() {
return detail_->is_canceled_.load() == false;
}
private:
Request() : detail_(new Detail{}) {}
Request(const Request &) = default;
std::shared_ptr<Detail> detail_;
};
template <class T>
auto SendMessage() {
auto result = Request<T>::NewRequest();
// TODO : send result.second(the another copy) to the worker
return std::move(result.first);
}
New request is contructed by factroy function NewRequest, the return value is a std::pair which contains two std::unique_ptr, each hold a copy of the newly created request.
The worker can now use the member function CheckAvailable() to check whether the request is canceled.
And the shared state is managed proprely(I believe) by the std::shared_ptr.
Note on std::promise<T> &Promise(const WorkerType &): The const reference parameter(which should be replaced with a propre type according to your implementation) is for preventing the boss from calling this function by accident while the worker should be able to easily provide a propre argument for calling this function. The same for std::future<T> &Future(const BossType &).
Related
I have bidirectional streaming async grpc client that use ClientAsyncReaderWriter for communication with server. RPC code looks like:
rpc Process (stream Request) returns (stream Response)
For simplicity Request and Response are bytes arrays (byte[]). I send several chunks of data to server, and when server accumulate enough data, server process this data and send back the response and continue accumulating data for next responses. After several responses, the server send final response and close connection.
For async client I using CompletionQueue. Code looks like:
...
CompletionQueue cq;
std::unique_ptr<Stub> stub;
grpc::ClientContext context;
std::unique_ptr<grpc::ClientAsyncReaderWriter<Request,Response>> responder = stub->AsyncProcess(&context, &cq, handler);
// thread for completition queue
std::thread t(
[]{
void *handler = nullptr;
bool ok = false;
while (cq_.Next(&handler, &ok)) {
if (can_read) {
// how do you know that it is read data available
// Do read
} else {
// do write
...
Request request = prepare_request();
responder_->Write(request, handler);
}
}
}
);
...
// wait
What is the proper way to async reading? Can I try to read if it no data available? Is it blocking call?
Sequencing Read() calls
Can I try to read if it no data available?
Yep, and it's going to be case more often than not. Read() will do nothing until data is available, and only then put its passed tag into the completion queue. (see below for details)
Is it blocking call?
Nope. Read() and Write() return immediately. However, you can only have one of each in flight at any given moment. If you try to send a second one before the previous has completed, it (the second one) will fail.
What is the proper way to async reading?
Each time a Read() is done, start a new one. For that, you need to be able to tell when a Read() is done. This is where tags come in!
When you call Read(&msg, tag), or Write(request, tag),you are telling grpc to put tag in the completion queue associated with that responder once that operation has completed. grpc doesn't care what the tag is, it just hands it off.
So the general strategy you will want to go for is:
As soon as you are ready to start receiving messages:
call responder->Read() once with some tag that you will recognize as a "read done".
Whenever cq_.Next() gives you back that tag, and ok == true:
consume the message
Queue up a new responder->Read() with that same tag.
Obviously, you'll also want to do something similar for your calls to Write().
But since you still want to be able to lookup the handler instance from a given tag, you'll need a way to pack a reference to the handler as well as information about which operation is being finished in a single tag.
Completion queues
Lookup the handler instance from a given tag? Why?
The true raison d'ĂȘtre of completion queues is unfortunately not evident from the examples. They allow multiple asynchronous rpcs to share the same thread. Unless your application only ever makes a single rpc call, the handling thread should not be associated with a specific responder. Instead, that thread should be a general-purpose worker that dispatches events to the correct handler based on the content of the tag.
The official examples tend to do that by using pointer to the handler object as the tag. That works when there's a specific sequence of events to expect since you can easily predict what a handler is reacting to. You often can't do that with async bidirectional streams, since any given completion event could be a Read() or a Write() finishing.
Example
Here's a general outline of what I personally consider to be a clean way to go about all that:
// Base class for async bidir RPCs handlers.
// This is so that the handling thread is not associated with a specific rpc method.
class RpcHandler {
// This will be used as the "tag" argument to the various grpc calls.
struct TagData {
enum class Type {
start_done,
read_done,
write_done,
// add more as needed...
};
RpcHandler* handler;
Type evt;
};
struct TagSet {
TagSet(RpcHandler* self)
: start_done{self, TagData::Type::start_done},
read_done{self, TagData::Type::read_done},
write_done{self, TagData::Type::write_done} {}
TagData start_done;
TagData read_done;
TagData write_done;
};
public:
RpcHandler() : tags(this) {}
virtual ~RpcHandler() = default;
// The actual tag objects we'll be passing
TagSet tags;
virtual void on_ready() = 0;
virtual void on_recv() = 0;
virtual void on_write_done() = 0;
static void handling_thread_main(grpc::CompletionQueue* cq) {
void* raw_tag = nullptr;
bool ok = false;
while (cq->Next(&raw_tag, &ok)) {
TagData* tag = reinterpret_cast<TagData*>(raw_tag);
if(!ok) {
// Handle error
}
else {
switch (tag->evt) {
case TagData::Type::start_done:
tag->handler->on_ready();
break;
case TagData::Type::read_done:
tag->handler->on_recv();
break;
case TagData::Type::write_done:
tag->handler->on_write_done();
break;
}
}
}
}
};
void do_something_with_response(Response const&);
class MyHandler final : public RpcHandler {
public:
using responder_ptr =
std::unique_ptr<grpc::ClientAsyncReaderWriter<Request, Response>>;
MyHandler(responder_ptr responder) : responder_(std::move(responder)) {
// This lock is needed because StartCall() can
// cause the handler thread to access the object.
std::lock_guard lock(mutex_);
responder_->StartCall(&tags.start_done);
}
~MyHandler() {
// TODO: finish/abort the streaming rpc as appropriate.
}
void send(const Request& msg) {
std::lock_guard lock(mutex_);
if (!sending_) {
sending_ = true;
responder_->Write(msg, &tags.write_done);
} else {
// TODO: add some form of synchronous wait, or outright failure
// if the queue starts to get too big.
queued_msgs_.push(msg);
}
}
private:
// When the rpc is ready, queue the first read
void on_ready() override {
std::lock_guard l(mutex_); // To synchronize with the constructor
responder_->Read(&incoming_, &tags.read_done);
};
// When a message arrives, use it, and start reading the next one
void on_recv() override {
// incoming_ never leaves the handling thread, so no need to lock
// ------ If handling is cheap and stays in the handling thread.
do_something_with_response(incoming_);
responder_->Read(&incoming_, &tags.read_done);
// ------ If responses is expensive or involves another thread.
// Response msg = std::move(incoming_);
// responder_->Read(&incoming_, &tags.read_done);
// do_something_with_response(msg);
};
// When has been sent, send the next one is there is any
void on_write_done() override {
std::lock_guard lock(mutex_);
if (!queued_msgs_.empty()) {
responder_->Write(queued_msgs_.front(), &tags.write_done);
queued_msgs_.pop();
} else {
sending_ = false;
}
};
responder_ptr responder_;
// Only ever touched by the handler thread post-construction.
Response incoming_;
bool sending_ = false;
std::queue<Request> queued_msgs_;
std::mutex mutex_; // grpc might be thread-safe, MyHandler isn't...
};
int main() {
// Start the thread as soon as you have a completion queue.
auto cq = std::make_unique<grpc::CompletionQueue>();
std::thread t(RpcHandler::handling_thread_main, cq.get());
// Multiple concurent RPCs sharing the same handling thread:
MyHandler handler1(serviceA->MethodA(&context, cq.get()));
MyHandler handler2(serviceA->MethodA(&context, cq.get()));
MyHandlerB handler3(serviceA->MethodB(&context, cq.get()));
MyHandlerC handler4(serviceB->MethodC(&context, cq.get()));
}
If you have a keen eye, you will notice that the code above stores a bunch (1 per event type) of redundant this pointers in the handler. It's generally not a big deal, but it is possible to do without them via multiple inheritance and downcasting, but that's starting to be somewhat beyond the scope of this question.
I apologize for the ambiguous title, but I'll try to elaborate further here:
I have an application which includes (among others) a control, and TCP server classes.
Communication between the TCP and control class is done via this implementation:
#include <boost/signals2.hpp>
// T - Observable object type
// S - Function signature
template <class T, typename S> class observer {
using F = std::function<S>;
public:
void register_notifier(T &obj, F f)
{
connection_ = obj.connect_notifier(std::forward<F>(f));
}
protected:
boost::signals2::scoped_connection connection_;
};
// S - Function signature
template <typename S> class observable {
public:
boost::signals2::scoped_connection connect_notifier(std::function<S> f)
{
return notify.connect(std::move(f));
}
protected:
boost::signals2::signal<S> notify;
};
Where the TCP server class is the observable, and the control class is the observer.
The TCP server is running on a separate thread from the control class, and uses boost::asio::async_read. Whenever a message is received, the server object sends a notification via the 'notify' member, thus triggering the callback registered in the control class, and then waits to read the next message.
The problem is that I need to somehow safely and efficiently store the data currently stored in the TCP server buffer and pass it to the control class before it's overridden by the next message.
i.e. :
inline void ctl::tcp::server::handle_data_read(/* ... */)
{
if (!error) {
/* .... */
notify(/* ? */); // Using a pointer to the buffer
// would obviously fail as it
// is overridden in the next read
}
/* .... */
}
Those were my ideas for a solution so far:
Allocating heap memory and passing a pointer to it using
unique_ptr, but I'm not sure if boost.signals2 is move-aware.
Use an
unordered map (shared between the objects) that maps an integer index to a unique_ptr of the
data type (std::unordered_map<int, std::unique_ptr<data_type>>),
then only pass the index of the element, and 'pop' it in the control
class callback, but it feels like an overkill.
What I'm really looking for is an idea for a simple and efficient solution to pass the TCP buffer contents for each message between the threads.
Note I'm also open for suggestions to redesign my communication method between the objects if it's completely wrong.
The story :
I make use of the QtConcurrent API for every "long" operation in my application.
It works pretty well, but I face some problems with the QObjects creation.
Consider this piece of code, which use a thread to create a "Foo" object :
QFuture<Foo*> = QtConcurrent::run([=]()
{
Data* data = /*long operation to acquire the data*/
Foo* result = new Foo(data);
return result;
});
It works well, but if the "Foo" class is derived from QObject class, the "result" instance belongs to the QThread who has created the object.
So to use properly signal/slot with the "result" instance, one should do something like this :
QFuture<Foo*> = QtConcurrent::run([=]()
{
Data* data = /*long operation to acquire the data*/
Foo* result = new Foo(data);
// Move "result" to the main application thread
result->moveToThread(qApp->thread());
return result;
});
Now, all works as exepected, and I think this is the normal behaviour and the nominal solution.
The problem :
I have a lot of this kind of code, which sometimes create objects, which can also create objects. Most of them are created properly with a "moveToThread" call.
But sometimes, I miss one "moveToThread" call.
And then, a lot of things look like they doesn't work (because this object slots are "broken"), without any Qt warning.
Now, I sometimes spend a lot of time to figure why someting doesn't work, before understanding it's only because the slots are no more called on a particular object instance.
The question :
Is there any way to help me to prevent/detect/debug this kind of situation ?
For example :
having a warning logged every time a QThread is deleted but there are objects alive who belongs to it ?
having a warning logged every time a signal is emitted to an object which QThread is deleted ?
having a warning logged every time a signal is emitted to an object (in another thread) and not processed before a timeout ?
Thanks
It is possible to track an object's movement among threads. Just before an object is moved to the new thread, it is sent a ThreadChange event. You can filter that event and have your code run to take a note of when an object leaves a thread. But it's too early at that point to know of whether the object goes anywhere. To detect that, you need to post a metacall (see this question) to the object's queue to be executed as soon as the object's event processing resumes in the new thread. You'd also attach to QThread::finished to get a chance to look through your object list and check if any of them live on the thread that's about to die.
But all this is fairly involved: each thread will need its own tracker/filter object, as event filters must live in the object's thread. We're probably talking of more than 200 lines of code to do it right, handling all corner cases.
Instead, you can leverage RAII and hold your objects using handles that manage thread affinity as a resource (because it is one!):
// https://github.com/KubaO/stackoverflown/tree/master/questions/thread-track-38611886
#include <QtConcurrent>
template <typename T>
class MainResult {
Q_DISABLE_COPY(MainResult)
T * m_obj;
public:
template<typename... Args>
MainResult(Args&&... args) : m_obj{ new T(std::forward<Args>(args)...) } {}
MainResult(T * obj) : m_obj{obj} {}
T* operator->() const { return m_obj; }
operator T*() const { return m_obj; }
T* operator()() const { return m_obj; }
~MainResult() { m_obj->moveToThread(qApp->thread()); }
};
struct Foo : QObject { Foo(int) {} };
You can return a MainResult by value, but the return type of the functor must be explicitly given:
QFuture<Foo*> test1() {
return QtConcurrent::run([=]()->Foo*{ // explicit return type
MainResult<Foo> obj{1};
obj->setObjectName("Hello");
return obj; // return by value
});
}
Alternatively, you can return the result of calling MainResult; it's a functor itself to save a bit of typing but this might be considered a hack and perhaps you should convert operator()() to a method with a short name.
QFuture<Foo*> test2() {
return QtConcurrent::run([=](){ // deduced return type
MainResult<Foo> obj{1};
obj->setObjectName("Hello");
return obj(); // return by call
});
}
While it's preferable to construct the object along with the handle, it's also possible to pass an instance pointer to the handle's constructor:
MainResult<Foo> obj{ new Foo{1} };
I was able to create a handler for a boost deadline_time (which is a member)
by declaring it static. Unfortunately this prevents the access to non-static member data.
I have a series of timeouts. So my idea was to have a single deadline_timer
while maintaining an ordered list of timeout events.
Every time the next timeout event would happen,
the class would retrigger the timer with the next timeout event in the class
calculating the remaining time for this timeout event.
For this concept to work the handler would need to manipulate
non-static data. But this is not possible sence boost::asio requires a static handler.
Anybody got an idea how to handle this?
class TimerController {
public:
void setTimer(const eibaddr_t gad, const timesecs_t timedelay);
void cancelTimer(const eibaddr_t gad);
bool isRunning(const eibaddr_t gad);
void setGad(const eibaddr_t gad);
static void timerHandler(const boost::system::error_code &ec);
private:
boost::asio::deadline_timer* m_pTimer;
struct timerList_s
{
eibaddr_t gad;
boost::posix_time::ptime absTimeOut;
timerList_s(const timerList_s& elem) : gad(elem.gad),
absTimeOut(elem.absTimeOut)
{
};
timerList_s(const eibaddr_t& pgad, const boost::posix_time::ptime pato)
: gad(pgad),
absTimeOut(pato)
{
};
timerList_s& operator= (const timerList_s& elem)
{
gad = elem.gad;
absTimeOut = elem.absTimeOut;
return *this;
};
bool operator< (const timerList_s& elem) const
{
return (absTimeOut < elem.absTimeOut);
};
bool operator== (const timerList_s& elem) const
{
return (gad == elem.gad);
};
};
std::list<timerList_s> m_timers;
It is possible to use the deadline_timer class with non-static data using boost::bind in the following way deadline_.async_wait(bind(&client::check_deadline, this));. Details available in ASIO's examples, for instance, here.
I have a series of timeouts. So my idea was to have a single
deadline_timer while maintaining an ordered list of timeout events.
Every time the next timeout event would happen, the class would
retrigger the timer with the next timeout event in the class
calculating the remaining time for this timeout event.
this is a very odd design.
For this concept to work the handler would need to manipulate
non-static data. But this is not possible sence boost::asio requires a
static handler.
boost::asio does not require a static handler, see the documentation. It requires a handler with the signature
void handler(
const boost::system::error_code& error // Result of operation.
);
The typical recipe here is to use boost::bind to bind a member function to the handler. The async TCP client example shows one way to do this. The author of the asio library has an excellent blog post describing this concept in detail if you have trouble understanding it.
Is it possible to perform an asynchronous wait (read : non-blocking) on a conditional variable in boost::asio ? if it isn't directly supported any hints on implementing it would be appreciated.
I could implement a timer and fire a wakeup even every few ms, but this is approach is vastly inferior, I find it hard to believe that condition variable synchronization is not implemented / documented.
If I understand the intent correctly, you want to launch an event handler, when some condition variable is signaled, in context of asio thread pool? I think it would be sufficient to wait on the condition variable in the beginning of the handler, and io_service::post() itself back in the pool in the end, something of this sort:
#include <iostream>
#include <boost/asio.hpp>
#include <boost/thread.hpp>
boost::asio::io_service io;
boost::mutex mx;
boost::condition_variable cv;
void handler()
{
boost::unique_lock<boost::mutex> lk(mx);
cv.wait(lk);
std::cout << "handler awakened\n";
io.post(handler);
}
void buzzer()
{
for(;;)
{
boost::this_thread::sleep(boost::posix_time::seconds(1));
boost::lock_guard<boost::mutex> lk(mx);
cv.notify_all();
}
}
int main()
{
io.post(handler);
boost::thread bt(buzzer);
io.run();
}
I can suggest solution based on boost::asio::deadline_timer which works fine for me. This is kind of async event in boost::asio environment.
One very important thing is that the 'handler' must be serialised through the same 'strand_' as 'cancel', because using 'boost::asio::deadline_timer' from multiple threads is not thread safe.
class async_event
{
public:
async_event(
boost::asio::io_service& io_service,
boost::asio::strand<boost::asio::io_context::executor_type>& strand)
: strand_(strand)
, deadline_timer_(io_service, boost::posix_time::ptime(boost::posix_time::pos_infin))
{}
// 'handler' must be serialised through the same 'strand_' as 'cancel' or 'cancel_one'
// because using 'boost::asio::deadline_timer' from multiple threads is not thread safe
template<class WaitHandler>
void async_wait(WaitHandler&& handler) {
deadline_timer_.async_wait(handler);
}
void async_notify_one() {
boost::asio::post(strand_, boost::bind(&async_event::async_notify_one_serialized, this));
}
void async_notify_all() {
boost::asio::post(strand_, boost::bind(&async_event::async_notify_all_serialized, this));
}
private:
void async_notify_one_serialized() {
deadline_timer_.cancel_one();
}
void async_notify_all_serialized() {
deadline_timer_.cancel();
}
boost::asio::strand<boost::asio::io_context::executor_type>& strand_;
boost::asio::deadline_timer deadline_timer_;
};
Unfortunately, Boost ASIO doesn't have an async_wait_for_condvar() method.
In most cases, you also won't need it. Programming the ASIO way usually means, that you use strands, not mutexes or condition variables, to protect shared resources. Except for rare cases, which usually focus around correct construction or destruction order at startup and exit, you won't need mutexes or condition variables at all.
When modifying a shared resource, the classic, partially synchronous threaded way is as follows:
Lock the mutex protecting the resource
Update whatever needs to be updated
Signal a condition variable, if further processing by a waiting thread is required
Unlock the mutex
The fully asynchronous ASIO way is though:
Generate a message, that contains everything, that is needed to update the resource
Post a call to an update handler with that message to the resource's strand
If further processing is needed, let that update handler create further message(s) and post them to the apropriate resources' strands.
If jobs can be executed on fully private data, then post them directly to the io-context instead.
Here is an example of a class some_shared_resource, that receives a string state and triggers some further processing depending on the state received. Please note, that all processing in the private method some_shared_resource::receive_state() is fully thread-safe, as the strand serializes all calls.
Of course, the example is not complete; some_other_resource needs a similiar send_code_red() method as some_shared_ressource::send_state().
#include <boost/asio>
#include <memory>
using asio_context = boost::asio::io_context;
using asio_executor_type = asio_context::executor_type;
using asio_strand = boost::asio::strand<asio_executor_type>;
class some_other_resource;
class some_shared_resource : public std::enable_shared_from_this<some_shared_resource> {
asio_strand strand;
std::shared_ptr<some_other_resource> other;
std::string state;
void receive_state(std::string&& new_state) {
std::string oldstate = std::exchange(state, new_state);
if(state == "red" && oldstate != "red") {
// state transition to "red":
other.send_code_red(true);
} else if(state != "red" && oldstate == "red") {
// state transition from "red":
other.send_code_red(false);
}
}
public:
some_shared_resource(asio_context& ctx, const std::shared_ptr<some_other_resource>& other)
: strand(ctx.get_executor()), other(other) {}
void send_state(std::string&& new_state) {
boost::asio::post(strand, [me = weak_from_this(), new_state = std::move(new_state)]() mutable {
if(auto self = me.lock(); self) {
self->receive_state(std::move(new_state));
}
});
}
};
As you see, posting always into ASIO's strands can be a bit tedious at first. But you can move most of that "equip a class with a strand" code into a template.
The good thing about message passing: As you are not using mutexes, you cannot deadlock yourself anymore, even in extreme situations. Also, using message passing, it is often easier to create a high level of parallelity than with classical multithreading. On the downside, moving and copying around all these message objects is time consuming, which can slow down your application.
A last note: Using the weak pointer in the message formed by send_state() facilitates the reliable destruction of some_shared_resource objects: Otherwise, if A calls B and B calls C and C calls A (possibly only after a timeout or similiar), using shared pointers instead of weak pointers in the messages would create cyclic references, which then prevents object destruction. If you are sure, that you never will have cycles, and that processing messages from to-be-deleted objects doesn't pose a problem, you can use shared_from_this() instead of weak_from_this(), of course. If you are sure, that objects won't get deleted before ASIO has been stopped (and all working threads been joined back to the main thread), then you can also directly capture the this pointer instead.
FWIW, I implemented an asynchronous mutex using the rather good continuable library:
class async_mutex
{
cti::continuable<> tail_{cti::make_ready_continuable()};
std::mutex mutex_;
public:
async_mutex() = default;
async_mutex(const async_mutex&) = delete;
const async_mutex& operator=(const async_mutex&) = delete;
[[nodiscard]] cti::continuable<std::shared_ptr<int>> lock()
{
std::shared_ptr<int> result;
cti::continuable<> tail = cti::make_continuable<void>(
[&result](auto&& promise) {
result = std::shared_ptr<int>((int*)1,
[promise = std::move(promise)](auto) mutable {
promise.set_value();
}
);
}
);
{
std::lock_guard _{mutex_};
std::swap(tail, tail_);
}
co_await std::move(tail);
co_return result;
}
};
usage eg:
async_mutex mutex;
...
{
const auto _ = co_await mutex.lock();
// only one lock per mutex-instance
}