Am I paranoid while using boost:asio? - c++

I am write an app using boost:asio.
I have a single io_serice::run() thread, and many worker threads. All the worker threads may send msg at any time.
Here is how I implement the send_msg().
// Note: send_msg() could be called from any thread.
// 'msg' must be 'malloc'ed, and its owner ship will be transfered to '_send_q'
//
// NetLibConnection has base classes of tcp::socket and boost::enable_shared_from_this
void NetLibConnection::send_msg(PlainNetLibMsg* msg)
{
AutoLocker __dummy(this->_lock_4_send_q); // _lock_4_send_q is a 'mutex'
bool write_in_progress = ! this->_send_q.empty(); // _send_q is std::deque<PlainNetLibMsg* >,
// the 'send_q' mechansim is learned from boost_asio_example/cpp03/chat
this->_send_q.push_back(msg);
if (write_in_progress)
{
return;
}
this->get_io_service().post( // queue the 'send operation' to a singlton io_serivce::run() thread
boost::bind(&NetLibConnection::async_send_front_of_q
, boost::dynamic_pointer_cast<NetLibConnection>(shared_from_this())
)
);
}
void NetLibConnection::async_send_front_of_q()
{
boost::asio::async_write(*this
, boost::asio::buffer( this->_send_q.front() , _send_q.front()->header.DataSize + sizeof(NetLibChunkHeader) )
, this->_strand.wrap( // this great post https://stackoverflow.com/questions/12794107/why-do-i-need-strand-per-connection-when-using-boostasio/
// convinced me that I should use strand along with Connection
boost::bind( &NetLibConnection::handle_send
, boost::dynamic_pointer_cast<NetLibConnection>(shared_from_this())
, boost::asio::placeholders::error
)
)
);
}
The code works fine. But I am not satisfied with its verbosity. I feel the senq_q acts as the same role of strand.
Since
all real async_write call happen in a single io_service::run() thread
all real async_write are queued one-by-one via the send_q
Do I still need the strand?

Yes, indeed. The documentation details this here:
Threads And Boost Asio
By only calling io_service::run() from a single thread, the user's code can avoid the development complexity associated with synchronisation. For example, a library user can implement scalable servers that are single-threaded (from the user's point of view).
Thinking a bit more broadly, your scenario is the simplest form of having a single logical strand. There are other ways in which you can maintain logical strands (by chaining handlers), see this most excellent answer on the subject: Why do I need strand per connection when using boost::asio?

Related

How to integrate Cap'n'Proto threads with non Cap'n'Proto threads?

How do I properly integrate Cap'n'Proto client usage with surrounding multi-threaded code? The Cap'n'Proto docs say that each Cap'n'Proto interface is single-threaded with a dedicated event loop. Additionally they recommend using Cap'n'Proto to communicate between threads. However, the docs don't seem to describe how non-Cap'n'Proto threads (e.g. the UI loop) could integrate with that. Even if could integrate Cap'n'Proto event loops with the UI loop in some places, other models like thread pools (Android Binder, global libdispatch queues) seem more challenging.
I think the solution is to cache the thread executor for the client thread in a synchronized place that the non-capnp thread will access it.
I believe though that the calling thread always needs to be on its own event loop as well to marry them but I just want to make sure that's actually the case. My initial attempt to do that in a simple unit test is failing. I created a KjLooperEventPort class (following the structure for the node libuv adapter) to marry KJ & ALooper on Android.
Then my test code is:
TEST(KjLooper, CrossThreadPromise) {
std::thread::id kjThreadId;
ConditionVariable<const kj::Executor*> executorCv{nullptr};
ConditionVariable<std::pair<bool, kj::Promise<void>>> looperThreadFinished{false, nullptr};
std::thread looperThread([&] {
auto looper = android::newLooper();
android::KjLooperEventPort kjEventPort{looper};
kj::WaitScope waitScope(kjEventPort.getKjLoop());
auto finished = kj::newPromiseAndFulfiller<void>();
looperThreadFinished.constructValueAndNotifyAll(true, kj::mv(finished.promise));
executorCv.waitNotValue(nullptr);
auto executor = executorCv.readCopy();
kj::Promise<void> asyncPromise = executor->executeAsync([&] {
ASSERT_EQ(std::this_thread::get_id(), kjThreadId);
});
asyncPromise = asyncPromise.then([tid = std::this_thread::get_id(), kjThreadId, &finished] {
std::cerr << "Running promise completion on original thread\n";
ASSERT_NE(tid, kjThreadId);
ASSERT_EQ(std::this_thread::get_id(), tid);
std::cerr << "Fulfilling\n";
finished.fulfiller->fulfill();
std::cerr << "Fulfilled\n";
});
asyncPromise.wait(waitScope);
});
std::thread kjThread([&] {
kj::Promise<void> finished = kj::NEVER_DONE;
looperThreadFinished.wait([&](auto& promise) {
finished = kj::mv(promise.second);
return promise.first;
});
auto ioContext = kj::setupAsyncIo();
kjThreadId = std::this_thread::get_id();
executorCv.setValueAndNotifyAll(&kj::getCurrentThreadExecutor());
finished.wait(ioContext.waitScope);
});
looperThread.join();
kjThread.join();
}
This crashes fulfilling the promise back to the kj thread.
terminating with uncaught exception of type kj::ExceptionImpl: kj/async.c++:1269: failed: expected threadLocalEventLoop == &loop || threadLocalEventLoop == nullptr; Event armed from different thread than it was created in. You must use
Executor to queue events cross-thread.
Most Cap'n Proto RPC and KJ Promise-related objects can only be accessed in the thread that created them. Resolving a promise cross-thread, for example, will fail, as you saw.
Some ways you could solve this include:
You can use kj::Executor to schedule code to run on a different thread's event loop. The calling thread does NOT need to be a KJ event loop thread if you use executeSync() -- however, this function blocks until the other thread has had a chance to wake up and execute the function. I'm not sure how well this will perform in practice; if it's a problem, there is probably room to extend the Executor interface to handle this use case more efficiently.
You can communicate between threads by passing messages over pipes or socketpairs (but sending big messages this way would involve a lot of unnecessary copying to/from the socket buffer).
You could signal another thread's event loop to wake up using a pipe, signal, or (on Linux) eventfd, then have it look for messages in a mutex-protected queue. (But kj::Executor mostly obsoletes this technique.)
It's possible, though not easy, to adapt KJ's event loop to run on top of other event loops, so that both can run in the same thread. For example, node-capnp adapts KJ to run on top of libuv.

boost io_service non-blocking parallel execution?

I have run into a dilemma whilst using boost::asio and boost::io_service
My classes wrap around the async client example provided by boost for socket connections.
I use another class which encapsulates:
class service_controller
{
...
/// IO service
boost::asio::io_service __io_service;
/// Endpoint Resolver
boost::asio::ip::tcp::resolver::query __query;
/// Resolution for TCP
boost::asio::ip::tcp::resolver __resolver;
}
So, when I construct my clients, the constructor takes references:
asio_service_client (
boost::asio::ip::tcp::resolver::query & query,
boost::asio::ip::tcp::resolver & resolver,
boost::asio::io_service & io_service
);
Everything works fine, but I have to call
io_service.run()
At the end, after creating all all my clients.
If I encapsulate seperate io_service objects for each client, I essentially remove the async io nature, as each one will block until its finished.
Therefore, I decided to form a type of group, by making all client objects use the same io_service.
io_service::poll() does not appear to work at all (nothing happens), nor does io_service::run_one().
In fact, the only thing that appears to work, is:
// with a callback => the callback will run once finished
rapp::services::asio_service_client c1( ctrl.Query(), ctrl.Resolver(), ctrl.Services() );
// without a callback => asio_service_client::handle_reply will run once finished
rapp::services::asio_service_client c2 ( ctrl.Query(), ctrl.Resolver(), ctrl.Services() );
rapp::services::asio_service_client c3 ( ctrl.Query(), ctrl.Resolver(), ctrl.Services() );
rapp::services::asio_service_client c4 ( ctrl.Query(), ctrl.Resolver(), ctrl.Services() );
// Run services c1, c2
c1.Run( header, post,
[&]( boost::asio::streambuf & buffer )
{
std::string raw ( ( std::istreambuf_iterator<char>( &buffer ) ), std::istreambuf_iterator<char>() );
std::cout << raw << std::endl;
});
c2.Run( header, post );
ctrl.Services().run();
/// Run remaining services ( c3, c4 )
c3.Run( header, post );
c4.Run( header, post );
ctrl.Services().reset();
ctrl.Services().run();
Unless of course, if I request a group to be run altogether (e.g., ask for c1, c2, c3 and c4 Run).
Is there some way, or some class pattern, where I could automate a queue, where I create objects, add them, and they are run asynchronously? Ideally with threads, but without will also work.
Some kind of a stack, where whilst I add objects, they are asynchronously executed, as they are added.
If I try something like:
Scheduler::Execute ( asio_service_client & client )
{
client.Run( ... )
io_service.reset();
io_service.run();
}
I will reset previous running services, and start all over, which is not what I want.
My only obvious option, is to either accept and assign a separate io_service for each added asio_service_client, or force them to be added all together in a job group, which is then executed?
The other solution I can think of, is using threads, thereby, each asio_service_client will run in its own thread, and thus won't block other asio_service_clients, executing in parallel?
You probably want to share a single io_service instance and post a io_service::work object on it so it stays active even if no client currently has any pending asycn operations:
boost::asio::io_service io_service;
auto work = boost::make_shared<boost::asio::io_service::work>(io_service);
// any client can post it's asynchronous operations on this service object, from any thread
// completion handlers will be invoked on any thread that runs `io_service.run()`
// once you want the `io_service` to empty the queue and return:
work.reset();
// now `run()` will return when it runs out of queued tasks

ZMQ C++ Event Loop Within Class

My overall goal in using ZMQ is to avoid having to get into the weeds of asynchronous message passing; and ZMQ seemed like a portable and practical solution. Most of the ZeroMQ docs, however, like this, and many of the other zmq examples I have Googled upon are based on the helloworld.c format. That is, they are all simple procedural code inside int main(){}.
My problem is that I want to "embed" a zmq "listener" inside a c++ singleton-like class. I want to "listen" for messages and then process them. I'm planning on using zmq's PUSH -> PULL sockets, on the off chance that matters. What I cannot figure out how to do is to have in internal "event loop".
class foomgr {
public:
static foomgr& get_foomgr();
// ...
private:
foomgr();
foomgr(const &foomgr);
// ...
listener_() {
// EVENT LOOP HERE
// RECV and PROCESS ZMQ MSGS
// while(true) DOES NOT WORK HERE
}
// ...
zmq::context_t zmqcntx_;
zmq::socket_t zmqsock_;
const int zmqsock_linger_ = 1000;
// ....
}
I obviously cannot use the while(true) construct in listener, since wherever I call it from will block. Since one of the advantages of using ZMQ is that I do not have to manage "listener" threads myself, it seems silly to have to figure out how create my own thread to wrap listener_ in. I'm lost for solutions.
Note: I'm a c++ newb, so what might be obvious to most is not to me. Also, I'm trying to use generic "words", not library or language specific to avoid confusion. The code is built with -std=c++11, so those
constructs are fine.
The ZMQ C++ library does not implement a listener pattern for message polling. It leaves that task up to you to wrap in your own classes. It does support a non-blocking mode of polling for new messages, however.
So using the right code you can wrap it up in a small loop in a non-blocking fashion.
See this Polling Example here on GitHub written in C++. Note that its polling from 2 sockets, so you'll need to modify it a little to remove the extra code.
The important part that you'll need to wrap inside your own observer implementation is below:
zmq::message_t message;
zmq::poll (&items [0], 2, -1);
if (items [0].revents & ZMQ_POLLIN) {
receiver.recv(&message);
// Process task
}
Zmq is not thread safe by design (versions up to now). In fact, Zmq stresses:
Do not use or close sockets except in the thread that created them.
PERIOD.
Callbacks shouldn't be used because the thread calling the callback, will be for sure different from the thread that created the socket, which is forbidden.
Maybe, you will find useful zmqHelper, a small library (only two classes and a few functions), to make it easier using Zmq in C++ and to enforce (it is guaranteed) that threads can't share sockets.
In the example sections, you will find how to do the most frequent tasks.
Hope it helps.
Code snippet: polling using zmqHelper in a ROUTER-DEALER broker.
zmq::context_t theContext {1}; // 1 thread in the socket
SocketAdaptor< ZMQ_ROUTER > frontend_ROUTER {theContext};
SocketAdaptor< ZMQ_DEALER > backend_DEALER {theContext};
frontend_ROUTER.bind ("tcp://*:8000");
backend_DEALER.bind ("tcp://*:8001");
while (true) {
std::vector<std::string> lines;
//
// wait (blocking poll) for data in any socket
//
std::vector< zmqHelper::ZmqSocketType * > list
= { frontend_ROUTER.getZmqSocket(), backend_DEALER.getZmqSocket() };
zmqHelper::ZmqSocketType * from = zmqHelper::waitForDataInSockets ( list );
//
// there is data, where is it from?
//
if ( from == frontend_ROUTER.getZmqSocket() ) {
// from frontend, read ...
frontend_ROUTER.receiveText (lines);
// ... and resend
backend_DEALER.sendText( lines );
}
else if ( from == backend_DEALER.getZmqSocket() ) {
// from backend, read ...
backend_DEALER.receiveText (lines);
// ... and resend
frontend_ROUTER.sendText( lines );
}
else if ( from == nullptr ) {
std::cerr << "Error in poll ?\n";
}
} // while (true)

Keeping two cross-communicating asio io_service objects busy

I am using boost:asio with multiple io_services to keep different forms of blocking I/O separate. E.g. I have one io_service for blocking file I/O, and another for long-running CPU-bound tasks (and this could be extended to a third for blocking network I/O, etc.) Generally speaking I want to ensure that one form of blocking I/O cannot starve the others.
The problem I am having is that since tasks running in one io_service can post events to other io_service (e.g. a CPU-bound task may need to start a file I/O operation, or a completed file I/O operation may invoke a CPU-bound callback), I don't know how to keep both io_services running until they are both out of events.
Normally with a single I/O service, you do something like:
shared_ptr<asio::io_service> io_service (new asio::io_service);
shared_ptr<asio::io_service::work> work (
new asio::io_service::work(*io_service));
// Create worker thread(s) that call io_service->run()
io_service->post(/* some event */);
work.reset();
// Join worker thread(s)
However if I simply do this for both io_services, the one into which I did not post an initial event finishes immediately. And even if I post initial events to both, if the initial event on io_service B finishes before the task on io_service A posts a new event to B, io_service B will finish prematurely.
How can I keep io_service B running while io_service A is still processing events (because one of the queued events in service A might post a new event to B), and vice-versa, while still ensuring that both io_services exit their run() methods if they are ever both out of events at the same time?
Figured out a way to do this, so documenting it for the record in case anyone else finds this question in a search:
Create each N cross-communicating io_services, create a work object for each of them, and then start their worker threads.
Create a "master" io_service object which will not run any worker threads.
Do not allow posting events directly to the services. Instead, create accessor functions to the io_services which will:
Create a work object on the master thread.
Wrap the callback in a function that runs the real callback, then deletes the work.
Post this wrapped callback instead.
In the main flow of execution, once all of the N io_services have started and you have posted work to at least one of them, call run() on the master io_service.
When the master io_service's run() method returns, delete all of the initial work on the N cross-communicating io_services, and join all worker threads.
Having the master io_service's thread own work on each of the other io_services ensures that they will not terminate until the master io_service runs out of work. Having each of the other io_services own work on the master io_service for every posted callback ensure that the master io_service will not run out of work until every one of the other io_services no longer has any posted callbacks left to process.
An example (could be enapsulated in a class):
shared_ptr<boost::asio::io_service> master_io_service;
void RunWorker(boost::shared_ptr<boost::asio::io_service> io_service) {
io_service->run();
}
void RunCallbackAndDeleteWork(boost::function<void()> callback,
boost::asio::io_service::work* work) {
callback();
delete work;
}
// All new posted callbacks must come through here, rather than being posted
// directly to the io_service object.
void PostToService(boost::shared_ptr<boost::asio::io_service> io_service,
boost::function<void()> callback) {
io_service->post(boost::bind(
&RunCallbackAndDeleteWork, callback,
new boost::asio::io_service::work(*master_io_service)));
}
int main() {
vector<boost::shared_ptr<boost::asio::io_service> > io_services;
vector<boost::shared_ptr<boost::asio::io_service::work> > initial_work;
boost::thread_pool worker_threads;
master_io_service.reset(new boost::asio::io_service);
const int kNumServices = X;
const int kNumWorkersPerService = Y;
for (int i = 0; i < kNumServices; ++i) {
shared_ptr<boost::asio::io_service> io_service(new boost::asio::io_service);
io_services.push_back(io_service);
initial_work.push_back(new boost::asio::io_service::work(*io_service));
for (int j = 0; j < kNumWorkersPerService; ++j) {
worker_threads.create_thread(boost::bind(&RunWorker, io_service));
}
}
// Use PostToService to start initial task(s) on at least one of the services
master_io_service->run();
// At this point, there is no real work left in the services, only the work
// objects in the initial_work vector.
initial_work.clear();
worker_threads.join_all();
return 0;
}
The HTTP server example 2 does something similar that you may find useful. It uses the concept of an io_service pool that retains vectors of shared_ptr<boost::asio::io_service> and a shared_ptr<boost::asio::io_service::work> for each io_service. It uses a thread pool to run each service.
The example uses a round-robin scheduling for doling out work to the I/O services, I don't think that will apply in your case since you have specific tasks for io_service A and io_service B.

Boost Asio callback doesn't get called

I'm using Boost.Asio for network operations, they have to (and actually, can, there's no complex data structures or anything) remain pretty low level since I can't afford the luxury of serialization overhead (and the libs I found that did offer well enough performance seemed to be badly suited for my case).
The problem is with an async write I'm doing from the client (in QT, but that should probably be irrelevant here). The callback specified in the async_write doesn't get called, ever, and I'm at a complete loss as to why. The code is:
void SpikingMatrixClient::addMatrix() {
std::cout << "entered add matrix" << std::endl;
int action = protocol::Actions::AddMatrix;
int matrixSize = this->ui->editNetworkSize->text().toInt();
std::ostream out(&buf);
out.write(reinterpret_cast<const char*>(&action), sizeof(action));
out.write(reinterpret_cast<const char*>(&matrixSize), sizeof(matrixSize));
boost::asio::async_write(*connection.socket(), buf.data(),
boost::bind(&SpikingMatrixClient::onAddMatrix, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred));
}
which calls the first write. The callback is
void SpikingMatrixClient::onAddMatrix(const boost::system::error_code& error, size_t bytes_transferred) {
std::cout << "entered onAddMatrix" << std::endl;
if (!error) {
buf.consume(bytes_transferred);
requestMatrixList();
} else {
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
}
The callback never gets called, even though the server receives all the data. Can anyone think of any reason why it might be doing that?
P.S. There was a wrapper for that connection, and yes there will probably be one again. Ditched it a day or two ago because I couldn't find the problem with this callback.
As suggested, posting a solution I found to be the most suitable (at least for now).
The client application is [being] written in QT, and I need the IO to be async. For the most part, the client receives calculation data from the server application and has to render various graphical representations of them.
Now, there's some key aspects to consider:
The GUI has to be responsive, it should not be blocked by the IO.
The client can be connected / disconnected.
The traffic is pretty intense, data gets sent / refreshed to the client every few secs and it has to remain responsive (as per item 1.).
As per the Boost.Asio documentation,
Multiple threads may call io_service::run() to set up a pool of
threads from which completion handlers may be invoked.
Note that all threads that have joined an io_service's pool are considered equivalent, and the io_service may distribute work across them in an arbitrary fashion.
Note that io_service.run() blocks until the io_service runs out of work.
With this in mind, the clear solution is to run io_service.run() from another thread. The relevant code snippets are
void SpikingMatrixClient::connect() {
Ui::ConnectDialog ui;
QDialog *dialog = new QDialog;
ui.setupUi(dialog);
if (dialog->exec()) {
QString host = ui.lineEditHost->text();
QString port = ui.lineEditPort->text();
connection = TcpConnection::create(io);
boost::system::error_code error = connection->connect(host, port);
if (!error) {
io = boost::shared_ptr<boost::asio::io_service>(new boost::asio::io_service);
work = boost::shared_ptr<boost::asio::io_service::work>(new boost::asio::io_service::work(*io));
io_threads.create_thread(boost::bind(&SpikingMatrixClient::runIo, this, io));
}
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
}
for connecting & starting IO, where:
work is a private boost::shared_ptr to the boost::asio::io_service::work object it was passed,
io is a private boost::shared_ptr to a boost::asio::io_service,
connection is a boost::shared_ptr to my connection wrapper class, and the connect() call uses a resolver etc. to connect the socket, there's plenty examples of that around
and io_threads is a private boost::thread_group.
Surely it could be shortened with some typedefs if needed.
TcpConnection is my own connection wrapper implementation, which sortof lacks functionality for now, and I suppose I could move the whole thread thing into it when it gets reinstated. This snippet should be enough to get the idea anyway...
The disconnecting part goes like this:
void SpikingMatrixClient::disconnect() {
work.reset();
io_threads.join_all();
boost::system::error_code error = connection->disconnect();
if (!error) {
connection.reset();
}
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
the work object is destroyed, so that the io_service can run out of work eventually,
the threads are joined, meaning that all work gets finished before disconnecting, thus data shouldn't get corrupted,
the disconnect() calls shutdown() and close() on the socket behind the scenes, and if there's no error, destroys the connection pointer.
Note, that there's no error handling in case of an error while disconnecting in this snippet, but it could very well be done, either by checking the error code (which seems more C-like), or throwing from the disconnect() if the error code within it represents an error after trying to disconnect.
I encountered a similar problem (callbacks not fired) but the circumstances are different from this question (io_service had jobs but still would not fire the handlers ). I will post this anyway and maybe it will help someone.
In my program, I set up an async_connect() then followed by io_service.run(), which blocks as expected.
async_connect() goes to on_connect_handler() as expected, which in turn fires async_write().
on_write_complete_handler() does not fire, even though the other end of the connection has received all the data and has even sent back a response.
I discovered that it is caused by me placing program logic in on_connect_handler(). Specifically, after the connection was established and after I called async_write(), I entered an infinite loop to perform arbitrary logic, not allowing on_connect_handler() to exit. I assume this causes the io_service to not be able to execute other handlers, even if their conditions are met because it is stuck here. ( I had many misconceptions, and thought that io_service would automagically spawn threads for each async_x() call )
Hope that helps.