Trying to understand Boost.Asio custom service implementation - c++

I'm thinking about writing a custom Asio service on top of an existing proprietary 3rd party networking protocol that we are currently using.
According to Highscore Asio guide you need to implement three classes to create a custom Asio service:
A class derived from boost::asio::basic_io_object representing the new I/O object.
A class derived from boost::asio::io_service::service representing a service that is registered with the I/O service and can be accessed from the I/O object.
A class not derived from any other class representing the service implementation.
The network protocol implementation already provides asynchronous operations and has a (blocking) event-loop. So I thought, I would put it into my service implementation class and run the event-loop in an internal worker thread. So far so good.
Looking at some examples of custom services, I noticed that the service classes spawn their own internal threads (in fact they instantiate their own internal io_service instances). For example:
The Highscore page provides a directory monitor example. It is essentially a wrapper around inotify. The interesting classes are inotify/basic_dir_monitor_service.hpp and inotify/dir_monitor_impl.hpp. Dir_monitor_impl handles the actual interaction with inofity, which is blocking, and therefore runs in a background thread. I agree with that. But the basic_dir_monitor_service also has an internal worker thread and all that seems to be doing is shuffling requests between main's io_service and the dir_monitor_impl. I played around with the code, removed the worker thread in basic_dir_monitor_service and instead posted requests directly to the main io_service and the program still ran as before.
In Asio's custom logger service example, I've noticed the same approach. The logger_service spawns an internal worker thread to handle the logging requests. I haven't had time to play around with that code, but I think, it should be possible to post these requests directly to the main io_service as well.
What is the advantage of having these "intermediary workers"? Couldn't you post all work to the main io_service all the time? Did I miss some crucial aspect of the Proactor pattern?
I should probably mention that I'm writing software for an underpowered single-core embedded system. Having these additional threads in place just seems to impose unnecessary context switches which I'd like to avoid if possible.

In short, consistency. The services attempt to meet user expectations set forth by the services Boost.Asio provides.
Using an internal io_service provides a clear separation of ownership and control of handlers. If a custom service posts its internal handlers into the user's io_service, then execution of the service's internal handlers becomes implicitly coupled with the user's handlers. Consider how this would impact user expectations with the Boost.Asio Logger Service example:
The logger_service writes to the file stream within a handler. Thus, a program that never processes the io_service event loop, such as one that only uses the synchronous API, would never have log messages written.
The logger_service would no longer be thread-safe, potentially invoking undefined behavior if the io_service is processed by multiple threads.
The lifetime of the logger_service's internal operations is constrained by that of the io_service. For example, when a service's shutdown_service() function is invoked, the lifetime of the owning io_service has already ended. Hence, messages could not be logged via logger_service::log() within shutdown_service(), as it would attempt to post an internal handler into the io_service whose lifetime has already ended.
The user may no longer assume a one-to-one mapping between an operation and handler. For example:
boost::asio::io_service io_service;
debug_stream_socket socket(io_service);
boost::asio::async_connect(socket, ..., &connect_handler);
io_service.poll();
// Can no longer assume connect_handler has been invoked.
In this case, io_service.poll() may invoke the handler internal to the logger_service, rather than connect_handler().
Furthermore, these internal threads attempt to mimic the behavior used internally by Boost.Asio itself:
The implementation of this library for a particular platform may make use of one or more internal threads to emulate asynchronicity. As far as possible, these threads must be invisible to the library user.
Directory Monitor example
In the directory monitor example, an internal thread is used to prevent indefinitely blocking the user's io_service while waiting for an event. Once an event has occurred, the completion handler is ready to be invoked, so the internal thread post the user handler into the user's io_service for deferred invocation. This implementation emulates asynchronicity with an internal thread that is mostly invisible to the user.
For details, when an asynchronous monitor operation is initiated via dir_monitor::async_monitor(), a basic_dir_monitor_service::monitor_operation is posted into the internal io_service. When invoked, this operation invokes dir_monitor_impl::popfront_event(), a potentially blocking call. Hence, if the monitor_operation is posted into the user's io_service, the user's thread could be indefinitely blocked. Consider the affect on the following code:
boost::asio::io_service io_service;
boost::asio::dir_monitor dir_monitor(io_service);
dir_monitor.add_directory(dir_name);
// Post monitor_operation into io_service.
dir_monitor.async_monitor(...);
io_service.post(&user_handler);
io_service.run();
In the above code, if io_service.run() invokes monitor_operation first, then user_handler() will not be invoked until dir_monitor observes an event on the dir_name directory. Therefore, dir_monitor service's implementation would not behave in a consistent manner that most users expect from other services.
Asio Logger Service
The use of an internal thread and io_service:
Mitigates the overhead of logging on the user's thread(s) by performing potentially blocking or expensive calls within the internal thread.
Guarantees the thread-safety of std::ofstream, as only the single internal thread writes to the stream. If logging was done directly within logger_service::log() or if logger_service posted its handlers into the user's io_service, then explicit synchronization would be required for thread-safety. Other synchronization mechanisms may introduce greater overhead or complexity into the implementation.
Allows for services to log messages within shutdown_service(). During destruction, the io_service will:
Shutdown each of its services.
Destroy all uninvoked handlers that were scheduled for deferred invocation in the io_service or any of its associated strands.
Destroy each of its services.
As the lifetime of the user's io_service has ended, its event queue is neither being processed nor can additional handlers be posted. By having its own internal io_service that is processed by its own thread, logger_service enables other services to log messages during their shutdown_service().
Additional Considerations
When implementing a custom service, here are a few points to consider:
Block all signals on internal threads.
Never invoke the user's code directly.
How to track and post user handlers when an implementation is destroyed.
Resource(s) owned by the service that are shared between the service's implementations.
For the last two points, the dir_monitor I/O object exhibits behavior that users may not expect. As the single thread within the service invokes a blocking operation on a single implementation's event queue, it effectively blocks operations that could potentially complete immediately for their respective implementation:
boost::asio::io_service io_service;
boost::asio::dir_monitor dir_monitor1(io_service);
dir_monitor1.add_directory(dir_name1);
dir_monitor1.async_monitor(&handler_A);
boost::asio::dir_monitor dir_monitor2(io_service);
dir_monitor2.add_directory(dir_name2);
dir_monitor2.async_monitor(&handler_B);
// ... Add file to dir_name2.
{
// Use scope to enforce lifetime.
boost::asio::dir_monitor dir_monitor3(io_service);
dir_monitor3.add_directory(dir_name3);
dir_monitor3.async_monitor(&handler_C);
}
io_service.run();
Although the operations associated with handler_B() (success) and handler_C() (aborted) would not block, the single thread in basic_dir_monitor_service is blocked waiting for a change to dir_name1.

Related

How to implement robust, leak-free session object destruction in Boost.ASIO based applications?

I have a WebSocket server done with Boost.ASIO and Boost.Beast. It follows the idiomatic ASIO design: session (connection) objects own the communication socket and derive from std::enable_shared_from_this. Async completion handlers capture a std::shared_ptr to self keeping the object alive while there're pending operations, and the objects get destructed automatically when the chain of async ops end. The io_context runs on a single thread, so everything is in an implicit strand.
All this is fairly simple when there's only one chain of async handlers. The session objects I have contain an additional TCP socket and a timer. Read operations are concurrently pending on 2 sockets forwarding messages back and forth, while the timer runs periodically to clean up things. To kill such an object I created a destroySession method that calls cancel an all resources, and eventually completion handlers get called with operation_cancelled. When these all return without scheduling any new async op, then the object gets destructed. destroySession calls are carefully placed at every location where a critical error happens that should result in session termination.
Question1: Is there better way to destruct such an object? With the above solution I feel like I'm back 90's where I forget a delete somewhere and I got a leak...
Given that all destroySession calls are there, is it still possible to leak objects? In some test envs I see 1 session object in 1000 that fails to destruct. I'm thinking of a similar scenario:
websocket closure and timer expiry happens at the same time
websocket completion handler gets invoked, timer handler enqueued
websocket completion handler cancels everything
timer expiry handler gets called (not knowing the error) reschedules the timeout
timer cancel handler gets invoked and simply returns, object remains alive (by the timer)
Is this scenario plausible?
Question2: After calling cancel on a timer/socket can ASIO invoke an already enqueued completion handler with other status than operation_cancelled?
Nice description. Even though code is missing, I have a very good sense of both your design and your understanding of Asio. Both of which seem fine :)
First thoughts:
I kind of agree with the sentiment that destroySession might be a code smell of itself. I can't really state it for lack of details. In my code, I make sure to cancel the "complementary async chain", not just a broad cancel of everything. And the need rarely arises outside the common case of a async timer.
Also, I'm a little worried about the vague "timer runs periodically to clean up things" - in the sketched design there is nothing to clean up, so I worry whether the things you're not showing (leaving out of the description) might cause the symptoms you're trying to explain.
The Timer Scenario
Yes, this is a plausible scenario. In fact it's a bit of a common pitfall problem with Asio timers:
Cancelling boost asio deadline timer safely
SUMMARY TL;DR
Cancelling a time only cancels asynchronous operations in flight.
If you want to shutdown an asynchronous call chain, you'll have to use additional logic for that. An example is given below.
The answer goes into detail how to trace cases like this, and also a approach to fix it.

Boost ASIO - What is async

I've been doing a lot of reading, but I just cannot wrap my head around the difference between synchronous and asynchronous calls in Boost ASIO: what they are, how they work, and why to pick one over the other.
My model is a server which accepts connections and appends the new connection to a list. A different thread loops over the list and sends each registered connection data as it becomes available. Each write operation should be safe. It should have a timeout so that it cannot hang, it should not allocate arbitrarily large amounts of memory, or in general cause the main application to crash.
Confusion:
How does accept_async differ from regular accept? Is a new thread allocated for each connection accepted? From examples I've seen it looks like after a connection is accepted, a request handler is called. This request handler must tell the acceptor to prepare to accept again. Nothing about this seems asynchronous. If the requset handler hangs then the acceptor blocks.
In the boost mailing list the OP was told to use async_write with a timer instead of regular write. In this configureation I don't see any asynchronous behaviour or why they would be recommended. From the Boost docs async_write seems more dangerous than write because the user must not call async_write again before the first one completes.
Asynchronous calls return immediately.
That's the important bit.
Now how do you control "the next thing" that happens when the asynchronous operation has completed? You got it, you supply the completion handler.
The strength of asynchrony is so you can have an IO operation (or similar) run "in the background" without necessarily incurring any thread switch or synchronization overhead. This way you can handle many asynchronous control flows at the same time, on a single thread.
Indeed asynchronous operations can be more complicated and require more thought (e.g. about lifetime of references used in the completion handler). However, when you need it, you need it.
Boost.Asio basic overview from the official site explains it well:
http://www.boost.org/doc/libs/1_61_0/doc/html/boost_asio/overview/core/basics.html
The io_service object is what handles the multiple operations.
Calls to io_service.run() should be made carefully (that could explain the "dangerous async_write")

How to execute async operations sequentially with c++ boost::asio?

I would like to have a way to add async tasks form multiple threads and execute them sequentially in a c++ boost::asio application.
Update: I would like to make a server-to-server communication with only one persistent socket between them and I need to sequence the multiple requests trough it. It needs to keep the incoming request in a queue, fire the top one / wait for it response and pick up the next. I'm trying to avoid using zeromq because it needs a dedicated thread.
Update2: Ok, Here is with what I ended up: The concurrent worker threads are "queued" for the use of the server-to-server socket with a simple mutex. The communication is blocking write/wait for response/read then release the mutex. Simple isn't it :)
From the ASIO documentation:
Asynchronous completion handlers will only be called from threads that
are currently calling io_service::run().
If you're already calling io_service::run() from multiple threads, you can wrap your async calls in an io_service::strand as described here.
Not sure if I understand you correctly either, but what's wrong with the approach in the client chat example? Messages are posted to the io_service thread, queued while a write is in progress and popped/sent in the write completion handler. If more messages were added in the meantime, the write handler launches the next async write.
Based on your comment to Sean, I also don't understand the benefit of having multiple threads calling io_service::run since you can only execute one async_write/async_read on one persistent socket at a time i.e. you can only call async_write again once the handler has returned? The number of calling threads might require you to lock the queue with a mutex though.
AFAICT the benefit of having multiple threads calling io_service::run is to increase the scalability of a server that is serving multiple requests simultaneously.

using boost sockets, do I need only one io_service?

having several connections in several different threads.. I'm basically doing a base class that uses boost/asio.hpp and the tcp stuff there..
now i was reading this: http://www.boost.org/doc/libs/1_44_0/doc/html/boost_asio/tutorial/tutdaytime1.html
it says that "All programs that use asio need to have at least one io_service object."
so should my base class has a static io_service (which means there will be only 1 for all the program and a all the different threads and connections will use the same io_service object)
or make each connection its own io_service?
thanks in front!
update:
OK so basically what I wish to do is a class for a basic client which will have a socket n it.
For each socket I'm going to have a thread that always-receives and a different thread that sometimes sends packets.
after looking in here: www.boost.org/doc/libs/1_44_0/doc/html/boost_asio/reference/ip__tcp/socket.html (cant make hyperlink since im new here.. so only 1 hyperling per post) I can see that socket class isn't entirely thread-safe..
so 2 questions:
1. Based on the design I just wrote, do I need 1 io_service for all the sockets (meaning make it a static class member) or I should have one for each?
2. How can I make it thread-safe to do? should I put it inside a "thread safe environment" meaning making a new socket class that has mutexes and stuff that doesn't let u send and receive at the same time or you have other suggestions?
3. Maybe I should go on a asynch design? (ofc each socket will have a different thread but the sending and receiving would be on the same thread?)
just to clarify: im doing a tcp client that connects to a lot of servers.
You need to decide first which style of socket communication you are going to use:
synchronous - means that all low-level operations are blocking, and typically you need a thread for the accept, and then threads (read thread or io_service) to handle each client.
asynchronous - means that all low-level operations are non-blocking, and here you only need a single thread (io_service), and you need to be able to handle callbacks when certain things happen (i.e. accepts, partial writes, result of reads etc.)
Advantage of approach 1 is that it's a lot simpler to code (??) than 2, however I find that 2 is most flexible, and in fact with 2, by default you have a single threaded application (internally the event callbacks are done in a separate thread to the main dispatching thread), downside of 2 of course is that your processing delay hits the next read/write operations... Of course you can make multi-threaded applications with approach 2, but not vice-versa (i.e. single threaded with 1) - hence the flexibility...
So, fundamentally, it all depends on the selection of style...
EDIT: updated for the new information, this is quite long, I can't be bothered to write the code, there is plenty in the boost docs, I'll simply describe what is happening for your benefit...
[main thread]
- declare an instance of io_service
- for each of the servers you are connecting to (I'm assuming that this information is available at start), create a class (say ServerConnection), and in this class, create a tcp::socket using the same io_service instance from above, and in the constructor itself, call async_connect, NOTE: this call is a scheduling a request for connect rather than the real connection operation (this doesn't happen till later)
- once all the ServerConnection objects (and their respective async_connects queued up), call run() on the instance of io_service. Now the main thread is blocked dispatching events in the io_service queue.
[asio thread] io_service by default has a thread in which scheduled events are invoked, you don't control this thread, and to implement a "multi-threaded" program, you can increase the number of threads that the io_service uses, but for the moment stick with one, it will make your life simple...
asio will invoke methods in your ServerConnection class depending on which events are ready from the scheduled list. The first event you queued up (before calling run()) was async_connect, now asio will call you back when a connection is established to a server, typically, you will implement a handle_connect method which will get called (you pass the method in to the async_connect call). On handle_connect, all you have to do is schedule the next request - in this case, you want to read some data (potentially from this socket), so you call async_read_some and pass in a function to be notified when there is data. Once done, then the main asio dispatch thread will continue dispatching other events which are ready (this could be the other connect requests or even the async_read_some requests that you added).
Let's say you get called because there is some data on one of the server sockets, this is passed to you via your handler for async_read_some - you can then process this data, do as you need to, but and this is the most important bit - once done, schedule the next async_read_some, this way asio will deliver more data as it becomes available. VERY IMPORTANT NOTE: if you no longer schedule any requests (i.e. exit from the handler without queueing), then the io_service will run out of events to dispatch, and run() (which you called in the main thread) will end.
Now, as for writing, this is slightly trickier. If all your writes are done as part of the handling of data from a read call (i.e. in the asio thread), then you don't need to worry about locking (unless your io_service has multiple threads), else in your write method, append the data to a buffer, and schedule an async_write_some request (with a write_handler that will get called when the buffer is written, either partially or completely). When asio handles this request, it will invoke your handler once the data is written and you have the option of calling async_write_some again if there is more data left in the buffer or if none, you don't have to bother scheduling a write. At this point, I will mention one technique, consider double buffering - I'll leave it at that. If you have a completely different thread that is outside of the io_service and you want to write, you must call the io_service::post method and pass in a method to execute (in your ServerConnection class) along with the data, the io_service will then invoke this method when it can, and within that method, you can then buffer the data and optionally call async_write_some if a write is currently not in progress.
Now there is one VERY important thing that you must be careful about, you must NEVER schedule async_read_some or async_write_some if there is already one in progress, i.e. let's say you called async_read_some on a socket, until this event is invoked by asio, you must not schedule another async_read_some, else you'll have lots of crap in your buffers!
A good starting point is the asio chat server/client that you find in the boost docs, it shows how the async_xxx methods are used. And keep this in mind, all async_xxx calls return immediately (within some tens of microseconds), so there are no blocking operations, it all happens asynchronously. http://www.boost.org/doc/libs/1_39_0/doc/html/boost_asio/example/chat/chat_client.cpp, is the example I was referring to.
Now if you find that performance of this mechanism is too slow and you want to have threading, all you need to do is increase the number of threads that are available to the main io_service and implement the appropriate locking in your read/write methods in ServerConnection and you're done.
For asynchronous operations, you should use a single io_service object for the entire program. Whether its a static member of a class, or instantiated elsewhere is up to you. Multiple threads can invoke its run method, this is described in Inverse's answer.
Multiple threads may call
io_service::run() to set up a pool of
threads from which completion handlers
may be invoked. This approach may also
be used with io_service::post() to use
a means to perform any computational
tasks across a thread pool.
Note that all threads that have joined
an io_service's pool are considered
equivalent, and the io_service may
distribute work across them in an
arbitrary fashion.
if you have handlers that are not thread safe, read about strands.
A strand is defined as a strictly
sequential invocation of event
handlers (i.e. no concurrent
invocation). Use of strands allows
execution of code in a multithreaded
program without the need for explicit
locking (e.g. using mutexes).
The io_service is what invokes all the handler functions for you connections. So you should have one running for thread in order to distribute the work across threads. Here is a page explain the io_service and threads:
Threads and Boost.Asio

io_service, why and how is it used?

Trying to learn asio, and I'm following the examples from the website.
Why is io_service needed and what does it do exactly? Why do I need to send it to almost every other functions while performing asynchronous operations, why can't it "create" itself after the first "binding".
Asio's io_service is the facilitator for operating on asynchronous functions. Once an async operation is ready, it uses one of io_service's running threads to call you back. If no such thread exists it uses its own internal thread to call you.
Think of it as a queue containing operations. It guarantees you that those operations, when run, will only do so on the threads that called its run() or run_once() methods, or when dealing with sockets and async IO, its internal thread.
The reason you must pass it to everyone is basically that someone has to wait for async operations to be ready, and as stated in its own documentation io_service is ASIO's link to the Operating System's I/O service so it abstracts away the platform's own async notifiers, such as kqueue, /dev/pool/, epoll, and the methods to operate on those, such as select().
Primarily I end up using io_service to demultiplex callbacks from several parts of the system, and make sure they operate on the same thread, eliminating the need for explicit locking, since the operations are serialized. It is a very powerful idiom for asynchronous applications.
You can take a look at the core documentation to get a better feeling of why io_service is needed and what it does.