Can we get the epoll descriptor underlying the boost asio io_service object ?? we have multiple boost asio io_service objects in our application, one from a library and the other native to the application, the io_service object from the library is a server object serving multiple connections, we are investigating how best we can coordinate multiple io_service objects efficiently.
I'd say you don't need to break the documented interface to combine both.
In fact the documented interface is enough to combine several io_service objects efficiently.
Basically the only point of efficiency you could be looking for is to avoid running separate event loops for them (as that might require more threads than you're prepared to).
As the docs say:
The library interface is decoupled from interfaces for thread creation and management, and permits implementations on platforms where threads are not available.
And the platform specific implementation notes promise (in slightly different but essentially similar wording):
Demultiplexing using epoll is performed in one of the threads that calls io_context::run(), io_context::run_one(), io_context::poll() or io_context::poll_one().
This is your clue. You can knit together many io_services into a single event loop using poll_one() (or even run_one()). In fact, this mechanism can be used to integrate with whatever thrid-party event subsystem you want (libev, Qt idle work etc.). You could call poll_one() in response to a hardware interrupt on systems that don't support threads in the first place.
As a side-note, the inverse is to let other libraries do native socket operations with ASIO doing the polling: Reactor Style Operations.
Both these approaches can be combined.
Summary
Boost Asio is designed to be extensible and to be unintrusive to your design choices. You're most likely able to "fix" your third-party library integration worries using the public interface.
In the recent versions of boost there are methods run_for and run_until, we can wait on the first io service object for a specified time and can call poll on the second io service object when the first one returns or times out.
Related
I have written a MultiThread C++ Codes using boost.
I have the below code in my main thread:
while (!mInputQueue.empty() && mStartProcessJobs)
mProcessJobs.wait(lock);
the second line should be executed immediately after the first line and context switching should not occur. How can I do this?
Depending on the nature of the jobs, you can use an asynchronous service provider.
Often these exist for asynchronous IO (e.g. sockets in non-blocking mode, IO completion ports on windows, libaio etc.)
Boost Asio harnesses all these interfaces (and some more, related to timers, platform specific handles or e.g. serial ports) into a service object. This enables you to run many jobs asynchronously, potentially all on a single thread. This means that there is no context switching.
Asio's io_service has several ways of posting/dispatching jobs. Depending on which you use, jobs might even execute immediately and synchronously.
I suggest you look at some of the samples, as it looks to be precisely what you need.
PS. There are other - more low-level - libraries outside of boost that have the same kind of features but I haven't used them. I think the most popular are libuv/libevent (IIRC)
The boost::asio library provides an interesting synchronization model using "strands" to serialize accesses to a resource that would normally require locks. This increases parallelism by essentially turning every lock operation into an enqueue.
Searching for "strands" only yields relevant results about asio, even though they seem like an exceptionally useful primitive for multithreading. Is there some other term for them that I'm missing?
Link to the asio strand documentation: http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/reference/io_service__strand.html
I am not aware of an official name for the construct.
The proposal based on Boost.Asio (N2175 - Networking Library Proposal for TR2) documents the strand class, but does not reference any relevant material. Also, the Intel compiler documentation makes a few references to strand in its execution model, defining it as "any sequence of instructions without any parallel control structures."
I've started doing a bit of programming in the iOS and Mac OS X domain, they have a similar concept to strands called a serial dispatch queue from Grand Central Dispatch. The tasks are executed in the order they are added to the queue, just like a strand. Similarly, the thread that executes the task is not defined, just as with asio when multiple threads invoke io_service::run().
For some time now I have been googling a lot to get to know about the various ways to acheive asynchronous programming/behavior on nix machines and ( as known earlier to me ) got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).
Below are the few alternatives present for linux:
select/poll/epoll :: Cannot be done using single thread as epoll is still blocking call. Also the monitored file descriptors must be opened in non-blocking mode.
libaio:: What I have come to know about is that its implementation sucks and its still notification based instead of being completion based as with windows I/O completion ports.
Boost ASIO :: It uses epoll under linux and thus not a true async pattern as it spawns thread which are completely abstracted from user code to acheive the proactor design pattern
libevent :: Any reason to go for it if I prefer ASIO?
Now Here comes the questions :)
What would be the best design pattern for writing fast scalable network server using epoll (ofcourse, will have to use threads here :( )
I had read somewhere that "only sockets can be opened in non-blocking mode" hence epoll supports only sockets and hence cannot be used for disk I/O.
How true is the above statement and why async programming cannot be done on disk I/O using epoll ?
Boost ASIO uses one big lock around epoll call. I didnt actually understand what can be its implications and how to overcome it using asio itself. Similar question
How can I modify ASIO pattern to work with disk files? Is there any recommended design pattern ?
Hope somebody will able to answer all the questions with nice explanations also. Any link to source where the implementation details of epoll and AIO design patterns are exaplained is also appreciated.
Boost ASIO :: It uses epoll under linux and thus not a true async
pattern as it spawns thread which are completely abstracted from user
code to acheive the proactor design pattern
This is not correct. The Asio library uses epoll() by default on most recent Linux kernel versions. however, threads invoking io_service::run() will invoke callback handlers as needed. There is only one place in the Asio library that a thread is used to emulate an asynchronous interface, it is well described in the documentation:
An additional thread per io_service is used to emulate asynchronous
host resolution. This thread is created on the first call to either
ip::tcp::resolver::async_resolve() or
ip::udp::resolver::async_resolve().
This does not make the library "not a true async pattern" as you claim, in fact its name would disagree with you by definition.
1) What would be the best design pattern for writing fast scalable network server using epoll (of course, will have to use threads here :(
)
I suggest using Boost Asio, it uses the proactor design pattern.
3) Boost ASIO uses one big lock around epoll call. I didnt actually
understand what can be its implications and how to overcome it using
asio itself
The epoll reactor uses a mutex to dispatch handlers, though in practice this is not a big concern for most applications. There are application specific ways to mitigate this behavior, such as an io_service per CPU to exploit data locality. See my answer to a similar question on this topic. It is also discussed on the Asio mailing list frequently.
4) How can I modify ASIO pattern to work with disk files? Is there any
recommended design pattern?
The Asio library does not natively support file I/O as you noted. There have been several attempts to add it to the library, I'd suggest discussing on the mailing list.
First of all:
got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).
You probably has a small misconception, asynchronous can be build on top of "polling" api.
More then that "reactor" (epoll-like) API is more powerful then "proactor" API (IOCP) as
the second can be implemented in terms of the first one (but not the other way around).
Also some operations that are "truly" asynchronous for example like disk I/O, some some other tools can be with combination of signals and Linux specific signalfd can provide full coverage of some other cases.
Bottom line. epoll is truly asynchronous I/O
I am working on a RPC framework, I want to use a multi io_service design to decouple the io_objects that perform the IO (front-end) from the the threads that perform the RPC work (the back-end).
The front-end should be single threaded and the back-end should have a thread pool. I was considering a design to get the front-end and back-end to synchronise using a condition variables. However, it seems boost::thread and boost::asio do not comingle --i.e., it seems condition variable async_wait support is not available. I have a question open on this matter here.
It occured to me that io_service::post() might be used to synchronise the two io_service objects. I have attached a diagram below, I just want to know if I understand the post mechanism correctly, and weather this is a sensible implementation.
I assume that you use "a single io_service and a thread pool calling io_service::run()"
Also I assume that your frond-end is single-threaded just to avoid a race condition writing from multiple threads to the same socket.
The same goal can be achieved using io_service::strand (tutorial).Your front-end can be MT synchronized by io_service::strand. All posts from back-end to front-end (and handlers from front-end to front-end like handle_connect etc.) should be wrapped by strand, something like this:
back-end -> front-end:
io_service.post(front_end.strand.wrap(
boost::bind(&Front_end::send_response, front_end_ptr)));
or front-end -> front-end:
socket.async_connect(endpoint, strand.wrap(
boost::bind(&Front_end::handle_connect, shared_from_this(),
boost::asio::placeholders::error)));
And all posts from front-end to back-end shouldn't be wrapped by strand.
If you back-end is a thread pool calling any of the io_service::run(), io_service::run_one(), io_service::poll(), io_service::poll_one() functions and your handler(s) require access to shared resources then you still have to take care to lock those shared resources somehow in the handler's themselves.
Given the limited amount of information posted in the question, I would assume this would work fine given the caveat above.
However, when posting there is some measurable overhead for setting up the necessary completion ports and waiting -- overhead you could avoid using a different implementation of your back end "queue".
Without knowing the exact details of what you need to accomplish, I would suggest that you look into thread building blocks for pipelines or perhaps more simply for a concurrent queue.
Trying to learn asio, and I'm following the examples from the website.
Why is io_service needed and what does it do exactly? Why do I need to send it to almost every other functions while performing asynchronous operations, why can't it "create" itself after the first "binding".
Asio's io_service is the facilitator for operating on asynchronous functions. Once an async operation is ready, it uses one of io_service's running threads to call you back. If no such thread exists it uses its own internal thread to call you.
Think of it as a queue containing operations. It guarantees you that those operations, when run, will only do so on the threads that called its run() or run_once() methods, or when dealing with sockets and async IO, its internal thread.
The reason you must pass it to everyone is basically that someone has to wait for async operations to be ready, and as stated in its own documentation io_service is ASIO's link to the Operating System's I/O service so it abstracts away the platform's own async notifiers, such as kqueue, /dev/pool/, epoll, and the methods to operate on those, such as select().
Primarily I end up using io_service to demultiplex callbacks from several parts of the system, and make sure they operate on the same thread, eliminating the need for explicit locking, since the operations are serialized. It is a very powerful idiom for asynchronous applications.
You can take a look at the core documentation to get a better feeling of why io_service is needed and what it does.