I'm developing a network server based on Boost::Asio.
I have a boost::thread_group of IO worker threads which I use to call boost::asio::io_service::run( )
When network activity occurs ASIO uses one of these worker threads to process the activity (eg. Accept or Receive).
My application then does some work, possibly some calculation, possibly some other IO (via boost) and possibly some database activity.
I'd like to know what the implications are of doing said work within these threads. Specifically:
Does carrying out ( possibly significant work ) on the IO threads cause
the io_service any grief?
And less specifically: any other issues I should be thinking about.
Does carrying out ( possibly significant work ) on the IO threads
cause the io_service any grief?
It really depends what you mean by grief. Performing long running operations in a handler invoked by an io_service can block additional handlers from being invoked by the io_service. Consider the simplest example with a single thread invoking io_service::run(). If the handler invoked by an asynchronous operation, say async_read() then performs some database operations that could be long running, additional outstanding asynchronous operations will not have their handlers invoked until the long running operation is complete and the handler returns control to the io_service.
This is typically mitigated by invoking io_service::run() from multiple threads, and using a strand to ensure exclusive access to handlers that used shared data. Though if all of your handlers perform some long running operations, you might need to investigate alternative designs.
Related
I have hard time finding out how exactly does thread pool built with boost::asio::io_service behave.
The documentation says:
Multiple threads may call the run() function to set up a pool of
threads from which the io_service may execute handlers. All threads
that are waiting in the pool are equivalent and the io_service may
choose any one of them to invoke a handler.
I would imagine, that when threads executing run() are taking a handler to execute, they execute it, and then come back to wait for next handlers to execute. When executing a handler, a thread is not considered waiting, and hence no new handlers to execute are assigned to it. Is that correct? Or does io_service assign work to threads, without considering whether these are busy or not?
I am asking, because in one project that we are using (OSRM), that uses boost::asio::io_service based thread pool to handle incoming HTTP requests, I noticed that long running request, sometimes block other, fast requests, even though more threads and cores are available.
When executing a handler, a thread is not considered waiting, and hence no new handlers to execute are assigned to it. Is that correct?
Yes. It's a pull model queue.
A notable "apparent" exception is when strands are used: handlers wrapped on a on a strand do synchronize with other handlers running on that same strand.
In my app I will receive various events that I would like to process asynchronously in a prioritised order.
I could do this with a boost::asio::io_service, but my application is single threaded. I don't want to pay for locks and mallocs you might need for a multi threaded program (the performance cost really is significant to me). I'm basically looking for a boost::asio::io_service that is written for single threaded execution.
I'm pretty sure I could implement this myself using boost::coroutine, but before I do, does something like a boost::asio::io_service that is written for single threaded execution exist already? I scanned the list of boost libraries already and nothing stood out to me
Be aware that you have to pay for synchronization as soon as you use any non-blocking calls of Asio.
Even though you might use a single thread for scheduling work and processing the resulting callbacks, Asio might still have to spawn additional threads internally for executing asynchronous calls. Those will access the io_service concurrently.
Think of an async_read on a socket: As soon as the received data becomes available, the socket has to notify the io_service. This happens concurrent to your main thread, so additional synchronization is required.
For blocking I/O this problem goes away in theory, but since asynchronous I/O is sort of the whole point of the library, I would not expect to find too many optimizations for this case in the implementation.
As was pointed out in the comments already, the contention on the io_service will be very low with only one main thread, so unless profiling indicates a clear performance bottleneck there, you should not worry about it too much.
I suggest to use boost::asio together with boost::coroutine -> boost::asio::yield_context (does already the coupling between coroutine + io_service). If you detect an task with higher priority you could suspend the current task and start processing the task with higher priority.
The problem is that you have to define/call certain check-points in the code of your task in order to suspend the task if the condition (higher prio task enqueued) is given.
I'm writing a thread pool class in C++ which receives tasks to be executed in parallel. I want all cores to be busy, if possible, but sometimes some threads are idle because they are blocked for a time for synchronization purposes. When this happens I would like to start a new thread, so that there are always approximately as many threads awake as there are cpu cores. For this purpose I need a way to find out whether a certain thread is awake or sleeping (blocked). How can I find this out?
I'd prefer to use the C++11 standard library or boost for portability purposes. But if necessary I would also use WinAPI. I'm using Visual Studio 2012 on Windows 7. But really, I'd like to have a portable way of doing this.
Preferably this thread-pool should be able to master cases like
MyThreadPool pool;
for ( int i = 0; i < 100; ++i )
pool.addTask( &block_until_this_function_has_been_called_a_hundred_times );
pool.join(); // waits until all tasks have been dispatched.
where the function block_until_this_function_has_been_called_a_hundred_times() blocks until 100 threads have called it. At this time all threads should continue running. One requirement for the thread-pool is that it should not deadlock because of a too low number of threads in the pool.
Add a facility to your thread pool for a thread to say "I'm blocked" and then "I'm no longer blocked". Before every significant blocking action (see below for what I mean by that) signal "I'm blocked", and then "I'm no longer blocked" afterwards.
What constitutes a "significant blocking action"? Certainly not a simple mutex lock: mutexes should only be held for a short period of time, so blocking on a mutex is not a big deal. I mean things like:
Waiting for I/O to complete
Waiting for another pool task to complete
Waiting for data on a shared queue
and other similar events.
Use Boost Asio. It has its own thread pool management and scheduling framework. The basic idea is to push tasks to the io_service object using the post() method, and call run() from as many threads as many CPU cores you have. You should create a work object while the calculation is running to avoid the threads from exiting if they don't have enough jobs.
The important thing about Asio is never to use any blocking calls. For I/O calls, use the asynchronous calls of Asio's own I/O objects. For synchronization, use strand objects instead of mutexes. If you post functions to the io service that is wrapped in a strand, then it ensures that at any time at most one task runs that belongs to a certain strand. If there is a conflict, the task remains in Asio's event queue instead of blocking a working thread.
There is one drawback of using asynchronous programming though. It is much harder to read a code that is scattered into several asynchronous calls than one with a clear control flow. You should be aware of this when designing your program.
having several connections in several different threads.. I'm basically doing a base class that uses boost/asio.hpp and the tcp stuff there..
now i was reading this: http://www.boost.org/doc/libs/1_44_0/doc/html/boost_asio/tutorial/tutdaytime1.html
it says that "All programs that use asio need to have at least one io_service object."
so should my base class has a static io_service (which means there will be only 1 for all the program and a all the different threads and connections will use the same io_service object)
or make each connection its own io_service?
thanks in front!
update:
OK so basically what I wish to do is a class for a basic client which will have a socket n it.
For each socket I'm going to have a thread that always-receives and a different thread that sometimes sends packets.
after looking in here: www.boost.org/doc/libs/1_44_0/doc/html/boost_asio/reference/ip__tcp/socket.html (cant make hyperlink since im new here.. so only 1 hyperling per post) I can see that socket class isn't entirely thread-safe..
so 2 questions:
1. Based on the design I just wrote, do I need 1 io_service for all the sockets (meaning make it a static class member) or I should have one for each?
2. How can I make it thread-safe to do? should I put it inside a "thread safe environment" meaning making a new socket class that has mutexes and stuff that doesn't let u send and receive at the same time or you have other suggestions?
3. Maybe I should go on a asynch design? (ofc each socket will have a different thread but the sending and receiving would be on the same thread?)
just to clarify: im doing a tcp client that connects to a lot of servers.
You need to decide first which style of socket communication you are going to use:
synchronous - means that all low-level operations are blocking, and typically you need a thread for the accept, and then threads (read thread or io_service) to handle each client.
asynchronous - means that all low-level operations are non-blocking, and here you only need a single thread (io_service), and you need to be able to handle callbacks when certain things happen (i.e. accepts, partial writes, result of reads etc.)
Advantage of approach 1 is that it's a lot simpler to code (??) than 2, however I find that 2 is most flexible, and in fact with 2, by default you have a single threaded application (internally the event callbacks are done in a separate thread to the main dispatching thread), downside of 2 of course is that your processing delay hits the next read/write operations... Of course you can make multi-threaded applications with approach 2, but not vice-versa (i.e. single threaded with 1) - hence the flexibility...
So, fundamentally, it all depends on the selection of style...
EDIT: updated for the new information, this is quite long, I can't be bothered to write the code, there is plenty in the boost docs, I'll simply describe what is happening for your benefit...
[main thread]
- declare an instance of io_service
- for each of the servers you are connecting to (I'm assuming that this information is available at start), create a class (say ServerConnection), and in this class, create a tcp::socket using the same io_service instance from above, and in the constructor itself, call async_connect, NOTE: this call is a scheduling a request for connect rather than the real connection operation (this doesn't happen till later)
- once all the ServerConnection objects (and their respective async_connects queued up), call run() on the instance of io_service. Now the main thread is blocked dispatching events in the io_service queue.
[asio thread] io_service by default has a thread in which scheduled events are invoked, you don't control this thread, and to implement a "multi-threaded" program, you can increase the number of threads that the io_service uses, but for the moment stick with one, it will make your life simple...
asio will invoke methods in your ServerConnection class depending on which events are ready from the scheduled list. The first event you queued up (before calling run()) was async_connect, now asio will call you back when a connection is established to a server, typically, you will implement a handle_connect method which will get called (you pass the method in to the async_connect call). On handle_connect, all you have to do is schedule the next request - in this case, you want to read some data (potentially from this socket), so you call async_read_some and pass in a function to be notified when there is data. Once done, then the main asio dispatch thread will continue dispatching other events which are ready (this could be the other connect requests or even the async_read_some requests that you added).
Let's say you get called because there is some data on one of the server sockets, this is passed to you via your handler for async_read_some - you can then process this data, do as you need to, but and this is the most important bit - once done, schedule the next async_read_some, this way asio will deliver more data as it becomes available. VERY IMPORTANT NOTE: if you no longer schedule any requests (i.e. exit from the handler without queueing), then the io_service will run out of events to dispatch, and run() (which you called in the main thread) will end.
Now, as for writing, this is slightly trickier. If all your writes are done as part of the handling of data from a read call (i.e. in the asio thread), then you don't need to worry about locking (unless your io_service has multiple threads), else in your write method, append the data to a buffer, and schedule an async_write_some request (with a write_handler that will get called when the buffer is written, either partially or completely). When asio handles this request, it will invoke your handler once the data is written and you have the option of calling async_write_some again if there is more data left in the buffer or if none, you don't have to bother scheduling a write. At this point, I will mention one technique, consider double buffering - I'll leave it at that. If you have a completely different thread that is outside of the io_service and you want to write, you must call the io_service::post method and pass in a method to execute (in your ServerConnection class) along with the data, the io_service will then invoke this method when it can, and within that method, you can then buffer the data and optionally call async_write_some if a write is currently not in progress.
Now there is one VERY important thing that you must be careful about, you must NEVER schedule async_read_some or async_write_some if there is already one in progress, i.e. let's say you called async_read_some on a socket, until this event is invoked by asio, you must not schedule another async_read_some, else you'll have lots of crap in your buffers!
A good starting point is the asio chat server/client that you find in the boost docs, it shows how the async_xxx methods are used. And keep this in mind, all async_xxx calls return immediately (within some tens of microseconds), so there are no blocking operations, it all happens asynchronously. http://www.boost.org/doc/libs/1_39_0/doc/html/boost_asio/example/chat/chat_client.cpp, is the example I was referring to.
Now if you find that performance of this mechanism is too slow and you want to have threading, all you need to do is increase the number of threads that are available to the main io_service and implement the appropriate locking in your read/write methods in ServerConnection and you're done.
For asynchronous operations, you should use a single io_service object for the entire program. Whether its a static member of a class, or instantiated elsewhere is up to you. Multiple threads can invoke its run method, this is described in Inverse's answer.
Multiple threads may call
io_service::run() to set up a pool of
threads from which completion handlers
may be invoked. This approach may also
be used with io_service::post() to use
a means to perform any computational
tasks across a thread pool.
Note that all threads that have joined
an io_service's pool are considered
equivalent, and the io_service may
distribute work across them in an
arbitrary fashion.
if you have handlers that are not thread safe, read about strands.
A strand is defined as a strictly
sequential invocation of event
handlers (i.e. no concurrent
invocation). Use of strands allows
execution of code in a multithreaded
program without the need for explicit
locking (e.g. using mutexes).
The io_service is what invokes all the handler functions for you connections. So you should have one running for thread in order to distribute the work across threads. Here is a page explain the io_service and threads:
Threads and Boost.Asio
Trying to learn asio, and I'm following the examples from the website.
Why is io_service needed and what does it do exactly? Why do I need to send it to almost every other functions while performing asynchronous operations, why can't it "create" itself after the first "binding".
Asio's io_service is the facilitator for operating on asynchronous functions. Once an async operation is ready, it uses one of io_service's running threads to call you back. If no such thread exists it uses its own internal thread to call you.
Think of it as a queue containing operations. It guarantees you that those operations, when run, will only do so on the threads that called its run() or run_once() methods, or when dealing with sockets and async IO, its internal thread.
The reason you must pass it to everyone is basically that someone has to wait for async operations to be ready, and as stated in its own documentation io_service is ASIO's link to the Operating System's I/O service so it abstracts away the platform's own async notifiers, such as kqueue, /dev/pool/, epoll, and the methods to operate on those, such as select().
Primarily I end up using io_service to demultiplex callbacks from several parts of the system, and make sure they operate on the same thread, eliminating the need for explicit locking, since the operations are serialized. It is a very powerful idiom for asynchronous applications.
You can take a look at the core documentation to get a better feeling of why io_service is needed and what it does.