I want to create an http client using boost asio. To have a structured and optimized I have looked into the examples of boost asio to have some idea of what a good implementation should look like.
Mostly, I have followed the structure of HTTP Server, so I have a connection manager that holds a set of pointers to each individual connection. Now, the big difference here is that already in the constructor of server.cpp an asynchronous function is called, namely
acceptor_.async_accept(new_connection_->socket(),
boost::bind(&server::handle_accept, this,
boost::asio::placeholders::error));
and in the winmain.cpp the io_service is started through a function call to server::run():
io_service_.run();
In my implementation, since it's a client and not a server, I want to wait for the user to call a send() function before I start connecting to the server. I have therefore moved all connecting-to-server-related function calls into the connection class. When a user requests to send a msg to the server the following is called:
resolver.async_resolve(query,
boost::bind(&connection::handle_resolve, boost::ref(*this),
boost::asio::placeholders::error,
boost::asio::placeholders::iterator));
io_service_.run();
I want to start every connection-object in one separate thread and this is really the background of my question. How do I do that in order to have a structured and optimized code?
I have tried, as HTTP Server 2 example, to set up a thread pool of io_services and assigning work to them so that they will not return until stopped. This seems like a good idea since I would the have the io services running in the background all the time. Consequently, I start the thread pool from my equivalent to server.cpp, in a thread:
boost::thread t(boost::bind(&geocast::enabler::io_service_pool::run, &io_service_pool_));
BUT, from my own trial and error analysis, it seems as you cannot start io_service BEFORE you have issued an asynchronous function, is that true? Because my program gets stuck. In my case I want to call async_resolve only when a user means to sends a POST request or a GET request. To support my theory; The Chat Client starts off by calling an async_connect and having an async_read as callback, this way they can safely call io_service.run() just after the client has been created. I don't want to read from the server all the time just to be able to start the io_service, because that is not how a normal client works, right? A browser does not read from every possible server on the planet without the user having navigated to a website...
If I don't use the thread pool from example 2 but start every connection-class in a separate class, each of which own its own io_service, everything works fine. But, a thread pool with a simple round-robin routine to select an appropriate io_service seems really attractive. What is the best approach for me to go multi-threaded? Am I just picky and should stick to one-connection-one-io_service-thing?
I have tried, as HTTP Server 2
example, to set up a thread pool of
io_services and assigning work to them
so that they will not return until
stopped.
When using asynchronous programming, I strongly suggest using the following designs in order:
single io_service with a single thread
pool of threads invoking a single io_service
io_service per thread or other exotic designs
You should only move to the next design if, after profiling, your current design proves to be a bottleneck.
BUT, from my own trial and error
analysis, it seems as you cannot start
io_service BEFORE you have issued an
asynchronous function, is that true?
You are correct, the io_service::run() documentation spells this out very clearly
The run() function blocks until all
work has finished and there are no
more handlers to be dispatched, or
until the io_service has been stopped.
The correct way to prevent io_service::run() from returning immediately is to queue up some handlers, or instantiate an io_service::work object and keep it in scope for as long as you want run() to stay active.
When using ASIO, you are giving up control of the flow of your program over to ASIO. You can share control if you change your code to use a thread pool and call run_one() instead of run(). run_one() only dispatches one IO job to a thread, so if you have multiple events in the ioservice, you will have to call run_one() several times.
Have you thought about spawning a new thread as your boss thread and then having your boss thread create a bunch of worker threads? Your boss thread could call run() and then your UI's thread could call post() to dispatch a new unit of work. Along with not having to manually call and schedule tasks in the ioservice, it also makes the cleanup and shutdown more straight forward since your boss thread would block when it calls run().
Related
Given that the boost::asio::ip::tcp::acceptor and boost::asio::ip::tcp::socket are both marked as non-thread safe as of Boost 1.52.0, is it possible to shutdown a tcp::acceptor currently blocking on accept() from a separate thread?
I've looked at calling boost::asio::io_service::stop() and this looks possible as io_service is thread safe. Would this leave the io_service event loop running until any processing being done on the socket are complete?
I am operating synchronously as this is as simple event loop as part of a bigger program and don't want to create additional threads without good reason which I understand async will do.
Having spent some time looking into this there is only 1 thread safe manner in which this can be achieved: by sending a message to the socket (on a thread not waiting on accept()) telling the thread to close the socket and the acceptor. By doing this the socket and acceptor can be wholly owned by a single thread.
As pointed out separately, io_service is only of use for asynchronous operations.
If your acceptor is in async_accept, you can call ip::tcp::acceptor::cancel() to cancel any async operations on it. Note this may fire handlers in this acceptor with the boost::asio::error::operation_aborted error code.
If you're using synchronous accept, it seems impossible since I think it's not related to io_service at all.
I think your over thinking this a little. Use a non-blocking accept or a native accept with a timeout within a conditional loop. Add a mutex lock and it's thread safe. You can also use a native select and accept when new connection arrive. Set a timeout and a conditional loop for the select.
Is I am looking at writing a multithreaded tcp server using boost ASIO. I have read through the tutorials and had a look at some of the examples and just want to check that my understanding is correct.
The server will accept connections then service requests from multiple clients.
My understanding is as follows:
The server uses "a single io_service and a thread pool calling io_service::run()"
All threads call io_service::run().
The calls to io_service::run() are not within a strand, ergo completion handlers can run simultaneously.
When a request arrives one of the threads is chosen, its read handler will be called
Another request may arrive,starting the read handler on a second thread
When one of the threads has finished handling the request it calls async_write, from within a strand
Another thread also finishes processing its request, it also calls async_write, from within a strand
The writes to the io_service are serialised via the strand, ergo they are thread safe.
When the write operation completes the thread calls async_read()
This call is not protected by a strand and the thread will be used for handling requests
Is my understanding correct? Is this solution vulnerable to race conditions?
As Sam miller said, your assumptions are quite correct.
However I would like to point out an issue that you may have not spotted.
It is right that strands will serialize async_write(s) and therefore there will be thread safe.
But the issue is not here, async_write is by itself thread safe if not used on the same socket. And strands will not help here since you should not interleave async_write on the same socket.
Strands will not wait the previous async_write to finish before calling the next one. you will have to create a structure that async_write only if none is already in action on the socket.
I would like to have a way to add async tasks form multiple threads and execute them sequentially in a c++ boost::asio application.
Update: I would like to make a server-to-server communication with only one persistent socket between them and I need to sequence the multiple requests trough it. It needs to keep the incoming request in a queue, fire the top one / wait for it response and pick up the next. I'm trying to avoid using zeromq because it needs a dedicated thread.
Update2: Ok, Here is with what I ended up: The concurrent worker threads are "queued" for the use of the server-to-server socket with a simple mutex. The communication is blocking write/wait for response/read then release the mutex. Simple isn't it :)
From the ASIO documentation:
Asynchronous completion handlers will only be called from threads that
are currently calling io_service::run().
If you're already calling io_service::run() from multiple threads, you can wrap your async calls in an io_service::strand as described here.
Not sure if I understand you correctly either, but what's wrong with the approach in the client chat example? Messages are posted to the io_service thread, queued while a write is in progress and popped/sent in the write completion handler. If more messages were added in the meantime, the write handler launches the next async write.
Based on your comment to Sean, I also don't understand the benefit of having multiple threads calling io_service::run since you can only execute one async_write/async_read on one persistent socket at a time i.e. you can only call async_write again once the handler has returned? The number of calling threads might require you to lock the queue with a mutex though.
AFAICT the benefit of having multiple threads calling io_service::run is to increase the scalability of a server that is serving multiple requests simultaneously.
having several connections in several different threads.. I'm basically doing a base class that uses boost/asio.hpp and the tcp stuff there..
now i was reading this: http://www.boost.org/doc/libs/1_44_0/doc/html/boost_asio/tutorial/tutdaytime1.html
it says that "All programs that use asio need to have at least one io_service object."
so should my base class has a static io_service (which means there will be only 1 for all the program and a all the different threads and connections will use the same io_service object)
or make each connection its own io_service?
thanks in front!
update:
OK so basically what I wish to do is a class for a basic client which will have a socket n it.
For each socket I'm going to have a thread that always-receives and a different thread that sometimes sends packets.
after looking in here: www.boost.org/doc/libs/1_44_0/doc/html/boost_asio/reference/ip__tcp/socket.html (cant make hyperlink since im new here.. so only 1 hyperling per post) I can see that socket class isn't entirely thread-safe..
so 2 questions:
1. Based on the design I just wrote, do I need 1 io_service for all the sockets (meaning make it a static class member) or I should have one for each?
2. How can I make it thread-safe to do? should I put it inside a "thread safe environment" meaning making a new socket class that has mutexes and stuff that doesn't let u send and receive at the same time or you have other suggestions?
3. Maybe I should go on a asynch design? (ofc each socket will have a different thread but the sending and receiving would be on the same thread?)
just to clarify: im doing a tcp client that connects to a lot of servers.
You need to decide first which style of socket communication you are going to use:
synchronous - means that all low-level operations are blocking, and typically you need a thread for the accept, and then threads (read thread or io_service) to handle each client.
asynchronous - means that all low-level operations are non-blocking, and here you only need a single thread (io_service), and you need to be able to handle callbacks when certain things happen (i.e. accepts, partial writes, result of reads etc.)
Advantage of approach 1 is that it's a lot simpler to code (??) than 2, however I find that 2 is most flexible, and in fact with 2, by default you have a single threaded application (internally the event callbacks are done in a separate thread to the main dispatching thread), downside of 2 of course is that your processing delay hits the next read/write operations... Of course you can make multi-threaded applications with approach 2, but not vice-versa (i.e. single threaded with 1) - hence the flexibility...
So, fundamentally, it all depends on the selection of style...
EDIT: updated for the new information, this is quite long, I can't be bothered to write the code, there is plenty in the boost docs, I'll simply describe what is happening for your benefit...
[main thread]
- declare an instance of io_service
- for each of the servers you are connecting to (I'm assuming that this information is available at start), create a class (say ServerConnection), and in this class, create a tcp::socket using the same io_service instance from above, and in the constructor itself, call async_connect, NOTE: this call is a scheduling a request for connect rather than the real connection operation (this doesn't happen till later)
- once all the ServerConnection objects (and their respective async_connects queued up), call run() on the instance of io_service. Now the main thread is blocked dispatching events in the io_service queue.
[asio thread] io_service by default has a thread in which scheduled events are invoked, you don't control this thread, and to implement a "multi-threaded" program, you can increase the number of threads that the io_service uses, but for the moment stick with one, it will make your life simple...
asio will invoke methods in your ServerConnection class depending on which events are ready from the scheduled list. The first event you queued up (before calling run()) was async_connect, now asio will call you back when a connection is established to a server, typically, you will implement a handle_connect method which will get called (you pass the method in to the async_connect call). On handle_connect, all you have to do is schedule the next request - in this case, you want to read some data (potentially from this socket), so you call async_read_some and pass in a function to be notified when there is data. Once done, then the main asio dispatch thread will continue dispatching other events which are ready (this could be the other connect requests or even the async_read_some requests that you added).
Let's say you get called because there is some data on one of the server sockets, this is passed to you via your handler for async_read_some - you can then process this data, do as you need to, but and this is the most important bit - once done, schedule the next async_read_some, this way asio will deliver more data as it becomes available. VERY IMPORTANT NOTE: if you no longer schedule any requests (i.e. exit from the handler without queueing), then the io_service will run out of events to dispatch, and run() (which you called in the main thread) will end.
Now, as for writing, this is slightly trickier. If all your writes are done as part of the handling of data from a read call (i.e. in the asio thread), then you don't need to worry about locking (unless your io_service has multiple threads), else in your write method, append the data to a buffer, and schedule an async_write_some request (with a write_handler that will get called when the buffer is written, either partially or completely). When asio handles this request, it will invoke your handler once the data is written and you have the option of calling async_write_some again if there is more data left in the buffer or if none, you don't have to bother scheduling a write. At this point, I will mention one technique, consider double buffering - I'll leave it at that. If you have a completely different thread that is outside of the io_service and you want to write, you must call the io_service::post method and pass in a method to execute (in your ServerConnection class) along with the data, the io_service will then invoke this method when it can, and within that method, you can then buffer the data and optionally call async_write_some if a write is currently not in progress.
Now there is one VERY important thing that you must be careful about, you must NEVER schedule async_read_some or async_write_some if there is already one in progress, i.e. let's say you called async_read_some on a socket, until this event is invoked by asio, you must not schedule another async_read_some, else you'll have lots of crap in your buffers!
A good starting point is the asio chat server/client that you find in the boost docs, it shows how the async_xxx methods are used. And keep this in mind, all async_xxx calls return immediately (within some tens of microseconds), so there are no blocking operations, it all happens asynchronously. http://www.boost.org/doc/libs/1_39_0/doc/html/boost_asio/example/chat/chat_client.cpp, is the example I was referring to.
Now if you find that performance of this mechanism is too slow and you want to have threading, all you need to do is increase the number of threads that are available to the main io_service and implement the appropriate locking in your read/write methods in ServerConnection and you're done.
For asynchronous operations, you should use a single io_service object for the entire program. Whether its a static member of a class, or instantiated elsewhere is up to you. Multiple threads can invoke its run method, this is described in Inverse's answer.
Multiple threads may call
io_service::run() to set up a pool of
threads from which completion handlers
may be invoked. This approach may also
be used with io_service::post() to use
a means to perform any computational
tasks across a thread pool.
Note that all threads that have joined
an io_service's pool are considered
equivalent, and the io_service may
distribute work across them in an
arbitrary fashion.
if you have handlers that are not thread safe, read about strands.
A strand is defined as a strictly
sequential invocation of event
handlers (i.e. no concurrent
invocation). Use of strands allows
execution of code in a multithreaded
program without the need for explicit
locking (e.g. using mutexes).
The io_service is what invokes all the handler functions for you connections. So you should have one running for thread in order to distribute the work across threads. Here is a page explain the io_service and threads:
Threads and Boost.Asio
I have already used wininet to send some synchronous HTTP requests. Now, I want to go one step further and want to request some content asynchronously.
The goal is to get something "reverse proxy"-like. I send an HTTP request which gets answered delayed - as soon as someone wants to contact me. My thread should continue as if there was nothing in the meanwhile, and a callback should be called in this thread as soon as the response arrives. Note that I don't want a second thread which handles the reply (if it is necessary, it should only provide some mechanism which interrupts the main thread to invoke the callback there)!
Update: Maybe, the best way to describe what I want is a behaviour like in JavaScript where you have only one thread but can send AJAX requests which then result in a callback being invoked in this main thread.
Since I want to understand how it works, I don't want library solutions. Does anybody know some good tutorial which explains me how to achieve my wanted behavior?
My thread should continue as if there
was nothing in the meanwhile, and a
callback should be called in this
thread as soon as the response
arrives.
What you're asking for here is basically COME FROM (as opposed to GO TO). This is a mythical instruction which doesn't really exist. The only way you can get your code called is to either poll in the issuing thread, or to have a separate thread which is performing the synchronous IO and then executing the callback (in that thread, or in yet another spawned thread) with the results.
When I was working in C++ with sockets I set up a dedicated thread to iterate over all the open sockets, poll for data which would be available without blocking, take the data and stuff it in a buffer, sending the buffer to a callback on a given circumstance (EOL, EOF, that sort of thing).
Unless your main thread is listening to something like a message queue there isn't really a way to just hijack it and start it executing code other than what it is currently doing.
Take a look at how boost::asio works, it basically lets you asyncronously do connects, reads, writes, etc... For example you start an async read with the primary (or any) thread, asio then uses overlapped IO to ask the OS to notify it of IO completion. When the async read completes your callback will be executed by one of the worker threads.
All you need to do is to be sure to call io_service::run() with either your main thread or a worker thread to handle the IO completion queue. Any threads that you call run with will be the ones that execute the callback.
Asio has some guarantees that make this method of multithreading fairly robust if you follow the rules.
Take a look at the documentation for asio even if you don't plan to use it, a lot of the patterns and ideas are quite interesting if this is something you want to tackle yourself.
If you don't want to look at it, remember, on Windows the method of doing async IO is called "Overlapped IO".