I have a server application, which uses boost.asio framework. Application logic is simple - it listens on several ports for incoming connections, accepts it, does some processing and closes connection. Of course, more than several clients allowed to connect to server at same time. I use asynchronous approach to accept connection read and write data. The problem is, that at some point of time io_service just stops to process handlers.
Let me describe symptoms in more details. After problem appears, application continues to listen to specified ports, and netstat command can verify that. Client can establish connection to server, but not a single handler(Server::Session) is called.
Here is code, that accepts connections:
void Server::StartAccept()
{
socket_ptr sock(new boost::asio::ip::tcp::socket(ioService_));
acceptor_.async_accept(*sock, boost::bind(&Server::Session, shared_from_this(), sock, boost::asio::placeholders::error));
}
void Server::Session(socket_ptr sock, const boost::system::error_code& error)
{
StartAccept();
if(error)
{
boost::system::error_code ec;
sock->shutdown(boost::asio::ip::tcp::socket::shutdown_both, ec);
sock->close(ec);
return;
}
//Processing...
}
Here is the code, which starts server:
void run_service()
{
for (;;)
{
try
{
io_service.run();
break;
}
catch (...)
{
}
}
}
boost::thread_group threads;
for ( int i = 0; i < size; ++i)
threads.create_thread(run_service);
threads.join_all();
I found out out, that, if I replace line
io_service.run();
with
while (!io_service.stopped())
io_service.run_once();
than this loop will stuck right when error appears, and run_once function will never return.
My assumptions on why that could happend:
One of handler, which was called never returns.
This is some sort of deadlock in boost internals(because I don't do any locking).
The questions are:
What other reasons could be for such strange behaviour?
What is best way to fix that?
How can I figure out, which handler is called by run_once function before it stucks?
The problem was in handler, which waited for another network activity to finish. This activity didn't have timeout and in some cases lasted forever. Thanks for comments. Defining BOOST_ASIO_ENABLE_HANDLER_TRACKING is really good step to detect problem.
Related
example websocket :
std::thread{ wsserver }.detach();
void()
{
boost::asio::ip::tcp::socket socket{ ioc };
acceptor.accept(socket);
}
The thread process is running separately from the server.
I think this may lead to erroneous results in some interlinked operations.
so I want to put this process into an infinite loop.
In short, I should not use multi-threads.
I want.
example:
while(loop());
int loop()
{
boost::asio::ip::tcp::socket socket{ ioc };
acceptor.accept(socket);
return 1;
}
but not work idle loop.
because "acceptor.accept" is all time waiting connect.
How to use a different command instead of the "accept" command.
How I can also get rid of your thread command.
Can I do this with smart data?
I hope I can explained.
In boost website, there is a good example about timeout of async operations. However, in that example, the socket is closed to cancel operations. There is also socket::cancel(), but in both documentation and as a compiler warning, it is stated as problematic in terms of portability.
Among the stack of Boost.Asio timeout questions in SO, there are several kind of answers. The first one probably is introducing a custom event loop, i.e., loop io_service::run_one() and cancel the event loop on deadline. I am using io_service::run() in a worker thread. That's not the kind of solution I would like to employ, if possible, as I do not want to change my code base.
A second option is directly changing the options of native socket. However, I would like to stick to Boost.Asio if possible and avoid any sort of platform-specific code as much as possible.
The example in the documentation is for an old version of Boost.Asio, but it's working properly, other than being forced to close the socket to cancel the operations. Using the documentation example, I have the following
void check_deadline(const boost::system::error_code &ec)
{
if(!running) {
return;
}
if(timer.expires_at() <= boost::asio::deadline_timer::traits_type::now()) {
// cancel all operations
boost::system::error_code errorcode;
boost::asio::ip::tcp::endpoint endpoint = socket.remote_endpoint();
socket.close(errorcode);
if(errorcode) {
SLOGERROR(mutex, errorcode.message(), "check_deadline()");
}
else {
SLOG(mutex, "timed out", "check_deadline()");
// connect again
Connect(endpoint);
if(errorcode) {
SLOGERROR(mutex, errorcode.message(), "check_deadline()");
}
}
// set timer to infinity, so that it won't expire
// until a proper deadline is set
timer.expires_at(boost::posix_time::pos_infin);
}
// keep waiting
timer.async_wait(std::bind(&TCPClient::check_deadline, this, std::placeholders::_1));
}
This is the only callback function registered to async_wait.The very first solution I could come up is reconnecting after closing the socket. Now my question is, is there a better way? By better way, I mean canceling the operations based on a timer without actually disrupting (i.e., not closing the socket) the connection.
In the Daytime.3 tutorial for boost::asio (asynchronous TCP server), the class tcp_server contains the following two methods:
void start_accept()
{
tcp_connection::pointer new_connection =
tcp_connection::create(acceptor_.get_io_service());
acceptor_.async_accept(new_connection->socket(),
boost::bind(&tcp_server::handle_accept, this, new_connection,
boost::asio::placeholders::error));
}
void handle_accept(tcp_connection::pointer new_connection,
const boost::system::error_code& error)
{
if (!error) new_connection->start(); // ***
start_accept();
}
My concern is the line marked ***. What if this operation takes a long time to complete? Even if it doesn't, there must be some time gap between the *** line and the call to start_accept, during which the server will fail to accept incoming connections. Wouldn't it make more sense for async_accept to register an OS handler that doesn't halt when it accepts its first connection? Also, is this a real issue and how would I fix it?
The server won't "fail to accept incoming connections"; that's what the second parameter of the listen() function is for in the sockets API. But you are correct that the server can have a delay in handling the client request. A single-threaded application that requires lots of computation will cause issues, hence why this particular example really only performs IO. If your server really does need to perform something CPU intensive, then the handler should be passed to a task manager of some sort.
I currently have a very simple boost::asio server that sends a status update upon connecting (using google proto buffers):
try
{
boost::asio::io_service io_service;
tcp::acceptor acceptor(io_service,tcp::endpoint(tcp::v4(), 13));
for (;;)
{
tcp::socket socket(io_service);
acceptor.accept(socket);
...
std::stringstream message;
protoMsg.SerializeToOstream(&message);
boost::system::error_code ignored_error;
boost::asio::write(socket, boost::asio::buffer(message.str()), ignored_error);
}
}
catch (std::exception& e) { }
I would like to extend it to first read after accepting a new connection, check what request was received, and send different messages back depending on this message. I'd also like to keep the TCP connection open so the client doesn't have to re-connect, and would like to handle multiple clients (not many, maybe 2 or 3).
I had a look at a few examples on boost asio, namely the async time tcp server and the chat server, but both are a bit over my head tbh. I don't even understand whether I need an async server. I guess I could just do a read after acceptor.accept(socket), but I guess then I wouldn't keep on listening for further requests. And if I go into a loop I guess that would mean I could only handle one client. So I guess that means I have to go async? Is there a simpler example maybe that isn't 250 lines of code? Or do I just have to bite my way through those examples? Thanks
The examples you mention from the Boost.Asio documentation are actually pretty good to see how things work. You're right that at first it might look a bit difficult to understand, especially if you're new to these concepts. However, I would recommend that you start with the chat server example and get that built on your machine. This will allow you to closer look into things and start changing things in order to learn how it works. Let me guide you through a few things I find important to get started.
From your description what you want to do, it seems that the chat server gives you a good starting point as it already has similar pieces you need. Having the server asynchronous is what you want as you then quite easily can handle multiple clients with a single thread. Nothing too complicated from the start.
Simplified, asynchronous in this case means that your server works off a queue, taking a handler (task) and executes it. If there is nothing on the queue, it just waits for something to be put on the queue. In your case that means it could be a connect from a client, a new read of a message from a client or something like this. In order for this to work, each handler (the function handling the reaction to a particular event) needs to be set up.
Let me explain a bit using code from the chat server example.
In the server source file, you see the chat_server class which calls start_accept in the constructor. Here the accept handler gets set up.
void start_accept()
{
chat_session_ptr new_session(new chat_session(io_service_, room_)); // 1
acceptor_.async_accept(new_session->socket(), // 2
boost::bind(&chat_server::handle_accept, this, new_session, // 3
boost::asio::placeholders::error)); // 4
}
Line 1: A chat_session object is created which represents a session between one client and the server. A session is created for the accept (no client has connected yet).
Line 2: An asynchronous accept for the socket...
Line 3: ...bound to call chat_server::handle_accept when it happens. The session is passed along to be used by the first client which connects.
Now, if we look at the handle_accept we see that upon client connect, start is called for the session (this just starts stuff between the server and this client). Lastly a new accept is put outstanding in case other clients want to connect as well.
void handle_accept(chat_session_ptr session,
const boost::system::error_code& error)
{
if (!error)
{
session->start();
}
start_accept();
}
This is what you want to have as well. An outstanding accept for incoming connections. And if multiple clients can connect, there should always be one of these outstanding so the server can handle the accept.
How the server and the client(s) interact is all in the session and you could follow the same design and modify this to do what you want. You mention that the server needs to look at what is sent and do different things. Take a look at chat_session and the start function which was called by the server in handle_accept.
void start()
{
room_.join(shared_from_this());
boost::asio::async_read(socket_,
boost::asio::buffer(read_msg_.data(), chat_message::header_length),
boost::bind(
&chat_session::handle_read_header, shared_from_this(),
boost::asio::placeholders::error));
}
What is important here is the call to boost::asio::async_read. This is what you want too. This puts an outstanding read on the socket, so the server can read what the client sends. There is a handler (function) which is bound to this event chat_session::handle_read_header. This will be called whenever the server reads something on the socket. In this handler function you could start putting your specific code to determine what to do if a specific message is sent and so on.
What is important to know is that whenever calling these asynchronous boost::asio functions things will not happen within that call (i.e. the socket is not read if you call the function read). This is the asynchronous aspect. You just kind of register a handler for something and your code is called back when this happens. Hence, when this read is called it will immediately return and you're back in the handle_accept for the server (if you follow how things get called). And if you remember there we also call start_accept to set up another asynchronous accept. At this point you have two outstanding handlers waiting for either another client to connect or the first client to send something. Depending on what happens first, that specific handler will be called.
Also what is important to understand is that whenever something is run, it will run uninterrupted until everything it needs to do has been done. Other handlers have to wait even if there is are outstanding events which trigger them.
Finally, in order to run the server you'll need the io_service which is a central concept in Asio.
io_service.run();
This is one line you see in the main function. This just says that the thread (only one in the example) should run the io_service, which is the queue where handlers get enqueued when there is work to be done. When nothing, the io_service just waits (blocking the main thread there of course).
I hope this helps you get started with what you want to do. There is a lot of stuff you can do and things to learn. I find it a great piece of software! Good luck!
In case anyone else wants to do this, here is the minimum to get above going: (similar to the tutorials, but a bit shorter and a bit different)
class Session : public boost::enable_shared_from_this<Session>
{
tcp::socket socket;
char buf[1000];
public:
Session(boost::asio::io_service& io_service)
: socket(io_service) { }
tcp::socket& SocketRef() { return socket; }
void Read() {
boost::asio::async_read( socket,boost::asio::buffer(buf),boost::asio::transfer_at_least(1),boost::bind(&Session::Handle_Read,shared_from_this(),boost::asio::placeholders::error));
}
void Handle_Read(const boost::system::error_code& error) {
if (!error)
{
//read from buffer and handle requests
//if you want to write sth, you can do it sync. here: e.g. boost::asio::write(socket, ..., ignored_error);
Read();
}
}
};
typedef boost::shared_ptr<Session> SessionPtr;
class Server
{
boost::asio::io_service io_service;
tcp::acceptor acceptor;
public:
Server() : acceptor(io_service,tcp::endpoint(tcp::v4(), 13)) { }
~Server() { }
void operator()() { StartAccept(); io_service.run(); }
void StartAccept() {
SessionPtr session_ptr(new Session(io_service));
acceptor.async_accept(session_ptr->SocketRef(),boost::bind(&Server::HandleAccept,this,session_ptr,boost::asio::placeholders::error));
}
void HandleAccept(SessionPtr session,const boost::system::error_code& error) {
if (!error)
session->Read();
StartAccept();
}
};
From what I gathered through trial and error and reading: I kick it off in the operator()() so you can have it run in the background in an additional thread. You run one Server instance. To handle multiple clients, you need an extra class, I called this a session class. For asio to clean up dead sessions, you need a shared pointer as pointed out above. Otherwise the code should get you started.
I'm using Boost.Asio for network operations, they have to (and actually, can, there's no complex data structures or anything) remain pretty low level since I can't afford the luxury of serialization overhead (and the libs I found that did offer well enough performance seemed to be badly suited for my case).
The problem is with an async write I'm doing from the client (in QT, but that should probably be irrelevant here). The callback specified in the async_write doesn't get called, ever, and I'm at a complete loss as to why. The code is:
void SpikingMatrixClient::addMatrix() {
std::cout << "entered add matrix" << std::endl;
int action = protocol::Actions::AddMatrix;
int matrixSize = this->ui->editNetworkSize->text().toInt();
std::ostream out(&buf);
out.write(reinterpret_cast<const char*>(&action), sizeof(action));
out.write(reinterpret_cast<const char*>(&matrixSize), sizeof(matrixSize));
boost::asio::async_write(*connection.socket(), buf.data(),
boost::bind(&SpikingMatrixClient::onAddMatrix, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred));
}
which calls the first write. The callback is
void SpikingMatrixClient::onAddMatrix(const boost::system::error_code& error, size_t bytes_transferred) {
std::cout << "entered onAddMatrix" << std::endl;
if (!error) {
buf.consume(bytes_transferred);
requestMatrixList();
} else {
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
}
The callback never gets called, even though the server receives all the data. Can anyone think of any reason why it might be doing that?
P.S. There was a wrapper for that connection, and yes there will probably be one again. Ditched it a day or two ago because I couldn't find the problem with this callback.
As suggested, posting a solution I found to be the most suitable (at least for now).
The client application is [being] written in QT, and I need the IO to be async. For the most part, the client receives calculation data from the server application and has to render various graphical representations of them.
Now, there's some key aspects to consider:
The GUI has to be responsive, it should not be blocked by the IO.
The client can be connected / disconnected.
The traffic is pretty intense, data gets sent / refreshed to the client every few secs and it has to remain responsive (as per item 1.).
As per the Boost.Asio documentation,
Multiple threads may call io_service::run() to set up a pool of
threads from which completion handlers may be invoked.
Note that all threads that have joined an io_service's pool are considered equivalent, and the io_service may distribute work across them in an arbitrary fashion.
Note that io_service.run() blocks until the io_service runs out of work.
With this in mind, the clear solution is to run io_service.run() from another thread. The relevant code snippets are
void SpikingMatrixClient::connect() {
Ui::ConnectDialog ui;
QDialog *dialog = new QDialog;
ui.setupUi(dialog);
if (dialog->exec()) {
QString host = ui.lineEditHost->text();
QString port = ui.lineEditPort->text();
connection = TcpConnection::create(io);
boost::system::error_code error = connection->connect(host, port);
if (!error) {
io = boost::shared_ptr<boost::asio::io_service>(new boost::asio::io_service);
work = boost::shared_ptr<boost::asio::io_service::work>(new boost::asio::io_service::work(*io));
io_threads.create_thread(boost::bind(&SpikingMatrixClient::runIo, this, io));
}
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
}
for connecting & starting IO, where:
work is a private boost::shared_ptr to the boost::asio::io_service::work object it was passed,
io is a private boost::shared_ptr to a boost::asio::io_service,
connection is a boost::shared_ptr to my connection wrapper class, and the connect() call uses a resolver etc. to connect the socket, there's plenty examples of that around
and io_threads is a private boost::thread_group.
Surely it could be shortened with some typedefs if needed.
TcpConnection is my own connection wrapper implementation, which sortof lacks functionality for now, and I suppose I could move the whole thread thing into it when it gets reinstated. This snippet should be enough to get the idea anyway...
The disconnecting part goes like this:
void SpikingMatrixClient::disconnect() {
work.reset();
io_threads.join_all();
boost::system::error_code error = connection->disconnect();
if (!error) {
connection.reset();
}
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
the work object is destroyed, so that the io_service can run out of work eventually,
the threads are joined, meaning that all work gets finished before disconnecting, thus data shouldn't get corrupted,
the disconnect() calls shutdown() and close() on the socket behind the scenes, and if there's no error, destroys the connection pointer.
Note, that there's no error handling in case of an error while disconnecting in this snippet, but it could very well be done, either by checking the error code (which seems more C-like), or throwing from the disconnect() if the error code within it represents an error after trying to disconnect.
I encountered a similar problem (callbacks not fired) but the circumstances are different from this question (io_service had jobs but still would not fire the handlers ). I will post this anyway and maybe it will help someone.
In my program, I set up an async_connect() then followed by io_service.run(), which blocks as expected.
async_connect() goes to on_connect_handler() as expected, which in turn fires async_write().
on_write_complete_handler() does not fire, even though the other end of the connection has received all the data and has even sent back a response.
I discovered that it is caused by me placing program logic in on_connect_handler(). Specifically, after the connection was established and after I called async_write(), I entered an infinite loop to perform arbitrary logic, not allowing on_connect_handler() to exit. I assume this causes the io_service to not be able to execute other handlers, even if their conditions are met because it is stuck here. ( I had many misconceptions, and thought that io_service would automagically spawn threads for each async_x() call )
Hope that helps.