Handling boost asio timer cancelation - c++

Consider the following code that simulates a synchronous connect with timeout using async connect:
{
boost::asio::steady_timer timer{io_service, timeout};
timer.async_wait([&socket](boost::system::error_code const& code) {
if (!code)
socket.close();
});
auto future = boost::asio::async_connect(socket, endpoint, boost::asio::use_future);
future.get();
}
/* detect post-success cancelation? */
if (!socket.is_open()) {
}
If I understand asio documentation correctly, I cannot guarantee that the timer handler won't close the socket after is_open() has already returned true, because this sequence of events is possible:
connect completes successfully
timer expires, queuing the handler with code == success
timer is destroyed, but the already queued handler can't be recalled
is_open() returns true, so we think we're golden
handler runs, canceling our socket because code == success
future operations using the socket fail because we erroneously believed it's still open
How do I fix this code to make it safe against this scenario?

Related

asio::async_write incredibly difficult to synchronize on a high volume stream

I am currently using the Asio C++ library and wrote a client wrapper around it. My original approach was very basic and only needed to stream in a single direction. Requirements have changed and I've switched over to using all asynchronous calls. Most of the migration has been easy except for the asio::async_write(...). I have used a few different approaches and inevitably run into a deadlock with each one.
The application streams data at a high volume continuously. I have stayed away from strands because they do not block and can lead to memory issues especially when the server is under heavy load. Jobs will back up and the applications heap indefinitely grows.
So I created a blocking queue only to find out the hard way that using locks across callbacks and or blocking events leads to unknown behavior.
The wrapper is a very large class, so I will try to explain my landscape in its current state and hopefully get some good suggestions:
I have an asio::steady_timer that runs on a fixed schedule to push a heartbeat message directly into the blocking queue.
A thread dedicated to reading events and pushing them to the blocking queue
A thread dedicated to consumption of the blocking queue
For example, in my queue I have a queue::block() and queue::unblock() that are just wrappers for the condition variable / mutex.
std::thread consumer([this]() {
std::string message_buffer;
while (queue.pop(message_buffer)) {
queue.stage_block();
asio::async_write(*socket, asio::buffer(message_buffer), std::bind(&networking::handle_write, this, std::placeholders::_1, std::placeholders::_2));
queue.block();
}
});
void networking::handle_write(const std::error_code& error, size_t bytes_transferred) {
queue.unblock();
}
When the socket backs up and the server can no longer accept data because of the current load, the queue fills up and leads to a deadlock where handle_write(...) is never called.
The other approach eliminates the consumer thread entirely and relies on handle_write(...) to pop the queue. Like so:
void networking::write(const std::string& data) {
if (!queue.closed()) {
std::stringstream stream_buffer;
stream_buffer << data << std::endl;
spdlog::get("console")->debug("pushing to queue {}", queue.size());
queue.push(stream_buffer.str());
if (queue.size() == 1) {
spdlog::get("console")->debug("handle_write: {}", stream_buffer.str());
asio::async_write(*socket, asio::buffer(stream_buffer.str()), std::bind(&networking::handle_write, this, std::placeholders::_1, std::placeholders::_2));
}
}
}
void networking::handle_write(const std::error_code& error, size_t bytes_transferred) {
std::string message;
queue.pop(message);
if (!queue.closed() && !queue.empty()) {
std::string front = queue.front();
asio::async_write(*socket, asio::buffer(queue.front()), std::bind(&networking::handle_write, this, std::placeholders::_1, std::placeholders::_2));
}
}
This also resulted in a deadlock and obviously results in other race problems. When I disabled my heartbeat callback, I had absolutely no issues. However, the heartbeat is a requirement.
What am I doing wrong? What is a better approach?
It appears all my pain derived from the heartbeat entirely. Disabling the heartbeat in each variation of my asynchronous write operations seem to cure my problems, so this lead me to believe that this could be a result of using the built in asio::async_wait(...) and the asio::steady_timer.
Asio synchronizes its work internally and waits for jobs to complete before executing the next job. Using the asio::async_wait(...) to construct my heartbeat functionality was my design flaw because it operated on the same thread that waited on pending jobs. It created a deadlock with Asio when the heartbeat waited on queue::push(...). This would explain why asio::async_write(...) completion handler never executed in my first example.
The solution was to put the heartbeat on its own thread and let it work independently from Asio. I am still using my blocking queue to synchronize calls to asio::async_write(...) but have modified my consumer thread to use std::future and std::promise. This synchronizes the callback with my consumer thread cleanly.
std::thread networking::heartbeat_worker() {
return std::thread([&]() {
while (socket_opened) {
spdlog::get("console")->trace("heartbeat pending");
write(heartbeat_message);
spdlog::get("console")->trace("heartbeat sent");
std::unique_lock<std::mutex> lock(mutex);
socket_closed_event.wait_for(lock, std::chrono::milliseconds(heartbeat_interval), [&]() {
return !socket_opened;
});
}
spdlog::get("console")->trace("heartbeat thread exited gracefully");
});
}

Boost Asio, async_read/connect timeout

In boost website, there is a good example about timeout of async operations. However, in that example, the socket is closed to cancel operations. There is also socket::cancel(), but in both documentation and as a compiler warning, it is stated as problematic in terms of portability.
Among the stack of Boost.Asio timeout questions in SO, there are several kind of answers. The first one probably is introducing a custom event loop, i.e., loop io_service::run_one() and cancel the event loop on deadline. I am using io_service::run() in a worker thread. That's not the kind of solution I would like to employ, if possible, as I do not want to change my code base.
A second option is directly changing the options of native socket. However, I would like to stick to Boost.Asio if possible and avoid any sort of platform-specific code as much as possible.
The example in the documentation is for an old version of Boost.Asio, but it's working properly, other than being forced to close the socket to cancel the operations. Using the documentation example, I have the following
void check_deadline(const boost::system::error_code &ec)
{
if(!running) {
return;
}
if(timer.expires_at() <= boost::asio::deadline_timer::traits_type::now()) {
// cancel all operations
boost::system::error_code errorcode;
boost::asio::ip::tcp::endpoint endpoint = socket.remote_endpoint();
socket.close(errorcode);
if(errorcode) {
SLOGERROR(mutex, errorcode.message(), "check_deadline()");
}
else {
SLOG(mutex, "timed out", "check_deadline()");
// connect again
Connect(endpoint);
if(errorcode) {
SLOGERROR(mutex, errorcode.message(), "check_deadline()");
}
}
// set timer to infinity, so that it won't expire
// until a proper deadline is set
timer.expires_at(boost::posix_time::pos_infin);
}
// keep waiting
timer.async_wait(std::bind(&TCPClient::check_deadline, this, std::placeholders::_1));
}
This is the only callback function registered to async_wait.The very first solution I could come up is reconnecting after closing the socket. Now my question is, is there a better way? By better way, I mean canceling the operations based on a timer without actually disrupting (i.e., not closing the socket) the connection.

Check for data with timing?

Is there a way to check for data for a certain time in asio?
I have a client with an asio socket which has a Method
bool ASIOClient::hasData()
{
return m_socket->available();
}
And i'd like to have some kind of delay here so it checks for data for like 1 second max and returns more ealy. Moreover i don't want to poll it for obvious reason that it meight take a second. The reaseon why i use this is, that i do send data to a client and wait for the respond. If he doesnt respond in a certain time i'd close the socket. Thats what the hasData is mentioned for.
I know that it is nativ possible with an select and an fd_set.
The asio Client is created in an Accept method of the server socket class and later used to handle requests and send back data to the one who connected here.
int ASIOServer::accept(const bool& blocking)
{
auto l_sock = std::make_shared<asio::ip::tcp::socket>(m_io_service);
m_acceptor.accept(*l_sock);
auto l_client = std::make_shared<ASIOClient>(l_sock);
return 0;
}
You just need to attempt to read.
The usual approach is to define deadlines for all asynchronous operations that could take "long" (or even indefinitely long).
This is quite natural in asynchronous executions:
Just add a deadline timer:
boost::asio::deadline_timer tim(svc);
tim.expires_from_now(boost::posix_time::seconds(2));
tim.async_wait([](error_code ec) {
if (!ec) // timer was not canceled, so it expired
{
socket_.cancel(); // cancel pending async operation
}
});
If you want to use it with synchronous calls, you can with judicious use of poll() instead of run(). See this answer: boost::asio + std::future - Access violation after closing socket which implements a helper await_operation that runs a single operations synchronously but under a timeout.

Boost asio stops processing after some amount of time

I have a server application, which uses boost.asio framework. Application logic is simple - it listens on several ports for incoming connections, accepts it, does some processing and closes connection. Of course, more than several clients allowed to connect to server at same time. I use asynchronous approach to accept connection read and write data. The problem is, that at some point of time io_service just stops to process handlers.
Let me describe symptoms in more details. After problem appears, application continues to listen to specified ports, and netstat command can verify that. Client can establish connection to server, but not a single handler(Server::Session) is called.
Here is code, that accepts connections:
void Server::StartAccept()
{
socket_ptr sock(new boost::asio::ip::tcp::socket(ioService_));
acceptor_.async_accept(*sock, boost::bind(&Server::Session, shared_from_this(), sock, boost::asio::placeholders::error));
}
void Server::Session(socket_ptr sock, const boost::system::error_code& error)
{
StartAccept();
if(error)
{
boost::system::error_code ec;
sock->shutdown(boost::asio::ip::tcp::socket::shutdown_both, ec);
sock->close(ec);
return;
}
//Processing...
}
Here is the code, which starts server:
void run_service()
{
for (;;)
{
try
{
io_service.run();
break;
}
catch (...)
{
}
}
}
boost::thread_group threads;
for ( int i = 0; i < size; ++i)
threads.create_thread(run_service);
threads.join_all();
I found out out, that, if I replace line
io_service.run();
with
while (!io_service.stopped())
io_service.run_once();
than this loop will stuck right when error appears, and run_once function will never return.
My assumptions on why that could happend:
One of handler, which was called never returns.
This is some sort of deadlock in boost internals(because I don't do any locking).
The questions are:
What other reasons could be for such strange behaviour?
What is best way to fix that?
How can I figure out, which handler is called by run_once function before it stucks?
The problem was in handler, which waited for another network activity to finish. This activity didn't have timeout and in some cases lasted forever. Thanks for comments. Defining BOOST_ASIO_ENABLE_HANDLER_TRACKING is really good step to detect problem.

Checking if a boost timed thread has completed

I have been reading the boost thread documentation, and cannot find an example of what I need.
I need to run a method in a timed thread, and if it has not completed within a number of milliseconds,
then raise a timeout error.
So I have a method called invokeWithTimeOut() that looks like this:
// Method to invoke a request with a timeout.
bool devices::server::CDeviceServer::invokeWithTimeout(CDeviceClientRequest& request,
CDeviceServerResponse& response)
{
// Retrieve the timeout from the device.
int timeout = getTimeout();
timeout += 100; // Add 100ms to cover invocation time.
// TODO: insert code here.
// Invoke the request on the device.
invoke(request, response);
// Return success.
return true;
}
I need to call invoke(request, response), and if it has not completed within timeout, the method needs to return false.
Can someone supple a quick boost::thread example of how to do this please.
Note: The timeout is in milliseconds. Both getTimeout() and invoke() are pure-virtual functions, that have been implemented on the device sub-classes.
Simplest solution: Launch invoke in a separate thread and use a future to indicate when invoke finishes:
boost::promise<void> p;
boost::future<void> f = p.get_future();
boost::thread t([&]() { invoke(request, response); p.set_value(); });
bool did_finish = (f.wait_for(boost::chrono::milliseconds(timeout)) == boost::future_status::ready)
did_finish will be true if and only if the invoke finished before the timeout.
The interesting question is what to do if that is not the case. You still need to shutdown the thread t gracefully, so you will need some mechanism to cancel the pending invoke and do a proper join before destroying the thread. While in theory you could simply detach the thread, that is a very bad idea in practice as you lose all means of interacting with the thread and could for example end up with hundreds of deadlocked threads without noticing.