async_connect() timeout with multiple threads performing io_service.run() - c++

I am trying to implement async_connect() with a timeout.
async_connect_with_timeout(socket_type & s,
std::function<void(BoostAndCustomError const & error)> const & connect_handler,
time_type timeout);
When operation completes connect_handler(error) is called with error indicating operation result (including timeout).
I was hoping to use code from timeouts example 1.51. The biggest difference is that I am using multiple worker threads performing io_service.run().
What changes are necessary to keep the example code working?
My issues are:
When calling :
Start() {
socket_.async_connect(Handleconnect);
dealine_.async_wait(HandleTimeout);
}
HandleConnect() can be completed in another thread even before async_wait() (unlikely but possible). Do I have to strand wrap Start(), HandleConnect(), and HandleTimeout()?
What if HandleConnect() is called first without error, but deadline_timer.cancel() or deadline_timer.expires_from_now() fails because HandleTimeout() "have been queued for invocation in the near future"? Looks like example code lets HandleTimeout() close socket. Such behavior (timer closes connection after we happily started some operations after connect) can easily lead to serious headache.
What if HandleTimeout() and socket.close() are called first. Is it possible to HandlerConnect() be already "queued" without error? Documentation says: "Any asynchronous send, receive or connect operations will be cancelled immediately, and will complete with the boost::asio::error::operation_aborted error". What does "immediately" mean in multithreading environment?

You should wrap with strand each handler, if you want to prevent their parallel execution in different threads. I guess some completion handlers would access socket_ or the timer, so you'll definitely have to wrap Start() with a strand as well. But wouldn't it be much more simple to use io_service-per-CPU model, i.e. to base your application on io_service pool? IMHO, you'll get much less headache.
Yes, it's possible. Why is it a headache? The socket gets closed because of a "false timeout", and you start re-connection (or whatever) procedure just as if it were closed due to a network failure.
Yes, it's also possible, but again, it shouldn't cause any problem for correctly designed program: if in HandleConnect you try to issue some operation on a closed socket, you'll get the appropriate error. Anyway, when you attempt to send/receive data you don't really know the current socket/network status.

Related

Design pattern to ensure on_close() is called once after all async r/w's are finished?

This question is asked from the context of Boost ASIO (C++).
Say you are using a library to do some async i/o on a socket, where:
you are always waiting to receive data
you occasionally send some data
Since you are always waiting to receive data (e.g. you trigger another async_read() from your completion handler), at any given time, you will either have:
an async read operation in progress
an async read operation in progress and an async write operation in progress
Now say you wanted to call some other function, on_close(), when the connection closes. In Boost ASIO, a connection error or cancel() will cause any oustanding async reads/writes to give an error to your completion handler. But there is no guarantee whether you are in scenario 1. or 2., nor is there a guarantee that the write will error before the read or vice versa. So to implement this, I can only imagine adding two variables called is_reading and is_writing which are set to true by async_read() and async_write() respectively, and set to false by the completion handlers. Then, from either completion handler, when there is an error and I think the connection may be closing, I would check if there is still an async operation in the opposite direction, and call on_close() if not.
The code, more or less:
atomic_bool is_writing;
atomic_bool is_reading;
...
void read_callback(error_code& error, size_t bytes_transferred)
{
is_reading = false;
if (error)
{
if (!is_writing) on_close();
}
else
{
process_data(bytes_transferred);
async_read(BUF_SIZE); // this will set is_reading to true
}
}
void write_callback(error_code& error, size_t bytes_transferred)
{
is_writing = false;
if (error)
{
if (!is_reading) on_close();
}
}
Assume that this is a single-threaded app, but the thread is handling multiple sockets so you can't just let the thread end.
Is there a better way to design this? To make sure on_close() is called after the last async operation finishes?
One of the most common patterns is to use enable_shared_from_this and binding all completion handlers ("continuations") to it.
That way if the async call chain ends (be it due to error or regular completion) the shared_ptr referee will be freed.
You can see many many examples by me using Asio/Beast on this site
You can put your close logic in a destructor, or if that, too, involves async calls, you can post it on the same strand/chain.
Advanced Ideas
If your traffic is full-duplex and one side fails in a way that necessitates cancelling the other direction, you can post cancellation on the strand and the async call will abort (e.g. with error_code boost::asio::error::operation_aborted).
Even more involved would be to create a custom IO service, where the lifetime of certain "backend" entities is governed by "handle" types. This is probably often overkill, but if you are writing a foundational framework that will be used in a larger number of places, you might consider it. I think this is a good starter: How to design proper release of a boost::asio socket or wrapper thereof (be sure to follow the comment links).
You can leave error handling logic only inside read_callback.

How to detect a closed connection on a socket without a blocking call in boost asio

I've written a tcp communication scheme using boost asio and it works quite fine if I use a separate thread for the io_service::run(). However, the communicator should also be used as part of an MPI-parallel program (please do not question this), where forking and threading is not a good idea and may not be permitted.
Therefor, the communicator has a function work() which calls io_service::run_one. Now, if I directly use async_read, it will be blocking is the call to run_one until something is read or an error occurs. So I wrote the check_available function shown below that first checks if there's something on the socket before calling the async_read in my function read_message_header.
This also works smoothly until the peer closes the connection. Unfortunately, this seems not to be flagged as an eof error by socket::available(error) like it is the case in theasync_read and it returns 0 as the number of available bytes so that read_message_header is never called, which would then detect the eof. Also checking socket::is_open() did not work for this purpose, because only the peer's sockets are closed and not the receiving socket on this instance.
void TCPConnection::check_availble() {
if(! socket_.is_open()) {
handle_read_error(boost::asio::error::eof);
}
boost::system::error_code error;
size_t nbytes=socket_.available(error); // does not detect the eof
if(error)
handle_read_error(error);
if(nbytes>0)
read_message_header();
else
socket_.get_io_service().post(
boost::bind(
&TCPConnection::check_availble,
shared_from_this()
)
);
}
Is there a way to detect the eof without any blocking calls?
I've couple of remarks to make:
which calls io_service::run_one. Now, if I directly use async_read, it
will be blocking is the call to run_one until something is read or an
error occurs
You can easily subvert this blocking behaviour by creating a fake work<> object and post it into io_service queue, so as long as there are handlers to execute, the poll won't wait for any events: just return immediately.
speaking about size_t nbytes=socket_.available(error);; avilable() returns number of bytes that are available to read (may be without blocking), and as mentioned in a comment: perhaps, you need to read the ready-bytes to see EOF, but avilable() has nothing to do with it.

Force asynchronous socket read to finish early using Boost.Asio

I have a tcp::socket for reading and writing data. There is a read loop made up by chaining async_read_some() and the handler on_data_read() that calls async_read_some() again after dealing with the data read. A relevant shared_ptr<tcp::socket> is passed along the loop. The loop ends when on_data_read() is called with a non-success error_code such as asio::error::eof, in which case async_read_some() is not called.
There may also be asynchronous writes on the socket, which is done by repeated calls to async_write_some() until all data is written. That is, the handler on_data_written() for async_write_some() calls async_write_some() again if the data is only partially written. The relevant shared_ptr<tcp::socket> is also passed along the call-chain.
Now, I want to close the socket in a safe manner. Specifically, I want to force the asynchronous read to finish early with a non-success error code to ends the read loop. If there is no pending write call-chain, the only shared_ptr<tcp::socket> (the one passed along the read loop) gets destroyed with its managed tcp::socket closed and destroyed. If there is a pending write call-chain, it continues until all data is written. At the time the write call-chain goes to its end, the last shared_ptr<tcp::socket> (the one passed along the write call-chain) gets destroyed. This procedure is safe in the sense that the socket is closed after pending writes, if any.
The problem is
how can I force the asynchronous socket read to finish with a non-success error code?
I've checked the linger option. But it won't work since I'm using chained-up async_write_some() instead of a single async_write(). So, the socket may be closed while on_data_written() is being called. cancel() and close() won't work either, since they interrupt not only the read loop but also the write call-chain. And although shutdown() can be applied to the read loop only, it prevents future async_read_some() calls only, and has no effect on what is already done. I've worked out a workaround solution. That is, call cancel(), but have on_data_written() ignore the the error code caused by cancel() and continue the write call-chain. I'm not satisfied with this solution (see the remarks section here). I'm wondering if there is a more direct and elegant way to achieve what I want, or the whole design is just flawed?
In my opinion you've summed it up pretty nicely.
You cannot really do better than fullblown cancel. Indeed you may resume any canceled writes.
I don't think there is anything more elegant. I would not say the design is flawed, but you might want to consider not actually canceling pending operations, but instead just keeping a flag to indicate whether a "logical read cancel" is pending and prevent chaining more reads in that case
When you used shutdown.. eg.. socket.shutdown did you use the shutdown_both?
(boost::asio::ip::tcp::socket::shutdown_both, ec)
The shutdown_both option. That should handle read and write closures.
socket.shutdown(boost::asio::ip::tcp::socket::shutdown_both, errorcode);
if (errorcode)
{
cerr << "socket.shutdown error: " << errorcode.message() << endl;
}
Also.. If you have the io_service handle. you can call io_service.stop() as a last resort which will shutdown all operations.
Shutdown the socket for input. That will cause all pending reads to encounter end of stream and return accordingly.
And although shutdown() can be applied to the read loop only, it prevents future async_read_some() calls only, and has no effect on what is already done.
I don't know where you got this misinformation from. I don't even know what it means. Shutdown applies to the socket, not a read loop.

IOCP: If operation returns immediately with error, can I still receive completion notification?

If FILE_SKIP_COMPLETION_PORT_ON_SUCCESS is not enabled, then even if the operation completes immediately with success, I still get a completion notification on the completion port. I'd like to know if this is the case if it completes immediately with errors as well.
I process completions with handlers that I store as std::function in an extended OVERLAPPED struct, and are executed by the thread pool that is looping on the completion port. Having FILE_SKIP_COMPLETION_PORT_ON_SUCCESS disabled means that I don't have to worry about handlers forming a recursive chain and, worst case, running out of stack space, if the operations often complete immediately. With the skip enabled, the handler for the new operation would have to be called immediately if the operation returns right away.
The issue is that the handlers are supposed to execute both on success and on error. However, I don't know whether if an overlapped Read/Write/WSARecv/WSASend returning immediately with an error would still queue a completion packet, so that I can allow it to be handled in the handler by the thread pool, as in the case of success. Is this doable? Is it something that only applies to certain types of errors and not others? Workarounds?
This knowledge base article says that SUCCESS and ERROR_IO_PENDING result in a completion packet being generated and other results do not.
See Tip 4
Based on this blog from Raymond Chen, all completions will be queued to the completion port even if the operation completes synchronously (successfully or with an error condition).

Boost asio - stopping io_service

I'm using boost::asio to do some very basic UDP packet collection. The io_service object is instantiated in a worker thread, and io_service.run() is called from inside that thread. My problem is getting io_service.run() to return when I am done collecting packets.
I'm not clear on what methods of io_service can be called from other threads when it comes time to stop my worker thread. I have a reference to the io_service object, and from a different thread I make this call:
ios.dispatch( boost::bind( &udp_server::handle_kill, this ) );
In my udp_server class, the handler for that function cancels the pending work from a single boost::asio::ip::udp::socket and a single boost::asio::deadline_timer object. Both have pending async work to do. At that point I call ios.stop():
void udp_server::handle_kill()
{
m_socket.cancel();
m_timer.cancel();
m_ios.stop();
}
With no work pending, I expect at this point that my call to ios.run() should return - but this does not happen.
So why does it not return? The most likely explanation to me is that I shouldn't be calling io_service::dispatch() from another thread. But the dispatch() method kind of seems like it was built to do just that - dispatch a function call in the thread that io_service::run() is working in. And it seems to do just that.
So this leaves me with a few related questions:
Am I using io_service::dispatch() correctly?
If all tasks are canceled, is there any reason that io_service::run() should not return?
socket::upd::cancel() doesn't seem to be the right way to close a socket and abort all work. What is the right way?
asio is behaving pretty well for me, but I need to get a better understanding of this bit of architecture.
More data
socket::udp::cancel() is apparently an unsupported operation on an open socket under Win32 - so this operation fails by throwing an exception - which does in fact cause an exit from io_service::run(), but definitely not the desired exit.
socket::udp::close() doesn't seem to cancel the pending async_receive_from() task, so calling it instead of socket::udp::cancel() seems to leave the thread somewhere inside io_service::run().
Invoking io_service::stop from another thread is safe, this is well described in the documentation
Thread Safety
Distinct objects: Safe.
Shared objects: Safe, with the
exception that calling reset() while
there are unfinished run(), run_one(),
poll() or poll_one() calls results in
undefined behaviour.
as the comments to your question indicate, you really need to boil this down to a reproducible example.