SO_RCVTIME and SO_RCVTIMEO not affecting Boost.Asio operations - c++

Below is my code
boost::asio::io_service io;
boost::asio::ip::tcp::acceptor::reuse_address option(true);
boost::asio::ip::tcp::acceptor accept(io);
boost::asio::ip::tcp::resolver resolver(io);
boost::asio::ip::tcp::resolver::query query("0.0.0.0", "8080");
boost::asio::ip::tcp::endpoint endpoint = *resolver.resolve(query);
accept.open(endpoint.protocol());
accept.set_option(option);
accept.bind(endpoint);
accept.listen(30);
boost::asio::ip::tcp::socket ps(io);
accept.accept(ps);
struct timeval tv;
tv.tv_sec = 1;
tv.tv_usec = 0;
//setsockopt(ps.native(), SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof(tv));
setsockopt(ps.native(), SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv));
char buf[1024];
ps.async_receive(boost::asio::buffer(buf, 1024), boost::bind(fun));
io.run();
When I use Telnet to connect, but not sending data, it does not disconnect from a Telnet timeout.Will need to do to make setsockopt kick in?
Thanks!
I have modified SO_RCVTIMEO to SO_SNDTIMEO. Still unable to timeout in the specified time

Using SO_RCVTIMEO and SO_SNDTIMEO socket options with Boost.Asio will rarely produce the desired behavior. Consider using either of the following two patterns:
Composed Operation With async_wait()
One can compose an asynchronous read operation with timeout by using a Boost.Asio timer and an async_wait() operation with a async_receive() operation. This approach is demonstrated in the Boost.Asio timeout examples, something similar to:
// Start a timeout for the read.
boost::asio::deadline_timer timer(io_service);
timer.expires_from_now(boost::posix_time::seconds(1));
timer.async_wait(
[&socket, &timer](const boost::system::error_code& error)
{
// On error, such as cancellation, return early.
if (error) return;
// Timer has expired, but the read operation's completion handler
// may have already ran, setting expiration to be in the future.
if (timer.expires_at() > boost::asio::deadline_timer::traits_type::now())
{
return;
}
// The read operation's completion handler has not ran.
boost::system::error_code ignored_ec;
socket.close(ignored_ec);
});
// Start the read operation.
socket.async_receive(buffer,
[&socket, &timer](const boost::system::error_code& error,
std::size_t bytes_transferred)
{
// Update timeout state to indicate the handler has ran. This
// will cancel any pending timeouts.
timer.expires_at(boost::posix_time::pos_infin);
// On error, such as cancellation, return early.
if (error) return;
// At this point, the read was successful and buffer is populated.
// However, if the timeout occurred and its completion handler ran first,
// then the socket is closed (!socket.is_open()).
});
Be aware that it is possible for both asynchronous operations to complete in the same iteration, making both completion handlers ready to run with success. Hence, the reason why both completion handlers need to update and check state. See this answer for more details on how to manage state.
Use std::future
Boost.Asio's provides support for C++11 futures. When boost::asio::use_future is provided as the completion handler to an asynchronous operation, the initiating function will return a std::future that will be fulfilled once the operation completes. As std::future supports timed waits, one can leverage it for timing out an operation. Do note that as the calling thread will be blocked waiting for the future, at least one other thread must be processing the io_service to allow the async_receive() operation to progress and fulfill the promise:
// Use an asynchronous operation so that it can be cancelled on timeout.
std::future<std::size_t> read_result = socket.async_receive(
buffer, boost::asio::use_future);
// If timeout occurs, then cancel the read operation.
if (read_result.wait_for(std::chrono::seconds(1)) ==
std::future_status::timeout)
{
socket.cancel();
}
// Otherwise, the operation completed (with success or error).
else
{
// If the operation failed, then read_result.get() will throw a
// boost::system::system_error.
auto bytes_transferred = read_result.get();
// process buffer
}
Why SO_RCVTIMEO Will Not Work
System Behavior
The SO_RCVTIMEO documentation notes that the option only affects system calls that perform socket I/O, such as read() and recvmsg(). It does not affect event demultiplexers, such as select() and poll(), that only watch the file descriptors to determine when I/O can occur without blocking. Furthermore, when a timeout does occur, the I/O call fails returning -1 and sets errno to EAGAIN or EWOULDBLOCK.
Specify the receiving or sending timeouts until reporting an error. [...] if no data has been transferred and the timeout has been reached then -1 is returned with errno set to EAGAIN or EWOULDBLOCK [...] Timeouts only have effect for system calls that perform socket I/O (e.g., read(), recvmsg(), [...]; timeouts have no effect for select(), poll(), epoll_wait(), and so on.
When the underlying file descriptor is set to non-blocking, system calls performing socket I/O will return immediately with EAGAIN or EWOULDBLOCK if resources are not immediately available. For a non-blocking socket, SO_RCVTIMEO will not have any affect, as the call will return immediately with success or failure. Thus, for SO_RCVTIMEO to affect system I/O calls, the socket must be blocking.
Boost.Asio Behavior
First, asynchronous I/O operations in Boost.Asio will use an event demultiplexer, such as select() or poll(). Hence, SO_RCVTIMEO will not affect asynchronous operations.
Next, Boost.Asio's sockets have the concept of two non-blocking modes (both of which default to false):
native_non_blocking() mode that roughly corresponds to the file descriptor's non-blocking state. This mode affects system I/O calls. For example, if one invokes socket.native_non_blocking(true), then recv(socket.native_handle(), ...) may fail with errno set to EAGAIN or EWOULDBLOCK. Anytime an asynchronous operation is initiated on a socket, Boost.Asio will enable this mode.
non_blocking() mode that affects Boost.Asio's synchronous socket operations. When set to true, Boost.Asio will set the underlying file descriptor to be non-blocking and synchronous Boost.Asio socket operations can fail with boost::asio::error::would_block (or the equivalent system error). When set to false, Boost.Asio will block, even if the underlying file descriptor is non-blocking, by polling the file descriptor and re-attempting system I/O operations if EAGAIN or EWOULDBLOCK are returned.
The behavior of non_blocking() prevents SO_RCVTIMEO from producing desired behavior. Assuming socket.receive() is invoked and data is neither available nor received:
If non_blocking() is false, the system I/O call will timeout per SO_RCVTIMEO. However, Boost.Asio will then immediately block polling on the file descriptor to be readable, which is not affected by SO_RCVTIMEO. The final result is the caller blocked in socket.receive() until either data has been received or failure, such as the remote peer closing the connection.
If non_blocking() is true, then the underlying file descriptor is also non-blocking. Hence, the system I/O call will ignore SO_RCVTIMEO, immediately return with EAGAIN or EWOULDBLOCK, causing socket.receive() to fail with boost::asio::error::would_block.
Ideally, for SO_RCVTIMEO to function with Boost.Asio, one needs native_non_blocking() set to false so that SO_RCVTIMEO can take affect, but also have non_blocking() set to true to prevent polling on the descriptor. However, Boost.Asio does not support this:
socket::native_non_blocking(bool mode)
If the mode is false, but the current value of non_blocking() is true, this function fails with boost::asio::error::invalid_argument, as the combination does not make sense.

Since you are receiving data, you may want to set: SO_RCVTIMEO instead of SO_SNDTIMEO
Although mixing boost and system calls may not produce the expected results.
For reference:
SO_RCVTIMEO
Sets the timeout value that specifies the maximum amount of time an input function waits until it completes. It accepts a timeval
structure with the number of seconds and microseconds specifying the
limit on how long to wait for an input operation to complete. If a
receive operation has blocked for this much time without receiving
additional data, it shall return with a partial count or errno set to
[EAGAIN] or [EWOULDBLOCK] if no data is received. The default for this
option is zero, which indicates that a receive operation shall not
time out. This option takes a timeval structure. Note that not all
implementations allow this option to be set.
This option however only has effect on read operations, not on other low level function that may wait on the socket in an asynchronous implementation (e.g. select and epoll) and it seems that it does not affect asynchronous asio operations as well.
I found an example code from boost that may work for your case here.
An over simplified example (to be compiled in c++11):
#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <iostream>
void myclose(boost::asio::ip::tcp::socket& ps) { ps.close(); }
int main()
{
boost::asio::io_service io;
boost::asio::ip::tcp::acceptor::reuse_address option(true);
boost::asio::ip::tcp::acceptor accept(io);
boost::asio::ip::tcp::resolver resolver(io);
boost::asio::ip::tcp::resolver::query query("0.0.0.0", "8080");
boost::asio::ip::tcp::endpoint endpoint = *resolver.resolve(query);
accept.open(endpoint.protocol());
accept.set_option(option);
accept.bind(endpoint);
accept.listen(30);
boost::asio::ip::tcp::socket ps(io);
accept.accept(ps);
char buf[1024];
boost::asio::deadline_timer timer(io, boost::posix_time::seconds(1));
timer.async_wait(boost::bind(myclose, boost::ref(ps)));
ps.async_receive(boost::asio::buffer(buf, 1024),
[](const boost::system::error_code& error,
std::size_t bytes_transferred )
{
std::cout << bytes_transferred << std::endl;
});
io.run();
return 0;
}

Related

How to recover from network interruption using boost::asio

I am writing a server that accepts data from a device and processes it. Everything works fine unless there is an interruption in the network (i.e., if I unplug the Ethernet cable, then reconnect it). I'm using read_until() because the protocol that the device uses terminates the packet with a specific sequence of bytes. When the data stream is interrupted, read_until() blocks, as expected. However when the stream starts up again, it remains blocked. If I look at the data stream with Wireshark, the device continues transmitting and each packet is being ACK'ed by the network stack. But if I look at bytes_readable it is always 0. How can I detect the interruption and how to re-establish a connection to the data stream? Below is a code snippet and thanks in advance for any help you can offer. [Go easy on me, this is my first Stack Overflow question....and yes I did try to search for an answer.]
using boost::asio::ip::tcp;
boost::asio::io_service IOservice;
tcp::acceptor acceptor(IOservice, tcp::endpoint(tcp::v4(), listenPort));
tcp::socket socket(IOservice);
acceptor.accept(socket);
for (;;)
{
len = boost::asio::read_until(socket, sbuf, end);
// Process sbuf
// etc.
}
Remember, the client initiates a connection, so the only thing you need to achieve is to re-create the socket and start accepting again. I will keep the format of your snippet but I hope your real code is properly encapsulated.
using SocketType = boost::asio::ip::tcp::socket;
std::unique_ptr<SocketType> CreateSocketAndAccept(
boost::asio::io_service& io_service,
boost::asio::ip::tcp::acceptor& acceptor) {
auto socket = std::make_unique<boost::asio::ip::tcp::socket>(io_service);
boost::system::error_code ec;
acceptor.accept(*socket.get(), ec);
if (ec) {
//TODO: Add handler.
}
return socket;
}
...
auto socket = CreateSocketAndAccept(IOservice, acceptor);
for (;;) {
boost::system::error_code ec;
auto len = boost::asio::read_until(*socket.get(), sbuf, end, ec);
if (ec) // you could be more picky here of course,
// e.g. check against connection_reset, connection_aborted
socket = CreateSocketAndAccept(IOservice, acceptor);
...
}
Footnote: Should go without saying, socket needs to stay in scope.
Edit: Based on the comments bellow.
The listening socket itself does not know whether a client is silent or whether it got cut off. All operations, especially synchronous, should impose a time limit on completion. Consider setting SO_RCVTIMEO or SO_KEEPALIVE (per socket, or system wide, for more info How to use SO_KEEPALIVE option properly to detect that the client at the other end is down?).
Another option is to go async and implement a full fledged "shared" socket server (BOOST example page is a great start).
Either way, you might run into data consistency issues and be forced to deal with it, e.g. when the client detects an interrupted connection, it would resend the data. (or something more complex using higher level protocols)
If you want to stay synchronous, the way I've seen things handled is to destroy the socket when you detect an interruption. The blocking call should throw an exception that you can catch and then start accepting connections again.
for (;;)
{
try {
len = boost::asio::read_until(socket, sbuf, end);
// Process sbuf
// etc.
}
catch (const boost::system::system_error& e) {
// clean up. Start accepting new connections.
}
}
As Tom mentions in his answer, there is no difference between inactivity and ungraceful disconnection so you need an external mechanism to detect this.
If you're expecting continuous data transfer, maybe a timeout per connection on the server side is enough. A simple ping could also work. After accepting a connection, ping your client every X seconds and declare the connection dead if he doesn't answer.

UnrealEngine4: Recv function would keep blocking when TCP server shutdown

I use a blocking FSocket in client-side that connected to tcp server, if there's no message from server, socket thread would block in function FScoket::Recv(), if TCP server shutdown, socket thread is still blocking in this function. but when use blocking socket of BSD Socket API, thread would pass from recv function and return errno when TCP server shutdown, so is it the defect of FSocket?
uint32 HRecvThread::Run()
{
uint8* recv_buf = new uint8[RECV_BUF_SIZE];
uint8* const recv_buf_head = recv_buf;
int readLenSeq = 0;
while (Started)
{
//if (TcpClient->Connected() && ClientSocket->GetConnectionState() != SCS_Connected)
//{
// // server disconnected
// TcpClient->SetConnected(false);
// break;
//}
int32 bytesRead = 0;
//because use blocking socket, so thread would block in Recv function if have no message
ClientSocket->Recv(recv_buf, readLenSeq, bytesRead);
.....
//some logic of resolution for tcp msg bytes
.....
}
delete[] recv_buf;
return 0
}
As I expected, you are ignoring the return code, which presumably indicates success or failure, so you are looping indefinitely (not blocking) on an error or end of stream condition.
NB You should allocate the recv_buf on the stack, not dynamically. Don't use the heap when you don't have to.
There is a similar question on the forums in the UE4 C++ Programming section. Here is the discussion:
https://forums.unrealengine.com/showthread.php?111552-Recv-function-would-keep-blocking-when-TCP-server-shutdown
Long story short, in the UE4 Source, they ignore EWOULDBLOCK as an error. The code comments state that they do not view it as an error.
Also, there are several helper functions you should be using when opening the port and when polling the port (I assume you are polling since you are using blocking calls)
FSocket::Connect returns a bool, so make sure to check that return
value.
FSocket::GetLastError returns the UE4 Translated error code if an
error occured with the socket.
FSocket::HasPendingData will return a value that informs you if it
is safe to read from the socket.
FSocket::HasPendingConnection can check to see your connection state.
FSocket::GetConnectionState will tell you your active connection state.
Using these helper functions for error checking before making a call to FSocket::Recv will help you make sure you are in a good state before trying to read data. Also, it was noted in the forum posts that using the non-blocking code worked as expected. So, if you do not have a specific reason to use blocking code, just use the non-blocking implementation.
Also, as a final hint, using FSocket::Wait will block until your socket is in a desirable state of your choosing with a timeout, i.e. is readable or has data.

Set timeout for boost socket.connect

I am using boost::asio::connect on a tcp::socket. When all goes fine, the connect returns immediately but on a poor network, the connect times out after a log wait of 15 seconds. I cannot afford to wait that long and so want to reduce the timeout. Unfortunately I have not come across any solution so far.
I see solutions where async_wait is been used together with deadline_timer but all those examples are for receive / send operations and not for connect.
Can anyone help me with a sample code for boost::asio::connect(socket, endpoints);. Requirement is that it should timeout in 5 seconds instead of 15.
Have you take a look to the following example? It contains a sample code an async_connect with timeout.
The connect with timeout method could be implemented using the following code:
void connect(const std::string& host, const std::string& service,
boost::posix_time::time_duration timeout) {
// Resolve the host name and service to a list of endpoints.
tcp::resolver::query query(host, service);
tcp::resolver::iterator iter = tcp::resolver(io_service_).resolve(query);
// Set a deadline for the asynchronous operation. As a host name may
// resolve to multiple endpoints, this function uses the composed operation
// async_connect. The deadline applies to the entire operation, rather than
// individual connection attempts.
deadline_.expires_from_now(timeout);
// Set up the variable that receives the result of the asynchronous
// operation. The error code is set to would_block to signal that the
// operation is incomplete. Asio guarantees that its asynchronous
// operations will never fail with would_block, so any other value in
// ec indicates completion.
boost::system::error_code ec = boost::asio::error::would_block;
// Start the asynchronous operation itself. The boost::lambda function
// object is used as a callback and will update the ec variable when the
// operation completes. The blocking_udp_client.cpp example shows how you
// can use boost::bind rather than boost::lambda.
boost::asio::async_connect(socket_, iter, var(ec) = _1);
// Block until the asynchronous operation has completed.
do io_service_.run_one(); while (ec == boost::asio::error::would_block);
// Determine whether a connection was successfully established. The
// deadline actor may have had a chance to run and close our socket, even
// though the connect operation notionally succeeded. Therefore we must
// check whether the socket is still open before deciding if we succeeded
// or failed.
if (ec || !socket_.is_open())
throw boost::system::system_error(
ec ? ec : boost::asio::error::operation_aborted);
}

Using pselect for synchronous wait

In a server code I want to use pselect to wait for clients to connect as well monitor the standard output of the prozesses that I create and send it to the client (like a simplified remote shell).
I tried to find examples on how to use pselect but I haven't found any. The socket where the client can connect is already set up and works, as I verified that with accept(). SIGTERM is blocked.
Here is the code where I try to use pselect:
waitClient()
{
fd_set readers;
fd_set writers;
fd_set exceptions;
struct timespec ts;
// Loop until we get a sigterm to shutdown
while(getSigTERM() == false)
{
FD_ZERO(&readers);
FD_ZERO(&writers);
FD_ZERO(&exceptions);
FD_SET(fileno(stdin), &readers);
FD_SET(fileno(stdout), &writers);
FD_SET(fileno(stderr), &writers);
FD_SET(getServerSocket()->getSocketId(), &readers);
//FD_SET(getServerSocket()->getSocketId(), &writers);
memset(&ts, 0, sizeof(struct timespec));
pret = pselect(FD_SETSIZE, &readers, &writers, &exceptions, &ts, &mSignalMask);
// Here pselect always returns with 2. What does this mean?
cout << "pselect returned..." << pret << endl;
cout.flush();
}
}
So what I want to know is how to wait with pselect until an event is received, because currently pselect always returns immediately with a value 2. I tried to set the timeout to NULL but that doesn't change anything.
The returnvalue of pselect (if positive) is the filedescriptor that caused the event?
I'm using fork() to create new prozesses (not implemented yet) I know that I have to wait() on them. Can I wait on them as well? I suppose I need to chatch the signal SIGCHILD, so how would I use that? wait() on the child would also block, or can I just do a peek and then continue with pselect, otherwise I have to concurrent blocking waits.
It returns immediately because the file descriptors in the writers set are ready. The standard output streams will almost always be ready for writing.
And if you check a select manual page you will see that the return value is either -1 on error, 0 on timeout, and a positive number telling you the number of file descriptors that are ready.

boost asio async_connect success after close

Single-threaded application.
It happens not every time, only after 1.5 hours of high load.
tcp::socket::async_connect
tcp::socket::close (by deadline_timer)
async_connect_handler gives success error_code (one of a million times), but socket is closed by(2). 99.999% of time it gives errno=125 (ECANCELED).
Is it possible that socket implementation or boost asio somehow do this:
async_connect
async success posted to io_service
close by timer
async success handled by me, not affected by close
Right now solved by saving state in my variables, ignoring accept success.
Linux 2.6 (fedora).
Boost 1.46.0
PS: ofcouse possible bug on my part... But runs smoothly for days if not this.
As Igor mentions in the comments, the completion handler is already queued.
This scenario is the result of a separation in time between when an operation executes and when a handler is invoked. The documentation for io_service::run(), io_service::run_one(), io_service::poll(), and io_service::poll_one() is specific to mention handlers, and not operations. In the scenario, the socket::async_connect() operation and deadline_timer::async_wait() operation complete in the same event loop iteration. This results in both handlers being added to the io_service for deferred invocation, in an unspecified order.
Consider the following snippet that accentuates the scenario:
void handle_wait(const boost::system::error_code& error)
{
if (error) return;
socket_.close();
}
timer_.expires_from_now(boost::posix_time::seconds(30));
timer_.async_wait(&handle_wait);
socket_.async_connect(endpoint_, handle_connect);
boost::this_thread::sleep(boost::posix_time::seconds(60));
io_service_.run_one();
When io_service_.run_one() is invoked, both socket::async_connect() and deadline_timer::async_wait() operations may have completed, causing handle_wait and handle_connect to be ready for invocation from within the io_service in an unspecified order. To properly handle this unspecified order, additional logic need to occur from within handle_wait() and handle_connect() to query the current state, and determine if the other handler has been invoked, rather than depending solely on the status (error_code) of the operation.
The easiest way to determine if the other handler has invoked is:
In handle_connect(), check if the socket is still open via is_open(). If the socket is still open, then handle_timer() has not been invoked. A clean way to indicate to handle_timer() that handle_connect() has ran is to update the expiry time.
In handle_timer(), check if the expiry time has passed. If this is true, then handle_connect() has not ran, so close the socket.
The resulting handlers could look like the following:
void handle_wait(const boost::system::error_code& error)
{
// On error, return early.
if (error) return;
// If the timer expires in the future, then connect handler must have
// first.
if (timer_.expires_at() > deadline_timer::traits_type::now()) return;
// Timeout has occurred, so close the socket.
socket_.close();
}
void handle_connect(const boost::system::error_code& error)
{
// The async_connect() function automatically opens the socket at the start
// of the asynchronous operation. If the socket is closed at this time then
// the timeout handler must have run first.
if (!socket_.is_open()) return;
// On error, return early.
if (error) return;
// Otherwise, a connection has been established. Update the timer state
// so that the timeout handler does not close the socket.
timer_.expires_at(boost::posix_time::pos_infin);
}
Boost.Asio provides some examples for handling timeouts.
I accept twsansbury's answer, just want to add some more info.
About shutdown():
void async_recv_handler( boost::system::error_code ec_recv, std::size_t count )
{
if ( !m_socket.is_open() )
return; // first time don't trust to ec_recv
if ( ec_recv )
{
// oops, we have error
// log
// close
return;
}
// seems that we are just fine, no error in ec_recv, we can gracefully shutdown the connection
// but shutdown may fail! this check is working for me
boost::system::error_code ec_shutdown;
// second time don't trusting to ec_recv
m_socket.shutdown( t, ec_shutdown );
if ( !ec_shutdown )
return;
// this error code is expected
if ( ec_shutdown == boost::asio::error::not_connected )
return;
// other error codes are unexpected for me
// log << ec_shutdown.message()
throw boost::system::system_error(ec_shutdown);
}