About socket connection timeout in C++ (UNIX)

About socket connection timeout in C++ (UNIX) - c++

I have been working to put a 15 seconds timeout to a socket to avoid it to be blocked. So i configured it as non-blocking, then used select function, and works fine... when the computer is connected to a network!
If computer is disconnected from the network (for example, if Wi-Fi signal is turned off, or if you emove the ethernet plug), when I call connect function it returns inmediatly the "Network is unreachable" error.
Since I have a loop to get it reconnected if something like this happens, it is trying to connect MANY times, so I'm getting log files of gigabytes.
So, what I want is to set some kind of timeout for that too. It's not a timeout actually, but I want it to wait 15 seconds until it tries to connect again to avoid this problem. I was wrong when I thought that the timeout i have set as explained in paragraph 1 would fix this too. How can I make this then?

You could test the errno and sleep if it's ENETUNREACH.

Related

Win API serial port need to wait after initialization

I have a following problem. I have a serial port device that is supposed to communicate with a computer. In fact it is Arduino Due board but i don't think it is related.
I use CreateFile to open the port, and then set the parameters using GetCommState()&SetCommState() and GetCommTimeouts()&SetCommTimeouts().
The port is opened correctly - no problem there. But at this point I want to check whether the device is connected. So I send a specific message. The device is supposed to respond in a certain way so that I know it is connected.
Now to the problem: It only works if put Sleep(1000) after Creating the port (before sending the handshake request). It looks as if the WinAPI needs some time before it can begin to use the port. Because the Sleep solution is not generally usable I need to find some alternative...
By it doesn't work I mean ReadFile times out. It times out even if the timeout is set to something like 5 seconds - note that the Sleep interval is only one second. So it looks like the handshake request is not even sent. If I set timeout to 1 second and Sleep interval to one second, it works. If I set timeout to 5 seconds but there's no Sleep it doesn't work. See the problem?
I am going to try some NetworkMonitor, but I'm kinda sure the problem is not with the device...

OK, I might have searched a little more before posting this question.
The thing is that Arduino restarts itself when you open a connection from your PC.
When you use a terminal you connect first and write a few seconds later so that the Arduino board has enough time to boot up and you won't notice the thing. Which is what confused me enough to write the question.
There are 3 solutions to this, only 2 of which it makes sense to mention at all:
1) the solution I used without knowing all this (you wait about a second for the board to boot up again...)
2) you disable auto-reset by modifying your Arduino board
Both of them are stupid if you ask me, there should be a switch or a flash variable to do this...

Interminent Delays in C++ Tcp Communication in Linux

I have a device which sends data every 20 milliseconds over TCP. I have an application which connects to this device, starts the socket communication. My Application listens on a seperate thread and reads the data as fast as data is ready, puts data aside, and some other thread processes it. Device is directly connected to the computer via ethernet cable.
I see a strange problem and I am trying to understand the reason why, Almost once in every minute, it takes approximately 50 milliseconds to receive a packet from the device. I do a blocking read which will try reading for a second, and will finish as fast as data is ready, normally it takes approximately 20 ms as I would expect, but like I said before there are times it takes 50 ms even though it is very rare(1 in 3000). What I noticed is the packets after late packet arrives immediately, so it makes me think that there's some delay on the network layer. I also examined the timestamps of the packets(which is given by the device), they are consistenly increasing by 20 ms's.
Is it normal to see delays like that when the device is directly connected to the computer, Since it is TCP there might be lots of effort under the hood(CRC checks, out of order packages, retransmissions, etc). I still want to find an alternative way to prevent this delay than accepting the fact that it might happen.
Any insights will be greatly appreciated.

It's probably result of Nagle's algorithm which is turned on by default in TCP/IP socket.
Use setsockopt() to set the TCP_NODELAY flag on socket that sends data to turn it off.

Shorter timeout on connect

connect is failing with WSAETIMEDOUT. That's fine but is there anyway to make the timeout period shorter? Maybe something like 2-3 seconds? Currently it seems to be something higher like 10 seconds.
OS is Windows, using Winsock with C++

Put the socket into non-blocking mode before calling connect(). When it returns with a WSAEWOULDBLOCK error, call select() with whatever timeout interval you want. If select() reports the socket becomes writable, the connection was successful. If select() reports a timeout instead, close the socket.

This has been asked before: WINSOCK - Setting a timeout for a connection attempt on a non existing IP?
No, it's handled by the IP stack. You'll have to start a timer and kill the connection if you need to change this functionality.

Ensuring data is being read with async_read

I am currently testing my network application in very low bandwidth environments. I currently have code that attempts to ensure that the connection is good by making sure I am still receiving information.
Traditionally I have done this by recording the timestamp in my ReadHandler function so that each time it gets called I know I have received data on the socket. With very low bandwidths this isn't sufficient because my ReadHandler is not getting called frequently enough.
I was toying around with the idea of writing my own completion condition function (right now I am using tranfer_at_least(1)) thinking it would get called more frequently and I could record my timestamp there, but I was wondering if there wasn't some other more standard way to go about this.

We had a similar issue in production: some of our connections may be idle for days, but we must detect if the remote is dead ASAP.
We solved it by enabling the TCP_KEEPALIVE option:
boost::asio::socket_base::keep_alive option(true);
mSocketTCP.set_option(option);
which had to be accompanied by new startup script that writes sensible values to /proc/sys/net/ipv4/tcp_keepalive_* which have very long timeouts by default (on LInux)

You can use the read_some method to get partial reads, and deal with the book keeping. This is more efficient than transfer_at_least(1), but you still have to keep track of what is going on.
However, a cleaner approach is just to use a concurrent deadline_timer. If the timer goes off before you are finished, then is taking too long and cancel whatever is going on. If not, just stop the timer and continue. Something like:
boost::asio::deadline_timer t;
t.expires_from_now(boost::posix_time::seconds(20));
t.async_wait(bind(&Class::timed_out, this, _1));
// Do stuff.
if (!t.cancel()) {
// Timer went off, abort
}
// And the timeout method
void Class::timed_out(error_code const& error)
{
if (error == boost::asio::error::operation_aborted) return;
// Deal with the timeout, close the socket, etc.
}

I don't know how to handle low latency of network from within application. Can you be sure if it's network latency, or if peer server or peer application busy and react slowly. Does it matter if it network/server/application quilt?
Even if you can discover network latency and find it's big, what are you going to do?
You can not improve the situation.
Consider other critical case which is a subset of what you're trying to handle - network is down (e.g. you disconnect cable from your machine). Since it a subset of your problem you want to handle it too.
Let's examine the network down effect on active TCP connection.How can you discover your active TCP connection is still alive? Calling send() will success, but it merely says that the message queued in TCP outgoing queue in kernel. TCP stack will try to send it, but since TCP ACK won't be sent back, TCP stack on your side will try to resend it again and again. You can see your message in netstat output (Send-Q column).
I'm aware of the following ways to deal with it:
One standard way is TCP keep alive proposed #Cubby.
Another way is to implement Keep Alive mechanism. Send Keep Alive req message and peer is obligated to send back Keep Alive ack message.
If you don't receive ack message after predefined timeout, try to send Keep Alive req N more times (e.g. N=2). If still no success, close the socket and open it again. If peer server is not available you'll not be abable to open connection, since TCP 3 way handshake requires peer to respond.

Should I implement my own TCP/IP socket timeouts?

The software I'm working on needs to be able to connect to many servers in a short period of time, using TCP/IP. The software runs under Win32. If a server does not respond, I want to be able to quickly continue with the next server in the list.
Sometimes when a remote server does not respond, I get a connection timeout error after roughly 20 seconds. Often the timeout comes quicker.
My problem is that these 20 seconds hurts the performance of my software, and I would like my software to give up sooner (after say 5 seconds). I assume that the TCP/IP stack (?) in Windows automatically adjusts the timeout based on some parameters?
Is it sane to override this timeout in my application, and close the socket if I'm unable to connect within X seconds?
(It's probably irrelevant, but the app is built using C++ and uses I/O completion ports for asynchronous network communication)

If you use IO completion ports and async operations, why do you need to wait for a connect to complete before continuing with the next server on the list? Use ConnectEx and pass in an overlapped structure. This way the individual server connect time will no add up, the total connect time is the max server connect time not the sum.

On Linux you can
int syncnt = 1;
int syncnt_sz = sizeof(syncnt);
setsockopt(sockfd, IPPROTO_TCP, TCP_SYNCNT, &syncnt, syncnt_sz);
to reduce (or increase) the number of SYN retries per connect per socket. Unfortunately, it's not portable to Windows.
As for your proposed solution: closing a socket while it is still in connecting state should be fine, and it's probably the easiest way. But since it sounds like you're already using asynchronous completions, can you simply try to open four connections at a time? If all four time out, at least it will only take 20 seconds instead of 80.

All configurable TCP/IP parameters for Windows are here
See TcpMaxConnectRetransmissions

You might consider trying to open many connections at once (each with its own socket), and then work with the one that responds first. The others can be closed.
You could do this with non-blocking open calls, or with blocking calls and threads. Then the lag waiting for a connection to open shouldn't be any more than is minimally nessecary.

You have to be careful when you override the socket timeout. If you are too aggressive and attempt to connect to many servers very quickly then the windows TCP/IP stack will assume your application is an internet worm and throttle it down. If this happens, then the performance of your application will become even worse.
The details of when exactly the throttling back occurs is not advertised, but the timeout you propose ( 5 seconds ) should be OK, in my experience.
The details that are available about this can be found here

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js