recv() with errno=107:(transport endpoint connected) - c++

well..I use a typical model of epoll+multithread to handle massive sockets, that is, I have a thread called epollWorkThread that use epoll_wait to handle i/o sockets. While there's an event of EPOLLIN, recv() will do the work and I do use the noblocking mode to allow immediate return. And recv() is indeed in a while(true) loop.
Everything is fine in the intial time(maybe a couple of hours or maybe minutes or if I'm lucky days), I can receive the information. But some time later, recv() insists to return -1 with the errno = 107(ENOTCONN). The other peer of the transport is written in AS3 which makes sure that the socket is connected. So I'm confused by the recv() behaviour. Thank you in advance and any comment is appreciated!

Errno 107 means that the socket is NOT connected (any more).
There are several reasons why this could happen. Assuming you're right and both sides of the connection claim that the socket is still open, an intermediate router/switch may have dropped the connection due to a timeout. The safest way to avoid such things from happen is to periodically send a 'health' or 'keep-alive' message. (Thus the intermediate router/switch accepts the connection as living...)=

Related

Socket recv() function

In c++, in windows OS, on the recv() call for TCP socket , if the socket connection is closed somehow, the recv() will return immediately or will it hang?
What would be result(immediately returns or hangs) in the blocking and non blocking socket?
I am using socket version 2.
Thanks in advance.
As documented depending of how the connection was closed it should more or less immediately with a return value of SOCKET_ERROR in a non graceful case and set WSAGetLastError to one of the possible reasons like WSAENOTCONN or would just return 0 in the graceful connection close scenario. This would be the same between blocking and nonblocking sockets.
If the socket is connection oriented and the remote side has shut down the connection gracefully, and all data has been received, a recv will complete immediately with zero bytes received. If the connection has been reset, a recv will fail with the error WSAECONNRESET.
However, since I know the Windows API does not always work as documented, I recommend to test it.
If your recv() is based on BSD - as almost all are - it will return immediately if the connection is closed (per the local side's status), regardless of whether the socket is blocking or non-blocking.
recv() will return 0 upon a graceful disconnect, ie the peer shutdown its end of the connection and its socket stack sent a FIN packet to your socket stack. You are pretty much guaranteed to get this result immediately, whether you use a blocking or non-blocking socket.
recv() will return -1 on any other error, including an abnormal connection loss.
You need to use WSAGetLastError() to find out what actually happened. For a lost connection on a blocking socket, you will usually get an error code such as WSAECONNRESET or WSAECONNABORTED. For a lost connection on a non-blocking socket, it is possible that recv() may report an WSAEWOULDBLOCK error immediately and then report the actual error at some later time, maybe via select() with an exception fd_set, or an asynchronous notification, depending on how you implement your non-blocking logic.
However, either way, you are NOT guaranteed to get a failure result on a lost connection in any timely manner! It MAY take some time (seconds, minutes, can even be hours in rare cases) before the OS deems the connection is actually lost and invalidates the socket connection. TCP is designed to recover lost connections when possible, so it has to account for temporary network outages and such, so there are internal timeouts. You don't see that in your code, it happens in the background.
If you don't want to wait for the OS to timeout internally, you can always use your own timeout in your code, such as via select(), setsocktopt(SO_RCVTIMEO), TCP keep-alives (setsockopt(SO_KEEPALIVE) or WSAIoCtl(SIO_KEEPALIVE_VALS)), etc. You may still not get a failure immediately, but you will get it sooner rather than later.

How to make 'send' non-blocking in winsock

I am making a program which sends UDP packets to a server at a fixed interval, something like this:
while (!stop) {
Sleep(fixedInterval);
send(sock, pkt, payloadSize, flags);
}
However the periodicity cannot be guaranteed because send is a blocking call (e.g., when fixedInterval is 20ms, and a call to send is > 20ms ). Do you know how I can turn the send into a non-blocking operation?
You need to use a non-blocking socket. The send/receive functions are the same functions for blocking or non-blocking operations, but you must set the socket itself to non-blocking.
u_long mode = 1; // 1 to enable non-blocking socket
ioctlsocket(sock, FIONBIO, &mode);
Also, be aware that working with non-blocking sockets is quite different. You'll need to make sure you handle WSAEWOULDBLOCK errors as success! :)
So, using non-blocking sockets may help, but still will not guarantee an exact period. You would be better to drive this from a timer, rather than this simple loop, so that any latency from calling send, even in non-blocking mode, will not affect the timing.
The API ioctlsocket can do it.You can use it as below.But why don't you use I/O models in winsock?
ioctlsocket(hsock,FIOBIO,(unsigned long *)&ul);
My memory is fuzzy here since it's probably been 15 years since I've used UDP non-blocking.
However, there are some things of which you should be aware.
Send only smallish packets if you're going over a public network. The PATH MTU can trip you up if either the client or the server is not written to take care of incomplete packets.
Make sure you check that you have sent the number of bytes you think you have to send. It can get weird when you're expecting to see 300 bytes sent and the receiving end only gets 248. Both client side and server side have to be aware of this issue.
See here for some good advice from the Linux folks.
See here for the Unix Socket FAQ for UDP
This is a good, general network programming FAQ and example page.
How about measuring the time Send takes and then just sleeping the time missing up to the 20ms?

Non-blocking TCP socket and flushing right after send?

I am using Windows socket for my application(winsock2.h). Since the blocking socket doesn't let me control connection timeout, I am using non-blocking one. Right after send command I am using shutdown command to flush(I have to). My timeout is 50ms and the thing I want to know is if the data to be sent is so big, is there a risk of sending only a portion of data or sending nothing at all? Thanks in advance...
hSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
u_long iMode=1;
ioctlsocket(hSocket,FIONBIO,&iMode);
connect(hSocket, (sockaddr*)(&sockAddr),sockAddrSize);
send(hSocket, sendbuf, sendlen, 0);
shutdown(hSocket, SD_BOTH);
Sleep(50);
closesocket(hSocket);
Non-blocking TCP socket and flushing right after send?
There is no such thing as flushing a TCP socket.
Since the blocking socket doesn't let me control connection timeout
False. You can use select() on a blocking socket.
I am using non-blocking one.
Non sequitur.
Right after send command I am using shutdown command to flush(I have to).
You don't have to, and shutdown() doesn't flush anything.
My timeout is 50ms
Why? The time to send data depends on the size of the data. Obviously. It does not make any sense whatsoever to use a fixed timeout for a send.
and the thing I want to know is if the data to be sent is so big, is there a risk of sending only a portion of data or sending nothing at all?
In blocking mode, all the data you provided to send() will be sent if possible. In non-blocking mode, the amount of data represented by the return value of send() will be sent, if possible. In either case the connection will be reset if the send fails. Whatever timeout mechanism you superimpose can't possibly change any of that: specifically, closing the socket asynchronously after a timeout will only cause the close to be appended to the data being sent. It will not cause the send to be aborted.
Your code wouldn't pass any code review known to man. There is zero error checking; the sleep is completely pointless; and shutdown before close is redundant. If the sleep is intended to implement a timeout, it doesn't.
I want to be sending data as fast as possible.
You can't. TCP implements flow control. There is exactly nothing you can do about that. You are rate-limited by the receiver.
Also the 2 possible cases are: server waits too long to accept connection
There is no such case. The client can complete a connection before the server ever calls accept(). If you're trying to implement a connect timeout shorter than the default of about a minute, use select().
or receive.
Nothing you can do about that: see above.
So both connecting and writing should be done in max of 50ms since the time is very important in my situation.
See above. It doesn't make sense to implement a fixed timeout for operations that take variable time. And 50ms is far too short for a connect timeout. If that's a real issue you should keep the connection open so that the connect delay only happens once: in fact you should keep TCP connections open as long as possible anyway.
I have to flush both write and read streams
You can't. There is no operation in TCP that will flush either a read stream or a write stream.
because the server keeps sending me unnecessarly big data and I have limited internet connection.
Another non sequitur. If the server sends you data, you have to read it, otherwise you will stall the server, and that doesn't have anything to do with flushing your own write stream.
Actually I don't even want a single byte from the server
Bad luck. You have to read it. [If you were on BSD Unix you could shutdown the socket for input, which would cause data from the server to be thrown away, but that doesn't work on Windows: it causes the server to get a connection reset.]
Thanks to EJP and Martin, now I have created a second thread to check. Also in the code I posted in my question, I added "counter=0;" line after the "send" line and removed shutdown. It works just as I wanted now. It never waits more than 50ms :) Really big thanks
unsigned __stdcall SecondThreadFunc( void* pArguments )
{
while(1)
{
counter++;
if (counter > 49)
{
closesocket(hSocket);
counter = 0;
printf("\rtimeout");
}
Sleep(1);
}
return 0;
}

Gracefully shut down a TCP socket

I'm writing an IRC client in C++ and currently I'm having an issue where, upon exit, I do:
Send("QUIT :Quit\r\n"); // just an inline, variadic send() wrapper
shutdown(m_hSocket, SD_BOTH);
closesocket(m_hSocket);
WSAShutdown();
However, the issue is that the QUIT message is not being sent. I've sniffed the packets coming from the client and infact this message is never sent. I believe this is an issue with the socket not being flushed, but I have no idea how to do this and Google suggested disabling Nagle's algorithm but I doubt this is good practice.
Thanks in advance.
First of all you should check the return value of send: are the data you attempt to send actually accepted by the network stack? (In general this should be done after each and every send call, not just in this case).
Assuming the data is accepted, then AFAIK it should be actually transmitted as a result of calling shutdown. You might try using SO_LINGER to see if it makes a difference, see Graceful Shutdown, Linger Options, and Socket Closure on MSDN.

select(), recv() and EWOULDBLOCK on non-blocking sockets

I would like to know if the following scenario is real?!
select() (RD) on non-blocking TCP socket says that the socket is ready
following recv() would return EWOULDBLOCK despite the call to select()
For recv() you would get EAGAIN rather than EWOULDBLOCK, and yes it is possible. Since you have just checked with select() then one of two things happened:
Something else (another thread) has drained the input buffer between select() and recv().
A receive timeout was set on the socket and it expired without data being received.
It's possible, but only in a situation where you have multiple threads/processes trying to read from the same socket.
On Linux it's even documented that this can happen, as I read it.
See this question:
Spurious readiness notification for Select System call
I am aware of an error in a popular desktop operating where O_NONBLOCK TCP sockets, particularly those running over the loopback interface, can sometimes return EAGAIN from recv() after select() reports the socket is ready for reading. In my case, this happens after the other side half-closes the sending stream.
For more details, see the source code for t_nx.ml in the NX library of my OCaml Network Application Environment distribution. (link)
Though my application is a single-threaded one, I noticed that the described behavior is not uncommon in RHEL5. Both with TCP and UDP sockets that were set to O_NONBLOCK (the only socket option that is set). select() reports that the socket is ready but the following recv() returns EAGAIN.
Yes, it's real. Here's one way it can happen:
A future modification to the TCP protocol adds the ability for one side to "revoke" information it sent provided it hasn't been received yet by the other side's application layer. This feature is negotiated on the connection. The other side sends you some data, you get a select hit. Before you can call recv, the other side "revokes" the data using this new extension. Your read gets a "would block" error because no data is available to be read.
The select function is a status-reporting function that does not come with future guarantees. Assuming that a hit on select now assures that a subsequent operation won't block is as invalid as using any other status-reporting function this way. It's as bad as using access to try to ensure a subsequent operation won't fail due to incorrect permissions or using statfs to try to ensure a subsequent write won't fail due to a full disk.
It is possible in a multithreaded environment where two threads are reading from the socket. Is this a multithreaded application?
If you do not call any other syscall between select() and recv() on this socket, then recv() will never return EAGAIN or EWOULDBLOCK.
I don't know what they mean with recv-timeout, however, the POSIX standard does not mention it here so you can be safe calling recv().