select(), recv() and EWOULDBLOCK on non-blocking sockets - c++

I would like to know if the following scenario is real?!
select() (RD) on non-blocking TCP socket says that the socket is ready
following recv() would return EWOULDBLOCK despite the call to select()

For recv() you would get EAGAIN rather than EWOULDBLOCK, and yes it is possible. Since you have just checked with select() then one of two things happened:
Something else (another thread) has drained the input buffer between select() and recv().
A receive timeout was set on the socket and it expired without data being received.

It's possible, but only in a situation where you have multiple threads/processes trying to read from the same socket.

On Linux it's even documented that this can happen, as I read it.
See this question:
Spurious readiness notification for Select System call

I am aware of an error in a popular desktop operating where O_NONBLOCK TCP sockets, particularly those running over the loopback interface, can sometimes return EAGAIN from recv() after select() reports the socket is ready for reading. In my case, this happens after the other side half-closes the sending stream.
For more details, see the source code for t_nx.ml in the NX library of my OCaml Network Application Environment distribution. (link)

Though my application is a single-threaded one, I noticed that the described behavior is not uncommon in RHEL5. Both with TCP and UDP sockets that were set to O_NONBLOCK (the only socket option that is set). select() reports that the socket is ready but the following recv() returns EAGAIN.

Yes, it's real. Here's one way it can happen:
A future modification to the TCP protocol adds the ability for one side to "revoke" information it sent provided it hasn't been received yet by the other side's application layer. This feature is negotiated on the connection. The other side sends you some data, you get a select hit. Before you can call recv, the other side "revokes" the data using this new extension. Your read gets a "would block" error because no data is available to be read.
The select function is a status-reporting function that does not come with future guarantees. Assuming that a hit on select now assures that a subsequent operation won't block is as invalid as using any other status-reporting function this way. It's as bad as using access to try to ensure a subsequent operation won't fail due to incorrect permissions or using statfs to try to ensure a subsequent write won't fail due to a full disk.

It is possible in a multithreaded environment where two threads are reading from the socket. Is this a multithreaded application?

If you do not call any other syscall between select() and recv() on this socket, then recv() will never return EAGAIN or EWOULDBLOCK.
I don't know what they mean with recv-timeout, however, the POSIX standard does not mention it here so you can be safe calling recv().

Related

How to make 'send' as non-blocking in winsock [duplicate]

I am making a program which sends UDP packets to a server at a fixed interval, something like this:
while (!stop) {
Sleep(fixedInterval);
send(sock, pkt, payloadSize, flags);
}
However the periodicity cannot be guaranteed because send is a blocking call (e.g., when fixedInterval is 20ms, and a call to send is > 20ms ). Do you know how I can turn the send into a non-blocking operation?
You need to use a non-blocking socket. The send/receive functions are the same functions for blocking or non-blocking operations, but you must set the socket itself to non-blocking.
u_long mode = 1; // 1 to enable non-blocking socket
ioctlsocket(sock, FIONBIO, &mode);
Also, be aware that working with non-blocking sockets is quite different. You'll need to make sure you handle WSAEWOULDBLOCK errors as success! :)
So, using non-blocking sockets may help, but still will not guarantee an exact period. You would be better to drive this from a timer, rather than this simple loop, so that any latency from calling send, even in non-blocking mode, will not affect the timing.
The API ioctlsocket can do it.You can use it as below.But why don't you use I/O models in winsock?
ioctlsocket(hsock,FIOBIO,(unsigned long *)&ul);
My memory is fuzzy here since it's probably been 15 years since I've used UDP non-blocking.
However, there are some things of which you should be aware.
Send only smallish packets if you're going over a public network. The PATH MTU can trip you up if either the client or the server is not written to take care of incomplete packets.
Make sure you check that you have sent the number of bytes you think you have to send. It can get weird when you're expecting to see 300 bytes sent and the receiving end only gets 248. Both client side and server side have to be aware of this issue.
See here for some good advice from the Linux folks.
See here for the Unix Socket FAQ for UDP
This is a good, general network programming FAQ and example page.
How about measuring the time Send takes and then just sleeping the time missing up to the 20ms?

How to make 'send' non-blocking in winsock

I am making a program which sends UDP packets to a server at a fixed interval, something like this:
while (!stop) {
Sleep(fixedInterval);
send(sock, pkt, payloadSize, flags);
}
However the periodicity cannot be guaranteed because send is a blocking call (e.g., when fixedInterval is 20ms, and a call to send is > 20ms ). Do you know how I can turn the send into a non-blocking operation?
You need to use a non-blocking socket. The send/receive functions are the same functions for blocking or non-blocking operations, but you must set the socket itself to non-blocking.
u_long mode = 1; // 1 to enable non-blocking socket
ioctlsocket(sock, FIONBIO, &mode);
Also, be aware that working with non-blocking sockets is quite different. You'll need to make sure you handle WSAEWOULDBLOCK errors as success! :)
So, using non-blocking sockets may help, but still will not guarantee an exact period. You would be better to drive this from a timer, rather than this simple loop, so that any latency from calling send, even in non-blocking mode, will not affect the timing.
The API ioctlsocket can do it.You can use it as below.But why don't you use I/O models in winsock?
ioctlsocket(hsock,FIOBIO,(unsigned long *)&ul);
My memory is fuzzy here since it's probably been 15 years since I've used UDP non-blocking.
However, there are some things of which you should be aware.
Send only smallish packets if you're going over a public network. The PATH MTU can trip you up if either the client or the server is not written to take care of incomplete packets.
Make sure you check that you have sent the number of bytes you think you have to send. It can get weird when you're expecting to see 300 bytes sent and the receiving end only gets 248. Both client side and server side have to be aware of this issue.
See here for some good advice from the Linux folks.
See here for the Unix Socket FAQ for UDP
This is a good, general network programming FAQ and example page.
How about measuring the time Send takes and then just sleeping the time missing up to the 20ms?

Handling POSIX socket read() errors

Currently I am implementing a simple client-server program with just the basic functionalities of read/write.
However I noticed that if for example my server calls a write() to reply my client, and if my client does not have a corresponding read() function, my server program will just hang there.
Currently I am thinking of using a simple timer to define a timeout count, and then to disconnect the client after a certain count, but I am wondering if there is a more elegant/or standard way of handling such errors?
There are two general approaches to prevent server blocking and to handle multiple clients by a single server instance:
use POSIX threads to handle each client's connection. If one thread blocks because of erroneous client, other threads will still continue to run. If the remote client has just disappeared (crashed, network down, etc.), then sooner or later the TCP stack will signal a timeout and the blocked write operation will fail with error.
use non-blocking I/O together with a polling mechanism, e.g. select(2) or poll(2). It is quite harder to program using polling calls though. Network sockets are made non-blocking using fcntl(2) and in cases where a normal write(2) or read(2) on the socket would block an EAGAIN error is returned instead. You can use select(2) or poll(2) to wait for something to happen on the socket with an adjustable timeout period. For example, waiting for the socket to become writable, means that you will be notified when there is enough socket send buffer space, e.g. previously written data was flushed to the client machine TCP stack.
If the client side isn't going to read from the socket anymore, it should close down the socket with close. And if you don't want to do that because the client still might want to write to the socket, then you should at least close the read half with shutdown(fd, SHUT_RD).
This will set it up so the server gets an EPIPE on the write call.
If you don't control the clients... if random clients you didn't write can connect, the server should handle clients actively attempting to be malicious. One way for a client to be malicious is to attempt to force your server to hang. You should use a combination of non-blocking sockets and the timeout mechanism you describe to keep this from happening.
In general you should write the protocols for how the server and client communicate so that neither the server or client are trying to write to the socket when the other side isn't going to be reading. This doesn't mean you have to synchronize them tightly or anything. But, for example, HTTP is defined in such a way that it's quite clear for either side as to whether or not the other side is really expecting them to write anything at any given point in the protocol.

Gracefully shut down a TCP socket

I'm writing an IRC client in C++ and currently I'm having an issue where, upon exit, I do:
Send("QUIT :Quit\r\n"); // just an inline, variadic send() wrapper
shutdown(m_hSocket, SD_BOTH);
closesocket(m_hSocket);
WSAShutdown();
However, the issue is that the QUIT message is not being sent. I've sniffed the packets coming from the client and infact this message is never sent. I believe this is an issue with the socket not being flushed, but I have no idea how to do this and Google suggested disabling Nagle's algorithm but I doubt this is good practice.
Thanks in advance.
First of all you should check the return value of send: are the data you attempt to send actually accepted by the network stack? (In general this should be done after each and every send call, not just in this case).
Assuming the data is accepted, then AFAIK it should be actually transmitted as a result of calling shutdown. You might try using SO_LINGER to see if it makes a difference, see Graceful Shutdown, Linger Options, and Socket Closure on MSDN.

recv() with errno=107:(transport endpoint connected)

well..I use a typical model of epoll+multithread to handle massive sockets, that is, I have a thread called epollWorkThread that use epoll_wait to handle i/o sockets. While there's an event of EPOLLIN, recv() will do the work and I do use the noblocking mode to allow immediate return. And recv() is indeed in a while(true) loop.
Everything is fine in the intial time(maybe a couple of hours or maybe minutes or if I'm lucky days), I can receive the information. But some time later, recv() insists to return -1 with the errno = 107(ENOTCONN). The other peer of the transport is written in AS3 which makes sure that the socket is connected. So I'm confused by the recv() behaviour. Thank you in advance and any comment is appreciated!
Errno 107 means that the socket is NOT connected (any more).
There are several reasons why this could happen. Assuming you're right and both sides of the connection claim that the socket is still open, an intermediate router/switch may have dropped the connection due to a timeout. The safest way to avoid such things from happen is to periodically send a 'health' or 'keep-alive' message. (Thus the intermediate router/switch accepts the connection as living...)=