Handling POSIX socket read() errors

Handling POSIX socket read() errors - c++

Currently I am implementing a simple client-server program with just the basic functionalities of read/write.
However I noticed that if for example my server calls a write() to reply my client, and if my client does not have a corresponding read() function, my server program will just hang there.
Currently I am thinking of using a simple timer to define a timeout count, and then to disconnect the client after a certain count, but I am wondering if there is a more elegant/or standard way of handling such errors?

There are two general approaches to prevent server blocking and to handle multiple clients by a single server instance:
use POSIX threads to handle each client's connection. If one thread blocks because of erroneous client, other threads will still continue to run. If the remote client has just disappeared (crashed, network down, etc.), then sooner or later the TCP stack will signal a timeout and the blocked write operation will fail with error.
use non-blocking I/O together with a polling mechanism, e.g. select(2) or poll(2). It is quite harder to program using polling calls though. Network sockets are made non-blocking using fcntl(2) and in cases where a normal write(2) or read(2) on the socket would block an EAGAIN error is returned instead. You can use select(2) or poll(2) to wait for something to happen on the socket with an adjustable timeout period. For example, waiting for the socket to become writable, means that you will be notified when there is enough socket send buffer space, e.g. previously written data was flushed to the client machine TCP stack.

If the client side isn't going to read from the socket anymore, it should close down the socket with close. And if you don't want to do that because the client still might want to write to the socket, then you should at least close the read half with shutdown(fd, SHUT_RD).
This will set it up so the server gets an EPIPE on the write call.
If you don't control the clients... if random clients you didn't write can connect, the server should handle clients actively attempting to be malicious. One way for a client to be malicious is to attempt to force your server to hang. You should use a combination of non-blocking sockets and the timeout mechanism you describe to keep this from happening.
In general you should write the protocols for how the server and client communicate so that neither the server or client are trying to write to the socket when the other side isn't going to be reading. This doesn't mean you have to synchronize them tightly or anything. But, for example, HTTP is defined in such a way that it's quite clear for either side as to whether or not the other side is really expecting them to write anything at any given point in the protocol.

Related

Win32 Registered Socket I/O: cancelling pending receive operations?

I've recently begun to implement a UDP socket receiver with Registered I/O on Win32. I've stumbled upon the following issue: I don't see any way to cancel pending RIOReceive()/RIOReceiveEx() operations without closing the socket.
To summarize the basic situation:
During normal operation I want to queue quite a few RIOReceive()/RIOReceiveEx() operations in the request queue to ensure that I get the best possible performance when it comes to receiving the UDP packets.
However, at some point I may want to stop what I'm doing. If UDP packets are still arriving at that point, fine, I can just wait until all pending requests have been processed. Unfortunately, if the sender has also stopped sending UDP packets, I still have the pending receive operations.
That in and by itself is not a problem, because I can just keep going once operations start again.
However, if I want to reconfigure the buffers used in between, I run into an issue. Because the documentation states that it's an error to deregister a buffer with RIO while it's still in use, but as long as receive operations are still pending, the buffers are still officially in use, so I can't do that.
What I've tried so far related to cancellation of these requests:
CancelIo() on the socket (no effect)
CancelSynchronousIo(GetCurrentThread()) (no effect)
shutdown(s, SD_RECEIVE) (success, but no effect, the socket even receives packets afterwards -- though shutdown probably wouldn't have been helpful anyway)
WSAIoctl(s, SIO_FLUSH, ...) because the docs of RIOReceiveEx() mentioned it, but that just gives me WSAEOPNOTSUPP on UDP sockets (probably only useful for TCP and probably also only useful for sending, not receiving)
Just for fun I tried to set setsockopt(s, SOL_SOCKET, SO_RCVTIMEO, ...) with 1ms as the timeout -- and that doesn't appear to have any effect on RIO regardless of whether I call it before or after I queue the RIOReceive()/RIOReceiveEx() calls
Only closing the socket will successfully cancel the I/O.
I've also thought about doing RIOCloseCompletionQueue(), but there I wouldn't even know how to proceed afterwards, since there's no way of reassigning a completion queue to a request queue, as far as I can see, and you can only ever create a single request queue for a socket. (If there was something like RIOCloseRequestQueue() and that did cancel the pending requests, I'd be happy, but the docs only mention that closesocket() will free resources associated with the request queue.)
So what I'm left with is the following:
Either I have to write my logic so that the buffers that are being used are always fixed once the socket is opened, because I can't really ever change them in practice due to requests that could still be pending.
Or I have to close the socket and reopen it every time I want to change something here. But that is a race condition, because I'd have to bind the socket again, and I'd really like to avoid that if possible.
I've tested sending UDP packets to my own socket from a newly created different socket until all of the requests have been 'eaten up' -- and while that works in principle, I really don't like it, because if any kind of firewall rule decides to not allow this, the code would deadlock instantly.
On Linux io_uring I can just cancel existing operations, or even exit the uring, and once that's done, I'm sure that there are no receive operations still active, but the socket is still there and accessible. (And on Linux it's nice that the socket still behaves like a normal socket, on Windows if I create the socket with the WSA_FLAG_REGISTERED_IO flag, I can't use it outside of RIO except for operations such as bind().)
Am I missing something here or is this simply not possible with Registered I/O?

How to send and receive data up to SO_SNDTIMEO and SO_RCVTIMEO without corrupting connection?

I am currently planning how to develop a man in the middle network application for TCP server that would transfer data between server and client. It would behave as regular client for server and server for remote client without modifying any data. It will be optionally used to detect and measure how long server or client is not able to receive data that is ready to be received in situation when connection is inactive.
I am planning to use blocking send and recv functions. Before any data transfer I would call a setsockopt function to set SO_SNDTIMEO and SO_RCVTIMEO to about 10 - 20 miliseconds assuming it will force blocking send and recv functions to return early in order to let another active connection data to be routed. Running thread per connection looks too expensive. I would not use async sockets here because I can not find guarantee that they will get complete in a parts of second especially when large data amount is being sent or received. High data delays does not look good. I would use very small buffers here but calling function for each received byte looks overkill.
My next assumption would be that is safe to call send or recv later if it has previously terminated by timeout and data was received less than requested.
But I am confused by contradicting information available at msdn.
send function
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740149%28v=vs.85%29.aspx
If no error occurs, send returns the total number of bytes sent, which
can be less than the number requested to be sent in the len parameter.
SOL_SOCKET Socket Options
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740532%28v=vs.85%29.aspx
SO_SNDTIMEO - The timeout, in milliseconds, for blocking send calls.
The default for this option is zero, which indicates that a send
operation will not time out. If a blocking send call times out, the
connection is in an indeterminate state and should be closed.
Are my assumptions correct that I can use these functions like this? Maybe there is more effective way to do this?
Thanks for answers

While you MIGHT implement something along the ideas you have given in your question, there are preferable alternatives on all major systems.
Namely:
kqueue on FreeBSD and family. And on MAC OSX.
epoll on linux and related types of operating systems.
IO completion ports on Windows.
Using those technologies allows you to process traffic on multiple sockets without timeout logics and polling in an efficient, reactive manner. They all can be considered successors of the ancient select() function in socket API.
As for the quoted documentation for send() in your question, it is not really confusing or contradicting. Useful network protocols implement a mechanism to create "backpressure" for situations where a sender tries to send more data than a receiver (and/or the transport channel) can accomodate for. So, an application can only provide more data to send() if the network stack has buffer space ready for it.
If, for example an application tries to send 3Kb worth of data and the tcp/ip stack has only room for 800 bytes, send() might succeed and return that it used 800 bytes of the 3k offered bytes.
The basic approach to forwarding the data on a connection is: Do not read from the incoming socket until you know you can send that data to the outgoing socket. If you read greedily (and buffer on application layer), you deprive the communication channel of its backpressure mechanism.
So basically, the "send capability" should drive the receive actions.
As for using timeouts for this "middle man", there are 2 major scenarios:
You know the sending behavior of the sender application. I.e. if it has some intent on sending any data within your chosen receive timeout at any time. Some applications only send sporadically and any chosen value for a receive timeout could be wrong. Even if it is supposed to send at a specific time interval, your timeouts will cause trouble once someone debugs the sending application.
You want the "middle man" to work for unknown applications (which must not use some encryption for middle man to have a chance, of course). There, you cannot pick any "adequate" timeout value because you know nothing about the sending behavior of the involved application(s).

As a previous poster has suggested, I strongly urge you to reconsider the design of your server so that it employs an asynchronous I/O strategy. This may very well require that you spend significant time learning about each operating systems' preferred approach. It will be time well-spent.
For anything other than a toy application, using blocking I/O in the manner that you suggest will not perform well. Even with short timeouts, it sounds to me as though you won't be able to service new connections until you have completed the work for the current connection. You may also find (with short timeouts) that you're burning more CPU time spinning waiting for work to do than actually doing work.
A previous poster wisely suggested taking a look at Windows I/O completion ports. Take a look at this article I wrote in 2007 for Dr. Dobbs. It's not perfect, but I try to do a decent job of explaining how you can design a simple server that uses a small thread pool to handle potentially large numbers of connections:
Windows I/O Completion Ports
http://www.drdobbs.com/cpp/multithreaded-asynchronous-io-io-comple/201202921
If you're on Linux/FreeBSD/MacOSX, take a look at libevent:
Libevent
http://libevent.org/
Finally, a good, practical book on writing TCP/IP servers and clients is "Practical TCP/IP Sockets in C" by Michael Donahoe and Kenneth Calvert. You could also check out the W. Richard Stevens texts (which cover the topic completely for UNIX.)
In summary, I think you should take some time to learn more about asynchronous socket I/O and the established, best-of-breed approaches for developing servers.
Feel free to private message me if you have questions down the road.

C++ - detecting dead sockets

I have a server application written in C++. When a client connects, it creates a new thread for him. In that thread there is a BLOCKING reading from a socket. Because there is a possibility for a client to accidentally disconnect and left behind a thread still hanging on the read function, there is a thread that checks if the sockets are still alive by sending "heartbeat messages". The message consists of 1 character and is "ignored" by the client (it is not processed like other messages). The write looks like this:
write(fd, ";", 1);
It works fine, but is it really necessary to send a random character through the socket? I tried to send an empty message ("" with length 0), but it didn't work. Is there any better way to solve this socket checking?
Edit:
I'm using BSD sockets (TCP).

I'm assuming when you say, "socket, you mean a TCP network socket.
If that's true, then the TCP protocol gives you a keepalive option that you would need to ask the OS to use.
I think this StackOverflow answer gets at what you would need to do, assuming a BSDish socket library.

In my experience, using heartbeat messages on TCP (and checking for responses, e.g. NOP/NOP-ACK) is the easiest way to get reliable and timely indication of connectivity at the application layer. The network layer can do some interesting things but getting notification in your application can be tricky.
If you can switch to UDP, you'll have more control and flexibility at the application layer, and probably reduced traffic overall since you can customize the communications, but you'll need to handle reliability, packet ordering, etc. yourself.

You can set connection KEEPALIVE. You may have interests in this link: http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/overview.html
It is ok you create a thread for each new coming requests if it is only toy. In most of time, i use poll, that is non-blocking io, for performance improvement.

Receiving data from already closed socket?

Suppose I have a server application - the connection is over TCP, using UNIX sockets.
The connection is asynchronous - in other words, clients' and servers' sockets are non-blocking.
Suppose the following situation: in some conditions, the server may decide to send some data to a connected client and immediately close the connection: using shutdown with SHUT_RDWR.
So, my question is - is it guaranteed, that when the client call recv, it will receive the (sent by the server) data?
Or, to receive the data, recv must be called before the server's shutdown? If so, what should I do (or, to be more precise, how should I do this), to make sure, that the data is received by the client?

You can control this behavior with "setsockopt(SO_LINGER)":
man setsockopt
SO_LINGER
Waits to complete the close function if data is present. When this option is enabled and there is unsent data present when the close
function is called, the calling application is blocked during the
close function until the data is transmitted or the connection has
timed out. The close function returns without blocking the caller.
This option has meaning only for stream sockets.
See also:
man read
Beej's Guide to Network Programming

There's no guarantee you will receive any data, let alone this data, but the data pending when the socket is closed is subject to the same guarantees as all the other data: if it arrives it will arrive in order and undamaged and subject to TCP's best efforts.
NB 'Asynchronous' and 'non-blocking' are two different things, not two terms for the same thing.

Once you have successfully written the data to the socket, it is in the kernel's buffer, where it will stay until it has been sent and acknowledged. Shutdown doesn't cause the buffered data to get lost. Closing the socket doesn't cause the buffered data to get lost. Not even the death of the sending process would cause the buffered data to get lost.
You can observe the size of the buffer with netstat. The SendQ column is how much data the kernel still wants to transmit.
After the client has acknowledged everything, the port disappears from the server. This may happen before the client has read the data, in which case it will be in RecvQ on the client. Basically you have nothing to worry about. After a successful write to a TCP socket, every component is trying as hard as it can to make sure that your data gets to the destination unharmed regardless of what happens to the sending socket and/or process.
Well, maybe one thing to worry about: If the client tries to send anything after the server has done its shutdown, it could get a SIGPIPE and die before it has read all the available data from the socket.

select(), recv() and EWOULDBLOCK on non-blocking sockets

I would like to know if the following scenario is real?!
select() (RD) on non-blocking TCP socket says that the socket is ready
following recv() would return EWOULDBLOCK despite the call to select()

For recv() you would get EAGAIN rather than EWOULDBLOCK, and yes it is possible. Since you have just checked with select() then one of two things happened:
Something else (another thread) has drained the input buffer between select() and recv().
A receive timeout was set on the socket and it expired without data being received.

It's possible, but only in a situation where you have multiple threads/processes trying to read from the same socket.

On Linux it's even documented that this can happen, as I read it.
See this question:
Spurious readiness notification for Select System call

I am aware of an error in a popular desktop operating where O_NONBLOCK TCP sockets, particularly those running over the loopback interface, can sometimes return EAGAIN from recv() after select() reports the socket is ready for reading. In my case, this happens after the other side half-closes the sending stream.
For more details, see the source code for t_nx.ml in the NX library of my OCaml Network Application Environment distribution. (link)

Though my application is a single-threaded one, I noticed that the described behavior is not uncommon in RHEL5. Both with TCP and UDP sockets that were set to O_NONBLOCK (the only socket option that is set). select() reports that the socket is ready but the following recv() returns EAGAIN.

Yes, it's real. Here's one way it can happen:
A future modification to the TCP protocol adds the ability for one side to "revoke" information it sent provided it hasn't been received yet by the other side's application layer. This feature is negotiated on the connection. The other side sends you some data, you get a select hit. Before you can call recv, the other side "revokes" the data using this new extension. Your read gets a "would block" error because no data is available to be read.
The select function is a status-reporting function that does not come with future guarantees. Assuming that a hit on select now assures that a subsequent operation won't block is as invalid as using any other status-reporting function this way. It's as bad as using access to try to ensure a subsequent operation won't fail due to incorrect permissions or using statfs to try to ensure a subsequent write won't fail due to a full disk.

It is possible in a multithreaded environment where two threads are reading from the socket. Is this a multithreaded application?

If you do not call any other syscall between select() and recv() on this socket, then recv() will never return EAGAIN or EWOULDBLOCK.
I don't know what they mean with recv-timeout, however, the POSIX standard does not mention it here so you can be safe calling recv().

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js