Send/Recv Socket Blocking Issues - c++

another question about my beloved sockets.
I'll first explain what my case is. After that I will tell you whats bothering me.
I have a client and a server. Both Applications are written in C++ with the winsock2 implementation. The connection runs over TCP and WLAN. WLan is very important, because its probably causing the issue and is definetly going to be the communicationchannel.
I'm connecting two sockets to the server. A SendSocket and a ReceiveSocket. I'm constantly sending video data to the server through the sendsocket. The data is processed and gets send back to the client and gets displayed. Each socket got his own thread.
The Videodata is encoded, so I achieve like 500kB/s. Lets see this rate as fixed, without explanation.
Perfect communication viewed by the client:
Send Data
Recv Data
Send Data
Recv Data
...
This is for like 100 frames the case.
But every couple of frames, the stream freezes for like 4 frames and continues after that. (4 frames are like 500ms)
Thats the issue, i'm facing.
What happens to the stream is the following:
Send Data
Recv Data
Send Data
Send Data
Send Data1 -> blocked send
Recv Data
Recv Data
Send Data2 -> not blocked anymore.
The Data gets properly sent on server side.
Since WLan is not duplex (as far as I know), I thought, that the send calls are prioritized for some reason. And after that the Receive calls are prioritized, so the send call blocks until the recv calls are done.
Maybe you can tell me, what is happening in the lower layer, which could cause the problem.
Btw. I'm definetly not sure, if its not just a bandwidth issue, but I thought WLAN should be able to handle 500kB/s. This 500kB/s are both upstream and downstream together.
Important notice: If I set the framerate to a factor of 1/5, it does not fix the issue.
I know it's hard to fix this issue with this insight. I would be happy, if you could share your knowledge, so I may be able to fix it myself.
EDIT: Its perfectly fine, if the client recv hangs a litte. But it must not block the send. The server needs data continuosly.

A blocked send means either that the socket send buffer is full, which means either (a) that the socket receiver buffer at the receiver is full, which means the receiver isn't reading as fast as you're sending; or else (b) that there are network losses that are causing the sender to retry. In either case there is nothing you can do about it at the sending end.
Someone is bound to mention non-blocking I/O as a solution, but it isn't: at the point where a blocking sender blocks, a non-blocking sender will get -1 from send() witch 'errno == EAGAIN/EWOULDBLOCK', which doesn't solve the actual problem at all.

Alright then. It was definetly a wlan issue. I tested over the eduroam wlan at my university. I don't know, if anybody knows it. Now I tested it with a simple router and it worked fine. Seems like the eduroam wlan does have some trouble with bandwidth or direction changes. I won't look into that...

Related

Winsock send() issue with single byte transmissions

I'm experiencing a frustrating behaviour of windows sockets that I cant find any info on, so I thought I'd try here.
My problem is as follows:
I have a C++ application that serves as a device driver, communicating with a serial device connected
through a serial to TCP/IP converter.
The serial protocol requires a lot of single byte messages to be communicated between the device and
my software. I noticed that these small messages are only sent about 3 times after startup, after which they are no longer actually transmitted (checked with wireshark). All the while, the send() method keeps returning > 0, indicating that the message has been copied to it's send buffer.
I'm using blocking sockets.
I discovered this issue because this particular driver eventually has to drop it's connection when the send buffer is completely filled (select() fails due to this after about 5 hours, but it happens much sooner when I reduce SO_SNDBUF size).
I checked, and noticed that when I call send with messages of 2 bytes or larger, transmission never fails.
Any input would be very much appreciated, I am out of ideas how to fix this.
This is a rare case when you should set TCP_NODELAY so that the sends are written individually, not coalesced. But I think you have another problem as well. Are you sure you're reading everything that's being sent back? And acting on it properly? It sounds like an application protocol problem to me.

Forced server-side socket close without SO_LINGER > 0 can lose data, right?

I'm writing a cross-platform client application that uses sockets, written in C++. I'm having problems where the server is doing a hard close on the socket when it's done sending me info.
I've been reading other posts on this topic, and I'm not so much interested in the rights or wrong of this approach, but it's seems the server is either explicitly setting SO_LINGER=0, or that's the default behavior on that system (not sure, it's a Linux box).
I can see (in Wireshark) that the data was sent to me followed within milli-seconds by an RST, indicating a hard close by the server. I personally don't agree with this approach as it should be up to the client to shutdown the socket.
Server team are saying there's nothing wrong with that approach (doing a hard close rather than shutdown), it's typical on servers to avoid accumulating TIMED_WAIT sockets. On Windows my select() returns indicating there's something to read (while I haven't read any of this "in transit" data yet).
However, because of the quick arrival of the RST, on Windows recv() returns -1 and I'm seeing a 10054 for the error code (connection reset by peer). This wouldn't be too bad if I could at least get the data that was sent, but it seems that once my client's socket stack sees the RST any unread bytes are no longer made available to me.
On Linux (client), there's no problem. It seems the TCP stack is behaving slightly differently, in that I can read the outstanding bytes before the RST is honoured. I'm having trouble convincing the server guys they have a bug, given that it works for a Linux client.
First off, am I correct? Is this a server-side issue? I can't see that the client end is doing anything wrong, so it must be right?
It seems the server team are adamant that they want to perform the close, and they don't want to in have TIMED_WAITs, so I was going to push for them to add a SO_LINGER of, say 2 seconds? Does that sound like it will solve my problem? From what I understand this will stop the server from sending out a RST so soon after sending data, and should give me a chance to read the outstanding bytes.
Found a definitive answer to my own question:
"...Upon reception of RST segment, the receiving side will immediately abort the connection. This statement has more implications than just meaning that you will not be able to receive or send any more data to/from this connection. It also implies that any unread data still in the TCP reception buffer will be lost..." It cites the book "TCP/IP Internetworking Volume II". I don't have that book, so I can only take his word for it. Doesn't seems to discard data on Linux, only Windows...
Olivier Langlois's blog
The side-effect of fiddling with SO_LINGER to force a reset is that all pending data is lost. The fact that you don't receive it is all the proof you need that the server team is wrong to do this.
RFC 793 cited below says 'this command [ABORT] causes all pending SENDs and RECEIVEs to be aborted, ... and a special RESET message to be sent to the TCP on the other side of the connection.' See also W.R. Stevens, TCP/IP Illustrated, Vol. 1, p. 287: 'Aborting a connection provides two features to the application: (1) any queued data is thrown away and the reset is sent immediately, and (2) the receiver of the RST can tell that the other end did an abort instead of a normal close'. There is similar wording, along with an extract from the BSD code that implements it, in Vol. 2.
The TIME_WAIT state only occurs on a socket which sends a FIN before it has received one: see RFC 793. So the server should be waiting for a FIN from the client, with a suitable timeout, rather than resetting. This will also permit the client to do connection pooling.

Receiving data from already closed socket?

Suppose I have a server application - the connection is over TCP, using UNIX sockets.
The connection is asynchronous - in other words, clients' and servers' sockets are non-blocking.
Suppose the following situation: in some conditions, the server may decide to send some data to a connected client and immediately close the connection: using shutdown with SHUT_RDWR.
So, my question is - is it guaranteed, that when the client call recv, it will receive the (sent by the server) data?
Or, to receive the data, recv must be called before the server's shutdown? If so, what should I do (or, to be more precise, how should I do this), to make sure, that the data is received by the client?
You can control this behavior with "setsockopt(SO_LINGER)":
man setsockopt
SO_LINGER
Waits to complete the close function if data is present. When this option is enabled and there is unsent data present when the close
function is called, the calling application is blocked during the
close function until the data is transmitted or the connection has
timed out. The close function returns without blocking the caller.
This option has meaning only for stream sockets.
See also:
man read
Beej's Guide to Network Programming
There's no guarantee you will receive any data, let alone this data, but the data pending when the socket is closed is subject to the same guarantees as all the other data: if it arrives it will arrive in order and undamaged and subject to TCP's best efforts.
NB 'Asynchronous' and 'non-blocking' are two different things, not two terms for the same thing.
Once you have successfully written the data to the socket, it is in the kernel's buffer, where it will stay until it has been sent and acknowledged. Shutdown doesn't cause the buffered data to get lost. Closing the socket doesn't cause the buffered data to get lost. Not even the death of the sending process would cause the buffered data to get lost.
You can observe the size of the buffer with netstat. The SendQ column is how much data the kernel still wants to transmit.
After the client has acknowledged everything, the port disappears from the server. This may happen before the client has read the data, in which case it will be in RecvQ on the client. Basically you have nothing to worry about. After a successful write to a TCP socket, every component is trying as hard as it can to make sure that your data gets to the destination unharmed regardless of what happens to the sending socket and/or process.
Well, maybe one thing to worry about: If the client tries to send anything after the server has done its shutdown, it could get a SIGPIPE and die before it has read all the available data from the socket.

I'm using tcp for very many small sends, should I turn off Nagles algorithm? (People also know this as TCP_NODELAY)

I remade this post because my title choice was horrible, sorry about that. My new post can be found here: After sending a lot, my send() call causes my program to stall completely. How is this possible?
Thank you very much everyone. The problem was that the clients are actually bots and they never read from the connections. (Feels foolish)
TCP_NODELAY might help latency of small packets from sender to receiver, but the description you gave points into different direction. I can imagine the following:
Sending more data than receivers actually consume - this eventually overflows sender's buffer (SO_SNDBUF) and causes the server process to appear "stuck" in the send(2) system call. At this point the kernel waits for the other end to acknowledge some of the outstanding data, but the receiver does not expect it, so it does not recv(2).
There are probably other explanations, but it's hard to tell without seeing the code.
If send() is blocking on a TCP socket, it indicates that the send buffer is full, which in turn indicates that the peer on the other end of the connection isn't reading data fast enough. Maybe that client is completely stuck and not calling recv() often enough.
Nagle's wouldn't cause "disappearing into the kernel", which is why disabling it doesn't help you. Nagle's will just buffer data for a little while, but will eventually send it without any prompting from the user.
There is some other culprit.
Edit for the updated question.
You must make sure that the client is receiving all of the sent data, and that it is receiving it quickly. Have each client write to a log or something to verify.
For example, if a client is waiting for the server to accept its 23-byte update, then it might not be receiving the data. That can cause the server's send buffer to fill-up, which would cause degradation and eventual deadlock.
If this is indeed the culprit, the solution would be some asynchronous communication, like Boost's Asio library.

Facing an issue with recv() and send() winsock api. Recv() hangs while receving the last packet

I am facing an issue with recv() and send() winsock api. Recv() hangs while receving the last packet.
Problem Description:-
System A's app is writing data over a non-blocking socket and system B's app is receiving data over a blocking socket in chunks of 64k.
It seems that while reading probably the last packet of 64k, which may less than or equal to 64k, the receive freezes. I am not sure if the receive of the last packet or send of the last packet is an issue, but I am observing this issue intermittently in our legacy applications.
Has anyone faced a similar issue before? If yes, then can please provide your inputs.
If not, then can you please provide some trouble-shooting techniques to narrow down to the root cause.
Just for information I have win2k3 servers.
Thanks,
Varun
Wireshark is a great tool for troubleshooting networking code. It'll show you exactly what packets are entering and leaving your network interface in near real time.
As to your specific issue: are you saying that the last chunk of data might be shorter than 64k? If so, your protocol should include some message length information so the receiver
knows how much data to look for.
A couple of guesses...
If you are using UDP, perhaps one or more packets are being dropped en route (which UDP is permitted to do whenever it feels like). In that case, your receiver might end up waiting for data that is simply never going to arrive; to fix this you would need to either implement some way of automatically resending the lost data, or (if you don't strictly need all the data), some way for the sender to notify the receiver that he is done transmitting, so the receiver might as well stop waiting. (of course you would need to handle the case where this notification gets dropped, as well... it can get complicated if you want 100% robustness)
If you are using TCP, perhaps you are not carefully checking the values returned by send() on the sending side? If you are assuming that send() will always send the number of bytes you asked it to, you might end up thinking send() sent all the bytes when in fact it only sent some (or none) of them... so the sender would think the transmission was complete, while the receiver would be stuck waiting for data that isn't going to arrive.
You might have a problem with the server sending data down the wire faster than the receiver is able to read it. You could try increasing the receive buffer:
int nSocketBuffer = 131072; // 128k
if (setsockopt(m_sSocket,SOL_SOCKET,SO_RCVBUF,(LPCSTR)&nSocketBuffer,sizeof(int)) == SOCKET_ERROR)
{
// socket error
return false;
}