How to increase the TCP receive window for a specific socket?
- I know how to do so for all the sockets by setting the registry key TcpWindowSize,
but how do do that for a specific one?
According to MSFT's documents, the way is
Calling the Windows Sockets function
setsockopt, which sets the receive
window on a per-socket basis.
But in setsockopt, it is mentioned about SO_RCVBUF :
Specifies the total per-socket buffer
space reserved for receives. This is
unrelated to SO_MAX_MSG_SIZE and does
not necessarily correspond to the size
of the TCP receive window.
So is it possible? How?
Thanks.
SO_MAX_MSG_SIZE is for UDP. Here's from MSDN:
SO_MAX_MSG_SIZE - Returns the maximum outbound message size for message-oriented sockets supported by the protocol. Has no meaning for stream-oriented sockets.
It's also not settable.
For TCP just use SO_(SND|RCV)BUF.
I am fairly sure that SO_RCVBUF is what you want. The first link says that SO_RCVBUF has the highest priority for determining the TCP window size over and above anything set on the system. From the way I am reading it, I think that all second part is saying is that the SO_RCVBUF size does not have to match the system receive window size. In other words, it can be a different size that you set.
You need to be careful tuning this and testing the results. Windows Vista and above have a smart adaptive window size auto tuning feature which specifically tunes the window size to work well both on LANs and long fat networks such as 3G and high loss networks. Setting the window size yourself will override this so that windows can no longer tune the window size automatically. This may damage your performance should you ever need to run over a particularly high latency network such as a cellular network.
Related
IMPORTANT NOTE: I'm aware that UDP is an unreliable protocol. But, as I'm not the manufacturer of the device that delivers the data, I can only try to minimize the impact. Hence, please don't post any more statements about UDP being unreliable. I need suggestions to reduce the loss to a minimum instead.
I've implemented an application C++ which needs to receive a large amount of UDP packets in short time and needs to work under Windows (Winsock). The program works, but seems to drop packets, if the Datarate (or Packet Rate) per UDP stream reaches a certain level... Note, that I cannot change the camera interface to use TCP.
Details: It's a client for Gigabit-Ethernet cameras, which send their images to the computer using UDP packets. The data rate per camera is often close to the capacity of the network interface (~120 Megabytes per second), which means even with 8KB-Jumbo Frames the packet rate is at 10'000 to 15'000 per camera. Currently we have connected 4 cameras to one computer... and this means up to 60'000 packets per second.
The software handles all cameras at the same time and the stream receiver for each camera is implemented as a separate thread and has it's own receiving UDP socket.
At a certain frame rate the software seems miss a few UDP frames (even the network capacity is used only by ~60-70%) every few minutes.
Hardware Details
Cameras are from foreign manufacturers! They send UDP streams to a configurable UDP endpoint via ethernet. No TCP-support...
Cameras are connected via their own dedicated network interface (1GBit/s)
Direct connection, no switch used (!)
Cables are CAT6e or CAT7
Implementation Details
So far I set the SO_RCVBUF to a large value:
int32_t rbufsize = 4100 * 3100 * 2; // two 12 MP images
if (setsockopt(s, SOL_SOCKET, SO_RCVBUF, (char*)&rbufsize, sizeof(rbufsize)) == -1) {
perror("SO_RCVBUF");
throw runtime_error("Could not set socket option SO_RCVBUF.");
}
The error is not thrown. Hence, I assume the value was accepted.
I also set the priority of the main process to HIGH-PRIORITY_CLASS by using the following code:
SetPriorityClass(GetCurrentProcess(), HIGH_PRIORITY_CLASS);
However, I didn't find any possibility to change the thread priorities. The threads are created after the process priority is set...
The receiver threads use blocking IO to receive one packet at a time (with a 1000 ms timeout to allow the thread to react to a global shutdown signal). If a packet is received, it's stored in a buffer and the loop immediately continues to receive any further packets.
Questions
Is there any other way how I can reduce the probability of a packet loss? Any possibility to maybe receive all packets that are stored in the sockets buffer with one call? (I don't need any information about the sender side; just the contained payload)
Maybe, you can also suggest some registry/network card settings to check...
To increase the UDP Rx performance for GigE cameras on Widnows you may want to look into writing a custom filter driver (NDIS). This allows you to intercept the messages in the kernel, stop them from reaching userspace, pack them into some buffer and then send to userspace via a custom ioctl to your application. I have done this, it took about a week of work to get done. There is a sample available from Microsoft which I used as base for it.
It is also possible to use an existing generic driver, such as pcap, which I also tried and that took about half a week. This is not as good because pcap cannot determine when the frames end so packet grouping will be sub optimal.
I would suggest first digging deep in the network stack settings and making sure that the PC is not starved for resources. Look at guides for tuning e.g. Intel network cards for this type of load, that could potentially have a larger impact than a custom driver.
(I know this is an older thread and you have probably solved your problem. But things like this is good to document for future adventurers..)
IOCP and WSARecv in overlapped mode, you can setup around ~60k WSARecv
on the thread that handles the GetQueuedCompletionStatus process the data and also do a WSARecv in that thread to comnpensate for the one being used when receiving the data
please note that your udp packet size should stay below the MTU above it will cause drops depending on all the network hardware between the camera and the software
write some UDP testers that mimuc the camera to test the network just to be sure that the hardware will support the load.
https://www.winsocketdotnetworkprogramming.com/winsock2programming/winsock2advancediomethod5e.html
I have a network application reading from two sockets from Port A and Port B. The sender of data to Port A is very quick (flooding data), while the one on Port B is very slow.
If the application is very slow in consuming the data, a 'TCP Zero Window' will show up and who sends the data to Port A will be blocked.
Do you know if a 'TCP Zero Windows' is something that affects ALL remaining ports and ALL remaining sockets open at that very moment?
Do you know if also the sender of data to Port B might be blocked as well when the TCP buffer is filled?
I am using C/C++ in Linux.
TCP flow control is applied on a per-connection basis. The sliding window size on port A has no effect on port B's window size at all.
When the window size reaches zero the sender uses a periodic timer to keep probing the window size to check when your end is ready again. Allowing the window size to hit zero is bad for throughput but I'm sure you're aware of this already.
I have some doubts over increasing TCP Window Size in application. In my C++ software application, we are sending data packets of size around 1k from client to server using TCP/IP blocking socket. Recently I came across this concept TCP Window Size. So I tried increasing the value to 64K using setsockopt() for both SO_SNDBUF and SO_RCVBUF. After increasing this value, I get some improvements in performance for WAN connection but not in LAN connection.
As per my understanding in TCP Window Size,
Client will send data packets to server. Upon reaching this TCP Window Size, it will wait to make sure ACK received from the server for the first packet in the window size. In case of WAN connection, ACK is getting delayed from the server to the client because of latency in RTT of around 100ms. So in this case, increasing TCP Window Size compensates ACK wait time and thereby improving performance.
I want to understand how the performance improves in my application.
In my application, even though TCP Window Size (Both Send and Receive Buffer) is increased using setsockopt at socket level, we still maintain the same packet size of 1k (i.e the bytes we send from client to server in a single socket send). Also we disabled Nagle algorithm (inbuilt option to consolidate small packets into a large packet thereby avoiding frequent socket call).
My doubts are as follows:
Since I am using blocking socket, for each data packet send of 1k, it should block if ACK doesn't come from the server. Then how does the performance improve after improving the TCP window Size in WAN connection alone ? If I misunderstood the concept of TCP Window Size, please correct me.
For sending 64K of data, I believe I still need to call socket send function 64 times ( since i am sending 1k per send through blocking socket) even though I increased my TCP Window Size to 64K. Please confirm this.
What is the maximum limit of TCP window size with windows scaling enabled with RFC 1323 algorithm ?
I am not so good in my English. If you couldn't understand any of the above, please let me know.
First of all, there is a big misconception evident from your question: that the TCP window size is what is controlled by SO_SNDBUF and SO_RCVBUF. This is not true.
What is the TCP window size?
In a nutshell, the TCP window size determines how much follow-up data (packets) your network stack is willing to put on the wire before receiving acknowledgement for the earliest packet that has not been acknowledged yet.
The TCP stack has to live with and account for the fact that once a packet has been determined to be lost or mangled during transmission, every packet sent, from that one onwards, has to be re-sent since packets may only be acknowledged in order by the receiver. Therefore, allowing too many unacknowledged packets to exist at the same time consumes the connection's bandwidth speculatively: there is no guarantee that the bandwidth used will actually produce anything useful.
On the other hand, not allowing multiple unacknowledged packets at the same time would simply kill the bandwidth of connections that have a high bandwidth-delay product. Therefore, the TCP stack has to strike a balance between using up bandwidth for no benefit and not driving the pipe aggressively enough (and thus allowing some of its capacity to go unused).
The TCP window size determines where this balance is struck.
What do SO_SNDBUF and SO_RCVBUF do?
They control the amount of buffer space that the network stack has reserved for servicing your socket. These buffers serve to accumulate outgoing data that the stack has not yet been able to put on the wire and data that has been received from the wire but not yet read by your application respectively.
If one of these buffers is full you won't be able to send or receive more data until some space is freed. Note that these buffers only affect how the network stack handles data on the "near" side of the network interface (before they have been sent or after they have arrived), while the TCP window affects how the stack manages data on the "far" side of the interface (i.e. on the wire).
Answers to your questions
No. If that were the case then you would incur a roundtrip delay for each packet sent, which would totally destroy the bandwidth of connections with high latency.
Yes, but that has nothing to do with either the TCP window size or with the size of the buffers allocated to that socket.
According to all sources I have been able to find (example), scaling allows the window to reach a maximum size of 1GB.
Since I am using blocking socket, for each data packet send of 1k, it should block if ACK doesn't come from the server.
Wrong. Sending in TCP is asynchronous. send() just transfers the data to the socket send buffer and returns. It only blocks while the socket send buffer is full.
Then how does the performance improve after improving the TCP window Size in WAN connection alone?
Because you were wrong about it blocking until it got an ACK.
For sending 64K of data, I believe I still need to call socket send function 64 times
Why? You could just call it once with the 64k data buffer.
( since i am sending 1k per send through blocking socket)
Why? Or is this a repetition of your misconception under (1)?
even though I increased my TCP Window Size to 64K. Please confirm this.
No. You can send it all at once. No loop required.
What is the maximum limit of TCP window size with windows scaling enabled with RFC 1323 algorithm?
Much bigger than you will ever need.
I'm a beginner for network programming. I read some resources that I could find on the internet, where I came across TCP Window Scaling. As I understood, the scaling factor is negotiated when the connection is first established, in the SYN packet. So does this mean that TCP Window scaling cannot be set by the code that we would write for socket programming? Is it the operating system which does this? Say, in a windows environment, how does this happen and is there a way for us to manually/dynamically change it?
Window scaling is enabled automatically if you set a socket receive buffer size of more than 64k, via setsockopt().
As the window scaling negotiation happens during the connection handshake, you have to do that before connecting the socket. In the case of sockets accepted by a server via a listening socket, this is obviously impossible, so you have to do the apparently odd operation of setting the socket receive buffer size on the listening socket instead, from where it is inherited by all sockets accepted from it.
No, I believe this can only be set at a global level. There is a registry setting for this under the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters key.
It's called GlobalMaxTcpWindowSize. See here: http://technet.microsoft.com/en-us/library/cc957546.aspx
If what you actually mean is to change the size of the socket receive and transmit buffers then these can be changed using Winsock. See SO_RCVBUF and SO_SNDBUF.
The window size of TCP packages are managed by the operation system. So far you cannot change that dynamically. For a static way to change the window size for the whole system see Nicks answer.
There is just one very hard way: You WinCap there you can write out every TCP package you want. But that is a real pain.
Using 2 PC's with Windows XP, 64kB Tcp Window size, connected with a crossover cable
Using Qt 4.5.3, QTcpServer and QTcpSocket
Sending 2000 messages of 40kB takes 2 seconds (40MB/s)
Sending 1 message of 80MB takes 80 seconds (1MB/s)
Anyone has an explanation for this? I would expect the larger message to go faster, since the lower layers can then fill the Tcp packets more efficiently.
This is hard to comment on without seeing your code.
How are you timing this on the sending side? When do you know you're done?
How does the client read the data, does it read into fixed sized buffers and throw the data away or does it somehow know (from the framing) that the "message" is 80MB and try and build up the "message" into a single data buffer to pass up to the application layer?
It's unlikely to be the underlying Windows sockets code that's making this work poorly.
TCP, from the application side, is stream-based which means there are no packets, just a sequence of bytes. The kernel may collect multiple writes to the connection before sending it out and the receiving side may make any amount of the received data available to each "read" call.
TCP, on the IP side, is packets. Since standard Ethernet has an MTU (maximum transfer unit) of 1500 bytes and both TCP and IP have 20-byte headers, each packet transferred over Ethernet will pass 1460 bytes (or less) of the TCP stream to the other side. 40KB or 80MB writes from the application will make no difference here.
How long it appears to take data to transfer will depend on how and where you measure it. Writing 40KB will likely return immediately since that amount of data will simply get dropped in TCP's "send window" inside the kernel. An 80MB write will block waiting for it all to get transferred (well, all but the last 64KB which will fit, pending, in the window).
TCP transfer speed is also affected by the receiver. It has a "receive window" that contains everything received from the peer but not fetched by the application. The amount of space available in this window is passed to the sender with every return ACK so if it's not being emptied quickly enough by the receiving application, the sender will eventually pause. WireShark may provide some insight here.
In the end, both methods should transfer in the same amount of time since an application can easily fill the outgoing window faster than TCP can transfer it no matter how that data is chunked.
I can't speak for the operation of QT, however.
Bug in Qt 4.5.3
..................................