UDP packets not sent on time - c++

I am working on a C++ application that can be qualified as a router. This application receives UDP packets on a given port (nearly 37 bytes each second) and must multicast them to another destinations within a 10 ms period. However, sometimes after packet reception, the retransmission exceeds the 10 ms limit and can reach the 100 ms. these off-limits delays are random.
The application receives on the same Ethernet interface but on a different port other kind of packets (up to 200 packets of nearly 100 bytes each second). I am not sure that this later flow is disrupting the other one because these delay peaks are too scarce (2 packets among 10000 packets)
What can be the causes of these sporadic delays? And how to solve them?
P.S. My application is running on a Linux 2.6.18-238.el5PAE. Delays are measured between the reception of the packet and after the success of the transmission!
An image to be more clear :

10ms is a tough deadline for a non-realtime OS.
Assign your process to one of the realtime scheduling policies, e.g. SCHED_RR or SCHED_FIFO (some reading). It can be done in the code via sched_setscheduler() or from command line via chrt. Adjust the priority as well, while you're at it.
Make sure your code doesn't consume CPU more than it has to, or it will affect entire system performance.
You may also need RT_PREEMPT patch.
Overall, the task of generating Ethernet traffic to schedule on Linux is not an easy one. E.g. see BRUTE, a high-performance traffic generator; maybe you'll find something useful in its code or in the research paper.

Related

High speed ethernet messages without congestion

I have two programs that transmits UDP messages between them. The first program is a simulator and the other program is a controller.
Both simulator and controller is written i C++. I want the simulator to transmit the simulation state to the controller that in turn sends a control signal to the simulator. All messages will be under 1kb.
I have now used UDP for the ethernet connection. The speed is good for the first 2-3 seconds, and then it drops significantly to 1/10th of the original speed. I suspect this is due to network congestion.
The problem:
I though that UDP was faster than TCP, but do you think that TCP will be faster due to the congestion? Is there anything I can do to increase the speed?
EDIT: How I measure the speed
I have just done some simulation with varying length
Simulating 5 seconds: 1.10s
Simulating 50 seconds: 26s
Simulating 100 seconds: 61s
You can see that the speed drops when the simulation is run for longer period.
EDIT2: blocking/non-blocking
I am using non-blocking sending and blocking receiving. The simulation is written using Simulink, and receive and send is implemented as two C++-blocks.
My guess is that the sending happens first, and then the receive-block is run.
There are no fixed intervals. The sending happens when the calculations are finished. It's all in one thread.

Latency measurement over UDP on Linux

I want to measure UDP latency and drop rate between two machines on Linux. Preferably (but not crucial) to perform measurement between multiple machines at the same time.
As a result I want to get a histogram, e.g. RTT times of each individual packet at every moment during measurement. Expected frequency is about 10 packets per second.
Do you know of any tool that I can use for this purpose?
What I tried so far is:
ping - uses icmp instead of UDP
iperf - measures only jitter but not latency.
D-ITG - measures per flow statistics, no histograms
tshark - uses TCP for pings instead UDP
I have also created a simple C++ socket program where I have Client and Server on each side, and I send UDP packets with counter and timestamp. My program seems to work ok, although since I am not a network programmer I am not 100% sure that I handled buffers correctly (specifically in the case of partial packets etc). So I would prefer to use some proven software for this task.
Can you recommend something?
Thanks
It depends. If all you want is a trace with timestamps, Wireshark is your friend: https://www.wireshark.org/
I would like to remind you that UDP is a message based protocol and packets have definite boundaries. There cannot be reception of partial packets. That is, you will either get the complete message or you will not get it. So, you need not worry about partial packets in UDP.
The method of calculating packet drop using counter & calculating latency using time delta appears fine for UDP. However the important point to be taken in to consideration is ensuring the synchronization of the system time of client and server.

Interminent Delays in C++ Tcp Communication in Linux

I have a device which sends data every 20 milliseconds over TCP. I have an application which connects to this device, starts the socket communication. My Application listens on a seperate thread and reads the data as fast as data is ready, puts data aside, and some other thread processes it. Device is directly connected to the computer via ethernet cable.
I see a strange problem and I am trying to understand the reason why, Almost once in every minute, it takes approximately 50 milliseconds to receive a packet from the device. I do a blocking read which will try reading for a second, and will finish as fast as data is ready, normally it takes approximately 20 ms as I would expect, but like I said before there are times it takes 50 ms even though it is very rare(1 in 3000). What I noticed is the packets after late packet arrives immediately, so it makes me think that there's some delay on the network layer. I also examined the timestamps of the packets(which is given by the device), they are consistenly increasing by 20 ms's.
Is it normal to see delays like that when the device is directly connected to the computer, Since it is TCP there might be lots of effort under the hood(CRC checks, out of order packages, retransmissions, etc). I still want to find an alternative way to prevent this delay than accepting the fact that it might happen.
Any insights will be greatly appreciated.
It's probably result of Nagle's algorithm which is turned on by default in TCP/IP socket.
Use setsockopt() to set the TCP_NODELAY flag on socket that sends data to turn it off.

How to figure out why UDP is only accepting packets at a relatively slow rate?

I'm using Interix on Windows XP to port my C++ Linux application more readily to port to Windows XP. My application sends and receives packets over a socket to and from a nearby machine running Linux. When sending, I'm only getting throughput of around 180 KB/sec and when receiving I'm getting around 525 KB/sec. The same code running on Linux gets closer to 2,500 KB/sec.
When I attempt to send at a higher rate than 180 KB/sec, packets get dropped to bring the rate back down to about that level.
I feel like I should be able to get better throughput on sending than 180 KB/sec but am not sure how to go about determining what is the cause of the dropped packets.
How might I go about investigating this slowness in the hopes of improving throughput?
--Some More History--
To reach the above numbers, I have already improved the throughput a bit by doing the following (that made no difference on Linux, but help throughput on Interix):
I changed SO_RCVBUF and SO_SNDBUF from 256KB to 25MB, this improved throughput about 20%
I ran optimized instead of debug, this improved throughput about 15%
I turned off all logging messages going to stdout and a log file, this doubled throughput.
So it would seem that CPU is a limiting factor on Interix, but not on Linux. Further, I am running on a Virtual Machine hosted in a hypervisor. The Windows XP is given 2 cores and 2 GB of memory.
I notice that the profiler shows the cpu on the two cores never exceeding 50% utilization on average. This even occurs when I have two instances of my application running, still it hovers around 50% on both cores. Perhaps my application, which is multi-threaded, with a dedicated thread to read from UDP socket and a dedicated thread to write to UDP socket (only one is active at any given time) is not being scheduled well on Interix and thus my packets are dropping?
In answering your question, I am making the following assumptions based on your description of the problem:
(1) You are using the exact same program in Linux when achieving the throughput of 2,500 KB/sec, other than the socket library, which is of course, going to be different between Windows and Linux. If this assumption is correct, we probably shouldn't have to worry about other pieces of your code affecting the throughput.
(2) When using Linux to achieve 2,500 KB/sec throughput, the node is in the exact same location in the network. If this assumption is correct, we don't have to worry about network issues affecting your throughput.
Given these two assumptions, I would say that you likely have a problem in your socket settings on the Windows side. I would suggest checking the size of the send-buffer first. The size of the send-buffer is 8192 bytes by default. If you increase this, you should see an increase in throughput. Use setsockopt() to change this. Here is the usage manual: http://msdn.microsoft.com/en-us/library/windows/desktop/ms740476(v=vs.85).aspx
EDIT: It looks like I misread your post going through it too quickly the first time. I just noticed you're using Interix, which means you're probably not using a different socket library. Nevertheless, I suggest checking the send buffer size first.

Measure data transfer rate (bandwidth) between 2 apps across network using C++, how to get unbiased and accurate result?

I am trying to measure IO data transfer rate (bandwidth) between 2 simulation applications (written in C++). I created a very simple perfclient and perfserver program just to verify that my approach in calculating the network bandwidth is correct before implementing this calculation approach in the real applications. So in this case, I need to do it programatically (NOT using Iperf).
I tried to run my perfclient and perfserver program on various domain (localhost, computer connected to ethernet,and computer connected to wireless connection). However I always get about the similar bandwidth on each of these different hosts, around 1900 Mbps (tested using data size of 1472 bytes). Is this a reasonable result, or can I get a better and more accurate bandwidth?
Should I use 1472 (which is the ethernet MTU, not including header) as the maximum data size for each send() and recv(), and why/why not? I also tried using different data size, and here are the average bandwidth that I get (tested using ethernet connection), which did not make sense to me because the number exceeded 1Gbps and reached something like 28 Gbps.
SIZE BANDWIDTH
1KB 1396 Mbps
2KB 2689 Mbps
4KB 5044 Mbps
8KB 9146 Mbps
16KB 16815 Mbps
32KB 22486 Mbps
64KB 28560 Mbps
HERE is my current approach:
I did a basic ping-pong fashion loop, where the client continuously send bytes of data stream to the server program. The server will read those data, and reflect (send) the data back to the client program. The client will then read those reflected data (2 way transmission). The above operation is repeated 1000 times, and I then divided the time by 1000 to get the average latency time. Next, I divided the average latency time by 2, to get the 1 way transmission time. Bandwidth can then be calculated as follow:
bandwidth = total bytes sent / average 1-way transmission time
Is there anything wrong with my approach? How can I make sure that my result is not biased? Once I get this right, I will need to test this approach in my original application (not this simple testing application), and I want to put this performance testing result in a scientific paper.
EDIT:
I have solved this problem. Check out the answer that I posted below.
Unless you have a need to reinvent the wheel iperf was made to handle just this problem.
Iperf was developed by NLANR/DAST as a modern alternative for measuring maximum TCP and UDP bandwidth performance. Iperf allows the tuning of various parameters and UDP characteristics. Iperf reports bandwidth, delay jitter, datagram loss.
I was finally able to figure and solve this out :-)
As I mentioned in the question, regardless of the network architecture that I used (localhost, 1Gbps ethernet card, Wireless connection, etc), my achieved bandwidth scaled up for up to 28Gbps. I have tried to bind the server IP address to several different IP addresses, as follow:
127.0.0.1
IP address given by my LAN connection
IP address given by my wireless connection
So I thought that this should give me correct result, in fact it didn't.
This was mainly because I was running both of the client and server program on the same computers (different terminal window, even though the client and server are both bound to different IP addresses). My guess is that this is caused by the internal loopback. This is the main reason why the result is so biased and not accurate.
Anyway, so I then tried to run the client on one workstation, and the server on another workstation, and I tested them using the different network connection, and it worked as expected :-)
On 1Gbps connection, I got about 9800 Mbps (0.96 Gbps), and on 10Gbps connection, I got about 10100 Mbps (9.86 Gbps). So this work exactly as I expected. So my approach is correct. Perfect !!