C++ UDP socket corrupts packet over a certain frequency

C++ UDP socket corrupts packet over a certain frequency - c++

I am developing a simple file transfer protocol based on UDP.
To make sure that packets are sent correctly, I am checksumming them. At the moment of receiving, corrupt packets are dropped. I begun by testing my protocol at home within my home network. I have seen it support several MB/s upload bandwidth to the internet so I expected it to perform nicely with two computers connected to the same wifi router.
What happened is that when I reach up to 10000 packets per second (packets are of a few bytes only!) packets start appearing massively (about 40% to 60%) corrupt (checksum fails). What could be the cause of this problem? Any help would be really appreciated!

UDP is a connectionless oriented protocol - meaning, you can send UDP packets at any time - if someone is listening they'll get the packet. If they don't, they don't. Packets are NOT guaranteed to arrive.
You cannot send UDP packets the same way you are doing with TCP. You have to handle each packet as its own. For example, with socket/TCP, you can write as much data as you want and TCP will get it over there unless you overflow the socket itself. It's reliable.
UDP is not. If you send UDP packet and it gets lost, it's lost forever and there no way to recover it - you'll have to do the recovery yourself in your own protocol above the layer. There is no resend, it's not a reliable connection.
Although there is a checksum, it's typically optional and typically not used.
UDP is great for streaming data, such as music, voice, etc. There are recovery protocols such as RTP above the UDP layer for voice that can recover data in the voice coders themselves.
I bet if you put a counter in the UDP packet, you'll note that some of them does not arrive if you exceed a certain bandwith and definitely will run into this if you are connecting it through a switch/network. If you do a direct connection between two computers, it may work at a very high bandwidth though.

Related

Why do I see strange UDP fragmentation on my C++ server?

I have build an UDP server with C++ and I have a couple questions about this.
Goal:
I have incomming TCP trafic and I need to sent this further as UDP trafic. My own UDP server then processes this UDP data.
The size of the TCP packets can vary.
Details:
In my example I have a TCP packet that consists of a total of 2000 bytes (4 random bytes, 1995 'a' (0x61) bytes and the last byte being 'b' (0x62)).
My UDP server has a buffer (recvfrom buffer) with size larger then 2000 bytes.
My MTU size is 1500 everywhere.
My server is receiving this packet correctly. In my UDP server I can see the received packet has a length of 2000 and if I check the last byte buffer[1999], it prints 'b' (0x62), which is correct. But if I open tcpdump -i eth0 I see only one UDP packet: 09:06:01.143207 IP 192.168.1.1.5472 > 192.168.1.2.9000: UDP, bad length 2004 > 1472.
With the tcpdump -i eth0 -X command, I see the data of the packet, but only ~1472 bytes, which does not include the 'b' (0x62) byte.
The ethtool -k eth0 command prints udp-fragmentation-offload: off.
So my questions are:
Why do I only see one packet and not two (fragmented part 1 and 2)?
Why dont I see the 'b' (0x62) byte in the tcpdump?
In my C++ server, what buffer size is best to use? I have it now on 65535 because the incomming TCP packets can be any size.
What will happen if the size exceedes 65535 bytes, will I have to make an own fragmentation scheme before sending the TCP packet as UDP?

OK, circumstances are more complicated as they appear from question, extracted from your comments there's the following information available:
Some client sends data to a server – both not modifiable – via TCP.
In between both resides a firewall, though, only allowing uni-directional communication to server via UDP.
You now intend to implement a proxy consisting of two servers residing in between and tunneling TCP data via UDP.
Not being able for the server to reply backwards does not impose a problem either.
My personal approach would then be as follows:
Let the proxy servers be entirely data unaware! Let the outbound receiver accept (recv or recvfrom depending on a single or multiple clients being available) chunks of data that yet fit into UDP packets and simply forward them as they are.
Apply some means to assure lost data is at least detected, better such that lost data can be reconstructed. As confirmation or re-request messages are impossible due to firewall limitation, only chance to increase reliability is via redundancy, though.
Configure the final target server to listen on loopback only.
Let the inbound proxy connect to the target server via TCP and as long as no (non-recoverable) errors occur just forward any incoming data as is.
To be able to detect lost messages I'd at very least prepend a packet counter to any UDP message sent. If two subsequent messages do not provide consecutive counter values then a message has been lost in between.
As no communication backwards is possible the only way to increase reliability is unconditional redundancy, trading some of your transfer rate for, e.g. by sending every message more than once and ignoring surplus duplicates on reception side.
A more elaborate approach might distribute redundant data over several packets such that a missing one can be reconstructed from the remaining ones – maybe similar to what RAID level 5 does. Admitted, you need to be pretty committed to try that...
Final question would be how routing looks like. There's no guarantee with UDP for packets being received in the same order as they are sent. If there's really only one fix route available from outbound proxy to inbound one via firewall then packets shouldn't overtake one another – you might still want to at least log to file appropriately to monitor the inbound UDP packets and in case of errors occurring apply appropriate means (buffering packets and re-ordering them if need be).

The size of the TCP packets can vary.
While there is no code shown the sentence above and your description suggests that you are working with wrong assumptions of how TCP works.
Contrary to UDP, TCP is not a message based protocol but a byte stream. This especially means that it does not guarantee that single send at the sender will be matched by a single recv in the recipient. Thus even if the send is done with 2000 bytes it might still be that the first recv only gets 1400 bytes while another recv will get the rest - no matter if everything would fit into the socket buffer at once.

Concurrent UDP connection limit in C/C++

I wrote a server in c which receive UDP data from client in port X. I have used Epoll(non block) socket for UDP listening and has only one thread as worker. Pseudo code is following:
on_data_receive(socket){
process(); //take 2-4 millisecond
send_response(socket);
}
But when I send 5000 concurrent (using thread) request server miss 5-10% request. on_data_receive() never called for 5-10% request. I am testing in local network so you can assume there is no packet loss. My question is why on_data_receive didn't call for some request? What is the connection limit for socket? With the increase of concurrent request loss ratio also increase.
Note: I used random sleep upto 200 millisecond before sending the request to server.

There is no 'connection' for UDP. All packets are just sent between peers, and the OS does some magic buffering to avoid packet loss to some degree.
But when too many packets arrive, or if the receiving application is too slow reading the packets, some packets get dropped without notice. This is not an error.
For example Linux has a UDP receive buffer which is about 128k by default (I think). You can probably change that, but it is unlikely to solve the systematic problem that UDP may expose packet loss.
With UDP there is no congestion control like there is for TCP. The raw artifacts of the underlying transport (Ethernet, local network) are exposed. Your 5000 senders get probably more CPU time in total than your receiver, and so they can send more packets than the receiver can receive. With UDP senders do not get blocked (e.g. in sendto()) when the receiver cannot keep up receiving the packets. With UDP the sender always needs to control and limit the data rate explicitly. There is no back pressure from the network side (*).
(*) Theoretically there is no back pressure in UDP. But on some operating systems (e.g. Linux) you can observe that there is back pressure (at least to some degree) when sending for example over a local Ethernet. The OS blocks in sendto() when the network driver of the physical network interface reports that it is busy (or that its buffer is full). But this back pressure stops working when the local network adapter cannot determine the network 'being busy' for the whole network path. See also "Ethernet flow control (Pause Frames)". Through this the sending side can block the sending application even when the receive-buffer on the receiving side is full. This explains why often there seems to be a UDP back-pressure like a TCP back pressure, although there is nothing in the UDP protocol to support back pressure.

C++ Gaming application UDP vs TCP

I am making a real time application. I can't say much about it, but it's an online real time application that needs as little latency as possible. I am using sockets, no library. Also I need full bandwitdh. Should I use TCP or UDP? I don't mind programming a bit more to get UDP working. Thanks in advance.

Depends on the nature of client connections.
TCP is stateful session. If you have a lot of clients connected at the same time, you may suffer port exhaustion. If clients connect and disconnect frequently, establishing and tearing down TCP session adds to latency, CPU load and bandwidth. If your client connections are more or less permanent and not too many clients are connected at the same time, TCP is only slightly worse than UDP.
UDP is much better suited for low-latency communications. Beware of NAT firewalls however - not all are capable or are set up for UDP address mapping.
Also be aware that TCP is a stream and, as such, does not provide message packetization. Your application has to assemble packets from TCP stream with additional overhead.
UDP is by definition a complete message, i.e. arrives as a packet that was sent. Beware that delivery is not guaranteed and application may need to provide acknowledgement and resending layer.

TCP/IP implements a stream; that is, it wants to send the recipient everything sent through it on the other end. To this end, it adds a lot of protocol that handles situations where a portion of the stream is missing--such that it will retry sends, keep track of how many bytes it still needs to send (with a window), etc. The TCP/IP connection is used to add the guarantee of the streams; if TCP/IP fails to deliver packets and cannot recover, the connection will drop.
UDP/IP implements a datagram--that is, it wants to send a particular packet (with a small limited size) from A to B. If a datagram is lost, there's no built in way to resend it. If it's not lost, UDP packets are just dropped.
The UDP lack of guarantees is actually a benefit. Say you're modeling something like "health"--a monster attacks you twice on a server. Your health drops from 90 to 78, then to 60. Using TCP/IP, you must receive 78 first--so if that packet was dropped, it has to be resent, and in order--because there's a stream here. But you really don't care--your health now is 60; you want to try to get 60 across. If 78 was dropped by the time your health reaches 60, who cares--78's old history and not important any more. You need health to say 60.
TCP is also a benefit--you don't want to use UDP for in game chats. You want everything said to you to come to you, in order.
TCP also adds congestion control; with UDP you'd have to implement it, or somehow take care that you throttle UDP such that it doesn't saturate the unknown network characteristics between the server and the player.
So yes, you want to use "both"; but "importance" isn't quite the criteria you need to aim for. Use TCP/IP for delivering streams, easy congestion control, etc. Use UDP for real time states, and other situations where the stream abstraction interferes with the purpose rather than aligning with it.

Both UDP and TCP have benefits in terms of latency. If all of the below is true:
You have a single source of data in your client
Send small messages but are worried about their latency
Your application can deal with loosing messages from time to time
Your application can deal with receiving messages out of order
UDP may be a better option. Also UDP is good for sending data to multiple recipiemts.
On the other hand, if either
If any of the above is not true
If your application sends as much data as possible as fast as possible
You have multiple connections to maintain
You should definitely use TCP.
This post will tell you why UDP is not necessarily the fastest.
If you are planning to tranfer large quantity of data over TCP, you should consider the effect of bandwidth delay product. Although you seem to be more worried about the latency than throughput, it may be of an interest to you.

Sending arbitrary (raw) packets

I've seen it asked elsewhere but no one answers it to my satisfaction: how can I receive and send raw packets?
By "raw packets", I mean where I have to generate all the headers and data, so that the bytes are completely arbitrary, and I am not restricted in any way. This is why Microsofts RAW sockets won't work, because you can't send TCP or UDP packets with incorrect source addresses.
I know you can send packets like I want to with WinPCAP but you cannot receive raw information with it, which I also need to do.

First of all decide what protocol layer you want to test malformed data on:
Ethernet
If you want to generate and receive invalid Ethernet frames with a wrong ethernet checksum, you are more or less out of luck as the checksumming is often done in hardware, and in the cases they're not, the driver for the NIC performs the checksumming and there's no way around that at least on Windows. NetBSD provides that option for most of it drivers that does ethernet checksumming in the OS driver though.
The alternative is to buy specialized hardware, (e.g. cards from Napatech, you might find cheaper ones though), which provides an API for sending and receiving ethernet frames however invalid you would want.
Be aware that sending by sending invalid ethernet frames, the receiving end or a router inbetween will just throw the frames away, they will never reach the application nor the OS IP layer. You'll be testing the NIC or NIC driver on the receiving end.
IP
If all you want is to send/receive invalid IP packets, winpcap lets you do this. Generate the packets, set up winpcap to capture packets, use winpcap to send..
Be aware that packets with an invalid IP checksum other invalid fields, the TCP/IP stack the receiving application runs on will just throw the IP packets away, as will any IP/layer 3 router inbetween the sender and receiver do. They will not reach the application. If you're generating valid IP packets, you'll also need to generate valid UDP and implement a TCP session with valid TCP packets yourself in order for the application to process them, otherwise they'll also be thrown away by the TCP/IP stack
You'll be testing the lower part of the TCP/IP stack on the receiving end.
TCP/UDP
This is not that different from sending/receiving invalid IP packets. You an do all this with winpcap, though routers will not throw them away, as long as the ethernet/IP headers are ok. An application will not receive these packets though, they'll be thrown away by the TCP/IP stack.
You'll be testing the upperpart of the TCP/IP stack on the receiving end.
Application Layer
This is the (sane) way of actually testing the application(unless your "application" actually is a TCP/IP stack, or lower). You send/receive data as any application would using sockets, but generate malformed application data as you want. The application will receive this data, it's not thrown away by lower protocol layers.
Although one particular form of tests with TCP can be hard to test - namely varying the TCP segments sent, if you e.g. want to test that an application correctly interprets the TCP data as a stream. (e.g. you want to send the string "hello" in 5 segments and somehow cause the receiving application to read() the characters one by one). If you don't need speed, you can usually get that behaviour by inserting pauses in the sending and turn off nagel's algorithm (TCP_NDELAY) and/or tune the NIC MTU.
Remember that any muckery with lower level protocols in a TCP stream, e.g. cause one of the packets to have an invalid/diffferent IP source address just gets thrown away by lower level layers.
You'll be testing an application running on top of TCP/UDP(or any other IP protocol).
Alternatives
switch to another OS, where you at least can use raw sockets without the restrictions of recent windows.
Implement a transparent drop insert solution based on the "Ethernet" or "IP" alternative above. i.e. you have your normal client application, your normal server application. You break a cable inbetween them, insert your box with 2 NICs where you programatically alter bytes of the frames received and send them back out again on the other NIC. This'll allow you to easily introduce packet delays in the system as well. Linux' netfilter already have this capability which you can easily build on top of, often with just configuration or scripting.
If you can alter the receiving application you want to test, have it read data from something else such as a file or pipe and feed it random bytes/packets as you wish.
Hybrid model, mainly for TCP application testing, but also useful for e.g. testing UDP ICMP responses. Set up a TCP connection using sockets. Send your invalid application data using sockets. Introduce random malformed packets(much easier than programming with raw sockets that set up a TCP session and then introduce lower layer errors). Send malformed IP or UDP/TCP packets, or perhaps ICMP packets using WinPcap, though communicate with the socket code to the winpcap code so you'll the addresses/port correct, such that the receiving application sees it.
Check out NS/2

Send same packets to multiple clients

I have to develop a software to send same packets to multiple destination.
But i must not use multicast scheme.!!!! ( because my boss is a stupid man )
so, any way, the problem is that:
i have same packets and multiple IP address ( clients) and i can not use multicast
how can i do that in the best way?
i must use c++ as a language and Linux as a platform.
so please help me
Thanx

If your boss said you can't use multicast, maybe he/she has his/her reason. I guess broadcasting is out of the game too?
If these are the requisites, your only chance is to establish a TCP connection with every remote host you want to send packet to.
EDIT
UDP, conversely, would not provide much benefit over multicasting if your application will run over a LAN you are in charge for configuration of, that's the reason I specified TCP.
Maybe you have to describe your scenario a little better.

This could be done with either TCP or UDP depending on your reliability requirements. Can you tolerate lost or reordered packets? Are you prepared to handle timeouts and retransmission? If both answers are "yes", pick UDP. Otherwise stay with TCP. Then:
TCP case. Instead of single multicast UDP socket you would have a number of TCP sockets, one per destination. You will have to figure out the best scheme for connection establishment. Regular listening and accepting connecting clients works as usual. Then you just iterate over connected sockets and send your data to each one.
UDP case. This could be done with single UDP socket on the server side. If you know the IPs and ports of the clients (data receivers) use sendto(2) on the same data for each address/port. The clients would have to be recv(2)-ing at that time. If you don't know your clients upfront you'd need to devise a scheme for clients to request the data, or just register with the server. That's where recvfrom(2) is usefull - it gives you the address of the client.

You have restricted yourself by saying no to multicast. I guess sending packets to multiple clients is just a part of your requirement and unless you throw more light, it will be difficult to provide a complete solution.
Are you expecting two way communication between the client and the server ? in that case choosing multicast may prove complex. please clarify
You have to iterate through the clients and send packets one after another. You may want to persist the sessions if you are expecting response from the clients back.
Choice of UDP or TCP again depends on the nature of data being sent. with UDP you would need to handle out of sequence packets and also need to implement re-transmission.

You'll have to create a TCP Listerner on your server running at a particular port listening for incoming Tcp Client connections (Sockets).
Every time a client connects, you'll have to cache it in some kind of datastructre like a Name value pair (name being a unique name for the client amd value being the Network Stream of that client obtained as a result of the TCP socket).
Then when you are finally ready to transmit the data you could either iterate through this collection of name value pair connections and send them data as byte array one by one to each client or spawm off one thread per connected client and have it send the data concurrently.
TCP is a bulky protocol (due to its connection-oriented nature) and transmission of large data (like videos/images) can be quite slow.
UDP is definitely the choice for streaming large data packets but you'll have to trade-off with the delivery gurantee.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js