UDP transfer is too fast, Apache Mina doesn't handle it

UDP transfer is too fast, Apache Mina doesn't handle it - c++

We decided to use UDP to send a lot of data like coordinates between:
client [C++] (using poll)
server [JAVA] [Apache MINA]
My datagrams are only 512 Bytes max to avoid as possible the fragmentation during the transfer.
Each datagram has a header I added (with an ID inside), so that I can monitor :
how many datagrams are received
which ones are received
The problem is that we are sending the datagrams too fast. We receive like the first ones and then have a big loss, and then get some, and big loss again. The sequence of ID datagram received is something like [1], [2], [250], [251].....
The problem is happening in local too (using localhost, 1 network card only)
I do not care about losing datagrams, but here it is not about simple loss due to network (which I can deal with)
So my questions here are:
On client, how can I get the best :
settings, or socket settings?
way to send as much as I can without being to much?
On Server, Apache MINA seems to say that it manage itself the ~"size of the buffer socket"~ but is there still some settings to care about?
Is it possible to reach something like 1MB/s knowing that our connection already allow us to have at least this bandwidth when downloading regular files?
Nowadays, when we want to transfer a ~4KB coordinates info, we have to add sleep time so that we are waiting 5 minutes or more to get it to finish, it's a big issue for us knowing that we should send every minute at least 10MB coordinates informations.

If you want reliable transport, you should use TCP. This will let you send almost as fast as the slower of the network and the client, with no losses.
If you want a highly optimized low-latency transport, which does not need to be reliable, you need UDP. This will let you send exactly as fast as the network can handle, but you can also send faster, or faster than the client can read, and then you'll lose packets.
If you want reliable highly optimized low-latency transport with fine-grained control, you're going to end up implementing a custom subset of TCP on top of UDP. It doesn't sound like you could or should do this.
... how can I get the best settings, or socket settings
Typically by experimentation.
If the reason you're losing packets is because the client is slow, you need to make the client faster. Larger receive buffers only buy a fixed amount of headroom (say to soak up bursts), but if you're systematically slower any sanely-sized buffer will fill up eventually.
Note however that this only cures excessive or avoidable drops. The various network stack layers (even without leaving a single box) are allowed to drop packets even if your client can keep up, so you still can't treat it as reliable without custom retransmit logic (and we're back to implementing TCP).
... way to send as much as I can without being to much?
You need some kind of ack/nack/back-pressure/throttling/congestion/whatever message from the receiver back to the source. This is exactly the kind of thing TCP gives you for free, and which is relatively tricky to implement well yourself.
Is it possible to reach something like 1MB/s ...
I just saw 8MB/s using scp over loopback, so I would say yes. That uses TCP and apparently chose AES128 to encrypt and decrypt the file on the fly - it should be trivial to get equivalent performance if you're just sending plaintext.

UDP is only a viable choice when any number of datagrams can be lost without sacrificing QoS. I am not familiar with Apache MINA, but the scenario described resembles the server which handles every datagram sequentially. In this case all datagrams arrived while the one is serviced will be lost - there is no queuing of UDP datagrams. Like I said, I do not know if MINA can be tuned for parallel datagram processing, but if it can't, it is simply wrong choice of tools.

Related

Efficiently send a stream of UDP packets

I know how to open an UDP socket in C++, and I also know how to send packets through that. When I send a packet I correctly receive it on the other end, and everything works fine.
EDIT: I also built a fully working acknowledgement system: packets are numbered, checksummed and acknowledged, so at any time I know how many of the packets that I sent, say, during the last second were actually received from the other endpoint. Now, the data I am sending will be readable only when ALL the packets are received, so that I really don't care about packet ordering: I just need them all to arrive, so that they could arrive in random sequences and it still would be ok since having them sequentially ordered would still be useless.
Now, I have to transfer a big big chunk of data (say 1 GB) and I'd need it to be transferred as fast as possible. So I split the data in say 512 bytes chunks and send them through the UDP socket.
Now, since UDP is connectionless it obviously doesn't provide any speed or transfer efficiency diagnostics. So if I just try to send a ton of packets through my socket, my socket will just accept them, then they will be sent all at once, and my router will send the first couple and then start dropping them. So this is NOT the most efficient way to get this done.
What I did then was making a cycle:
Sleep for a while
Send a bunch of packets
Sleep again and so on
I tried to do some calibration and I achieved pretty good transfer rates, however I have a thread that is continuously sending packets in small bunches, but I have nothing but an experimental idea on what the interval should be and what the size of the bunch should be. In principle, I can imagine that sleeping for a really small amount of time, then sending just one packet at a time would be the best solution for the router, however it is completely unfeasible in terms of CPU performance (I probably would need to busy wait since the time between two consecutive packets would be really small).
So is there any other solution? Any widely accepted solution? I assume that my router has a buffer or something like that, so that it can accept SOME packets all at once, and then it needs some time to process them. How big is that buffer?
I am not an expert in this so any explanation would be great.
Please note, however, that for technical reasons there is no way at all I can use TCP.

As mentioned in some other comments, what you're describing is a flow control system. The wikipedia article has a good overview of various ways of doing this:
http://en.wikipedia.org/wiki/Flow_control_%28data%29
The solution that you have in place (sleeping for a hard-coded period between packet groups) will work in principle, but in order to get reasonable performance in a real-world system you need to be able to react to changes in the network. This means implementing some kind of feedback where you automatically adjust both the outgoing data rate and packet size in response to to network characteristics, such as throughput and packetloss.
One simple way of doing this is to use the number of re-transmitted packets as an input into your flow control system. The basic idea would be that when you have a lot of re-transmitted packets, you would reduce the packet size, reduce the data rate, or both. If you have very few re-transmitted packets, you would increase packet size & data rate until you see an increase in re-transmitted packets.
That's something of a gross oversimplification, but I think you get the idea.

Reliable UDP Algorithm?

I'm working on reliable UDP networking and I have to know something. I think UDP reliable algorithm works like that (IDK, I guess);
Server send: (header:6)abcdef
Client receive: (header:6)abdf, sends back "I got 4 data, they are abdf"
Server send: (header:2)ce
Client receive: (header:2)ce, OK I'm going to combine them!
Now is this true way to reliable UDP?
EDIT (after answer, maybe this can be helpful for someone): I'm goint to use TCP because reliable UDP is not a good way to handle my operations. I'll send position like un-important, temporal variables. Maybe if I create a algorithm for reliable UDP, this reliable process will took 3-4 UDP send-recv that means I can send 3-4 other unreliable position data at this time and I'm sending small datas which is can be more efficiency than reliable UDP.

The "true way" to get reliable UDP is to use TCP.
If you still want to do it over UDP, you can verify the integrity of the message by sending a checksum with the message, and then recalculating the checksum at the other end to see if it matches the checksum you sent.
If it doesn't match, request the packet again. Note that this is essentially reinventing TCP.

Well, even with:
- Client receive: (header:6)abdf, sends back "I got 4 data, they are abdf"
- Server send: (header:2)ce
what if server will not receive your response (which may happen in UDP)? So switching to TCP is much better option, if you're not concerned about speed of connection.

Your problem sounds like it's tailor-made for the Data Distribution Service.
I'll send position like un-important, temporal variables
In fact, location ordinates are the popular examples for many of its vendors. RTI has a demonstration which goes well with your use case.
Yeah, a lot of folks groan when they hear "IDL" but I'd recommend that you give it a fair shake. DDS is unlike very many popular pub-sub/distribution/etc protocols in that it's not a simple encapsulation/pipeline.
I think the really cool thing is that often a lot of logic and design elements go into the problem of "how do I react the underlying network or my peer(s) misbehave(s)?" DDS offers a quality of service negotiation and hooks for your code to react when the QoS terms aren't met.
I would recommend against taking this decision lightly, it's a good deal more complex than TCP, UDP, AMQP, etc. But if you can afford the complexity and can amortize it over a large enough system -- it can pay real dividends.
In the end, DDS does deliver "reliable" messages over UDP. It's designed to support many different transports, and many different dimensions of QoS. It's truly dizzying when you see all the different dimensions of QoS that are considered by this service.

Data Transfer Protocol Design

I am writing a protocol to transfer gigabytes of data over a network using TCP, to try to teach myself a little bit about programming on protocols. I am unsure of how to design this transfer protocol, in order to transfer the data in the fastest and most efficient way.
I am using Qt on windows.
At the moment, my design of my application protocol (the part to transfer the data) is as follows:
First shoot the login details.
Write the first data packet (into the socket) of 4 kilobytes, and then wait for the server to confirm it has got the packet.
When the server confirms receiving the data packet (by writing int "1"), write the next 4 kilobytes.
When all data has been transferred, send the md5sum of the data transferred to the server.
If the server confirms again with an int 8, data transfer completes.
At the moment, I am not able to get speeds higher than 166KB/sec on the same computer when transferring over 127.0.0.1. I have been trying to read other protocol designs, but there is hardly any documentation on data transfer protocols that one can write for their application.
Is the protocol design that I've posted wrong or suffering from some serious issues?
Should the protocol wait for each packet to be confirmed by the server or should I write it continuously?

First, I would recommend spending some time on reading about TCP, and about Sliding Window Protocol.
I think there are 2 reasons why your implementation is so slow: first, you wait for acknowledgement of each packet - very slow, you should use sliding window.
Second, you use MD5 checksumming. There is nothing wrong with that, but TCP already implements some basic checksumming, and MD5 implementation you use can be very slow.
And finally, typical way to find out why something works very slow is to use profiling.

Optimizing Sockets in Symbian

I have a TCP connection opened between Symbian and a Server machine and I would like
to transfer huge chunks of data (around 32K) between these two endpoints. Unfortuantely,
the performance figures are pretty poor and I am looking for ideas how I could improve
my implementation. One of the things I tried was to increase the number of bytes that can be
buffered by the socket for sending & receiving to 64K.
iSocket.SetOpt(KSoTcpSendWinSize, KSolInetTcp, 0x10000);
iSocket.SetOpt(KSoTcpRecvWinSize, KSolInetTcp, 0x10000);
Are there any other things that could be optimized at a socket level for better throughput?
It is also possible, that my socket code does something stupid. It follows a simple request/response
protocol. I have to use the blocking WaitForRequest routine to be sure that the data has been send/received
so that I can process it then.
//store requestinfo in reqbuf and send it to server; wait for iStatus
iSocket.Send( reqbuff, 0, iStatus, len );
User::WaitForRequest(iStatus);
//store 32K file in resbuff; wait for iStatus to be sure that all data has
//been received
iSocket.Recv(resbuff, 0, iStatus, len);
User::WaitForRequest(iStatus);
//do something with the 32K received
Would be thankful for every comment!

You can send and receive in parallell if you use active objects. There should be example code in the SDK. Obviously it depends on the application and protocol used whether that will help.
I'm no TCP expert, but I think there are parameters on the socket that can cause your usage pattern (sending one large buffer, then receiveing a large buffer) to not use the network optimally compared to when sending approximately equal amounts of data in both directions.
All things about TCP sockets that can be configured in other OS:se should be possible to configure on Symbian as well, but first you need to figure out what. I suggest you ask another question that is TCP general and get some pointers. Then you can figure out how to set that up in Symbian.

Are you positive that the
//do something with the 32K received
doesn't take particularily long? Your app appears to be single-threaded so if this is holding up the line, well thats an obvious bottleneck.
Also what do you mean by poor performance? Have you compared the performance to other tcp apps?
Lastly, if performance is a big issue you can switch over to raw sockets/datagram sockets, and optimize your own validation protocol for your specific data.

What should i know about UDP programming?

I don't mean how to connect to a socket. What should I know about UDP programming?
Do I need to worry about bad data in my socket?
I should assume if I send 200bytes I may get 120 and 60 bytes separately?
Should I worry about another connection sending me bad data on the same port?
If data doesnt arrive typically how long may I (typically) not see data for (250ms? 1 second? 1.75sec?)
What do I really need to know?

"i should assume if i send 200bytes i
may get 120 and 60bytes separately?"
When you're sending UDP datagrams your read size will equal your write size. This is because UDP is a datagram protocol, vs TCP's stream protocol. However, you can only write data up to the size of the MTU before the packet could be fragmented or dropped by a router. For general internet use, the safe MTU is 576 bytes including headers.
"i should worry about another
connection sending me bad data on the
same port?"
You don't have a connection, you have a port. You will receive any data sent to that port, regardless of where it's from. It's up to you to determine if it's from the right address.
If data doesnt arrive typically how
long may i (typically) not see data
for (250ms? 1 second? 1.75sec?)
Data can be lost forever, data can be delayed, and data can arrive out of order. If any of those things bother you, use TCP. Writing a reliable protocol on top of UDP is a very non trivial task and there is no reason to do so for almost all applications.

Should I worry about another
connection sending me bad data on the
same port?
Yes you should worry about it. Any application can send data to your open UDP port at any time. One of the big uses of UDP is many to one style communications where you multiplex communications with several peers on a single port using the addressed passed back during the recvfrom to differentiate between peers.
However, if you want to avoid this and only accept packets from a single peer you can actually call connect on your UDP socket. This cause the IP stack to reject packets coming from any host:port combo ( socket ) other than the one you want to talk to.
A second advantage of calling connect on your UDP socket is that in many OS's it gives a significant speed / latency improvement. When you call sendto on an unconnected UDP socket the OS actually temporarily connects the socket, sends your data and then disconnects the socket adding significant overhead.
A third advantage of using connected UDP sockets is it allows you to receive ICMP error messages back to your application, such as routing or host unknown due to a crash. If the UDP socket isn't connected the OS won't know where to deliver ICMP error messages from the network to and will silently discard them, potentially leading to your app hanging while waiting for a response from a crashed host ( or waiting for your select to time out ).

Your packet may not get there.
Your packet may get there twice or even more often.
Your packets may not be in order.
You have a size limitation on your packets imposed by the underlying network layers. The packet size may be quite small (possibly 576 bytes).
None of this says "don't use UDP". However you should be aware of all the above and think about what recovery options you may want to take.

Fragmentation and reassembly happens at the IP level, so you need not worry about that (Wikipedia). (This means that you won't receive split or truncated packets).
UDP packets have a checksum for the data and the header, so receiving bogus data is unlikely, but possible. Lost or duplicate packets are also possible. You should check your data in any case anyway.
There's no congestion control, so you may wish to consider that, if you plan on clogging the tubes with a lot of UDP packets.

UDP is a connectionless protocol. Sending data over UDP can get to the receiver, but can also get lost during transmission. UDP is ideal for things like broadcasting and streaming audio or video (i.e. a dropped packet is never a problem in those situations.) So if you need to ensure your data gets to the other side, stick with TCP.
UDP has less overhead than TCP and is therefore faster. (TCP needs to build a connection first and also checks data packets for data corruption which takes time.)
Fragmented UDP packets (i.e. packets bigger than about half a Kb) will probably be dropped by routers, so split your data into small chuncks before sending it over. (In some cases, the OS can take care of that.) Note that it is allways a packet that might make it, or not. Half packets aren't processed.
Latency over long distances can be quite big. If you want to do retransmission of data, I would go with something like 5 to 10 times the agerage latency time over the current connection. (You can measure the latency by sending and receiving a few packets.)
Hope this helps.

I won't follow suit with the other people who answered this, they all seem to push you toward TCP, and that's not for gaming at all, except maybe for login/chat info. Let's go in order:
Do I need to worry about bad data in my socket?
Yes. Even though UDP contains an extremely simple checksum for routers and such, it is not 100% efficient. You can add your own checksum device, but most of the time UDP is used when reliability is already not an issue, so data that doesn't conform should just be dropped.
I should assume if I send 200bytes I may get 120 and 60 bytes separately?
No, UDP is direct data write and read. However, if the data is too large, some routers will truncate and you lose part of the data permanently. Some have said roughly 576 bytes with header, I personally wouldn't use more than 256 bytes (nice round log2 number).
Should I worry about another connection sending me bad data on the same port?
UDP listens for any data from any computer on a port, so on this sense yes. Also note that UDP is a primitive and a raw format can be used to fake the sender, so you should use some sort of "key" in order for the listener to verify the sender against their IP.
If data doesnt arrive typically how long may I (typically) not see data for (250ms? 1 second? 1.75sec?)
Data sent on UDP is usually disposable, so if you don't receive data, then it can easily be ignored...however, sometimes you want "semi-reliable" but you don't want 'ordered reliable' like TCP uses, 1 second is a good estimate of a drop. You can number your packets on a rotation and write your own ACK communication. When a packet is received, it records the number and sends back a bitfield letting the sender know which packets it received. You can read this unfinished document for more information (although unfinished, it still yields valiable info):
http://gafferongames.com/networking-for-game-programmers/

The big thing to know when attempting to use UDP is:
Your packets might not all make it over the line, which means there is going to be possible data corruption.
If you're working on an application where 100% of the data needs to arrive reliably to provide functionality, use TCP. If you're working on an application where some loss is allowable (streaming media, etc.) then go for UDP but don't expect everything to get from one of the pipe to the other intact.

One way to look at the difference between applications appropriate for UDP vs. TCP is that TCP is good when data delivery is "better late than never", UDP is good when data delivery is "better never than late".
Another aspect is that the stateless, best-effort nature of most UDP-based applications can make scalability a bit easier to achieve. Also note that UDP can be multicast while TCP can't.

In addition to don.neufeld's recommendation to use TCP.
For most applications TCP is easier to implement. If you need to maintain packet boundaries in a TCP stream, a good way is to transmit a two byte header before the data to delimit the messages. The header should contain the message length. At the receiving end just read two bytes and evaluate the value. Then just wait until you have received that many bytes. You then have a complete message and are ready to receive the next 2-byte header.
This gives you some of the benefit of UDP without the hassle of lost data, out-of-order packet arrival etc.

And don't assume that if you send a packet it got there.

If there is a packet size limitation imposed by some router along the way, your UDP packets could be silently truncated to that size.

Two things:
1) You may or may not received what was sent
2) Whatever you receive may not be in the same order it was sent.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js