I've been learning boost asio recently, especially UDP. I am familiar with the basics, but had a question regarding how UDP handles incoming messages. In the tutorial (see source code here: http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/tutorial/tutdaytime6/src.html), the UDP server operates something like (very pseudo-code):
startReceive(){
async_receive(boost::bind(handler),...other params);
}
handler(){
doStuffToDataReceived();
startReceive(); //start the receiving process over again to allow it to receive more data
}
My question is, if data arrives to the server during the time that it is in "doStuffToDataReceived()", before it startReceives over again, does that data get lost, or does it sit there and wait for startReceive to happen again and then is immediately retrieved?
Thanks!
UDP stack has a buffer, so the data in the above example wouldn't get lost.
Note however, that UDP is allowed to drop packets under various circumstances. So, as the throughput of the UDP server grows, the timings of doStuffToDataReceived might become much more critical.
Related
I've been using boost asio sockets (UDP and TCP) to handle a custom protocol between my client server program. Its been working great until I discovered that on TCP async_send/async_recieve calls that data can arrived in combined chunks.
For example, if I make two send calls each with it's own packet, they can arrive combined at a single receive call. I wrongly assumed that every send corresponds to a receive, but I'm obviously wrong. It however has worked well for the longest time until I found the issue running the client for a different OS.
So my question is: are there any guarantees to the completeness of the data on arrival for every receive call? (e.g. async_send 128 bytes arrive in multiples of 128 bytes, or how it arrives must always be treated as random, like 1 bytes arrives then 127 bytes is possible)
More specifically, does this mean that:
Data can arrive concatenated or partial for every send call, and I
have to always handle the concatenated/partial data manually
Is this true for both UDP and TCP asio sockets?
I searched around and couldn't find any documentation on this so I was wondering if anyone have any idea.
First its important to understand that boost asio socket receive and sends methods just mean that they ordered the underlying network stack to receive or send data. By network stack this could be the windows socket API.
If you are sending data right to the same computer, via so called loopback addresses, the operating system (if there is any) can just "give" it to the listening i.e. receiving program. Thats the scenario where you would be most lucky to get things in order and always complete for all cases.
However if you want you are addressing another computer or because the operating system is in the mood, you will have different behaviour:
TCP was designed that you will get you data in the order you have send it. But the chunks or packet size if will be sent differs even on the same connection and is a key feature of TCP. Your OS or hardware network adapter might do some send or receive buffering too, before informing you. However things won't get lost.
So in short for TCP: You can make sure the data is complete by waiting for a certain point in your data async_read_until is just there for this case. Data from multiple send calls might be in one receive or many
UDP was designed to have a low latency in contrast to TCP, but without its ordering and completeness guarantees. So when you send a UDP datagram i.e. packet, usually the OS and network adapter will try to send it out ASAP. However on the way to the other computer, the internet might loose it, or hold one packet back until the one you send after the first, so that data you send later, could be received later, while you can also get the sent first, later, or might not. But when you receive a datagram it's complete in it self.
So in short for UDP: Data will arrive in datagram chunks, but some datagrams might be missing, or might arrive in another order than sent. The data from one send might be in one receive, might not, or later
So after some more testing here's what I concluded: the answer is no. Boost Asio sockets does not have magic that can enforce data completeness beyond what the TCP/UDP protocols enforces.
Edit:
So here's more of my research:
For TCP, it acts like a data stream. So packets may arrive partial or combined and is complete. So the user application need to handle deserialization of combined or partial data.
For UDP, because it is a datagram packet, if the packet arrives, it is guaranteed to be independent and complete. So there is no need to handle partial or combined packets.
As a homework assignment, I wrote UDP server-client application that tries to correct errors in the UDP communication using checksums and through confirming correctly received packets.
The problem is that on localhost, all packets are received without a problem. I tried some packet tampering programs, but they all require communication through network interface.
How to simulate UDP packet loss on localhost loopback address?
UDP is easy to deal with--just write a bit of code in the sender or receiver which drops a certain percentage of the messages, and perhaps occasionally reorders some too.
If you can't modify the actual sender or receiver, it is easy enough to write a third program which simply sits in the middle, forwarding packets with some drops and reordering.
If you're using Linux, you can probably set up iptables to drop packets for you: http://code.nomad-labs.com/2010/03/11/simulating-dropped-packets-aka-crappy-internets-with-iptables/ - this seems like it might work even on loopback ports.
I am currently planning how to develop a man in the middle network application for TCP server that would transfer data between server and client. It would behave as regular client for server and server for remote client without modifying any data. It will be optionally used to detect and measure how long server or client is not able to receive data that is ready to be received in situation when connection is inactive.
I am planning to use blocking send and recv functions. Before any data transfer I would call a setsockopt function to set SO_SNDTIMEO and SO_RCVTIMEO to about 10 - 20 miliseconds assuming it will force blocking send and recv functions to return early in order to let another active connection data to be routed. Running thread per connection looks too expensive. I would not use async sockets here because I can not find guarantee that they will get complete in a parts of second especially when large data amount is being sent or received. High data delays does not look good. I would use very small buffers here but calling function for each received byte looks overkill.
My next assumption would be that is safe to call send or recv later if it has previously terminated by timeout and data was received less than requested.
But I am confused by contradicting information available at msdn.
send function
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740149%28v=vs.85%29.aspx
If no error occurs, send returns the total number of bytes sent, which
can be less than the number requested to be sent in the len parameter.
SOL_SOCKET Socket Options
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740532%28v=vs.85%29.aspx
SO_SNDTIMEO - The timeout, in milliseconds, for blocking send calls.
The default for this option is zero, which indicates that a send
operation will not time out. If a blocking send call times out, the
connection is in an indeterminate state and should be closed.
Are my assumptions correct that I can use these functions like this? Maybe there is more effective way to do this?
Thanks for answers
While you MIGHT implement something along the ideas you have given in your question, there are preferable alternatives on all major systems.
Namely:
kqueue on FreeBSD and family. And on MAC OSX.
epoll on linux and related types of operating systems.
IO completion ports on Windows.
Using those technologies allows you to process traffic on multiple sockets without timeout logics and polling in an efficient, reactive manner. They all can be considered successors of the ancient select() function in socket API.
As for the quoted documentation for send() in your question, it is not really confusing or contradicting. Useful network protocols implement a mechanism to create "backpressure" for situations where a sender tries to send more data than a receiver (and/or the transport channel) can accomodate for. So, an application can only provide more data to send() if the network stack has buffer space ready for it.
If, for example an application tries to send 3Kb worth of data and the tcp/ip stack has only room for 800 bytes, send() might succeed and return that it used 800 bytes of the 3k offered bytes.
The basic approach to forwarding the data on a connection is: Do not read from the incoming socket until you know you can send that data to the outgoing socket. If you read greedily (and buffer on application layer), you deprive the communication channel of its backpressure mechanism.
So basically, the "send capability" should drive the receive actions.
As for using timeouts for this "middle man", there are 2 major scenarios:
You know the sending behavior of the sender application. I.e. if it has some intent on sending any data within your chosen receive timeout at any time. Some applications only send sporadically and any chosen value for a receive timeout could be wrong. Even if it is supposed to send at a specific time interval, your timeouts will cause trouble once someone debugs the sending application.
You want the "middle man" to work for unknown applications (which must not use some encryption for middle man to have a chance, of course). There, you cannot pick any "adequate" timeout value because you know nothing about the sending behavior of the involved application(s).
As a previous poster has suggested, I strongly urge you to reconsider the design of your server so that it employs an asynchronous I/O strategy. This may very well require that you spend significant time learning about each operating systems' preferred approach. It will be time well-spent.
For anything other than a toy application, using blocking I/O in the manner that you suggest will not perform well. Even with short timeouts, it sounds to me as though you won't be able to service new connections until you have completed the work for the current connection. You may also find (with short timeouts) that you're burning more CPU time spinning waiting for work to do than actually doing work.
A previous poster wisely suggested taking a look at Windows I/O completion ports. Take a look at this article I wrote in 2007 for Dr. Dobbs. It's not perfect, but I try to do a decent job of explaining how you can design a simple server that uses a small thread pool to handle potentially large numbers of connections:
Windows I/O Completion Ports
http://www.drdobbs.com/cpp/multithreaded-asynchronous-io-io-comple/201202921
If you're on Linux/FreeBSD/MacOSX, take a look at libevent:
Libevent
http://libevent.org/
Finally, a good, practical book on writing TCP/IP servers and clients is "Practical TCP/IP Sockets in C" by Michael Donahoe and Kenneth Calvert. You could also check out the W. Richard Stevens texts (which cover the topic completely for UNIX.)
In summary, I think you should take some time to learn more about asynchronous socket I/O and the established, best-of-breed approaches for developing servers.
Feel free to private message me if you have questions down the road.
I have a server application written in C++. When a client connects, it creates a new thread for him. In that thread there is a BLOCKING reading from a socket. Because there is a possibility for a client to accidentally disconnect and left behind a thread still hanging on the read function, there is a thread that checks if the sockets are still alive by sending "heartbeat messages". The message consists of 1 character and is "ignored" by the client (it is not processed like other messages). The write looks like this:
write(fd, ";", 1);
It works fine, but is it really necessary to send a random character through the socket? I tried to send an empty message ("" with length 0), but it didn't work. Is there any better way to solve this socket checking?
Edit:
I'm using BSD sockets (TCP).
I'm assuming when you say, "socket, you mean a TCP network socket.
If that's true, then the TCP protocol gives you a keepalive option that you would need to ask the OS to use.
I think this StackOverflow answer gets at what you would need to do, assuming a BSDish socket library.
In my experience, using heartbeat messages on TCP (and checking for responses, e.g. NOP/NOP-ACK) is the easiest way to get reliable and timely indication of connectivity at the application layer. The network layer can do some interesting things but getting notification in your application can be tricky.
If you can switch to UDP, you'll have more control and flexibility at the application layer, and probably reduced traffic overall since you can customize the communications, but you'll need to handle reliability, packet ordering, etc. yourself.
You can set connection KEEPALIVE. You may have interests in this link: http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/overview.html
It is ok you create a thread for each new coming requests if it is only toy. In most of time, i use poll, that is non-blocking io, for performance improvement.
I am using SDL and Net2 lib for a client-server application.
The problem I am facing is that I am not receiving all of my TCP packets from my client unless I place a delay before sending each packet from client.
Removing the delay I get only one packet.
A TCP connection is a stream of bytes. Your client could send 20 packets of 5 bytes each, and the server read it as one 100-byte sequence. You'll need to split the data up yourself.
Well you're not guaranteed (in regular sockets) to receive all packets at one time, you may have to call your receive function more than once, to receive all data. This is of course depends on your definition of a "packet" are you receiving all of your data?
+1 erik
Although it is not guaranteed to be reliable, you most likely want to use UDP, not TCP. Net2 handles UDP very well. UDP is actually very reliable. UDP is message oriented. UDP messages tend to get sent quickly and get special treatment by routers (not always a good thing :-). UDP is often used in games.
BTW, if you had asked this question on the SDL mailing list, or sent it to me directly, you would have gotten this advice many months ago.
I wrote Net2 and I hang out on the SDL list. I do not hang out here because this place is an infinite time sink.
Bob Pendleton