UDP Read entire socket buffer in one shot - c++

I have 3 components client-proxy-server, at times when the proxy gets heavily loaded the socket buffers configure to say 1 MB gets filled. Is there a way to read Entire buffer 1 MB in one shot and then process?
FYI:
all the data grams never goes beyond MTU size are in per-defined structural format, where in length of each packet is also added.
Proxy routes data in between client & server, so tried having Producer & consumer thread but problem is NOT solved

Short answer: no.
Long answer:
The Berkeley style socket implementation allows to receive or send only one packet per call. Therefore it is not possible to read a complete network stream and replay it at the other side.
One reason is that your UDP socket can receive data from several sources. The interface should be able to pass the meta information like sender socket address, and at least the packet size to the caller. This is bunch of data should be parsed and you would pick the packets that meet a criteria. Finally you could build the bunch of packets to send.
Since you have to have the possibility to check each packet, if the packet is really expected you need a function to read a packet from the bunch. This is the function recvfrom.

Related

TCP sockets: Where does incoming data go after ack(leaves tcp read buffer) but before read()/recv()?

If i have a TCP connection that transfers data at 200 KB/sec but i only read()/recv() from the socket once a second, where are those 200 KB of data stored in the meanwhile?
As much as I know, data leaves the TCP socket's read buffer after an ack gets sent to the sender, and it's too small anyways to hold 200KB of data, where does it wait in the meanwhile until it can be read()/recv() by my client?
Thanks!!
The following answer claims data leaves the TCP read buffer as soon as it is ACK'ed, before being read()/recv()d:
https://stackoverflow.com/a/12934115/2378033
"The size of the receiver's socket receive buffer determines how much data can be in flight without acknowledgement"
Could it be that my assumption is wrong and the data gets ACK'd only after it is read()/recv()d by the userspace program?
data leaves the TCP socket's read buffer after an ack gets sent to the sender
No. It leaves the receive buffer when you read it, via recv(), recvfrom(), read(), etc.
The following answer claims data leaves the TCP read buffer as soon as it is ACK'ed
Fiddlesticks. I wrote it, and it positively and absolutely doesn't 'claim' any such thing.
You are thinking of the send buffer. Data is removed from the sender's send buffer when it is ACKed by the receiver. That's because the sender now knows it has arrived and doesn't need it for any more resends.
Could it be that my assumption is wrong and the data gets ACK'd only after it is read()/recv()d by the userspace program?
Yes, your assumption is wrong, and so is this alternative speculation. The data gets ACK'd on arrival, and removed by read()/recv().
When data is correctly received it enters the TCP read buffer and is subject to acknowledgement immediately. That doesn't mean that the acknowledgement is sent immediately, as it will be more efficient to combine the acknowledgement with a window size update, or with data being sent over the connection in the other direction, or acknowledgement of more data.
For example suppose you are sending one byte at a time, corresponding to a user's typing, and the other side has a receive buffer of 50000 bytes. It tells you that the window size is 50000 bytes, meaning that you can send that many bytes of data without receiving anything further. Every byte of data you send closes the window by one byte. Now the receiver could send a packet acknowledging the single byte as soon as it was correctly received and entered the TCP receive buffer, with a window size of 49999 bytes because that is how much space is left in the receive buffer. The acknowledgement would allow you to remove the byte from your send buffer, since you now know that the byte was received correctly and will not need to be resent. Then when the application read it from the TCP receive buffer using read() or recv() that would make space in the buffer for one additional byte of data to be received, so it could then send another packet updating the TCP window size by one byte to allow you to once again send 50000 bytes, rather than 49999. Then the application might echo the character or send some other response to the data, causing a third packet to be sent. Fortunately, a well-designed TCP implementation will not do that as that would create a lot of overhead. It will ideally send a single packet containing any data going in the other direction as well as any acknowledgement and window size update as part of the same packet. It might appear that the acknowledgement is sent when the application reads the data and it leaves the receive buffer, but that may simply be the event that triggered the sending of the packet. However it will not always delay an acknowledgement and will not delay it indefinitely; after a short timeout with no other activity it will send any delayed acknowledgement.
As for the size of the receive buffer, which contains the received data not yet read by the application, that can be controlled using setsockopt() with the SO_RCVBUF option. The default may vary by OS, memory size, and other parameters. For example a fast connection with high latency (e.g. satellite) may warrant larger buffers, although that will increase memory use. There is also a send buffer (SO_SNDBUF) which includes data that has either not yet been transmitted, or has been transmitted but not yet acknowledged.
Your OS will buffer a certain amount of incoming TCP data. For example on Solaris this defaults to 56K but can be reasonably configured for up to several MB if heavy bursts are expected. Linux appears to default to much smaller values, but you can see instructions on this web page for increasing those defaults: http://www.cyberciti.biz/faq/linux-tcp-tuning/

Packet Size modification over Sockets

I am doing socket programming in QT and I have to design a protocol to transfer data over TCP/IP.
Now my protocol design is simple. It sends commands in a fashion that the first byte of the data written to the socket for every write will be the command. So whenever I write into the socket using socket->write("CDATA") the first byte, "C" in this case will mean a command for the server to do something.
I just want to know one thing, that whether the write will be broken down into multiple reads on the server ? I know there will be a buffer size on the server for the read. But can the socket->write() on the client be recieved in multiple reads on the server when the write is within the buffer limits of the server ?
To clear this question I will given an example Lets say the buffer read size of the socket on the server is 4096 bytes. The client writes socket->write("CDATA") to the server. Now is there any possibility that server will receive this in more than one read? Because I have a while loop on the server :
while{
char str[] = socket->read();
// What is the coomand in the first byte
if(str[0] == "C"){
// Do something
}
}
If the data sent by the client is received in more than one read (even though the client sent it in one write) my protocol design will fail.
Now is there any possibility that server will receive this in more than one read?
Yes, TCP/IP can fragment messages any way it likes. TCP is a stateful stream protocol: you are guaranteed that bytes you put in on one end will come out the other end in the same order. IP is connectionless and datagram based. Due to the nature of carrying TCP over IP, circumstances can arise in which data packets are split, merged, or otherwise processed in transit.
You should find a way to sanitize your program to the intricacies of network communication. You can:
Use a datagram protocol like UDP (you lose the guarantee of getting data in the order they are sent, and dropped packets becomes a possibility as well. Today's networks are fairly robust; this is not usually a problem).
[DATAGRAM (size specified in datagram header)]
Always read blocks of a fixed size from the network
[DATA - block of data of some fixed size]
Include the size of the incoming data as a header attached to the front
[LENGTH - 4 byte integer][DATA - block of data of size LENGTH]
Use some sort of delimiter to indicate end-of-data and continue reading until you get it
[DATA - indeterminately sized data][DELIMITER - end-of-data control sequence]
Chances are you can use library methods to perform this behavior for you requiring very little code on your part.

UDP Server Socket Buffer Overflow

I am writing a C++ application on Linux. My application has a UDP server which sends data to clients on some events. The UDP server also receives some feedback/acknowledgement back from the clients.
To implement this application I used a single UDP Socket(e.g. int fdSocket) to send and receive data from all the clients. I bound this socked to port 8080 and have set the socket into NON_BLOCKING mode.
I created two threads. In one thread I wait for some event to happen, if an event occurs then I use the fdsocket to send data to all the clients(in a for loop).
In the another thread I use the fdSocket to receive data from clients (recvfrom()). This thread is scheduled to run every 4 seconds (i.e. every 4 seconds it will call recvfrom() to retrieve the data from the socket buffer. Since it is in NON-BLOCKING mode the recvfrom() function will return immediately if no UDP data is available, then I will go to sleep for 4 secs).
The UDP Feedback/Acknowledge from all the clients has a fixed payload whose size is 20bytes.
Now I have two questions related to this implementation:
Is it correct to use the same socket for sending/receiving UDP data
with Mulitiple clients ?
How to find the maximum number of UDP Feedback/Acknowledge Packets my application can handling without UDP Socket Buffer Overflow (since I am reading at every 4secs, if I
receive lot of packets within this 4 seconds I might loose some packet ie., I need to find the rate in packets/sec I can handle safely)?
I tried to get the Linux Socket Buffer size for my socket (fdsocket) using the function call getsockopt(fdsocket,SOL_SOCKET,SO_RCVBUF,(void *)&n, &m);. From this function I discover that my Socket Buffer size is 110592. But I am not clear as what data will be stored in this socket buffer: will it store only the UDP Payload or Entire UDP Packet or event the Entire Ethernet Packet? I referred this link to get some idea but got confused.
Currently my code it little bit dirty , I will clean and post it soon here.
The following are the links I have referred before posting this question.
Linux Networking
UDP SentTo and Recvfrom Max Buffer Size
UDP Socket Buffer Overflow Detection
UDP broadcast and unicast through the same socket?
Sending from the same UDP socket in multiple threads
How to flush Input Buffer of an UDP Socket in C?
How to find the socket buffer size of linux
How do I get amount of queued data for UDP socket?
Having socket reading at fixed interval of four seconds definitely sets you up for losing packets. The conventional tried-and-true approach to non-blocking I/O is the de-multiplexer system calls select(2)/poll(2)/epoll(7). See if you can use these to capture/react to your other events.
On the other hand, since you are already using threads, you can just do blocking recv(2) without that four second sleep.
Read Stevens for explanation of SO_RCVBUF.
You can see the maximum allowed buffer size:
sysctl net.core.rmem_max
You can set the maximum buffer size you can use by:
sysctl -w net.core.rmem_max=8388608
You can also set the buffer size at run-time (not exceeding the max above) by using setsockopt and changing SO_RCVBUF. You can see the buffer level by looking at /proc/net/udp.
The buffer is used to store the UDP header and application data, rest belong to lower levels.
Q: Is it correct to use the same socket for sending/receiving UDP data with Mulitiple clients ?
A: Yes, it is correct.
Q: How to find the maximum number of UDP Feedback/Acknowledge Packets my application can handling without UDP Socket Buffer Overflow (since I am reading at every 4secs, if I receive lot of packets within this 4secs I might loose some packet ie., I need to find the rate : noofpackets/sec I can handle safely)?
A: The bottleneck might be the network bandwidth, or CPU, or memory. You could simply do a testing, using a client which sends ACK to the server with consecutive number, and verify whether there is packet loss at the server.

Receive packet by packet data from TCP socket?

I have a tcp socket on which I receive video stream. I want to receive data as packet by packet from socket so that I could remove the packet header and keep the only stream data. How can I do this??
any help will be appreciated.
You can't. TCP doesn't work with packets / messages etc. TCP works with bytes. You get a stream of bytes. The problem is that there's no guarantee reagarding the number of bytes you'll get each time you read from a socket. The usual way to handle this:
When you want to send a "packet" include as the first thing a length
When you read stuff from a socket make sure you read at least that length
Your message could be:
|Message Length:4bytes|Additional header Information:whatever1|Message Data:whatever2|
What you'll then have to do is read 4 bytes and then read as much as those 4 bytes tell you. Then you'll be able to strip the header and get the data.
As others have mentioned, TCP is a streaming protocol. This means from an API point of view, there is no concept of "packet". As a user, all you can expect is a stream of data.
Internally, TCP will break the stream into segments that can be placed into IP packets. These packets will be sent along with control data, over IP, to the remote end. The remote end will receive these IP packets. It may discard certain IP packets (in the case of duplicates), reorder the packets or withhold data until earlier packets have arrived. All this is internal to TCP meaning the concept of a "TCP packet" is meaningless.
You might be able to use raw sockets to receive the raw IP packets but this will mean you will have to reimplement much of the TCP stack (like sending ACKs and adjusting window size) to get the remote end to perform correctly. You do not want to do this.
UDP, on the other hand, is a datagram protocol. This means that the user is made aware of how the data is sent over the network. If the concept of packets or datagrams are important to you, you will need to build your own protocol on top of UDP.
TCP is a stream protocol and it doesn't guaranty that when you call socket read function you will receive one, complete packet. UDP or SCTP are packet oriented protocols and guaranty this. For TCP you can get part of the packet or few packet at once. You have to build your own application protocol on top of TCP and fragment/defragment messages manually.
TCP is a streaming protocol. You get bytes with no message boundaries. The solution is to buffer all your reads and extract/process full video packets from the buffer.
Algorithm:
Initialize an empty buffer.
Examine buffer for a complete packet.
If found, remove complete packet from beginning of buffer and process it.
If not found, append data from a recv() to the buffer and go to #2.
What a "complete packet" contains should be defined by the video streaming protocol.
Are you pretty sure about this approach? In my opinion these "preprocessing" will introduce an additional overhead to the system. And of course this is handled by a lower layer (Read about OSI model) so it is not easy to change. Note that most of the existing streaming protocols are already optimized for the best performance.

Confusion about UDP/IP and sendto/recvfrom return values

I'm working with UDP sockets in C++ for the first time, and I'm not sure I understand how they work. I know that sendto/recvfrom and send/recv normally return the number of bytes actually sent or received. I've heard this value can be arbitrarily small (but at least 1), and depends on how much data is in the socket's buffer (when reading) or how much free space is left in the buffer (when writing).
If sendto and recvfrom only guarantee that 1 byte will be sent or received at a time, and datagrams can be received out of order, how can any UDP protocol remain coherent? Doesn't this imply that the bytes in a message can be arbitrarily shuffled when I receive them? Is there a way to guarantee that a message gets sent or received all at once?
It's a little stronger than that. UDP does deliver a full package; the buffer size can be arbitrarily small, but it has to include all the data sent in the packet. But there's also a size limit: if you want to send a lot of data, you have to break it into packets and be able to reassemble them yourself. It's also no guaranteed delivery, so you have to check to make sure everything comes through.
But since you can implement all of TCP with UDP, it has to be possible.
usually, what you do with UDP is you make small packets that are discrete.
Metaphorically, think of UDP like sending postcards and TCP like making a phone call. When you send a postcard, you have no guarantee of delivery, so you need to do something like have an acknowledgement come back. With a phone call, you know the connection exists, and you hear the answers right away.
Actually you can send a UDP datagram of 0 bytes length. All that gets sent is the IP and UDP headers. The UDP recvfrom() on the other side will return with a length of 0. Unlike TCP this does not mean that the peer closed the connection because with UDP there is no "connection".
No. With sendto you send out packets, which can contain down to a single byte.
If you send 10 bytes as a single sendto call, these 10 bytes get sent into a single packet, which will be received coherent as you would expect.
Of course, if you decide to send those 10 bytes one by one, each of them with a sendto call, then indeed you send and receive 10 different packets (each one containing 1 byte), and they could be in arbitrary order.
It's similar to sending a book via postal service. You can package the book as a whole into a single box, or tear down every page and send each one as an individual letter. In the first case, the package is bulkier but you receive the book as a single, ordered entity. In the latter, each package is very light, but good luck reading that ;)
I have a client program that uses a blocking select (NULL timeout parameter) in a thread dedicated to waiting for incoming data on a UDP socket. Even though it is blocking, the select would sometimes return with an indication that the single read descriptor was "ready". A subsequent recvfrom returned 0.
After some experimentation, I have found that on Windows at least, sending a UDP packet to a port on a host that's not expecting it can result in a subsequent recvfrom getting 0 bytes. I suspect some kind of rejection notice might be coming from the other end. I now use this as a reminder that I've forgotten to start the process on the server that looks for the client's incoming traffic.
BTW, if I instead "sendto" a valid but unused IP address, then the select does not return a ready status and blocks as expected. I've also found that blocking vs. non-blocking sockets makes no difference.