Receive packet by packet data from TCP socket?

Receive packet by packet data from TCP socket? - c++

I have a tcp socket on which I receive video stream. I want to receive data as packet by packet from socket so that I could remove the packet header and keep the only stream data. How can I do this??
any help will be appreciated.

You can't. TCP doesn't work with packets / messages etc. TCP works with bytes. You get a stream of bytes. The problem is that there's no guarantee reagarding the number of bytes you'll get each time you read from a socket. The usual way to handle this:
When you want to send a "packet" include as the first thing a length
When you read stuff from a socket make sure you read at least that length
Your message could be:
|Message Length:4bytes|Additional header Information:whatever1|Message Data:whatever2|
What you'll then have to do is read 4 bytes and then read as much as those 4 bytes tell you. Then you'll be able to strip the header and get the data.

As others have mentioned, TCP is a streaming protocol. This means from an API point of view, there is no concept of "packet". As a user, all you can expect is a stream of data.
Internally, TCP will break the stream into segments that can be placed into IP packets. These packets will be sent along with control data, over IP, to the remote end. The remote end will receive these IP packets. It may discard certain IP packets (in the case of duplicates), reorder the packets or withhold data until earlier packets have arrived. All this is internal to TCP meaning the concept of a "TCP packet" is meaningless.
You might be able to use raw sockets to receive the raw IP packets but this will mean you will have to reimplement much of the TCP stack (like sending ACKs and adjusting window size) to get the remote end to perform correctly. You do not want to do this.
UDP, on the other hand, is a datagram protocol. This means that the user is made aware of how the data is sent over the network. If the concept of packets or datagrams are important to you, you will need to build your own protocol on top of UDP.

TCP is a stream protocol and it doesn't guaranty that when you call socket read function you will receive one, complete packet. UDP or SCTP are packet oriented protocols and guaranty this. For TCP you can get part of the packet or few packet at once. You have to build your own application protocol on top of TCP and fragment/defragment messages manually.

TCP is a streaming protocol. You get bytes with no message boundaries. The solution is to buffer all your reads and extract/process full video packets from the buffer.
Algorithm:
Initialize an empty buffer.
Examine buffer for a complete packet.
If found, remove complete packet from beginning of buffer and process it.
If not found, append data from a recv() to the buffer and go to #2.
What a "complete packet" contains should be defined by the video streaming protocol.

Are you pretty sure about this approach? In my opinion these "preprocessing" will introduce an additional overhead to the system. And of course this is handled by a lower layer (Read about OSI model) so it is not easy to change. Note that most of the existing streaming protocols are already optimized for the best performance.

Related

UDP Read entire socket buffer in one shot

I have 3 components client-proxy-server, at times when the proxy gets heavily loaded the socket buffers configure to say 1 MB gets filled. Is there a way to read Entire buffer 1 MB in one shot and then process?
FYI:
all the data grams never goes beyond MTU size are in per-defined structural format, where in length of each packet is also added.
Proxy routes data in between client & server, so tried having Producer & consumer thread but problem is NOT solved

Short answer: no.
Long answer:
The Berkeley style socket implementation allows to receive or send only one packet per call. Therefore it is not possible to read a complete network stream and replay it at the other side.
One reason is that your UDP socket can receive data from several sources. The interface should be able to pass the meta information like sender socket address, and at least the packet size to the caller. This is bunch of data should be parsed and you would pick the packets that meet a criteria. Finally you could build the bunch of packets to send.
Since you have to have the possibility to check each packet, if the packet is really expected you need a function to read a packet from the bunch. This is the function recvfrom.

Which method to send/receive data properly in a network game (UDP, but why not TCP)

I have a C++ application with GUI that runs (on PC 1) just like a network game, and receives data packets from another computer (2) via WiFi (ad-hoc, so it's quite reliable) at fairly regular intervals (like 40ms), once per loop on program (2). I use send/read.
Here is the problem:
- Packets are not always fully sent (but apparently you can simply keep send()ing the remaining data until all is sent, and thats works well)
- More importantly, packets are stacked in the socket during (1)'s loop until the read() occurs, and then there is no way to distinguish packets in the big stream of data, or know if you were already in the middle of a packet.
I tried to fix this with ID headers (you find an ID as first bytes and you know the length of the packet), but I often get lost (unknown ID : we are not at the beginning of the packet) and am forced to ignore all the remaining data.
So my question is:
Why do packets stack? (generally I have 400B of data whereas my packets are <100B long and fps (1) and (2) are not very different)
How can I have a more reliable way to receive actual packets, say, 80% of packets (discarding packet loss, it's not a question of UDP/TCP)?
Would a separate thread for receiving packets work? (on (1), the server)
How do real-time network games to that (including multiple client management)?
Thanks in advance.
(Sorry I do not have the code here, but I tried to be as clear as I could)

Well:
1) UDP transfers MESSAGES, but is unreliable.
2) TCP transfers BYTE STREAMS, and is reliable.
UDP cannot reliably transfer messages. Anything more reliable requires a protocol on top of UDP.
TCP cannot transfer messages unless they are one byte long. Anything more complex requires a protocol on top of TCP.

Why do packets stack? (generally I have 400B of data whereas my packets are <100B long and fps (1) and (2) are not very different)
Because the time to send packets across the net varies, it typically does not make sense to send packets at a high rate, so most networking libraries (e.g. RakNet) will queue up packets and do a send every 10 ms.
In the case of TCP, there is Nagle's algorithm which is a more principled way of doing the same thing. You can turn Nagle's off by setting the NO_DELAY TCP flag.
How can I have a more reliable way to receive actual packets, say, 80% of packets (discarding packet loss, it's not a question of UDP/TCP)?
If you use TCP, you will receive all of the packets and in the right order. The penalty for using TCP is if a packet is dropped, the packets after it wait until that packet can be resent before they are processed. This results in a noticeable delay, so any games that use TCP have sophisticated prediction techniques to hide this delay and other techniques to smoothly "catch up" once the missing packet arrives.
If you use UDP, you can implement a layer on top that gives you reliability but without the ordering if the order of the packets doesn't matter by sending a counter with each packet and having the receiver repeatedly notify the sender of gaps in the counts. You can also implement ordering by doing something similar. Of course, if you enforce both, then you are creating your own TCP layer. See http://www.jenkinssoftware.com/raknet/manual/reliabilitytypes.html for more details.

What you describe is what would happen if you are using TCP without a protocol on top of it to structure your transmitted data. Your idea of using an ID header and packet length is one such protocol. If you send a 4-byte ID followed by a 4-byte length followed by X number of bytes, then the receiver knows that it has to read 4 bytes followed by 4 bytes followed by X bytes to receive a complete packet. It doesn't get much simplier than that. The fact that you are still having problems reading packets with such a simple protocol suggests that your underlying socket reading code is flawed to begin with. Without seeing your actual code, it is difficult to tell you what you are doing wrong.

Receiving data in packets on TCP client

Does recv() call intercepts data in packets or can i get data packets with timestamps?

On a datagram socket (like UDP), recv gets data in datagrams. TCP is a stream-mode socket, however, and recv gets a collection of bytes with no regard for packets.
It's possible, using low-level APIs, to get the packets, but if you were hoping to see boundaries between send calls you are out of luck... that information is not present in the packets.

Recv gets data from a socket that has been successfully received. It does not tell you when that happened; i.e. no timestamp.
Would you elaborate on what problem you're trying to solve ("why do you need this?") instead of your attempted solution? (Or have I completely misunderstood your question?)

If your own code is sending data to the remote machine where you are receiving data...then you can make you r own application level data format...such as sending the data after sending timestamp (some specified number of bytes).
This information can be extracted at the receiving end. Although as mentioned connection is TCP ...the data would be in stream format not as a complete packet as in case of UDP.

C++ UDP sockets packet queuing

I am using the same UDP socket for sending and receiving data. I am wondering if packet queuing for DGRAM sockets is already present, or do we have to handle it separately.
If the user code has to handle queueing, how is it done? Do we have separate threads to recvfrom for the socket and put the packet in the reciver_queue and to sendto from another sending_queue?
An example code will be absolutely awesome. Thanks for your help.

There is a packet queue. However when the packet queue is filled then UDP packets start getting discarded. When they are discarded they are lost forever so make sure you keep reading data!

As Goz has noted, there is a packet queue. There is more than one, actually, at various places of the whole pipeline that ends in your application. There are usually some buffers on the NIC, then there are some managed by the kernel. The kernel buffers often can be sized for individual sockets using setsockopt().
As Goz has already noted, UDP packets can be lost on their way to you, or they can arive in different order. If you need both realiability and ordering and if you cannot use TCP instead, you will have to implement some kind of protocol that will provide both atop UDP, e.g. sliding window protocol.

With UDP there's actually only the receive socket buffer. While there is SO_SNDBUF socket option, the value supplied is just the upper limit for the datagram size. The outbound datagram is either given to the hardware in whole, or in fragments (if it's bigger then the MTU), or discarded. The hardware usually have some ring buffers, but that really has to do with DMA and of no concern to userland apps.
The most straightforward technique for packet queueing in the application is, again, a circular buffer - make it large enough for normal usage, lose some packets during heavy spikes. Surely there are other approaches.

What should i know about UDP programming?

I don't mean how to connect to a socket. What should I know about UDP programming?
Do I need to worry about bad data in my socket?
I should assume if I send 200bytes I may get 120 and 60 bytes separately?
Should I worry about another connection sending me bad data on the same port?
If data doesnt arrive typically how long may I (typically) not see data for (250ms? 1 second? 1.75sec?)
What do I really need to know?

"i should assume if i send 200bytes i
may get 120 and 60bytes separately?"
When you're sending UDP datagrams your read size will equal your write size. This is because UDP is a datagram protocol, vs TCP's stream protocol. However, you can only write data up to the size of the MTU before the packet could be fragmented or dropped by a router. For general internet use, the safe MTU is 576 bytes including headers.
"i should worry about another
connection sending me bad data on the
same port?"
You don't have a connection, you have a port. You will receive any data sent to that port, regardless of where it's from. It's up to you to determine if it's from the right address.
If data doesnt arrive typically how
long may i (typically) not see data
for (250ms? 1 second? 1.75sec?)
Data can be lost forever, data can be delayed, and data can arrive out of order. If any of those things bother you, use TCP. Writing a reliable protocol on top of UDP is a very non trivial task and there is no reason to do so for almost all applications.

Should I worry about another
connection sending me bad data on the
same port?
Yes you should worry about it. Any application can send data to your open UDP port at any time. One of the big uses of UDP is many to one style communications where you multiplex communications with several peers on a single port using the addressed passed back during the recvfrom to differentiate between peers.
However, if you want to avoid this and only accept packets from a single peer you can actually call connect on your UDP socket. This cause the IP stack to reject packets coming from any host:port combo ( socket ) other than the one you want to talk to.
A second advantage of calling connect on your UDP socket is that in many OS's it gives a significant speed / latency improvement. When you call sendto on an unconnected UDP socket the OS actually temporarily connects the socket, sends your data and then disconnects the socket adding significant overhead.
A third advantage of using connected UDP sockets is it allows you to receive ICMP error messages back to your application, such as routing or host unknown due to a crash. If the UDP socket isn't connected the OS won't know where to deliver ICMP error messages from the network to and will silently discard them, potentially leading to your app hanging while waiting for a response from a crashed host ( or waiting for your select to time out ).

Your packet may not get there.
Your packet may get there twice or even more often.
Your packets may not be in order.
You have a size limitation on your packets imposed by the underlying network layers. The packet size may be quite small (possibly 576 bytes).
None of this says "don't use UDP". However you should be aware of all the above and think about what recovery options you may want to take.

Fragmentation and reassembly happens at the IP level, so you need not worry about that (Wikipedia). (This means that you won't receive split or truncated packets).
UDP packets have a checksum for the data and the header, so receiving bogus data is unlikely, but possible. Lost or duplicate packets are also possible. You should check your data in any case anyway.
There's no congestion control, so you may wish to consider that, if you plan on clogging the tubes with a lot of UDP packets.

UDP is a connectionless protocol. Sending data over UDP can get to the receiver, but can also get lost during transmission. UDP is ideal for things like broadcasting and streaming audio or video (i.e. a dropped packet is never a problem in those situations.) So if you need to ensure your data gets to the other side, stick with TCP.
UDP has less overhead than TCP and is therefore faster. (TCP needs to build a connection first and also checks data packets for data corruption which takes time.)
Fragmented UDP packets (i.e. packets bigger than about half a Kb) will probably be dropped by routers, so split your data into small chuncks before sending it over. (In some cases, the OS can take care of that.) Note that it is allways a packet that might make it, or not. Half packets aren't processed.
Latency over long distances can be quite big. If you want to do retransmission of data, I would go with something like 5 to 10 times the agerage latency time over the current connection. (You can measure the latency by sending and receiving a few packets.)
Hope this helps.

I won't follow suit with the other people who answered this, they all seem to push you toward TCP, and that's not for gaming at all, except maybe for login/chat info. Let's go in order:
Do I need to worry about bad data in my socket?
Yes. Even though UDP contains an extremely simple checksum for routers and such, it is not 100% efficient. You can add your own checksum device, but most of the time UDP is used when reliability is already not an issue, so data that doesn't conform should just be dropped.
I should assume if I send 200bytes I may get 120 and 60 bytes separately?
No, UDP is direct data write and read. However, if the data is too large, some routers will truncate and you lose part of the data permanently. Some have said roughly 576 bytes with header, I personally wouldn't use more than 256 bytes (nice round log2 number).
Should I worry about another connection sending me bad data on the same port?
UDP listens for any data from any computer on a port, so on this sense yes. Also note that UDP is a primitive and a raw format can be used to fake the sender, so you should use some sort of "key" in order for the listener to verify the sender against their IP.
If data doesnt arrive typically how long may I (typically) not see data for (250ms? 1 second? 1.75sec?)
Data sent on UDP is usually disposable, so if you don't receive data, then it can easily be ignored...however, sometimes you want "semi-reliable" but you don't want 'ordered reliable' like TCP uses, 1 second is a good estimate of a drop. You can number your packets on a rotation and write your own ACK communication. When a packet is received, it records the number and sends back a bitfield letting the sender know which packets it received. You can read this unfinished document for more information (although unfinished, it still yields valiable info):
http://gafferongames.com/networking-for-game-programmers/

The big thing to know when attempting to use UDP is:
Your packets might not all make it over the line, which means there is going to be possible data corruption.
If you're working on an application where 100% of the data needs to arrive reliably to provide functionality, use TCP. If you're working on an application where some loss is allowable (streaming media, etc.) then go for UDP but don't expect everything to get from one of the pipe to the other intact.

One way to look at the difference between applications appropriate for UDP vs. TCP is that TCP is good when data delivery is "better late than never", UDP is good when data delivery is "better never than late".
Another aspect is that the stateless, best-effort nature of most UDP-based applications can make scalability a bit easier to achieve. Also note that UDP can be multicast while TCP can't.

In addition to don.neufeld's recommendation to use TCP.
For most applications TCP is easier to implement. If you need to maintain packet boundaries in a TCP stream, a good way is to transmit a two byte header before the data to delimit the messages. The header should contain the message length. At the receiving end just read two bytes and evaluate the value. Then just wait until you have received that many bytes. You then have a complete message and are ready to receive the next 2-byte header.
This gives you some of the benefit of UDP without the hassle of lost data, out-of-order packet arrival etc.

And don't assume that if you send a packet it got there.

If there is a packet size limitation imposed by some router along the way, your UDP packets could be silently truncated to that size.

Two things:
1) You may or may not received what was sent
2) Whatever you receive may not be in the same order it was sent.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js