How to receive more than 1440 through the socket - c++

I wrote two simple programs server and a client using sockets in C++ (Linux). And initially it was a sample client-server application (echo-message sending and receiving the answer). Next, I changed the client in order to implement HTTP GET (now I do not use my sample server anymore). It works, but whatever buffer size I set, the client receives only 1440 bytes. I want to receive whole page into the buffer. I think that this is concerned with the TCP properties and I should implement some kind of cycle inside my client's code to capture all the parts of the answer. But I don't know what exactly I should do.
This is my code:
...
int bytesSent = send(sock, tmpCharArr, message.size()+1, 0);
// Wait for the answer. Receive it into the buffer defined.
int bytesRecieved = recv(sock, resultBuf, 2048*100, 0);
...
2048*100 is a buffer size and I think this is more than enough for the relatively small WEB-page used for testing. But as I mentioned, I receive only 1440 bytes.
What can I do with recv() function call to capture all the reply "parts" when the server's response is larger then 1440 bytes?
Thanks in advance.

The buffer size is determined by factors outside your control (routers, ADSL links, IP stacks, etc.). The standard way to transmit large volumes of data is to call recv() repeatedly.

HTTP works over TCP, and to understand the working of TCP sockets better you have to think of them as a stream rather than packets.
For further clarity, read my earlier post: recombine split TCP packet with flash sockets
As to why you receive only 1400 (or so) bytes, you have to understand MTU and Fragmentation. To sum it up, MTU (Maximum Transmission Unit) is the ability of the network to transfer a single packet of a certain maximum size. MTU of a entire network is the lowest MTU of all the routers involved. Fragmentation is splitting up of the packets if you try to send a single packet of size larger than the MTU of that network.
For a better understanding of MTU and Fragmentation, read: http://www.miislita.com/internet-engineering/ip-packet-fragmentation-tutorial.pdf
Now as for how to receive the entire page in the buffer, one alternative is to keep calling recv() and appending the data you get in a buffer, until recv() returns zero. This will work because typically a web-server will close the TCP connection after it sends you the response. However, this technique will fail to work if the web-server doesn't close the session (maybe keep-alives are configures).
Therefore, the correct solution would be to keep receiving until you have received the HTTP header. Take a peek and determine the length of the entire HTTP response (Content-Length:) and then you can keep on receiving until you have received the exact amount of bytes you were supposed to receive.

Related

Increasing TCP Window Size

I have some doubts over increasing TCP Window Size in application. In my C++ software application, we are sending data packets of size around 1k from client to server using TCP/IP blocking socket. Recently I came across this concept TCP Window Size. So I tried increasing the value to 64K using setsockopt() for both SO_SNDBUF and SO_RCVBUF. After increasing this value, I get some improvements in performance for WAN connection but not in LAN connection.
As per my understanding in TCP Window Size,
Client will send data packets to server. Upon reaching this TCP Window Size, it will wait to make sure ACK received from the server for the first packet in the window size. In case of WAN connection, ACK is getting delayed from the server to the client because of latency in RTT of around 100ms. So in this case, increasing TCP Window Size compensates ACK wait time and thereby improving performance.
I want to understand how the performance improves in my application.
In my application, even though TCP Window Size (Both Send and Receive Buffer) is increased using setsockopt at socket level, we still maintain the same packet size of 1k (i.e the bytes we send from client to server in a single socket send). Also we disabled Nagle algorithm (inbuilt option to consolidate small packets into a large packet thereby avoiding frequent socket call).
My doubts are as follows:
Since I am using blocking socket, for each data packet send of 1k, it should block if ACK doesn't come from the server. Then how does the performance improve after improving the TCP window Size in WAN connection alone ? If I misunderstood the concept of TCP Window Size, please correct me.
For sending 64K of data, I believe I still need to call socket send function 64 times ( since i am sending 1k per send through blocking socket) even though I increased my TCP Window Size to 64K. Please confirm this.
What is the maximum limit of TCP window size with windows scaling enabled with RFC 1323 algorithm ?
I am not so good in my English. If you couldn't understand any of the above, please let me know.
First of all, there is a big misconception evident from your question: that the TCP window size is what is controlled by SO_SNDBUF and SO_RCVBUF. This is not true.
What is the TCP window size?
In a nutshell, the TCP window size determines how much follow-up data (packets) your network stack is willing to put on the wire before receiving acknowledgement for the earliest packet that has not been acknowledged yet.
The TCP stack has to live with and account for the fact that once a packet has been determined to be lost or mangled during transmission, every packet sent, from that one onwards, has to be re-sent since packets may only be acknowledged in order by the receiver. Therefore, allowing too many unacknowledged packets to exist at the same time consumes the connection's bandwidth speculatively: there is no guarantee that the bandwidth used will actually produce anything useful.
On the other hand, not allowing multiple unacknowledged packets at the same time would simply kill the bandwidth of connections that have a high bandwidth-delay product. Therefore, the TCP stack has to strike a balance between using up bandwidth for no benefit and not driving the pipe aggressively enough (and thus allowing some of its capacity to go unused).
The TCP window size determines where this balance is struck.
What do SO_SNDBUF and SO_RCVBUF do?
They control the amount of buffer space that the network stack has reserved for servicing your socket. These buffers serve to accumulate outgoing data that the stack has not yet been able to put on the wire and data that has been received from the wire but not yet read by your application respectively.
If one of these buffers is full you won't be able to send or receive more data until some space is freed. Note that these buffers only affect how the network stack handles data on the "near" side of the network interface (before they have been sent or after they have arrived), while the TCP window affects how the stack manages data on the "far" side of the interface (i.e. on the wire).
Answers to your questions
No. If that were the case then you would incur a roundtrip delay for each packet sent, which would totally destroy the bandwidth of connections with high latency.
Yes, but that has nothing to do with either the TCP window size or with the size of the buffers allocated to that socket.
According to all sources I have been able to find (example), scaling allows the window to reach a maximum size of 1GB.
Since I am using blocking socket, for each data packet send of 1k, it should block if ACK doesn't come from the server.
Wrong. Sending in TCP is asynchronous. send() just transfers the data to the socket send buffer and returns. It only blocks while the socket send buffer is full.
Then how does the performance improve after improving the TCP window Size in WAN connection alone?
Because you were wrong about it blocking until it got an ACK.
For sending 64K of data, I believe I still need to call socket send function 64 times
Why? You could just call it once with the 64k data buffer.
( since i am sending 1k per send through blocking socket)
Why? Or is this a repetition of your misconception under (1)?
even though I increased my TCP Window Size to 64K. Please confirm this.
No. You can send it all at once. No loop required.
What is the maximum limit of TCP window size with windows scaling enabled with RFC 1323 algorithm?
Much bigger than you will ever need.

Limitations on sending through UDP sockets

I have a big 1GB file, which I am trying to send to another node. After the sender sends 200 packets (before sending the complete file) the code jumps out. Saying "Sendto no send space available". What can be the problem and how to take care of it.
Apart from this, we need maximum throughput in this transfer. So what send buffer size we should use to be efficient?
What is the maximum MTU which we can use to transfer the file without fragmentation?
Thanks
Ritu
Thank you for the answers. Actually, our project specifies to use UDP and then some additional code to take care of lost packets.
Now I am able to send the complete file, using blocking UDP sockets.
I am running the whole setup on an emulab like environment, called deter. I have set link loss to 0 but still my some packets are getting lost. What could be the possible reason behind that? Even if I add delay (assuming receiver drops the packet when its buffer is full) after sending every packet..still this packet losts persists.
It's possible to use UDP for high speed data transfer, but you have to make sure not to send() the data out faster than your network card can pump it onto the wire. In practice that means either using blocking I/O, or blocking on select() and only sending the next packet when select() indicates that the socket is ready-for-write. (ideally you'd also not send the data faster than the receiving machine can receive it, but that's less of an issue these days since modern CPU speeds are generally much faster than modern network I/O speeds)
Once you have that logic working properly, the size of your send-buffer isn't terribly important. (i.e. your send buffer will never be large enough to hold a 1GB file anyway, so making sure your program doesn't overflow the send buffer is the key issue whether the send buffer is large or small) The size of the receive-buffer on the receiver is important though... best to make that as large as possible, so the receiving computer won't drop packets if the receiving process gets held off of the CPU by another program.
Regarding MTU, if you want to avoid packet fragmentation (and assuming your packets are traveling over Ethernet), then you shouldn't place more than 1468 bytes into each UDP packet (or 1452 bytes if you're using IPv6). (Calculated by subtracting the size of the necessary IP and UDP headers from Ethernet's 1500-byte frame size)
Also agree with #jonfen. No UDP for high speed file transfer.
UDP incur less protocol overhead. However, at the maximum transfer rate, transmit errors are inevitable (such as packet loss). So one must incorporate TCP like error correction scheme. End result is lower than TCP performance.

Using Tcp, why do large blocks of data get transmitted with a lower bandwidth then small blocks of data?

Using 2 PC's with Windows XP, 64kB Tcp Window size, connected with a crossover cable
Using Qt 4.5.3, QTcpServer and QTcpSocket
Sending 2000 messages of 40kB takes 2 seconds (40MB/s)
Sending 1 message of 80MB takes 80 seconds (1MB/s)
Anyone has an explanation for this? I would expect the larger message to go faster, since the lower layers can then fill the Tcp packets more efficiently.
This is hard to comment on without seeing your code.
How are you timing this on the sending side? When do you know you're done?
How does the client read the data, does it read into fixed sized buffers and throw the data away or does it somehow know (from the framing) that the "message" is 80MB and try and build up the "message" into a single data buffer to pass up to the application layer?
It's unlikely to be the underlying Windows sockets code that's making this work poorly.
TCP, from the application side, is stream-based which means there are no packets, just a sequence of bytes. The kernel may collect multiple writes to the connection before sending it out and the receiving side may make any amount of the received data available to each "read" call.
TCP, on the IP side, is packets. Since standard Ethernet has an MTU (maximum transfer unit) of 1500 bytes and both TCP and IP have 20-byte headers, each packet transferred over Ethernet will pass 1460 bytes (or less) of the TCP stream to the other side. 40KB or 80MB writes from the application will make no difference here.
How long it appears to take data to transfer will depend on how and where you measure it. Writing 40KB will likely return immediately since that amount of data will simply get dropped in TCP's "send window" inside the kernel. An 80MB write will block waiting for it all to get transferred (well, all but the last 64KB which will fit, pending, in the window).
TCP transfer speed is also affected by the receiver. It has a "receive window" that contains everything received from the peer but not fetched by the application. The amount of space available in this window is passed to the sender with every return ACK so if it's not being emptied quickly enough by the receiving application, the sender will eventually pause. WireShark may provide some insight here.
In the end, both methods should transfer in the same amount of time since an application can easily fill the outgoing window faster than TCP can transfer it no matter how that data is chunked.
I can't speak for the operation of QT, however.
Bug in Qt 4.5.3
..................................

What should i know about UDP programming?

I don't mean how to connect to a socket. What should I know about UDP programming?
Do I need to worry about bad data in my socket?
I should assume if I send 200bytes I may get 120 and 60 bytes separately?
Should I worry about another connection sending me bad data on the same port?
If data doesnt arrive typically how long may I (typically) not see data for (250ms? 1 second? 1.75sec?)
What do I really need to know?
"i should assume if i send 200bytes i
may get 120 and 60bytes separately?"
When you're sending UDP datagrams your read size will equal your write size. This is because UDP is a datagram protocol, vs TCP's stream protocol. However, you can only write data up to the size of the MTU before the packet could be fragmented or dropped by a router. For general internet use, the safe MTU is 576 bytes including headers.
"i should worry about another
connection sending me bad data on the
same port?"
You don't have a connection, you have a port. You will receive any data sent to that port, regardless of where it's from. It's up to you to determine if it's from the right address.
If data doesnt arrive typically how
long may i (typically) not see data
for (250ms? 1 second? 1.75sec?)
Data can be lost forever, data can be delayed, and data can arrive out of order. If any of those things bother you, use TCP. Writing a reliable protocol on top of UDP is a very non trivial task and there is no reason to do so for almost all applications.
Should I worry about another
connection sending me bad data on the
same port?
Yes you should worry about it. Any application can send data to your open UDP port at any time. One of the big uses of UDP is many to one style communications where you multiplex communications with several peers on a single port using the addressed passed back during the recvfrom to differentiate between peers.
However, if you want to avoid this and only accept packets from a single peer you can actually call connect on your UDP socket. This cause the IP stack to reject packets coming from any host:port combo ( socket ) other than the one you want to talk to.
A second advantage of calling connect on your UDP socket is that in many OS's it gives a significant speed / latency improvement. When you call sendto on an unconnected UDP socket the OS actually temporarily connects the socket, sends your data and then disconnects the socket adding significant overhead.
A third advantage of using connected UDP sockets is it allows you to receive ICMP error messages back to your application, such as routing or host unknown due to a crash. If the UDP socket isn't connected the OS won't know where to deliver ICMP error messages from the network to and will silently discard them, potentially leading to your app hanging while waiting for a response from a crashed host ( or waiting for your select to time out ).
Your packet may not get there.
Your packet may get there twice or even more often.
Your packets may not be in order.
You have a size limitation on your packets imposed by the underlying network layers. The packet size may be quite small (possibly 576 bytes).
None of this says "don't use UDP". However you should be aware of all the above and think about what recovery options you may want to take.
Fragmentation and reassembly happens at the IP level, so you need not worry about that (Wikipedia). (This means that you won't receive split or truncated packets).
UDP packets have a checksum for the data and the header, so receiving bogus data is unlikely, but possible. Lost or duplicate packets are also possible. You should check your data in any case anyway.
There's no congestion control, so you may wish to consider that, if you plan on clogging the tubes with a lot of UDP packets.
UDP is a connectionless protocol. Sending data over UDP can get to the receiver, but can also get lost during transmission. UDP is ideal for things like broadcasting and streaming audio or video (i.e. a dropped packet is never a problem in those situations.) So if you need to ensure your data gets to the other side, stick with TCP.
UDP has less overhead than TCP and is therefore faster. (TCP needs to build a connection first and also checks data packets for data corruption which takes time.)
Fragmented UDP packets (i.e. packets bigger than about half a Kb) will probably be dropped by routers, so split your data into small chuncks before sending it over. (In some cases, the OS can take care of that.) Note that it is allways a packet that might make it, or not. Half packets aren't processed.
Latency over long distances can be quite big. If you want to do retransmission of data, I would go with something like 5 to 10 times the agerage latency time over the current connection. (You can measure the latency by sending and receiving a few packets.)
Hope this helps.
I won't follow suit with the other people who answered this, they all seem to push you toward TCP, and that's not for gaming at all, except maybe for login/chat info. Let's go in order:
Do I need to worry about bad data in my socket?
Yes. Even though UDP contains an extremely simple checksum for routers and such, it is not 100% efficient. You can add your own checksum device, but most of the time UDP is used when reliability is already not an issue, so data that doesn't conform should just be dropped.
I should assume if I send 200bytes I may get 120 and 60 bytes separately?
No, UDP is direct data write and read. However, if the data is too large, some routers will truncate and you lose part of the data permanently. Some have said roughly 576 bytes with header, I personally wouldn't use more than 256 bytes (nice round log2 number).
Should I worry about another connection sending me bad data on the same port?
UDP listens for any data from any computer on a port, so on this sense yes. Also note that UDP is a primitive and a raw format can be used to fake the sender, so you should use some sort of "key" in order for the listener to verify the sender against their IP.
If data doesnt arrive typically how long may I (typically) not see data for (250ms? 1 second? 1.75sec?)
Data sent on UDP is usually disposable, so if you don't receive data, then it can easily be ignored...however, sometimes you want "semi-reliable" but you don't want 'ordered reliable' like TCP uses, 1 second is a good estimate of a drop. You can number your packets on a rotation and write your own ACK communication. When a packet is received, it records the number and sends back a bitfield letting the sender know which packets it received. You can read this unfinished document for more information (although unfinished, it still yields valiable info):
http://gafferongames.com/networking-for-game-programmers/
The big thing to know when attempting to use UDP is:
Your packets might not all make it over the line, which means there is going to be possible data corruption.
If you're working on an application where 100% of the data needs to arrive reliably to provide functionality, use TCP. If you're working on an application where some loss is allowable (streaming media, etc.) then go for UDP but don't expect everything to get from one of the pipe to the other intact.
One way to look at the difference between applications appropriate for UDP vs. TCP is that TCP is good when data delivery is "better late than never", UDP is good when data delivery is "better never than late".
Another aspect is that the stateless, best-effort nature of most UDP-based applications can make scalability a bit easier to achieve. Also note that UDP can be multicast while TCP can't.
In addition to don.neufeld's recommendation to use TCP.
For most applications TCP is easier to implement. If you need to maintain packet boundaries in a TCP stream, a good way is to transmit a two byte header before the data to delimit the messages. The header should contain the message length. At the receiving end just read two bytes and evaluate the value. Then just wait until you have received that many bytes. You then have a complete message and are ready to receive the next 2-byte header.
This gives you some of the benefit of UDP without the hassle of lost data, out-of-order packet arrival etc.
And don't assume that if you send a packet it got there.
If there is a packet size limitation imposed by some router along the way, your UDP packets could be silently truncated to that size.
Two things:
1) You may or may not received what was sent
2) Whatever you receive may not be in the same order it was sent.

Count the number of packets sent to a server from a client?

So I'm almost done an assignment involving Win32 programming and sockets, but I have to generate and analyze some statistics about the transfers. The only part I'm having trouble with is how to figure out the number of packets that were sent to the server from the client.
The data sent can be variable-length, so I can't just divide the total bytes received by a #define'd value.
We have to use asynchronous calls to do everything, so I've been trying to increment a counter with every FD_READ message I get for the server's socket. However, because I have to be able to accept a potentially large file size, I have to call recv/recvfrom with a buffer size around 64k. If I send a small packet (a-z), there are no problems. But if I send a string of 1024 characters 10x, the server reports 2 or 3 packets received, but 0% data loss in terms of bytes sent/received.
Any idea how to get the number of packets?
Thanks in advance :)
This really boils down to what you mean by 'packet.'
As you are probably aware, when a TCP/UDP message is sent on the wire, the data being sent is 'wrapped,' or prepended, with a corresponding TCP/UDP header. This is then 'wrapped' in an IP header, which is in turn 'wrapped' in an Ethernet frame. You can see this breakout if you use a sniffing package like Wireshark.
The point is this. When I hear the term 'packet,' I think of data at the IP level. IP data is truly packetized on the wire, so packet counts make sense when talking about IP. However, if you're using regular sockets to send and receive your data, the IP headers, as well as the TCP/UDP headers, are stripped off, i.e., you don't get this information from the socket. And without that information, it is impossible to determine the number of 'packets' (again, I'm thinking IP) that were transmitted.
You could do what others are suggesting by adding your own header with a length and a counter. This information will help you accurately size your receive buffers, but it won't help you determine the number of packets (again, IP...), especially if you're doing TCP.
If you want to accurately determine the number of packets using Winsock sockets, I would suggest creating a 'raw' socket as suggested here. This socket will collect all IP traffic seen by your local NIC. Use the IP and TCP/UDP headers to filter the data based on your client and server sockets, i.e., IP addresses and port numbers. This will give an accurate picture of how many IP packets were actually used to transmit your data.
Not a direct answer to your question but rather a suggestion for a different solution.
What if you send a length-descriptor in front of the data you want to transfer? That way you can already allocate the correct buffer size (not too much, not too little) on the client and also check if there were any losses when the transfer is over.
With TCP you should have no problem at all because the protocol itself handles the error-free transmission or otherwise you should get a meaningful error.
Maybe with UDP you could also split up your transfer into fixed-size chunks with a propper sequence-id. You'd have to accumulate all incoming packages before you sort them (UDP makes no guarantee on the receive-order) and paste the data together.
On the other hand you should think about it if it is really necessary to support UDP as there is quite some manual overhead if you want to get that protocol error-safe... (see the Wikipedia Article on TCP for a list of the problems to get around)
Do your packets have a fixed header, or are you allowed to define your own. If you can define your own, include a packet counter in the header, along with the length. You'll have to keep a running total that accounts for rollover in your counter, but this will ensure you're counting packets sent, rather than packets received. For an simple assignment, you probably won't be encountering loss (with UDP, obviously) but if you were, a packet counter would make sure your statistics reflected the sent message accurately.