recv reads incomplete packet - c++

I have simple function that responsible on receiving packets via socket.
if((recv_size = recv(sock , rx , 50000 ,0)) == SOCKET_ERROR)
{
...
} else
{
...
}
I found that sometimes I receiv incompleate packet. Why? Mybe I should use recv for several times? Packet length never exceeds 50000 bytes.
I use TCP socket.

If you're using TCP it's expected. TCP is a streaming protocol, it doesn't have "packets" or message boundaries, and you can have received all of the "message" or part of it, or even multiple messages. So you might have to call recv multiple times to receive a complete message.
However, since TCP doesn't have message boundaries, you have to implement them yourself on top of TCP, for example by sending the length of the message in a fixed-size header, or have some special end-of-message marker.

Related

How to properly recv with winsock?

I'm writing a simple http server for a test and I'm rather confused as to how one is supposed to tell where the end of a request is.
recv() returns a negative number on error, 0 on connection close and a positive number receiving data, when there is no more data it just blocks.
I could create some frankenstein that continuously recv's on one thread and checks if it blocked on another thread but there has got to be a better way to do this... How can I tell if there is no more bytes to read for the time being without blocking?
First of all, you should follow the HTTP protocol when reading the HTTP request:
Continue reading from socket until \r\n\r\n is received
Parse the header
If Content-Length is specified, additionally read that many bytes of the request payload
Process the HTTP request
Send HTTP response
Close the socket (HTTP/1.0) or (HTTP/1.1) handle keep-alive, content-encoding, transfer-encoding, trailers, etc, potentially repeating from step 1.
To deal with potentially misbehaving clients, when using blocking sockets it is customary to set a socket timeout prior to issuing recv or send calls.
DWORD recvTimeoutMs = 20000;
setsockopt(socket, SOL_SOCKET, SO_SNDTIMEO, (const char *)&recvTimeoutMs, sizeof(recvTimeoutMs));
DWORD sendTimeoutMs = 30000;
setsockopt(socket, SOL_SOCKET, SO_RCVTIMEO, (const char *)&sendTimeoutMs, sizeof(sendTimeoutMs));
When a recv or send times out, it will fail with WSAGetLastError giving WSAETIMEDOUT (10060).

boost asio detecting / avoiding reception buffer overflow

Consider a client sending data to a server using TCP, with boost::asio, in "synchronous mode" (aka "blocking" functions).
Client code (skipped the part about query and io_service):
tcp::resolver::iterator endpoint_iterator = resolver.resolve(query);
tcp::socket socket( io_service );
boost::asio::connect( socket, endpoint_iterator );
std::array<char, 1000> buf = { /* some data */ };
size_t n = socket.send( boost::asio::buffer(buf) );
This will send the whole buffer (1000 bytes) to the connected machine.
Now the server code:
tcp::acceptor acceptor( io_service, tcp::endpoint( tcp::v4(), port ) );
tcp::socket socket( io_service );
boost::system::error_code err;
std::array<char, 500> buff;
size_t n = socket.read_some( boost::asio::buffer(buff), err );
std::cout << "err=" << err.message() << '\n';
What this does: client sends 1000 bytes through the connection, and server attempts to store it in a 500 bytes buffer.
What I expected: an server error status saying that buffer is too small and/or too much data received.
What I get: A "Success" error value, and n=1000 in the server.
What did I miss here ? Can't ASIO detect the buffer overflow ?
Should I proceed using some other classes/functions (streams, maybe?)
Refs (for 1.54, which is the one I use):
buffer function
TCP socket read_some()
TCP socket send()
You're seriously misunderstanding TCP.
TCP is a byte stream. There's no packet boundary inside a TCP stream. Until you close the socket, all bytes form a single stream. (unlike UDP)
Boost.Asio knows this. As long as the stream is open, it can't say how big the stream will eventually be. If you've got a 500 byte buffer, Boost Asio can fill it with the first 500 bytes of the (potentially unbounded) TCP stream.
However, read_some just looks at what's already available. In your case, with just 1000 bytes, it's entirely expected that the whole 1000 bytes are available on your network card. There's no error in that part. It jsut doesn't fit in your buffer, but that's not a problem on the network side.
Neither TCP nor UDP have a way to communicate back that the receiver was expecting a smaller packet. That's application-level logic, and you handle it on the application level. For instance, HTTP has 413 Payload Too Large. Therefore, Boost.Asio doesn't offer a standard mechanism.
You did receive 500 bytes and may read the last 500 bytes by calling asio again. Just saying this as it seems to me that you misundertood the behaviour of asio.

What about partial recv() on two byte header containing message length?

I have been reading some socket guides such as Beej's guide to network programming. It is quite clear now that there is no guarantee on how many bytes are received in a single recv() call. Therefore a mechanism of e.g. first two bytes stating the message length should be sent and then the message. So the receiver receives the first two bytes and then receives in a loop until the whole message has been received. All good and dandy!?
I was asked by a colleague about messages going out of sync. E.g. what if, somehow, I receive two bytes in once recv() call that are actually in the middle of the message itself and it would appear as a integer of some value? Does that mean that the rest of the data sent will be out of sync? And what about receiving the header partially, i.e. one byte at a time?
Maybe this is overthinking, but I can't find this mentioned anywhere and I just want to be sure that I would handle this if it could be a possible threat to the integrity of the communication.
Thanks.
It is not overthinking. TCP presents a stream so you should treat it this way. A lot of problems concerning TCP are due to network issues and will probably not happen during development.
Start a message with a (4 byte) magic that you can look for followed by a (4 byte) length in an expected order (normally big endian). When receiving, read each byte of the header at the time, so you can handle it anyway the bytes were received. Based on that you can accept messages in a lasting TCP connection.
Mind you that when starting a new connection per message, you know the starting point. However, it doesn't hurt sending a magic either, if only to filter out some invalid messages.
A checksum is not necessary because TCP shows a reliable stream of bytes which was already checked by the receiving part of TCP, and syncing will only be needed if there was a coding issue with sending/receiving.
On the other hand, UDP sends packets, so you know what to expect, but then the delivery and order is not guaranteed.
Your colleague is mistaken. TCP data cannot arrive out of order. However you should investigate the MSG_WAITALL flag to recv() to overcome the possibility of the two length bytes arriving separately, and to eliminate the need for a loop when receiving the message body.
Its your responsibility to make you client and server syncing together, how ever in TCP there is no out of order delivery, if you got something by calling recv() you can think there isn't anything behind that that you doesn't received.
So the question is how to synchronize sender and receiver ? its easy, as stefaanv said, sender and receiver are knowing their starting point. so you can define a protocol for your network communication. for example a protocol could be defined this way :
4 bytes of header including message type and payload length
Rest of message is payload length
By this, you have to send 4 byte header before sending actual payload, then sending actual payload followed.
Because TCP has garauntied Inorder reliable delivery, you can make two recv() call for each pack. one recv() call with length of 4 bytes for getting next payload size, and another call to recv() with size specified in header. Its necessary to make both recv() blocking to getting synchronized all the time.
An example would be like this:
#define MAX_BUF_SIZE 1024 // something you know
char buf[MAX_BUF_SIZE];
int recvLen = recv(fd, buff, 4, MSG_PEEK);
if(recvLen==4){
recvLen = recv(fd, buff, 4);
if(recvLen != 4){
// fatal error
}
int payloadLen = extractPayloadLenFromHeader(buf);
recvLen = recv(fd, buff, payloadLen, MSG_PEEK);
if(recvLen == payloadLen){
recvLen = recv(fd, buff, payloadLen); // actual recv
if(recvLen != payloadLen){
// fatal error
}
// do something with received payload
}
}
As you can see, i have first called recv with MSG_PEEK flag to ensure is there really 4 bytes available or not, then received actual header. same for payload

TCP Socket - read most recent data from input queue [duplicate]

I've been reading through Beej's Guide to Network Programming to get a handle on TCP connections. In one of the samples the client code for a simple TCP stream client looks like:
if ((numbytes = recv(sockfd, buf, MAXDATASIZE-1, 0)) == -1) {
perror("recv");
exit(1);
}
buf[numbytes] = '\0';
printf("Client: received '%s'\n", buf);
close(sockfd);
I've set the buffer to be smaller than the total number of bytes that I'm sending. I'm not quite sure how I can get the other bytes. Do I have to loop over recv() until I receive '\0'?
*Note on the server side I'm also implementing his sendall() function, so it should actually be sending everything to the client.
See also 6.1. A Simple Stream Server in the guide.
Yes, you will need multiple recv() calls, until you have all data.
To know when that is, using the return status from recv() is no good - it only tells you how many bytes you have received, not how many bytes are available, as some may still be in transit.
It is better if the data you receive somehow encodes the length of the total data. Read as many data until you know what the length is, then read until you have received length data. To do that, various approaches are possible; the common one is to make a buffer large enough to hold all data once you know what the length is.
Another approach is to use fixed-size buffers, and always try to receive min(missing, bufsize), decreasing missing after each recv().
The first thing you need to learn when doing TCP/IP programming: 1 write/send call might take
several recv calls to receive, and several write/send calls might need just 1 recv call to receive. And anything in-between.
You'll need to loop until you have all data. The return value of recv() tells you how much data you received. If you simply want to receive all data on the TCP connection, you can loop until recv() returns 0 - provided that the other end closes the TCP connection when it is done sending.
If you're sending records/lines/packets/commands or something similar, you need to make your own protocol over TCP, which might be as simple as "commands are delimited with \n".
The simple way to read/parse such a command would be to read 1 byte at a time, building up a buffer with the received bytes and check for a \n byte every time. Reading 1 byte is extremely inefficient, so you should read larger chunks at a time.
Since TCP is stream oriented and does not provide record/message boundaries it becomes a bit more tricky - you'd
have to recv a piece of bytes, check in the received buffer for a \n byte, if it's there - append the bytes to previously received bytes and output that message. Then check the remainder of the buffer after the \n - which might contain another whole message or just the start of another message.
Yes, you have to loop over recv() until you receive '\0' or an
error happen (negative value from recv) or 0 from recv().
For the first option: only if this zero is part of your
protocol (the server sends it). However from your code it seems that
the zero is just to be able to use the buffer content as a
C-string (on the client side).
The check for a return value of 0 from recv:
this means that the connection was closed (it could be part
of your protocol that this happens.)

Handling partial return from recv() TCP in C

I've been reading through Beej's Guide to Network Programming to get a handle on TCP connections. In one of the samples the client code for a simple TCP stream client looks like:
if ((numbytes = recv(sockfd, buf, MAXDATASIZE-1, 0)) == -1) {
perror("recv");
exit(1);
}
buf[numbytes] = '\0';
printf("Client: received '%s'\n", buf);
close(sockfd);
I've set the buffer to be smaller than the total number of bytes that I'm sending. I'm not quite sure how I can get the other bytes. Do I have to loop over recv() until I receive '\0'?
*Note on the server side I'm also implementing his sendall() function, so it should actually be sending everything to the client.
See also 6.1. A Simple Stream Server in the guide.
Yes, you will need multiple recv() calls, until you have all data.
To know when that is, using the return status from recv() is no good - it only tells you how many bytes you have received, not how many bytes are available, as some may still be in transit.
It is better if the data you receive somehow encodes the length of the total data. Read as many data until you know what the length is, then read until you have received length data. To do that, various approaches are possible; the common one is to make a buffer large enough to hold all data once you know what the length is.
Another approach is to use fixed-size buffers, and always try to receive min(missing, bufsize), decreasing missing after each recv().
The first thing you need to learn when doing TCP/IP programming: 1 write/send call might take
several recv calls to receive, and several write/send calls might need just 1 recv call to receive. And anything in-between.
You'll need to loop until you have all data. The return value of recv() tells you how much data you received. If you simply want to receive all data on the TCP connection, you can loop until recv() returns 0 - provided that the other end closes the TCP connection when it is done sending.
If you're sending records/lines/packets/commands or something similar, you need to make your own protocol over TCP, which might be as simple as "commands are delimited with \n".
The simple way to read/parse such a command would be to read 1 byte at a time, building up a buffer with the received bytes and check for a \n byte every time. Reading 1 byte is extremely inefficient, so you should read larger chunks at a time.
Since TCP is stream oriented and does not provide record/message boundaries it becomes a bit more tricky - you'd
have to recv a piece of bytes, check in the received buffer for a \n byte, if it's there - append the bytes to previously received bytes and output that message. Then check the remainder of the buffer after the \n - which might contain another whole message or just the start of another message.
Yes, you have to loop over recv() until you receive '\0' or an
error happen (negative value from recv) or 0 from recv().
For the first option: only if this zero is part of your
protocol (the server sends it). However from your code it seems that
the zero is just to be able to use the buffer content as a
C-string (on the client side).
The check for a return value of 0 from recv:
this means that the connection was closed (it could be part
of your protocol that this happens.)