Take the following:
struct Header{
std::size_t body_size;
};
struct Body{
std::string data;
};
struct Packet{
Header header;
Body body;
};
Suppose now, that I want to send a Packet object over a tcp socket.
To do this, I want to serialise the Header into a std::string, which contains information about how large the body is, and send that string over through the socket.
But this serialised header string itself has a variable size since the body_size is not fixed, so how would I know how many bytes to read (of the serialised header string)
So what kind of protocols are used in sending data like this?
The most commonly used protocol for sending data over a TCP socket like you propose is HTTP.
In HTTP, the body size is sent as a string in the relevant header [request, response or chunk] with special characters (\r\n for a request or response, ; for a chunk) to identify the end of the size string.
You'll need to decide on your own special character; NULL would be the easiest for null terminated strings.
One of the key issues of sending data as you propose, is that TCP can split the data up into a number of (what are commonly called) packets, so the receiver may not receive the whole of your Packet at once.
BTW Packet is a poor name in this context.
You don't need to know the size of your complete Packet before you send it. You can call boost:asio async_write with a collection of const_buffers to send both the Header and Body of your Packet.
On the receive side, I recommend using async_read_some with a string buffer large enough for your whole Packet including the Header string.
You should be able to read the Body size from the characters before the NULL and calculate your Packet size from the Body size plus the number of characters before the NULL + 1 (for the NULL). If this is less that the size reported by your ReadHandler then your Body has been split across multiple TCP packets, so you'll need to receive the other packets to reconstruct your Packet Body.
Or, you could save yourself the trouble and simply use a tried and tested HTTP library like via-httplib or boost::beast.
Related
I am writing a server-client application using Winsock in c++ for sending a file line by line and I have a problem in sending huge string. The line size is very huge.
For getting the message from the client by the server I use the code below.
int result;
char message[200];
while (true)
{
recv(newSd, (char*)&message, sizeof(message), 0);
cout << "The Message from client: " << message << ";";
}
The above code working fine if I send small length of the message. But, what I wanted is to send an unknown size of lines in a file.
How to send a big unknown string instead of char message[200];
TCP is a byte stream, it knows nothing about messages or lines or anything like that. When you send data over TCP, all it knows about is raw bytes, not what the bytes represent. It is your responsibility to implement a messaging protocol on top of TCP to delimit the data in some meaningful way so the receiver can know when the data is complete. There are two ways to do that:
send the data length before sending the actual data. The receiver reads the length first, then reads however many bytes the length says.
send a unique terminator after sending the data. Make sure the terminator never appears in the data. The receiver can then read until the terminator is received.
You are not handling either of those in your recv() code, so I suspect you are not handling either of them in your send() code, too (which you did not show).
Since you are sending a text file, you can either:
send the file size, such as in a uint32_t or uint64_t (depending on how large the file is), then send the raw file bytes.
send each text line individually as-is, terminated by a CRLF or bare-LF line break after each line, and then send a final terminator after the last line.
You are also ignoring the return value of recv(), which tells you how many bytes were actually received. It can, and usually does, return fewer bytes than requested, so you must be prepared to call recv() multiple times, usually in a loop, to receive data completely. Same with send().
I have a server that receives a compressed string (compressed with zlib) from a client, and I was using async_receive from the boost::asio library to receive this string, it turns out however that there is no guarantee that all bytes will be received, so I now have to change it to async_read. The problem I face is that the size of the bytes received is variable, so I am not sure how to use async_read without knowing the number of bytes to be received. With the async_receive I just have a boost::array<char, 1024>, however this is a buffer that is not necessarily filled completely.
I wondered if anyone can suggest a solution where I can use async_read even though I do not know the number of bytes to be received in advance?
void tcp_connection::start(boost::shared_ptr<ResolverQueueHandler> queue_handler)
{
if (!_queue_handler.get())
_queue_handler = queue_handler;
std::fill(buff.begin(), buff.end(), 0);
//socket_.async_receive(boost::asio::buffer(buff), boost::bind(&tcp_connection::handle_read, shared_from_this(), boost::asio::placeholders::error));
boost::asio::async_read(socket_, boost::asio::buffer(buff), boost::bind(&tcp_connection::handle_read, shared_from_this(), boost::asio::placeholders::error));
}
buff is a boost::array<char, 1024>
How were you expecting to do this using any other method?
There are a few general methods to sending data of variable sizes in an async manor:
By message - meaning that you have a header that defines the length of the expected message followed by a body which contains data of the specified length.
By stream - meaning that you have some marker (and this is very broad) method of knowing when you've gotten a complete packet.
By connection - each complete packet of data is sent in a single connection which is closed once the data is complete.
So can your data be parsed, or a length sent etc...
Use async_read_until and create your own match condition, or change your protocol to send a header including the number of bytes to expect in the compressed string.
A single IP packet is limited to an MTU size of ~1500 bytes, and yet still you can download gigabyte-large files from your favourite website, and watch megabyte-sized videos on YouTube.
You need to send a header indicating the actual size of the raw data, and then receive the data by pieces on smaller chunks until you finish receiving all the bytes.
For example, when you download a large file over HTTP, there is a field on the header indicating the size of the file: Content-Length:.
Currently I am receiving data synchronously in the following manner
boost::array<char, 2000> buf;
while(true)
{
std::string dt;
size_t len = connect_sock->receive(boost::asio::buffer(buf, 3000));
std::copy(buf.begin(), buf.begin()+len, std::back_inserter(dt));
std::cout << dt;
}
My question is whether this method is efficient enough to receive data that exceed the buffer size . Is there any way that I could know exactly how much data is available so that I could adjust the buffer size accordingly ? (The reason for this is that my server sends a particular response to a request that needs to be processed only when an entire response has been stored in a string variable.
If you are sending data using TCP, you have to take care of this at the application protocol level.
For example, you could prefix each request with a header that would include the number of bytes that make up the request. The receiver, having read and parsed the header, would know how many more bytes it would need to read to get the rest of the request. Then it could repeatedly call receive() until it gets the correct amount of data.
I have a server that receives a compressed string (compressed with zlib) from a client, and I was using async_receive from the boost::asio library to receive this string, it turns out however that there is no guarantee that all bytes will be received, so I now have to change it to async_read. The problem I face is that the size of the bytes received is variable, so I am not sure how to use async_read without knowing the number of bytes to be received. With the async_receive I just have a boost::array<char, 1024>, however this is a buffer that is not necessarily filled completely.
I wondered if anyone can suggest a solution where I can use async_read even though I do not know the number of bytes to be received in advance?
void tcp_connection::start(boost::shared_ptr<ResolverQueueHandler> queue_handler)
{
if (!_queue_handler.get())
_queue_handler = queue_handler;
std::fill(buff.begin(), buff.end(), 0);
//socket_.async_receive(boost::asio::buffer(buff), boost::bind(&tcp_connection::handle_read, shared_from_this(), boost::asio::placeholders::error));
boost::asio::async_read(socket_, boost::asio::buffer(buff), boost::bind(&tcp_connection::handle_read, shared_from_this(), boost::asio::placeholders::error));
}
buff is a boost::array<char, 1024>
How were you expecting to do this using any other method?
There are a few general methods to sending data of variable sizes in an async manor:
By message - meaning that you have a header that defines the length of the expected message followed by a body which contains data of the specified length.
By stream - meaning that you have some marker (and this is very broad) method of knowing when you've gotten a complete packet.
By connection - each complete packet of data is sent in a single connection which is closed once the data is complete.
So can your data be parsed, or a length sent etc...
Use async_read_until and create your own match condition, or change your protocol to send a header including the number of bytes to expect in the compressed string.
A single IP packet is limited to an MTU size of ~1500 bytes, and yet still you can download gigabyte-large files from your favourite website, and watch megabyte-sized videos on YouTube.
You need to send a header indicating the actual size of the raw data, and then receive the data by pieces on smaller chunks until you finish receiving all the bytes.
For example, when you download a large file over HTTP, there is a field on the header indicating the size of the file: Content-Length:.
Our server is seemingly packet based. It is an adaptation from an old serial based system. It has been added, modified, re-built, etc over the years. Since TCP is a stream protocol and not a packet protocol, sometimes the packets get broken up. The ServerSocket is designed in such a way that when the Client sends data, part of the data contains the size of our message such as 55. Sometimes these packets are split into multiple pieces. They arrive in order but since we do not know how the messages will be split, our server sometimes does not know how to identify the split message.
So, having given you the background information. What is the best method to rebuild the packets as they come in if they are split? We are using C++ Builder 5 (yes I know, old IDE but this is all we can work with at the moment. ALOT of work to re-design in .NET or newer technology).
TCP guarantees that the data will arrive in the same order it was sent.
That beeing said, you can just append all the incoming data to a buffer. Then check if your buffer contains one or more packets, and remove them from the buffer, keeping all the remaining data into the buffer for future check.
This, of course, suppose that your packets have some header that indicates the size of the following data.
Lets consider packets have the following structure:
[LEN] X X X...
Where LEN is the size of the data and each X is an byte.
If you receive:
4 X X X
[--1--]
The packet is not complete, you can leave it in the buffer. Then, other data arrives, you just append it to the buffer:
4 X X X X 3 X X X
[---2---]
You then have 2 complete messages that you can easily parse.
If you do it, don't forget to send any length in a host-independant form (ntohs and ntohl can help).
This is often accomplished by prefixing messages with a one or two-byte length value which, like you said, gives the length of the remaining data. If I've understood you correctly, you're sending this as plain text (i.e., '5', '5') and this might get split up. Since you don't know the length of a decimal number, it's somewhat ambiguous. If you absolutely need to go with plain text, perhaps you could encode the length as a 16-bit hex value, i.e.:
00ff <255 bytes data>
000a <10 bytes data>
This way, the length of the size header is fixed to 4 bytes and can be used as a minimum read length when receiving on the socket.
Edit: Perhaps I misunderstood -- if reading the length value isn't a problem, deal with splits by concatenating incoming data to a string, byte buffer, or whatever until its length is equal to the value you read in the beginning. TCP will take care of the rest.
Take extra precautions to make sure that you can't get stuck in a blocking read state should the client not send a complete message. For example, say you receive the length header, and start a loop that keeps reading through blocking recv() calls until the buffer is filled. If a malicious client intentionally stops sending data, your server might be locked until the client either disconnects, or starts sending.
I would have a function called readBytes or something that takes a buffer and a length parameter and reads until that many bytes have been read. You'll need to capture the number of bytes actually read and if it's less than the number you're expecting, advance your buffer pointer and read the rest. Keep looping until you've read them all.
Then call this function once for the header (containing the length), assuming that the header is a fixed length. Once you have the length of the actual data, call this function again.