QTcpSocket and TCP padding - c++

Good day.
I am sending a custom protocol for logging via TCP which looks like this:
Timestamp (uint32_t -> 4 bytes)
Length of message (uint8_t -> 1 byte)
Message (char -> Length of message)
The Timestamp is converted to BigEndian for the transport and everything goes out correctly, except for one little detail: Padding
The Timestamp is sent on its own, however instead of just sending the timestamp (4 bytes) my application (using BSD sockets under Ubuntu) automatically appends two bytes of padding to the message.
Wireshark recognizes this correctly and marks the two extraneous bytes as padding, however the QTcpSocket (Qt 5.8, mingw 5.3.0) apparently assumes that the two extra bytes are actually payload, which obviously messes up my protocol.
Is there any way for me to 'teach' QTcpSocket to ignore the padding (like it should) or any way to get rid of the padding?
I'd like to avoid to do the whole 'create a sufficiently large buffer and preassemble the entire packet in it so it will be sent out in one go'-method if possible.
Thank you very much.
Because it was asked, the code used to send the data is:
return
C->sendInt(entry.TS) &&
C->send(&entry.LogLen, 1) &&
C->send(&entry.LogMsg, entry.LogLen);
where sendInt is declared as (Src being the parameter):
Src = htonl(Src);
return send(&Src, 4);
where 'send' is declared as (Source and Len being the parameters):
char *Src = (char *)Source;
while(Len) {
int BCount = ::send(Sock, Src, Len, 0);
if(BCount < 1) return false;
Src += BCount;
Len -= BCount;
}
return true;
::send is the standard BSD send function.
Reading is done via QTcpSocket:
uint32_t timestamp;
if (Sock.read((char *)&timestamp, sizeof(timestamp)) > 0)
{
uint8_t logLen;
char message[256];
if (Sock.read((char *)&logLen, sizeof(logLen)) > 0 &&
logLen > 0 &&
Sock.read(message, logLen) == logLen
) addToLog(qFromBigEndian(timestamp), message);
}
Sock is the QTcpSocket instance, already connected to the host and addToLog is the processing function.
Also to be noted, the sending side needs to run on an embedded system, using QTcpServer is therefor not an option.

Your read logic appears to be incorrect. You have...
uint32_t timestamp;
if (Sock.read((char *)&timestamp, sizeof(timestamp)) > 0)
{
uint8_t logLen;
char message[256];
if (Sock.read((char *)&logLen, sizeof(logLen)) > 0 &&
logLen > 0 &&
Sock.read(message, logLen) == logLen
) addToLog(qFromBigEndian(timestamp), message);
}
From the documentation for QTcpSocket::read(data, MaxSize) it...
Reads at most maxSize bytes from the device into data, and returns the
number of bytes read
What if one of your calls to Sock.read reads partial data? You essentially discard that data rather than buffering it for reuse next time.
Assuming you have a suitably scoped QByteArray...
QByteArray data;
your reading logic should be more along the lines of...
/*
* Append all available data to `data'.
*/
data.append(Sock.readAll());
/*
* Now repeatedly read/trim messages from data until
* we have no further complete messages.
*/
while (contains_complete_log_message(data)) {
auto message = read_message_from_data(data);
data = data.right(data.size() - message.size());
}
/*
* At this point `data' may be non-empty but doesn't
* contain enough data for a complete message.
*/

If the length of the padding is always fixed then just add socket->read(2); to ignore the 2 bytes.
On the other hand it might be just the tip of the iceberg. What are you using to read and write?

You should not invoke send three times but only once. For conversion into BigEndian you might use the Qt functions and write everything into a single buffer and only call send once. It is not what you want, but I assume it is what you'll need to do and it should be easy, as you already know the size of you message. You also will not need to leave the Qt world for sending the messages.

Related

Creating and sending raw data packets in C/C++

I know how my packet looks like. It has 6 header fields (1 byte each, each header has 8 fields) and then it has the payload (data).
I would like to build a raw packet in C or C++ (it should look the same I think).
Here's what I think I should do:
unsigned char packet[11];
packet[0] = (0x81); // first header with 8 fields
packet[1] = (0x8c); // second header with 8 fields
packet[2] = (0xfe);
packet[3] = (0x84);
packet[4] = (0x1d);
packet[5] = (0x79);
packet[6] = (0x96); // payload, the 'h' letter, masked
packet[7] = (0xe1); // 'e'
packet[8] = (0x71); // 'l'
packet[9] = (0x15); // 'l'
packet[10] = (0x91);// 'o'
Where, for instance, 0x81 is the first byte (I simply converted every field (bit) of my first header to hex).
And then, simply, I want to send it to server: send(sockfd, packet, sizeof(packet), 0) to send it.
Receiving and printing the response:
unsigned char buffer[1024];
if ((recv(sockfd, buffer, len, 0)) == 0)
{
if (errno != 0)
{
exit(1);
}
}
int i;
for(i = 0; i<len; i++)
printf("%x ", buffer[i]);
Am I right?
Other than mishandling the return value from recv, your code looks okay.
if ((recv(sockfd, buffer, len, 0)) == 0)
{
if (errno != 0)
{
exit(1);
}
}
A zero return indicates normal close of the connection. There's no reason to check errno if it returns zero.
A return value of -1 indicates an error. In that case, it does make sense to check errno.
A value greater than zero indicates that number of bytes have been received. Be aware that it is perfectly normal for recv to return fewer bytes than you asked it for. If you want to receive exactly some number of bytes, you must call recv in a loop.
TCP is a byte-stream protocol and has no idea where your "packets" (really, messages) begin and end.
Your code will not appear to be error-prone!
But a good practice would be:
const std::uint32_t BUFFER_SIZE = 11;
std::vector<std::uint8_t> buffer;
buffer.reserve(BUFFER_SIZE)
buffer = {0x81,0x8c.....};
send( sockfd,
reinterpret_cast <const char*> ( buffer.data() ),
static_cast <int> ( buffer.size() ),
0
);
Doing so, your code gets more optimized, and avoids possible leaks, using the std vectors.
May also benefit from taking a look at ZeroMQ, as an example of a ready-made, high-performance asynchronous messaging library, aimed at use in distributed or concurrent applications.

C++ TCP recv unknown buffer size

I want to use the function recv(socket, buf, len, flags) to receive an incoming packet. However I do not know the length of this packet prior to runtime so the first 8 bytes are supposed to tell me the length of this packet. I don't want to just allocate an arbitrarily large len to accomplish this so is it possible to set len = 8 have buf be a type of uint64_t. Then afterwards
memcpy(dest, &buf, buf)?
Since TCP is stream-based, I'm not sure what type of packages you mean. I will assume that you are referring to application level packages. I mean packages which are defined by your application and not by underlying protocols like TCP. I will call them messages instead to avoid confusion.
I will show two possibilities. First I will show, how you could read a message without knowing the length before you have finished reading. The second example will do two calls. First it reads the size of the message. Then it read the whole message at once.
Read data until the message is complete
Since TCP is stream-based, you will not loss any data when your buffer is not big enough. So you can read a fixed amount of bytes. If something is missing, you can call recv again. Here is a extensive example. I just wrote it without testing. I hope everything would work.
std::size_t offset = 0;
std::vector<char> buf(512);
std::vector<char> readMessage() {
while (true) {
ssize_t ret = recv(fd, buf.data() + offset, buf.size() - offset, 0);
if (ret < 0) {
if (errno == EINTR) {
// Interrupted, just try again ...
continue;
} else {
// Error occured. Throw exception.
throw IOException(strerror(errno));
}
} else if (ret == 0) {
// No data available anymore.
if (offset == 0) {
// Client did just close the connection
return std::vector<char>(); // return empty vector
} else {
// Client did close connection while sending package?
// It is not a clean shutdown. Throw exception.
throw ProtocolException("Unexpected end of stream");
}
} else if (isMessageComplete(buf)) {
// Message is complete.
buf.resize(offset + ret); // Truncate buffer
std::vector<char> msg = std::move(buf);
std::size_t msgLen = getSizeOfMessage(msg);
if (msg.size() > msgLen) {
// msg already contains the beginning of the next message.
// write it back to buf
buf.resize(msg.size() - msgLen)
std::memcpy(buf.data(), msg.data() + msgLen, msg.size() - msgLen);
msg.resize(msgLen);
}
buf.resize(std::max(2*buf.size(), 512)) // prepare buffer for next message
return msg;
} else {
// Message is not complete right now. Read more...
offset += ret;
buf.resize(std::max(buf.size(), 2 * offset)); // double available memory
}
}
}
You have to define bool isMessageComplete(std::vector<char>) and std::size_t getSizeOfMessage(std::vector<char>) by yourself.
Read the header and check the length of the package
The second possibility is to read the header first. Just the 8 bytes which contains the size of the package in your case. After that, you know the size of the package. This mean you can allocate enough storage and read the whole message at once:
/// Reads n bytes from fd.
bool readNBytes(int fd, void *buf, std::size_t n) {
std::size_t offset = 0;
char *cbuf = reinterpret_cast<char*>(buf);
while (true) {
ssize_t ret = recv(fd, cbuf + offset, n - offset, MSG_WAITALL);
if (ret < 0) {
if (errno != EINTR) {
// Error occurred
throw IOException(strerror(errno));
}
} else if (ret == 0) {
// No data available anymore
if (offset == 0) return false;
else throw ProtocolException("Unexpected end of stream");
} else if (offset + ret == n) {
// All n bytes read
return true;
} else {
offset += ret;
}
}
}
/// Reads message from fd
std::vector<char> readMessage(int fd) {
std::uint64_t size;
if (readNBytes(fd, &size, sizeof(size))) {
std::vector buf(size);
if (readNBytes(fd, buf.data(), size)) {
return buf;
} else {
throw ProtocolException("Unexpected end of stream");
}
} else {
// connection was closed
return std::vector<char>();
}
}
The flag MSG_WAITALL requests that the function blocks until the full amount of data is available. However, you cannot rely on that. You have to check it and read again if something is missing. Just like I did it above.
readNBytes(fd, buf, n) reads n bytes. As far as the connection was not closed from the other side, the function will not return without reading n bytes. If the connection was closed by the other side, the function returns false. If the connection was closed in the middle of a message, an exception is thrown. If an i/o-error occurred, another exception is thrown.
readMessage reads 8 bytes [sizeof(std::unit64_t)] und use them as size for the next message. Then it reads the message.
If you want to have platform independency, you should convert size to a defined byte order. Computers (with x86 architecture) are using little endian. It is common to use big endian in network traffic.
Note: With MSG_PEEK it is possible to implement this functionality for UDP. You can request the header while using this flag. Then you can allocate enough space for the whole package.
A fairly common technique is to read leading message length field, then issue a read for the exact size of the expected message.
HOWEVER! Do not assume that the first read will give you all eight bytes(see Note), or that the second read will give you the entire message/packet.
You must always check the number of bytes read and issue another read (or two (or three, or...)) to get all the data you want.
Note: Because TCP is a streaming protocol and because the packet size "on the wire" varies in accordance with a very arcane algorithm designed to maximize network performance, you could easily issue a read for eight bytes and the read could return having only read three (or seven or ...) bytes. The guarantee is that unless there is an unrecoverable error you will receive at least one byte and at most the number of bytes you requested. Because of this you must be prepared to do byte address arithmetic and issue all reads in a loop that repeats until the desired number of bytes is returned.
Since TCP is streaming there isn't really any end to the data you receive, not until the connection is closed or there is an error.
Instead you need to implement your own protocol on top of TCP, one that either contains a specific end-of-message marker, a length-of-data header field, or possibly a command-based protocol where the data of each command is of a well-known size.
That way you can read into a small fixed-sized buffer and append to a larger (possibly expanding) buffer as needed. The "possibly expanding" part is ridiculously easy in C++, what with std::vector and std::string (depending on the data you have)
There is another important thing to remember, that since TCP is stream-based, a single read or recv call may not actually fetch all the data you request. You need to receive the data in a loop until you have received everything.
In my Personal opinion.
I suggest receive "size of message"(integer 4 byte fixed) first.
recv(socket, "size of message written in integer" , "size of integer")
then
receive real message after.
recv(socket, " real message" ,"size of message written in integer")
This techinique also can be used on "sending files, images ,long messages"

Winsock - read integer from Java client in C++

I have a client-server application, with the server part written in C++ (Winsock) and the client part in Java.
When sending data from the client, I first send its length followed by the actual data. For sending the length, this is the code:
clientSender.print(text.length());
where clientSender is of type PrintWriter.
On the server side, the code that reads this is
int iDataLength;
if(recv(client, (char *)&iDataLength, sizeof(iDataLength), 0) != SOCKET_ERROR)
//do something
I tried printing the value of iDataLength within the if and it always turns out to be some random large integer. If I change iDataLength's type to char, I get the correct value. However, the actual value could well exceed a char's capacity.
What is the correct way to read an integer passed over a socket in C++ ?
I think the problem is that PrintWriter is writing text and you are trying to read a binary number.
Here is what PrintWriter does with the integer it sends:
http://docs.oracle.com/javase/7/docs/api/java/io/PrintWriter.html#print%28int%29
Prints an integer. The string produced by String.valueOf(int) is
translated into bytes according to the platform's default character
encoding, and these bytes are written in exactly the manner of the
write(int) method.
Try something like this:
#include <sys/socket.h>
#include <cstring> // for std::strerror()
// ... stuff
char buf[1024]; // buffer to receive text
int len;
if((len = recv(client, buf, sizeof(buf), 0)) == -1)
{
std::cerr << "ERROR: " << std::strerror(errno) << std::endl;
return 1;
}
std::string s(buf, len);
int iDataLength = std::stoi(s); // convert text back to integer
// use iDataLength here (after sanity checks)
Are you sure the endianness is not the issue? (Maybe Java encodes it as big endian and you read it as little endian).
Besides, you might need to implement receivall function (similar to sendall - as here). To make sure you receive exact number of bytes specified - because recv may receive fewer bytes than it was told to.
You have a confusion between numeric values and their ASCII representation.
When in Java you write clientSender.print(text.length()); you are actually writing an ascii string - if length is 15, you will send characters 1 (code ASCII 0x31) and 5 (code ASCII 0x35)
So you must either :
send a binary length in a portable way (in C or C++ you have hton and ntoh, but unsure in Java)
add a separator (newline) after the textual length from Java side and decode that in C++ :
char buffer[1024]; // a size big enough to read the packet
int iDataLength, l;
l = recv(client, (char *)&iDataLength, sizeof(iDataLength), 0);
if (l != SOCKET_ERROR) {
buffer[l] = 0;
iDataLength = sscanf(buffer, "%d", &iDataLength);
char *ptr = strchr(buffer, '\n');
if (ptr == NULL) {
// should never happen : peer does not respect protocol
...
}
ptr += 1; // ptr now points after the length
//do something
}
Java part should be : clientSender.println(text.length());
EDIT :
From Remy Lebeau's comment, There is no 1-to-1 relationship between sends and reads in TCP. recv() can and does return arbitrary amounts of data, so you cannot assume that a single recv() will read the entire line of text.
Above code should not do a simple recv but be ready to concatenate multiple reads to find the separator (left as exercise for the reader :-) )

Prepending a message with the size of the message

I'm writing something server-client related, and I have this code snippet here:
char serverReceiveBuf[65536];
client->read(serverReceiveBuf, client->bytesAvailable());
handleConnection(serverReceiveBuf);
that reads data whenever a readyRead() signal is emitted by the server. Using bytesAvailable() is fine when I test on my local network since there's no latency, but when I deploy the program I want to make sure the entire message is received before I handleConnection().
I was thinking of ways to do this, but read and write only accept chars, so the maximum message size indicator I can send in one char is 127. I want the maximum size to be 65536, but the only way I can think of doing that is have a size-of-size-of-message variable first.
I reworked the code to look like this:
char serverReceiveBuf[65536];
char messageSizeBuffer[512];
int messageSize = 0, i = 0; //max value of messageSize = 65536
client->read(messageSizeBuffer,512);
while((int)messageSizeBuffer[i] != 0 || i <= 512){
messageSize += (int) messageSizeBuffer[i];
//client will always send 512 bytes for size of message size
//if message size < 512 bytes, rest of buffer will be 0
}
client->read(serverReceiveBuf, messageSize);
handleConnection(serverReceiveBuf);
but I'd like a more elegant solution if one exists.
It is a very common technique when sending messages over a stream to send a fixed-sized header before the message payload. This header can include many different pieces of information, but it always includes the payload size. In the simplest case, you can send the message size encoded as a uint16_t for a maximum payload size of 65535 (or uint32_t if that's not sufficient). Just make sure you handle byte ordering with ntohs and htons.
uint16_t messageSize;
client->read((char*)&messageSize, sizeof(uint16_t));
messageSize = ntohs(messageSize);
client->read(serverReceiveBuf, messageSize);
handleConnection(serverReceiveBuf);
read and write work with byte streams. It does not matter to them if the bytes are chars or any other form of data. You can send a 4-byte integer by casting its address to char* and sending 4 bytes. On the receiving end cast the 4 bytes back to an int. (If the machines are of different types you may also have endian issues, requiring the bytes to be rearranged into an int. See htonl and its cousins.)

Partial receipt of packets from socket C++

I have a trouble, my server application sends packet 8 bytes length - AABBCC1122334455 but my application receives this packet in two parts AABBCC1122 and 334455, via "recv" function, how can i fix that?
Thanks!
To sum up a liitle bit:
TCP connection doesn't operate with packets or messages on the application level, you're dealing with stream of bytes. From this point of view it's similar to writing and reading from a file.
Both send and recv can send and receive less data than provided in the argument. You have to deal with it correctly (usually by applying proper loop around the call).
As you're dealing with streams, you have to find the way to convert it to meaningful data in your application. In other words, you have to design serialisation protocol.
From what you've already mentioned, you most probably want to send some kind of messages (well, it's usually what people do). The key thing is to discover the boundaries of messages properly. If your messages are of fixed size, you simply grab the same amount of data from the stream and translate it to your message; otherwise, you need a different approach:
If you can come up with a character which cannot exist in your message, it could be your delimiter. You can then read the stream until you reach the character and it'll be your message. If you transfer ASCII characters (strings) you can use zero as a separator.
If you transfer binary data (raw integers etc.), all characters can appear in your message, so nothing can act as a delimiter. Probably the most common approach in this case is to use fixed-size prefix containing size of your message. Size of this extra field depends on the max size of your message (you will be probably safe with 4 bytes, but if you know what is the maximum size, you can use lower values). Then your packet would look like SSSS|PPPPPPPPP... (stream of bytes), where S is the additional size field and P is your payload (the real message in your application, number of P bytes is determined by value of S). You know every packet starts with 4 special bytes (S bytes), so you can read them as an 32-bit integer. Once you know the size of the encapsulated message, you read all the P bytes. After you're done with one packet, you're ready to read another one from the socket.
Good news though, you can come up with something completely different. All you need to know is how to deserialise your message from a stream of bytes and how send/recv behave. Good luck!
EDIT:
Example of function receiving arbitrary number of bytes into array:
bool recv_full(int sock, char *buffer, size_t size)
{
size_t received = 0;
while (received < size)
{
ssize_t r = recv(sock, buffer + received, size - received, 0);
if (r <= 0) break;
received += r;
}
return received == size;
}
And example of receiving packet with 2-byte prefix defining size of payload (size of payload is then limited to 65kB):
uint16_t msgSize = 0;
char msg[0xffff];
if (recv_full(sock, reinterpret_cast<char *>(&msgSize), sizeof(msgSize)) &&
recv_full(sock, msg, msgSize))
{
// Got the message in msg array
}
else
{
// Something bad happened to the connection
}
That's just how recv() works on most platforms. You have to check the number of bytes you receive and continue calling it in a loop until you get the number that you need.
You "fix" that by reading from TCP socket in a loop until you get enough bytes to make sense to your application.
my server application sends packet 8 bytes length
Not really. Your server sends 8 individual bytes, not a packet 8 bytes long. TCP data is sent over a byte stream, not a packet stream. TCP neither respects nor maintains any "packet" boundary that you might have in mind.
If you know that your data is provided in quanta of N bytes, then call recv in a loop:
std::vector<char> read_packet(int N) {
std::vector buffer(N);
int total = 0, count;
while ( total < N && (count = recv(sock_fd, &buffer[N], N-total, 0)) > 0 )
total += count;
return buffer;
}
std::vector<char> packet = read_packet(8);
If your packet is variable length, try sending it before the data itself:
int read_int() {
std::vector<char> buffer = read_packet(sizeof (int));
int result;
memcpy((void*)&result, (void*)&buffer[0], sizeof(int));
return result;
}
int length = read_int();
std::vector<char> data = read_buffer(length);