How to read a QTcpSocket from R - c++

I'm trying to send data from Qt to R. I am new to the QtNetwork module and relatively new to Qt overall. As such I am also trying to figure out how QIODevice encodes data for the purposes of reading and writing.
If I run the Fortune Server Example and connect to it with the following code in R:
connection <- socketConnection(host="localhost", port=50743, open="rb", timeout=10)
readBin(connection, what="raw", n = 1000)
the following raw hexadecimal vector is returned
00 00 00 56 00 59 00 6f 00 75 00 20 00 77 00 69 00 6c 00 6c 00 20 00 66 00 65 00 65 00 6c 00 20 00 68 00 75 00 6e 00 67 00 72 00 79 00 20 00 61 00 67 00 61 00 69 00 6e 00 20 00 69 00 6e 00 20 00 61 00 6e 00 6f 00 74 00 68 00 65 00 72 00 20 00 68 00 6f 00 75 00 72 00 2e
Removing the first five bytes and all the remaining null characters and converting to char I get:
"You will feel hungry again in another hour."
So what I want to know is where do all the characters that are not part of the fortune come from? The fourth byte seems to be the byte length of the message from the sixth byte to the end, the rest of the "non-fortune" characters are all null.
I read that QByteArray terminates each byte with a null character and QByteArray is converted to a QBuffer before being written by QTcpSocket, is that what is happening here? QBuffer adds the length of the message (but what of the other four bytes) and every second byte of a QByteArray is the null character? Also, the last byte is not null (did the readBin operation consume it/ how did readBin know where the message ended)?
Is this the only way to write data to the socket? If I wanted to transmit values of type double would I have to convert them to QByteArray to transmit them in this fashion? Is there not some non-text way of transmitting data through a socket?
Any enlightenment would be much appreciated!
EDIT:
Thanks for the answer! For completeness sake here is how you might decode the string in R
connection <- socketConnection(host="localhost", port=50743, open="rb", timeout=10)
# Read first 32 bits, which contains the size of the string in bytes
len.raw <- readBin(connection, what="raw", n = 4)
# convert to integer
len <- strtoi(paste(c("0x",len.raw),collapse=""))
# Read raw message
msg.raw <- readBin(connection, what="raw", n = len)
# convert to char using UTF-16BE
msg <- iconv(list(msg.raw),from="UTF-16BE")
close(connection)
cat(msg)

If you take a look at how the Fortune Server Example is implemented, you can see that it uses QDataStream to serialize fortunes (QStrings) over the socket:
QByteArray block;
QDataStream out(&block, QIODevice::WriteOnly);
out.setVersion(QDataStream::Qt_4_0);
out << fortunes.at(qrand() % fortunes.size());
So, the question is reduced to "How does QDataStream serialize QStrings?", and this is answered extensively in the documentation page about serializing Qt data types. You can see that a QString's serialization looks like this:
If the string is null: 0xFFFFFFFF (quint32)
Otherwise: The string length in bytes (quint32) followed by the data in UTF-16
And this is exactly what you are seeing in your question. The first four bytes are the string length in bytes, and the "nulls" you are seeing later appear because of using UTF-16 encoding.
Is this the only way to write data to the socket? If I wanted to transmit values of type double would I have to convert them to QByteArray to transmit them in this fashion? Is there not some non-text way of transmitting data through a socket?
You can use any serialization format you like. QDataStream is widely used in Qt since it supports most Qt data types out of the box. This has nothing to do with using QByteArray, you can let QDataStream write to the socket directly. QDataStream is, actually, a binary format (non-text) as you can see. If want textual human-readable formats, you can use JSON.
But if you are aiming to send data from Qt to R using QDataStream, you'll have to write your QDataStream deserializer for R. I would recommend using some common data serialization that has implementations in C++ and R (in lieu of re-inventing the wheel). I believe JSON meets this criterion, and if you want to use a binary format, msgpack might be interesting for you, since it supports a lot of programming languages (including R and C++).

Related

Problems with Pican2 sending messages

I've been experimenting with Pican2 and the python-can libraries and I've been able to read the bus and interpret many messages in my car. The problem is when I send a message to the bus (for example, turn on A/C), it quickly appears once in the candump printout and then reverts to its previous state. For example:
[436] 00 08 00 10 FE 00 00 01
[436] 04 10 00 10 FE 00 00 01
[436] 00 08 00 10 FE 00 00 01
[436] 00 08 00 10 FE 00 00 01
...
04 10 occur when A/C is on and fan speed is at level 1. I am sending this data... 00 08 is A/C is off, this overrides my can message on its own.
It seems as though I have to send the message in a loop for it to take. Is there something I am missing? I feel like I should just be able to send the message once and have the canbus accept it.
Many functions controlled by CAN messages need periodic messages to keep them doing what they are doing. In your log, it looks like there are periodic messages controlling the fan.
Further reading: https://www.sans.org/reading-room/whitepapers/threats/hacking-bus-basic-manipulation-modern-automobile-through-bus-reverse-engineering-37825

How to calculate checksum of UDP packet embedded inside IP packet

I have a UDP packet which is embedded inside IP packet and not able to calculate the checksum of UDP properly but I can correctly find the CHecksum of IP. Can someone help how the UDP checksum is found.
[45 00 00 53 00 80 00 00 40 11 66 16 0A 00 00 03 0A 00 00 02] CA B1 CA B1 00 3F DF A5
The bits enclosed in bracket is IP packet and the checksum is given in bold.
**UDP Packet**
CA B1 Source port
CA B1 Destination port
00 3F Length
DF A5 Checksum
Here how the checksum "DF A5" came. I did 16 bit addition and took the 1s complement but still not getting the value. Whether I need to consider IP header also to calculate the Checksum of UDP

Memory leak when using anything involving wxFileName

I'm making use of wxWidgets in my program for directory management and compressing/uncompromising collections of files. As I've been building my file system, I've noticed that I get memory leaks every run. After a lot of testing, I realized that any time I use any functions related to wxFileName, I get a memory leak. I'm using wx widgets 3.0.1, and my standalone example is as follows.
#include <wx\filename.h>
int main()
{
wxFileName::Mkdir("Test");
return 0;
}
The result is the same if I make an instance of the wxFileName class.
How do I make wx widgets not create a memory leak? I want to be able to package large collections of files in one file, and read the data from them with various other libraries (via extracting the zip to a temporary folder and reading the data from there). I haven't been able to get any other library to zip/unzip entire folders, so I really need to be able to use wxWidgets without a memory leak.
I read in another thread that the visual studios debugger is falsely identifying the memory leaks, but I ran it through AQtime and it confirmed that there was indeed a memory leak.
The exact debug output involving the memory leak is as follows:
Detected memory leaks!
Dumping objects ->
{1087} normal block at 0x009B4BC0, 64 bytes long.
Data: <\+= d+= l+= t+= > 5C 2B 3D 00 64 2B 3D 00 6C 2B 3D 00 74 2B 3D 00
{1086} normal block at 0x009B4880, 772 bytes long.
Data: < > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
{1085} normal block at 0x009B4680, 28 bytes long.
Data: < H > 80 48 9B 00 C1 00 00 00 00 00 00 00 CD CD CD CD
Object dump complete.
After a bit of digging (it WOULD be the digging I did AFTER posting the question) I found that when you're using wxWidgets without creating a wxWidgets app object, you need to use the following two functions:
wxInitialize()
and
wxUninitialize()
So the fixed version of my code is as follows:
#include <wx/app.h>
#include <wx\filename.h>
int main()
{
wxInitialize();
wxFileName::Mkdir("Waka Waka");
wxUninitialize();
return 0;
}
I suggest if anyone is using wxWidgets purely for the file management to either call these functions in the constructor and destructor of whatever class handles files, or at the beginning and end of your program's main loop.

Setting endianness of VS debugger

I am using VS 2012 and programming in C++. I have a wide string
wchar_t *str = L"Hello world".
Technically I read the string from a file but I don't know if that makes a difference. When I look at str in the memory window it looks like this:
00 48 00 65 00 6c 00 6c 00 6f 00 2c 00 20 00 77 00 6f 00 72 00 6c 00 64 00 21 00
As you can see the string is stored in memory as big-endian.
When I hover my mouse over the string I get:
L"䠀攀氀氀漀Ⰰ 眀漀爀氀搀℀"
And after I reverse the endianness of str the memory looks like:
48 00 65 00 6c 00 6c 00 6f 00 2c 00 20 00 77 00 6f 00 72 00 6c 00 64 00 21 00 00
And the hover over looks like:
L"Hello, world!"
It seems that the debugger displays UTF-16 in little-endian by default. My program reads big-endian files so it is very tedious to keep reversing the endianness of all strings to debug them. Is there any way to change the endianness of the debugger's display?
Except for debug purposes I can do all my processing in big endian.
It's not only the debugger. The wchar_t function of Visual Studio are little endian as the host is. When you want to process the data you need to reverse the string endianess to little endian anyway.
It's worth to have this change even if you output the strings to a file with a different endianess. Strings are defined as a byte sequence, your endianess applied to a string looks strange anyhow.
Your best shot in getting this to work is to define your own type and create a debugger type visualizer for it (see Customizing the Visual Studio Debugger Display of Your Data, or here).
Or maybe you can quick-hack it by shifting the address by 1 byte in watch window.
You're working with a non-native string format that just happens to "feel" similar to the native format. So you are tempted to think there should be almost a way to do it. But to the debugger, it's just a foreign binary format. The debugger is not designed to handle foreign endianness just as it does not handle visualizing an OGG stream packet.
If you want to use available tools for manipulating native-endian Unicode strings, you'll need to convert to native-endian Unicode format.
As has been pointed out, VS uses the native endianness, which is
little endian on an Intel/AMD. The problem is that you're not
reading the strings correctly; you should imbue the
std::istream with a locale which reads UTF-16BE (since this is
apparently the encoding form you're trying to read).
std::istream (or rather the backing std::filebuf) will
automatically do the code translation on the fly when reading
and writing.
You can set the endianness of the Memory window using the context menu. Right-click in the Memory window and check "Big Endian".

zlib decompression failing

I'm writing an application that needs to uncompress data compressed by another application (which is outside my control - I cannot make changes to it's source code). The producer application uses zlib to compress data using the z_stream mechanism. It uses the Z_FULL_FLUSH frequently (probably too frequently, in my opinion, but that's another matter). This third party application is also able to uncompress it's own data, so I'm pretty confident that the data itself is correct.
In my test, I'm using this third party app to compress the following simple text file (in hex):
48 65 6c 6c 6f 20 57 6f 72 6c 64 21 0d 0a
The compressed bytes I receive from the app look like this (again, in hex):
78 9c f2 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 00 00 ff ff
If I try and compress the same data, I get very similar results:
78 9c f3 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 24 e9 04 55
There are two differences that I can see:
First, the fourth byte is F2, rather than F3, so the deflate "final block" bit has not been set. I assume this is because the stream interface never knows when the end of the incoming data will be, so never sets that bit?
Finally, the last four bytes in the external data is 00 00 FF FF, whereas in my test data it is 24 E9 04 55. Searching around I found on this page
http://www.bolet.org/~pornin/deflate-flush.html
...that this is a signature of a sync or full flush.
When I try and decompress my own data using the decompress() function, everything works perfectly. However, when I try and decompress the external data the decompress() function call fails with a return code of Z_DATA_ERROR, indicating corrupt data.
I have a few questions:
Should I be able to use the zlib "uncompress" function to uncompress data that has been compressed with the z_stream method?
In the example above, what is the significance of the last four bytes? Given that both the externally compressed data stream and my own test data stream are the same length, what do my last four bytes represent?
Cheers
Thanks to the zlib authors, I have found the answer. The third party app is generating zlib streams that are not finished correctly:
78 9c f2 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 00 00 ff ff
That is a partial zlib stream,
consisting of a zlib header and a
partial deflate stream. There are two
blocks, neither of which is a last
block. The second block is an empty
stored block, used as a marker when
flushing. A zlib decoder would
correctly decode what's there, and
then continue to look for data after
those bytes.
78 9c f3 48 cd c9 c9 57 08 cf 2f ca 49 51 e4 e5 02 00 24 e9 04 55
That is a complete zlib stream,
consisting of a zlib header, a single
block marked as the last block, and a
zlib trailer. The trailer is the
Adler-32 checksum of the uncompressed
data.
So My decompression is failing - probably because the CRC is missing, or the decompression code keeps looking for more data that does not exist.
solution is here:
http://technology.amis.nl/2010/03/13/utl_compress-gzip-and-zlib/
this is decompression and compression functions for start with 78 9C signature
compressed database blob (or stream).