How to convert QByteArray to a byte string?

How to convert QByteArray to a byte string? - c++

I have a QByteArray object with 256 bytes inside of it. However, when I try to convert it to a byte string (std::string), it comes up with a length of 0 to 256. It is very inconsistent with how long the string is, but there are always 256 bytes in the array. This data is encrypted and as such I expect a 256-character garbage output, but instead I am getting a random number of bytes.
Here is the code I used to try to convert it:
// Fills array with 256 bytes (I have tried read(256) and got the same random output)
QByteArray byteRecv = socket->read(2048);
// Gives out random garbage (not the 256-character garbage that I need)
string recv = byteRecv.constData();
The socket object is a QTcpSocket* if it's necessary to know.
Is there any way I can get an exact representation of what's in the array? I have tried converting it to a QString and using the QByteArray::toStdString() method, but neither of those worked to solve the problem.

QByteArray::constData() member function returns a raw pointer const char*. The constructor of std::string from a raw pointer
std::string(const char* s);
constructs the string with the contents initialized with a copy of the null-terminated character string pointed to by s. The length of the string is determined by the first null character. If s does not point to such a string, the behaviour is undefined.
Your buffer is not a null-terminated string and can contain null characters in the middle. So you should use another constructor
std::string(const char* s, std::size_type count);
that constructs the string with the first count characters of character string pointed to by s.
That is:
std::string recv(byteRecv.constData(), 256);
For a collection of raw bytes, std::vector might be a better choice. You can construct it from two pointers:
std::vector<char> recv(byteRecv.constData(), byteRecv.constData() + 256);

Related

C++ Windows function "LockResource()" returns half the data in the resource

I am trying to read an embedded resource from a dll, it contains an encrypted file. Reading it from LockResource() , only returns half the data.
The funny thing is that I checked SizeOfResource() and the size of the resource is what it is supposed to be.
So I tried to access the file without it being an embedded resource :
std::ifstream enc("Logs.enc" , std::ios::binary); // Accessing encrypted file
std::string ciphertext = std::string((std::istreambuf_iterator<char>(enc)), std::istreambuf_iterator<char>());
int size = ciphertext.size(); // Returns the correct size
This worked , I tried to find something they have in common and I tried to remove the std::ios::binary and it had similar behavior to when accessing the file as a resource.
Here is my attempt to Access it as a resource :
HGLOBAL SHEET_DATA; // Imagine this has the encrypted file
if (SHEET_DATA) {
char* datac = nullptr;
datac = (char*)LockResource(SHEET_DATA);
std::string data = datac;
long size_sheet = SizeofResource(dll, SHEET); //
int real_size = data.size(); // Returns the wrong size
}
I tried to search if there was anything such as a LockResource() function that accessess the data in binary mode , but I couldn't find any results.
Thank you

strlen is assuming the parameter is a zero terminated string. It counts the chars until it gets to the zero termination.
In your case it seems like the resource is binary. In this case it may contain bytes with the value 0, which strlen treats as the end of the string.
Therefore what strlen returns is irrelevant. You can use size_sheet returned from SizeofResource to know the size of the data pointed by datac.
Update:
The updated question does not contain a usage of strlen anymore. But the line:
std::string data = datac;
Create a similar problem. Initializing an std::string from a char* assumes the char* is pointing to a zero terminated string. So if the buffer contains zeroes the resulting string will contain only the characters till the first zero.
You can initialize the std::string the following way to avoid the assumption of the zero termination:
std::string data(datac, size_sheet);
Giving the length of the buffer to the ctor of std::string will force initializing with the complete buffer (ignoring the zeroes).
Update2: As #IInspectable commented below, if the data is not really a string, better hold it in a more suitable container - e.g. std::vector<char>. It also has a constructor accepting a char* and the buffer's length.

The problem is this line:
std::string data = datac;
This constructs a std::string from a null-terminated string. But datac is not a null-terminated string, as you said it's binary data. Instead, use the string (const char* s, size_t n); ctor overload:
std::string data(datac, size_sheet);

Get the length of a string containing Null terminated string

I'm using the XOR encryption so when I'm going to decrypt my string I need to get the length of that string.
I tried in this way:
string to_decode = "abcd\0lom";
int size = to_decode.size();
or in this way:
string to_decode = "abcd\0lom";
int size = to_decode.lenght();
Both are wrong because the string contain \0.
So how can I have the right length of my string?

The problem is with the initialisation, not with the size. If you use the constructor taking a const char *, it interprets that argument as a NUL-terminated string. So your std::string is only initialised with the string abcd.
You need to use a range-based constructor:
const char data[] = "abcd\0lom";
std::string to_decode(data, data + (sizeof data) - 1); // -1 to not include terminating NUL
[Live example]
However, be careful with such strings. While std::string can deal with embedded NULs perfectly fine, the result of c_str() will behave as "truncated" as far as all NUL-terminated APIs are concerned.

When you initialize the std::string, with a \0 in the middle, you loose all data ahead of it. If you think about it, a std::string is just a wrapper for a char*, and that gets terminated by a null termination \0. If the \0, doesn't have any meaning in the string, then you could escape it, like this:
string to_decode = "abcd\\0lom";
and the size would be 9. Otherwise, you could a container (eg: std::vector), of char's for the data storage

As others have said, the problem is that the code uses the constructor that takes const char*, and that only copies up to the \0. But, by a very strange coincidence, std::string has a constructor that can handle that case:
const char text[] = "abcd\0lom";
std::string to_decode(text, sizeof(text) - 1);
int size = to_decode.size();
The constructor will copy as many characters as you tell it to.

How to initialize std::string with char* containing null values

I have a function that has the following signature
void serialize(const string& data)
I have an array of characters with possible null values
const char* serializedString
(so some characters have the value '\0')
I need to call the given function with the given string!
What I do to achieve that is as following:
string messageContents = string(serializedString);
serialize(messageContents.c_str());
The problem is the following. The string assigment ignores all characters occuring after the first '\0' character.
Even If I call size() on the array I get the number of elements before the first '\0'.
P.S. I know the 'real' size of the char array (the whole size of the arrray containing the characters including '\0' characters)
So how do I call the method correctly?

Construct the string with the length so it doesn't only contain the characters up to the first '\0' i.e.
string messageContents = string(serializedString, length);
or simply:
string messageContents(serializedString, length);
And stop calling c_str(), serialize() takes a string so pass it a string:
serialize(messageContents);
Otherwise you'll construct a new string from the const char*, and that will only read up to the first '\0' again.

Not sure why I am getting different lengths when using a string or a char

When I call gethostname using a char my length 25 but when I use a string my length is 64. Not really sure why. Both of them I am declaring the same size on HOST_NAME_MAX.
char hostname[HOST_NAME_MAX];
BOOL host = gethostname(hostname, sizeof hostname);
expectedComputerName = hostname;
int size2 = expectedComputerName.length();
std::string test(HOST_NAME_MAX, 0);
host = gethostname(&test[0], test.length());
int testSize = test.length();

An std::string object can contain NULs (i.e. '\0' characters). You are storing the name in the first bytes of a string object that was created with a size of HOST_NAME_MAX length.
Storing something in the beginning of the string data won't change the length of the string that remains therefore HOST_NAME_MAX.
When creating a string from a char pointer instead the std::string object created will contain up to, but excluding, the first NUL character (0x00). The reason is that a C string cannot contain NULs because the first NUL is used to mark the end of the string.

Consider what you're doing in each case. In the former code snippet, you're declaring a character array capable of holding HOST_NAME_MAX-1 characters (1 for the null terminator). You then load some string data into that buffer via the call to gethostname and then print out the length of buffer by assigning it to a std::string object using std::string::operator= that takes a const char *. One of the effects of this is that it will change an internal size variable of std::string to be strlen of the buffer, which is not necessarily the same as HOST_NAME_MAX. A call to std::string::length simply returns that variable.
In the latter case, you're using the std::string constructor that takes a size and initial character to construct test. This constructor sets the internal size variable to whatever size you passed in, which is HOST_NAME_MAX. The fact that you then copy in some data to std::strings internal buffer has no bearing on its size variable. As with the other case, a call to the length() member function simply returns the size - which is HOST_NAME_MAX - regardless of whether or not the actual length of the underlying buffer is smaller than HOST_NAME_MAX.
As #MattMcNabb mentioned in the comments, you could fix this by:
test.resize( strlen(test.c_str()) );
Why might you want to do this? Consistency with the char buffer approach might be a reason, but another reason may be performance oriented. In the latter case you're not only outright setting the length of the string to HOST_NAME_MAX, but also its capacity (omitting the SSO for brevity), which you can find starting on line 242 of libstdc++'s std::string implementation. What this means in terms of performance is that even though only, say, 25 characters are actually in your test string, the next time you append to that string (via +=,std::string::append,etc), it's more than likely to have to reallocate and grow the string, as shown here, because the internal size and internal capacity are equal. Following #MattMcNabb's suggestion, however, the string's internal size is reduced down to the length of the actual payload, while keeping the capacity the same as before, and you avoid the almost immediate re-growth and re-copy of the string, as shown here.

Null bytes in char* in QByteArray with QDataStream

I'm discovered that char* in QByteArray have null bytes. Code:
QByteArray arr;
QDataStream stream(&arr, QIODevice::WriteOnly);
stream << "hello";
Look at debugger variable view:
I don't understand why I have three empty bytes at the beginning. I know that [3] byte is string length. Can I remove last byte? I know it's null-terminated string, but for my application I must have raw bytes (with one byte at beggining for store length).
More weird for me is when I use QString:
QString str = "hello";
[rest of code same as above]
stream << str;
It's don't have null at end, so I think maybe null bytes before each char informs that next byte is char?
Just two questions:
Why so much null bytes?
How I can remove it, including last null byte?

I don't understand why I have three empty bytes at the beginning.
It's a fixed-size, uint32_t (4-byte) header. It's four bytes so that it can specify data lengths as long as (2^32-1) bytes. If it was only a single byte, then it would only be able to describe strings up to 255 bytes long, because that's the largest integer value that can fit into a single byte.
Can I remove last byte? I know it's null-terminated string, but for my
application I must have raw bytes (with one byte at beggining for
store length).
Sure, as long as the code that will later parse the data array is not depending on the presence of a trailing NUL byte to work correctly.
More weird for me is when I use QString [...] it's don't have null at end, so I think maybe null bytes before each char informs that next byte is char?
Per the Qt serialization documentation page, a QString is serialized as:
- If the string is null: 0xFFFFFFFF (quint32)
- Otherwise: The string length in bytes (quint32) followed by the data in UTF-16.
If you don't like that format, instead of serializing the QString directly, you could do something like
stream << str.toUtf8();
instead, and that way the data in your QByteArray would be in a simpler format (UTF-8).
Why so much null bytes?
They are used in fixed-size header fields when the length-values being encoded are small; or to indicate the end of NUL-terminated C strings.
How I can remove it, including last null byte?
You could add the string in your preferred format (no NUL terminator but with a single length header-byte) like this:
const char * hello = "hello";
char slen = strlen(hello);
stream.writeRawData(&slen, 1);
stream.writeRawData(hello, slen);
... but if you have the choice, I highly recommend just keeping the NUL-terminator bytes at the end of the strings, for these reasons:
A single preceding length-byte will limit your strings to 255 bytes long (or less), which is an unnecessary restriction that will likely haunt you in the future.
Avoiding the NUL-terminator byte doesn't actually save any space, because you've added a string-length byte to compensate.
If the NUL-terminator byte is there, you can simply pass a pointer to the first byte of the string directly to any code expects a C-style string, and it will be able to use the string immediately (without any data-conversion steps). If you rely on a different convention instead, you'll end up having to make a copy of the entire string before you can pass it to that code, just so that you can append a NUL byte to the end of the string so that that C-string-expecting code can use it. That will be CPU-inefficient and error-prone.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js