Replace method changes size of QByteArray - c++

I want to manipulate a 32 bit write command which I have stored in a QByteArray. But the thing that confuses me is that my QByteArray changes size and I cannot figure out why that happens.
My code:
const char CMREFCTL[] = {0x85,0x00,0x00,0x0B};
QByteArray test = QByteArray::fromRawData(CMREFCTL, sizeof(CMREFCTL));
qDebug()<<test.toHex();
const char last1 = 0x0B;
const char last2 = 0x0A;
test.replace(3,1,&last2);
qDebug()<<test.toHex();
test.replace(3,1,&last1);
qDebug()<<test.toHex();
Generates:
"0x8500000b"
"0x8500000a0ba86789"
"0x8500000ba867890ba86789"
I expected the following output:
"0x8500000b"
"0x8500000a"
"0x8500000b"
Using test.replace(3,1,&last2,1) works but I dont see why my code above dont give the same result.
Best regards!

Here is the documentation for the relevant method:
QByteArray & QByteArray::replace ( int pos, int len, const char *
after )
This is an overloaded function.
Replaces len bytes from index position pos with the zero terminated
string after.
Notice: this can change the length of the byte array.
You are not giving the byte array a zero-terminated string, but a pointer to a single char. So it will scan forward in memory from that pointer until it hits a 0, and treat all that memory as the string to replace with.
If you just want to change a single character test[3] = last2; should do what you want.

Related

Subsetting char array without copying it in C++

I have a long array of char (coming from a raster file via GDAL), all composed of 0 and 1. To compact the data, I want to convert it to an array of bits (thus dividing the size by 8), 4 bytes at a time, writing the result to a different file. This is what I have come up with by now:
uint32_t bytes2bits(char b[33]) {
b[32] = 0;
return strtoul(b,0,2);
}
const char data[36] = "00000000000000000000000010000000101"; // 101 is to be ignored
char word[33];
strncpy(word,data,32);
uint32_t byte = bytes2bits(word);
printf("Data: %d\n",byte); // 128
The code is working, and the result is going to be written in a separate file. What I'd like to know is: can I do that without copying the characters to a new array?
EDIT: I'm using a const variable here just to make a minimal, reproducible example. In my program it's a char *, which is continually changing value inside a loop.
Yes, you can, as long as you can modify the source string (in your example code you can't because it is a constant, but I assume in reality you have the string in writable memory):
uint32_t bytes2bits(const char* b) {
return strtoul(b,0,2);
}
void compress (char* data) {
// You would need to make sure that the `data` argument always has
// at least 33 characters in length (the null terminator at the end
// of the original string counts)
char temp = data[32];
data[32] = 0;
uint32_t byte = bytes2bits(data);
data[32] = temp;
printf("Data: %d\n",byte); // 128
}
In this example by using char* as a buffer to store that long data there is not necessary to copy all parts into a temporary buffer to convert it to a long.
Just use a variable to step through the buffer by each 32 byte length period, but after the 32th byte there needs the 0 termination byte.
So your code would look like:
uint32_t bytes2bits(const char* b) {
return strtoul(b,0,2);
}
void compress (char* data) {
int dataLen = strlen(data);
int periodLen = 32;
char* periodStr;
char tmp;
int periodPos = periodLen+1;
uint32_t byte;
periodStr = data[0];
while(periodPos < dataLen)
{
tmp = data[periodPos];
data[periodPos] = 0;
byte = bytes2bits(periodStr);
printf("Data: %d\n",byte); // 128
data[periodPos] = tmp;
periodStr = data[periodPos];
periodPos += periodLen;
}
if(periodPos - periodLen <= dataLen)
{
byte = bytes2bits(periodStr);
printf("Data: %d\n",byte); // 128
}
}
Please than be careful to the last period, which could be smaller than 32 bytes.
const char data[36]
You are in violation of your contract with the compiler if you declare something as const and then modify it.
Generally speaking, the compiler won't let you modify it...so to even try to do so with a const declaration you'd have to cast it (but don't)
char *sneaky_ptr = (char*)data;
sneaky_ptr[0] = 'U'; /* the U is for "undefined behavior" */
See: Can we change the value of an object defined with const through pointers?
So if you wanted to do this, you'd have to be sure the data was legitimately non-const.
The right way to do this in modern C++ is by using std::string to hold your string and std::string_view to process parts of that string without copying it.
You can using string_view with that char array you have though. It's common to use it to modernize the classical null-terminated string const char*.

Convert char * to QString and remove zeros

In my app I read a string field from a file in local (not Unicode) charset.
The field is a 10 bytes, the remainder is filled with zeros if the string < 10 bytes.
char str ="STRING\0\0\0\0"; // that was read from file
QByteArray fieldArr(str,10); // fieldArr now is STRING\000\000\000\000
fieldArr = fieldArr.trimmed() // from some reason array still containts zeros
QTextCodec *textCodec = QTextCodec::codecForLocale();
QString field = textCodec->ToUnicode(fieldArr).trimmed(); // also not removes zeros
So my question - how can I remove trailing zeros from a string?
P.S. I see zeros in "Local and Expressions" window while debuging
I'm going to assume that str is supposed to be char const * instead of char.
Just don't go over QByteArray -- QTextCodec can handle a C string, and it ends with the first null byte:
QString field = textCodec->toUnicode(str).trimmed();
Addendum: Since the string might not be zero-terminated, adding storage for a null byte to the end seems to be impossible, and making a copy to prepare for making a copy seems wasteful, I suggest calculating the length ourselves and using the toUnicode overload that accepts a char pointer and a length.
std::find is good for this, since it returns the ending iterator of the given range if an element is not found in it. This makes special-case handling unnecessary:
QString field = textCodec->toUnicode(str, std::find(str, str + 10, '\0') - str).trimmed();
Does this work for you?
#include <QDebug>
#include <QByteArray>
int main()
{
char str[] = "STRING\0\0\0\0";
auto ba = QByteArray::fromRawData(str, 10);
qDebug() << ba.trimmed(); // does not work
qDebug() << ba.simplified(); // does not work
auto index = ba.indexOf('\0');
if (index != -1)
ba.truncate(index);
qDebug() << ba;
return 0;
}
Using fromRawData() saves an extra copy. Make sure that the str
stays around until you delete the ba.
indexOf() is safe even if you have filled the whole str since
QByteArray knows you only have 10 bytes you can safely access. It
won't touch 11th or later. No buffer overrun.
Once you removed extra \0, it's trivial to convert to a QString.
You can truncate the string after the first \0:
char * str = "STRING\0\0\0\0"; // Assuming that was read from file
QString field(str); // field == "STRING\0\0\0\0"
field.truncate(field.indexOf(QChar::Null)); // field == "STRING" (without '\0' at the end)
I would do it like this:
char* str = "STRING\0\0\0\0";
QByteArray fieldArr;
for(quint32 i = 0; i < 10; i++)
{
if(str[i] != '\0')
{
fieldArr.append(str[i]);
}
}
QString can be constructed from a char array pointer using fromLocal8Bit. The codec is chosen the same way you do manually in your code.
You need to set the length manually to 10 since you say you have no guarantee that an terminating null byte is present.
Then you can use remove() to get rid of all null bytes. Caution: STRI\0\0\0\0NG will also result in STRING but you said that this does not happen.
char *str = "STRING\0\0\0\0"; // that was read from file
QString field = QString::fromLocal8Bit(str, 10);
field.remove(QChar::Null);

String is not null terminated error

I'm having a string is not null terminated error, though I'm not entirely sure why. The usage of std::string in the second part of the code is one of my attempt to fix this problem, although it still doesn't work.
My initial codes was just using the buffer and copy everything into client_id[]. The error than occurred. If the error is correct, that means I've got either client_ id OR theBuffer does not have a null terminator. I'm pretty sure client_id is fine, since I can see it in debug mode. Strange thing is buffer also has a null terminator. No idea what is wrong.
char * next_token1 = NULL;
char * theWholeMessage = &(inStream[3]);
theTarget = strtok_s(theWholeMessage, " ",&next_token1);
sendTalkPackets(next_token1, sizeof(next_token1) + 1, id_clientUse, (unsigned int)std::stoi(theTarget));
Inside sendTalkPackets is. I'm getting a string is not null terminated at the last line.
void ServerGame::sendTalkPackets(char * buffer, unsigned int buffersize, unsigned int theSender, unsigned int theReceiver)
{
std::string theMessage(buffer);
theMessage += "0";
const unsigned int packet_size = sizeof(Packet);
char packet_data[packet_size];
Packet packet;
packet.packet_type = TALK;
char client_id[MAX_MESSAGE_SIZE];
char theBuffer[MAX_MESSAGE_SIZE];
strcpy_s(theBuffer, theMessage.c_str());
//Quick hot fix for error "string not null terminated"
const char * test = theMessage.c_str();
sprintf_s(client_id, "User %s whispered: ", Usernames.find(theSender)->second.c_str());
printf("This is it %s ", buffer);
strcat_s(client_id, buffersize , theBuffer);
Methinks that problem lies in this line:
sendTalkPackets(next_token1, sizeof(next_token1) + 1, id_clientUse, (unsigned int)std::stoi(theTarget));
sizeof(next_token1)+1 will always gives 5 (on 32 bit platform) because it return size of pointer not size of char array.
One thing which could be causing this (or other problems): As
buffersize, you pass sizeof(next_token1) + 1. next_token1 is
a pointer, which will have a constant size of (typically) 4 or 8. You
almost certainly want strlen(next_token1) + 1. (Or maybe without the
+ 1; conventions for passing sizes like this generally only include
the '\0' if it is an output buffer. There are a couple of other
places where you're using sizeof, which may have similar problems.
But it would probably be better to redo the whole logic to use
std::string everywhere, rather than all of these C routines. No
worries about buffer sizes and '\0' terminators. (For protocol
buffers, I've also found std::vector<char> or std::vector<unsigned char>
quite useful. This was before the memory in std::string was
guaranteed to be contiguous, but even today, it seems to correspond more
closely to the abstraction I'm dealing with.)
You can't just do
std::string theMessage(buffer);
theMessage += "0";
This fails on two fronts:
The std::string constructor doesn't know where buffer ends, if buffer is not 0-terminated. So theMessage will potentially be garbage and include random stuff until some zero byte was found in the memory beyond the buffer.
Appending string "0" to theMessage doesn't help. What you want is to put a zero byte somewhere, not value 0x30 (which is the ascii code for displaying a zero).
The right way to approach this, is to poke a literal zero byte buffersize slots beyond the start of the buffer. You can't do that in buffer itself, because buffer may not be large enough to accomodate that extra zero byte. A possibility is:
char *newbuffer = malloc(buffersize + 1);
strncpy(newbuffer, buffer, buffersize);
newbuffer[buffersize] = 0; // literal zero value
Or you can construct a std::string, whichever you prefer.

searching an unsigned char array for characters

I have a binary data file that I am trying to read. The values in the file are 8-bit unsigned integers, with "record" delimiters that are ASCII text ($MSG, $GRP, for example). I read the data as one big chunk, as follows:
unsigned char *inBuff = (unsigned char*)malloc(file_size*sizeof(unsigned char));
result = fread(inBuff, sizeof(unsigned char), file_size, pFile);
I need to search this array to find records that start with $GRP (so I can then read the data that follows), can someone suggest a good way to do this? I have tried several things, and none of them have worked. For example, my most recent attempt was:
std::stringstream str1;
str1 << inBuff;
std::string strTxt = str1.str();
However, when I check the length on this, it is only 5. I looked at the file in Notepad, and noticed that the sixth character is a NULL. So it seems like it is cutting off there because of the NULL. Any ideas?
Assuming the fread does not return a -1, the value in it will tell you how many bytes are available to search.
It is unreasonable to expect to be able to do a string search on binary data, as there my be NUL characters in the binary data which will cause the length function to terminate early.
One possibly way is to to search for the data is to use memcmp on the buffer, with your search key, and length of the search key.
(As per my comment)
C str functions assume zero-terminated strings. Any C string function will stop at the very first binary 0. Use memchr to locate the $ and then use strncmp or memcmp. In particular, do not assume the byte immediately after the 4-byte identifier is a binary 0.
In code (C, not tested):
/* recordId should point to a simple string such as "$GRP" */
unsigned char *find_record (unsigned char *data, size_t max_length, char *recordId)
{
unsigned char *ptr;
size_t remaining_length;
ptr = startOfData;
if (strlen(recordId) > max_length)
return NULL;
remaining_length = max_length;
do
{
/* fast scan for the first character only */
ptr = memchr (ptr, recordId[0], remaining_length);
if (!ptr)
return NULL;
/* first character matches, test entire string */
if (!memcmp (ptr, recordId, strlen(recordId))
return ptr;
/* no match; test onwards from the next possible position */
ptr++;
/* take care not to overrun end of data */
/* It's tempting to test
remaining_length = ptr - startOfData;
but there is a chance this will end up negative, and
size_t does not like to be negative.
*/
if (ptr >= startOfData+max_length)
break;
remaining_length = ptr-startOfData;
} while (1);
return NULL;
}

Using istringstream to process a memory block of variable length

I'm trying to use istringstream to recreate an encoded wstring from some memory. The memory is laid out as follows:
1 byte to indicate the start of the wstring encoding. Arbitrarily this is '!'.
n bytes to store the character length of the string in text format, e.g. 0x31, 0x32, 0x33 would be "123", i.e. a 123-character string
1 byte separator (the space character)
n bytes which are the wchars which make up the string, where wchar_t's are 2-bytes each.
For example, the byte sequence:
21 36 20 66 00 6f 00 6f 00
is "!6 f.o.o." (using dots to represent char 0)
All I've got is a char* pointer (let's call it pData) to the start of the memory block with this encoded data in it. What's the 'best' way to consume the data to reconstruct the wstring ("foo"), and also move the pointer to the next byte past the end of the encoded data?
I was toying with using an istringstream to allow me to consume the prefix byte, the length of the string, and the separator. After that I can calculate how many bytes to read and use the stream's read() function to insert into a suitably-resized wstring. The problem is, how do I get this memory into the istringstream in the first place? I could try constructing a string first and then pass that into the istringstream, e.g.
std::string s((const char*)pData);
but that doesn't work because the string is truncated at the first null byte. Or, I could use the string's other constructor to explicitly state how many bytes to use:
std::string s((const char*)pData, len);
which works, but only if I know what len is beforehand. That's tricky given that the data is variable length.
This seems like a really solvable problem. Does my rookie status with strings and streams mean I'm overlooking an easy solution? Or am I barking up the wrong tree with the whole string approach?
Try setting your stringstream's rdbuf:
char* buffer = something;
std::stringbuf *pbuf;
std::stringstream ss;
std::pbuf=ss.rdbuf();
std::pbuf->sputn(buffer, bufferlength);
// use your ss
Edit: I see that this solution will have a similar problem to your string(char*, len) situation. Can you tell us more about your buffer object? If you don't know the length, and it isn't null terminated, it's going to be very hard to deal with.
Is it possible to modify how you encode the length, and make that a fixed size?
unsigned long size = 6; // known string length
char* buffer = new char[1 + sizeof(unsigned long) + 1 + size];
buffer[0] = '!';
memcpy(buffer+1, &size, sizeof(unsigned long));
buffer should hold the start indicator (1 byte), the actual size (size of unsigned long), the delimiter (1 byte) and the text itself (size).
This way, you could get the size "pretty" easy, then set the pointer to point beyond the overhead, and then use the len variable in the string constructor.
unsigned long len;
memcpy(&len, pData+1, sizeof(unsigned long)); // +1 to avoid the start indicator
// len now contains 6
char* actualData = pData + 1 + sizeof(unsigned long) + 1;
std::string s(actualData, len);
It's low level and error prone :) (for instance if you read anything that isn't encoded the way that you expect it to be, the len can get pretty big), but you avoid dynamically reading the length of the string.
It seems like something on this order should work:
std::wstring make_string(char const *input) {
if (*input != '!')
return "";
char length = *++input;
return std::wstring(++input, length);
}
The difficult part is dealing with the variable length of the size. Without something to specify the length it's hard to guess when to stop treating the data as specifying the length of the string.
As for moving the pointer, if you're going to do it inside a function, you'll need to pass a reference to the pointer, but otherwise it's a simple matter of adding the size you found to the pointer you received.
It's tempting to (ab)use the (deprecated but nevertheless standard) std::istrstream here:
// Maximum size to read is
// 1 for the exclamation mark
// Digits for the character count (digits10() + 1)
// 1 for the space
const std::streamsize max_size = 3 + std::numeric_limits<std::size_t>::digits10;
std::istrstream s(buf, max_size);
if (std::istream::traits_type::to_char_type(s.get()) != '!'){
throw "missing exclamation";
}
std::size_t size;
s >> size;
if (std::istream::traits_type::to_char_type(s.get()) != ' '){
throw "missing space";
}
std::wstring(reinterpret_cast<wchar_t*>(s.rdbuf()->str()), size/sizeof(wchar_t));