How to initialize std::string with char* containing null values - c++

I have a function that has the following signature
void serialize(const string& data)
I have an array of characters with possible null values
const char* serializedString
(so some characters have the value '\0')
I need to call the given function with the given string!
What I do to achieve that is as following:
string messageContents = string(serializedString);
serialize(messageContents.c_str());
The problem is the following. The string assigment ignores all characters occuring after the first '\0' character.
Even If I call size() on the array I get the number of elements before the first '\0'.
P.S. I know the 'real' size of the char array (the whole size of the arrray containing the characters including '\0' characters)
So how do I call the method correctly?

Construct the string with the length so it doesn't only contain the characters up to the first '\0' i.e.
string messageContents = string(serializedString, length);
or simply:
string messageContents(serializedString, length);
And stop calling c_str(), serialize() takes a string so pass it a string:
serialize(messageContents);
Otherwise you'll construct a new string from the const char*, and that will only read up to the first '\0' again.

Related

How to convert QByteArray to a byte string?

I have a QByteArray object with 256 bytes inside of it. However, when I try to convert it to a byte string (std::string), it comes up with a length of 0 to 256. It is very inconsistent with how long the string is, but there are always 256 bytes in the array. This data is encrypted and as such I expect a 256-character garbage output, but instead I am getting a random number of bytes.
Here is the code I used to try to convert it:
// Fills array with 256 bytes (I have tried read(256) and got the same random output)
QByteArray byteRecv = socket->read(2048);
// Gives out random garbage (not the 256-character garbage that I need)
string recv = byteRecv.constData();
The socket object is a QTcpSocket* if it's necessary to know.
Is there any way I can get an exact representation of what's in the array? I have tried converting it to a QString and using the QByteArray::toStdString() method, but neither of those worked to solve the problem.
QByteArray::constData() member function returns a raw pointer const char*. The constructor of std::string from a raw pointer
std::string(const char* s);
constructs the string with the contents initialized with a copy of the null-terminated character string pointed to by s. The length of the string is determined by the first null character. If s does not point to such a string, the behaviour is undefined.
Your buffer is not a null-terminated string and can contain null characters in the middle. So you should use another constructor
std::string(const char* s, std::size_type count);
that constructs the string with the first count characters of character string pointed to by s.
That is:
std::string recv(byteRecv.constData(), 256);
For a collection of raw bytes, std::vector might be a better choice. You can construct it from two pointers:
std::vector<char> recv(byteRecv.constData(), byteRecv.constData() + 256);

Why does std::string("\x00") report length of 0?

I have a function which needs to encode strings, which needs to be able to accept 0x00 as a valid 'byte'. My program needs to check the length of the string, however if I pass in "\x00" to std::string the length() method returns 0.
How can I get the actual length even if the string is a single null character?
std::string is perfectly capable of storing nulls. However, you have to be wary, as const char* is not, and you very briefly construct a const char*, from which you create the std::string.
std::string a("\x00");
This creates a constant C string containing only the null character, followed by a null terminator. But C strings don't know how long they are; so the string thinks it runs until the first null terminator, which is the first character. Hence, a zero-length string is created.
std::string b("");
b.push_back('\0');
std::string is null-clean. Characters (\0) can be the zero byte freely as well. So, here, there is nothing stopping us from correctly reading the data structure. The length of b will be 1.
In general, you need to avoid constructing C strings containing null characters. If you read the input from a file directly into std::string or make sure to push the characters one at a time, you can get the result you want. If you really need a constant string with null characters, consider using some other sentinel character instead of \0 and then (if you really need it) replace those characters with '\0' after loading into std::string.
You're passing in an empty string. Use std::string(1, '\0') instead.
Or std::string{ '\0' } (thanks, #zett42)
With C++14, you can use a string literal operator to store strings with null bytes:
using namespace std::string_literals;
std::string a = "\0"s;
std::string aa = "\0\0"s; // two null bytes are supported too

Get the length of a string containing Null terminated string

I'm using the XOR encryption so when I'm going to decrypt my string I need to get the length of that string.
I tried in this way:
string to_decode = "abcd\0lom";
int size = to_decode.size();
or in this way:
string to_decode = "abcd\0lom";
int size = to_decode.lenght();
Both are wrong because the string contain \0.
So how can I have the right length of my string?
The problem is with the initialisation, not with the size. If you use the constructor taking a const char *, it interprets that argument as a NUL-terminated string. So your std::string is only initialised with the string abcd.
You need to use a range-based constructor:
const char data[] = "abcd\0lom";
std::string to_decode(data, data + (sizeof data) - 1); // -1 to not include terminating NUL
[Live example]
However, be careful with such strings. While std::string can deal with embedded NULs perfectly fine, the result of c_str() will behave as "truncated" as far as all NUL-terminated APIs are concerned.
When you initialize the std::string, with a \0 in the middle, you loose all data ahead of it. If you think about it, a std::string is just a wrapper for a char*, and that gets terminated by a null termination \0. If the \0, doesn't have any meaning in the string, then you could escape it, like this:
string to_decode = "abcd\\0lom";
and the size would be 9. Otherwise, you could a container (eg: std::vector), of char's for the data storage
As others have said, the problem is that the code uses the constructor that takes const char*, and that only copies up to the \0. But, by a very strange coincidence, std::string has a constructor that can handle that case:
const char text[] = "abcd\0lom";
std::string to_decode(text, sizeof(text) - 1);
int size = to_decode.size();
The constructor will copy as many characters as you tell it to.

null character inside string

From Rules for C++ string literals escape character ,Eli's answer
std::string ("0\0" "0", 3) // String concatenation
works because this version of the constructor takes a char array; if you try to just pass "0\0" "0" as a const char*, it will treat it as a C string and only copy everything up until the null character.
Does that mean space isn't alloted for entire string , ie the string after \0 is written on unalloted space ?
Moreover the above question is for c++ string, I observed same behaviour for c strings too .
Are c and c++ strings same when I add null char in middle of string during declaration ?
The char array is copied into the new object. If you don't specify, how long the char array is, C++ will copy until the first null character. How much additional space is allocated is outside the scope of the specification. Like vectors, strings have a capacity that can exceed the amount required to store the string and allows to append characters without relocating the string.
std::string constructor that takes a const char* assumes the input is a C string.
C strings are '\0' terminated and thus stops when it reaches the '\0' character.
If you really want to play you need to use the constructor that builds the string from a
char array (not a C-String).
STL sequence containers exceed the amount required to store automatically
Example :
int main () {
string s="Elephant";
s[4] ='\0'; //C++ std::string is NOT '\0' terminated
cout<<s<<endl; //Elep ant
string x("xy\0ab"); // C-String assumed.
cout<<x; //xy
return 0;
}

cin.getline() equivalent when getting a char from a function.

From what I understand cin.getLine gets the first char(which I think it a pointer) and then gets that the length. I have used it when cin for a char. I have a function that is returning a pointer to the first char in an array. Is there an equivalent to get the rest of the array into a char that I can use the entire array. I explained below what I am trying to do. The function works fine, but if it would help I could post the function.
cmd_str[0]=infile();// get the pointer from a function
cout<<"pp1>";
cout<< "test1"<<endl;
// cin.getline(cmd_str,500);something like this with the array from the function
cout<<cmd_str<<endl; this would print out the entire array
cout<<"test2"<<endl;
length=0;
length= shell(cmd_str);// so I could pass it to this function
You could use a string stream:
char const * p = get_data(); // assume null-terminated
std::istringstream iss(std::string(p));
for (std::string line; std::getline(iss, line); )
{
// process "line"
}
If the character array is not null-terminated but has a given size N, say std::string(p, N) instead.
First, if cmd_str is an array of char and infile returns a pointer to a string, that first assignment will give you an error. It tries to assign a pointer to a single char.
What you seem to want is strncpy:
strncpy(cmd_str, infile() ARRAY_LENGTH - 1);
cmd_str[ARRAY_LENGTH - 1] = '\0';
I make sure to add the string terminator to the array, because if strncpy copies all ARRAY_LENGTH - 1 characters, it will not append the terminator.
If cmd_str is a proper array (i.e. declared like char cmd_str[ARRAY_LENGTH];) then you can use sizeof(cmd_str) - 1 instead of ARRAY_LENGTH - 1 in my example. However, if cmd_str is passed as a pointer to a function, this will not work.