Unexpected behavior on adding '\0' to std::string - c++

Why does the C++ standard allow the following?
#include <iostream>
#include <string>
int main()
{
std::string s(10, '\0'); // s.length() now is 10
std::cout << "string is " << s << ", length is " << s.length() << std::endl;
s.append(5, '\0'); // s.length() now is 15
std::cout << "string is " << s << ", length is " << s.length() << std::endl;
// the same with += char and push_back
// but:
s += "hello"; // s.length() returns 20 string is "hello"
std::cout << "string is " << s << ", length is " << s.length() << std::endl;
return 0;
}
Why does it add 0 and count it?
It looks like broken integrity of string, doesn't it? But I checked standard and it is correct behavior.

Why does standard allows following?
Because the people designing C++ strings decided that such things should be allowed. I'm not sure if anyone that was part of the team that designed C++ strings are on SO... But since you yourself say that the standard allows it, that's the way it is, and I doubt it's about to change.
It's sometimes quite practical to have a string that can contain "anything". I can think of a few instances when I've had to work around the fact that C style strings can't contain zero-bytes. Along with the fact that long C style strings take a long time to find the length of, the main benefit of C++ strings is that they are not restricted to "what you can put in them" - that's a good thing in my book.

Not sure what is problem here.
Adding '\0' in the middle of the std::string changes nothing - null character is treated like any other. The only thing that can change is if you use .c_str() with function that accepts null-terminated strings. But then it's not problem of .c_str(), only with the function that treats '\0' specially.
If you want to know how many characters has this string as if treated like null-terminated string, use
size_t len = strlen(s.c_str());
Note that it's O(n) operation, because that's how strlen works.
If you ask why += operator doesn't add the implicit null character of string literal "hello" to the string, I say the reverse (adding it) is unclear and definitely not what you want 99% of the time. On the other hand, if you want to add '\0' to your string, just append it like a buffer:
char buffer[] = "Hello";
s.append(buffer, sizeof(buffer));
or (even better) drop the char arrays and null-terminated strings altogether and use C++-style replacements like std::string as NTS-replacement, std::vector<char> as contiguous buffer, std::vector as dynamic array with pointers replacement, and std::array (C++11) as standard C array replacement.
Also, (as mentioned by #AdamRosenfield in comments), your string after adding "hello" does have in fact 20 characters, it's probably only that your terminal doesn't print nulls.

NUL char '\0' is the ending character for c style string, not std::strings. However, it supports this character to get values from a const char pointer so that it can find the end of a c-style string. Otherwise, it is treated just like other characters

std::string is more of a container for characters than anything else and \0 is a character. As a real world example, take a look at the CreateProcess function in Windows. The lpEnvironment parameter takes a null-terminated block of null-terminated strings (i.e. A=1\0B=2\0C=3\0\0). If you're building a block it's convenient to use an std::string.

Related

Passing a .war file using a http POST request not working [duplicate]

If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.

Creating binary (custom length) string in C++ [duplicate]

If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.

What is wrong with this string assignment?

string s="abcdefghijklmnopqrstuvwxyz"
char f[]=" " (s.substr(s.length()-10,9)).c_str() " ";
I want to get the last 9 characters of s and add " " to the beginning and the end of the substring, and store it as a char[]. I don't understand why this doesn't work even though char f[]=" " "a" " " does.
Is (s.substr(s.length()-10,9)).c_str() not a string literal?
No, it's not a string literal. String literals always have the form "<content>" or expand to that (macros, like __FILE__ for example).
Just use another std::string instead of char[].
std::string f = " " + s.substr(s.size()-10, 9) + " ";
First, consider whether you should be using cstrings. In C++, generally, use string.
However, if you want to use cstrings, the concatenation of "abc" "123" -> "abc123" is a preprocessor operation and so cannot be used with string::c_str(). Instead, the easiest way is to construct a new string and take the .c_str() of that:
string s="abcdefghijklmnopqrstuvwxyz"
char f[]= (string(" ") + s.substr(s.length()-10,9) + " ").c_str();
(EDIT: You know what, on second thought, that's a really bad idea. The cstring should be deallocated after the end of this statement, so using f can cause a segfault. Just don't use cstrings unless you're prepared to mess with strcpy and all that ugly stuff. Seriously.)
If you want to use strings instead, consider something like the following:
#include <sstream>
...
string s="abcdefghijklmnopqrstuvwxyz"
stringstream tmp;
tmp << " " << s.substr(s.length()-10,9) << " ";
string f = tmp.str();
#Xeo tells you how to solve your problem. Here's some complimentary background on how string literals are handled in the compilation process.
From section A.12 Preprocessing of The C Programming language:
Escape sequences in character constants and string literals (Pars. A.2.5.2, A.2.6) are
replaced by their equivalents; then adjacent string literals are concatenated.
It's the Preprocessor, not the compiler, who's responsible for the concatenation. (You asked for a C++ answer. I expect that C++ treats string literals the same way as C). The preprocessor has only a limited knowledge of the C/C++ language; the (s.substr(s.length()-10,9)).c_str() part is not evaluated at the preprocessor stage.

Why does std::ends cause string comparison to fail?

I spent about 4 hours yesterday trying to fix this issue in my code. I simplified the problem to the example below.
The idea is to store a string in a stringstream ending with std::ends, then retrieve it later and compare it to the original string.
#include <sstream>
#include <iostream>
#include <string>
int main( int argc, char** argv )
{
const std::string HELLO( "hello" );
std::stringstream testStream;
testStream << HELLO << std::ends;
std::string hi = testStream.str();
if( HELLO == hi )
{
std::cout << HELLO << "==" << hi << std::endl;
}
return 0;
}
As you can probably guess, the above code when executed will not print anything out.
Although, if printed out, or looked at in the debugger (VS2005), HELLO and hi look identical, their .length() in fact differs by 1. That's what I am guessing is causing the == operator to fail.
My question is why. I do not understand why std::ends is an invisible character added to string hi, making hi and HELLO different lengths even though they have identical content. Moreover, this invisible character does not get trimmed with boost trim. However, if you use strcmp to compare .c_str() of the two strings, the comparison works correctly.
The reason I used std::ends in the first place is because I've had issues in the past with stringstream retaining garbage data at the end of the stream. std::ends solved that for me.
std::ends inserts a null character into the stream. Getting the content as a std::string will retain that null character and create a string with that null character at the respective positions.
So indeed a std::string can contain embedded null characters. The following std::string contents are different:
ABC
ABC\0
A binary zero is not whitespace. But it's also not printable, so you won't see it (unless your terminal displays it specially).
Comparing using strcmp will interpret the content of a std::string as a C string when you pass .c_str(). It will say
Hmm, characters before the first \0 (terminating null character) are ABC, so i take it the string is ABC
And thus, it will not see any difference between the two above. You are probably having this issue:
std::stringstream s;
s << "hello";
s.seekp(0);
s << "b";
assert(s.str() == "b"); // will fail!
The assert will fail, because the sequence that the stringstream uses is still the old one that contains "hello". What you did is just overwriting the first character. You want to do this:
std::stringstream s;
s << "hello";
s.str(""); // reset the sequence
s << "b";
assert(s.str() == "b"); // will succeed!
Also read this answer: How to reuse an ostringstream
std::ends is simply a null character. Traditionally, strings in C and C++ are terminated with a null (ascii 0) character, however it turns out that std::string doesn't really require this thing. Anyway to step through your code point by point we see a few interesting things going on:
int main( int argc, char** argv )
{
The string literal "hello" is a traditional zero terminated string constant. We copy that whole into the std::string HELLO.
const std::string HELLO( "hello" );
std::stringstream testStream;
We now put the string HELLO (including the trailing 0) into the stream, followed by a second null which is put there by the call to std::ends.
testStream << HELLO << std::ends;
We extract out a copy of the stuff we put into the stream (the literal string "hello", plus the two null terminators).
std::string hi = testStream.str();
We then compare the two strings using the operator == on the std::string class. This operator (probably) compares the length of the string objects - including how ever many trailing null characters. Note that the std::string class does not require the underlying character array to end with a trailing null - put another way it allows the string to contain null characters so the first of the two trailing null characters is treated as part of the string hi.
Since the two strings are different in the number of trailing nulls, the comparison fails.
if( HELLO == hi )
{
std::cout << HELLO << "==" << hi << std::endl;
}
return 0;
}
Although, if printed out, or looked at
in the debugger (VS2005), HELLO and hi
look identical, their .length() in
fact differs by 1. That's what I am
guessing is causing the "==" operator
to fail.
Reason being, the length is different by one trailing null character.
My question is why. I do not
understand why std::ends is an
invisible character added to string
hi, making hi and HELLO different
lengths even though they have
identical content. Moreover, this
invisible character does not get
trimmed with boost trim. However, if
you use strcmp to compare .c_str() of
the two strings, the comparison works
correctly.
strcmp is different from std::string - it is written from back in the early days when strings were terminated with a null - so when it gets to the first trailing null in hi it stops looking.
The reason I used std::ends in the
first place is because I've had issues
in the past with stringstream
retaining garbage data at the end of
the stream. std::ends solved that for
me.
Sometimes it is a good idea to understand the underlying representation.
You're adding a NULL char to HELLO with std::ends. When you initialize hi with str() you are removing the NULL char. The strings are different. strcmp doesn't compare std::strings, it compares char* (it's a C function).
std::ends adds a null terminator, (char)'\0'. You'd use it with the deprecated strstream classes, to add the null terminator.
You don't need it with stringstream, and in fact it screws things up, because the null terminator isn't "the special null terminator that ends a string" to stringstream, to stringstream it's just another character, the zeroth character. stringstream just adds it, and that increases the character count (in your case) to seven, and makes the comparison to "hello" fail.
I think to have a good way to compare strings is to use std::find method. Do not mix C methods and std::string ones!

How do you construct a std::string with an embedded null?

If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.