Optimize inserting std::uint32_t's into a std::vector<char> - c++

I'm attempting to insert an array of unsigned ints into a std::vector.
Here is my current code:
auto add_chars(std::vector<char> & vec, unsigned val[]){
std::string tmp;
tmp.resize(11) // Max chars a uint can be represented by, including the '\n' for sprintf
for (auto x = 0; x< 10; x++){
auto char_count = sprintf(tmp.data(),"%u", val[x]);
vec.insert(vec.begin()+vec.size(),tmp.data(), tmp.data()+char_count);
}
}
int main(){
std::vector<char> chars;
unsigned val[10] {1,200,3,4,5,6000,7,8,9000};
add_chars(chars,val);
for (auto & item : chars){
std::cout << item;
}
}
This solution works, however I question its efficiency (and elegance).
Two questions:
Is there a more idiomatic way of doing this?
Is there a more efficient way of doing this?
*edit Fixed a bug in the code made while transferring over to here.
Also, i'm aware that '9000' can't be represented as 1 char, whats why im using the buffer and sprintf to generate multiple chars for the one uint.

Is there a more idiomatic way of doing this?
A character stream is idiomatic for this. Unfortunately, the standard only has a stream for building a string; not a vector. You can copy the string into a vector though. This is not most efficient way:
std::ostringstream ss;
unsigned val[10] {1,200,3,4,5,6000,7,8,9000};
for (auto v : val)
ss << v;
std::string str = ss.str();
// if you need a vector for some reason
std::vector<char> chars(std::begin(str), std::end(str));
Or you could write your own custom vector stream, but that will be a lot of boilerplate.

I would make a couple of changes (and fix the bug).
Firstly the number of digits in an integer is limited so there's no need to use a dynamic object like std::string, a simple char array will do. Since you are using uint32_t and decimal digits 10 characters are sufficient, 11 if you include a nul terminator.
Secondly sprintf and similar are inefficient because they have to interpret the format string, "%u" in your case. A hand written function to perform the conversion from uint32_t to digits would be more efficient.

Related

Passing a .war file using a http POST request not working [duplicate]

If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.

Creating binary (custom length) string in C++ [duplicate]

If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.

Writing struct of vector to a binary file in c++

I have a struct and I would like to write it to a binary file (c++ / visual studio 2008).
The struct is:
struct DataItem
{
std::string tag;
std::vector<int> data_block;
DataItem(): data_block(1024 * 1024){}
};
I am filling tha data_block vector with random values:
DataItem createSampleData ()
{
DataItem data;
std::srand(std::time(NULL));
std::generate(data.data_block.begin(), data.data_block.end(), std::rand);
data.tag = "test";
return data;
}
And trying to write the struct to file:
void writeData (DataItem data, long fileName)
{
ostringstream ss;
ss << fileName;
string s(ss.str());
s += ".bin";
char szPathedFileName[MAX_PATH] = {0};
strcat(szPathedFileName,ROOT_DIR);
strcat(szPathedFileName,s.c_str());
ofstream f(szPathedFileName, ios::out | ios::binary | ios::app);
// ******* first I tried to write this way then one by one
//f.write(reinterpret_cast<char *>(&data), sizeof(data));
// *******************************************************
f.write(reinterpret_cast<const char *>(&data.tag), sizeof(data.tag));
f.write(reinterpret_cast<const char *>(&data.data_block), sizeof(data.data_block));
f.close();
}
And the main is:
int main()
{
DataItem data = createSampleData();
for (int i=0; i<5; i++) {
writeData(data,i);
}
}
So I expect a file size at least (1024 * 1024) * 4 (for vector)+ 48 (for tag) but it just writes the tag to the file and creates 1KB file to hard drive.
I can see the contents in while I'm debugging but it doesn't write it to file...
What's wrong with this code, why can't I write the strcut to vector to file? Is there a better/faster or probably efficient way to write it?
Do I have to serialize the data?
Thanks...
Casting a std::string to char * will not produce the result you expect. Neither will using sizeof on it. The same for a std::vector.
For the vector you need to use either the std::vector::data method, or using e.g. &data.data_block[0]. As for the size, use data.data_block.size() * sizeof(int).
Writing the string is another matter though, especially if it can be of variable length. You either have to write it as a fixed-length string, or write the length (in a fixed-size format) followed by the actual string, or write a terminator at the end of the string. To get a C-style pointer to the string use std::string::c_str.
Welcome to the merry world of C++ std::
Basically, vectors are meant to be used as opaque containers.
You can forget about reinterpret_cast right away.
Trying to shut the compiler up will allow you to create an executable, but it will produce silly results.
Basically, you can forget about most of the std::vector syntactic sugar that has to do with iterators, since your fstream will not access binary data through them (it would output a textual representation of your data).
But all is not lost.
You can access the vector underlying array using the newly (C++11) introduced .data() method, though that defeats the point of using an opaque type.
const int * raw_ptr = data.data_block.data();
that will gain you 100 points of cool factor instead of using the puny
const int * raw_ptr = &data.data_block.data[0];
You could also use the even more cryptic &data.data_block.front() for a cool factor bonus of 50 points.
You can then write your glob of ints in one go:
f.write (raw_ptr, sizeof (raw_ptr[0])*data.data_block.size());
Now if you want to do something really too simple, try this:
for (int i = 0 ; i != data.data_block.size() ; i++)
f.write (&data.data_block[i], sizeof (data.data_block[i]));
This will consume a few more microseconds, which will be lost in background noise since the disk I/O will take much more time to complete the write.
Totally not cool, though.

Reviewing C++: Int to Char

It's been a while since I have worked with C++, I'm currently catching up for an upcoming programming test. I have the following function that has this signature:
void MyIntToChar(int *arrayOfInt,char* output)
Int is an array of integers and char* output is a buffer that should be long enough to hold the string representation of the integers that the function receives.
Here is an example of the usage of such function:
int numbers[3] = {11, 26, 81};
char* output = ""; // this I'm sure is not valid, any suggestions on how to
// to properly initialize this string?
MyIntToChar(numbers,output);
cout << output << endl; // this should print "11 26 81" or "11, 26, 81".
// i.e. formatting should not be a problem.
I have been reviewing my old c++ notes from college, but I keep having problems with these. I'm hating myself right now for going to the Java world and not working in this.
Thanks.
void MyIntToChar(int *arrayOfInt, char* output);
That's wrong in several ways. First, of all, it's a misnomer. You cannot, in general, convert an integer into one character, because only ten of all intergers (0...9) would fit into one. So I will assume you want to convert integers into _strings instead.
Then, if you pass arrays to functions, they decay to pointers to their first element, and all information about the array's size is lost. So when you pass arrays to function, you need to pass size information, too.
Either use the C way of doing this and pass in the number of elements as std::size_t (to be obtained as sizeof(myarray)/sizeof(myarray[0])):
void MyIntToStr(int *arrayOfInt, std::size_t arraySize, char* output);
Or do it the C++ way and pass in two iterators, one pointing at the first element (so-called begin iterator) and the other pointing to one behind the last (end iterator):
void MyIntToStr(int *begin, int *end, char* output);
You can improve on that by not insisting on the iterators being int*, but anything which, when dereferenced, yields an int:
template< typename FwdIt >
void MyIntToStr(FwdIt begin, FwdIt end, char* output);
(Templates would require you to implement the algorithm in an header.)
Then there's the problems with the output. First of all, do you really expect all the numbers to be written into one string? If so, how should they be separated? Nothing? Whitespace? Comma?
Or do you expect an array of strings to be returned?
Assuming you really want one string, if I pass the array {1, 2, 3, 4, 5} into your function, it needs space for five single-digit integers plus the space needed for four separators. Your function signature suggests you want me to allocate that upfront, but frankly, if I have to calculate this myself, I might just as well do the conversions myself. Further, I have no way of telling you how much memory that char* points to, so you can't check whether I was right. As generations of developers have found out, this is so hard to get right every time, that several computer languages have been invented to make things easier for programmers. One of those is C++, which nowadays comes with a dynamically resizing string class.
It would be much easier (for you and for me), if I could pass you a stirng and you write into that:
template< typename FwdIt >
void MyIntToChar(FwdIt begin, FwdIt end, std::string& output);
Note that I this passes the string per non-const reference. This allows you to modify my string and let's me see the changes you made.
However, once we're doing this, you might just as well return a new string instead of requireing me to pass one to you:
template< typename FwdIt >
std::string MyIntToChar(FwdIt begin, FwdIt end);
If, however, you actually wanted an array of strings returned, you shouldn't take one string to write to, but a means where to write them to. The naive way of doing this would be to pass a dynamically re-sizable array of dynamically re-sizable string. In C++, this is spelled std::vector<std::string>:
template< typename FwdIt >
void MyIntToStr(FwdIt begin, FwdIt end, std::vector<std::string>& output);
Again, it might be better you return such an array (although some would disagree since copying an array of string might be considered to expensive). However, the best way to do this would not require me to accept the result in form of a 'std::vector'. What if I needed the strings in a (linked) list instead? Or written to some stream?
The best way to do this would be for your function to accept an output iterator to which you write your result:
template< typename FwdIt, typename OutIt >
void MyIntToStr(FwdIt begin, FwdIt end, OutIt output);
Of course, now that's so general that it's hard to see what it does, so it's good we gave it a good name. However, looking at it I immediately think that this should build on another function which is needed probably even more than this one: A function that takes one integer and converts it to one string. Assuming that we have such a function:
std::string MyIntToStr(int i);
it's very easy to implement the array versions:
template< typename FwdIt, typename OutIt >
void MyIntToStr(FwdIt begin, FwdIt end, OutIt output)
{
while(begin != end)
*output++ = MyIntToStr(*begin++);
}
Now all that remains for you to be done is to implement that std::string MyIntToStr(int i); function. As someone else already wrote, that's easily done using string streams and you shouldn't have a problem to find some good examples for that. However, it's even easier to find bad examples, so I'd rather give you one here:
std::string MyIntToStr(int i);
{
std::ostringstream oss;
oss << i:
if(!oss) throw "bah!"; // put your error reporting mechanism here
return oss.str();
}
Of course, given templates, that easy to generalize to accepting anything that's streamable:
template< typename T >
std::string MyIntToStr(const T& obj);
{
std::ostringstream oss;
oss << obj:
if(!oss) throw "bah!"; // put your error reporting mechanism here
return oss.str();
}
Note that, given this general function template, the MyIntToStr() working on arrays now automatically works on arrays of any type the function template working on one object works on.
So, at the end of this (rather epic, I apologies) journey, this is what we arrived at: a generalized function template to convert anything (which can be written to a stream) into a string, and a generalized function template to convert the contents of any array of objects (which can be written to a stream) into a stream.
Note that, if you had at least a dozen 90mins lectures on C++ and your instructors failed to teach you enough to at least understand what I've written here, you have not been taught well according to modern C++ teaching standards.
Well an integer converted to string will require a max of 12 bytes including sign (assuming 32bit), so you can allocate something like this
char * output= new char[12*sizeof(numbers)/sizeof(int)];
First of all, it is impossible to use your method that way your example says:
char* output has only a size of 1 byte (don't forget the null-terminator '\0'). So you can't put a whole string in it. You will get segmentation faults. So, here you are going to make use of the heap. This is already implemented in std::string and std::stringstream. So use them for this problem.
Let's have a look:
#include <string>
#include <sstream>
#include <iostream>
std::string intArrayToString(int *integers, int numberOfInts)
{
std::stringstream ss;
for (int i = 0; i < numberOfInts; i++)
{
ss << integers[i] << ", ";
}
std::string temp = ss.str();
return temp.substr(0, temp.size() - 2); // Cut of the extra ", "
}
And if you want to convert it to char*, you can use yourString.c_str();
Here's a possibility if you are willing to reconsider a change in prototype of the function
template<int n>
void MyIntToChar(int (&iarr)[n], string &output){
stringstream ss;
for(size_t id = 0; id < n; ++id){
ss << iarr[id];
if(id != n - 1) ss << " ";
}
output = ss.str();
}
int main(){
int numbers[3] = {11, 26, 81};
string out = "";
MyIntToChar(numbers, out);
}
You should take a look at the std::stringstream, or, more C-ish (as char* type instead of strings might suggest) sprintf.
did you try sprintf(),it will do your work.For char * initialization,you have to either initialize it by calling malloc or you can take is as a char array and pass the address to the function rather then value.
Sounds like you would like to use C lang. Here's an example. There's an extra ", " at the end of the output but it should give you a feel for the concept. Also, I changed the return type so that I would know how many bytes of output were used. The alternative would be to initialize output would nulls.
int MyIntToChar(int *arrayOfInt, char* output) {
int bytes_used = 0; // use to bump the address past what has been used
for (int i = 0 ; i < sizeof(arrayOfInt); ++i)
bytes_used += sprintf(output + bytes_used, "%u, ", arrayOfInt[i]);
return bytes_used;
}
int main() {
int numbers[5] = {5, 2, 11, 26, 81}; // to properly initialize this string?
char output[sizeof(int)*sizeof(numbers)/sizeof(int) + sizeof(numbers)*2]; // int size plus ", " in string
int bytes_used = MyIntToChar(numbers, output);
printf("%*s", bytes_used, output);// this should print "11 26 81" or "11, 26, 81".
}

How do you construct a std::string with an embedded null?

If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.