Related
If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.
If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.
Apparently boost::asio::async_read doesn't like strings, as the only overload of boost::asio::buffer allows me to create const_buffers, so I'm stuck with reading everything into a streambuf.
Now I want to copy the contents of the streambuf into a string, but it apparently only supports writing to char* (sgetn()), creating an istream with the streambuf and using getline().
Is there any other way to create a string with the streambufs contents without excessive copying?
I don't know whether it counts as "excessive copying", but you can use a stringstream:
std::ostringstream ss;
ss << someStreamBuf;
std::string s = ss.str();
Like, to read everything from stdin into a string, do
std::ostringstream ss;
ss << std::cin.rdbuf();
std::string s = ss.str();
Alternatively, you may also use a istreambuf_iterator. You will have to measure whether this or the above way is faster - i don't know.
std::string s((istreambuf_iterator<char>(someStreamBuf)),
istreambuf_iterator<char>());
Note that someStreamBuf above is meant to represent a streambuf*, so take its address as appropriate. Also note the additional parentheses around the first argument in the last example, so that it doesn't interpret it as a function declaration returning a string and taking an iterator and another function pointer ("most vexing parse").
It's really buried in the docs...
Given boost::asio::streambuf b, with size_t buf_size ...
boost::asio::streambuf::const_buffers_type bufs = b.data();
std::string str(boost::asio::buffers_begin(bufs),
boost::asio::buffers_begin(bufs) + buf_size);
Another possibility with boost::asio::streambuf is to use boost::asio::buffer_cast<const char*>() in conjunction with boost::asio::streambuf::data() and boost::asio::streambuf::consume() like this:
const char* header=boost::asio::buffer_cast<const char*>(readbuffer.data());
//Do stuff with header, maybe construct a std::string with std::string(header,header+length)
readbuffer.consume(length);
This won't work with normal streambufs and might be considered dirty, but it seems to be the fastest way of doing it.
For boost::asio::streambuf you may find a solution like this:
boost::asio::streambuf buf;
/*put data into buf*/
std::istream is(&buf);
std::string line;
std::getline(is, line);
Print out the string :
std::cout << line << std::endl;
You may find here: http://www.boost.org/doc/libs/1_49_0/doc/html/boost_asio/reference/async_read_until/overload3.html
One can also obtain the characters from asio::streambuf using std::basic_streambuf::sgetn:
asio::streambuf in;
// ...
char cbuf[in.size()+1]; int rc = in.sgetn (cbuf, sizeof cbuf); cbuf[rc] = 0;
std::string str (cbuf, rc);
The reason you can only create const_buffer from std::string is because std::string explicitly doesn't support direct pointer-based writing in its contract. You could do something evil like resize your string to a certain size, then const_cast the constness from c_str() and treat it like a raw char* buffer, but that's very naughty and will get you in trouble someday.
I use std::vector for my buffers because as long as the vector doesn't resize (or you are careful to deal with resizing), you can do direct pointer writing just fine. If I need some of the data as a std::string, I have to copy it out, but the way I deal with my read buffers, anything that needs to last beyond the read callback needs to be copied out regardless.
I didn't see an existing answer for reading exactly n chars into a std::stringstream, so here is how that can be done:
std::stringstream ss;
boost::asio::streambuf sb;
const auto len = 10;
std::copy_n(boost::asio::buffers_begin(sb.data()), len,
std::ostream_iterator<decltype(ss)::char_type>(ss));
Compiler explorer
A simpler answer would be to convert it in std::string and manipulate it some what like this
std::string buffer_to_string(const boost::asio::streambuf &buffer)
{
using boost::asio::buffers_begin;
auto bufs = buffer.data();
std::string result(buffers_begin(bufs), buffers_begin(bufs) + buffer.size());
return result;
}
Giving a very concise code for the task.
I mostly don't like answers that say "You don't want X, you want Y instead and here's how to do Y" but in this instance I'm pretty sure I know what tstenner wanted.
In Boost 1.66, the dynamic string buffer type was added so async_read can directly resize and write to a string buffer.
I tested the first answer and got a compiler error when compiling using "g++ -std=c++11"
What worked for me was:
#include <string>
#include <boost/asio.hpp>
#include <sstream>
//other code ...
boost::asio::streambuf response;
//more code
std::ostringstream sline;
sline << &response; //need '&' or you a compiler error
std::string line = sline.str();
This compiled and ran.
I think it's more like:
streambuf.commit( number_of_bytes_read );
istream istr( &streambuf );
string s;
istr >> s;
I haven't looked into the basic_streambuf code, but I believe that should be just one copy into the string.
Is it possible to use an std::string for read() ?
Example :
std::string data;
read(fd, data, 42);
Normaly, we have to use char* but is it possible to directly use a std::string ? (I prefer don't create a char* for store the result)
Thank's
Well, you'll need to create a char* somehow, since that's what the
function requires. (BTW: you are talking about the Posix function
read, aren't you, and not std::istream::read?) The problem isn't
the char*, it's what the char* points to (which I suspect is what
you actually meant).
The simplest and usual solution here would be to use a local array:
char buffer[43];
int len = read(fd, buffer, 42);
if ( len < 0 ) {
// read error...
} else if ( len == 0 ) {
// eof...
} else {
std::string data(buffer, len);
}
If you want to capture directly into an std::string, however, this is
possible (although not necessarily a good idea):
std::string data;
data.resize( 42 );
int len = read( fd, &data[0], data.size() );
// error handling as above...
data.resize( len ); // If no error...
This avoids the copy, but quite frankly... The copy is insignificant
compared to the time necessary for the actual read and for the
allocation of the memory in the string. This also has the (probably
negligible) disadvantage of the resulting string having an actual buffer
of 42 bytes (rounded up to whatever), rather than just the minimum
necessary for the characters actually read.
(And since people sometimes raise the issue, with regards to the
contiguity of the memory in std:;string: this was an issue ten or more
years ago. The original specifications for std::string were designed
expressedly to allow non-contiguous implementations, along the lines of
the then popular rope class. In practice, no implementor found this
to be useful, and people did start assuming contiguity. At which point,
the standards committee decided to align the standard with existing
practice, and require contiguity. So... no implementation has ever not
been contiguous, and no future implementation will forego contiguity,
given the requirements in C++11.)
No, you cannot and you should not. Usually, std::string implementations internally store other information such as the size of the allocated memory and the length of the actual string. C++ documentation explicitly states that modifying values returned by c_str() or data() results in undefined behaviour.
If the read function requires a char *, then no. You could use the address of the first element of a std::vector of char as long as it's been resized first. I don't think old (pre C++11) strings are guarenteed to have contiguous memory otherwise you could do something similar with the string.
No, but
std::string data;
cin >> data;
works just fine. If you really want the behaviour of read(2), then you need to allocate and manage your own buffer of chars.
Because read() is intended for raw data input, std::string is actually a bad choice, because std::string handles text. std::vector seems like the right choice to handle raw data.
Using std::getline from the strings library - see cplusplus.com - can read from an stream and write directly into a string object. Example (again ripped from cplusplus.com - 1st hit on google for getline):
int main () {
string str;
cout << "Please enter full name: ";
getline (cin,str);
cout << "Thank you, " << str << ".\n";
}
So will work when reading from stdin (cin) and from a file (ifstream).
If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.