String data type variable initialization - c++

How to initialize a string data type variable by null character('\0')???
string txt='\0';
is it right or wrong?

This is wrong, because there is no constructor of std::string which takes char. If you want to construct an std::string which has a null-terminator in it (for whatever reasons you might have) you should use different constructor:
std::string txt(1, '\0');
Please note, this is the answer which takes question on it's face value. You probably do not need this at all, but I do not know that for sure.

First of all understand that to make std::string::c_str() work, std::string always allocates space for terminating zero, so most probably default constructor will do the job for you.
On other hand if you need string which length is not zero and contains zero inside there are two approaches.
use std::string txt(1, '\0'); as #SergeyA sucgested in other ansewear. (this will work with old C++03)
use string literal suffix operators, which is more handy IMO.
.
using namespace std::literals::string_literals;
auto s2 = "\0"s;
Here is live example.

Related

Converting string_view to null terminated c string [duplicate]

I have a method that takes std::string_view and uses function, which takes null terminated string as parameter. For example:
void stringFunc(std::experimental::string_view str) {
some_c_library_func(/* Expects null terminated string */);
}
The question is, what is the proper way to handle this situation? Is str.to_string().c_str() the only option? And I really want to use std::string_view in this method, because I pass different types of strings in it.
I solved this problem by creating an alternate string_view class called zstring_view. It's privately inherited from string_view and contains much of its interface.
The principal difference is that zstring_view cannot be created from a string_view. Also, any string_view APIs that would remove elements from the end are not part of the interface or they return a string_view instead of a zstring_view.
They can be created from any NUL-terminated string source: std::string and so forth. I even created special user-defined literal suffixes for them: _zsv.
The idea being that, so long as you don't put a non-NUL-terminated string into zstring_view manually, all zstring_views should be NUL-terminated. Like std::string, the NUL character is not part of the size of the string, but it is there.
I find it very useful for dealing with C interfacing.
You cannot alter a string through std::string_view. Therefore you cannot add a terminating '\0' character. Hence you need to copy the string somewhere else to add a '\0'-terminator. You could avoid heap allocations by putting the string on the stack, if it's short enough. If you know, that the std::string_view is part of a null-terminated string, then you may check, if the character past the end is a '\0' character and avoid the copy in that case. Other than that, I don't see much more room for optimizations.
You certainly shouldn't call data on std::experimental::string_view:
Unlike basic_string::data() and string literals, data() may return a
pointer to a buffer that is not null-terminated.
So call to_string and c_str on that:
void stringFunc(std::experimental::string_view str) {
some_c_library_func(str.to_string().c_str());
}
or:
void stringFunc(std::experimental::string_view str) {
std::string real_str(str);
some_c_library_func(real_str.c_str());
}
In some cases C-stype functions have overloads, which accept the length of string as a separate argument.
E.g. instead of using strcasesmp() it worth to switch to strncasecmp().
For sure, in this particular case, it would require additional logic implemented in case when strings are not equal, but the first n characters are equal.
But it could be good alternative to writing custom class for string views.

Using std::string_view with api that expects null-terminated string

I have a method that takes std::string_view and uses function, which takes null terminated string as parameter. For example:
void stringFunc(std::experimental::string_view str) {
some_c_library_func(/* Expects null terminated string */);
}
The question is, what is the proper way to handle this situation? Is str.to_string().c_str() the only option? And I really want to use std::string_view in this method, because I pass different types of strings in it.
I solved this problem by creating an alternate string_view class called zstring_view. It's privately inherited from string_view and contains much of its interface.
The principal difference is that zstring_view cannot be created from a string_view. Also, any string_view APIs that would remove elements from the end are not part of the interface or they return a string_view instead of a zstring_view.
They can be created from any NUL-terminated string source: std::string and so forth. I even created special user-defined literal suffixes for them: _zsv.
The idea being that, so long as you don't put a non-NUL-terminated string into zstring_view manually, all zstring_views should be NUL-terminated. Like std::string, the NUL character is not part of the size of the string, but it is there.
I find it very useful for dealing with C interfacing.
You cannot alter a string through std::string_view. Therefore you cannot add a terminating '\0' character. Hence you need to copy the string somewhere else to add a '\0'-terminator. You could avoid heap allocations by putting the string on the stack, if it's short enough. If you know, that the std::string_view is part of a null-terminated string, then you may check, if the character past the end is a '\0' character and avoid the copy in that case. Other than that, I don't see much more room for optimizations.
You certainly shouldn't call data on std::experimental::string_view:
Unlike basic_string::data() and string literals, data() may return a
pointer to a buffer that is not null-terminated.
So call to_string and c_str on that:
void stringFunc(std::experimental::string_view str) {
some_c_library_func(str.to_string().c_str());
}
or:
void stringFunc(std::experimental::string_view str) {
std::string real_str(str);
some_c_library_func(real_str.c_str());
}
In some cases C-stype functions have overloads, which accept the length of string as a separate argument.
E.g. instead of using strcasesmp() it worth to switch to strncasecmp().
For sure, in this particular case, it would require additional logic implemented in case when strings are not equal, but the first n characters are equal.
But it could be good alternative to writing custom class for string views.

std::string& vs boost::string_ref

Does it matter anymore if I use boost::string_ref over std::string& ? I mean, is it really more efficient to use boost::string_ref over the std version when you are processing strings ? I don't really get the explanation offered here: http://www.boost.org/doc/libs/1_61_0/libs/utility/doc/html/string_ref.html . What really confuses me is the fact that std::string is also a handle class that only points to the allocated memory, and since c++11, with move semantics the copy operations noted in the article above are not going to happen. So, which one is more efficient ?
The use case for string_ref (or string_view in recent Boost and C++17) is for substring references.
The case where
the source string happens to be std::string
and the full length of a source string is referenced
is a (a-typical) special case, where it does indeed resemble std::string const&.
Note also that operations on string_ref (like sref.substring(...)) automatically return more string_ref objects, instead of allocating a new std::string.
I have never used it be it seems to me that its purpose is to provide an interface similar to std::string but without having to allocate a string for manipulation. Take the example given extract_part(): it is given a hard-coded C array "ABCDEFG", but because the initial function takes a std::string an allocation takes place (std::string will have its own version of "ABCDEFG"). Using string_ref, no allocation occurs, it uses the reference to the initial "ABCDEFG". The constraint is that the string is read-only.
This answer uses the new name string_view to mean the same as string_ref.
What really confuses me is the fact that std::string is also a handle class that only points to the allocated memory
A string allocates, owns, and manages its own memory. A string_view is a handle to some memory that was already allocated. The memory is managed by some other mechanism, unrelated to the string_view.
If you already have some text data, for example in a char array, then the additional memory allocation involved in constructing a string might be redundant. A string_view could be more efficient because it would allow you to operate directly on the original data in the char array. However, it would not permit the data to be modified; string_view allows no non-const access, because it doesn't own the data it refers to.
and since c++11, with move semantics the copy operations noted in the article above are not going to happen.
You can only move from an object that is ready to be discarded. Copying still serves a purpose and is necessary in many cases.
The example in the article constructs two new strings (not copies) and also constructs two copies of existing strings. In C++98 the copies could already be elided by RVO without move semantics, so they're not a big deal. By using string_view it avoids constructing the two new strings. Move semantics are irrelevant here.
In the call to extract_part("ABCDEFG") a string_view is constructed which refers to the char array represented by the string literal. Constructing a string here would have involved a memory allocation and a copy of the char array.
In the call to bar.substr(2,3) a string_view is constructed which refers to parts of the data already referred to by the first string_view. Using a string here would have involved another memory allocation and copy of part of the data.
So, which one is more efficient?
This is a bit like asking if a hammer is more efficient than a screwdriver. They serve different purposes, so it depends what it is you're trying to accomplish.
You need to be careful when using string_view that the memory it refers to remains valid throughout its lifetime.
If you stick to std::string it does not matter, but boost::string_ref also supports const char*. That is, do you intend to call your string processing function foo with std::string only?
void foo(const std::string&);
foo("won't work"); // no support for `const char*`
Since boost::string_ref is constructable from const char*, it is more flexible since it works with both const char* and std::string.
The proposal N3442 might be helpful.
In short: The main benefit of std::string_view over const std::string& is that you can pass both const char* and std::string objects without doing a copy. As others have said, it also allows you to pass substrings without copying, although (in my experience) this is somewhat less often important.
Consider the following (silly) function (yes I know you could just call s.at(2)):
char getThird(std::string s)
{
if (s.size() < 3) throw std::runtime_error("String too short");
return s[2];
}
This function works, but the string is passed by value. This means the whole length of the string is copied even though we don't look at all of it, and it also (often) incurs a dynamic memory allocation. Doing this in a tight loop can be very expensive. One solution to this is to pass the string by const reference instead:
char getThird(const std::string& s);
This works a lot better if you have a std::string variable and you pass it as a parameter to getThird. But now there's a problem: what if you have a null-terminated const char* string? When you call this function, a temporary std::string will get constructed, so you still get still get the copy and dynamic memory allocation.
Here's another attempt:
char getThird(const char* s)
{
if (std::strlen(s) < 3) throw std::runtime_error("String too short");
return s[2];
}
This will obviously now work fine for const char* variables. It will also work for std::string variables, but calling it is a little awkward: getThird(myStr.c_str()). What's more, std::string supports embedded null characters, and getThird will misinterpret the string as ended at the first of these. At worst this could cause a security vulnerability - imagine if the function were called checkStringForBadHacks!
Another problem is simply that it's annoying to write a function in terms of old null-terminated strings instead of std::string objects with their handy methods. Did you notice, for example, that this function looks at the whole length of the string even though only the first few characters are important? It's hidden in std::strlen, which iterates over all characters looking for the null terminator. We could replace that with a manual check that the first three characters aren't null, but you can see this is a lot less convenient than the other versions.
Step in std::string_view (or boost::string_view, previously known as boost::string_ref):
char getThird(std::string_view s)
{
if (s.size() < 3) throw std::runtime_error("String too short");
return s[2];
}
This gives you the nice methods you expect from a proper string class, like .size(), and it works in both the situations discussed above, plus another:
It works with std::string objects, which can be implicitly be converted to std::string_view objects.
It works with const char* null-terminated strings, which can also be implicitly be converted to std::string_view objects.
This does have the potential disadvantage that constructing the std::string_view requires iterating over the whole string to find the length, even if the function that uses it never needs it (as is the case here). However, if a caller is using a const char* as a parameter to several functions (or one function in a loop) that take std::string_view objects it could always manually construct that object beforehand. This could even give a performance increase, because if that function(s) do need the length then it is precomputed once and reused.
As other answers have mentioned, it also avoids a copy when you only want to pass a substring. For example, this is very useful in parsing. But std::string_view is justified even without this feature.
It's worth noting that there is a case where the original function signature, taking a std::string by value, may actually be better than a std::string_view. That's where you were going to make a copy of the string anyway, for example to store in some other variable or to return from the function. Imagine this function:
std::string changeThird(std::string s, char c)
{
if (s.size() < 3) throw std::runtime_error("String too short");
s[2] = c;
return s;
}
// vs.
std::string changeThird(std::string_view s, char c)
{
if (s.size() < 3) throw std::runtime_error("String too short");
std::string result = s;
result[2] = c;
return result;
}
Note that both of these involve exactly one copy: In the first case this is done implicitly when the parameter s is constructed from whatever is passed in (including if it is another std::string). In the second case we do it explicitly when we create result. But the return statement does not do a copy, because uses move semantics (as if we had done std::move(result)), or more likely uses the return value optimisation.
The reason the first version can be better is that it is actually possible for it to perform zero copies, if the caller moves the argument:
std::string something = getMyString();
std::string other = changeThird(std::move(something), "x");
In this case, the first changeThird does not involve any copy at all, whereas the second one does.

Function that takes a char array as a parameter

There is a function I want to use that takes char str[] as a parameter. I want to call the function giving a string input.
void someFunction (char str[]) {
/* ... */
}
// Works.
someFunction("1010101");
// Does not work.
string someString;
someFunction(someString);
How can I get the second call to work?
EDIT: I cannot change the function's input parameters.
Depends on the nature of the string manipulations. If you read but don't write the string, change the prototype to const char str[] and use someString.c_str(), like others are suggesting.
If you change the characters but not the length of the string, use &*someString.begin().
If you extend/truncate the string, it's easier to pass a string& and work in terms of the string object. Less trouble, honestly.
You should be able to do:
someFunction(const_cast<char*>(someString.c_str()));
Although I'm not sure what will happen if str gets modified.
It's probably best if you just modify the original function to take a different parameter type.
What you want for std::string is void someFunction(std::string& str);
There's a reason for the issue -- a std::string's data is not guaranteed to be contiguous memory (at least, before C++11). Therefore, manipulating its buffer as a contiguous allocation (char[]) is a very bad idea.
casting away the const of std::string::c_str() is also a bad idea. One immediate problem you may face is that a std::string implementation may share backing string allocations with other std::string instances (copy-on-write), and you will end up modifying the values of other std::strings. Of course, there are many other bad things that could go wrong in their own implementation-defined ways -- the standard left this very flexible for the implementors of standard libraries.
EDIT: I cannot change the function's input parameters.
Use a std::vector instead.
You could have your function take a std::string instead:
void someFunction (std::string &str) {

C++, strings, and pointers

I know this is rudimentary but I know nothing about C++. Is it necessary to do:
string *str = getNextRequest();
rather than
string str = getNextRequest();
in order to reference str later on in the same block of code? If so, what type of error would the latter produce?
That depends entirely on the return type of getNextRequest.
Strings can be used and reused throughout the scope they're declared in. They essentially contain a mutable C string and some handling information, helper methods, etc.
You can, very safely, return a string from a function and the compiler will make a copy or move it as necessary. That string (str here) can then be used normally, with no worries about using out-of-scope locals or such.
There are times when a pointer to a string is needed, but unless you're using it as an out parameter, those tend to be rare and indicate some design oddity.
Which you use depends on what getNextRequest() returns. If it returns a string *, then use the first line, if it returns string then use the second.
So if the declaration of getNextRequest is like this:
string getNextRequest();
Then
string str = getNextRequest();
is correct. If the declaration is like this:
string *getNextRequest();
Then you can go with
string *str = getNextRequest();
or
string str = *getNextRequest();
string str = getNextRequest();
will create a copy of the string returned by getNextRequest. If you want to alter the contents of str and wish that these changes are also within the string returned by getNextRequest you have to return a pointer or reference.
If this is what you want, then you should define getNextRequest as:
string& getNextRequest()
and use it like:
string& str = getNextRequest();
string str* = getNextRequest();
As noted by #dasblinkenlight, that would be a syntax error
But to answer your original question, is it necessary? No. In general, you should not use pointers unless you must.
Especially with the STL. The STL is not designed to be used with pointers--it does dynamic memory management for you. Unless you have a good reason, you should always use vector<int> v and string s rather than vector<int>* or string*.
You will probably need to provide a little bit more information regarding this function getNextRequest(). Where is it from? Library? API? Purpose?
If the return type of the function is a string* (pointer to str), then the string has been allocated to the "heap". This means, it does not matter which block of code you reference the string from. As long as you maintain the pointer, you will be able to access it.
If the return type of the function is simply a string (meaning not a pointer), it will return the value, not the address of str. In essence, you would be "copying" the string to your new variable. In this case, the variable would be allocated on the stack, and you would only be able to reference it when in the scope of the code block.