For instance, lets say I have
std::string str = 0;
Is the 0 being converted to a const char*?
Is it being coverted to a char* and then to a const char* when it's passed to the constructor?
I understand initializing a char* to 0 is undefined behavior, and as far as I know the same goes for const char*, but I don't understand the process of what's going on when I pass 0 to the std::string constructor.
Edit: I was wrong.
You are correct in your guessing.
If we look at e.g. this std::string constructor reference we can see that the only suitable constructor is number 5. Therefore your definition is equal to
std::string str = std::string(0);
And as noted in the reference:
The behavior is undefined if s does not point at an array of at least Traits::length(s)+1 elements of CharT, including the case when s is a null pointer.
[Emphasis mine]
So yes it constructs a std::string from the null-pointer which is indeed UB.
I understand initializing a char* to 0 is undefined behavior
You understand wrong. A 0 literal can be converted to a null pointer constant of any pointer type. There's nothing undefined there. The issues come when there's overloading involved, and the 0 can be converted not just to a pointer, but to another integral type. But that conversion itself is not problematic on its own.
Which brings us to what std::string str = 0; does. It initializes str, a class type, from 0. So we need to examine constructors, the only applicable one for 0 is this one:
basic_string( const CharT* s,
const Allocator& alloc = Allocator() );
So it indeed initializes str from a null pointer. And that is what's undefined.
Related
It's not listed explicitly in std::string constructor doc, (EDIT: folks here says I should cite actual cppreference not cplusplus.com) but apparently it works. That means it's like the equivalent of strncpy, isn't it?
Does it work because it implicitly first initializes another std::string object that's a copy of the const char* string passed in? Does it mean it does extra work of copying the entire string though, even if it eventually only extracts a certain length of substring?
Also it seems such construction is kind of like string (const char* s+pos, size_t len) except the reference says here if len is greater than string length, it causes undefined behavior; yet in string (const char* s, size_t pos, size_t len = npos) if len is longer passed null terminator it's just fine. Presumably that's because, I guess, this internally is dealing with stuff on cpp string object level and the former is messing with pointers.
And why doesn't that behavior gets listed in c++ reference doc?
My guess is it's a kind of weird combination of internally copy to std::string object and then apply string (const string& str, size_t pos, size_t len = npos) to it, so it's not considered "standard". That said, I find this super useful, when I have to take input as char*, while I pretty much don't care about copying the entire string once, yet I can get away doing any malloc and strncpy and neither do I want to write code to branch out in making sure size limit len doesn't go out of bound.
This works because of the presence of constructor:
std::basic_string( const basic_string& other,
size_type pos,
size_type count = std::basic_string::npos,
const Allocator& alloc = Allocator() );
const char * is implicitly convertible to std::basic_string, so the above constructor is called after said conversion when you write (for example) std::string s {"abc", 1, 2};
Live demo
To address your question of efficiency, the implicit conversion from char * to std::basic_string involves construction of a temporary, so yes, the string is copied.
To my knowledge a reference cannot be null, but when I run code like this:
#include <iostream>
#include <string>
void test(int i, const std::string& s = nullptr) {
std::cout << i << " " << s << std::endl;
}
int main() {
test(1, "test");
test(2);
}
the optional parameter s can be null, and the code is built. What's more, when test(2) runs, the program throws exceptions instead of printing some random strings.
When I changed s to some basic type like int, it failed to compile, so I think the magic stays inside the string class, but how?
And what's more, how can I check if s is null or not? if I using if(s==nullptr) or if(s.empty()), it fails to compile.
test initialized its argument by using constructor number 5 of std::basic_string<char>:
basic_string( const CharT* s,
const Allocator& alloc = Allocator() );
Since it needs to materialize a temporary (std::string) to bind to that reference. That is because a reference must be bound to an object of a correct type, which std::nullptr_t isn't. And said constructor has a not null constraint on the pointer being passed. Calling test without an explicit argument leads to undefined behavior.
To be perfectly clear, there is no such thing as a null reference in a well-formed C++ program. A reference must be bound to a valid object. Trying to initialize one with nullptr will only seek out to do a conversion.
Since a std::string is an object with a well-defined "empty" state, a fixed version can simply pass in a default initialized string:
void test(int i, const std::string& s = {}); // Empty string by default.
Once the contract violation is fixed, s.empty() should give meaningful results again.
Reference indeed can not be null, however const std::string& s = nulltr does not do what you think it does. When second parameter is not specified compiler will create a string object invoking implicit string constructor that takes a pointer to null-terminated string as first parameter. So test(2); invocation looks like this:
test(2, ::std::string(static_cast<char const *>(nullptr), ::std::string::allocator_type()));
Note that passing nullptr as this first parameter causes Undefined Behavior.
I'm getting a weird problem and I want to know why it behaves like that. I have a class in which there is a member function that returns std::string. My goal to convert this string to const char*, so I did the following
const char* c;
c = robot.pose_Str().c_str(); // is this safe??????
udp_slave.sendData(c);
The problem is I'm getting a weird character in Master side. However, if I do the following
const char* c;
std::string data(robot.pose_Str());
c = data.c_str();
udp_slave.sendData(c);
I'm getting what I'm expecting. My question is what is the difference between the two aforementioned methods?
It's a matter of pointing to a temporary.
If you return by value but don't store the string, it disappears by the next sequence point (the semicolon).
If you store it in a variable, then the pointer is pointing to something that actually exists for the duration of your udp send
Consider the following:
int f() { return 2; }
int*p = &f();
Now that seems silly on its face, doesn't it? You are pointing at a value that is being copied back from f. You have no idea how long it's going to live.
Your string is the same way.
.c_str() returns the the address of the char const* by value, which means it gets a copy of the pointer. But after that, the actual character array that it points to is destroyed. That is why you get garbage. In the latter case you are creating a new string with that character array by copying the characters from actual location. In this case although the actual character array is destroyed, the copy remains in the string object.
You can't use the data pointed to by c_str() past the lifetime of the std::string object from whence it came. Sometimes it's not clear what the lifetime is, such as the code below. The solution is also shown:
#include <string>
#include <cstddef>
#include <cstring>
std::string foo() { return "hello"; }
char *
make_copy(const char *s) {
std::size_t sz = std::strlen(s);
char *p = new char[sz];
std::strcpy(p, s);
return p;
}
int
main() {
const char *p1 = foo().c_str(); // Whoops, can't use p1 after this statement.
const char *p2 = make_copy(foo().c_str()); // Okay, but you have to delete [] when done.
}
From c_str():
The pointer obtained from c_str() may be invalidated by:
Passing a non-const reference to the string to any standard library function, or
Calling non-const member functions on the string, excluding operator[], at(), front(), back(), begin(), rbegin(), end() and
rend().
Which means that, if the string returned by robot.pose_Str() is destroyed or changed by any non-const function, the pointer to the string will be invalidated. Since you may be returning a temporary copy to from robot.pose_Str(), the return of c_str() on it shall be invalid right after that call.
Yet, if you return a reference to the inner string you may be holding, instead of a temporary copy, you can either:
be sure it is going to work, in case your function udp_send is synchronous;
or rely on an invalid pointer, and thus experience undefined behavior if udp_send may finish after some possible modification on the inner contents of the original string.
Q
const char* c;
c = robot.pose_Str().c_str(); // is this safe??????
udp_slave.sendData(c);
A
This is potentially unsafe. It depends on what robot.pose_Str() returns. If the life of the returned std::string is longer than the life of c, then it is safe. Otherwise, it is not.
You are storing an address in c that is going to be invalid right after the statement is finished executing.
std::string s = robot.pose_Str();
const char* c = s.c_str(); // This is safe
udp_slave.sendData(c);
Here, you are storing an address in c that will be valid unit you get out of the scope in which s and c are defined.
I was working on a little project and came to a situation where the following happened:
std::string myString;
#GetValue() returns a char*
myString = myObject.GetValue();
My question is if GetValue() returns NULL myString becomes an empty string? Is it undefined? or it will segfault?
Interesting little question. According to the C++11 standard, sect. 21.4.2.9,
basic_string(const charT* s, const Allocator& a = Allocator());
Requires: s shall not be a null pointer.
Since the standard does not ask the library to throw an exception when this particular requirement is not met, it would appear that passing a null pointer provoked undefined behavior.
It is runtime error.
You should do this:
myString = ValueOrEmpty(myObject.GetValue());
where ValueOrEmpty is defined as:
std::string ValueOrEmpty(const char* s)
{
return s == nullptr ? std::string() : s;
}
Or you could return const char* (it makes better sense):
const char* ValueOrEmpty(const char* s)
{
return s == nullptr ? "" : s;
}
If you return const char*, then at the call-site, it will convert into std::string.
My question is if GetValue() returns NULL myString becomes an empty string? Is it undefined? or it will segfault?
It's undefined behavior. The compiler and run time can do whatever it wants and still be compliant.
Update:
Since C++23 adopted P2166, it is now forbidden to construct std::string from nullptr, that is, std::string s = nullptr or std::string s = 0 will no longer be well-formed.
const char cp[]="jkasdkasjsad";
string a=static_cast<string>(cp);//"const string a" also runs without any error
I have stuck at the above code for the whole afternoon. C++ Primer only give a code like
const char cp[]="jkasdkasjsad";
static_cast<string>(cp);
Could someone tell me is my code legal? Could I call it "cast away const" since no "const" before "string a"?
Any well-defined type conversion, other than those involving low-level const, can be
requested using a static_cast. For example, we can force our expression to use
floating-point division by casting one of the operands to double:
I was confused about the description above, what does "those involing low-level const" mean? Involving at left side or right side of an assignment?
Anyone can save me.. Many thanks!
Your string from cp array is being copied, string variable is not const
const char cp[] = "jkasdkasjsad";
std::string a = static_cast<std::string>(cp);
is equivalent to:
std::string ab = cp;
cp decays to pointer to first element of cp array
There is no real casting at all in this case.
static_cast<string>(cp);
is equivalent to call to string constructor
string(cp);
Temporary variable of type string constructed from cp will be returned from static_cast. Since, I think we talk about std::string, than this constructor will be called
basic_string( const CharT* s,
const Allocator& alloc = Allocator() );
Constructs the string with the contents initialized with a copy of the
null-terminated character string pointed to by s. The length of the
string is determined by the first null character.
Your code is perfectly legal according to the clause 5.2.9/4 of C++ standard:
An expression e can be explicitly converted to a type T using a
static_cast of the form static_cast<T>(e) if the declaration T t(e);
is well-formed, for some invented temporary variable t (8.5). The
effect of such an explicit conversion is the same as performing the
declaration and initialization and then using the temporary variable
as the result of the conversion.
For your example T is std::string, e is cp. There is no casting away constness because of new object creation. Compare with this:
char* p = static_cast<char*>(cp); // error