about string.c_str() life cycle - c++

I wonder if the void func(const char *str); refer to a valid str if I wrote as follow:
auto str = string("hello").c_str();
func(str);
How is it different from code below?
func(string("hello").c_str())

In both cases, the string object is a temporary, destroyed at the end of the statement.
In the first case, str ends up dangling - pointing to memory that was managed by the temporary string, but which has now been destroyed. Doing anything with it is an error, giving undefined behaviour.
In the second case, the temporary string is not destroyed until after the function returns. So this is fine, as long as the function doesn't keep hold of the pointer for something else to use later.

The difference is that the first creates a temporary string object that gets destroyed at the end of the first statement, so str becomes a dangling pointer. The second also creates a temporary, but it exists throughout the call to func because the temporary object doesn't get destroyed until after the call to func returns.

From Paragraph 12.2/3 of the C++11 Standard:
[...] Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created. This is true even if that evaluation ends in throwing an exception. [...]
This means that the temporary created within the expression that contains the call to func() will live until function call returns.
On the other hand, the lifetime of the temporary in the first code snippet will end up before func() is invoked, and str will be dangling. This will result in Undefined Behavior.

Related

How to comprehend "Temporary objs are destroyed as the last step in evaluating the full-expression"?Could anyone make it clear by some simple example?

As per the documentation(), which says:
When an implementation introduces a temporary object of a class that
has a non-trivial constructor ([class.default.ctor],
[class.copy.ctor]), it shall ensure that a constructor is called for
the temporary object. Similarly, the destructor shall be called for a
temporary with a non-trivial destructor ([class.dtor]). Temporary
objects are destroyed as the last step in evaluating the
full-expression ([intro.execution]) that (lexically) contains the
point where they were created. This is true even if that evaluation
ends in throwing an exception. The value computations and side effects
of destroying a temporary object are associated only with the
full-expression, not with any specific subexpression.
How to comprehend "Temporary objects are destroyed as the last step in evaluating the full-expression ([intro.execution]) that (lexically) contains the point where they were created."?Could anyboday make it clear by some simple examples?
Simple example. This expression produces a temporary object:
std::string("test")
Here, that expression is used as a subexpression:
function(std::string("test"));
// point A
At point A, the temporary object has been destroyed because the point is after the full-expression where the temporary object was created.
Here is an example of how to write a bug if this rule is not understood:
const std::string& function(const std::string& arg) {
return arg;
}
const std::string& ref = function("test");
std::cout << ref;
Here, the temporary object that was created as the argument is destroyed after the full expression, and therefore ref has become invalid - a dangling reference. The behaviour is undefined when the invalid reference is inserted into the output stream.
An explanation that works in many cases is that temporary objects are destroyed when execution reaches the semicolon at the end of the statement. Some language constructs (such as a for loop) are not covered by this explanation, so don't push it too hard. For a better explanation of the exceptions, see Statement that isn't a full-expression.
As one example:
i = foo1(foo2(std::string("test")));
The temporary string is kept alive until after the assignment, as the assignment occurs before the end of the statement. (In this case, the full expression is the statement.)

Best practices with references

Only for curiosity and educating and clarification reasons I would like to ask that the way I use references and values are good practices or not.
Theoretically:
class ComplexGraphicalShape {
...
public:
void setRasterImageURL(const QString &rasterImageURL);
const QString &rasterImageURL() const;
...
private:
const QString *_rasterImageURL;
};
...
void ShadowGram::setRasterImageURL(const QString &rasterImageURL) {
safeDelete(_rasterImageURL); // handle deletion
_rasterImageURL = new QString(rasterImageURL);
}
const QString &ShadowGram::rasterImageURL() const{
// Question 2: Why is it a problem if I return
// return "www.url.com/shape_url.jpg"
return *_rasterImageURL; // that is the right way
}
...
complexGraphicalShape().setRasterImageURL(kURLImagesToShare + imageName);
complexGraphicalShape().setRasterImageURL("www.url.com/url.jpg"); // Question 1.
My first question is that how long can I use the temporary object reference which is created inside setRasterImageURL functioncall? Where exist that variable?(in the stack If I am not mistaken, but what if I call another function with that temporary reference.
My second question is that why I got a warning in Question 2 section if I would like to use this return "www.url.com/shape_url.jpg"? That thing is kind of similar. How long can I use that temporary object?
Thanks for your time for the answer and explanations
The temporary exists until setRasterImageURL returns, so you can safely pass a reference to it along, but you need to be careful not to save the reference for later. The temporary is stored wherever the compiler wants to. The reference is most likely passed either in a register or on the stack.
It is a problem because you're returning a reference to a temporary QString object, and that object is destroyed when the function returns. You're not allowed to use the reference at all.
Passing a reference "inwards" to a function is (usually) safe as long as you don't store it, while passing a reference "outwards" from a function requires you to make sure that the referenced object still exists when the function returns.
Q1: The temporary string exists as long as the temporary reference that is "bound" to it. That is - as long as you are "inside" setRasterImageURL() function. This - of course - includes all functions called "within" this function. Note that storing another reference to this temporary string does NOT prolong the lifetime of the temporary object.
complexGraphicalShape().setRasterImageURL("www.url.com/url.jpg");
// the temporary object is "destroyed" when it goes out of scope, and it's scope is just the called function
Q2: The problem with returning is that you use "C string" (array of characters) to create a temporary QString object (on stack, still inside the function) and return reference to that temporary. As this temporary object is destroyed right after this function returns, your reference is never valid and refers to a dead object. On the other hand - returning a reference to a member variable works, because this object is not destroyed, so the reference is valid as long as your main object lives.
const QString &ShadowGram::rasterImageURL() const{
return "www.url.com/shape_url.jpg"
// the temporary object is destroyed here, before the function returns, reference is invalid
}
My first question is that how long can I use the temporary object reference which is created inside setRasterImageURL functioncall?
It's not created inside the function call, it's created on the caller's stack before the function is called, and is destroyed after the function returns.
Where exist that variable?(in the stack If I am not mistaken, but what if I call another function with that temporary reference.
Yes, on the stack. It is destroyed at the ; after the function call returns (at the end of the "full expression").
That thing is kind of similar. How long can I use that temporary object?
Until the end of the full expression that creates the temporary, which is the return statement, so it goes out of scope immediately before the function has even finished returning. That's why you get a warning - the returned reference is bound to an object which no longer exists, and is never safe to use.
Both these cases are covered by 12.2 [class.temporary] paragraph 5 in the standard:
— A temporary object bound to a reference parameter in a function call (5.2.2) persists until the completion of the full-expression containing the call.
— The lifetime of a temporary bound to the returned value in a function return statement (6.6.3) is not extended; the temporary is destroyed at the end of the full-expression in the return statement.

What's the scope of this string?

If I have the following code:
{
UnicodeString sFish = L"FISH";
char *szFish = AnsiString(sFish).c_str();
CallFunc(szFish);
}
Then what is the scope of the temporary AnsiString that's created, and for how long is szFish pointing to valid data? Will it still be valid for the CallFunc function?
Will it's scope last just the one line, or for the whole block?
szFish is invalid before the call to CallFunc(), because AnsiString is a temporary object that is destructed immediately and szFish is pointing to its internal buffer which will have just been deleted.
Ensure that the AnsiString instance is valid for the invocation of CallFunc(). For example:
CallFunc(AnsiString(sFish).c_str());
I would replace:
char *szFish = AnsiString(sFish).c_str();
with:
AnsiString as(sFish);
char *szFish = as.c_str();
I don't know the AnsiString class but in your code its destructor will fire before your call to CallFunc(), and will most probably release the string you point to with *szFish. When you replace the temporary object with a "named" object on stack its lifetime will extend until the end of the block it is defined in.
The C++11 standard $12.2.3 says:
When an implementation introduces a temporary object of a class that
has a non-trivial constructor (12.1, 12.8), it shall ensure that a
constructor is called for the temporary object. Similarly, the
destructor shall be called for a temporary with a non-trivial
destructor (12.4). Temporary objects are destroyed as the last step in
evaluating the full-expression (1.9) that (lexically) contains the
point where they were created. This is true even if that evaluation
ends in throwing an exception. The value computations and side effects
of destroying a temporary object are associated only with the
full-expression, not with any specific subexpression.
(emphasis mine)
There are additional caveats to this, but they don't apply in this situation. In your case the full expression is the indicated part of this statement:
char *szFish = AnsiString(sFish).c_str();
// ^^^^^^^^^^^^^^^^^^^^^^^^^
So, the instant szFish is assigned, the destructor of your temporary object (i.e. AnsiString(sFish)) will be called and its internal memory representation (where c_str() points to) will be released. Thus, szFish will be immediately become a dangling pointer and any access will fail.
You can get around this by saying
CallFunc(AnsiString(sFish).c_str());
instead, as here, the temporary will be destroyed (again) after the full expression (that is, right at the ;) and CallFunc will be able to read the raw string.
The scope of the AnsiString in this case is "from right before the call to c_str(), until right after."
It may help to think of it this way:
char *szFish;
{
AnsiString tmpString(sFish);
szFish = tmpString.c_str();
}

When do temporary parameter values go out of scope? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Lifetime of temporaries
int LegacyFunction(const char *s) {
// do something with s, like print it to standard output
// this function does NOT retain any pointer to s after it returns.
return strlen(s);
}
std::string ModernFunction() {
// do something that returns a string
return "Hello";
}
LegacyFunction(ModernFunction().c_str());
The above example could easily be rewritten to use smart pointers instead of strings; I've encountered both of these situations many times. Anyway, the above example will construct an STL string in ModernFunction, return it, then get a pointer to a C-style string inside of the string object, and then pass that pointer to the legacy function.
There is a temporary string object that exists after ModernFunction has returned. When does it go out of scope?
Is it possible for the compiler to call c_str(), destruct this temporary string object, and then pass a dangling pointer to LegacyFunction? (Remember that the string object is managing the memory that c_str() return value points to...)
If the above code is not safe, why is it not safe, and is there a better, equally concise way to write it than adding a temporary variable when making the function calls? If it's safe, why?
LegacyFunction(ModernFunction().c_str());
Destruction of copy will be after evaluation of full expression (i.e. after return from LegacyFunction).
n3337 12.2/3
Temporary objects are destroyed as the last step
in evaluating the full-expression (1.9) that (lexically) contains the point where they were created.
n3337 1.9/10
A full-expression is an expression that is not a subexpression of another expression. If a language construct
is defined to produce an implicit call of a function, a use of the language construct is considered to be an
expression for the purposes of this definition. A call to a destructor generated at the end of the lifetime of
an object other than a temporary object is an implicit full-expression. Conversions applied to the result of
an expression in order to satisfy the requirements of the language construct in which the expression appears
are also considered to be part of the full-expression.
[ Example:
struct S {
S(int i): I(i) { }
int& v() { return I; }
private:
int I;
};
S s1(1); // full-expression is call of S::S(int)
S s2 = 2; // full-expression is call of S::S(int)
void f() {
if (S(3).v()) // full-expression includes lvalue-to-rvalue and
// int to bool conversions, performed before
// temporary is deleted at end of full-expression
{ }
}
There is a temporary string object that exists after ModernFunction has returned. When does it go out of scope?
Strictly speaking, it's never in scope. Scope is a property of a name, not an object. It just so happens that automatic variables have a very close association between scope and lifetime. Objects that aren't automatic variables are different.
Temporary objects are destroyed at the end of the full-expression in which they appear, with a couple of exceptions that aren't relevant here. Anyway the special cases extend the lifetime of the temporary, they don't reduce it.
Is it possible for the compiler to call c_str(), destruct this temporary string object, and then pass a dangling pointer to LegacyFunction
No, because the full-expression is LegacyFunction(ModernFunction().c_str()) (excluding the semi-colon: feel that pedantry), so the temporary that is the return value of ModernFunction is not destroyed until LegacyFunction has returned.
If it's safe, why?
Because the lifetime of the temporary is long enough.
In general with c_str, you have to worry about two things. First, the pointer it returns becomes invalid if the string is destroyed (which is what you're asking). Second, the pointer it returns becomes invalid if the string is modified. You haven't worried about that here, but it's OK, you don't need to, because nothing modifies the string either.

Is this a valid function?

What happens to the reference in function parameter, if it gets destroyed when the function returns, then how const int *i is still a valid pointer?
const int* func(const int &x = 5)
{
return &x;
}
int main()
{
const int *i = func();
}
§12.2/5:
"A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full expression containing the call."
That means as i is being initialized, it's getting the address of a temporary object that does exist at that point. As soon as i is initialized, however, the temporary object will be destroyed, and i will become just another dangling pointer.
As such, yes, the function is valid -- but with the surrounding code as you've written it, any code you added afterward that attempted to dereference i would give undefined behavior.
Just because a pointer has a value doesn't mean it's a valid pointer.
In this case it holds an address which used to be that of x, and chances are that address still has the value 5, but it's not valid pointer and you can't count on that value being there.
int i points to a patch of memory that is unsafe to access, it is not a valid pointer.
the variable "i" is still a pointer, but even reading the value it points to will give you undefined behavior. That's why you should never write a function like func.
I think that x is created as an un-named temporary on the stack in setting up the call to func(). This temporary will exist until at least the end of the statement in the caller. So the int* i is perfectly valid. It only ceases to be valid at the end of the statement - which means that you cannot use it.
There is something in the standard about un-named temporaries being retained until the last reference to them goes out of scope, but I don't think it covers this explicit and hidden indirection.
[ Happy to have someone tell me otherwise.]
5 is program data. It is in the data segment, not the stack or heap.
So a pointer or reference to it will remain valid for the duration of the program.
Default arguments are evaluated every time the function is called, so the call func() is actually func(5) which is binding a temporary to a reference-to-const. The lifetime of that temporary is then extended till the end of the function and the object is destroyed. Any pointer to this object after that is invalid and dereferencing it is undefined behaviour.