Is this well defined behavior?
const char* p = (std::string("Hello") + std::string("World")).c_str();
std::cout << p;
I am not sure. Reasons?
No, this is undefined behavior. Both std::string temporaries and the temporary returned by operator+ only live until the end of the initialization of your const char* (end of full expression). Then they are destroyed and p points to uncertain memory.
No the behaviour is undefined because p points to deallocated storage in std::cout << p;
A temporary is created to hold std::string("Hello") + std::string("World"). C-style string is then retrived from that object. At the end of the expression that temporary is destroyed leaving p pointing to a deallocated storage.
Using p then invokes Undefined Behavior.
12.2/4 says
There are two contexts in which temporaries are destroyed at a different point than the end of the full-expression. The first context is when an expression appears as an initializer for a declarator defining an object. In that context, the temporary that holds the result of the expression shall persist until the object’s
initialization is complete.
....
it won't compile because of a missing semi-colon:
const char* p = (std::string("Hello") + std::string("World")).c_str(); //<< important here
std::cout << p;
NOW the rule applies that the temporary is deleted at the end of the expression it is used in, which is at the semicolon. So you have a pointer to deleted memory which causes undefined behaviour.
I recently read this excellent book http://www.amazon.co.uk/Gotchas-Avoiding-Addison-Wesley-Professional-Computing/dp/0321125185/ref and if I recall correctly this is pretty much one of the examples given there.
I believe the data returned from c_str is only valid as long as the string object that returned it is live. The string objects in your example are only live for the duration of the expression.
Related
In the following example:
http://coliru.stacked-crooked.com/a/7a1df22bb73f6030
struct D{
int i;
auto test2(int&& j){
return [&](){ // captured by reference!
cout << i*(j);
};
}
};
int main()
{
D d{10};
{
auto fn = d.test2(10);
fn(); // 1. wrong result here
d.test2(10)(); // 2. but ok here
}
}
Why does d.test2(10)(); work?
Should it really work, or thats just my undefined behavior equals correct result?
P.S. After reading this I see only one explanation: in (2) temporary lifetime prolongs till the end of the expression, and call happens in the same expression with && crteation; while (1) actually consists from 2 expressions:
a temporary bound to a reference parameter in a function call exists
until the end of the full expression containing that function call: if
the function returns a reference, which outlives the full expression,
it becomes a dangling reference.
Is this the case?
A temporary object lasts until the end of the line (well, full expression) where it is created, unless the lifetime is extended.
Your code does not extend the lifetimes of any temporaries. Lifetime extension through binding to references does not "commute", only the first binding extends lifetime.
So the furst case is UB as you have a dangling reference. The referred to temporary goes away st the end of the line: on the next line uou follow the reference, and chaos hapens.
In the second case, your reference does not extend the lifetime of the temporary, but the temporary lasts longer than the reference that binds to it does! They both die at the end of the line, in reverse order of construction.
So the call works.
Should it really work, or thats just my undefined behavior equals correct result?
Seems like it. In the example you linked, you have these warnings:
warning: '<anonymous>' is used uninitialized in this function [-Wuninitialized]
Uninitialized objects have indetermine values, and trying to access those values results in undefined behavior.
I just wrote this without thinking too hard about it. It seems to work fine, but I'm not sure if it's strictly safe.
class Foo
{
struct Buffer
{
char data [sizeof ("output will look like this XXXX YYYY ZZZZ")];
};
const char * print (const char * format = DEFUALT_FORMAT, Buffer && buf = Buffer ())
{
sort_of_sprintf_thing (format, buf .data, sizeof (buf.data), ...);
return buf .data;
}
};
std :: cout << Foo () .print ();
So I think the semantics are that the temporary Buffer will remain in existence until the whole cout statement completes. Is that right, or will it go out of scope before then, in which case this is UB?
Yes, your code is well-defined.
[class.temporary]
3 - [...] Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created. [...]
[intro.execution]
11 - [ Note: The evaluation of a full-expression can include the evaluation of subexpressions that are not lexically
part of the full-expression. For example, subexpressions involved in evaluating default arguments (8.3.6) are
considered to be created in the expression that calls the function, not the expression that defines the default
argument. — end note ]
That doesn't mean it's particularly good, though - it would be far too easy to bind the result of Foo().print() to a char const* variable, which would on the next full-expression become a dangling pointer.
The code is bad, the problem being not on the calling site, but rather on the print function. You are taking an rvalue (only thing that will bind to an rvalue-reference) and returning a pointer to its internals which is a recipe for Undefined Behavior (if the user dereferences the const char* returned).
In the particular example that you have, the Foo() temporary will live long enough, but this code is prone to clients storing the const char* beyond the full expression and causing undefined behavior.
I wonder if the void func(const char *str); refer to a valid str if I wrote as follow:
auto str = string("hello").c_str();
func(str);
How is it different from code below?
func(string("hello").c_str())
In both cases, the string object is a temporary, destroyed at the end of the statement.
In the first case, str ends up dangling - pointing to memory that was managed by the temporary string, but which has now been destroyed. Doing anything with it is an error, giving undefined behaviour.
In the second case, the temporary string is not destroyed until after the function returns. So this is fine, as long as the function doesn't keep hold of the pointer for something else to use later.
The difference is that the first creates a temporary string object that gets destroyed at the end of the first statement, so str becomes a dangling pointer. The second also creates a temporary, but it exists throughout the call to func because the temporary object doesn't get destroyed until after the call to func returns.
From Paragraph 12.2/3 of the C++11 Standard:
[...] Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created. This is true even if that evaluation ends in throwing an exception. [...]
This means that the temporary created within the expression that contains the call to func() will live until function call returns.
On the other hand, the lifetime of the temporary in the first code snippet will end up before func() is invoked, and str will be dangling. This will result in Undefined Behavior.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Lifetime of temporaries
int LegacyFunction(const char *s) {
// do something with s, like print it to standard output
// this function does NOT retain any pointer to s after it returns.
return strlen(s);
}
std::string ModernFunction() {
// do something that returns a string
return "Hello";
}
LegacyFunction(ModernFunction().c_str());
The above example could easily be rewritten to use smart pointers instead of strings; I've encountered both of these situations many times. Anyway, the above example will construct an STL string in ModernFunction, return it, then get a pointer to a C-style string inside of the string object, and then pass that pointer to the legacy function.
There is a temporary string object that exists after ModernFunction has returned. When does it go out of scope?
Is it possible for the compiler to call c_str(), destruct this temporary string object, and then pass a dangling pointer to LegacyFunction? (Remember that the string object is managing the memory that c_str() return value points to...)
If the above code is not safe, why is it not safe, and is there a better, equally concise way to write it than adding a temporary variable when making the function calls? If it's safe, why?
LegacyFunction(ModernFunction().c_str());
Destruction of copy will be after evaluation of full expression (i.e. after return from LegacyFunction).
n3337 12.2/3
Temporary objects are destroyed as the last step
in evaluating the full-expression (1.9) that (lexically) contains the point where they were created.
n3337 1.9/10
A full-expression is an expression that is not a subexpression of another expression. If a language construct
is defined to produce an implicit call of a function, a use of the language construct is considered to be an
expression for the purposes of this definition. A call to a destructor generated at the end of the lifetime of
an object other than a temporary object is an implicit full-expression. Conversions applied to the result of
an expression in order to satisfy the requirements of the language construct in which the expression appears
are also considered to be part of the full-expression.
[ Example:
struct S {
S(int i): I(i) { }
int& v() { return I; }
private:
int I;
};
S s1(1); // full-expression is call of S::S(int)
S s2 = 2; // full-expression is call of S::S(int)
void f() {
if (S(3).v()) // full-expression includes lvalue-to-rvalue and
// int to bool conversions, performed before
// temporary is deleted at end of full-expression
{ }
}
There is a temporary string object that exists after ModernFunction has returned. When does it go out of scope?
Strictly speaking, it's never in scope. Scope is a property of a name, not an object. It just so happens that automatic variables have a very close association between scope and lifetime. Objects that aren't automatic variables are different.
Temporary objects are destroyed at the end of the full-expression in which they appear, with a couple of exceptions that aren't relevant here. Anyway the special cases extend the lifetime of the temporary, they don't reduce it.
Is it possible for the compiler to call c_str(), destruct this temporary string object, and then pass a dangling pointer to LegacyFunction
No, because the full-expression is LegacyFunction(ModernFunction().c_str()) (excluding the semi-colon: feel that pedantry), so the temporary that is the return value of ModernFunction is not destroyed until LegacyFunction has returned.
If it's safe, why?
Because the lifetime of the temporary is long enough.
In general with c_str, you have to worry about two things. First, the pointer it returns becomes invalid if the string is destroyed (which is what you're asking). Second, the pointer it returns becomes invalid if the string is modified. You haven't worried about that here, but it's OK, you don't need to, because nothing modifies the string either.
What happens to the reference in function parameter, if it gets destroyed when the function returns, then how const int *i is still a valid pointer?
const int* func(const int &x = 5)
{
return &x;
}
int main()
{
const int *i = func();
}
§12.2/5:
"A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full expression containing the call."
That means as i is being initialized, it's getting the address of a temporary object that does exist at that point. As soon as i is initialized, however, the temporary object will be destroyed, and i will become just another dangling pointer.
As such, yes, the function is valid -- but with the surrounding code as you've written it, any code you added afterward that attempted to dereference i would give undefined behavior.
Just because a pointer has a value doesn't mean it's a valid pointer.
In this case it holds an address which used to be that of x, and chances are that address still has the value 5, but it's not valid pointer and you can't count on that value being there.
int i points to a patch of memory that is unsafe to access, it is not a valid pointer.
the variable "i" is still a pointer, but even reading the value it points to will give you undefined behavior. That's why you should never write a function like func.
I think that x is created as an un-named temporary on the stack in setting up the call to func(). This temporary will exist until at least the end of the statement in the caller. So the int* i is perfectly valid. It only ceases to be valid at the end of the statement - which means that you cannot use it.
There is something in the standard about un-named temporaries being retained until the last reference to them goes out of scope, but I don't think it covers this explicit and hidden indirection.
[ Happy to have someone tell me otherwise.]
5 is program data. It is in the data segment, not the stack or heap.
So a pointer or reference to it will remain valid for the duration of the program.
Default arguments are evaluated every time the function is called, so the call func() is actually func(5) which is binding a temporary to a reference-to-const. The lifetime of that temporary is then extended till the end of the function and the object is destroyed. Any pointer to this object after that is invalid and dereferencing it is undefined behaviour.