When are temporaries created as part of a function call destroyed? - c++

Is a temporary created as part of an argument to a function call guaranteed to stay around until the called function ends, even if the temporary isn't passed directly to the function?
There's virtually no chance that was coherent, so here's an example:
class A {
public:
A(int x) : x(x) {printf("Constructed A(%d)\n", x);}
~A() {printf("Destroyed A\n");}
int x;
int* y() {return &x;}
};
void foo(int* bar) {
printf("foo(): %d\n", *bar);
}
int main(int argc, char** argv) {
foo(A(4).y());
}
If A(4) were passed directly to foo it would definitely not be destroyed until after the foo call ended, but instead I'm calling a method on the temporary and losing any reference to it. I would instinctively think the temporary A would be destroyed before foo even starts, but testing with GCC 4.3.4 shows it isn't; the output is:
Constructed A(4)
foo(): 4
Destroyed A
The question is, is GCC's behavior guaranteed by the spec? Or is a compiler allowed to destroy the temporary A before the call to foo, invaliding the pointer to its member I'm using?

Temporary objects exist up until the end of the full expression in which they are created.
In your example, the A object created by A(4) will exist at least until the expression ends just after the return from the call to foo().
This behavior is guaranteed by the language standard:
Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created. This is true even if that evaluation ends in throwing an exception (C++03 §12.2/3).
The lifetime of the temporary may be extended by binding a reference to it (in which case its lifetime is extended until the end of the lifetime of the reference), or by using it as an initializer in a constructor's initializer list (in which case its lifetime is extended until the object being constructed is fully constructed).

§12.2/3: "Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created."
IOW, you're safe -- the A object must not be destroyed until after foo returns.

The temporary lasts until the end of the expression it is part of - which in this case is a function call.

The lifetime of your temp object A(4) will last long enough to call y()
The memory pointed to in the return of y() is not reliable, depending on threading and allocations it may be reallocated and the value changed before the call to foo() makes use of it.

Related

How to comprehend "Temporary objs are destroyed as the last step in evaluating the full-expression"?Could anyone make it clear by some simple example?

As per the documentation(), which says:
When an implementation introduces a temporary object of a class that
has a non-trivial constructor ([class.default.ctor],
[class.copy.ctor]), it shall ensure that a constructor is called for
the temporary object. Similarly, the destructor shall be called for a
temporary with a non-trivial destructor ([class.dtor]). Temporary
objects are destroyed as the last step in evaluating the
full-expression ([intro.execution]) that (lexically) contains the
point where they were created. This is true even if that evaluation
ends in throwing an exception. The value computations and side effects
of destroying a temporary object are associated only with the
full-expression, not with any specific subexpression.
How to comprehend "Temporary objects are destroyed as the last step in evaluating the full-expression ([intro.execution]) that (lexically) contains the point where they were created."?Could anyboday make it clear by some simple examples?
Simple example. This expression produces a temporary object:
std::string("test")
Here, that expression is used as a subexpression:
function(std::string("test"));
// point A
At point A, the temporary object has been destroyed because the point is after the full-expression where the temporary object was created.
Here is an example of how to write a bug if this rule is not understood:
const std::string& function(const std::string& arg) {
return arg;
}
const std::string& ref = function("test");
std::cout << ref;
Here, the temporary object that was created as the argument is destroyed after the full expression, and therefore ref has become invalid - a dangling reference. The behaviour is undefined when the invalid reference is inserted into the output stream.
An explanation that works in many cases is that temporary objects are destroyed when execution reaches the semicolon at the end of the statement. Some language constructs (such as a for loop) are not covered by this explanation, so don't push it too hard. For a better explanation of the exceptions, see Statement that isn't a full-expression.
As one example:
i = foo1(foo2(std::string("test")));
The temporary string is kept alive until after the assignment, as the assignment occurs before the end of the statement. (In this case, the full expression is the statement.)

What's the rationale of the exceptions of temporary object lifetime expansion when bound to a reference?

In 12.2 of C++11 standard:
The temporary to which the reference is bound or the temporary that is
the complete object of a subobject to which the reference is bound
persists for the lifetime of the reference except:
A temporary bound
to a reference member in a constructor’s ctor-initializer (12.6.2)
persists until the constructor exits.
A temporary bound to a
reference parameter in a function call (5.2.2) persists until the
completion of the full-expression containing the call.
The lifetime
of a temporary bound to the returned value in a function return
statement (6.6.3) is not extended; the temporary is destroyed at the
end of the full-expression in the return statement.
A temporary
bound to a reference in a new-initializer (5.3.4) persists until the
completion of the full-expression containing the new-initializer.
And there is an example of the last case in the standard:
struct S {
int mi;
const std::pair<int,int>& mp;
};
S a { 1,{2,3} }; // No problem.
S* p = new S{ 1, {2,3} }; // Creates dangling reference
To me, 2. and 3. make sense and easy to agree. But what's the reason bebind 1. and 4.? The example looks just evil to me.
As with many things in C and C++, I think this boils down to what can be reasonably (and efficiently) implemented.
Temporaries are generally allocated on the stack, and code to call their constructors and destructors are emitted into the function itself. So if we expand your first example into what the compiler is actually doing, it would look something like:
struct S {
int mi;
const std::pair<int,int>& mp;
};
// Case 1:
std::pair<int,int> tmp{ 2, 3 };
S a { 1, tmp };
The compiler can easily extend the life of the tmp temporary long enough to keep "S" valid because we know that "S" will be destroyed before the end of the function.
But this doesn't work in the "new S" case:
struct S {
int mi;
const std::pair<int,int>& mp;
};
// Case 2:
std::pair<int,int> tmp{ 2, 3 };
// Whoops, this heap object will outlive the stack-allocated
// temporary!
S* p = new S{ 1, tmp };
To avoid the dangling reference, we would need to allocate the temporary on the heap instead of the stack, something like:
// Case 2a -- compiler tries to be clever?
// Note that the compiler won't actually do this.
std::pair<int,int> tmp = new std::pair<int,int>{ 2, 3 };
S* p = new S{ 1, tmp };
But then a corresponding delete p would need to free this heap memory! This is quite contrary to the behavior of references, and would break anything that uses normal reference semantics:
// No way to implement this that satisfies case 2a but doesn't
// break normal reference semantics.
delete p;
So the answer to your question is: the rules are defined that way because it sort of the only practical solution given C++'s semantics around the stack, heap, and object lifetimes.
WARNING: #Potatoswatter notes below that this doesn't seem to be implemented consistently across C++ compilers, and therefore is non-portable at best for now. See his example for how Clang doesn't do what the standard seems to mandate here. He also says that the situation "may be more dire than that" -- I don't know exactly what this means, but it appears that in practice this case in C++ has some uncertainty surrounding it.
The main thrust is that reference extension only occurs when the lifetime can be easily and deterministically determined, and this fact can be deduced as possible on the line of code where the temporary is created.
When you call a function, it is extended to the end of the current line. That is long enough, and easy to determine.
When you create an automatic storage reference "on the stack", the scope of that automatic storage reference can be deterministically determined. The temporary can be cleaned up at that point. (Basically, create an anonymous automatic storage variable to store the temporary)
In a new expression, the point of destruction cannot be statically determined at the point of creation. It is whenever the delete occurs. If we wanted the delete to (sometimes) destroy the temporary, then our reference "binary" implementation would have to be more complicated than a pointer, instead of less or equal. It would sometimes own the referred to data, and sometimes not. So that is a pointer, plus a bool. And in C++ you don't pay for what you don't use.
The same holds in a constructor, because you cannot know if the constructor was in a new or a stack allocation. So any lifetime extension cannot be statically understood at the line in question.
How long do you want the temporary object to last? It has to be allocated somewhere.
It can't be on the heap because it would leak; there is no applicable automatic memory management. It can't be static because there can be more than one. It must be on the stack. Then it either lasts until the end of the expression or the end of the function.
Other temporaries in the expression, perhaps bound to function call parameters, are destroyed at the end of the expression, and persisting until the end of the function or "{}" scope would be an exception to the general rules. So by deduction and extrapolation of the other cases, the full-expression is the most reasonable lifetime.
I'm not sure why you say this is no problem:
S a { 1,{2,3} }; // No problem.
The dangling reference is the same whether or not you use new.
Instrumenting your program and running it in Clang produces these results:
#include <iostream>
struct noisy {
int n;
~noisy() { std::cout << "destroy " << n << "\n"; }
};
struct s {
noisy const & r;
};
int main() {
std::cout << "create 1 on stack\n";
s a {noisy{ 1 }}; // Temporary created and destroyed.
std::cout << "create 2 on heap\n";
s* p = new s{noisy{ 2 }}; // Creates dangling reference
}
 
create 1 on stack
destroy 1
create 2 on heap
destroy 2
The object bound to the class member reference does not have an extended lifetime.
Actually I'm sure this is the subject of a known defect in the standard, but I don't have time to delve in right now…

Rvalue references under the hood

Consider the foo function
void foo(X x);
with X a matrix cass, and the function
X foobar();
Suppose that I run
foo(foobar());
What happens with temporary objects in this case step by step? My understanding is that
foobar returns a temporary object, say Xtemp1;
foo copies Xtemp1 to a temporary object of its own, say Xtemp2, and then destroyes Xtemp1;
foo performs calculations on Xtemp2.
On the other side, if I overload foo as
void foo(X& x);
void foo(X&& x);
then the picture will be different and, in particular,
foobar returns the temporary Xtemp1;
foo does not create a new temporary, but acts directly on Xtemp1 through its reference.
Is this picture correct or, if not, could someone point out and fix my mistakes? Thank you very much.
foo(foobar());
foobar's return value is a value, so it's a temporary value.
1This function will move/copy the return value into local storage for a temporary.
1This function will move/copy the value in the local storage into the parameter for the call to foo.
The call to foo executes.
2As foo returns, it's parameter is destructed.
2When foo returns, the value in local storage is destructed.
foobar's return value is destructed.
1 These moves/copies may be elided by the compiler. This means that these copy/moves don't have to happen (and any compiler worth using will elide them).
2 If the moves/copies above are elided, then the destruction of their respective variables are naturally unnecessary. Since they don't exist.
On the other side, if I overload foo as
foobar's return value is a value, so it's a temporary value.
1This function will move/copy the return value into local storage for a temporary.
Overload resolution selects foo(X &&) to call.
An rvalue-reference parameter variable is created and initialized with the local storage value.
The call to foo executes.
As foo returns, it's reference parameter is destructed (not the value it references).
2When foo returns, the value in local storage is destructed.
foobar's return value is destructed.
Note the key differences here. Step 4 and 6 cannot be elided. Therefore, if X is a small type like int, then the function will have no choice but to create a fairly worthless reference to an integer. References are internally implemented as pointers, so it's not really possible for the compiler to just optimize that away as a register. The local storage must therefore be on the stack rather than a register.
So you will be guaranteed of fewer move/copies. But again, any decent compiler will elide them. So the question is, will X generally be too large to fit into a register?
Your understanding is almost correct. The only difference is that in step 2., the temporary Xtemp1 is not copied to an arbitrary-named temporary Xtemp2, but to the space of the formal parameter x (from declaration foo(X x)).
Also, it's possible for copy-elision to kick in, which could mean that the return value of foobar() is constructed directly in the space of foo's formal parameter x, thus no copying would occur. This is allowed by the standard, but not guaranteed.
Your take on the r-value reference case is correct.

when are default argument object destroyed?

void foo(const Object & o = Object()) {
return;
}
In the function above, when is ~Object supposed to be called ? when the function exit or when at the end of the block surrounding the call site ?
The default argument will be destroyed at the end of the complete expression that contains the function call.
To elaborate a bit on what David said, the standard says in section 12.2 [class.temporary]:
There are two contexts in which temporaries are destroyed at a
different point than the end of the full-expression. [...] The second
context is when a reference is bound to a temporary. The temporary to
which the reference is bound or the temporary that is the complete
object of a subobject to which the reference is bound persists for the
lifetime of the reference except:
...
A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full-expression
containing the call.
...
So they are neither destroyed when the function exits nor when the block containing the call ends, but at the end of the complete statement that contains the function call (simply said, at the first semicolon after the function call, in the calling context).
EDIT: So say we got:
int foo(const Object & o = Object());
some_stuff();
std::cout << (foo() + 7);
other_stuff();
This sould be roughly equivalent to the following (mind the conceptual scope block):
some_stuff();
{
Object o; // create temprorary
int i = foo(o); // and use it
int j = i + 7; // do other things
std::cout << j; // while o still alive
} // finally destroy o
other_stuff();
EDIT: As pointed out by Michael in his comment, this "statement/semicolon"-analogy I gave is rather a simplification of the term "full-expression" and there are cases where it is a bit different, like his example:
if(foo()) bar();
Which would destroy the temporary before bar is called and thus be different from the expression statement:
foo() ? bar() : 0;
But nevertheless, the "semicolon"-analogy is often a good fit, even if a full-expression is not neccessarily the same as a statement (which can consist of multiple full-expressions).
I don't think this code should compile. You can't bind a reference to a temporary unless it's const. And if it was const the temporary should be kept alive until the end of the function expression. Just the same as a local variable defined within it.

When do temporary parameter values go out of scope? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Lifetime of temporaries
int LegacyFunction(const char *s) {
// do something with s, like print it to standard output
// this function does NOT retain any pointer to s after it returns.
return strlen(s);
}
std::string ModernFunction() {
// do something that returns a string
return "Hello";
}
LegacyFunction(ModernFunction().c_str());
The above example could easily be rewritten to use smart pointers instead of strings; I've encountered both of these situations many times. Anyway, the above example will construct an STL string in ModernFunction, return it, then get a pointer to a C-style string inside of the string object, and then pass that pointer to the legacy function.
There is a temporary string object that exists after ModernFunction has returned. When does it go out of scope?
Is it possible for the compiler to call c_str(), destruct this temporary string object, and then pass a dangling pointer to LegacyFunction? (Remember that the string object is managing the memory that c_str() return value points to...)
If the above code is not safe, why is it not safe, and is there a better, equally concise way to write it than adding a temporary variable when making the function calls? If it's safe, why?
LegacyFunction(ModernFunction().c_str());
Destruction of copy will be after evaluation of full expression (i.e. after return from LegacyFunction).
n3337 12.2/3
Temporary objects are destroyed as the last step
in evaluating the full-expression (1.9) that (lexically) contains the point where they were created.
n3337 1.9/10
A full-expression is an expression that is not a subexpression of another expression. If a language construct
is defined to produce an implicit call of a function, a use of the language construct is considered to be an
expression for the purposes of this definition. A call to a destructor generated at the end of the lifetime of
an object other than a temporary object is an implicit full-expression. Conversions applied to the result of
an expression in order to satisfy the requirements of the language construct in which the expression appears
are also considered to be part of the full-expression.
[ Example:
struct S {
S(int i): I(i) { }
int& v() { return I; }
private:
int I;
};
S s1(1); // full-expression is call of S::S(int)
S s2 = 2; // full-expression is call of S::S(int)
void f() {
if (S(3).v()) // full-expression includes lvalue-to-rvalue and
// int to bool conversions, performed before
// temporary is deleted at end of full-expression
{ }
}
There is a temporary string object that exists after ModernFunction has returned. When does it go out of scope?
Strictly speaking, it's never in scope. Scope is a property of a name, not an object. It just so happens that automatic variables have a very close association between scope and lifetime. Objects that aren't automatic variables are different.
Temporary objects are destroyed at the end of the full-expression in which they appear, with a couple of exceptions that aren't relevant here. Anyway the special cases extend the lifetime of the temporary, they don't reduce it.
Is it possible for the compiler to call c_str(), destruct this temporary string object, and then pass a dangling pointer to LegacyFunction
No, because the full-expression is LegacyFunction(ModernFunction().c_str()) (excluding the semi-colon: feel that pedantry), so the temporary that is the return value of ModernFunction is not destroyed until LegacyFunction has returned.
If it's safe, why?
Because the lifetime of the temporary is long enough.
In general with c_str, you have to worry about two things. First, the pointer it returns becomes invalid if the string is destroyed (which is what you're asking). Second, the pointer it returns becomes invalid if the string is modified. You haven't worried about that here, but it's OK, you don't need to, because nothing modifies the string either.