What is the order of destruction of function arguments? - c++

If some function f with parameters p_1, ..., p_n of types T_1, ..., T_n respectively is called with arguments a_1, ..., a_n and its body throws an exception, finishes or returns, in what order are the arguments destroyed and why? Please provide a reference to the standard, if possible.
EDIT: I actually wanted to ask about function "parameters", but as T.C. and Columbo managed to clear my confusion, I'm leaving this question be about the arguments and asked a new separate question about the parameters. See the comments on this question for the distinction.

I did not manage to find the answer in the standard, but I was able to test this on 3 most popular C++ compliant compilers. The answer of R Sahu pretty much explains that it is implementation defined.
§5.2.2/8: The evaluations of the postfix expression and of the arguments are all unsequenced relative to one
another. All side effects of argument evaluations are sequenced before the function is entered.
Visual Studio C++ Compiler (Windows) and gcc (Debian)
Arguments are constructed in order reverse to their declaration and destroyed in reversed order (thus destroyed in order of delcaration):
2
1
-1
-2
Clang (FreeBSD)
Arguments are constructed in order of their declaration and destroyed in reversed order:
1
2
-2
-1
All compilers were instructed to treat the source code as C++11 and I used the following snippet to demonstrate the situation:
struct A
{
A(int) { std::cout << "1" << std::endl; }
~A() { std::cout << "-1" << std::endl; }
};
struct B
{
B(double) { std::cout << "2" << std::endl; }
~B() { std::cout << "-2" << std::endl; }
};
void f(A, B) { }
int main()
{
f(4, 5.);
}

In §5.2.2[4] N3337 is quite explicit on what happens (online draft):
During the initialization of a parameter, an implementation may avoid the construction of extra temporaries by combining the conversions on the associated argument and/or the construction of temporaries with the
initialization of the parameter (see 12.2). The lifetime of a parameter ends when the function in which it is defined returns.
So for example in
f(g(h()));
the return value from the call h() is a temporary that will be destroyed at the end of the full expression. However the compiler is allowed to avoid this temporary and directly initialize with its value the parameter of g(). In this case the return value will be destroyed once g() returns (i.e. BEFORE calling f()).
If I understood correctly what is stated in the standard however it's not permitted to have the value returned from h() to survive to the end of the full expression unless a copy is made (the parameter) and this copy is destroyed once g() returns.
The two scenarios are:
h return value is used to directly initialize g parameter. This object is destroyed when g returns and before calling f.
h return value is a temporary. A copy is made to initialize g parameter and it is destroyed when g returns. The original temporary is destroyed at the end of the full expression instead.
I don't know if implementations are following the rules on this.

The order in which the arguments to a function are evaluated is not specified by the standard. From the C++11 Standard (online draft):
5.2.2 Function call
8 [ Note: The evaluations of the postfix expression and of the argument expressions are all unsequenced relative to one another. All side effects of argument expression evaluations are sequenced before the function
is entered (see 1.9). —end note ]
Hence, it is entirely up to an implementation to decide in what order to evaluate the arguments to a function. This, in turn, implies that the order of construction of the arguments is also implementation dependent.
A sensible implementation would destroy the objects in the reverse order of their construction.

Related

Order of evaluation with function pointers in C++17

Consider the following program (and its alternative in the comment) in C++17:
#include<iostream>
void a(int) {
std::cout << "a\n";
}
void b(int) {
std::cout << "b\n";
}
int main() {
using T = void(*)(int);
T f = a;
(T(f))((f=b,0)); // alternatively: f((f=b,0))
}
With -O2 option, Clang 9.0.0 prints a and GCC 9.2 prints b. Both warn me about unsequenced modification and access to f. See godbolt.org.
My expectation was that this is program has well-defined behavior and will print a, because C++17 guarantees that the left-hand expression of the call (T(f)) is sequenced before any evaluation of the arguments. Because the result of the expression (T(f)) is a new pointer to a, the later modification of f should have no impact on the call at all. Am I wrong?
Both compilers give the same output if I use f((f=b,0)); instead of (T(f))((f=b,0));. Here I am slightly unsure about the undefined behavior aspect. Would this be undefined behavior because f still refers to the declared function pointer after evaluation, which will have been modified by the evaluation of the arguments and if so, why exactly would that cause undefined behavior rather than calling b?
I have asked a related question with regards to order of evaluation of non-static member function calls in C++17 here. I am aware that writing code like this is dangerous and unnecessary, but I want to understand the details of the C++ standard better.
Edit: GCC trunk now also prints a after the bug filed by Barry (see his answer below) has been fixed. Both Clang and GCC trunk do still show false-positive warnings with -Wall, though.
The C++17 rule is, from [expr.call]/8:
The postfix-expression is sequenced before each expression in the expression-list and any default argument. The initialization of a parameter, including every associated value computation and side effect, is indeterminately sequenced with respect to that of any other parameter.
In (T(f))((f=b,0));, (T(f)) is sequenced before the initialization of the parameter from (f=b, 0). All of this is well-defined and the program should print "a". That is, it should behave just like:
auto __tmp = T(f);
__tmp((f=b, 0));
The same is true even if we change your program such that this were valid:
T{f}(f=b, 0); // two parameters now, instead of one
The f=b and 0 expressions are indeterminately sequenced with each other, but T{f} is still sequenced before both, so this would still invoke a.
Filed 91974.

Intent of [basic.execution] p5 sentence 2

[basic.execution] p5 sentence 2 states:
If a language construct is defined to produce an implicit call of a function, a use of the language construct is considered to be an expression for the purposes of this definition.
However, the intent of this sentence is not immediately clear. My best guess is that it is here in order to ensure proper sequencing and to make sure that temporaries are not destroyed before any implicit function calls complete, however, I cannot see a situation where this would apply and change the meaning of some code. For example:
struct S { };
const S& f() { return {}; }
Here, the return statement would be considered an expression, and the operand {} would also be considered an expression, and therefore a subexpression of the return statement. Is this the intent of the sentence? Where else would this apply and have a meaningful effect?
The key phrase is "in the context of this definition", i.e. the definition of full-expression.
It's just saying that the rules of a full-expression (e.g. temporary lifetime) will also apply for the entirety of your return statement, even though it's not otherwise enumerated in the list of things that constitute a full-expression.
And that's because it involves an implicit function call (a ctor call); if it didn't, then the point would be moot.
It doesn't "change the meaning" of any code.

Weird behavior when passing argument by value

Stumbled upon few articles claiming that passing by value could improve performance if function is gonna make a copy anyway.
I never really thought about how pass-by-value might be implemented under the hood. Exactly what happens on stack when you do smth like this: F v = f(g(h()))?
After pondering a bit I came to conclusion that I'd implement it in such way that value returned by g() is created in locations where f() expects it to be. So, basically, no copy/move constructor calls -- f() will simply take ownership of object returned by g() and destroy it when execution leaves f()'s scope. Same for g() -- it'll take ownership of object returned by h() and destroy it on return.
Alas, compilers seem to disagree. Here is the test code:
#include <cstdio>
using std::printf;
struct H
{
H() { printf("H ctor\n"); }
~H() { printf("H dtor\n"); }
H(H const&) {}
// H(H&&) {}
// H(H const&) = default;
// H(H&&) = default;
};
H h() { return H(); }
struct G
{
G() { printf("G ctor\n"); }
~G() { printf("G dtor\n"); }
G(G const&) {}
// G(G&&) {}
// G(G const&) = default;
// G(G&&) = default;
};
G g(H) { return G(); }
struct F
{
F() { printf("F ctor\n"); }
~F() { printf("F dtor\n"); }
};
F f(G) { return F(); }
int main()
{
F v = f(g(h()));
return 0;
}
On MSVC 2015 it's output is exactly what I expected:
H ctor
G ctor
H dtor
F ctor
G dtor
F dtor
But if you comment out copy constructors it looks like this:
H ctor
G ctor
H dtor
F ctor
G dtor
G dtor
H dtor
F dtor
I suspect that removing user-provided copy constructor causes compiler to generate move-constructor, which in turn causes unnecessary 'move' which doesn't go away no matter how big objects in question are (try adding 1MB array as member variable). I.e. compiler prefers 'move' so much that it chooses it over not doing anything at all.
It seems like a bug in MSVC, but I would really like someone to explain (and/or justify) what is going on here. This is question #1.
Now, if you try GCC 5.4.0 output simply doesn't make any sense:
H ctor
G ctor
F ctor
G dtor
H dtor
F dtor
H has to be destroyed before F is created! H is local to g()'s scope! Note that playing with constructors has zero effect on GCC here.
Same as with MSVC -- looks like a bug to me, but can someone explain/justify what is going on here? That is question #2.
It is really silly that after many years of working with C++ professionally I run into issues like this... After almost 4 decades compilers still can't agree on how to pass values around?
For passing a parameter by value, the parameter is a local variable to the function, and it's initialized from the corresponding argument to the function call.
When returning by value, there is a value called the return value. This is initialized by the "argument" to the return expression. Its lifetime is until the end of the full-expression containing the function call.
Also there is an optimization called copy elision which can apply in a few cases. Two of those cases apply to returning by value:
If the return value is initialized by another object of the same type, then the same memory location can be used for both objects, and the copy/move step skipped (there are some conditions on exactly when this is allowed or disallowed)
If the calling code uses the return value to initialize an object of the same type, then the same memory location can be used for both the return value and the destination object, and the copy/move step is skipped. (Here the "object of the same type" includes function parameters).
It is possible for both of these to apply simultaneously. Also, as of C++14, copy elision is optional for the compiler.
In your call f(g(h())), here is the list of objects (without copy elision):
H default-constructed by return H();
H, the return value of h(), is copy-constructed from (step 1).
~H (step 1)
H, the parameter of g, is copy-constructed from (step 2).
G default-constructed by return G();
G, the return value of g(), is copy-constructed from (step 5).
~G (step 5)
~H (step 4) (see below)
G, the parameter of f, is copy-constructed from (step 6).
F default-constructed by return F();
F, the return value of f(), is move-constructed from (step 10).
~F (step 10)
~G (step 9) (see below)
F v is move-constructed from (step 11).
~F, ~G, ~H (steps 2, 6, 11) are destroyed - I think there is no required ordering of the three
~F(step 14)
For copy elision, steps 1+2+3 can be combined into "Return value of h() is default-constructed". Similarly for 5+6+7 and 10+11+12. However it is also possible to combine either 2+4 on their own into "Parameter of g is copy-constructed from 1", and also possible for both of these elisions to apply simultaneously , giving "Parameter of g is default-constructed".
Because copy elision is optional you may see different results from different compilers. It doesn't mean there is a compiler bug. You'll be glad to hear that in C++17 some copy elision scenarios are being made mandatory.
Your output in the second MSVC case would be more instructive if you included output text for the move-constructor. I would guess that in the first MSVC case it performed both simultaneous elisions that I mentioned above, whereas the second case omits the "2+4" and "6+9" elisions.
below: gcc and clang delay destruction of function parameters until the end of the full-expression that enclosed the function call. This is why your gcc output differs from MSVC.
As of the C++17 drafting process, it is implementation-defined whether these destructions occur where I had them in my list, or at the end of the full-expression. It could be argued that it was insufficiently specified in the earlier published standards. See here for further discussion.
This behavior is because of an optimization technique called copy elision. In a nutshell all of outputs you mentioned are valid! Yep! Because this technique is (the only one) allowed to modify the behavior of the program. More information can be found at What are copy elision and return value optimization?
Both M.M's and Ahmad's answers were sending me in right direction, but they both weren't fully correct. So I opted to write down a proper answer below...
function call and return in C++ has following semantic:
value passed as function argument gets copied into function scope and function gets invoked
return value gets copied into caller's scope, gets destroyed (when we reach end of return full expression) and execution leaves function scope
When it comes to implementing this on IA-32-like architecture it becomes painfully obvious that these copies are not required -- it is trivial to allocate uninitialized space on stack (for return value) and define function calling conventions in such way that it knows where to construct return value.
Same for argument passing -- if we pass rvalue as function argument, compiler can direct creation of that rvalue in such way that it will be created right were (subsequently called) function expects it to be.
I imagine this is main reason why copy elision was introduced to standard (and is made mandatory in C++17).
I am familiar with copy elision in general and read this page before. Unfortunately I missed two things:
the fact that this also applies to initialization of function arguments with rvalue (C++11 12.8.p32):
when a temporary class object that has not been bound to a reference
(12.2) would be copied/moved to a class object with the same
cv-unqualified type, the copy/move operation can be omitted by
constructing the temporary object directly into the target of the
omitted copy/move
when copy elision kicks in it affects object lifetime in a very peculiar way:
When certain criteria are met, an implementation is allowed to omit
the copy/move construction of a class object, even if the copy/move
constructor and/or destructor for the object have side effects. In
such cases, the implementation treats the source and target of the
omitted copy/move operation as simply two different ways of referring
to the same object, and the destruction of that object occurs at the
later of the times when the two objects would have been destroyed
without the optimization. This elision of copy/move operations, called
copy elision, is permitted in the following circumstances (which may
be combined to eliminate multiple copies)
This explains GCC output -- we pass some rvalue into a function, copy elision kicks in and we end up with one object being referred via two different ways and lifetime = longest of all of them (which is a lifetime of temporary in our F v = ...; expression). So, basically, GCC output is completely standard compliant.
Now, this also means that MSVC is not standard compliant! It successfully applied both copy elisions, but resulting object lifetime is too short.
Second MSVC output conforms the standard -- it applied RVO, but decided to not apply copy elision for function parameter. I still think it is a bug in MSVC, even though code is ok from standard point of view.
Thank you both M.M and Ahmad for pushing me in right direction.
Now little rant about lifetime rule enforced by standard -- I think it was meant to be used only with RVO.
Alas it doesn't make a lot of sense when applied to eliding copy of function argument. In fact, combined with C++17 mandatory copy elision rule it permits crazy code like this:
T bar();
T* foo(T a) { return &a; }
auto v = foo(bar())->my_method();
this rule forces T to be destroyed only at the end of full expression. This code will become correct in C++17. It is ugly and should not be allowed in my opinion. Plus, you'll end up destroying these objects on caller side (instead of inside of a function) -- needlessly increasing code size and complicating process of figuring out if given function is a nothrow or not.
In other words, I personally prefer MSVC output #1 (as most 'natural'). Both MSVC output #2 and GCC output should be banned. I wonder if this idea can be sold to C++ standardization committee...
edit: apparently in C++17 lifetime of temporary will become 'unspecified' thus allowing MSVC's behavior. Yet another unnecessary dark corner in the language. They should have simply mandated MSVC's behavior.

Does standard C++11 guarantee that temporary object passed to a function will have been destroyed after the end of the function?

As known, that standard C++11 guarantees that temporary object passed to a function will have been created before function call: Does standard C++11 guarantee that temporary object passed to a function will have been created before function call?
But, does standard C++11 guarantee that temporary object passed to a function will have been destroyed after the end of the function (not before)?
Working Draft, Standard for Programming Language C++ 2016-07-12: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4606.pdf
§ 12.2 Temporary objects
§ 12.2 / 5
There are three contexts in which temporaries are destroyed at a
different point than the end of the full expression. The first context
is when a default constructor is called to initialize an element of an
array with no corresponding initializer (8.6). The second context is
when a copy constructor is called to copy an element of an array while
the entire array is copied (5.1.5, 12.8). In either case, if the
constructor has one or more default arguments, the destruction of
every temporary created in a default argument is sequenced before the
construction of the next array element, if any. The third context is
when a reference is bound to a temporary.
Also:
§ 1.9 / 10
A full-expression is an expression that is not a subexpression of
another expression. [ Note: in some contexts, such as unevaluated
operands, a syntactic subexpression is considered a full-expression
(Clause 5). — end note ] If a language construct is defined to produce
an implicit call of a function, a use of the language construct is
considered to be an expression for the purposes of this definition. A
call to a destructor generated at the end of the lifetime of an object
other than a temporary object is an implicit full-expression.
Conversions applied to the result of an expression in order to satisfy
the requirements of the language construct in which the expression
appears are also considered to be part of the full-expression.
Does it mean that standard C++11 guarantees that temporary object passed to a function will have been destroyed not before the function will end - and exactly at the end of the full expression?
http://ideone.com/GbEPaK
#include <iostream>
using namespace std;
struct T {
T() { std::cout << "T created \n"; }
int val = 0;
~T() { std::cout << "T destroyed \n"; }
};
void function(T t_obj, T &&t, int &&val) {
std::cout << "func-start \n";
std::cout << t_obj.val << ", " << t.val << ", " << val << std::endl;
std::cout << "func-end \n";
}
int main() {
function(T(), T(), T().val);
return 0;
}
Output:
T created
T created
T created
func-start
0, 0, 0
func-end
T destroyed
T destroyed
T destroyed
Can we say that the T destroyed will always be after the func-end?
And this:
function(T(), T(), T().val);
Is always equal to this:
{
T tmp1; T tmp2; T tmp3;
function(tmp1, tmp2, tmp3.val);
}
Well, you already quoted all the text that tells us the temporary's lifetime ends at the end of the full-expression. So, yes, "T destroyed" will always come last.
If the destruction had no observable side-effects then, per the as-if rule, it could actually happen at any time afterwards… but that's moot because, well, it wouldn't be observable.
However, the final two snippets you presented are not generally equivalent, because you fixed the order of construction/initialisation in a way that it wasn't before. Function arguments have an unspecified evaluation order. Again, though, for this particular T the difference is not observable.

When an array is created by a subexpression, what happens with the temporaries therein?

I was reading these two paragraphs of the FDIS (12.2p{4,5}):
There are two contexts in which temporaries are destroyed at a different point than the end of the full-expression. The first context is when a default constructor is called to initialize an element of an array. If the constructor has one or more default arguments, the destruction of every temporary created in a default argument is sequenced before the construction of the next array element, if any.
and
The second context is when a reference is bound to a temporary. The temporary to which the reference is bound or the temporary that is the complete object of a subobject to which the reference is bound persists for the lifetime of the reference except:
[...]
A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full-expression containing the call.
These two two seem to contradict for the following case
struct A {
A() { std::cout << "C" << std::endl; }
~A() { std::cout << "D" << std::endl; }
};
struct B {
B(A const& a = A()) { }
};
typedef B array[2];
int main() {
array{};
}
Will this output CDCD as required by the first context, or will this output CCDD as required by the second context? GCC seems to follow the second context description and outputs CCDD. Have I overlooked something important?
EDIT: I don't think it needs C++0x. This new-expression is affected too by my question:
new array(); /* CDCD or CCDD ?? */
In this case though, GCC follows the first context, and outputs CDCD.
I don't think there's a contradiction.
5.2.2 clearly says what a function call is. A function call is a postfix expression followed by parentheses
containing a possibly empty,
comma-separated list of expressions
which constitute the arguments to the
function.
There doesn't seem to be a function call to B::B(A const&) anywhere in your program, so I don't see how the second passage applies.
EDIT the above is probably incorrect, given 1.9p10 etc.