What happens during initialization of a class? - c++

Here is the code which confuses me:
#include <iostream>
using namespace std;
class B {
public:
B() {
cout << "constructor\n";
}
B(const B& rhs) {
cout << "copy ctor\n";
}
B & operator=(const B & rhs) {
cout << "assignment\n";
}
~B() {
cout << "destructed\n";
}
B(int i) : data(i) {
cout << "constructed by parameter " << data << endl;
}
private:
int data;
};
B play(B b)
{
return b;
}
int main(int argc, char *argv[])
{
#if 1
B t1;
t1 = play(5);
#endif
#if 0
B t1 = play(5);
#endif
return 0;
}
Environment is g++ 4.6.0 on Fedora 15.
The first code fragment output is as follows:
constructor
constructed by parameter 5
copy ctor
assignment
destructed
destructed
destructed
And the second fragment code output is:
constructed by parameter 5
copy ctor
destructed
destructed
Why are are three destructors are called in the first example, while in the second it is only two?

First Case:
B t1;
t1 = play(5);
Creates a object t1 by calling default constructor of B.
In order to call play(), A temporary object of B is created by using B(int i). 5 is passed as an and object of B is created, and play() is called.
return b; inside play() causes the copy constructor to be called for returning a copy of object.
t1 = calls the Assignemnt operator to assign the returned object copy to t1.
First destructor, destructs the temporary object created in #3.
Second destructor destructs the returned temporay object in #2.
Third destructor destructs the object t1.
Second case:
B t1 = play(5);
An temporary object of class B is created by calling parameterized constructor of B which takes int as a paraemter.
This temporary object is used to call the Copy constructor of class B.
First destructor destructs the temporary created in #1.
Second destructor destructs object t1.
One destructor call is less in Second Case because, in second Case the compiler uses Return value Optimization and elides the call to create an additional temporary object while returning from play(). Instead the Base object is created in the location where the temporary would have been assigned.

First, examine the sub-expression play(5). This expression is the same in both cases.
In a function call expression each parameter is copy-initialized from its argument (ISO/IEC 14882:2003 5.2.2/4). In this case this involves converting 5 to a B by using the non-explicit constructor taking an int to create a temporary B and then using the copy-constructor to initialize the parameter b. However, the implementation is permitted to eliminate the temporary by directly initializing b using the converting constructor from int under the rules specified in 12.8.
The type of play(5) is B and - as function returning a non-reference - it is an rvalue.
The return statement implicitly converts the return expression to the type of the return value (6.6.3) and then copy-initializes (8.5/12) the return object with the converted expression.
In this case the return expression is already of the correct type, so no conversion is required but the copy initialization is still required.
Aside on return value optimizations
The named return value optimization (NRVO) refers to the situation where the return statement is if the form return x; where x is an automatic object local to the function. When occurs the implementation is allowed to construct x in the location for the return value and eliminate the copy-initialization at the point of return.
Although it is not named as such in the standard, NRVO usually refers to the first situation described in 12.8/15.
This particular optimization is not possible in play because b is not an object local to the function body, it is the name of the parameter which has already been constructed by the time the function is entered.
The (unnamed) return value optimization (RVO) has even less agreement on what it refers to but is usually used to refer to the situation where the return expression is not a named object but an expression where the conversion to the return type and copy-initialization of the return object can be combined so that the return object is initialized straight from the result of the conversion eliminating one temporary object.
The RVO doesn't apply in play because b is already of type B so the copy-initialization is equivalent to direct-initialization and no temporary object is necessary.
In both cases play(5) requires the construction of a B using B(int) for the parameter and a copy-initialization of B to the return object. It may also use a second copy in the initialization of the parameter but many compilers eliminate this copy even when optimizations are not explicitly requested. Both (or all) of these objects are temporaries.
In the expression statement t1 = play(5); the copy assignment operator will be called to copy the value of the return value of play to t1 and the two temporaries (parameter and return value of play) will be destroyed. Naturally t1 must have been constructed prior to this statement and its destructor will be called at the end of its lifetime.
In the declaration statement B t1 = play(5);, logically t1 is initialized with the return value of play and exactly the same number of temporaries will be used as the expression statement t1 = play(5);. However, this is the second of the situations covered in 12.8/15 where the implementation is allowed to eliminate the temporary used for the return value of play and instead allow the return object to alias t1. The play function operates in exactly the same way but because it the return object is just an alias to t1 its return statement effectively directly initializes t1 and there is no separate temporary object for the return value that needs to be destroyed.

The first fragment constructs three objects:
B t1
B(5) <- from (int) constructor; this is temporary object for play function
return b; or B(b) <- copy ctor
This is my guess, although it looks inefficient.

Refer to what Als posted for a play-by-play of the first scenario.
I think (EDIT: wrongly; see below) the difference with the second case is that the compiler was smart enough to use the NRVO (named return value optimization) and elide the middle copy: Instead of creating a temporary copy on return (from play), the compiler used the actual "b" inside of the play function as the rvalue for t1's copy constructor.
Dave Abrahams has an article on copy elision, and here's Wikipedia on the return value optimization.
EDIT: Actually, Als added a play-by-play of the second scenario, too. :)
Further edits: Actually, I was incorrect above. The NRVO is not being used in either case, because the standard forbids eliding copies directly from function arguments (b in play) to the return value location of a function (at least without inlining), according to the accepted answer for this question.
Even if the NRVO were allowed, we can tell that it's not being used in the first case at least: If it were, the first case would not involve a copy constructor whatsoever. The copy constructor in the first case comes from the hidden copy from the named value b (in the play function) to the hidden return value location for play. The first case involves no explicit copy construction, so that is the only place where it can arise.
What's actually going on is this: NRVO is not occurring in either case, and a hidden copy is being created on return...but in the second case, the compiler was able to construct the hidden return copy directly at t1's location. So, the copy from b to the return value was not elided, but the copy from the return value to t1 was. However, the compiler had a harder time doing this optimization for the first case where t1 was already constructed (read: it didn't do it ;)). If t1 is already constructed at an address incompatible with the return value's location, the compiler isn't able to use t1's address directly for the hidden return value copy.

In your first example, you're calling three constructors:
The B() constructor when you declare B t1;, which is also a definition if B() is public. In other words, the compiler will try to initialize any declared objects to some basic valid state, and treats B() as the method for transforming a B-sized block of memory into said basic valid state, so that methods called on t1 won't break the program.
The B(int) constructor, used as an implicit conversion; play() takes a B but was given an int, but B(int) is considered a method for converting int to B.
The B(const B& rhs) copy constructor, which will copy the value of the B returned by play() into a temporary value so that it will have scope long enough to survive being used in an assignment operator.
Each of the above constructors must be matched with a destructor when the scope exits.
In your second example, however, your are explicitly initializing the value of t1 with the result of play(), so the compiler doesn't need to waste cycles providing a basic state to t1 before it assigns a copy of play()'s result to the new variable. So you only call
B(int) to get a useful argument for play(B)
B(const B& rhs) so that t1 will be initialized with (whatever your copy constructor decides is) a proper copy of play()'s results.
You don't see a third constructor in this case because the compiler is "eliding" the returned value of play() into t1; that is, it knew that t1 did not exist in a valid state before play() returns, so it's just writing the return value directly into the memory set aside for t1.

Related

Temporary object argument lifetime in a function

I've read several posts about temporary object's lifetime. And in a word I learn that:
the temporary is destroyed after the end of the full-expression
containing
it.
But this code is out of my expectation:
#include <memory>
#include <iostream>
void fun(std::shared_ptr<int> sp)
{
std::cout << "fun: sp.use_count() == " << sp.use_count() << '\n';
//I expect to get 2 not 1
}
int main()
{
fun(std::make_shared<int>(5));
}
So I think I have 2 smart pointer objects here, one is std::make_shared<int>(5), the temporary unnamed object and the other sp which is a local variable inside the function. So based on my understanding, the temporary one won't "die" before completing the function call. I expect output to be 2 not 1. What's wrong here?
Pre-C++17, sp is move-constructed from the temporary if the move is not elided to begin with. In either case, sp is the sole owner of the resource, so the use count is rightly reported as 1. This is overload 10)† in this reference.
While the temporary still exists, if not elided, it is in a moved-from state and no longer holds any resource, so it doesn't contribute to the resource's use count.
Since C++17, no temporary is created thanks to guaranteed copy/move elision, and sp is constructed in place.
† Exact wording from said reference:
10) Move-constructs a shared_ptr from r. After the construction, *this contains a copy of the previous state of r, r is empty and its stored pointer is null. [...]
In our case, r refers to the temporary and *this to sp.
c++ has a strange concept known as elision.
Elision is a process whereby the compiler is allowed to take the lifetime of two objects and merge them. Typically people say that the copy or move constructor "is elided", but what is really elided is the identity of two seemingly distinct objects.
As a rule of thumb, when an anonymous temporary object is used to directly construct another object, their lifetimes can be elided together. So:
A a = A{}; // A{} is elided with a
void f(A);
f(A{}); // temporary A{} is elided with argument of f
A g();
f(g()); // return value of g is elided with argument of f
There are also situations where named variables can be elided with return values, and more than two objects can be elided together:
A g() {
A a;
return a; // a is elided with return value of g
}
A f() {
A x = g(); // x is elided with return value of g
// which is elided with a within g
return x; // and is then elided with return value of f
}
A bob = f(); // and then elided with bob.
Only one instance of A exists in the above code; it just has many names.
In c++17 things go even further. Prior to that the objects in question had to logically be copyable/movable, and elision simply eliminated calls the the constructor and shared the objects identity.
After c++17 some things that used to be elision are (in some sense) "guaranteed elision", which is really a different thing. "Guaranteed elision" is basically the idea that prvalues (things that used to be temporaries in pre-c++17) are now abstract instructions on how to create an object.
In certain circumstances temporaries are instantiated from them, but in others they are just used to construct some other object in some other spot.
So in c++17 you should think of this function:
A f();
as a function that returns instructions on how to create a A. When you do this:
A a = f();
you are saying "use the instructions that f returns to construct an A named a".
Similarly, A{} is no longer a temporary but instructions no how to create an A. If you put it on a line by itself those instructions are used to create a temporary, but in most contexts no temporary logically or actually exists.
template<class T, class...Us>
std::shared_ptr<T> make_shared(Us&&...);
this is a function that returns instructions on how to create a shared_ptr<T>.
fun(std::make_shared<int>(5));
here you apply these instructions to the agument of fun, which is of type std::shared_ptr<int>.
In pre-[C++17] without hostile compiler flags, the result with elision is practically the same here. In that case, the temporaries identity is merged with the argument of fun.
In no practical case will there be a temporary shared_ptr with a reference count of 0; other answers which claim this are wrong. The one way where that can occur is if you pass in flags that your compiler from performing elision (the above hostile compiler flags).
If you do pass in such flags, the shared_ptr is moved-from into the argument of fun, and it exists with a reference count of 0. So use_count will remain 0.
In addition to the move construction of std::shared_ptr, there is another aspect to consider: in-place creation of function argument passed by value. This is an optimization that compilers usually do. Consider the exemplary type
struct A {
A() { std::cout << "ctor\n"; }
A(const A&) { std::cout << "copy ctor\n"; }
};
together with a function that takes an instance of A by value
void f(A) {}
When the function parameter is passed as an rvalue like this
f(A{});
the copy constructor won't be called unless you explicitly compile with -fno-elide-constructors. In C++17, you can even delete the copy constructor
A(const A&) = delete;
because the copy elision is guaranteed. With this in mind: the temporary object that you pass as a function argument is "destroyed after the end of the full-expression containing it" only if there is a temporary, and a code snippet might suggest the existence of one even though it's easily (and since C++17: guaranteed to be) optimized out.

Can't return unique_ptr element from an array by value [duplicate]

unique_ptr<T> does not allow copy construction, instead it supports move semantics. Yet, I can return a unique_ptr<T> from a function and assign the returned value to a variable.
#include <iostream>
#include <memory>
using namespace std;
unique_ptr<int> foo()
{
unique_ptr<int> p( new int(10) );
return p; // 1
//return move( p ); // 2
}
int main()
{
unique_ptr<int> p = foo();
cout << *p << endl;
return 0;
}
The code above compiles and works as intended. So how is it that line 1 doesn't invoke the copy constructor and result in compiler errors? If I had to use line 2 instead it'd make sense (using line 2 works as well, but we're not required to do so).
I know C++0x allows this exception to unique_ptr since the return value is a temporary object that will be destroyed as soon as the function exits, thus guaranteeing the uniqueness of the returned pointer. I'm curious about how this is implemented, is it special cased in the compiler or is there some other clause in the language specification that this exploits?
is there some other clause in the language specification that this exploits?
Yes, see 12.8 §34 and §35:
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object [...]
This elision of copy/move operations, called copy elision, is permitted [...]
in a return statement in a function with a class return type, when the expression is the name of
a non-volatile automatic object with the same cv-unqualified type as the function return type [...]
When the criteria for elision of a copy operation are met and the object to be copied is designated by an lvalue,
overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue.
Just wanted to add one more point that returning by value should be the default choice here because a named value in the return statement in the worst case, i.e. without elisions in C++11, C++14 and C++17 is treated as an rvalue. So for example the following function compiles with the -fno-elide-constructors flag
std::unique_ptr<int> get_unique() {
auto ptr = std::unique_ptr<int>{new int{2}}; // <- 1
return ptr; // <- 2, moved into the to be returned unique_ptr
}
...
auto int_uptr = get_unique(); // <- 3
With the flag set on compilation there are two moves (1 and 2) happening in this function and then one move later on (3).
This is in no way specific to std::unique_ptr, but applies to any class that is movable. It's guaranteed by the language rules since you are returning by value. The compiler tries to elide copies, invokes a move constructor if it can't remove copies, calls a copy constructor if it can't move, and fails to compile if it can't copy.
If you had a function that accepts std::unique_ptr as an argument you wouldn't be able to pass p to it. You would have to explicitly invoke move constructor, but in this case you shouldn't use variable p after the call to bar().
void bar(std::unique_ptr<int> p)
{
// ...
}
int main()
{
unique_ptr<int> p = foo();
bar(p); // error, can't implicitly invoke move constructor on lvalue
bar(std::move(p)); // OK but don't use p afterwards
return 0;
}
unique_ptr doesn't have the traditional copy constructor. Instead it has a "move constructor" that uses rvalue references:
unique_ptr::unique_ptr(unique_ptr && src);
An rvalue reference (the double ampersand) will only bind to an rvalue. That's why you get an error when you try to pass an lvalue unique_ptr to a function. On the other hand, a value that is returned from a function is treated as an rvalue, so the move constructor is called automatically.
By the way, this will work correctly:
bar(unique_ptr<int>(new int(44));
The temporary unique_ptr here is an rvalue.
I think it's perfectly explained in item 25 of Scott Meyers' Effective Modern C++. Here's an excerpt:
The part of the Standard blessing the RVO goes on to say that if the conditions for the RVO are met, but compilers choose not to perform copy elision, the object being returned must be treated as an rvalue. In effect, the Standard requires that when the RVO is permitted, either copy elision takes place or std::move is implicitly applied to local objects being returned.
Here, RVO refers to return value optimization, and if the conditions for the RVO are met means returning the local object declared inside the function that you would expect to do the RVO, which is also nicely explained in item 25 of his book by referring to the standard (here the local object includes the temporary objects created by the return statement). The biggest take away from the excerpt is either copy elision takes place or std::move is implicitly applied to local objects being returned. Scott mentions in item 25 that std::move is implicitly applied when the compiler choose not to elide the copy and the programmer should not explicitly do so.
In your case, the code is clearly a candidate for RVO as it returns the local object p and the type of p is the same as the return type, which results in copy elision. And if the compiler chooses not to elide the copy, for whatever reason, std::move would've kicked in to line 1.
One thing that i didn't see in other answers is To clarify another answers that there is a difference between returning std::unique_ptr that has been created within a function, and one that has been given to that function.
The example could be like this:
class Test
{int i;};
std::unique_ptr<Test> foo1()
{
std::unique_ptr<Test> res(new Test);
return res;
}
std::unique_ptr<Test> foo2(std::unique_ptr<Test>&& t)
{
// return t; // this will produce an error!
return std::move(t);
}
//...
auto test1=foo1();
auto test2=foo2(std::unique_ptr<Test>(new Test));
I would like to mention one case where you must use std::move() otherwise it will give an error.
Case: If the return type of the function differs from the type of the local variable.
class Base { ... };
class Derived : public Base { ... };
...
std::unique_ptr<Base> Foo() {
std::unique_ptr<Derived> derived(new Derived());
return std::move(derived); //std::move() must
}
Reference: https://www.chromium.org/developers/smart-pointer-guidelines
I know it's an old question, but I think an important and clear reference is missing here.
From https://en.cppreference.com/w/cpp/language/copy_elision :
(Since C++11) In a return statement or a throw-expression, if the compiler cannot perform copy elision but the conditions for copy elision are met or would be met, except that the source is a function parameter, the compiler will attempt to use the move constructor even if the object is designated by an lvalue; see return statement for details.

Can we use the return value optimization when possible and fall back on move, not copy, semantics when not?

Is it possible to write C++ code where we rely on the return value optimization (RVO) when possible, but fall back on move semantics when not? For example, the following code can not use the RVO due to the conditional, so it copies the result back:
#include <iostream>
struct Foo {
Foo() {
std::cout << "constructor" << std::endl;
}
Foo(Foo && x) {
std::cout << "move" << std::endl;
}
Foo(Foo const & x) {
std::cout << "copy" << std::endl;
}
~Foo() {
std::cout << "destructor" << std::endl;
}
};
Foo f(bool b) {
Foo x;
Foo y;
return b ? x : y;
}
int main() {
Foo x(f(true));
std::cout << "fin" << std::endl;
}
This yields
constructor
constructor
copy
destructor
destructor
fin
destructor
which makes sense. Now, I could force the move constructor to be called in the above code by changing the line
return b ? x : y;
to
return std::move(b ? x : y);
This gives the output
constructor
constructor
move
destructor
destructor
fin
destructor
However, I don't really like to call std::move directly.
Really, the issue is that I'm in a situation where I absolutely, positively, can not call the copy constructor even when the constructor exists. In my use case, there's too much memory to copy and although it'd be nice to just delete the copy constructor, it's not an option for a variety of reasons. At the same time, I'd like to return these objects from a function and would prefer to use the RVO. Now, I don't really want to have to remember all of the nuances of the RVO when coding and when it's applied an when it's not applied. Mostly, I want the object to be returned and I don't want the copy constructor called. Certainly, the RVO is better, but the move semantics are fine. Is there a way to the RVO when possible and the move semantics when not?
Edit 1
The following question helped me figure out what's going on. Basically, 12.8.32 of the standard states:
When the criteria for elision of a copy operation are met or would be
met save for the fact that the source object is a function parameter,
and the object to be copied is designated by an lvalue, overload
resolution to select the constructor for the copy is first performed
as if the object were designated by an rvalue. If overload resolution
fails, or if the type of the first parameter of the selected
constructor is not an rvalue reference to the object’s type (possibly
cv-qualified), overload resolution is performed again, considering the
object as an lvalue. [ Note: This two-stage overload resolution must
be performed regardless of whether copy elision will occur. It
determines the constructor to be called if elision is not performed,
and the selected constructor must be accessible even if the call is
elided. —end note ]
Alright, so to figure out what the criteria for a copy elison are, we look at 12.8.31
in a return statement in a function with a class return type, when the
expression is the name of a non-volatile automatic object (other than
a function or catch-clause parameter) with the same cvunqualified type
as the function return type, the copy/move operation can be omitted by
constructing the automatic object directly into the function’s return
value
As such, if we define the code for f as:
Foo f(bool b) {
Foo x;
Foo y;
if(b) return x;
return y;
}
Then, each of our return values is an automatic object, so 12.8.31 says that it qualifies for copy elison. That kicks over to 12.8.32 which says that the copy is performed as if it were an rvalue. Now, the RVO doesn't happen because we don't know a priori which path to take, but the move constructor is called due to the requirements in 12.8.32. Technically, one move constructor is avoided when copying into x. Basically, when running, we get:
constructor
constructor
move
destructor
destructor
fin
destructor
Turning off elide on constructors generates:
constructor
constructor
move
destructor
destructor
move
destructor
fin
destructor
Now, say we go back to
Foo f(bool b) {
Foo x;
Foo y;
return b ? x : y;
}
We have to look at the semantics for the conditional operator in 5.16.4
If the second and third operands are glvalues of the same value
category and have the same type, the result is of that type and value
category and it is a bit-field if the second or the third operand is a
bit-field, or if both are bit-fields.
Since both x and y are lvalues, the conditional operator is an lvalue, but not an automatic object. Therefore, 12.8.32 doesn't kick in and we treat the return value as an lvalue and not an rvalue. This requires that the copy constructor be called. Hence, we get
constructor
constructor
copy
destructor
destructor
fin
destructor
Now, since the conditional operator in this case is basically copying out the value category, that means that the code
Foo f(bool b) {
return b ? Foo() : Foo();
}
will return an rvalue because both branches of the conditional operator are rvalues. We see this with:
constructor
fin
destructor
If we turning off elide on constructors, we see the moves
constructor
move
destructor
move
destructor
fin
destructor
Basically, the idea is that if we return an rvalue we'll call the move constructor. If we return an lvalue, we'll call the copy constructor. When we return a non-volatile automatic object whose type matches that of the return type, we return an rvalue. If we have a decent compiler, these copies and moves may be elided with the RVO. However, at the very least, we know what constructor is called in case the RVO can't be applied.
When the expression in the return statement is a non-volatile automatic duration object, and not a function or catch-clause parameter, with the same cv-unqualified type as the function return type, the resulting copy/move is eligible for copy elision. The standard also goes on to say that, if the only reason copy elision was forbidden was that the source object was a function parameter, and if the compiler is unable to elide a copy, the overload resolution for the copy should be done as if the expression was an rvalue. Thus, it would prefer the move constructor.
OTOH, since you are using the ternary expression, none of the conditions hold and you are stuck with a regular copy. Changing your code to
if(b)
return x;
return y;
calls the move constructor.
Note that there is a distinction between RVO and copy elision - copy elision is what the standard allows, while RVO is a technique commonly used to elide copies in a subset of the cases where the standard allows copy elision.
Yes, there is. Don't return the result of a ternary operator; use if/else instead. When you return a local variable directly, move semantics are used when possible. However, in your case you're not returning a local directly -- you're returning the result of an expression.
If you change your function to read like this:
Foo f(bool b) {
Foo x;
Foo y;
if (b) { return x; }
return y;
}
Then you should note that your move constructor is called instead of your copy constructor.
If you stick to returning a single local value per return statement then move semantics will be used if supported by the type.
If you don't like this approach then I would suggest that you stick with std::move. You may not like it, but you have to pick your poison -- the language is the way that it is.

Understanding function call in context of temporary objects

Look at this simple code:
class A
{};
A f(A a)
{
return a;
}
int main(void)
{
A a;
A b = f(a);
return 0;
}
It creates a local variable a, calls a function f() and assigns its return value to another variable b. But I'd like to know what happens during the function call.
Could someone describe to be me, step by step, what objects (temporary or otherwise) are created during the process, what constructors, destructors and assign/move operators are called and when?
When in doubt bring out the Noisy class:
struct Noisy {
Noisy() { std::cout << "Default construct" << std::endl; }
Noisy(const Noisy&) { std::cout << "Copy construct" << std::endl; }
Noisy(Noisy&&) { std::cout << "Move construct" << std::endl; }
Noisy& operator=(const Noisy&) { std::cout << "Copy assignment" << std::endl; return *this; }
Noisy& operator=(Noisy&&) { std::cout << "Move assignment" << std::endl; return *this; }
~Noisy() { std::cout << "Destructor" << std::endl; }
};
Noisy f(Noisy a) {
return a;
}
int main(void) {
Noisy a;
Noisy b = f(a);
}
Compiled with gcc-4.9.1 using options g++ -fno-elide-constructors -std=c++11 t.cc gives output:
Default construct // 1. 'a' is default constructed.
Copy construct // 2. Local argument 'a' in function 'f' is copied.
Move construct // 3. Return value is move constructed (*see note below).
Move construct // 4. 'b' is move constructed from return value.
Destructor // 5. Local argument 'a' is destroyed.
Destructor // 6. Return value is destroyed.
Destructor // 7. 'b' is destroyed.
Destructor // 8. 'a' is destroyed.
Note: Even though local argument a is an lvalue, the compiler knows it's about to go out of scope and considers it as an rvalue.
Compiling without option -fno-elide-constructors will enable compiler copy elision optimizations and yields output:
Default construct // 1. 'a' is default constructed.
Copy construct // 2. Local argument 'a' in function 'f' is copied.
Move construct // 3. 'b' is move constructed from argument 'a' (elision).
Destructor // 4. Local argument 'a' is destroyed.
Destructor // 5. 'b' is destroyed.
Destructor // 6. 'a' is destroyed.
Compiling with -std=c++03 i.e. C++03 will result in all moves being replaced with copies.
For more info about copy elision see here: What are copy elision and return value optimization?
A f(A a)
{
return a;
}
A a;
A b = f(a);
The parameter (a) is copy-initialized with the corresponding argument (a). That simply involves the copy-constructor.
The return-value temporary is copy-initialized with a.
b is copy-initialized with the return value of the function call. The implicitly-defined move-constructor is called (as the initializer is a (p)rvalue).
Note that copy elision doesn't apply here as, in return statements, it only works for variables that aren't function (or catch-clause) parameters.
int main(void)
{
A a; // creates `a` using default constructor of `A`
A b = f(a); // initializes `b` using implicitly-defined
// move-constructor from temporary copy of `a` (see [1])
/* where
A f(A a) // gets copy of `A` object as argument
{
return a; // return-value is copy-initialized
}
*/
return 0;
}
[1] The implicit generation of move constructors
Assuming the optimizer does not simplify the process:
The function f you wrote takes its argument a by value. So calling the function f invokes setting up the parameter to f, which means copying the local variable a of your main function into the stack space used for passing the parameter to f (a temporary object). This object is created using the copy constructor.
As f returns by value, all typical C++ implementations work by passing a pointer to storage space for an object of type A (no created object yet) as a hidden parameter to f. In the most simple case, this is a temporary object created on the stack of main for the time of the expression involving the function call.
Now that the parameters for f are set up, the function f is entered. As its sole statement is a return statement, the only thing f does is copying the parameter a into the storage provided by the caller. This is done using the move constructor, but you hit a corner case in the language specification here, so compiler behaviour might vary, and you might get a copy instead. More in this in the last paragraph.
After having constructed the return value, f exits. main regains control and creates the local variable b, taking the temporary object returned by f as source. As this temporary object is an rvalue, the initialization of b is done using move construction (on C++11).
As the statement is completed now, the temporary objects (the parameter and the return value) are destroyed. I don't think the order of destruction is specified, but if it is, it will be last-constructed first, so the return value would be destroyed before the parameter.
Typically, a "move elision" optimization is applied to your code, though. The hidden parameter of f is not given the address of a temporary object which is afterwards moved constructed into b, but f constructs directly int b.
The (non-named) return value optimization is not of a concern for the code given, as the return statement does not consist of a constructor call.
The named return-value optimization is also not applicable. This optimization would place the object you return (i.e. the parameter a) at the place the caller provided for the return value, so a copy/move operation can be avoided. In the function you wrote, the returned object is a parameter, so the compiler gets no chance to "put it where the return value is going to be" while compiling f, as the machine calling convention dictates where that object is to be found.
The before-mentioned corner case in the language specification (whether the return value of f is move- or copy-constructed) is rooted in an implicit "std::move", allowing moving, on the return statement. This is specified in clause [class.copy] (12.8 in n3337) in paragraph 32. It specifies that if a copy is allowed to be elided (accoring to the previous paragraph), an l-value given as copy source (in this case the name of the parameter a) is treated as r-value (i.e. can be moved from). The criteria for allowed copy elisions are given in 12.8/31, which amongst other criteria lists:
a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value
This is, in fact, the definition of the named return value optimization! And as already explained above, the named return value optimization can not work, as parameters and returned objects are located in different spaces, but now lets go back to [12.8/32] and look at the precise wording:
When the criteria for elision of a copy operation are met or would be met save for the fact that the source object is a function parameter, and the object to be copied is designated by an lvalue, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue
The part "or would be met save for the fact that the source object is a function parameter" is our life-saver here. It enables treatment as r-value even in the case where the named return value optimization is not possible just because of object storage locations and not because of semantic constraints.
EDIT: The reason I called this a corner case was missing: The life-save clause has been added very late in the C++11 standardization process, so there are some partly conforming compilers, that do not implicitly move from parameters.

Returning unique_ptr from functions

unique_ptr<T> does not allow copy construction, instead it supports move semantics. Yet, I can return a unique_ptr<T> from a function and assign the returned value to a variable.
#include <iostream>
#include <memory>
using namespace std;
unique_ptr<int> foo()
{
unique_ptr<int> p( new int(10) );
return p; // 1
//return move( p ); // 2
}
int main()
{
unique_ptr<int> p = foo();
cout << *p << endl;
return 0;
}
The code above compiles and works as intended. So how is it that line 1 doesn't invoke the copy constructor and result in compiler errors? If I had to use line 2 instead it'd make sense (using line 2 works as well, but we're not required to do so).
I know C++0x allows this exception to unique_ptr since the return value is a temporary object that will be destroyed as soon as the function exits, thus guaranteeing the uniqueness of the returned pointer. I'm curious about how this is implemented, is it special cased in the compiler or is there some other clause in the language specification that this exploits?
is there some other clause in the language specification that this exploits?
Yes, see 12.8 §34 and §35:
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object [...]
This elision of copy/move operations, called copy elision, is permitted [...]
in a return statement in a function with a class return type, when the expression is the name of
a non-volatile automatic object with the same cv-unqualified type as the function return type [...]
When the criteria for elision of a copy operation are met and the object to be copied is designated by an lvalue,
overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue.
Just wanted to add one more point that returning by value should be the default choice here because a named value in the return statement in the worst case, i.e. without elisions in C++11, C++14 and C++17 is treated as an rvalue. So for example the following function compiles with the -fno-elide-constructors flag
std::unique_ptr<int> get_unique() {
auto ptr = std::unique_ptr<int>{new int{2}}; // <- 1
return ptr; // <- 2, moved into the to be returned unique_ptr
}
...
auto int_uptr = get_unique(); // <- 3
With the flag set on compilation there are two moves (1 and 2) happening in this function and then one move later on (3).
This is in no way specific to std::unique_ptr, but applies to any class that is movable. It's guaranteed by the language rules since you are returning by value. The compiler tries to elide copies, invokes a move constructor if it can't remove copies, calls a copy constructor if it can't move, and fails to compile if it can't copy.
If you had a function that accepts std::unique_ptr as an argument you wouldn't be able to pass p to it. You would have to explicitly invoke move constructor, but in this case you shouldn't use variable p after the call to bar().
void bar(std::unique_ptr<int> p)
{
// ...
}
int main()
{
unique_ptr<int> p = foo();
bar(p); // error, can't implicitly invoke move constructor on lvalue
bar(std::move(p)); // OK but don't use p afterwards
return 0;
}
unique_ptr doesn't have the traditional copy constructor. Instead it has a "move constructor" that uses rvalue references:
unique_ptr::unique_ptr(unique_ptr && src);
An rvalue reference (the double ampersand) will only bind to an rvalue. That's why you get an error when you try to pass an lvalue unique_ptr to a function. On the other hand, a value that is returned from a function is treated as an rvalue, so the move constructor is called automatically.
By the way, this will work correctly:
bar(unique_ptr<int>(new int(44));
The temporary unique_ptr here is an rvalue.
I think it's perfectly explained in item 25 of Scott Meyers' Effective Modern C++. Here's an excerpt:
The part of the Standard blessing the RVO goes on to say that if the conditions for the RVO are met, but compilers choose not to perform copy elision, the object being returned must be treated as an rvalue. In effect, the Standard requires that when the RVO is permitted, either copy elision takes place or std::move is implicitly applied to local objects being returned.
Here, RVO refers to return value optimization, and if the conditions for the RVO are met means returning the local object declared inside the function that you would expect to do the RVO, which is also nicely explained in item 25 of his book by referring to the standard (here the local object includes the temporary objects created by the return statement). The biggest take away from the excerpt is either copy elision takes place or std::move is implicitly applied to local objects being returned. Scott mentions in item 25 that std::move is implicitly applied when the compiler choose not to elide the copy and the programmer should not explicitly do so.
In your case, the code is clearly a candidate for RVO as it returns the local object p and the type of p is the same as the return type, which results in copy elision. And if the compiler chooses not to elide the copy, for whatever reason, std::move would've kicked in to line 1.
One thing that i didn't see in other answers is To clarify another answers that there is a difference between returning std::unique_ptr that has been created within a function, and one that has been given to that function.
The example could be like this:
class Test
{int i;};
std::unique_ptr<Test> foo1()
{
std::unique_ptr<Test> res(new Test);
return res;
}
std::unique_ptr<Test> foo2(std::unique_ptr<Test>&& t)
{
// return t; // this will produce an error!
return std::move(t);
}
//...
auto test1=foo1();
auto test2=foo2(std::unique_ptr<Test>(new Test));
I would like to mention one case where you must use std::move() otherwise it will give an error.
Case: If the return type of the function differs from the type of the local variable.
class Base { ... };
class Derived : public Base { ... };
...
std::unique_ptr<Base> Foo() {
std::unique_ptr<Derived> derived(new Derived());
return std::move(derived); //std::move() must
}
Reference: https://www.chromium.org/developers/smart-pointer-guidelines
I know it's an old question, but I think an important and clear reference is missing here.
From https://en.cppreference.com/w/cpp/language/copy_elision :
(Since C++11) In a return statement or a throw-expression, if the compiler cannot perform copy elision but the conditions for copy elision are met or would be met, except that the source is a function parameter, the compiler will attempt to use the move constructor even if the object is designated by an lvalue; see return statement for details.