Why isn't the copy constructor elided here? - c++

(I'm using gcc with -O2.)
This seems like a straightforward opportunity to elide the copy constructor, since there are no side-effects to accessing the value of a field in a bar's copy of a foo; but the copy constructor is called, since I get the output meep meep!.
#include <iostream>
struct foo {
foo(): a(5) { }
foo(const foo& f): a(f.a) { std::cout << "meep meep!\n"; }
int a;
};
struct bar {
foo F() const { return f; }
foo f;
};
int main()
{
bar b;
int a = b.F().a;
return 0;
}

It is neither of the two legal cases of copy ctor elision described in 12.8/15:
Return value optimisation (where an automatic variable is returned from a function, and the copying of that automatic to the return value is elided by constructing the automatic directly in the return value) - nope. f is not an automatic variable.
Temporary initializer (where a temporary is copied to an object, and instead of constructing the temporary and copying it, the temporary value is constructed directly into the destination) - nope f is not a temporary either. b.F() is a temporary, but it isn't copied anywhere, it just has a data member accessed, so by the time you get out of F() there's nothing to elide.
Since neither of the legal cases of copy ctor elision apples, and the copying of f to the return value of F() affects the observable behaviour of the program, the standard forbids it to be elided. If you got replaced the printing with some non-observable activity, and examined the assembly, you might see that this copy constructor has been optimised away. But that would be under the "as-if" rule, not under the copy constructor elision rule.

Copy elision happens only when a copy isn't really necessary. In particular, it's when there's one object (call it A) that exists for the duration of the execution of a function, and a second object (call it B) that will be copy constructed from the first object, and immediately after that, A will be destroyed (i.e. upon exit from the function).
In this very specific case, the standard gives permission for the compiler to coalesce A and B into two separate ways of referring to the same object. Instead of requiring that A be created, then B be copy constructed from A, and then A be destroyed, it allows A and B to be considered two ways of referring to the same object, so the (one) object is created as A, and after the function returns starts to be referred to as B, but even if the copy constructor has side effects, the copy that creates B from A can still be skipped over. Also, note that in this case A (as an object separate from B) is never destroyed either -- e.g., if your dtor also had side effects, they could (would) be omitted as well.
Your code doesn't fit that pattern -- the first object does not cease to exist immediately after being used to initialize the second object. After F() returns, there are two instances of the object. That being the case, the [Named] Return Value Optimization (aka. copy elision) simply does not apply.
Demo code when copy elision would apply:
#include <iostream>
struct foo {
foo(): a(5) { }
foo(const foo& f): a(f.a) { std::cout << "meep meep!\n"; }
int a;
};
int F() {
// RVO
std::cout << "F\n";
return foo();
}
int G() {
// NRVO
std::cout << "G\n";
foo x;
return x;
}
int main() {
foo a = F();
foo b = G();
return 0;
}
Both MS VC++ and g++ optimize away both copy ctors from this code with optimization turned on. g++ optimizes both away even if optimization is turned off. With optimization turned off, VC++ optimizes away the anonymous return, but uses the copy ctor for the named return.

The copy constructor is called because a) there is no guarantee you are copying the field value without modification, and b) because your copy constructor has a side effect (prints a message).

A better way to think about copy elision is in terms of the temporary object. That is how the standard describes it. A temporary is allowed to be "folded" into a permanent object if it is copied into the permanent object immediately before its destruction.
Here you construct a temporary object in the function return. It doesn't really participate in anything, so you want it to be skipped. But what if you had done
b.F().a = 5;
if the copy were elided, and you operated on the original object, you would have modified b through a non-reference.

Related

Why does Return Value Optimization not happen if no destructor is defined?

I expected to see copy elision from Named Return Value Optimization (NRVO) from this test program but its output is "Addresses do not match!" so NRVO didn't happen. Why is this?
// test.cpp
// Compile using:
// g++ -Wall -std=c++17 -o test test.cpp
#include <string>
#include <iostream>
void *addr = NULL;
class A
{
public:
int i;
int j;
#if 0
~A() {}
#endif
};
A fn()
{
A fn_a;
addr = &fn_a;
return fn_a;
}
int main()
{
A a = fn();
if (addr == &a)
std::cout << "Addresses match!\n";
else
std::cout << "Addresses do not match!\n";
}
Notes:
If a destructor is defined by enabling the #if above, then the NRVO does happen (and it also happens in some other cases such as defining a virtual method or adding a std::string member).
No methods have been defined so A is a POD struct, or in more recent terminology a trivial class. I don't see an explicit exclusion for this in the above links.
Adding compiler optimisation (to a more complicated example that doesn't just reduce to the empty program!) doesn't make any difference.
Looking at the assembly for a second example shows that this even happens when I would expect mandatory Return Value Optimization (RVO), so the NRVO above was not prevented by taking the address of fn_a in fn(). Clang, GCC, ICC and MSVC on x86-64 show the same behaviour suggesting this behaviour is intentional and not a bug in a specific compiler.
class A
{
public:
int i;
int j;
#if 0
~A() {}
#endif
};
A fn()
{
return A();
}
int main()
{
// Where NRVO occurs the call to fn() is preceded on x86-64 by a move
// to RDI, otherwise it is followed by a move from RAX.
A a = fn();
}
The language rule which allows this in case of returning a prvalue (the second example) is:
[class.temporary]
When an object of class type X is passed to or returned from a function, if X has at least one eligible copy or move constructor ([special]), each such constructor is trivial, and the destructor of X is either trivial or deleted, implementations are permitted to create a temporary object to hold the function parameter or result object.
The temporary object is constructed from the function argument or return value, respectively, and the function's parameter or return object is initialized as if by using the eligible trivial constructor to copy the temporary (even if that constructor is inaccessible or would not be selected by overload resolution to perform a copy or move of the object).
[Note: This latitude is granted to allow objects of class type to be passed to or returned from functions in registers.
— end note
]
Why does Return Value Optimization not happen [in some cases]?
The motivation for the rule is explained in the note of the quoted rule. Essentially, RVO is sometimes less efficient than no RVO.
If a destructor is defined by enabling the #if above, then the RVO does happen (and it also happens in some other cases such as defining a virtual method or adding a std::string member).
In the second case, this is explained by the rule because creating the temporary is only allowed when the destructor is trivial.
In the NRVO case, I suppose this is up to the language implementation.
On many ABIs, if a return value is a trivially copyable object whose size/alignment is equal to or less than that of a pointer/register, then the ABI will not permit elision. The reason being that it is more efficient to just return the value via a register than via a stack memory address.
Note that when you get the address either of the object in the function or the returned object, the compiler will force the object onto the stack. But the actual passing of the object will be via a register.

Why is using the move constructor in a return statement legal?

Consider the following:
#include <iostream>
#define trace(name) std::cout << #name << " (" << this << "), i = " << i << std::endl
class C
{
C(C const&);
C& operator=(C const&);
public:
int i;
C() : i(42) { trace(CTOR); }
C(C&& other) : i(other.i) { trace(MOVE); other.i = 0; }
C& operator=(C&& other) { trace(ASGN); other.i = 0; return *this; }
~C() { trace(DTOR); }
};
C
func1(bool c)
{
C local;
if (c)
return local;
else
return C();
}
int
main()
{
C local(func1(true));
return 0;
}
Both MSC and g++ allow the return local, and use the move constructor (as shown by the output) when doing so. While this makes a lot of sense to me, and I think it probably should be the case, I can't find the text in the standard that authorizes it. As far as I can see, the argument to the move constructor must be either a prvalue (which it clearly isn't) or an xvalue; it is in fact an lvalue, which would make the return just as illegal as C other = local; in the function body (which does fail to compile).
When move semantics where added to C++ in C++11, decisions where made about where to automatically do move construction.
The general rule that was followed was that implicit move should occur when copy elision would be legal.
Copy elision is legal when you have an anonymous object that you are copying to another instance. The compiler can legally elide the copy, and treat the two objects as one, with a lifetime equal to the union of the two objects lifetime.
The other time that copy elision is legal is when you return a local variable from a function. This was known as NRVO (named return value optimization). This optimization in C++03 let you declare a return value, and it is constructed directly in the place where the return value goes. When you return retval;, no copy happens.
This is basically impossible to do without inlining when you return multiple different objects at different spots in your function.
However, when move semantics where added, this place was the other spot where implicit move can occur. A return local_var; in a function returning a variable the same type as local_var can be elided, and failing that, you can implicitly move from local_var into the return value.
C++11 12.8/32: When the criteria for elision of a copy operation are met or would be met save for the fact that the source object is a function parameter, and the object to be copied is designated by an lvalue, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue.
Returning a local automatic variable meets the criteria for elision; treating it as if it were an rvalue selects the move constructor.

Returning a value from a function with move semantics or the return value optimization, but not the copy constructor

Is there a good way to return a value from a function in C++ where we guarantee that the copy constructor is not called? Either the return value optimization or the move constructor are fine. For example, with the following code
#include <iostream>
struct Foo {
private:
// Disallow the copy and default constructor as well as the assignment
// operator
Foo();
Foo(Foo const & foo);
Foo & operator = (Foo const & foo);
public:
// Store a little bit of data
int data;
Foo(int const & data_) : data(data_) { }
// Write a move constructor
Foo(Foo && foo) {
std::cout << "Move constructor" << std::endl;
data=foo.data;
}
};
// Write a function that creates and returns a Foo
Foo Bar() {
Foo foo(3);
return foo;
}
// See if we can mix things up
Foo Baz(int x) {
Foo foo2(2);
Foo foo3(3);
return x>2 ? foo2 : foo3;
}
int main() {
// This is using the return value optimization (RVO)
Foo foo1 = Bar();
std::cout << foo1.data << std::endl;
// This should make the RVO fail
Foo foo2 = Baz(3);
std::cout << foo2.data << std::endl;
}
We have a compiler error
$ make
g++ -std=c++11 test01.cpp -o test01
test01.cpp: In function 'Foo Baz(int)':
test01.cpp:10:5: error: 'Foo::Foo(const Foo&)' is private
test01.cpp:35:25: error: within this context
make: *** [all] Error 1
since the copy constructor is private. Now, if we modify the Baz function to
// See if we can mix things up
Foo Baz(int x) {
Foo foo2(2);
Foo foo3(3);
return std::move(x>2 ? foo2 : foo3);
}
we do in fact run correctly. However, this seems to preclude the RVO from being used ever. Is there a better way to structure these functions if we must guarantee that the copy constructor is not called?
From the C++ standard:
[class.copy]/31: When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects. ... This elision of copy/move operations, called copy elision, is permitted in the following circumstances (which may be combined to eliminate multiple copies):
in a return statement in a function with a class return type, when the expression is the name of a
non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-
unqualified type as the function return type, the copy/move operation can be omitted by constructing
the automatic object directly into the function’s return value
Since x > 2 ? foo2 : foo3 is not the name of an automatic object, copy elision is not permitted.
Interestingly, your example is addressed in n1377:
With this language feature in place, move/copy elision, although still
important, is no longer critical. There are some functions where NRVO
is allowed, but can be exceedingly difficult to implement. For
example:
A
f(bool b)
{
A a1, a2;
// ...
return b ? a1 : a2;
}
It is somewhere between difficult and impossible to decide whether to
construct a1 or a2 in the caller's preferred location. Using A's move
constructor (instead of copy constructor) to send a1 or a2 back to the
caller is the best solution.
We could require that the author of operator+ explicitly request the
move semantics. But what would be the point? The current language
already allows for the elision of this copy, so the coder already can
not rely on destruction order of the local, nor can he rely on the
copy constructor being called. The auto-local is about to be
conceptually destructed anyway, so it is very "rvalue-like". The move
is not detectable except by measuring performance, or counting copies
(which may be elided anyway).
Note that this language addition permits movable, but non-copyable
objects (such as move_ptr) to be returned by value, since a move
constructor is found and used (or elided) instead of the inaccessible
copy constructor.
Their example of solving this (in favor of move semantics) is:
// Or just call std::move
// return x>2 ? static_cast<Foo&&>(foo2) : static_cast<Foo&&>(foo3);
return static_cast<Foo&&>(x>2 ? foo2 : foo3);
The logic resulting from this implicit cast results in an automatic
hierarchy of "move semantics" from best to worst:
If you can elide the move/copy, do so (by present language rules)
Else if there is a move constructor, use it
Else if there is a copy constructor, use it
Else the program is ill formed
Or as Xeo mentions, you can structure it this way:
Foo Baz(int x) {
Foo foo2(2);
Foo foo3(3);
if (x > 2)
return foo2;
else
return foo3;
}
You already provided an example in the OP, but the standard provides one for eliding the move/copy constructor (it equally applies):
class Thing {
public:
Thing() { }
~Thing() { }
Thing(Thing&& thing) {
std::cout << "hi there";
}
};
Thing f() {
Thing t;
return t;
}
Thing t2 = f();
// does not print "hi there"
But if you supply both the move and copy constructor, the move constructor seems to be preferred.

Why isn't object returned by std::move destroyed immediately

I tested the following code:
#include <iostream>
using namespace std;
class foo{
public:
foo() {cout<<"foo()"<<endl;}
~foo() {cout<<"~foo()"<<endl;}
};
int main()
{
foo f;
move(f);
cout<<"statement \"move(f);\" done."<<endl;
return 0;
}
The output was:
foo()
statement "move(f);" done.
~foo()
However, I expected:
foo()
~foo()
statement "move(f);" done.
According to the source code of the function move:
template<typename _Tp>
constexpr typename std::remove_reference<_Tp>::type&&
move(_Tp&& __t) noexcept
{ return static_cast<typename std::remove_reference<_Tp>::type&&>(__t); }
The returned object is a right value, So Why isn't it destroyed immediately?
-----------------------------------------------------------------
I think I just confused rvalue and rvalue reference.
I modified my code:
#include <iostream>
template<typename _Tp>
constexpr typename /**/std::remove_reference<_Tp>::type /* no && */
/**/ mymove /**/ (_Tp&& __t) noexcept
{ return static_cast<typename std::remove_reference<_Tp>::type&&>(__t); }
using namespace std;
class foo{
public:
foo() {cout<<"foo() at "<<this<<endl;} /* use address to trace different objects */
~foo() {cout<<"~foo() at "<<this<<endl;} /* use address to trace different objects */
};
int main()
{
foo f;
mymove(f);
cout<<"statement \"mymove(f);\" done."<<endl;
return 0;
}
And now I get what I've been expecting:
foo() at 0x22fefe
~foo() at 0x22feff
statement "mymove(f);" done.
~foo() at 0x22fefe
Moving from an object doesn't change its lifetime, only its current value. Your object foo is destroyed on return from main, which is after your output.
Futhermore, std::move doesn't move from the object. It just returns an rvalue reference whose referand is the object, making it possible to move from the object.
Objects get destroyed when they go out of scope. Moving from an object doesn't change that; depending on what the move constructor or move assignment operator does, the state of the object can be different after the move, but is hasn't yet been destroyed, so the practical rule is that moving from an object must leave it in a state that can be destroyed.
Beyond that, as #R.MartinhoFernandes points out, std::move doesn't do anything. It's the object's move constructor or move assignment operator that does whatever needs to be done, and that isn't applied in a call to std::move; it's applied when the moved-from object is used to construct a new object (move constructor) or is assigned to an existing object (move assignment operator). Like this:
foo f;
foo f1(f); // applies foo's copy constructor
foo f2(std::move(f)); // applies foo's move constructor
foo f3, f4;
f3 = f; // applies foo's copy assignment operator
f4 = std::move(f1); // applies foo's move assignment operator
A std::move doesn't change objects lifetime. Roughly speaking it's nothing more than a static_cast that casts a non const lvalue to a non const rvalue reference.
The usefulness of this is overload resolution. Indeed, some functions take parameters by const lvalue reference (e.g. copy constructors) and other take by non const rvalue reference (e.g. move constructors). If the passed object is a temporary, then the compiler calls the second overload. The idea is that just after the function is called the a temporary can no longer be used (and will be destroyed). Therefore the second overload could take ownership of the temporary's resources instead of coping them.
However, the compiler will not do it for a non-temporary object (or, to be more correct for an lvalue). The reason is that the passed object has a name that remains in scope and therefore is alive could still be used (as your code demonstrate). So its internal resources might still be required and it would be a problem if they have had been moved to the other object. Nevertheless, you can instruct the compiler that it can call the second overload by using std::move. It casts the argument to a rvalue reference and, by overload resolution, the second overload is called.
The slight changed code below illustrate this point.
#include <iostream>
using namespace std;
class foo{
public:
foo() { cout << "foo()" << endl; }
~foo() { cout << "~foo()" << endl; }
};
void g(const foo&) { cout << "lref" << endl; }
void g(foo&&) { cout << "rref" << endl; }
int main()
{
foo f;
g(f);
g(move(f));
// f is still in scope and can be referenced.
// For instance, we can call g(f) again.
// Imagine what would happen if f had been destroyed as the question's author
// originally though?
g(static_cast<foo&&>(f)); // This is equivalent to the previous line
cout<<"statement \"move(f);\" done."<<endl;
return 0;
}
The output is
foo()
lref
rref
rref
statement "move(f);" done.
~foo()
Update: (After the question has been changed to use mymove)
Notice that the new code doesn't give exactly what you said at the very beginning. Indeed it reports two calls to ~foo() rather than one.
From the displayed addresses we can see that the original object of type foo, namely, f is destroyed at the very end. Exactly as it used to be with the original code. As many have pointed out, f is destroyed only at the end of its scope (the body of function main). This is still the case.
The extra call to ~foo() reported just before the statement "mymove(f);" done. destroys another object which is a copy of f. If you add a reporting copy constructor to foo:
foo(const foo& orig) { cout << "copy foo from " << &orig << " to " << this << endl;}
Then you get an output similar to:
foo() at 0xa74203de
copy foo from 0xa74203de to 0xa74203df
~foo() at 0xa74203df
statement "move(f);" done.
~foo() at 0xa74203de
We can deduce that calling mymove yields a call to the copy constructor to copy f to another foo object. Then, this newly created object is destroyed before execution reaches the line that displays statement "move(f);" done.
The natural question now is where this copy come from? Well, notice the return type of mymove:
constexpr typename /**/std::remove_reference<_Tp>::type /* no && */`
In this example, after a simplification for clarity, this boils down to foo. That is, mymove returns a foo by value. Therefore, a copy is made to create a temporary object. As I said before, a temporary is destroyed just after the expression that creates it finishes to be evaluated (well, there are exceptions to this rule but they don't apply to this code). That explains the extra call to ~foo().
Because in the general case, the move could happen in another translation unit. In your example, the object wasn't even moved, only marked as movable. This means that the caller of std::move will not know if the object was moved or not, all he knows is, that there is an object and that it has to call the destructor at the end of the scope/lifetime of that object. std::move only marks the object as movable, it does not perform the move operation or create a moved copy that can be further moved or anything like that.
Consider:
// translation unit 1
void f( std::vector< int >&& v )
{
if( v.size() > 8 ) {
// move it
}
else {
// copy it as it's just a few entries
}
}
// translation unit 2
void f( std::vector< int >&& );
std::vector< int > g();
int main()
{
// v is created here
std::vector< int > v = g();
// it is maybe moved here
f( std::move( v ) );
// here, v still exists as an object
// when main() ends, it will be destroyed
}
In the example above, how would translation unit 2 decide whether or not to call the destructor after the std::move?
You're getting confused by the name -- std::move doesn't actually move anything. It just converts (casts) an lvalue reference into an rvalue reference, and is used to make someone else move something.
Where std::move is useful is when you have an overloaded function that takes either an lvalue or an rvalue reference. If you call such a function with a simple variable, overload resolution means you'll call the version with the lvalue reference. You can add an explicit call to std::move to instead call the rvalue reference version. No moves involved, except within the function that takes the rvalue reference.
Now the reason it is called move is the common usage where you have two constructors, one of which takes an lvalue reference (commonly called the copy constructor) and one that takes an rvalue reference (commonly called the move constructor). In this case, adding an explicit call to std::move means you call the move constructor instead of the copy constructor.
In the more general case, it's common practice to have overloaded functions that take lvalue/rvalue references where the lvalue version makes a copy of the object and the rvalue version moves the object (implicitly modifying the source object to take over any memory it uses).
Suppose code like this:
void bar(Foo& a) {
if (a.noLongerneeded())
lastUnneeded = std::move(a);
}
In this case, the caller of bar can not know that the function might in some cases end up calling the destructor of the passed object. It will feel responsible for that object, and make sure to call its destructor at any later point.
So the rule is that move may turn a valid object into a different but still valid object. Still valid means that calling methods or the destructor on that object should still yield well-defined results. Strictly speaking, move by itself does nothing except tell the receiver of such a reference that it may change the object if doing so makes sense. So it's the recipient, e.g. move constructor or move assignment operator, which does the actual moving. These operations will usually either not change the object at all, or set some pointers to nullptr, or some lengths to zero or something like that. They will however never call the destructor, since that task is left to the owner of the object.

Return value optimization (RVO) when using temporaries in C++

Lets say I have two functions A and B.
where function A returns an object X and function B gets object X as an argument. For example.
X A() {
X x;
return x;
}
void B(X x) {
write(x.data, x.size);
}
int main() { B(A()); }
Is this object X constructed only once as a temporary of B using RVO or do I need to use move semantics.
The compiler is only allowed to perform RVO, but not obliged to by the standard. Hence, it may or may not optimise your code, entirely depending on the compiler used and its parameters.
Another question might help with related details: When should RVO kick-in?
If copy elision is used, the X will only be constructed once.
Even if copy elision is disabled (e.g. passing the -fno-elide-constructors to gcc), the move constructors will be used automatically.
You can visualize it yourself:
#include <cstdio>
struct X
{
X() { printf("default constructed\n"); }
~X() { printf("destructed\n"); }
X(const X&) { printf("copy constructed\n"); }
X(X&&) { printf("move constructed\n"); }
};
X A() {
X x;
return x;
}
void B(X x) {
}
int main() {
B(A());
}
With RVO it prints
default constructed
destructed
With no RVO it prints
default constructed
move constructed
destructed
move constructed
destructed
destructed
It will most probably be the same object (i.e. no extra copy will be created). This isn't RVO though, it's called copy elision.
The quote is not from C++11, but C++03:
12.2/2 Temporary objects
Here, an implementation might use a temporary in which to construct X(2) before passing it to f() using X's copy-constructor; alternatively, X(2) might be constructed in the space used to hold the argument. /.../
I'd recommend that you just test it using whatever compiler you're interested in -- i.e., define X's copy constructor to have some side effect (eg printing a message), and see what the code you gave does for your compiler with your preferred optimization settings.