This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Why copy constructor is not called in this case?
When you pass an object to a function by value or return an object from a function by value, the copy constructor must be called. However, in some compilers this does not happen? Any explanation?
I assume they are referring to return-value optimization implemented in many compilers where the code:
CThing DoSomething();
gets turned into
void DoSomething(CThing& thing);
with thing being declared on the stack and passed in to DoSomething:
CThing thing;
DoSomething(thing);
which prevents CThing from needing to be copied.
It often doesn't happen because it doesn't need to happen. This is called copy elision. In many cases, the function doesn't need to make copies, so the compiler optimizes them away. For example, with the following function:
big_type foo(big_type bar)
{
return bar + 1;
}
big_type a = foo(b);
Will get converted to something like:
void foo(const big_type& bar, big_type& out)
{
out = bar + 1;
}
big_type a;
foo(b, a);
The removal of the return value is called the "Return Value Optimization" (RVO), and is implemented by most compilers, even when optimizations are turned off!
The compiler may call the copy constructor for pass-by-value or return-by-value, but it doesn't have to. The standard allows for optimizing it away (in standardese it's called copy elision) and in practice many compilers will do so, even if you don't have optimizations turned on. The explanation is pretty detailed, so I'll point you at C++ FAQ LITE.
Short version:
struct Foo
{
int a, b;
Foo(int A, int B) : a(A), b(B) {}
};
Foo make_me_a_foo(int x)
{
// ...blah, blah blah...
return Foo(x, x+1); // (1)
}
Foo bar = make_me_a_foo(42); // (2)
The trick here is the compiler is allowed to construct bar from line (2) directly in line (1) without incurring any overhead of constructing temporary Foo objects.
Related
This question already has an answer here:
Clang modifies return value in destructor?
(1 answer)
Closed 3 years ago.
Consider the following program:
#include <functional>
#include <iostream>
class RvoObj {
public:
RvoObj(int x) : x_{x} {}
RvoObj(const RvoObj& obj) : x_{obj.x_} { std::cout << "copied\n"; }
RvoObj(RvoObj&& obj) : x_{obj.x_} { std::cout << "moved\n"; }
int x() const { return x_; }
void set_x(int x) { x_ = x; }
private:
int x_;
};
class Finally {
public:
Finally(std::function<void()> f) : f_{f} {}
~Finally() { f_(); }
private:
std::function<void()> f_;
};
RvoObj BuildRvoObj() {
RvoObj obj{3};
Finally run{[&obj]() { obj.set_x(5); }};
return obj;
}
int main() {
auto obj = BuildRvoObj();
std::cout << obj.x() << '\n';
return 0;
}
Both clang and gcc (demo) output 5 without invoking the copy or move constructors.
Is this behavior well-defined and guaranteed by the C++17 standard?
Copy elision only permits an implementation to remove the presence of the object being generated by a function. That is, it can remove the copy from obj to the return value object of foo and the destructor of obj. However, the implementation can't change anything else.
The copy to the return value would happen before destructors for local objects in the function are called. And the destructor of obj would happen after the destructor of run, because destructors for automatic variables are executed in reverse-order of their construction.
This means that it is safe for run to access obj in its destructor. Whether the object denoted by obj is destroyed after run completes or not does not change this fact.
However, there is one problem. See, return <variable_name>; for a local variable is required to invoke a move operation. In your case, moving from RvoObj is the same as copying from it. So for your specific code, it'll be fine.
But if RvoObj were, for example, unique_ptr<T>, you'd be in trouble. Why? Because the move operation to the return value happens before destructors for local variables are called. So in this case obj will be in the moved-from state, which for unique_ptr means that it's empty.
That's bad.
If the move is elided, then there's no problem. But since elision is not required, there is potentially a problem, since your code will behave differently based on whether elision happens or not. Which is implementation-defined.
So generally speaking, it's best not to have destructors rely on the existence of local variables that you're returning.
The above purely relates to your question about undefined behavior. It isn't UB to do something that changes behavior based on whether elision happens or not. The standard defines that one or the other will happen.
However, you cannot and should not rely upon it.
Short answer: due to NRVO, the output of the program may be either 3 or 5. Both are valid.
For background, see first:
in C++ which happens first, the copy of a return object or local object's destructors?
What are copy elision and return value optimization?
Guideline:
Avoid destructors that modify return values.
For example, when we see the following pattern:
T f() {
T ret;
A a(ret); // or similar
return ret;
}
We need to ask ourselves: does A::~A() modify our return value somehow? If yes, then our program most likely has a bug.
For example:
A type that prints the return value on destruction is fine.
A type that computes the return value on destruction is not fine.
[From https://stackoverflow.com/a/54566080/9305398 ]
I have a function that "builds" a structure to return:
struct stuff {
int a;
double b;
Foo c;
};
stuff generate_stuff() {
Foo c = generate_foo();
//do stuff to Foo, that changes Foo:
//...
return {1, 2.0, c}; //should this be return {1, 2.0, move(c)};?
}
Should I be moving c out of the function? I realize that frequently, (N)RVO can build the object in place, however there might be times when this ISN'T the case. When can't (N)RVO be done, and thus, when should I move an object of a function?
Put another way, this is obviously going to be RVO of the temporary that is returned. The question becomes, will NRVO (named return value optimization) happen with c? Will c be constructed in place (at the call site of the function, inside the temporary stuff structure), or will c be constructed in the function, and then copied into the structure at the call site.
What your call to move(c) does is not moving c out of the function, but moving it into the temporary struct that is returned from the function. The temporary return value should always benefit from RVO. However, I believe that the move/copy of c into the temporary cannot be optimized away, as c itself is not a temporary. So here the move version should always be at least as efficient as the copy version (Tested for a simple scenario with g++, clang++ and MVC++).
If you have to absolutely minimize the number of copy/move operations, then you could write
struct stuff {
int a;
double b;
Foo c;
};
stuff generate_stuff() {
stuff s{ 1, 2.0, generate_foo() };
//use s.c instead of c
//...
return s;
}
which would result in only a single construction of Foo and no copies/moves thanks to NRVO.
EDIT:
As #dyp pointed out in the comments to your question, the in-place construction of Stuff isn't actually a case of RVO, but required by the standard. Anyway, the important part is that the move/copy of c cannot be elided so that using move should never result in a performance penalty.
Consider the following code:
#include<memory>
struct A {
std::auto_ptr<int> i;
};
A F() {
A a;
return a;
}
int main(int argc, char **argv) {
A a = F();
return 0;
}
When compiling I receive a compilation error, (see here):
error: no matching function for call to ‘A::A(A)’
A a = F();
^
To my understanding, A::A(A) isn't even allowed to exist, so why is the compiler requesting it? Secondly, why is it not using RVO?
If it is because a std::auto_ptr cannot be returned from a function, why does the following compile and run?
#include<memory>
std::auto_ptr<int> F() {
std::auto_ptr<int> ap;
return ap;
}
int main(int argc, char **argv) {
std::auto_ptr<int> ap = F();
return 0;
}
I cannot use C++11 in my current work unfortunately, hence the use of auto_ptr.
I tried searching but couldn't find a relevant Q&A, even though I know this is a duplicate. So instead I'm answering instead of voting to close as a duplicate. Apologies.
The reason it needs a copy constructor is because the line:
A a = F();
is really (from the compiler's perspective):
A a(F());
even if copy elision/RVO is used. That is, the compiler does not do:
// This is NOT what the compiler does for A a = F();
A a;
a = F();
Even with copy elision/RVO, A a(F()); won't work. From a C++ standards perspective, the code needs to be legal, whether or not the compiler does copy elision. Copy elision doesn't relax the requirement of needing a copy constructor (even if it doesn't actually use it; it still needs to be there in order to ensure "legality" of the code).
This doesn't work, because std::auto_ptr's copy constructor doesn't take a const reference, so A's copy constructor doesn't exist. F() returns a temporary A, which can only be captured by a const reference, which means that line of code is trying to use a non-existent copy constructor.
The compiler normally makes the default constructor
A(const A&)
However in this case it is not possible as there is no auto_ptr::auto_ptr(const auto_ptr &rhs) so it creates the following instead:
A(A&)
Now, when the F returns it will not let the return value a be modified (I think because it might be a persisting object such as a global or a reference). When it finds no A(const A&) it will instead look for A(A) as that's the only other way to return the value without modifying a (even though it's stupid). Even using RVO, it must still be valid code as #Cornstalks mentions in his answer (to an extent, see below).
auto_ptr gets around this by creating a temporary auto_ptr_ref object using the following (ref)
auto_ptr::operator auto_ptr_ref()
auto_ptr(auto_ptr_ref)
For some reason, the compiler accepts the typecast to auto_ptr_ref despite it not being a const function and rejecting the use of A(A&).
To get around this issue one can simply declare A(const A&) without implementing it
struct A {
A() { }
A(const A&);
std::auto_ptr<int> i;
};
The compiler thinks that the return is legal but applies RVO before the linker sees it and so the missing implementation is never needed. Of course it precludes the use of
A a;
A b(a);
An inelegant fix, but effective none the less.
I ran across this article on copy ellision in C++ and I've seen comments about it in the boost library. This is appealing, as I prefer my functions to look like
verylargereturntype DoSomething(...)
rather than
void DoSomething(..., verylargereturntype& retval)
So, I have two questions about this
Google has virtually no documentation on this at all, how real is this?
How can I check that this optimization is actually occuring? I assume it involves looking at the assembly, but lets just say that isn't my strong suit. If anyone can give a very basic example as to what successful ellision looks like, that would be very useful
I won't be using copy ellision just to prettify things, but if I can be guaranteed that it works, it sounds pretty useful.
I think this is a very commonly applied optimization because:
it's not difficult for the compiler to do
it can be a huge gain
it's an area of C++ that was a commonly critiqued before the optimization became common
If you're just curious, put a debug printf() in your copy constructor:
class foo {
public:
foo(): x(0) {};
foo(int x_) : x( x_) {};
foo( foo const& other) : x( other.x) {
printf( "copied a foo\n");
};
static foo foobar() {
foo tmp( 2);
return tmp;
}
private:
int x;
};
int main()
{
foo myFoo;
myFoo = foo::foobar();
return 0;
}
Prints out "copied a foo" when I run an unoptimmized build, but nothing when I build optimized.
From your cited article:
Although copy elision is never required by the standard, recent versions of every compiler I’ve tested do perform these optimizations today. But even if you don’t feel comfortable returning heavyweight objects by value, copy elision should still change the way you write code.
It is better known as Return Value Optimization.
The only way to know for sure is to look at the assembly, but you're asking the wrong question. You don't need to know if the compiler is eliding the copy unless it matters to the program timing. A profiler should easily tell you if you're spending too much time in the copy constructor.
The poor man's way to figure it out is to put a static counter in the copy constructor and try both forms of your function. If the counts are the same, you've successfully avoided the copy.
Google "Named Return Value Optimization" and "Return Value Optimization" instead. Modern compilers will in fact not perform the copy in many cases.
You can check if it's occurring by returning a type with side effects -- such as printing a message. Wikipedia has some good examples of where program output changes when RVO and/or NRVO is in effect.
Example of what it looks like:
#include <iostream>
struct Foo {
int a;
Foo(int a) : a(a) {}
Foo(const Foo &rhs) : a(rhs.a) { std::cout << "copying\n"; }
};
int main() {
Foo f = Foo(1);
}
If you see no output, then copy elision has taken place. That's elision of a copy from an initializer. The other legal case of copy elision is a return value, and is tested by:
Foo getFoo() {
return Foo(1);
}
int main() {
Foo f = getFoo();
}
or more excitingly for a named return value:
Foo getFoo() {
Foo f(1);
return f;
}
int main() {
Foo f = getFoo();
}
g++ performs all those elisions for me with no optimisation flags, but you can't really know whether more complex code will outwit the compiler.
Note that copy elision doesn't help with assignment, so the following will always result in a call to operator= if that operator prints anything:
Foo f(1);
f = getFoo();
Returning by value therefore can still result in "a copy", even if copy constructor elision is performed. So for chunking great classes it's still a performance consideration at the design stage. You don't want to write your code such that fixing it later will be a big deal if it turns out your app spends a significant proportion of its time in copying that could have been avoided.
To answer question 2, you could write a demo program where you write a class DemoReturnType; which has instrumented constructors and destructors which just write to cout when they are called. This should give you enough information about what your compiler is capable of.
Rvalue references solve this problem in C++0x. Whether or not you can obtain an rvalue-enabled compiler is another question - last time I checked only Visual Studio 2010 supports it.
(I'm using gcc with -O2.)
This seems like a straightforward opportunity to elide the copy constructor, since there are no side-effects to accessing the value of a field in a bar's copy of a foo; but the copy constructor is called, since I get the output meep meep!.
#include <iostream>
struct foo {
foo(): a(5) { }
foo(const foo& f): a(f.a) { std::cout << "meep meep!\n"; }
int a;
};
struct bar {
foo F() const { return f; }
foo f;
};
int main()
{
bar b;
int a = b.F().a;
return 0;
}
It is neither of the two legal cases of copy ctor elision described in 12.8/15:
Return value optimisation (where an automatic variable is returned from a function, and the copying of that automatic to the return value is elided by constructing the automatic directly in the return value) - nope. f is not an automatic variable.
Temporary initializer (where a temporary is copied to an object, and instead of constructing the temporary and copying it, the temporary value is constructed directly into the destination) - nope f is not a temporary either. b.F() is a temporary, but it isn't copied anywhere, it just has a data member accessed, so by the time you get out of F() there's nothing to elide.
Since neither of the legal cases of copy ctor elision apples, and the copying of f to the return value of F() affects the observable behaviour of the program, the standard forbids it to be elided. If you got replaced the printing with some non-observable activity, and examined the assembly, you might see that this copy constructor has been optimised away. But that would be under the "as-if" rule, not under the copy constructor elision rule.
Copy elision happens only when a copy isn't really necessary. In particular, it's when there's one object (call it A) that exists for the duration of the execution of a function, and a second object (call it B) that will be copy constructed from the first object, and immediately after that, A will be destroyed (i.e. upon exit from the function).
In this very specific case, the standard gives permission for the compiler to coalesce A and B into two separate ways of referring to the same object. Instead of requiring that A be created, then B be copy constructed from A, and then A be destroyed, it allows A and B to be considered two ways of referring to the same object, so the (one) object is created as A, and after the function returns starts to be referred to as B, but even if the copy constructor has side effects, the copy that creates B from A can still be skipped over. Also, note that in this case A (as an object separate from B) is never destroyed either -- e.g., if your dtor also had side effects, they could (would) be omitted as well.
Your code doesn't fit that pattern -- the first object does not cease to exist immediately after being used to initialize the second object. After F() returns, there are two instances of the object. That being the case, the [Named] Return Value Optimization (aka. copy elision) simply does not apply.
Demo code when copy elision would apply:
#include <iostream>
struct foo {
foo(): a(5) { }
foo(const foo& f): a(f.a) { std::cout << "meep meep!\n"; }
int a;
};
int F() {
// RVO
std::cout << "F\n";
return foo();
}
int G() {
// NRVO
std::cout << "G\n";
foo x;
return x;
}
int main() {
foo a = F();
foo b = G();
return 0;
}
Both MS VC++ and g++ optimize away both copy ctors from this code with optimization turned on. g++ optimizes both away even if optimization is turned off. With optimization turned off, VC++ optimizes away the anonymous return, but uses the copy ctor for the named return.
The copy constructor is called because a) there is no guarantee you are copying the field value without modification, and b) because your copy constructor has a side effect (prints a message).
A better way to think about copy elision is in terms of the temporary object. That is how the standard describes it. A temporary is allowed to be "folded" into a permanent object if it is copied into the permanent object immediately before its destruction.
Here you construct a temporary object in the function return. It doesn't really participate in anything, so you want it to be skipped. But what if you had done
b.F().a = 5;
if the copy were elided, and you operated on the original object, you would have modified b through a non-reference.