copy elision method - c++

From the standard definition of copy elision method:
In C++ computer programming, copy elision refers to a compiler optimization technique that eliminates unnecessary copying of objects.
Let us consider following code:
#include <cstdlib>
#include <iostream>
using namespace std;
int n=0;
struct C
{
C (int) {}
C(const C&) {++n;}
};
int main(int argc, char *argv[])
{
C c1(42);
C c2=42;
return n;
}
This line "return n" will returns either 0 or 1, depending on whether the copy was elided.
Also consider this code:
#include <iostream>
struct C {
C() {}
C(const C&) { std::cout << "Hello World!\n"; }
};
void f() {
C c;
throw c; // copying the named object c into the exception object.
} // It is unclear whether this copy may be elided.
int main() {
try {
f();
}
catch(C c) {
}
}
It says that
// copying the exception object into the temporary in the exception declaration.
//It is also unclear whether this copy may be elided.
So my question is how useful is implement such optimization method, if sometimes results are undefined? And in general how often it is used?

The important bit is that the standard explicitly allows for this, and that means that you cannot assume that the side effects of a copy constructor will be executed as the copies might be elided. The standard requires that the implementation of a copy-constructor has copy-constructor semantics, that is, has as whole purpose the generation of a second object semantically equivalent in your domain to the original object. If your program complies with that, then the optimization will not affect the program.
It is true, on the other hand, that this is the only situation I can think where the standard allows for different visible outcomes from the same program depending on what the compiler does, but you have been advised that you should not have side effects in your copy constructor (or rather, you cannot depend on the exact number of copies performed).
As to whether it is worth it, yes it is. In many cases copies are quite expensive (I am intentionally ignoring move-constructors in C++11 from the discussion). Consider a function that returns a vector<int>, if the copy is not elided, then another dynamic allocation is required, copy of all of the vector contents and then release of the original block of memory all three operations can be expensive.
Alternatively, you could force users to change their code to create an empty object and pass it by reference, but that will make code harder to read.

Related

Does the C++ standard guarantee that a function return value has a constant address?

Consider this program:
#include <stdio.h>
struct S {
S() { print(); }
void print() { printf("%p\n", (void *) this); }
};
S f() { return {}; }
int main() { f().print(); }
As far as I can tell, there is exactly one S object constructed here. There is no copy elision taking place: there is no copy to be elided in the first place, and indeed, if I explicitly delete the copy and/or move constructor, compilers continue to accept the program.
However, I see two different pointer values printed. This happens because my platform's ABI returns trivially copyable types such as this one in CPU registers, so there is no way with that ABI of avoiding a copy. clang preserves this behaviour even when optimising away the function call altogether. If I give S a non-trivial copy constructor, even if it's inaccessible, then I do see the same value printed twice.
The initial call to print() happens during construction, which is before the start of the object's lifetime, but using this inside a constructor is normally valid so long as it isn't used in a way that requires the construction to have finished -- no casting to a derived class, for instance -- and as far as I know, printing or storing its value doesn't require the construction to have finished.
Does the standard allow this program to print two different pointer values?
Note: I'm aware that the standard allows this program to print two different representations of the same pointer value, and technically, I haven't ruled that out. I could create a different program that avoids comparing pointer representations, but it would be more difficult to understand, so I would like to avoid that if possible.
T.C. pointed out in the comments that this is a defect in the standard. It's core language issue 1590. It's a subtly different issue than my example, but the same root cause:
Some ABIs require that an object of certain class types be passed in a register [...]. The Standard should be changed to permit this usage.
The current suggested wording would cover this by adding a new rule to the standard:
When an object of class type X is passed to or returned from a function, if each copy constructor, move constructor, and destructor of X is either trivial or deleted, and X has at least one non-deleted copy or move constructor, implementations are permitted to create a temporary object to hold the function parameter or result object. [...]
For the most part, this would permit the current GCC/clang behaviour.
There is a small corner case: currently, when a type has only a deleted copy or move constructor that would be trivial if defaulted, by the current rules of the standard, that constructor is still trivial if deleted:
12.8 Copying and moving class objects [class.copy]
12 A copy/move constructor for class X is trivial if it is not user-provided [...]
A deleted copy constructor is not user-provided, and nothing of what follows would render such a copy constructor non-trivial. So as specified by the standard, such a constructor is trivial, and as specified by my platform's ABI, because of the trivial constructor, GCC and clang create an extra copy in that case too. A one-line addition to my test program demonstrates this:
#include <stdio.h>
struct S {
S() { print(); }
S(const S &) = delete;
void print() { printf("%p\n", (void *) this); }
};
S f() { return {}; }
int main() { f().print(); }
This prints two different addresses with both GCC and clang, even though even the proposed resolution would require the same address to be printed twice. This appears to suggest that while we will get an update to the standard to not require a radically incompatible ABI, we will still need to get an update to the ABI to handle the corner case in a manner compatible with what the standard will require.
This is not an answer, rather a note on the different behavior of g++ and clang in this case, depending on the -O optimization flag.
Consider the following code:
#include <stdio.h>
struct S {
int i;
S(int _i): i(_i) {
int* p = print("from ctor");
printf("about to put 5 in %p\n", (void *)&i);
*p = 5;
}
int* print(const char* s) {
printf("%s: %p %d %p\n", s, (void *) this, i, (void *)&i);
return &i;
}
};
S f() { return {3}; }
int main() {
f().print("from main");
}
We can see that clang (3.8) and g++ (6.1) are taking it a bit differently, but both get to the right answer.
clang (for no -O, -O1, -O2) and g++ (for no -O, -O1)
from ctor: 0x7fff9d5e86b8 3 0x7fff9d5e86b8
about to put 5 in 0x7fff9d5e86b8
from main: 0x7fff9d5e86b0 5 0x7fff9d5e86b0
g++ (for -O2)
from ctor: 0x7fff52a36010 3 0x7fff52a36010
about to put 5 in 0x7fff52a36010
from main: 0x7fff52a36010 5 0x7fff52a36010
It seems that they both do it right in both cases - when they decide to skip the register optimization (g++ -O2) and when they go with the register optimization but copy the value to the actual i on time (all other cases).

Is return value optimization dependable in C++?

As this wiki page says (code exerted as below), return value optimization is an allowed by C++ compiler, but still depends on the implementation. To reduce the cost of copying, is it recommended to do optimize it manually (assign the object of function to a reference, like const C& obj = f();) or leave the compiler to do such optimization in practice?
#include <iostream>
struct C {
C() {}
C(const C&) { std::cout << "A copy was made.\n"; }
};
C f() {
return C();
}
int main() {
std::cout << "Hello World!\n";
C obj = f();
}
EDIT: Update the change as const reference.
You can't (portably) use the temporary return value to initialise a non-const reference, so that's certainly not recommended.
Using it to initialise a const reference wouldn't have any effect on whether or not the copy/move of the return expression's value might be elided; although it would eliminate the notional copy/move used to initialise the variable from the returned value, whether or not that might have been elided. Of course, that's not the same as initialising a (non-reference) variable, since you can't modify it.
In practice, any compiler with a decent optimiser will elide copies and moves wherever it's allowed to. If you're not using a decent optimiser, then you can't expect decent performance anyway.
To manually make sure you don't get any redundant copies of objects, you need to do a bit more than what you did so far. Earlier I answered that it wouldn't be possible, but I was wrong. Also, your use of const& may disallow certain operations on the returned value that you do want to allow. Here's what I would do if you need to do the optimisations manually:
#include <iostream>
struct S {
S()
{ std::cout << "default constructor\n"; }
S(const S &)
{ std::cout << "copy constructor\n"; }
S(S &&)
{ std::cout << "move constructor\n"; }
~S()
{ std::cout << "destructor\n"; }
};
S f() { return {}; }
int main() {
auto&&s = f();
std::cout << "main\n";
}
This prints "default constructor", followed by "main", and then "destructor". This is the output regardless of whether any copy elision takes place. Inside main, s is a named reference, so it is an lvalue, and it is not const-qualified. You can do everything with it that you otherwise could.
Given that it turns out fairly easy to avoid relying on copy elision in cases such as these, as long as you take care to pay attention to it from the start, it may be worth your efforts if you have to worry about other compilers not performing copy elision. Most compilers are capable of that, and there is a fair chance that if a compiler doesn't, it will have other, bigger, problems anyway, so there is a good argument for not worrying about it.
However, at the same time, copy elision is somewhat unreliable: even current optimising compilers do not always perform it, simply because there may be corner cases where copy elision would make sense, but is not permitted by the standard or not possible for that particular implementation. Forcing yourself to write code that doesn't rely on copy elision means you cannot get stuck in that situation.
That said, there are still some cases where copy elision can only realistically be eliminated by optimising compilers, so you may have no choice but to rely on it:
Suppose we add void m(); to S's definition. Suppose we now edit f to
S f() {
S s;
s.m();
return s;
}
This is more difficult to rewrite into a form that guarantees no redundant copies. Yet at the same time, copies are unnecessary, as can easily be determined from the fact that with GCC (and probably other compilers too), by default, no copies are made.
My final conclusion is that it's probably not worth optimising for compilers that don't perform RVO, but it is worth thinking carefully about what exactly makes it work, and writing code in such a way that RVO remains not only possible, but becomes something a compiler is very likely to do.

How to write move so that it can potentially be optimized away?

Given the following code:
struct obj {
int i;
obj() : i(1) {}
obj(obj &&other) : i(other.i) {}
};
void f() {
obj o2(obj(obj(obj{})));
}
I expect release builds to only really create one object and never call a move constructor because the result is the same as if my code was executed. Most code is not that simple though, I can think of a few hard to predict side effects that could stop the optimizer from proving the "as if":
changes to global or "outside" things in either the move constructor or destructor.
potential exceptions in the move constructor or destructor (probably bad design anyway)
internal counting or caching mechanisms changing.
Since I don't use any of these often can I expect most of my moves in and out of functions which are later inlined to be optimized away or am I forgetting something?
P.S. I know that just because an optimization is possible does not mean it will be made by any given compiler.
This doesn't really have anything to do with the as-if rule. The compiler is allowed to elide moves and copies even if they have some side effect. It is the single optimization that a compiler is allowed to do that might change the result of your program. From §12.8/31:
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects.
So the compiler doesn't have to bother inspecting what happens inside your move constructor, it will likely get rid of any moves here anyway. To demonstrate this, consider the following example:
#include <iostream>
struct bad_mover
{
static int move_count;
bad_mover() = default;
bad_mover(bad_mover&& other) { move_count++; }
};
int bad_mover::move_count = 0;
int main(int argc, const char* argv[])
{
bad_mover b{bad_mover(bad_mover(bad_mover()))};
std::cout << "Move count: " << bad_mover::move_count << std::endl;
return 0;
}
Compiled with g++ -std=c++0x:
Move count: 0
Compiled with g++ -std=c++0x -fno-elide-constructors:
Move count: 3
However, I would question any reason you have for providing a move constructor that has additional side effects. The idea in allowing this optimization regardless of side effects is that a copy or move constructor shouldn't do anything other than copy or move. The program with the copy or move should be exactly the same as without.
Nonetheless, your calls to std::move are unnecessary. std::move is used to change an lvalue expression to an rvalue expression, but an expression that creates a temporary object is already an rvalue expression.
Using std::move( tmp(...) ) is completely pointless, the temporary tmp is already an rvalue, you don't need to use std::move to cast it to an rvalue.
Read this series of articles: Want Speed? Pass By Value
You'll learn more and understand better than you will by asking a question on Stackoverflow

When I initialize a C++ container (such as a std::list) is the copy constructor called?

When I initialize a STL container such as a list< vector<char> > using e.g. my_list.push_back(vector<char>(5000, 'T')) is this copied after construction? Or does the compiler invoke the constructor inside list< vector<char> > itself?
In C++03 push_back is defined as void push_back(const T& x);. That means that you are constructing a vector and a const reference to such temporal is being passed to the list. Then the list internally invokes the copy constructor in order to store a copy of such element.
In C++11 there is an extra definition for void push_back(T&& x); that takes an rvalue reference to your temporal vector, and would result in the move constructor being called internally to initialize the copy held by the list.
Compilers are smart. Really smart. In this case, there is an optimization called "copy elision." The C++ standard allows the compiler to omit a copy when a temporary object is used to initialize an object of the same type and the copy constructor of said object has no side effects.
This is in the same class of optimizations as the more popular "as if" rule. That rule allows the compiler to get away with nearly anything it wants, as long as the observable behavior of the resulting program is the same "as if" the standard had been followed exactly.
Here is an example program. On gcc 4.4.5 with both -O0 and -O3 this code results in a "1" being printed. I think that GCC is wrong here... some compilers will output "2" indicating a copy took place. This is where things get tricky in trying to detect behavior that is supposed to be undetectable. In one of those compilers, the only way to tell will be to dive into the resulting assembly.
#include <iostream>
struct elision
{
explicit elision(int i) : v(i) {
}
elision(elision const &copy) : v(copy.v+1) {
}
int v;
};
int main()
{
elision e(elision(1));
std::cout << e.v << std::endl;
return 0;
}

compiler optimization

So I have a question for you. :)
Can you tell me the output the following code should produce?
#include <iostream>
struct Optimized
{
Optimized() { std::cout << "ctor" << std::endl; }
~Optimized() { std::cout << "dtor" << std::endl; }
Optimized(const Optimized& copy) { std::cout << "copy ctor" << std::endl; }
Optimized(Optimized&& move) { std::cout << "move ctor" << std::endl; }
const Optimized& operator=(const Optimized& rhs) { std::cout << "assignment operator" << std::endl; return *this; }
Optimized& operator=(Optimized&& lhs) { std::cout << "move assignment operator" << std::endl; return *this; }
};
Optimized TestFunction()
{
Optimized a;
Optimized b = a;
return b;
}
int main(int argc, char* argv[])
{
Optimized test = TestFunction();
return 0;
}
My first response would be:
ctor
copy ctor
move ctor
dtor
dtor
dtor
and it IS true, but only if compiler optimization is turned off. When optimization is turned ON then the output is entirely different. With optimization turned on, the output is:
ctor
copy ctor
dtor
dtor
With compiler optimization, the test variable is the return variable.
My question is, what conditions would cause this to not be optimized this way?
I have always been taught that returning a struct/class which results in extra copy constructors could better be optimized by being passed in as a reference but the compiler is doing that for me. So is return a structure still considered bad form?
This is known as Copy Elision and is a special handling instead of copying/moving.
The optimization is specifically allowed by the Standard, as long as it would be possible to copy/move (ie, the method is declared and accessible).
The implementation in a compiler is generally referred to, in this case, as Return Value Optimization. There are two variations:
RVO: when you return a temporary (return "aa" + someString;)
NRVO: N for Named, when you return an object that has a name
Both are implemented by major compilers, but the latter may kick in only at higher optimization levels as it is more difficult to detect.
Therefore, to answer your question about returning structs: I would recommend it. Consider:
// Bad
Foo foo;
bar(foo);
-- foo can be modified here
// Good
Foo const foo = bar();
The latter is not only clearer, it also allows const enforcement!
Both outputs are permissible. The C++03 language standard says, in clause 12.8/15:
When certain criteria are met, an implementation is allowed to omit the copy construction of a class object,
even if the copy constructor and/or destructor for the object have side effects. In such cases, the implementation
treats the source and target of the omitted copy operation as simply two different ways of referring to
the same object, and the destruction of that object occurs at the later of the times when the two objects
would have been destroyed without the optimization.111) This elision of copy operations is permitted in the
following circumstances (which may be combined to eliminate multiple copies):
in a return statement in a function with a class return type, when the expression is the name of a
non-volatile automatic object with the same cv-unqualified type as the function return type, the copy
operation can be omitted by constructing the automatic object directly into the function’s return value
when a temporary class object that has not been bound to a reference (12.2) would be copied to a class
object with the same cv-unqualified type, the copy operation can be omitted by constructing the temporary
object directly into the target of the omitted copy
The output this code will produce is unpredictable, since the language specification explicitly allows optional elimination (elision) of "unnecessary" temporary copies of class objects even if their copy constructors have side effects.
Whether this will happen or not might depend on may factors, including the compiler optimization settings.
In my opinion calling the above copy elision an "optimization" is not entirely correct (although the desire to use this term here is perfectly understandable and it is widely used for this purpose). I'd say that the term optimization should be reserved to situations when the compiler deviates from the behavior of the abstract C++ machine while preserving the observable behavior of the program. In other words, true optimization implies violation of the abstract requirements of the language specification. Since in this case there's no violation (the copy elision is explicitly allowed by the standard), there's no real "optimization". What we observe here is just how the C++ language works at its abstract level. No need to involve the concept of "optimization" at all.
Even when passing back by value the compiler can optimise the extra copy away using Return Value Optimisation see; http://en.wikipedia.org/wiki/Return_value_optimization