Expensive to move types - c++

I am reading the official CPPCoreGuidelines to understand correctly when it's reliable to count on RVO and when not.
At F20 it is written:
If a type is expensive to move (e.g., array), consider
allocating it on the free store and return a handle (e.g.,
unique_ptr), or passing it in a reference to non-const target object
to fill (to be used as an out-parameter)
I understand that the non-STL types are not optimized to move, but how can I easy detect other types expensive to move, so I will not use RVO on them?

You seem to have misunderstood what "RVO" is. "RVO" stands for "return value optimization" and it's a compiler optimization that prevents any move or copy constructor from being invoked. E.g.
std::vector<huge_thing> foo()
{
std::vector<huge_thing> result{/* ... */};
return result;
}
void bar()
{
auto v = foo(); // (0)
}
Any decent compiler will not execute any copy/move operation and simply construct v in place at (0). In C++17, this is mandatory thanks to the changes to prvalues.
In terms of expensive moves: sure, there can be types expensive to move - but I cannot think of any instance where a move would be more expensive than a copy.
Therefore:
Rely on RVO, especially in C++17 - this does not incur any cost even for types "expensive to move".
If a type is expensive to move, it's also expensive to copy - so you don't really have a choice there. Redesign your code so that you don't need the copy/move if possible.

Related

in c++11, is it necessary to provide rvalue overrides for functions move-assigning large objects? [duplicate]

Since we have move semantics in C++, nowadays it is usual to do
void set_a(A a) { _a = std::move(a); }
The reasoning is that if a is an rvalue, the copy will be elided and there will be just one move.
But what happens if a is an lvalue? It seems there will be a copy construction and then a move assignment (assuming A has a proper move assignment operator). Move assignments can be costly if the object has too many member variables.
On the other hand, if we do
void set_a(const A& a) { _a = a; }
There will be just one copy assignment. Can we say this way is preferred over the pass-by-value idiom if we will pass lvalues?
Expensive-to-move types are rare in modern C++ usage. If you are concerned about the cost of the move, write both overloads:
void set_a(const A& a) { _a = a; }
void set_a(A&& a) { _a = std::move(a); }
or a perfect-forwarding setter:
template <typename T>
void set_a(T&& a) { _a = std::forward<T>(a); }
that will accept lvalues, rvalues, and anything else implicitly convertible to decltype(_a) without requiring extra copies or moves.
Despite requiring an extra move when setting from an lvalue, the idiom is not bad since (a) the vast majority of types provide constant-time moves and (b) copy-and-swap provides exception safety and near-optimal performance in a single line of code.
But what happens if a is an lvalue? It seems there will be a copy
construction and then a move assignment (assuming A has a proper move
assignment operator). Move assignments can be costly if the object has
too many member variables.
Problem well spotted. I wouldn't go as far as to say that the pass-by-value-and-then-move construct is a bad idiom but it definitely has its potential pitfalls.
If your type is expensive to move and / or moving it is essentially just a copy, then the pass-by-value approach is suboptimal. Examples of such types would include types with a fixed size array as a member: It may be relatively expensive to move and a move is just a copy. See also
Small String Optimization and Move Operations and
"Want speed? Measure." (by Howard Hinnant)
in this context.
The pass-by-value approach has the advantage that you only need to maintain one function but you pay for this with performance. It depends on your application whether this maintenance advantage outweighs the loss in performance.
The pass by lvalue and rvalue reference approach can lead to maintenance headaches quickly if you have multiple arguments. Consider this:
#include <vector>
using namespace std;
struct A { vector<int> v; };
struct B { vector<int> v; };
struct C {
A a;
B b;
C(const A& a, const B& b) : a(a), b(b) { }
C(const A& a, B&& b) : a(a), b(move(b)) { }
C( A&& a, const B& b) : a(move(a)), b(b) { }
C( A&& a, B&& b) : a(move(a)), b(move(b)) { }
};
If you have multiple arguments, you will have a permutation problem. In this very simple example, it is probably still not that bad to maintain these 4 constructors. However, already in this simple case, I would seriously consider using the pass-by-value approach with a single function
C(A a, B b) : a(move(a)), b(move(b)) { }
instead of the above 4 constructors.
So long story short, neither approach is without drawbacks. Make your decisions based on actual profiling information, instead of optimizing prematurely.
The current answers are quite incomplete. Instead, I will try to conclude based on the lists of pros and cons I find.
Short answer
In short, it may be OK, but sometimes bad.
This idiom, namely the unifying interface, has better clarity (both in conceptual design and implementation) compared to forwarding templates or different overloads. It is sometimes used with copy-and-swap (actually, as well as move-and-swap in this case).
Detailed analysis
The pros are:
It needs only one function for each parameter list.
It needs indeed only one, not multiple ordinary overloads (or even 2n overloads when you have n parameters when each one can be unqualified or const-qualified).
Like within a forwarding template, parameters passed by value are compatible with not only const, but volatile, which reduce even more ordinary overloads.
Combined with the bullet above, you don't need 4n overloads to serve to {unqulified, const, const, const volatile} combinations for n parameters.
Compared to a forwarding template, it can be a non-templated function as long as the parameters are not needed to be generic (parameterized through template type parameters). This allows out-of-line definitions instead of template definitions needed to be instantiated for each instance in each translation unit, which can make significant improvement to translation-time performance (typically, during both compiling and linking).
It also makes other overloads (if any) easier to implement.
If you have a forwarding template for a parameter object type T, it may still clash with overloads having a parameter const T& in the same position, because the argument can be a lvalue of type T and the template instantiated with type T& (rather than const T&) for it can be more preferred by the overloading rule when there is no other way to differentiate which is the best overloading candidate. This inconsistency may be quite surprising.
In particular, consider you have forwarding template constructor with one parameter of type P&& in a class C. How many time will you forget to excluded the instance of P&& away from possibly cv-qualified C by SFINAE (e.g. by adding typename = enable_if_t<!is_same<C, decay_t<P>> to the template-parameter-list), to ensure it does not clash with copy/move constructors (even when the latter are explicitly user-provided)?
Since the parameter is passed by value of a non-reference type, it can force the argument be passed as a prvalue. This can make a difference when the argument is of a class literal type. Consider there is such a class with a static constexpr data member declared in some class without an out-of-class definition, when it is used as an argument to a parameter of lvalue reference type, it may eventually fail to link, because it is odr-used and there is no definition of it.
Note since ISO C++ 17 the rules of static constexpr data member have changed to introduce a definition implicitly, so the difference is not significant in this case.
The cons are:
A unifying interface can not replace copy and move constructors where the parameter object type is identical to the class. Otherwise, copy-initialization of the parameter would be infinite recursion, because it will call the unifying constructor, and the constructor then call itself.
As mentioned by other answers, if the cost of copy is not ignorable (cheap and predictable enough), this means you will almost always have the degeneration of performance in the calls when the copy is not needed, because copy-initialization of a unifying passed-by-value parameter unconditionally introduce a copy (either copied-to or moved-to) of the argument unless elided.
Even with mandatory elision since C++17, copy-initialization of a parameter object is still hardly free to be removed away - unless the implementation try very hard to prove the behavior not changed according to as-if rules instead of the dedicated copy elision rules applicable here, which might be sometimes impossible without a whole program analysis.
Likewise, the cost of destruction may not be ignorable as well, particularly when non-trivial subobjects are taken into account (e.g. in cases of containers). The difference is that, it does not only apply to the copy-initialization introduced by the copy construction, but also by the move construction. Making move cheaper than copy in constructors can not improve the situation. The more cost of copy-initialization, the more cost of destruction you have to afford.
A minor shortcoming is that there is no way to tweak the interface in different ways as plural overloads, for example, specifying different noexcept-specifiers for parameters of const& and && qualified types.
OTOH, in this example, unifying interface will usually provide you with noexcept(false) copy + noexcept move if you specifies noexcept, or always noexcept(false) when you specify nothing (or explicit noexcept(false)). (Note in the former case, noexcept does not prevent throwing during copy because that will only occur during evaluation of arguments, which is out of the function body.) There is no further chance to tune them separately.
This is considered minor because it is not frequently needed in reality.
Even if such overloads are used, they are probably confusing by nature: different specifiers may hide subtle but important behavioral differences which are difficult to reason about. Why not different names instead of overloads?
Note the example of noexcept may be particularly problematic since C++17 because noexcept-specification now affect the function type. (Some unexpected compatibility issues can be diagnosed by Clang++ warning.)
Sometimes the unconditional copy is actually useful. Because composition of operations with strong-exception guarantee does not hold the guarantee in nature, a copy can be used as a transactional state holder when the strong-exception guarantee is required and the operation cannot be broken down as sequence of operations with no less strict (no-exception or strong) exception guarantee. (This includes the copy-and-swap idiom, although assignments are not recommended to be unified for other reasons in general, see below.) However, this does not mean the copy is otherwise unacceptable. If the intention of the interface is always to create some object of type T, and the cost of moving T is ignorable, the copy can be moved to the target without unwanted overhead.
Conclusions
So for some given operations, here are suggestions about whether using a unifying interface to replace them:
If not all of the parameter types match the unifying interface, or if there is behavioral difference other than the cost of new copies among operations being unified, there cannot be a unifying interface.
If the following conditions are failed to be fit for all parameters, there cannot be a unifying interface. (But it can still be broken down to different named-functions, delegating one call to another.)
For any parameter of type T, if a copy of each argument is needed for all operations, use unifying.
If both copy and move construction of T have ignorable cost, use unifying.
If the intention of the interface is always to create some object of type T, and the cost of the move construction of T is ignorable, use unifying.
Otherwise, avoid unifying.
Here are some examples need to avoid unifying:
Assignment operations (including assignment to the subobjects thereof, typically with copy-and-swap idiom) for T without ignorable cost in copy and move constructions does not meet the criteria of unifying, because the intention of assignment is not to create (but to replace the content of) the object. The copied object will eventually be destructed, which incurs unnecessary overhead. This is even more obvious for cases of self-assignment.
Insertion of values to a container does not meet the criteria, unless both the copy-initialization and destruction have ignorable cost. If the operation fails (due to the allocation failure, duplicate values or so on) after copy-initialization, the parameters have to be destructed, which incurs unnecessary overhead.
Conditionally creation of object based on parameters will incur the overhead when it does not actually create the object (e.g. std::map::insert_or_assign-like container insertion even in spite of the failure above).
Note the accurate limit of "ignorable" cost is somewhat subjective because it eventually depends on how much cost can be tolerated by the developers and/or the users, and it may vary case by case.
Practically, I (conservatively) assume any trivially copyable and trivailly destructible type whose size is not more than one machine word (like a pointer) qualifying the criteria of ignorable cost in general - if the resulted code actually cost too much in such case, it suggests either a wrong configuration of the build tool is used, or the toolchain is not ready for production.
Do profile if there is any further doubt on performance.
Additional case study
There are some other well-known types preferred to be passed by value or not, depending on the conventions:
Types need to preserve reference values by convention should not be passed by value.
A canonical example is the argument forwarding call wrapper defined in ISO C++, which requires to forward references. Note in the caller position it may also preserve the reference respecting to the ref-qualifier.
An instance of this example is std::bind. See also the resolution of LWG 817.
Some generic code may directly copy some parameters. It may be even without std::move, because the cost of the copy is assumed to be ignorable and a move does not necessarily make it better.
Such parameters include iterators and function objects (except the case of argument forwarding caller wrappers discussed above).
Note the constructor template of std::function (but not the assignment operator template) also uses the pass-by-value functor parameter.
Types presumably having the cost comparable to pass-by-value parameter types with ignorable cost are also preferred to be pass-by-value. (Sometimes they are used as dedicated alternatives.) For example, instances of std::initializer_list and std::basic_string_view are more or less two pointers or a pointer plus a size. This fact makes them cheap enough to be directly passed without using references.
Some types should be better avoided passed by value unless you do need a copy. There are different reasons.
Avoid copy by default, because the copy may be quite expensive, or at least it is not easy to guarantee the copy is cheap without some inspection of the runtime properties of the value being copied. Containers are typical examples in this sort.
Without statically knowing how many elements in a container, it is generally not safe (in the sense of a DoS attack, for example) to be copied.
A nested container (of other containers) will easily make the performance problem of copying worse.
Even empty containers are not guaranteed cheap to be copied. (Strictly speaking, this depends on the concrete implementation of the container, e.g. the existence of the "sentinel" element for some node-based containers... But no, keep it simple, just avoid copying by default.)
Avoid copy by default, even when the performance is totally uninterested, because there can be some unexpected side effects.
In particular, allocator-awared containers and some other types with similar treatment to allocators ("container semantics", in David Krauss' word), should not be passed by value - allocator propagation is just another big semantic worm can.
A few other types conventionally depend. For example, see GotW #91 for shared_ptr instances. (However, not all smart pointers are like that; observer_ptr are more like raw pointers.)
For the general case where the value will be stored, the pass-by-value only is a good compromise-
For the case where you know that only lvalues will be passed (some tightly coupled code) it's unreasonable, unsmart.
For the case where one suspects a speed improvement by providing both, first THINK TWICE, and if that didn't help, MEASURE.
Where the value will not be stored I prefer the pass by reference, because that prevents umpteen needless copy operations.
Finally, if programming could be reduced to unthinking application of rules, we could leave it to robots. So IMHO it's not a good idea to focus so much on rules. Better to focus on what the advantages and costs are, for different situations. Costs include not only speed, but also e.g. code size and clarity. Rules can't generally handle such conflicts of interest.
Pass by value, then move is actually a good idiom for objects that you know are movable.
As you mentioned, if an rvalue is passed, it'll either elide the copy, or be moved, then within the constructor it will be moved.
You could overload the copy constructor and move constructor explicitly, however it gets more complicated if you have more than one parameter.
Consider the example,
class Obj {
public:
Obj(std::vector<int> x, std::vector<int> y)
: X(std::move(x)), Y(std::move(y)) {}
private:
/* Our internal data. */
std::vector<int> X, Y;
}; // Obj
Suppose if you wanted to provide explicit versions, you end up with 4 constructors like so:
class Obj {
public:
Obj(std::vector<int> &&x, std::vector<int> &&y)
: X(std::move(x)), Y(std::move(y)) {}
Obj(std::vector<int> &&x, const std::vector<int> &y)
: X(std::move(x)), Y(y) {}
Obj(const std::vector<int> &x, std::vector<int> &&y)
: X(x), Y(std::move(y)) {}
Obj(const std::vector<int> &x, const std::vector<int> &y)
: X(x), Y(y) {}
private:
/* Our internal data. */
std::vector<int> X, Y;
}; // Obj
As you can see, as you increase the number of parameters, the number of necessary constructors grow in permutations.
If you don't have a concrete type but have a templatized constructor, you can use perfect-forwarding like so:
class Obj {
public:
template <typename T, typename U>
Obj(T &&x, U &&y)
: X(std::forward<T>(x)), Y(std::forward<U>(y)) {}
private:
std::vector<int> X, Y;
}; // Obj
References:
Want Speed? Pass by Value
C++ Seasoning
I am answering myself because I will try to summarize some of the answers. How many moves/copies do we have in each case?
(A) Pass by value and move assignment construct, passing a X parameter. If X is a...
Temporary: 1 move (the copy is elided)
Lvalue: 1 copy 1 move
std::move(lvalue): 2 moves
(B) Pass by reference and copy assignment usual (pre C++11) construct. If X is a...
Temporary: 1 copy
Lvalue: 1 copy
std::move(lvalue): 1 copy
We can assume the three kinds of parameters are equally probable. So every 3 calls we have (A) 4 moves and 1 copy, or (B) 3 copies. I.e., in average, (A) 1.33 moves and 0.33 copies per call or (B) 1 copy per call.
If we come to a situation when our classes consist mostly of PODs, moves are as expensive as copies. So we would have 1.66 copies (or moves) per call to the setter in case (A) and 1 copies in case (B).
We can say that in some circumstances (PODs based types), the pass-by-value-and-then-move construct is a very bad idea. It is 66% slower and it depends on a C++11 feature.
On the other hand, if our classes include containers (which make use of dynamic memory), (A) should be much faster (except if we mostly pass lvalues).
Please, correct me if I'm wrong.
Readability in the declaration:
void foo1( A a ); // easy to read, but unless you see the implementation
// you don't know for sure if a std::move() is used.
void foo2( const A & a ); // longer declaration, but the interface shows
// that no copy is required on calling foo().
Performance:
A a;
foo1( a ); // copy + move
foo2( a ); // pass by reference + copy
Responsibilities:
A a;
foo1( a ); // caller copies, foo1 moves
foo2( a ); // foo2 copies
For typical inline code there is usually no difference when optimized.
But foo2() might do the copy only on certain conditions (e.g. insert into map if key does not exist), whereas for foo1() the copy will always be done.

Is it undesirable to defensively apply std::move to trivially-copyable types?

Imagine we have a trivially-copyable type:
struct Trivial
{
float A{};
int B{};
}
which gets constructed and stored in an std::vector:
class ClientCode
{
std::vector<Trivial> storage{};
...
void some_function()
{
...
Trivial t{};
fill_trivial_from_some_api(t, other_args);
storage.push_back(std::move(t)); // Redundant std::move.
...
}
}
Normally, this is a pointless operation, as the object will be copied anyway.
However, an advantage of keeping the std::move call is that if the Trivial type would be changed to no longer be trivially-copyable, the client code will not silently perform an extra copy operation, but a more appropriate move. (The situation is quite possible in my scenario, where the trivial type is used for managing external resources.)
So my question is whether there any technical downsides to applying the redundant std::move?
However, an advantage of keeping the std::move call is that if the Trivial type would be changed to no longer be trivially-copyable, the client code will not silently perform an extra copy operation, but a more appropriate move.
This is correct and something you should think about.
So my question is whether there any technical downsides to applying the redundant std::move?
Depends on where the moved object is being consumed. In the case of push_back, everything is fine, as push_back has both const T& and T&& overloads that behave intuitively.
Imagine another function that had a T&& overload that has completely different behavior from const T&: the semantics of your code will change with std::move.

Can modern C++ get you performance for free?

It is sometimes claimed that C++11/14 can get you a performance boost even when merely compiling C++98 code. The justification is usually along the lines of move semantics, as in some cases the rvalue constructors are automatically generated or now part of the STL. Now I'm wondering whether these cases were previously actually already handled by RVO or similar compiler optimizations.
My question then is if you could give me an actual example of a piece of C++98 code that, without modification, runs faster using a compiler supporting the new language features. I do understand that a standard conforming compiler is not required to do the copy elision and just by that reason move semantics might bring about speed, but I'd like to see a less pathological case, if you will.
EDIT: Just to be clear, I am not asking whether new compilers are faster than old compilers, but rather if there is code whereby adding -std=c++14 to my compiler flags it would run faster (avoid copies, but if you can come up with anything else besides move semantics, I'd be interested, too)
I am aware of 5 general categories where recompiling a C++03 compiler as C++11 can cause unbounded performance increases that are practically unrelated to quality of implementation. These are all variations of move semantics.
std::vector reallocate
struct bar{
std::vector<int> data;
};
std::vector<bar> foo(1);
foo.back().data.push_back(3);
foo.reserve(10); // two allocations and a delete occur in C++03
every time the foo's buffer is reallocated in C++03 it copied every vector in bar.
In C++11 it instead moves the bar::datas, which is basically free.
In this case, this relies on optimizations inside the std container vector. In every case below, the use of std containers is just because they are C++ objects that have efficient move semantics in C++11 "automatically" when you upgrade your compiler. Objects that don't block it that contain a std container also inherit the automatic improved move constructors.
NRVO failure
When NRVO (named return value optimization) fails, in C++03 it falls back on copy, on C++11 it falls back on move. Failures of NRVO are easy:
std::vector<int> foo(int count){
std::vector<int> v; // oops
if (count<=0) return std::vector<int>();
v.reserve(count);
for(int i=0;i<count;++i)
v.push_back(i);
return v;
}
or even:
std::vector<int> foo(bool which) {
std::vector<int> a, b;
// do work, filling a and b, using the other for calculations
if (which)
return a;
else
return b;
}
We have three values -- the return value, and two different values within the function. Elision allows the values within the function to be 'merged' with the return value, but not with each other. They both cannot be merged with the return value without merging with each other.
The basic issue is that NRVO elision is fragile, and code with changes not near the return site can suddenly have massive performance reductions at that spot with no diagnostic emitted. In most NRVO failure cases C++11 ends up with a move, while C++03 ends up with a copy.
Returning a function argument
Elision is also impossible here:
std::set<int> func(std::set<int> in){
return in;
}
in C++11 this is cheap: in C++03 there is no way to avoid the copy. Arguments to functions cannot be elided with the return value, because the lifetime and location of the parameter and return value is managed by the calling code.
However, C++11 can move from one to the other. (In a less toy example, something might be done to the set).
push_back or insert
Finally elision into containers does not happen: but C++11 overloads rvalue move insert operators, which saves copies.
struct whatever {
std::string data;
int count;
whatever( std::string d, int c ):data(d), count(c) {}
};
std::vector<whatever> v;
v.push_back( whatever("some long string goes here", 3) );
in C++03 a temporary whatever is created, then it is copied into the vector v. 2 std::string buffers are allocated, each with identical data, and one is discarded.
In C++11 a temporary whatever is created. The whatever&& push_back overload then moves that temporary into the vector v. One std::string buffer is allocated, and moved into the vector. An empty std::string is discarded.
Assignment
Stolen from #Jarod42's answer below.
Elision cannot occur with assignment, but move-from can.
std::set<int> some_function();
std::set<int> some_value;
// code
some_value = some_function();
here some_function returns a candidate to elide from, but because it is not used to construct an object directly, it cannot be elided. In C++03, the above results in the contents of the temporary being copied into some_value. In C++11, it is moved into some_value, which basically is free.
For the full effect of the above, you need a compiler that synthesizes move constructors and assignment for you.
MSVC 2013 implements move constructors in std containers, but does not synthesize move constructors on your types.
So types containing std::vectors and similar do not get such improvements in MSVC2013, but will start getting them in MSVC2015.
clang and gcc have long since implemented implicit move constructors. Intel's 2013 compiler will support implicit generation of move constructors if you pass -Qoption,cpp,--gen_move_operations (they don't do it by default in an effort to be cross-compatible with MSVC2013).
if you have something like:
std::vector<int> foo(); // function declaration.
std::vector<int> v;
// some code
v = foo();
You got a copy in C++03, whereas you got a move assignment in C++11.
so you have free optimisation in that case.

Is the pass-by-value-and-then-move construct a bad idiom?

Since we have move semantics in C++, nowadays it is usual to do
void set_a(A a) { _a = std::move(a); }
The reasoning is that if a is an rvalue, the copy will be elided and there will be just one move.
But what happens if a is an lvalue? It seems there will be a copy construction and then a move assignment (assuming A has a proper move assignment operator). Move assignments can be costly if the object has too many member variables.
On the other hand, if we do
void set_a(const A& a) { _a = a; }
There will be just one copy assignment. Can we say this way is preferred over the pass-by-value idiom if we will pass lvalues?
Expensive-to-move types are rare in modern C++ usage. If you are concerned about the cost of the move, write both overloads:
void set_a(const A& a) { _a = a; }
void set_a(A&& a) { _a = std::move(a); }
or a perfect-forwarding setter:
template <typename T>
void set_a(T&& a) { _a = std::forward<T>(a); }
that will accept lvalues, rvalues, and anything else implicitly convertible to decltype(_a) without requiring extra copies or moves.
Despite requiring an extra move when setting from an lvalue, the idiom is not bad since (a) the vast majority of types provide constant-time moves and (b) copy-and-swap provides exception safety and near-optimal performance in a single line of code.
But what happens if a is an lvalue? It seems there will be a copy
construction and then a move assignment (assuming A has a proper move
assignment operator). Move assignments can be costly if the object has
too many member variables.
Problem well spotted. I wouldn't go as far as to say that the pass-by-value-and-then-move construct is a bad idiom but it definitely has its potential pitfalls.
If your type is expensive to move and / or moving it is essentially just a copy, then the pass-by-value approach is suboptimal. Examples of such types would include types with a fixed size array as a member: It may be relatively expensive to move and a move is just a copy. See also
Small String Optimization and Move Operations and
"Want speed? Measure." (by Howard Hinnant)
in this context.
The pass-by-value approach has the advantage that you only need to maintain one function but you pay for this with performance. It depends on your application whether this maintenance advantage outweighs the loss in performance.
The pass by lvalue and rvalue reference approach can lead to maintenance headaches quickly if you have multiple arguments. Consider this:
#include <vector>
using namespace std;
struct A { vector<int> v; };
struct B { vector<int> v; };
struct C {
A a;
B b;
C(const A& a, const B& b) : a(a), b(b) { }
C(const A& a, B&& b) : a(a), b(move(b)) { }
C( A&& a, const B& b) : a(move(a)), b(b) { }
C( A&& a, B&& b) : a(move(a)), b(move(b)) { }
};
If you have multiple arguments, you will have a permutation problem. In this very simple example, it is probably still not that bad to maintain these 4 constructors. However, already in this simple case, I would seriously consider using the pass-by-value approach with a single function
C(A a, B b) : a(move(a)), b(move(b)) { }
instead of the above 4 constructors.
So long story short, neither approach is without drawbacks. Make your decisions based on actual profiling information, instead of optimizing prematurely.
The current answers are quite incomplete. Instead, I will try to conclude based on the lists of pros and cons I find.
Short answer
In short, it may be OK, but sometimes bad.
This idiom, namely the unifying interface, has better clarity (both in conceptual design and implementation) compared to forwarding templates or different overloads. It is sometimes used with copy-and-swap (actually, as well as move-and-swap in this case).
Detailed analysis
The pros are:
It needs only one function for each parameter list.
It needs indeed only one, not multiple ordinary overloads (or even 2n overloads when you have n parameters when each one can be unqualified or const-qualified).
Like within a forwarding template, parameters passed by value are compatible with not only const, but volatile, which reduce even more ordinary overloads.
Combined with the bullet above, you don't need 4n overloads to serve to {unqulified, const, const, const volatile} combinations for n parameters.
Compared to a forwarding template, it can be a non-templated function as long as the parameters are not needed to be generic (parameterized through template type parameters). This allows out-of-line definitions instead of template definitions needed to be instantiated for each instance in each translation unit, which can make significant improvement to translation-time performance (typically, during both compiling and linking).
It also makes other overloads (if any) easier to implement.
If you have a forwarding template for a parameter object type T, it may still clash with overloads having a parameter const T& in the same position, because the argument can be a lvalue of type T and the template instantiated with type T& (rather than const T&) for it can be more preferred by the overloading rule when there is no other way to differentiate which is the best overloading candidate. This inconsistency may be quite surprising.
In particular, consider you have forwarding template constructor with one parameter of type P&& in a class C. How many time will you forget to excluded the instance of P&& away from possibly cv-qualified C by SFINAE (e.g. by adding typename = enable_if_t<!is_same<C, decay_t<P>> to the template-parameter-list), to ensure it does not clash with copy/move constructors (even when the latter are explicitly user-provided)?
Since the parameter is passed by value of a non-reference type, it can force the argument be passed as a prvalue. This can make a difference when the argument is of a class literal type. Consider there is such a class with a static constexpr data member declared in some class without an out-of-class definition, when it is used as an argument to a parameter of lvalue reference type, it may eventually fail to link, because it is odr-used and there is no definition of it.
Note since ISO C++ 17 the rules of static constexpr data member have changed to introduce a definition implicitly, so the difference is not significant in this case.
The cons are:
A unifying interface can not replace copy and move constructors where the parameter object type is identical to the class. Otherwise, copy-initialization of the parameter would be infinite recursion, because it will call the unifying constructor, and the constructor then call itself.
As mentioned by other answers, if the cost of copy is not ignorable (cheap and predictable enough), this means you will almost always have the degeneration of performance in the calls when the copy is not needed, because copy-initialization of a unifying passed-by-value parameter unconditionally introduce a copy (either copied-to or moved-to) of the argument unless elided.
Even with mandatory elision since C++17, copy-initialization of a parameter object is still hardly free to be removed away - unless the implementation try very hard to prove the behavior not changed according to as-if rules instead of the dedicated copy elision rules applicable here, which might be sometimes impossible without a whole program analysis.
Likewise, the cost of destruction may not be ignorable as well, particularly when non-trivial subobjects are taken into account (e.g. in cases of containers). The difference is that, it does not only apply to the copy-initialization introduced by the copy construction, but also by the move construction. Making move cheaper than copy in constructors can not improve the situation. The more cost of copy-initialization, the more cost of destruction you have to afford.
A minor shortcoming is that there is no way to tweak the interface in different ways as plural overloads, for example, specifying different noexcept-specifiers for parameters of const& and && qualified types.
OTOH, in this example, unifying interface will usually provide you with noexcept(false) copy + noexcept move if you specifies noexcept, or always noexcept(false) when you specify nothing (or explicit noexcept(false)). (Note in the former case, noexcept does not prevent throwing during copy because that will only occur during evaluation of arguments, which is out of the function body.) There is no further chance to tune them separately.
This is considered minor because it is not frequently needed in reality.
Even if such overloads are used, they are probably confusing by nature: different specifiers may hide subtle but important behavioral differences which are difficult to reason about. Why not different names instead of overloads?
Note the example of noexcept may be particularly problematic since C++17 because noexcept-specification now affect the function type. (Some unexpected compatibility issues can be diagnosed by Clang++ warning.)
Sometimes the unconditional copy is actually useful. Because composition of operations with strong-exception guarantee does not hold the guarantee in nature, a copy can be used as a transactional state holder when the strong-exception guarantee is required and the operation cannot be broken down as sequence of operations with no less strict (no-exception or strong) exception guarantee. (This includes the copy-and-swap idiom, although assignments are not recommended to be unified for other reasons in general, see below.) However, this does not mean the copy is otherwise unacceptable. If the intention of the interface is always to create some object of type T, and the cost of moving T is ignorable, the copy can be moved to the target without unwanted overhead.
Conclusions
So for some given operations, here are suggestions about whether using a unifying interface to replace them:
If not all of the parameter types match the unifying interface, or if there is behavioral difference other than the cost of new copies among operations being unified, there cannot be a unifying interface.
If the following conditions are failed to be fit for all parameters, there cannot be a unifying interface. (But it can still be broken down to different named-functions, delegating one call to another.)
For any parameter of type T, if a copy of each argument is needed for all operations, use unifying.
If both copy and move construction of T have ignorable cost, use unifying.
If the intention of the interface is always to create some object of type T, and the cost of the move construction of T is ignorable, use unifying.
Otherwise, avoid unifying.
Here are some examples need to avoid unifying:
Assignment operations (including assignment to the subobjects thereof, typically with copy-and-swap idiom) for T without ignorable cost in copy and move constructions does not meet the criteria of unifying, because the intention of assignment is not to create (but to replace the content of) the object. The copied object will eventually be destructed, which incurs unnecessary overhead. This is even more obvious for cases of self-assignment.
Insertion of values to a container does not meet the criteria, unless both the copy-initialization and destruction have ignorable cost. If the operation fails (due to the allocation failure, duplicate values or so on) after copy-initialization, the parameters have to be destructed, which incurs unnecessary overhead.
Conditionally creation of object based on parameters will incur the overhead when it does not actually create the object (e.g. std::map::insert_or_assign-like container insertion even in spite of the failure above).
Note the accurate limit of "ignorable" cost is somewhat subjective because it eventually depends on how much cost can be tolerated by the developers and/or the users, and it may vary case by case.
Practically, I (conservatively) assume any trivially copyable and trivailly destructible type whose size is not more than one machine word (like a pointer) qualifying the criteria of ignorable cost in general - if the resulted code actually cost too much in such case, it suggests either a wrong configuration of the build tool is used, or the toolchain is not ready for production.
Do profile if there is any further doubt on performance.
Additional case study
There are some other well-known types preferred to be passed by value or not, depending on the conventions:
Types need to preserve reference values by convention should not be passed by value.
A canonical example is the argument forwarding call wrapper defined in ISO C++, which requires to forward references. Note in the caller position it may also preserve the reference respecting to the ref-qualifier.
An instance of this example is std::bind. See also the resolution of LWG 817.
Some generic code may directly copy some parameters. It may be even without std::move, because the cost of the copy is assumed to be ignorable and a move does not necessarily make it better.
Such parameters include iterators and function objects (except the case of argument forwarding caller wrappers discussed above).
Note the constructor template of std::function (but not the assignment operator template) also uses the pass-by-value functor parameter.
Types presumably having the cost comparable to pass-by-value parameter types with ignorable cost are also preferred to be pass-by-value. (Sometimes they are used as dedicated alternatives.) For example, instances of std::initializer_list and std::basic_string_view are more or less two pointers or a pointer plus a size. This fact makes them cheap enough to be directly passed without using references.
Some types should be better avoided passed by value unless you do need a copy. There are different reasons.
Avoid copy by default, because the copy may be quite expensive, or at least it is not easy to guarantee the copy is cheap without some inspection of the runtime properties of the value being copied. Containers are typical examples in this sort.
Without statically knowing how many elements in a container, it is generally not safe (in the sense of a DoS attack, for example) to be copied.
A nested container (of other containers) will easily make the performance problem of copying worse.
Even empty containers are not guaranteed cheap to be copied. (Strictly speaking, this depends on the concrete implementation of the container, e.g. the existence of the "sentinel" element for some node-based containers... But no, keep it simple, just avoid copying by default.)
Avoid copy by default, even when the performance is totally uninterested, because there can be some unexpected side effects.
In particular, allocator-awared containers and some other types with similar treatment to allocators ("container semantics", in David Krauss' word), should not be passed by value - allocator propagation is just another big semantic worm can.
A few other types conventionally depend. For example, see GotW #91 for shared_ptr instances. (However, not all smart pointers are like that; observer_ptr are more like raw pointers.)
For the general case where the value will be stored, the pass-by-value only is a good compromise-
For the case where you know that only lvalues will be passed (some tightly coupled code) it's unreasonable, unsmart.
For the case where one suspects a speed improvement by providing both, first THINK TWICE, and if that didn't help, MEASURE.
Where the value will not be stored I prefer the pass by reference, because that prevents umpteen needless copy operations.
Finally, if programming could be reduced to unthinking application of rules, we could leave it to robots. So IMHO it's not a good idea to focus so much on rules. Better to focus on what the advantages and costs are, for different situations. Costs include not only speed, but also e.g. code size and clarity. Rules can't generally handle such conflicts of interest.
Pass by value, then move is actually a good idiom for objects that you know are movable.
As you mentioned, if an rvalue is passed, it'll either elide the copy, or be moved, then within the constructor it will be moved.
You could overload the copy constructor and move constructor explicitly, however it gets more complicated if you have more than one parameter.
Consider the example,
class Obj {
public:
Obj(std::vector<int> x, std::vector<int> y)
: X(std::move(x)), Y(std::move(y)) {}
private:
/* Our internal data. */
std::vector<int> X, Y;
}; // Obj
Suppose if you wanted to provide explicit versions, you end up with 4 constructors like so:
class Obj {
public:
Obj(std::vector<int> &&x, std::vector<int> &&y)
: X(std::move(x)), Y(std::move(y)) {}
Obj(std::vector<int> &&x, const std::vector<int> &y)
: X(std::move(x)), Y(y) {}
Obj(const std::vector<int> &x, std::vector<int> &&y)
: X(x), Y(std::move(y)) {}
Obj(const std::vector<int> &x, const std::vector<int> &y)
: X(x), Y(y) {}
private:
/* Our internal data. */
std::vector<int> X, Y;
}; // Obj
As you can see, as you increase the number of parameters, the number of necessary constructors grow in permutations.
If you don't have a concrete type but have a templatized constructor, you can use perfect-forwarding like so:
class Obj {
public:
template <typename T, typename U>
Obj(T &&x, U &&y)
: X(std::forward<T>(x)), Y(std::forward<U>(y)) {}
private:
std::vector<int> X, Y;
}; // Obj
References:
Want Speed? Pass by Value
C++ Seasoning
I am answering myself because I will try to summarize some of the answers. How many moves/copies do we have in each case?
(A) Pass by value and move assignment construct, passing a X parameter. If X is a...
Temporary: 1 move (the copy is elided)
Lvalue: 1 copy 1 move
std::move(lvalue): 2 moves
(B) Pass by reference and copy assignment usual (pre C++11) construct. If X is a...
Temporary: 1 copy
Lvalue: 1 copy
std::move(lvalue): 1 copy
We can assume the three kinds of parameters are equally probable. So every 3 calls we have (A) 4 moves and 1 copy, or (B) 3 copies. I.e., in average, (A) 1.33 moves and 0.33 copies per call or (B) 1 copy per call.
If we come to a situation when our classes consist mostly of PODs, moves are as expensive as copies. So we would have 1.66 copies (or moves) per call to the setter in case (A) and 1 copies in case (B).
We can say that in some circumstances (PODs based types), the pass-by-value-and-then-move construct is a very bad idea. It is 66% slower and it depends on a C++11 feature.
On the other hand, if our classes include containers (which make use of dynamic memory), (A) should be much faster (except if we mostly pass lvalues).
Please, correct me if I'm wrong.
Readability in the declaration:
void foo1( A a ); // easy to read, but unless you see the implementation
// you don't know for sure if a std::move() is used.
void foo2( const A & a ); // longer declaration, but the interface shows
// that no copy is required on calling foo().
Performance:
A a;
foo1( a ); // copy + move
foo2( a ); // pass by reference + copy
Responsibilities:
A a;
foo1( a ); // caller copies, foo1 moves
foo2( a ); // foo2 copies
For typical inline code there is usually no difference when optimized.
But foo2() might do the copy only on certain conditions (e.g. insert into map if key does not exist), whereas for foo1() the copy will always be done.

Will RVO happen when returning std::pair?

A function needs to return two values to the caller. What is the best way to implement?
Option 1:
pair<U,V> myfunc()
{
...
return make_pair(getU(),getV());
}
pair<U,V> mypair = myfunc();
Option 1.1:
// Same defn
U u; V v;
tie(u,v) = myfunc();
Option 2:
void myfunc(U& u , V& v)
{
u = getU(); v= getV();
}
U u; V v;
myfunc(u,v);
I know with Option2, there are no copies/moves but it looks ugly. Will there be any copies/moves occur in Option1, 1.1? Lets assume U and V are huge objects supporting both copy/move operations.
Q: Is it theoretically possible for any RVO/NRVO optimizations as per the standard? If yes, has gcc or any other compiler implemented yet?
Will RVO happen when returning std::pair?
Yes it can.
Is it guaranteed to happen?
No it is not.
C++11 standard: Section 12.8/31:
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects.
Copy elision is not a guaranteed feature. It is an optimization compilers are allowed to perform whenever they can. There is nothing special w.r.t std::pair. If a compiler is good enough to detect an optimization opportunity it will do so. So your question is compiler specific but yes same rule applies to std::pair as to any other class.
While RVO is not guaranteed, in C++11 the function as you have defined it I believe MUST move-return at the very least, so I would suggest leaving the clearer definition rather than warping it to accept output-variables (Unless you have a specific policy for using them).
Also, even if this example did use RVO, your explicit use of make_pair means you will always have at least one extra pair construction and thus a move operation. Change it to return a brace-initialized expression:
return { getU(), getV() };
RVO or Copy elision is dependant on compiler so if you want to have RVO and avoid call to Copy constructor best option is to use pointers.
In our product we use use pointers and boost containers pointer to avoid Copy constructor. and this indeed gives performance boost of around 10%.
Coming to your question,
In option 1 U and V's copy constructor will not be called as you are not returning U or V but returning std::pair object so it's copy constructor will be called and most compilers will definately use RVO here to avoid that.
Thanks
Niraj Rathi
If you need to do additional work on u and v after having created the pair, I find the following pattern pretty flexible in C++17:
pair<U,V> myfunc()
{
auto out = make_pair(getU(),getV());
auto& [u, v] = out;
// Work with u and v
return out;
}
This should be a pretty easy case for the compiler to use named return value optimization