Is the compiler allowed to eliminate the copy that is required for the by-value capture?
vector<Image> movie1;
apply( [=movie1](){ return movie1.size(); } );
Is there any circumstance that the compiler does not need to copy movie1?
Maybe if the compiler could know, that apply does not actually change movie1?
Or does it help that Lambdas are by default const functors in any case?
Does it help at all that vector has a move constructor and move assign?
If yes, is it required to add these to Image as well, to prevent an expensive copy here?
Is there a difference in the mechanism when and how a copy is required for by-value capture compared to by-value arguments? eg. void operate(vector<Image> movie)?
I'm fairly sure it cannot.
Even if the outer function no longer explicitly uses the variable, moving the variable would change the semantics of destruction.
Having move constructors for Image doesn't help, a vector can move or swap without moving its elements.
If the variable is read-only from this point forward, why not capture by reference? You could even create a const reference and capture that.
If the variable is not read-only, the copy is required. It doesn't matter whether the outer function or the lambda performs the modification, the compiler cannot allow that modification to become visible to the other.
The only difference I see between by-value capture and by-value argument passing is that the capture is named, it cannot be a temporary. So argument passing optimizations applicable to temporaries cannot be used.
There is always the "as-if" rule. As long as it looks as if the rules had been followed, the compiler can do whatever it likes. So for objects where the copy constructor and destructor have no side effects, and where no changes are made to the copy, or the original object isn't accessed afterwards (so no one will notice if we make changes to the object), the compiler could prove that eliminating the copy is legal under the "as-if" rule.
But other than that, no, it can't just eliminate the copy, as #Ben said. The "regular" copy elision rules don't cover this case.
Related
When should I declare my function as:
void foo(Widget w);
as opposed to:
void foo(Widget&& w);?
Assume this is the only overload (as in, I pick one or the other, not both, and no other overloads). No templates involved. Assume that the function foo requires ownership of the Widget (e.g. const Widget& is not part of this discussion). I'm not interested in any answer outside the scope of these circumstances. (See addendum at end of post for why these constraints are part of the question.)
The primary difference that my colleagues and I can come up with is that the rvalue reference parameter forces you to be explicit about copies. The caller is responsible for making an explicit copy and then passing it in with std::move when you want a copy. In the pass by value case, the cost of the copy is hidden:
//If foo is a pass by value function, calling + making a copy:
Widget x{};
foo(x); //Implicit copy
//Not shown: continues to use x locally
//If foo is a pass by rvalue reference function, calling + making a copy:
Widget x{};
//foo(x); //This would be a compiler error
auto copy = x; //Explicit copy
foo(std::move(copy));
//Not shown: continues to use x locally
Other than forcing people to be explicit about copying and changing how much syntactic sugar you get when calling the function, how else are these different? What do they say differently about the interface? Are they more or less efficient than one another?
Other things that my colleagues and I have already thought of:
The rvalue reference parameter means that you may move the argument, but does not mandate it. It is possible that the argument you passed in at the call site will be in its original state afterwards. It's also possible the function would eat/change the argument without ever calling a move constructor but assume that because it was an rvalue reference, the caller relinquished control. Pass by value, if you move into it, you must assume that a move happened; there's no choice.
Assuming no elisions, a single move constructor call is eliminated with pass by rvalue.
The compiler has better opportunity to elide copies/moves with pass by value. Can anyone substantiate this claim? Preferably with a link to gcc.godbolt.org showing optimized generated code from gcc/clang rather than a line in the standard. My attempt at showing this was probably not able to successfully isolate the behavior: https://godbolt.org/g/4yomtt
Addendum: why am I constraining this problem so much?
No overloads - if there were other overloads, this would devolve into a discussion of pass by value vs a set of overloads that include both const reference and rvalue reference, at which point the set of overloads is obviously more efficient and wins. This is well known, and therefore not interesting.
No templates - I'm not interested in how forwarding references fit into the picture. If you have a forwarding reference, you call std::forward anyway. The goal with a forwarding reference is to pass things as you received them. Copies aren't relevant because you just pass an lvalue instead. It's well known, and not interesting.
foo requires ownership of Widget (aka no const Widget&) - We're not talking about read-only functions. If the function was read-only or didn't need to own or extend the lifetime of the Widget, then the answer trivially becomes const Widget&, which again, is well known, and not interesting. I also refer you to why we don't want to talk about overloads.
What do rvalue usages say about an interface versus copying?
rvalue suggests to the caller that the function both wants to own the value and has no intention of letting the caller know of any changes it has made. Consider the following (I know you said no lvalue references in your example, but bear with me):
//Hello. I want my own local copy of your Widget that I will manipulate,
//but I don't want my changes to affect the one you have. I may or may not
//hold onto it for later, but that's none of your business.
void foo(Widget w);
//Hello. I want to take your Widget and play with it. It may be in a
//different state than when you gave it to me, but it'll still be yours
//when I'm finished. Trust me!
void foo(Widget& w);
//Hello. Can I see that Widget of yours? I don't want to mess with it;
//I just want to check something out on it. Read that one value from it,
//or observe what state it's in. I won't touch it and I won't keep it.
void foo(const Widget& w);
//Hello. Ooh, I like that Widget you have. You're not going to use it
//anymore, are you? Please just give it to me. Thank you! It's my
//responsibility now, so don't worry about it anymore, m'kay?
void foo(Widget&& w);
For another way of looking at it:
//Here, let me buy you a new car just like mine. I don't care if you wreck
//it or give it a new paint job; you have yours and I have mine.
void foo(Car c);
//Here are the keys to my car. I understand that it may come back...
//not quite the same... as I lent it to you, but I'm okay with that.
void foo(Car& c);
//Here are the keys to my car as long as you promise to not give it a
//paint job or anything like that
void foo(const Car& c);
//I don't need my car anymore, so I'm signing the title over to you now.
//Happy birthday!
void foo(Car&& c);
Now, if Widgets have to remain unique (as actual widgets in, say, GTK do) then the first option cannot work. The second, third and fourth options make sense, because there's still only one real representation of the data. Anyway, that's what those semantics say to me when I see them in code.
Now, as for efficiency: it depends. rvalue references can save a lot of time if Widget has a pointer to a data member whose pointed-to contents can be rather large (think an array). Since the caller used an rvalue, they're saying they don't care about what they're giving you anymore. So, if you want to move the caller's Widget's contents into your Widget, just take their pointer. No need to meticulously copy each element in the data structure their pointer points to. This can lead to pretty good improvements in speed (again, think arrays). But if the Widget class doesn't have any such thing, this benefit is nowhere to be seen.
Hopefully that gets at what you were asking; if not, I can perhaps expand/clarify things.
The rvalue reference parameter forces you to be explicit about copies.
Yes, pass-by-rvalue-reference got a point.
The rvalue reference parameter means that you may move the argument, but does not mandate it.
Yes, pass-by-value got a point.
But that also gives to pass-by-rvalue the opportunity to handle exception guarantee: if foo throws, widget value is not necessary consumed.
For move-only types (as std::unique_ptr), pass-by-value seems to be the norm (mostly for your second point, and first point is not applicable anyway).
EDIT: standard library contradicts my previous sentence, one of shared_ptr's constructor takes std::unique_ptr<T, D>&&.
For types which have both copy/move (as std::shared_ptr), we have the choice of the coherency with previous types or force to be explicit on copy.
Unless you want to guarantee there is no unwanted copy, I would use pass-by-value for coherency.
Unless you want guaranteed and/or immediate sink, I would use pass-by-rvalue.
For existing code base, I would keep consistency.
Unless the type is a move-only type you normally have an option to pass by reference-to-const and it seems arbitrary to make it "not part of the discussion" but I will try.
I think the choice partly depends on what foo is going to do with the parameter.
The function needs a local copy
Let's say Widget is an iterator and you want to implement your own std::next function. next needs its own copy to advance and then return. In this case your choice is something like:
Widget next(Widget it, int n = 1){
std::advance(it, n);
return it;
}
vs
Widget next(Widget&& it, int n = 1){
std::advance(it, n);
return std::move(it);
}
I think by-value is better here. From the signature you can see it is taking a copy. If the caller wants to avoid a copy they can do a std::move and guarantee the variable is moved from but they can still pass lvalues if they want to.
With pass-by-rvalue-reference the caller cannot guarantee that the variable has been moved from.
Move-assignment to a copy
Let's say you have a class WidgetHolder:
class WidgetHolder {
Widget widget;
//...
};
and you need to implement a setWidget member function. I'm going to assume you already have an overload that takes a reference-to-const:
WidgetHolder::setWidget(const Widget& w) {
widget = w;
}
but after measuring performance you decide you need to optimize for r-values. You have a choice between replacing it with:
WidgetHolder::setWidget(Widget w) {
widget = std::move(w);
}
Or overloading with:
WidgetHolder::setWidget(Widget&& widget) {
widget = std::move(w);
}
This one is a little bit more tricky. It is tempting choose pass-by-value because it accepts both rvalues and lvalues so you don't need two overloads. However it is unconditionally taking a copy so you can't take advantage of any existing capacity in the member variable. The pass by reference-to-const and pass by r-value reference overloads use assignment without taking a copy which might be faster
Move-construct a copy
Now lets say you are writing the constructor for WidgetHolder and as before you have already implemented a constructor that takes an reference-to-const:
WidgetHolder::WidgetHolder(const Widget& w) : widget(w) {
}
and as before you have measured peformance and decided you need to optimize for rvalues. You have a choice between replacing it with:
WidgetHolder::WidgetHolder(Widget w) : widget(std::move(w)) {
}
Or overloading with:
WidgetHolder::WidgetHolder(Widget&& w) : widget(std:move(w)) {
}
In this case, the member variable cannot have any existing capacity since this is the constructor. You are move-constucting a copy. Also, constructors often take many parameters so it can be quite a pain to write all the different permutations of overloads to optimize for r-value references. So in this case it is a good idea to use pass-by-value, especially if the constructor takes many such parameters.
Passing unique_ptr
With unique_ptr the efficiency concerns are less important given that a move is so cheap and it doesn't have any capacity. More important is expressiveness and correctness. There is a good discussion of how to pass unique_ptr here.
When you pass by rvalue reference object lifetimes get complicated. If the callee does not move out of the argument, the destruction of the argument is delayed. I think this is interesting in two cases.
First, you have an RAII class
void fn(RAII &&);
RAII x{underlying_resource};
fn(std::move(x));
// later in the code
RAII y{underlying_resource};
When initializing y, the resource could still be held by x if fn doesn't move out of the rvalue reference. In the pass by value code, we know that x gets moved out of, and fn releases x. This is probably a case where you would want to pass by value, and the copy constructor would likely be deleted, so you wouldn't have to worry about accidental copies.
Second, if the argument is a large object and the function doesn't move out, the lifetime of the vectors data is larger than in the case of pass by value.
vector<B> fn1(vector<A> &&x);
vector<C> fn2(vector<B> &&x);
vector<A> va; // large vector
vector<B> vb = fn1(std::move(va));
vector<C> vc = fn2(std::move(vb));
In the example above, if fn1 and fn2 don't move out of x, then you will end up with all of the data in all of the vectors still alive. If you instead pass by value, only the last vector's data will still be alive (assuming vectors move constructor clears the sources vector).
One issue not mentioned in the other answers is the idea of exception-safety.
In general, if the function throws an exception, we would ideally like to have the strong exception guarantee, meaning that the call has no effect other than raising the exception. If pass-by-value uses the move constructor, then such an effect is essentially unavoidable. So an rvalue-reference argument may be superior in some cases. (Of course, there are various cases where the strong exception guarantee isn't achievable either way, as well as various cases where the no-throw guarantee is available either way. So this is not relevant in 100% of cases. But it's relevant sometimes.)
Choosing between by-value and by-rvalue-ref, with no other overloads, is not meaningful.
With pass by value the actual argument can be an lvalue expression.
With pass by rvalue-ref the actual argument must be an rvalue.
If the function is storing a copy of the argument, then a sensible choice is between pass-by-value, and a set of overloads with pass-by-ref-to-const and pass-by-rvalue-ref. For an rvalue expression as actual argument the set of overloads can avoid one move. It's an engineering gut-feeling decision whether the micro-optimization is worth the added complexity and typing.
One notable difference is that if you move to an pass-by-value function:
void foo(Widget w);
foo(std::move(copy));
compiler must generate a move-constructor call Widget(Widget&&) to create the value object. In case of pass-by-rvalue-reference no such call is needed as the rvalue-reference is passed directly to the method. Usually this does not matter, as move constructors are trivial (or default) and are inlined most of the time.
(you can check it on gcc.godbolt.org -- in your example declare move constructor Widget(Widget&&); and it will show up in assembly)
So my rule of thumb is this:
if the object represents a unique resource (without copy semantics) I prefer to use pass-by-rvalue-reference,
otherwise if it logically makes sense to either move or copy the object, I use pass-by-value.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is move semantics?
I recently attended a C++11 seminar and the following tidbit of advice was given.
when you have && and you are unsure, you will almost always use std::move
Could any one explain to me why you should use std::move as opposed to some alternatives and some cases when you should not use std::move?
First, there's probably a misconception in the question I'll address:
Whenever you see T&& t in code (And T is an actual type, not a template type), keep in mind the value category of t is an lvalue(reference), not an rvalue(temporary) anymore. It's very confusing. The T&& merely means that t is constructed from an object that was an rvalue 1, but t itself is an lvalue, not an rvalue. If it has a name (in this case, t) then it's an lvalue and won't automatically move, but if it has no name (the result of 3+4) then it is an rvalue and will automatically move into it's result if it can. The type (in this case T&&) has almost nothing to do with the value category of the variable (in this case, an lvalue).
That being said, if you have T&& t written in your code, that means you have a reference to a variable that was a temporary, and it is ok to destroy if you want to. If you need to access the variable multiple times, you do not want to std::move from it, or else it would lose it's value. But the last time you acccess t it is safe to std::move it's value to another T if you wish. (And 95% of the time, that's what you want to do). All of this also applies to auto&& variables.
1. if T is a template type, T&& is a forwarding reference instead, in which case you use std::forward<T>(t) instead of std::move(t) the last time. See this question.
I found this article to be pretty enlightening on the subject of rvalue references in general. He mentions std::move towards the end. This is probably the most relevant quote:
We need to use std::move, from <utility> -- std::move is a way of
saying, "ok, honest to God I know I have an lvalue, but I want it to
be an rvalue." std::move does not, in and of itself, move anything; it
just turns an lvalue into an rvalue, so that you can invoke the move
constructor.
Say you have a move constructor that looks like this:
MyClass::MyClass(MyClass&& other): myMember(other.myMember)
{
// Whatever else.
}
When you use the statement other.myMember, the value that's returned is an lvalue. Thus the code uses the copy constructor to initialize this->myMember. But since this is a move constructor, we know that other is a temporary object, and therefore so are its members. So we really want to use the more-efficient move constructor to initialize this->myMember. Using std::move makes sure that the compiler treats other.myMember like an rvalue reference and calls the move constructor, as you'd want it to:
MyClass::MyClass(MyClass&& other): myMember(std::move(other.myMember))
{
// Whatever else.
}
Just don't use std::move on objects you need to keep around - move constructors are pretty much guaranteed to muck up any objects passed into them. That's why they're only used with temporaries.
Hope that helps!
When you have an object of type T&&, a rvalue, it means that this object is safe to be moved, as no one else will depend on its internal state later.
As moving should never be more expensive than copying, you will almost always want to move it. And to move it, you have to use the std::move function.
When should you avoid std::move, even if it would be safe? I wouldn't use it in trivial examples, e.g.,:
int x = 0;
int y = std::move(x);
Beside that, I see no downsides. If it does not complicate the code, moving should be done whenever possible IMHO.
Another example, where you don't want to move are return values. The language guarantees that return values are (at least) moved, so you should not write
return std::move(x); // not recommended
(If you are lucky, return value optimization hits, which is even better than a move operation.)
You can use move when you need to "transfer" the content of an object somewhere else, without doing a copy. It's also possible for an object to take the content of a temporary object without doing a copy, with std::move.
Read more on Rvalue references and move constructors from wikipedia.
I want to construct an object with another using rvalue.
class BigDataClass{
public:
BigDataClass(); //some default BigData
BigDataClass(BigDataClass&& anotherBigData);
private:
BigDataClass(BigDataClass& anotherBigData);
BigDataPtr m_data;
};
So now I want to do something like:
BigDataClass someData;
BigDataClass anotherData(std::move(someData));
So now anotherData gets rValue. It's an eXpiring Value in fact, so as http://en.cppreference.com/w/cpp/utility/move states compiler now
has an oppourtunity to optimize the initialization of anotherData with moving
someData to another.
In my opinion we can in fact get 2 different things:
Optimized approach: data moved. It's optimized, fast and we're happy
Nonoptimized approach: data not moved. We have to copy data from object to another AND delete data from the first one(as far as I know after changing object to rvalue once we cannot use it, because it has got no ownership of data, that it held). In fact it can be even slower than initialization with lvalue referrence due to deletion operation.
Can we really get so unoptimized way of data initialization?
You said:
So now anotherData gets rValue. It's an eXpiring Value in fact, so as http://en.cppreference.com/w/cpp/utility/move states compiler now has an oppourtunity to optimize the initialization of anotherData with moving someData to another.
Actually, what it stated was:
Code that receives such an xvalue has the opportunity to optimize away unnecessary overhead by moving data out of the argument, leaving it in a valid but unspecified state.
That is, it's the code that's responsible for optimization here, not the compiler. All std::move(someData) does is cast its argument to an rvalue reference. Given that BigDataClass has a constructor that takes an rvalue reference, that constructor is preferred, and that constructor will be the one that is called. There isn't any room for change here from the compiler's point of view. Thus the code will do whatever the BigDataClass(BigDataClass&&) constructor does.
Looks like you are confused with what is optimization and what is not. Using move constructor (when available) is not an optimization, it is mandated by standard. It is not that the compiler has this opportunity, it has to do this.
On the other hand, copy elision is an optimization which compiler has an opportunity to perform. How relibale it is, depends on your compiler (though they are applying it pretty uniformely) and the actual function code.
You think about what the optimizer can do with move semantics. Simply nothing itself! You, the coder, has to implement the code which is the optimization compared against the constructor with a const ref.
The question can go to the opposite:
If the compiler already knows that you have a rvalue which is passed as const ref to a constructor, the compiler is able to do the construction as if the value is generated in the constructor itself. Copy eliding is done very often by up to date compilers. The question here is ( for me ) how many effort I should spend to write some rvalue reference constructions to get the same result as the compiler already builds on the fly for me.
OK, in c++11 you have a lot of opportunities to handle code for forwarding and moving by your algorithms. And yes, some benefit can be generated. But I see the benefit only for templated code where I have the need to move/forward some of the parameters to (meta)template functions.
And on the opposite: Handling rvalue references must taken with care and the meaning of a valid but undefined state rise some questions on every user who use your interface implementation. See also: What can I do with a moved-from object?
What kind of optimizations does rvalue guarantee
Simply nothing. You have to implement it!
As I read some articles, rvalue references and move semantics are usually described together. However as I understand, rvalue references are just references to rvalues and have nothing to do on their own with move semantics. And move semantics could be implemented probably without even using rvalue references. So the question is, why move constructor/operator= use rvalue references? Was it just to make it easier to write the code?
Consider the problem. There are two basic move operations we want to support: move "construction" and move "assignment". I use quotations there because we don't necessarily have to implement them with constructors or move assignment operators; we could use something else.
Move "construction" means creating a new object by transferring the contents from an existing object, such that deleting the old object doesn't deallocate resources now used in the new one. Move "assignment" means taking a pre-existing object and transferring the contents from an existing object, such that deleting the old object doesn't deallocate resources now used in the new one.
OK, so these are the operations we want to do. Well, how to do it?
Take move "construction". While we don't have to implement this with a constructor call, we really want to. We don't want to force people to do two-stage move construction, even if it's behind some magical function call. So we want to be able to implement movement as a constructor. OK, fine.
Here's problem 1: constructors have no names. Therefore, you can only differentiate them based on argument types and overloading resolution. And we know that the move constructor for an object of type T must take an object of type T as a parameter. And since it only needs one argument, it therefore looks exactly like a copy constructor.
OK, so now we need some way to satisfy overloading. We could introduce some standard library type, a std::move_ref. It would be like std::reference_wrapper, but it would be a distinct type. Therefore, you could say that a move constructor is a constructor that takes a std::move_ref<T>. Alright, fine: problem solved.
Only not; we now have new problems. Consider this code:
std::string MakeAString() { return std::string("foo"); }
std::string data = MakeAString();
Ignoring elision, C++11's expression value category rules state that a type which is returned from a function by value is an prvalue. And therefore, it will automatically be used by move constructors/assignment operators wherever possible. No need for std::move or the like.
To do it your way would require this:
std::string MakeAString() { return std::move(std::string("foo")); }
std::string data = std::move(MakeAString());
Both of those std::move calls would be needed to avoid copying. You have to move out of the temporary and into the return value, and then move out of the return value and into data (again, ignoring elision).
If you think that this is merely a minor annoyance, consider what else rvalue references buy us: perfect forwarding. Without the special reference-collapsing rules, you could not write a proper forwarding function that forwards copy and move semantics perfectly. std::move_ref would be a real C++ type; you couldn't just slap arbitrary rules like reference collapsing onto it like you can with rvalue references.
At the end of the day, you need some kind of language construct in place, not merely a library type. By making it a new kind of reference, you get to be able to define new rules for what can bind to that reference (and what cannot). And you get to define special reference-collapsing rules that make perfect forwarding possible.
The connection is that it is safe to move from an rvalue (because (in the absence of casts) rvalues refer to objects that are at the end of their lifespans), so a constructor that takes an rvalue reference can be safely implemented by pilfering/moving from the referenced object.
From a C++-language point of view, this is the end of the connection, but the standard library further expands on this connection by consistently making construction from lvalues copy and construction from rvalues move, and by providing helper functions (such as std::move) which make it straightforward to chose whether to move or copy a particular object (by changing around the value category of the object in the expression that causes the copy/move).
Move semantics can be implemented without rvalue-references, but it would be a lot less neat. A number of problems would need to be solved:
How to capture an rvalue by non-const reference?
How to distinguish between a constructor that copies and a constructor that moves?
How to ensure that moves are used wherever they would be a safe optimization?
How to write generic code that works with both movable and copyable objects?
How many copies happen/object exist in the following, assuming that normal compiler optimizations are enabled:
std::vector<MyClass> v;
v.push_back(MyClass());
If it is not exactly 1 object creation and 0 copying, What can I do (including changes in MyClass) to achieve that, since it seems to me that that is all that should really be necessary?
If the constructor of MyClass has side-effects, then in C++03 the copy is not permitted to be elided. That's because the temporary object that's the source of the copy has been bound to a reference (the parameter of push_back).
If the copy constructor of MyClass has no side-effects then the compiler is permitted to optimize it away under the "as-if" rule. I think the only sensible way to determine whether it actually has done so with "normal optimizations" is to inspect the emitted code. Different people have different ideas what's normal, and a given compiler might be sensitive to the details of MyClass. My guess is that what this amounts to is whether or not the compiler (or linker) inlines everything in sight. If it does then it will probably optimize, if it doesn't then it won't. So even the size of the constructor code might be relevant, never mind what it does.
So I think the main thing you can do is to ensure that both the default and the copy constructor of MyClass have no side-effects and are available to be inlined. If they're not available then of course the compiler will assume that they could have side-effects and will do the copy. If link-time optimization is a normal compiler option for you, then you don't have to do much to make them available. Otherwise, if they're user-defined then do it in the header file that defines MyClass. You might be able to get away with the default constructor having certain kinds of side-effects: if the effects don't depend on the address of the temporary being different from the address of the vector element then "as-if" still applies.
In C++11 you have a move (that likewise must not be elided if it has side-effects), but you can use v.emplace_back() to avoid that. The move would call the move constructor of MyClass if it has one, otherwise the copy constructor, and everything I say above about "as-if" applies to moves. emplace_back() calls the no-args constructor to construct the vector element (or if you pass arguments to emplace_back then whatever constructor matches those args), which I think is exactly what you want.
You mean:
std::vector<MyClass> v;
v.push_back(MyClass());
None. The temporary will cause the move version of push_back to be called. Even the move construction will most likely be elided.
If you have a C++11 compiler, you can use emplace_back to construct the element at the end of the vector, zero copies necessary.
In C++03, you would have a construction and a copy, plus destruction of the temporary.
If your compiler supports C++11 and MyClass defines a move constructor, then you have one construction and a move.
As mentionned by Timbo, you can also use emplace_back to avoid the move, the object being constructed in-place.