This question already has answers here:
What is std::move(), and when should it be used?
(9 answers)
Closed 3 years ago.
I am currently learning more about all the c++11/14 features and wondering when to use std::move in function calls.
I know I should not use it when returning local variables, because this breaks return value optimisation, but I do not really understand where in function calls casting to a rvalue actually helps.
When a function accepts an rvalue reference, you have to provide an rvalue (either by having already a prvalue, or using std::move to create an xvalue). E.g.
void foo(std::string&& s);
std::string s;
foo(s); // Compile-time error
foo(std::move(s)); // OK
foo(std::string{}) // OK
When a function accepts a value, you can use std::move to move-construct the function argument instead of copy-constructing. E.g.
void bar(std::string s);
std::string s;
bar(s); // Copies into `s`
bar(std::move(s)); // Moves into `s`
When a function accepts a forwarding reference, you can use std::move to allows the function to move the object further down the call stack. E.g.
template <typename T>
void pipe(T&& x)
{
sink(std::forward<T>(x));
}
std::string s;
pipe(s); // `std::forward` will do nothing
pipe(std::move(s)); // `std::forward` will move
pipe(std::string{}); // `std::forward` will move
When you have some substantial object, and you're passing it as an argument to a function (e.g. an API, or a container emplace operation), and you will no longer need it at the callsite, so you want to transfer ownership, rather than copying then "immediately" losing the original. That's when you move it.
void StoreThing(std::vector<int> v);
int main()
{
std::vector<int> v{1,2,3,4,5,6/*,.....*/};
StoreThing(v);
}
// Copies `v` then lets it go out of scope. Pointless!
versus:
void StoreThing(std::vector<int> v);
int main()
{
std::vector<int> v{1,2,3,4,5,6/*,.....*/};
StoreThing(std::move(v));
}
// Better! We didn't need `v` in `main` any more...
This happens automatically when returning local variables, if RVO hasn't been applied (and note that such an "optimisation" is mandated since C++17 so you're right to say that adding a "redundant" std::move in that case can actually be harmful).
Also it's pointless to std::move if you're passing something really small (particularly a non-class thing which cannot possibly have a move constructor, let alone a meaningful one!) or you know you're passing into a function that accepts its arguments const-ly; in that case it's up to you as to whether you want to save the added source code distraction of a std::move that won't do anything: on the surface it's wise, but in a template you may not be so sure.
Related
I think that constructor with universal reference parameter has better performance than the one without reference.
From cppreference (https://en.cppreference.com/w/cpp/utility/functional/function/function), we can see that the template constructor for std::function is not using reference.
template< class F > function( F f );
Is it a mistake? If not, why the standard doesn't requires the constructor use universal reference?
EDIT:
Let's think about two cases.
F is a built-in type, here a function pointer whose size is 4 bytes.
Using universal reference:
lvalue case: 4 bytes are copied.
rvalue case: 4 bytes are copied. (Will you apply a std::move to built-in type?)
Passing by value:
all cases: 4 bytes are copied.
F is a type with non-trivial copy constructor, ex. a lambda type which captures a std::string by value. This example stands for most cases, do you agree?
Using universal reference:
lvalue case: copy constructor is invoked once
rvalue case: move constructor is invoked once
Passing by value:
lvalue case: a move and a copy
rvalue case: a move
Is this classification complete? If so, we can run into the conclusion that universal reference is no worse then passing by value. Then it returns to the original question.
EDIT again:
Maybe lambdas without captures are the most common cases, where passing by value is actually passing nothing while passing by reference is passing a pointer. This may be the key. LWG 2774 is related to this question. See it in comments.
Because that constructor moves its argument, accepting a reference is pointless. This comes down to the usual advice on when to take values.
If you pass a primitive, like an int, passing by reference is a pessimisation. If you pass a complex type, like a std::string, you can already std::move it into the argument (and, if you don't, then that's because you wanted a copy anyway). You get the best of both worlds.
// Bog-standard choice between value and ref; original value only observed
void foo(const int& x) { cout << x << '\n'; }
void foo(const int x) { cout << x << '\n'; } // Better!
void foo(const std::string& x) { cout << x << '\n'; } // Better!
void foo(const std::string x) { cout << x << '\n'; }
// When we want to store
void storeACopy(int);
void foo(const int& x) { storeACopy(x); }
void foo(const int x) { storeACopy(x); } // Better!
void storeACopy(std::string);
void foo(const std::string& x) { storeACopy(x); } // Meh
void foo(std::string x) { storeACopy(std::move(x)); } // Cheap!
// So, best of both worlds:
template <typename T>
void foo(T x) { storeACopy(std::move(x)); }
// This performs a copy when needed, but allows callsite to
// move instead (i.e. total flexibility) *and* doesn't require silly
// refs-to-primitives
It also signals to the call site that the original object will not be modified unless you specifically yield ownership with std::move.
If instead the function took a reference, okay you'd potentially save one move if you want to yield. But, moves are supposed to be super-cheap so we're not concerned about that. And if you didn't want to yield then suddenly you have to go through the rigmaroll of an object copy at the callsite. Ugh!
This pattern is key to making the most out of move semantics.
I think that constructor with universal reference parameter has better performance than the one without reference.
Why do you think that? Have you measured? What if we're talking about passing int? Reference (universal / forwarding or not) is not always a boost for performance.
Is it a mistake?
No, it's the standard convention to pass function objects by value. They are meant to be cheap to copy and when implementing own function objects, you should keep that in mind.
If not, why the standard doesn't requires the constructor use universal reference?
Ultimately, why would it? :)
This question already has answers here:
c++11 Return value optimization or move? [duplicate]
(4 answers)
Closed 5 years ago.
In this case
struct Foo {};
Foo meh() {
return std::move(Foo());
}
I'm pretty sure that the move is unnecessary, because the newly created Foo will be an xvalue.
But what in cases like these?
struct Foo {};
Foo meh() {
Foo foo;
//do something, but knowing that foo can safely be disposed of
//but does the compiler necessarily know it?
//we may have references/pointers to foo. how could the compiler know?
return std::move(foo); //so here the move is needed, right?
}
There the move is needed, I suppose?
In the case of return std::move(foo); the move is superfluous because of 12.8/32:
When the criteria for elision of a copy operation are met or would be
met save for the fact that the source object is a function parameter,
and the object to be copied is designated by an lvalue, overload
resolution to select the constructor for the copy is first performed as
if the object were designated by an rvalue.
return foo; is a case of NRVO, so copy elision is permitted. foo is an lvalue. So the constructor selected for the "copy" from foo to the return value of meh is required to be the move constructor if one exists.
Adding move does have a potential effect, though: it prevents the move being elided, because return std::move(foo); is not eligible for NRVO.
As far as I know, 12.8/32 lays out the only conditions under which a copy from an lvalue can be replaced by a move. The compiler is not permitted in general to detect that an lvalue is unused after the copy (using DFA, say), and make the change on its own initiative. I'm assuming here that there's an observable difference between the two -- if the observable behavior is the same then the "as-if" rule applies.
So, to answer the question in the title, use std::move on a return value when you want it to be moved and it would not get moved anyway. That is:
you want it to be moved, and
it is an lvalue, and
it is not eligible for copy elision, and
it is not the name of a by-value function parameter.
Considering that this is quite fiddly and moves are usually cheap, you might like to say that in non-template code you can simplify this a bit. Use std::move when:
you want it to be moved, and
it is an lvalue, and
you can't be bothered worrying about it.
By following the simplified rules you sacrifice some move elision. For types like std::vector that are cheap to move you'll probably never notice (and if you do notice you can optimize). For types like std::array that are expensive to move, or for templates where you have no idea whether moves are cheap or not, you're more likely to be bothered worrying about it.
The move is unnecessary in both cases. In the second case, std::move is superfluous because you are returning a local variable by value, and the compiler will understand that since you're not going to use that local variable anymore, it can be moved from rather than being copied.
On a return value, if the return expression refers directly to the name of a local lvalue (i.e. at this point an xvalue) there is no need for the std::move. On the other hand, if the return expression is not the identifier, it will not be moved automatically, so for example, you would need the explicit std::move in this case:
T foo(bool which) {
T a = ..., b = ...;
return std::move(which? a : b);
// alternatively: return which? std::move(a), std::move(b);
}
When returning a named local variable or a temporary expression directly, you should avoid the explicit std::move. The compiler must (and will in the future) move automatically in those cases, and adding std::move might affect other optimizations.
There are lots of answers about when it shouldn't be moved, but the question is "when should it be moved?"
Here is a contrived example of when it should be used:
std::vector<int> append(std::vector<int>&& v, int x) {
v.push_back(x);
return std::move(v);
}
ie, when you have a function that takes an rvalue reference, modifies it, and then returns a copy of it. (In c++20 behavior here changes) Now, in practice, this design is almost always better:
std::vector<int> append(std::vector<int> v, int x) {
v.push_back(x);
return v;
}
which also allows you to take non-rvalue parameters.
Basically, if you have an rvalue reference within a function that you want to return by moving, you have to call std::move. If you have a local variable (be it a parameter or not), returning it implicitly moves (and this implicit move can be elided away, while an explicit move cannot). If you have a function or operation that takes local variables, and returns a reference to said local variable, you have to std::move to get move to occur (as an example, the trinary ?: operator).
A C++ compiler is free to use std::move(foo):
if it is known that foo is at the end of its lifetime, and
the implicit use of std::move won't have any effect on the semantics of the C++ code other than the semantic effects allowed by the C++ specification.
It depends on the optimization capabilities of the C++ compiler whether it is able to compute which transformations from f(foo); foo.~Foo(); to f(std::move(foo)); foo.~Foo(); are profitable in terms of performance or in terms of memory consumption, while adhering to the C++ specification rules.
Conceptually speaking, year-2017 C++ compilers, such as GCC 6.3.0, are able to optimize this code:
Foo meh() {
Foo foo(args);
foo.method(xyz);
bar();
return foo;
}
into this code:
void meh(Foo *retval) {
new (retval) Foo(arg);
retval->method(xyz);
bar();
}
which avoids calling the copy-constructor and the destructor of Foo.
Year-2017 C++ compilers, such as GCC 6.3.0, are unable to optimize these codes:
Foo meh_value() {
Foo foo(args);
Foo retval(foo);
return retval;
}
Foo meh_pointer() {
Foo *foo = get_foo();
Foo retval(*foo);
delete foo;
return retval;
}
into these codes:
Foo meh_value() {
Foo foo(args);
Foo retval(std::move(foo));
return retval;
}
Foo meh_pointer() {
Foo *foo = get_foo();
Foo retval(std::move(*foo));
delete foo;
return retval;
}
which means that a year-2017 programmer needs to specify such optimizations explicitly.
std::move is totally unnecessary when returning from a function, and really gets into the realm of you -- the programmer -- trying to babysit things that you should leave to the compiler.
What happens when you std::move something out of a function that isn't a variable local to that function? You can say that you'll never write code like that, but what happens if you write code that's just fine, and then refactor it and absent-mindedly don't change the std::move. You'll have fun tracking that bug down.
The compiler, on the other hand, is mostly incapable of making these kinds of mistakes.
Also: Important to note that returning a local variable from a function does not necessarily create an rvalue or use move semantics.
See here.
This question already has answers here:
How to pass parameters correctly?
(5 answers)
Closed 8 years ago.
Ok, I'm thinking about the following C++ code:
foo (std::string str) {
// do whatever
}
foo(const char *c_str) {
foo(std::string(c_str));
}
I look at this code and think it needs to be rewritten to pass by reference. Basically, my fear is that the constructor will get called twice, once in the const char * version of foo and once again when the argument is passed to foo as a std::string, since it is set to pass by copy. My question is: am I right, or is g++ smart enough to take the constructor in the c string version and call it good? It seems like g++ wouldn't be able to do that but I'm just hoping someone who really knows can clarify it.
In theory two constructors (one to create the temporary, plus the copy constructor for the pass-by-copy) would be involved; in practice, the compiler is explicitly allowed to perform copy elision (C++11 §12.8 ¶32).
But you don't need the two overloads to begin with.
The normal way to go is to just have a version of that function that takes a const std::string &. If the caller already has an std::string, no copy is performed, since we are passing by reference. If instead it has a char *, a temporary std::string is created (since it has a non-explicit constructor from const char*) and is passed to the function (since const references can be bound to temporaries).
you can just idiomatically write
foo (std::string const& str) {
// do whatever
}
No need for the overload, since you can implicitly construct a temporary:
foo("yes");
If you intend to store the value of the argument somewhere, you could take an rvalue reference:
foo (std::string && str) {
my_member = std::move(str);
}
But to avoid overload explosion, taking the argument by value is often a good middleground:
How true is "Want Speed? Pass by value"
Regardless of all this good avice about idomatic parameter-passing, yes the compiler can optimize away the spurious copies under the as-if rule (although the copy constructor is required to be accessible as if the copy were performed)
Since the temporary passed to foo is unnamed it seems like it would be a fairly simple optimization to construct directly into the parameter, eliminating the copy, although this isn't guaranteed by the standard (as no optimizations are).
More generally speaking however, you should pass by constant reference unless your function would be taking a copy of the parameter itself already (perhaps to copy-and-mutate for example),
Passing a lambda is really easy in c++11:
func( []( int arg ) {
// code
} ) ;
But I'm wondering, what is the cost of passing a lambda to a function like this? What if func passes the lambda to other functions?
void func( function< void (int arg) > f ) {
doSomethingElse( f ) ;
}
Is the passing of the lambda expensive? Since a function object can be assigned 0,
function< void (int arg) > f = 0 ; // 0 means "not init"
it leads me to think that function objects kind of act like pointers. But without use of new, then it means they might be like value-typed struct or classes, which defaults to stack allocation and member-wise copy.
How is a C++11 "code body" and group of captured variables passed when you pass a function object "by value"? Is there a lot of excess copy of the code body? Should I have to mark each function object passed with const& so that a copy is not made:
void func( const function< void (int arg) >& f ) {
}
Or do function objects somehow pass differently than regular C++ structs?
Disclaimer: my answer is somewhat simplified compared to the reality (I put some details aside) but the big picture is here. Also, the Standard does not fully specify how lambdas or std::function must be implemented internally (the implementation has some freedom) so, like any discussion on implementation details, your compiler may or may not do it exactly this way.
But again, this is a subject quite similar to VTables: the Standard doesn't mandate much but any sensible compiler is still quite likely to do it this way, so I believe it is worth digging into it a little. :)
Lambdas
The most straightforward way to implement a lambda is kind of an unnamed struct:
auto lambda = [](Args...) -> Return { /*...*/ };
// roughly equivalent to:
struct {
Return operator ()(Args...) { /*...*/ }
}
lambda; // instance of the unnamed struct
Just like any other class, when you pass its instances around you never have to copy the code, just the actual data (here, none at all).
Objects captured by value are copied into the struct:
Value v;
auto lambda = [=](Args...) -> Return { /*... use v, captured by value...*/ };
// roughly equivalent to:
struct Temporary { // note: we can't make it an unnamed struct any more since we need
// a constructor, but that's just a syntax quirk
const Value v; // note: capture by value is const by default unless the lambda is mutable
Temporary(Value v_) : v(v_) {}
Return operator ()(Args...) { /*... use v, captured by value...*/ }
}
lambda(v); // instance of the struct
Again, passing it around only means that you pass the data (v) not the code itself.
Likewise, objects captured by reference are referenced into the struct:
Value v;
auto lambda = [&](Args...) -> Return { /*... use v, captured by reference...*/ };
// roughly equivalent to:
struct Temporary {
Value& v; // note: capture by reference is non-const
Temporary(Value& v_) : v(v_) {}
Return operator ()(Args...) { /*... use v, captured by reference...*/ }
}
lambda(v); // instance of the struct
That's pretty much all when it comes to lambdas themselves (except the few implementation details I ommitted, but which are not relevant to understanding how it works).
std::function
std::function is a generic wrapper around any kind of functor (lambdas, standalone/static/member functions, functor classes like the ones I showed, ...).
The internals of std::function are pretty complicated because they must support all those cases. Depending on the exact type of functor this requires at least the following data (give or take implementation details):
A pointer to a standalone/static function.
Or,
A pointer to a copy[see note below] of the functor (dynamically allocated to allow any type of functor, as you rightly noted it).
A pointer to the member function to be called.
A pointer to an allocator that is able to both copy the functor and itself (since any type of functor can be used, the pointer-to-functor should be void* and thus there has to be such a mechanism -- probably using polymorphism aka. base class + virtual methods, the derived class being generated locally in the template<class Functor> function(Functor) constructors).
Since it doesn't know beforehand which kind of functor it will have to store (and this is made obvious by the fact that std::function can be reassigned) then it has to cope with all possible cases and make the decision at runtime.
Note: I don't know where the Standard mandates it but this is definitely a new copy, the underlying functor is not shared:
int v = 0;
std::function<void()> f = [=]() mutable { std::cout << v++ << std::endl; };
std::function<void()> g = f;
f(); // 0
f(); // 1
g(); // 0
g(); // 1
So, when you pass a std::function around it involves at least those four pointers (and indeed on GCC 4.7 64 bits sizeof(std::function<void()> is 32 which is four 64 bits pointers) and optionally a dynamically allocated copy of the functor (which, as I already said, only contains the captured objects, you don't copy the code).
Answer to the question
what is the cost of passing a lambda to a function like this?[context of the question: by value]
Well, as you can see it depends mainly on your functor (either a hand-made struct functor or a lambda) and the variables it contains. The overhead compared to directly passing a struct functor by value is quite negligible, but it is of course much higher than passing a struct functor by reference.
Should I have to mark each function object passed with const& so that a copy is not made?
I'm afraid this is very hard to answer in a generic way. Sometimes you'll want to pass by const reference, sometimes by value, sometimes by rvalue reference so that you can move it. It really depends on the semantics of your code.
The rules concerning which one you should choose are a totally different topic IMO, just remember that they are the same as for any other object.
Anyway, you now have all the keys to make an informed decision (again, depending on your code and its semantics).
See also C++11 lambda implementation and memory model
A lambda-expression is just that: an expression. Once compiled, it results in a closure object at runtime.
5.1.2 Lambda expressions [expr.prim.lambda]
The evaluation of a lambda-expression results in a prvalue temporary
(12.2). This temporary is called the closure object.
The object itself is implementation-defined and may vary from compiler to compiler.
Here is the original implementation of lambdas in clang
https://github.com/faisalv/clang-glambda
If the lambda can be made as a simple function (i.e. it does not capture anything), then it is made exactly the same way. Especially as standard requires it to be compatible with the old-style pointer-to-function with the same signature. [EDIT: it's not accurate, see discussion in comments]
For the rest it is up to the implementation, but I'd not worry ahead. The most straightforward implementation does nothing but carry the information around. Exactly as much as you asked for in the capture. So the effect would be the same as if you did it manually creating a class. Or use some std::bind variant.
Let's take the following method as an example:
void Asset::Load( const std::string& path )
{
// complicated method....
}
General use of this method would be as follows:
Asset exampleAsset;
exampleAsset.Load("image0.png");
Since we know most of the time the Path is a temporary rvalue, does it make sense to add an Rvalue version of this method? And if so, is this a correct implementation;
void Asset::Load( const std::string& path )
{
// complicated method....
}
void Asset::Load( std::string&& path )
{
Load(path); // call the above method
}
Is this a correct approach to writing rvalue versions of methods?
For your particular case, the second overload is useless.
With the original code, which has just one overload for Load, this function is called for lvalues and rvalues.
With the new code, the first overload is called for lvalues and the second is called for rvalues. However, the second overload calls the first one. At the end, the effect of calling one or the other implies that the same operation (whatever the first overload does) will be performed.
Therefore, the effects of the original code and the new code are the same but the first code is just simpler.
Deciding whether a function must take an argument by value, lvalue reference or rvalue reference depends very much on what it does. You should provide an overload taking rvalue references when you want to move the passed argument. There are several good references on move semantincs out there, so I won't cover it here.
Bonus:
To help me make my point consider this simple probe class:
struct probe {
probe(const char* ) { std::cout << "ctr " << std::endl; }
probe(const probe& ) { std::cout << "copy" << std::endl; }
probe(probe&& ) { std::cout << "move" << std::endl; }
};
Now consider this function:
void f(const probe& p) {
probe q(p);
// use q;
}
Calling f("foo"); produces the following output:
ctr
copy
No surprises here: we create a temporary probe passing the const char* "foo". Hence the first output line. Then, this temporary is bound to p and a copy q of p is created inside f. Hence the second output line.
Now, consider taking p by value, that is, change f to:
void f(probe p) {
// use p;
}
The output of f("foo"); is now
ctr
Some will be surprised that in this case: there's no copy! In general, if you take an argument by reference and copy it inside your function, then it's better to take the argument by value. In this case, instead of creating a temporary and copying it, the compiler can construct the argument (p in this case) direct from the input ("foo"). For more information, see Want Speed? Pass by Value. by Dave Abrahams.
There are two notable exceptions to this guideline: constructors and assignment operators.
Consider this class:
struct foo {
probe p;
foo(const probe& q) : p(q) { }
};
The constructor takes a probe by const reference and then copy it to p. In this case, following the guideline above doesn't bring any performance improvement and probe's copy constructor will be called anyway. However, taking q by value might create an overload resolution issue similar to the one with assignment operator that I shall cover now.
Suppose that our class probe has a non-throwing swap method. Then the suggested implementation of its assignment operator (thinking in C++03 terms for the time being) is
probe& operator =(const probe& other) {
probe tmp(other);
swap(tmp);
return *this;
}
Then, according to the guideline above, it's better to write it like this
probe& operator =(probe tmp) {
swap(tmp);
return *this;
}
Now enter C++11 with rvalue references and move semantics. You decided to add a move assignment operator:
probe& operator =(probe&&);
Now calling the assignment operator on a temporary creates an ambiguity because both overloads are viable and none is preferred over the other. To resolve this issue, use the original implementation of the assignment operator (taking the argument by const reference).
Actually, this issue is not particular to constructors and assignment operators and might happen with any function. (It's more likely that you will experience it with constructors and assignment operators though.) For instance, calling g("foo"); when g has the following two overloads raises the ambiguity:
void g(probe);
void g(probe&&);
Unless you're doing something other than calling the lvalue reference version of Load, you don't need the second function, as an rvalue will bind to a const lvalue reference.
Since we know most of the time the Path is a temporary rvalue, does it make sense to add an Rvalue version of this method?
Probably not... Unless you need to do something tricky inside Load() that requires a non-const parameter. For example, maybe you want to std::move(Path) into another thread. In that case it might make sense to use move semantics.
Is this a correct approach to writing rvalue versions of methods?
No, you should do it the other way around:
void Asset::load( const std::string& path )
{
auto path_copy = path;
load(std::move(path_copy)); // call the below method
}
void Asset::load( std::string&& path )
{
// complicated method....
}
It's generally a question of whether internally you will make a copy (explicitly, or implicitly) of the incoming object (provide T&& argument), or you will just use it (stick to [const] T&).
If your Load member function doesn't assign from the incoming string, you should simply provide void Asset::Load(const std::string& Path).
If you do assign from the incoming path, say to a member variable, then there's a scenario where it could be slightly more efficient to provide void Asset::Load(std::string&& Path) too, but you'd need a different implementation that assigns ala loaded_from_path_ = std::move(Path);.
The potential benefit is to the caller, in that with the && version they might receive the free-store region that had been owned by the member variable, avoiding a pessimistic delete[]ion of that buffer inside void Asset::Load(const std::string& Path) and possible re-allocation next time the caller's string is assigned to (assuming the buffer's large enough to fit its next value too).
In your stated scenario, you're usually passing in string literals; such caller's will get no benefit from any && overload as there's no caller-owned std::string instance to receive the existing data member's buffer.
Here's what I do when trying to decide on the function signature
(const std::string& const_lvalue) argument is read only
(std::string& lvalue) I can modify argument (usually put something in) so the change would be VISIBLE to the caller
(std::string&& rvalue) I can modify argument (usually steal something from), zero consequences since the caller would no longer see/use this argument (consider it self destroyed after function returns) RVALUE reference bind to a temp object
All three of them are "pass-by-reference", but they show different intentions. 2+3 are similar, they can both modify the argument but 2 wants the modification to be seen by the caller whereas 3 doesn't.
// (2) caller sees the change argument
void ModifyInPlace(Foo& lvalue){
delete lvalue.data_pointer;
lvalue.data_pointer = nullptr;
}
// (3) move constructor, caller ignores the change to the argument
Foo(Foo&& rvalue)
{
this->data_pointer = that.data_pointer;
that.data_pointer = nullptr;
}