Following on from a comment I made on this:
passing std::vector to constructor and move semantics
Is the std::move necessary in the following code, to ensure that the returned value is a xvalue?
std::vector<string> buildVector()
{
std::vector<string> local;
// .... build a vector
return std::move(local);
}
It is my understanding that this is required. I have often seen this used when returning a std::unique_ptr from a function, however GManNickG made the following comment:
It is my understanding that in a return statement all local variables are automatically xvalues (expiring values) and will be moved, but I'm unsure if that only applies to the returned object itself. So OP should go ahead and put that in there until I'm more confident it shouldn't have to be. :)
Can anyone clarify if the std::move is necessary?
Is the behaviour compiler dependent?
You're guaranteed that local will be returned as an rvalue in this situation. Usually compilers would perform return-value optimization though before this even becomes an issue, and you probably wouldn't see any actual move at all, since the local object would be constructed directly at the call site.
A relevant Note in 6.6.3 ["The return statement"] (2):
A copy or move operation associated with a return statement may be elided or considered as an rvalue for the purpose of overload resolution in selecting a constructor (12.8).
To clarify, this is to say that the returned object can be move-constructed from the local object (even though in practice RVO will skip this step entirely). The normative part of the standard is 12.8 ["Copying and moving class objects"] (31, 32), on copy elision and rvalues (thanks #Mankarse!).
Here's a silly example:
#include <utility>
struct Foo
{
Foo() = default;
Foo(Foo const &) = delete;
Foo(Foo &&) = default;
};
Foo f(Foo & x)
{
Foo y;
// return x; // error: use of deleted function ‘Foo::Foo(const Foo&)’
return std::move(x); // OK
return std::move(y); // OK
return y; // OK (!!)
}
Contrast this with returning an actual rvalue reference:
Foo && g()
{
Foo y;
// return y; // error: cannot bind ‘Foo’ lvalue to ‘Foo&&’
return std::move(y); // OK type-wise (but undefined behaviour, thanks #GMNG)
}
Altough both, return std::move(local) and return local, do work in sense of that they do compile, their behavior is different. And probably only the latter one was intended.
If you write a function which returns a std::vector<string>, you have to return a std::vector<string> and exactly it. std::move(local) has the typestd::vector<string>&& which is not a std::vector<string> so it has to be converted to it using the move constructor.
The standard says in 6.6.3.2:
The value of the expression is implicitly
converted to the return type of the function in which it appears.
That means, return std::move(local) is equalvalent to
std::vector<std::string> converted(std::move(local); // move constructor
return converted; // not yet a copy constructor call (which will be elided anyway)
whereas return local only is
return local; // not yet a copy constructor call (which will be elided anyway)
This spares you one operation.
To give you a short example of what that means:
struct test {
test() { std::cout << " construct\n"; }
test(const test&) { std::cout << " copy\n"; }
test(test&&) { std::cout << " move\n"; }
};
test f1() { test t; return t; }
test f2() { test t; return std::move(t); }
int main()
{
std::cout << "f1():\n"; test t1 = f1();
std::cout << "f2():\n"; test t2 = f2();
}
This will output
f1():
construct
f2():
construct
move
I think the answer is no. Though officially only a note, §5/6 summarizes what expressions are/aren't xvalues:
An expression is an xvalue if it is:
the result of calling a function, whether implicitly or explicitly, whose return type is an rvalue reference to object type,
a cast to an rvalue reference to object type,
a class member access expression designating a non-static data member of non-reference type in which the object expression is an xvalue, or
a .* pointer-to-member expression in which the first operand is an xvalue and the second operand is a pointer to data member.
In general, the effect of this rule is that named rvalue references are treated as lvalues and unnamed rvalue references to objects are treated as xvalues; rvalue references to functions are treated as lvalues whether named or not.
The first bullet point seems to apply here. Since the function in question returns a value rather than an rvalue reference, the result won't be an xvalue.
Related
In the example snippet code below, the local variable which is used in return statement doesn't convert to r-value implicitly to match the conversion operator. However for move constructor it works.
I want to know whether it is a standard behavior or a bug. And if it is a standard behavior, what is the reason?
I tested it in Microsoft Visual Studio 2019 (Version 16.8.3) in 'permissive-' mode and it produced a compiler error. But in 'permissive' mode, it was OK.
#include <string>
class X
{
std::string m_str;
public:
X() = default;
X(X&& that)
{
m_str = std::move(that.m_str);
}
operator std::string() &&
{
return std::move(m_str);
}
};
X f()
{
X x;
return x;
}
std::string g()
{
X x;
return x; // Conformance mode: Yes (/permissive-) ==> error C2440: 'return': cannot convert from 'X' to 'std::basic_string<char,std::char_traits<char>,std::allocator<char>>'
//return std::move(x); // OK
// return X{}; // OK
}
int main()
{
f();
g();
return 0;
}
The reason that f works under the C++11 standard (link is to a close-enough draft) is this clause
[class.copy]/32
When the criteria for elision of a copy operation are met or would be met save for the fact that the source
object is a function parameter, and the object to be copied is designated by an lvalue, overload resolution to
select the constructor for the copy is first performed as if the object were designated by an rvalue. ...
And the "criteri[on] for elision of a copy operation" that is relevant in this case is
[class.copy]/31.1
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value
This works for f, since x in return x is the "name of a non-volatile automatic object ... with the same cv-unqualified type as the function return type"; that type is X. This does not work for g, since the return type std::string is not the type X of the object named by x.
I think it might be important to understand why this rule is here in the first place. This rule isn't really about implicitly moving function-local variables into function return values, even though that's what it literally says. It's about making NRVO possible. Consider what you would have to write for f without these rules:
X f() {
X x;
return std::move(x);
}
But then NVRO cannot apply, since you aren't returning a variable; you're returning the result of a function call! So the clause [class.copy]/32 is about making your code
X f() {
X x;
return x;
}
syntactically legal, while the semantics as described by the clause (using a move constructor) are to be ignored (assuming your implementation isn't too stupid) because we're actually just going to do NRVO, which doesn't call anything.
You see that, really, [class.copy]/32 doesn't have to work for g. Its purpose in f is to make it possible to execute zero copy/move constructors. But g has to execute the conversion operator; there's no other sensible way for the language to pull out a std::string when you give it an X. So NVRO cannot apply in g, so there's no need to write return x;, so you can just write
std::string g() {
X x;
return std::move(x);
}
and not be worried that will cause a missed optimization.
We see that the C++11 rule [class.copy]/32 is designed so that it affects the minimal portion of the cases possible. It applies to those cases where'd we'd like NVRO but don't have a copy constructor, and makes NVRO possible by telling us to pretend we'll call the move constructor. But when actually writing code, that means it's a mind-twister of a rule to remember: "to minimize copies/moves, if the return type is the same as the type of the variable, return the_variable;, and otherwise return std::move(the_variable)." That's why the C++20 standard completely rephrases [class.copy]/32 into
[class.copy.elision]/3
An implicitly movable entity is a variable of automatic storage duration that is either a non-volatile object or an rvalue reference to a non-volatile object type. In the following copy-initialization contexts, a move operation is first considered before attempting a copy operation:
If the expression in a return ([stmt.return]) or co_return ([stmt.return.coroutine]) statement is a (possibly parenthesized) id-expression that names an implicitly movable entity declared in the body or parameter-declaration-clause of the innermost enclosing function or lambda-expression, or
...
overload resolution to select the constructor for the copy or the return_value overload to call is first performed as if the expression or operand were an rvalue. ...
This does not require that the return type be the same as the variable's type for an implicit move; it can be summed up as the conceptually simpler rule "returning a variable tries to move, and then tries to copy". That leads to the conceptually simpler principle "When returning a variable from a function, just return the_variable; and it will do the right thing". (Of course, none of GCC, Clang, or MSVC seem to have gotten the memo. That has to be some kind of record...)
Let’s say I have the class MyClass with a correct move constructor and whose copy constructor is deleted. Now I am returning this class like this:
MyClass func()
{
return MyClass();
}
In this case the move constructor gets called when returning the class object and everything works as expected.
Now let’s say MyClass has an implementation of the << operator:
MyClass& operator<<(MyClass& target, const int& source)
{
target.add(source);
return target;
}
When I change the code above:
MyClass func()
{
return MyClass() << 5;
}
I get the compiler error, that the copy constructor cannot be accessed because it is deleted. But why is the copy constructor being used at all in this case?
Now I am returning this class via lvalue like this:
MyClass func()
{
return MyClass();
}
No, the returned expression is an xvalue (a kind of rvalue), used to initialise the result for return-by-value (things are a little more complicated since C++17, but this is still the gist of it; besides, you're on C++11).
In this case the move constructor gets called when returning the class object and everything works as expected.
Indeed; an rvalue will initialise an rvalue reference and thus the whole thing can match move constructors.
When I change the code above:
… now the expression is MyClass() << 5, which has type MyClass&. This is never an rvalue. It's an lvalue. It's an expression that refers to an existing object.
So, without an explicit std::move, that'll be used to copy-initialise the result. And, since your copy constructor is deleted, that can't work.
I'm surprised the example compiles at all, since a temporary can't be used to initialise an lvalue reference (your operator's first argument), though some toolchains (MSVS) are known to accept this as an extension.
then would return std::move(MyClass() << 5); work?
Yes, I believe so.
However that is very strange to look at, and makes the reader double-check to ensure there are no dangling references. This suggests there's a better way to accomplish this that results in clearer code:
MyClass func()
{
MyClass m;
m << 5;
return m;
}
Now you're still getting a move (because that's a special rule when returning local variables) without any strange antics. And, as a bonus, the << call is completely standard-compliant.
Your operator return by MyClass&. So you are returning an lvalue, not an rvalue that can be moved automatically.
You can avoid the copy by relying on the standard guarantees regarding NRVO.
MyClass func()
{
MyClass m;
m << 5;
return m;
}
This will either elide the object entirely, or move it. All on account of it being a function local object.
Another option, seeing as you are trying to call operator<< on an rvalue, is to supply an overload dealing in rvalue references.
MyClass&& operator<<(MyClass&& target, int i) {
target << i; // Reuse the operator you have, here target is an lvalue
return std::move(target);
}
That will make MyClass() << 5 itself well formed (see the other answer for why it isn't), and return an xvalue from which the return object may be constructed. Though such and overload for operator<< is not commonly seen.
Your operator<< takes its first parameter as a non-const reference. You can't bind a non-const reference to a temporary. But MyClass() returns the newly-created instance as a temporary.
Also, while func returns a value, operator<< returns a reference. So what else can it do but make a copy to return?
Why is the copy constructor called when returning from bar instead of the move constructor?
#include <iostream>
using namespace std;
class Alpha {
public:
Alpha() { cout << "ctor" << endl; }
Alpha(Alpha &) { cout << "copy ctor" << endl; }
Alpha(Alpha &&) { cout << "move ctor" << endl; }
Alpha &operator=(Alpha &) { cout << "copy asgn op" << endl; }
Alpha &operator=(Alpha &&) { cout << "move asgn op" << endl; }
};
Alpha foo(Alpha a) {
return a; // Move ctor is called (expected).
}
Alpha bar(Alpha &&a) {
return a; // Copy ctor is called (unexpected).
}
int main() {
Alpha a, b;
a = foo(a);
a = foo(Alpha());
a = bar(Alpha());
b = a;
return 0;
}
If bar does return move(a) then the behavior is as expected. I do not understand why a call to std::move is necessary given that foo calls the move constructor when returning.
There are 2 things to understand in this situation:
a in bar(Alpha &&a) is a named rvalue reference; therefore, treated as an lvalue.
a is still a reference.
Part 1
Since a in bar(Alpha &&a) is a named rvalue reference, its treated as an lvalue. The motivation behind treating named rvalue references as lvalues is to provide safety. Consider the following,
Alpha bar(Alpha &&a) {
baz(a);
qux(a);
return a;
}
If baz(a) considered a as an rvalue then it is free to call the move constructor and qux(a) may be invalid. The standard avoids this problem by treating named rvalue references as lvalues.
Part 2
Since a is still a reference (and may refer to an object outside of the scope of bar), bar calls the copy constructor when returning. The motivation behind this behavior is to provide safety.
References
SO Q&A - return by rvalue reference
Comment by Kerrek SB
yeah, very confusing. I would like to cite another SO post here implicite move. where I find the following comments a bit convincing,
And therefore, the standards committee decided that you had to be
explicit about the move for any named variable, regardless of its
reference type
Actually "&&" is already indicating let-go and at the time when you do "return", it is safe enough to do move.
probably it is just the choice from standard committee.
item 25 of "effective modern c++" by scott meyers, also summarized this, without giving much explanations.
Alpha foo() {
Alpha a
return a; // RVO by decent compiler
}
Alpha foo(Alpha a) {
return a; // implicit std::move by compiler
}
Alpha bar(Alpha &&a) {
return a; // Copy ctor due to lvalue
}
Alpha bar(Alpha &&a) {
return std:move(a); // has to be explicit by developer
}
This is a very very common mistake to make as people first learn about rvalue references. The basic problem is a confusion between type and value category.
int is a type. int& is a different type. int&& is yet another type. These are all different types.
lvalues and rvalues are things called value categories. Please check out the fantastic chart here: What are rvalues, lvalues, xvalues, glvalues, and prvalues?. You can see that in addition to lvalues and rvalues, we also have prvalues and glvalues and xvalues, and they form a various venn diagram sort of relation.
C++ has rules that say that variables of various types can bind to expressions. An expressions reference type however, is discarded (people often say that expressions do not have reference type). Instead, the expression have a value category, which determines which variables can bind to it.
Put another way: rvalue references and lvalue references are only directly relevant on the left hand of the assignment, the variable being created/bound. On the right side, we are talking about expressions and not variables, and rvalue/lvalue reference-ness is only relevant in the context of determining value category.
A very simple example to start with is simple looking at things of purely type int. A variable of type int as an expression, is an lvalue. However, an expression consisting of evaluating a function that returns an int, is an rvalue. This makes intuitive sense to most people; the key thing though is to separate out the type of an expression (even before references are discarded) and its value category.
What this is leading to, is that even though variables of type int&& can only bind to rvalues, does not mean that all expressions with type int&&, are rvalues. In fact, as the rules at http://en.cppreference.com/w/cpp/language/value_category say, any expression consisting of naming a variable, is always an lvalue, no matter the type.
That's why you need std::move in order to pass along rvalue references into subsequent functions that take by rvalue reference. It's because rvalue references do not bind to other rvalue references. They bind to rvalues. If you want to get the move constructor, you need to give it an rvalue to bind to, and a named rvalue reference is not an rvalue.
std::move is a function that returns an rvalue reference. And what's the value category of such an expression? An rvalue? Nope. It's an xvalue. Which is basically an rvalue, with some additional properties.
In both foo and bar, the expression a is an lvalue. The statement return a; means to initialize the return value object from the initializer a, and return that object.
The difference between the two cases is that overload resolution for this initialization is performed differently depending on whether or not a declared as a non-volatile automatic object within the innermost enclosing block, or a function parameter.
Which it is for foo but not bar. (In bar , a is declared as a reference). So return a; in foo selects the move constructor to initialize the return value, but return a; in bar selects the copy constructor.
The full text is C++14 [class.copy]/32:
When the criteria for elision of a copy/move operation are met, but not for an exception-declaration , and the object to be copied is designated by an lvalue, or when the expression in a return statement is a (possibly parenthesized) id-expression that names an object with automatic storage duration declared in the body or parameter-declaration-clause of the innermost enclosing function or lambda-expression, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue. If the first overload resolution fails or was not performed, or if the type of the first parameter of the selected constructor is not an rvalue reference to the object’s type (possibly cv-qualified), overload resolution is performed again, considering the object as an lvalue. [Note: This two-stage overload resolution must be performed regardless of whether copy elision will occur. It determines the constructor to be called if elision is not performed, and the selected constructor must be accessible even if the call is elided. —end note ]
where "criteria for elision of a copy/move operation are met" refers to [class.copy]/31.1:
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing
the automatic object directly into the function’s return value
Note, these texts will change for C++17.
unique_ptr<T> does not allow copy construction, instead it supports move semantics. Yet, I can return a unique_ptr<T> from a function and assign the returned value to a variable.
#include <iostream>
#include <memory>
using namespace std;
unique_ptr<int> foo()
{
unique_ptr<int> p( new int(10) );
return p; // 1
//return move( p ); // 2
}
int main()
{
unique_ptr<int> p = foo();
cout << *p << endl;
return 0;
}
The code above compiles and works as intended. So how is it that line 1 doesn't invoke the copy constructor and result in compiler errors? If I had to use line 2 instead it'd make sense (using line 2 works as well, but we're not required to do so).
I know C++0x allows this exception to unique_ptr since the return value is a temporary object that will be destroyed as soon as the function exits, thus guaranteeing the uniqueness of the returned pointer. I'm curious about how this is implemented, is it special cased in the compiler or is there some other clause in the language specification that this exploits?
is there some other clause in the language specification that this exploits?
Yes, see 12.8 §34 and §35:
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object [...]
This elision of copy/move operations, called copy elision, is permitted [...]
in a return statement in a function with a class return type, when the expression is the name of
a non-volatile automatic object with the same cv-unqualified type as the function return type [...]
When the criteria for elision of a copy operation are met and the object to be copied is designated by an lvalue,
overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue.
Just wanted to add one more point that returning by value should be the default choice here because a named value in the return statement in the worst case, i.e. without elisions in C++11, C++14 and C++17 is treated as an rvalue. So for example the following function compiles with the -fno-elide-constructors flag
std::unique_ptr<int> get_unique() {
auto ptr = std::unique_ptr<int>{new int{2}}; // <- 1
return ptr; // <- 2, moved into the to be returned unique_ptr
}
...
auto int_uptr = get_unique(); // <- 3
With the flag set on compilation there are two moves (1 and 2) happening in this function and then one move later on (3).
This is in no way specific to std::unique_ptr, but applies to any class that is movable. It's guaranteed by the language rules since you are returning by value. The compiler tries to elide copies, invokes a move constructor if it can't remove copies, calls a copy constructor if it can't move, and fails to compile if it can't copy.
If you had a function that accepts std::unique_ptr as an argument you wouldn't be able to pass p to it. You would have to explicitly invoke move constructor, but in this case you shouldn't use variable p after the call to bar().
void bar(std::unique_ptr<int> p)
{
// ...
}
int main()
{
unique_ptr<int> p = foo();
bar(p); // error, can't implicitly invoke move constructor on lvalue
bar(std::move(p)); // OK but don't use p afterwards
return 0;
}
unique_ptr doesn't have the traditional copy constructor. Instead it has a "move constructor" that uses rvalue references:
unique_ptr::unique_ptr(unique_ptr && src);
An rvalue reference (the double ampersand) will only bind to an rvalue. That's why you get an error when you try to pass an lvalue unique_ptr to a function. On the other hand, a value that is returned from a function is treated as an rvalue, so the move constructor is called automatically.
By the way, this will work correctly:
bar(unique_ptr<int>(new int(44));
The temporary unique_ptr here is an rvalue.
I think it's perfectly explained in item 25 of Scott Meyers' Effective Modern C++. Here's an excerpt:
The part of the Standard blessing the RVO goes on to say that if the conditions for the RVO are met, but compilers choose not to perform copy elision, the object being returned must be treated as an rvalue. In effect, the Standard requires that when the RVO is permitted, either copy elision takes place or std::move is implicitly applied to local objects being returned.
Here, RVO refers to return value optimization, and if the conditions for the RVO are met means returning the local object declared inside the function that you would expect to do the RVO, which is also nicely explained in item 25 of his book by referring to the standard (here the local object includes the temporary objects created by the return statement). The biggest take away from the excerpt is either copy elision takes place or std::move is implicitly applied to local objects being returned. Scott mentions in item 25 that std::move is implicitly applied when the compiler choose not to elide the copy and the programmer should not explicitly do so.
In your case, the code is clearly a candidate for RVO as it returns the local object p and the type of p is the same as the return type, which results in copy elision. And if the compiler chooses not to elide the copy, for whatever reason, std::move would've kicked in to line 1.
One thing that i didn't see in other answers is To clarify another answers that there is a difference between returning std::unique_ptr that has been created within a function, and one that has been given to that function.
The example could be like this:
class Test
{int i;};
std::unique_ptr<Test> foo1()
{
std::unique_ptr<Test> res(new Test);
return res;
}
std::unique_ptr<Test> foo2(std::unique_ptr<Test>&& t)
{
// return t; // this will produce an error!
return std::move(t);
}
//...
auto test1=foo1();
auto test2=foo2(std::unique_ptr<Test>(new Test));
I would like to mention one case where you must use std::move() otherwise it will give an error.
Case: If the return type of the function differs from the type of the local variable.
class Base { ... };
class Derived : public Base { ... };
...
std::unique_ptr<Base> Foo() {
std::unique_ptr<Derived> derived(new Derived());
return std::move(derived); //std::move() must
}
Reference: https://www.chromium.org/developers/smart-pointer-guidelines
I know it's an old question, but I think an important and clear reference is missing here.
From https://en.cppreference.com/w/cpp/language/copy_elision :
(Since C++11) In a return statement or a throw-expression, if the compiler cannot perform copy elision but the conditions for copy elision are met or would be met, except that the source is a function parameter, the compiler will attempt to use the move constructor even if the object is designated by an lvalue; see return statement for details.
in the code below, what it is used to avoid copy, elision or rvalue reference and move constructor ?
std::string get(){return "...";}
void foo(std::string var){}
foo( get() ); //<--- here
std::string get(){
// this is similar to return std::string("..."), which is
// copied/moved into the return value object.
return "...";
}
RVO allows it to construct the temporary string object directly into the return value object of get().
foo( get() );
RVO allows it to directly construct the temporary string object (the return value object) directly into the parameter object of foo.
These are the RVO scenarios allowed. If your compiler cannot apply them, it has to use move constructors (if available) to move the return value into the return value object and the parameter object, respectively. In this case that is not surprising because both temporary objects are or are treated as rvalues anyway. (For the first scenario, no expression corresponds to the created temporary, so the treatment is only for the purpose of selecting what constructor is used for copying/moving the temporary into the return value object).
For other cases, the compiler has to consider things as rvalues even if they are otherwise lvalues
std::string get(){
std::string s = "...";
// this is similar to return move(s)
return s;
}
The spec says when it could potentially apply RVO (or NRVO) to an lvalue by the rules it sets forth, the implementation is required to treat the expressions as rvalues and use move constructors if available, and only if it couldn't find a suitable constructor, it should use the expression as an lvalue. It would be a pity for the programmer to write explicit moves in these cases, when it's clear the programmer would always want a move instead of a copy.
Example:
struct A { A(); A(A&); };
struct B { B(); B(B&&); };
A f() { A a; return a; }
B f() { B b; return b; }
For the first, it takes a as an rvalue, but cannot find constructors that accept this rvalue (A& cannot bind to rvalues). Therefor, it then again treats a as what it is (an lvalue). For the second, it takes b as a rvalue, and has B(B&&) take that rvalue and move it. If it would have taken b as an lvalue (what it is), then the copy initialization would have failed, because B has no copy constructor implicitly declared.
Note that returning and paramter passing uses the rules of copy initialization, which means
u -> T (where u's type is different from T) =>
T rvalue_tmp = u;
T target(rvalue_tmp);
t -> T (where t's type is T) =>
T target = t;
Hence, in the example where we return a "...", we first create an rvalue temporary and then move that into the target. For the case where we return an expression of the type of the return value / paramter, we will directly move / copy the expression into the target.
Most likely copy ellision, but if your compiler cannot apply in this case, which can happen if the functions are more complex, then you're looking at a move. Moves are extremely efficient, so I wouldn't panic here if ellision is not performed.
Implementation defined, but most likely copy elision.
Similarly, RVO/NRVO will most likely kick in before move semantics when returning an object value from a function.