Classes, Rvalues and Rvalue References - c++

An lvalue is a value bound to a definitive region of memory whereas an rvalue is an expression value whose existence is temporary and who does not necessarily refer to a definitive region of memory. Whenever an lvalue is used in a position in which an rvalue is expected, the compiler performs an lvalue-to-rvalue conversion and then proceeds with evaluation.
http://www.eetimes.com/discussion/programming-pointers/4023341/Lvalues-and-Rvalues
Whenever we construct a temporary (anonymous) class object or return a temporary class object from a function, although the object is temporary, it is addressable. However, the object still is a valid rvalue. This means that the object is a) an addressable rvalue or b) is being implicitly converted from an lvalue to an rvalue when the compiler expects an lvalue to be used.
For instance:
class A
{
public:
int x;
A(int a) { x = a; std::cout << "int conversion ctor\n"; }
A(A&) { std::cout << "lvalue copy ctor\n"; }
A(A&&) { std::cout << "rvalue copy ctor\n"; }
};
A ret_a(A a)
{
return a;
}
int main(void)
{
&A(5); // A(5) is an addressable object
A&& rvalue = A(5); // A(5) is also an rvalue
}
We also know that temporary objects returned (in the following case a) by functions are lvalues as this code segment:
int main(void)
{
ret_a(A(5));
}
yields the following output:
int conversion ctor
lvalue copy ctor
Indicating that the call to the function ret_a using actual argument A(5) calls the conversion constructor A::A(int) which constructs the function's formal argument a with the value 5.
When the function completes execution, it then constructs a temporary A object using a as its argument, which invokes A::A(A&). However, if we were to remove A::A(A&) from the list of overloaded constructors, the returned temporary object would still match the rvalue-reference constructor A::A(A&&).
This is what I'm not quite understanding: how can the object a match both an rvalue reference and an lvalue reference? It is clear that A::A(A&) is a better match than A::A(A&&) (and therefore a must be an lvalue). But, because an rvalue reference cannot be initialized to an lvalue, given that the formal argument a is an lvalue, it should not be able to match the call to A::A(A&&). If the compiler is making an lvalue-to-rvalue conversion it would be trivial. The fact that a conversion from 'A' to 'A&' is also trivial, both functions should have identical implicit conversion sequence ranks and therefore, the compiler should not be able to deduce the best-matching function when both A::A(A&) and A::A(A&&) are in the overloaded function candidate set.
Moreover, the question (which I previously asked) is:
How can a given object match both an rvalue reference and an lvalue reference?

For me:
int main(void)
{
ret_a(A(5));
}
Yields:
int conversion ctor
rvalue copy ctor
(i.e. rvalue, not lvalue). This is a bug in your compiler. However it is understandable as the rules for this behavior changed only a few months ago (Nov. 2010). More on this below.
When the function completes execution,
it then constructs a temporary A
object using a as its argument, which
invokes A::A(A&).
Actually no. When the function ret_a completes execution, it then constructs a temporary A object using a as its argument, which invokes A:A(A&&). This is due to [class.copy]/p33]1:
When the criteria for elision of a
copy operation are met or would be met
save for the fact that the source
object is a function parameter, and
the object to be copied is designated
by an lvalue, overload resolution to
select the constructor for the copy is
first performed as if the object were
designated by an rvalue. If overload
resolution fails, or if the type of
the first parameter of the selected
constructor is not an rvalue reference
to the object’s type (possibly
cv-qualified), overload resolution is
performed again, considering the
object as an lvalue. [ Note: This
two-stage overload resolution must be
performed regardless of whether copy
elision will occur. It determines the
constructor to be called if elision is
not performed, and the selected
constructor must be accessible even if
the call is elided. — end note ]
However if you remove the A::A(A&&) constructor, then A::A(&) will be chosen for the return. Although in this case, then the construction of the argument a will fail because you can't construct it using an rvalue. However ignoring that for the moment, I believe your ultimate question is:
How can a given object match both an
rvalue reference and an lvalue
reference?
in referring to the statement:
return a;
And the answer is in the above quoted paragraph from the draft standard: First overload resolution is tried as if a is an rvalue. And if that fails, overload resolution is tried again using a as an lvalue. This two-stage process is tried only in the context wherein copy elision is permissible (such as a return statement).
The C++0x draft has just recently been changed to allow the two-stage overload resolution process when returning arguments that have been passed by value (as in your example). And that is the reason for the varying behavior from different compilers that we are seeing.

Related

Initialization in return statements of functions that return by-value

My question originates from delving into std::move in return statements, such as in the following example:
struct A
{
A() { std::cout << "Constructed " << this << std::endl; }
A(A&&) noexcept { std::cout << "Moved " << this << std::endl; }
};
A nrvo()
{
A local;
return local;
}
A no_nrvo()
{
A local;
return std::move(local);
}
int main()
{
A a1(nrvo());
A a2(no_nrvo());
}
which prints (MSVC, /std:c++17, release)
Constructed 0000000C0BD4F990
Constructed 0000000C0BD4F991
Moved 0000000C0BD4F992
I am interested in the general initialization rules for return statements in functions that return by-value and which rules apply when returning a local variable with std::move as shown above.
The general case
Regarding return statements you can read
Evaluates the expression, terminates the current function and returns the result of the expression to the caller after implicit conversion to the function return type. [...]
on cppreference.com.
Amongst others Copy initialization happens
when returning from a function that returns by value
like so
return other;
Coming back to my example, according to my current knowledge - and in contrast to the above-named rule - A a1(nrvo()); is a statement that direct-initializes a1 with the prvalue nrvo(). So which object exactly is copy-initialized as described at cppreference.com for return statements?
The std::move case
For this case, I've referred to ipc's answer on Are returned locals automatically xvalues. I want to make sure that the following is correct: std::move(local) has the type A&& but no_nrvo() is declared to return the type A, so here the
returns the result of the expression to the caller after implicit conversion to the function return type
part should come into play. I think this should be an Lvalue to rvalue conversion:
A glvalue of any non-function, non-array type T can be implicitly converted to a prvalue of the same type. [...] For a class type, this conversion [...] converts the glvalue to a prvalue whose result object is copy-initialized by the glvalue.
To convert from A&& to A A's move constructor is used, which is also why NRVO is disabled here. Are those the rules that apply in this case, and did I understand them correctly? Also, again they say copy-initialized by the glvalue but A a2(no_nrvo()); is a direct initialization. So this also touches on the first case.
You have to be careful with cppreference.com when diving into such nitty-gritty, as it's not an authoritative source.
So which object exactly is copy-initialized as described at cppreference.com for return statements?
In this case, none. That's what copy elision is: The copy that would normally happen is skipped. The cppreference (4) clause could be written as "when returning from a function that returns by value, and the copy is not elided", but that's kind of redundant. The standard: [stmt.return] is a lot clearer on the subject.
To convert from A&& to A A's move constructor is used, which is also why NRVO is disabled here. Are those the rules that apply in this case, and did I understand them correctly?
That's not quite right. NRVO only applies to names of non-volatile objects. However, in return std::move(local);, it's not local that is being returned, it's the A&& that is the result of the call to std::move(). This has no name, thus mandatory NRVO does not apply.
I think this should be an Lvalue to rvalue conversion:
The A&& returned by std::move() is decidedly not an Lvalue. It's an xvalue, and thus an rvalue already. There is no Lvalue to rvalue conversion happening here.
but A a2(no_nrvo()); is a direct initialization. So this also touches on the first case.
Not really. Whether a function has to perform copy-initialization of its result as part of a return statement is not impacted in any way by how that function is invoked. Similarly, how a function's return argument is used at the callsite is not impacted by the function's definition.
In both cases, an is direct-initialized by the result of the function. In practice, this means that the compiler will use the same memory location for the an object as for the return value of the function.
In A a1(nrvo());, thanks to NRVO, the memory location assigned to local is the same as the function's result value, which happens to be a1 already. Effectively, local and a1 were the same object all along.
In A a2(no_nrvo()), local has its own storage, and the result of the function, aka a2 is move-constructed from it. Effectively, local is moved into a2.

How to do type conversion on an "implicit" rvalue in the return statement

Is this invalid? gcc accepts it, clang and msvc don't.
#include <memory>
class Holder {
std::unique_ptr<int> data;
public:
operator std::unique_ptr<int>() && { return std::move(data); }
};
std::unique_ptr<int> test()
{
Holder val;
return val;
}
Assuming that I don't want to add something like std::unique_ptr<int> Holder::TakeData() { return std::move(data); }, the only other workaround I could think of is moving in the return value:
std::unique_ptr<int> test()
{
Holder val;
return std::move(val); // lets the conversion proceed
}
But then gcc 9.3+ has the gall to tell me that the std::move is redundant (with all warnings enabled). WTF? I mean yeah, gcc doesn't need the move, sure, but nothing else accepts the code then. And if it won't be gcc, then some humans inevitably will balk at it later.
What's the authoritative last word on whether it should compile as-is or not?
How should such code be written? Should I put in this seemingly noisy TakeData function and use it? Worse yet - should I maybe make the TakeData function limited to rvalue context, i.e. having to do return std::move(val).TakeData() ?
Adding operator std::unique_ptr<int>() & { return std::move(data); } is not an option, since it obviously leads to nasty bugs - it can be invoked in wrong context.
The "implicit" rvalue conversion is standard mandated. But depending on which standard version you are using, which compiler is "correct" varies.
In C++17
[class.copy.elision] (emphasis mine)
3 In the following copy-initialization contexts, a move operation
might be used instead of a copy operation:
If the expression in a return statement is a (possibly parenthesized) id-expression that names an object with automatic
storage duration declared in the body or parameter-declaration-clause
of the innermost enclosing function or lambda-expression, or
...
overload resolution to select the constructor for the copy is first
performed as if the object were designated by an rvalue. If the first
overload resolution fails or was not performed, or if the type of the
first parameter of the selected constructor is not an rvalue reference
to the object's type (possibly cv-qualified), overload resolution is
performed again, considering the object as an lvalue. [ Note: This
two-stage overload resolution must be performed regardless of whether
copy elision will occur. It determines the constructor to be called if
elision is not performed, and the selected constructor must be
accessible even if the call is elided.  — end note ]
Up to C++17, GCC is wrong. Using val implicitly as an rvalue should fail to initialize the return type on account of the sentence I marked in bold (the rvalue reference in the unique_ptr c'tor doesn't bind directly to val). But come C++20, that sentence is no longer there.
C++20
3 An implicitly movable entity is a variable of automatic storage
duration that is either a non-volatile object or an rvalue reference
to a non-volatile object type. In the following copy-initialization
contexts, a move operation might be used instead of a copy operation:
If the expression in a return ([stmt.return]) or co_­return ([stmt.return.coroutine]) statement is a (possibly parenthesized)
id-expression that names an implicitly movable entity declared in the
body or parameter-declaration-clause of the innermost enclosing
function or lambda-expression, or
[...]
overload resolution to select the constructor for the copy or the
return_­value overload to call is first performed as if the expression
or operand were an rvalue. If the first overload resolution fails or
was not performed, overload resolution is performed again, considering
the expression or operand as an lvalue. [ Note: This two-stage
overload resolution must be performed regardless of whether copy
elision will occur. It determines the constructor or the return_­value
overload to be called if elision is not performed, and the selected
constructor or return_­value overload must be accessible even if the
call is elided. — end note ]
The correctness of the code is thus subject to the time travel properties of your compiler(s).
As far as how should code like that should be written. If you aren't getting consistent results, an option would be to use the exact return type of the function
std::unique_ptr<int> test()
{
Holder val;
std::unique_ptr<int> ret_val = std::move(val);
return ret_val;
}
I agree from the get go that this may not look as appealing as simply returning val, but at least it plays nice with NRVO. So we aren't likely to get more copies of unique_ptr than we desired originally.
If that is too unappealing still, then I find your idea of a resource stealing member function to be most to my liking. But no accounting for taste.

Copy ctor called instead of move ctor

Why is the copy constructor called when returning from bar instead of the move constructor?
#include <iostream>
using namespace std;
class Alpha {
public:
Alpha() { cout << "ctor" << endl; }
Alpha(Alpha &) { cout << "copy ctor" << endl; }
Alpha(Alpha &&) { cout << "move ctor" << endl; }
Alpha &operator=(Alpha &) { cout << "copy asgn op" << endl; }
Alpha &operator=(Alpha &&) { cout << "move asgn op" << endl; }
};
Alpha foo(Alpha a) {
return a; // Move ctor is called (expected).
}
Alpha bar(Alpha &&a) {
return a; // Copy ctor is called (unexpected).
}
int main() {
Alpha a, b;
a = foo(a);
a = foo(Alpha());
a = bar(Alpha());
b = a;
return 0;
}
If bar does return move(a) then the behavior is as expected. I do not understand why a call to std::move is necessary given that foo calls the move constructor when returning.
There are 2 things to understand in this situation:
a in bar(Alpha &&a) is a named rvalue reference; therefore, treated as an lvalue.
a is still a reference.
Part 1
Since a in bar(Alpha &&a) is a named rvalue reference, its treated as an lvalue. The motivation behind treating named rvalue references as lvalues is to provide safety. Consider the following,
Alpha bar(Alpha &&a) {
baz(a);
qux(a);
return a;
}
If baz(a) considered a as an rvalue then it is free to call the move constructor and qux(a) may be invalid. The standard avoids this problem by treating named rvalue references as lvalues.
Part 2
Since a is still a reference (and may refer to an object outside of the scope of bar), bar calls the copy constructor when returning. The motivation behind this behavior is to provide safety.
References
SO Q&A - return by rvalue reference
Comment by Kerrek SB
yeah, very confusing. I would like to cite another SO post here implicite move. where I find the following comments a bit convincing,
And therefore, the standards committee decided that you had to be
explicit about the move for any named variable, regardless of its
reference type
Actually "&&" is already indicating let-go and at the time when you do "return", it is safe enough to do move.
probably it is just the choice from standard committee.
item 25 of "effective modern c++" by scott meyers, also summarized this, without giving much explanations.
Alpha foo() {
Alpha a
return a; // RVO by decent compiler
}
Alpha foo(Alpha a) {
return a; // implicit std::move by compiler
}
Alpha bar(Alpha &&a) {
return a; // Copy ctor due to lvalue
}
Alpha bar(Alpha &&a) {
return std:move(a); // has to be explicit by developer
}
This is a very very common mistake to make as people first learn about rvalue references. The basic problem is a confusion between type and value category.
int is a type. int& is a different type. int&& is yet another type. These are all different types.
lvalues and rvalues are things called value categories. Please check out the fantastic chart here: What are rvalues, lvalues, xvalues, glvalues, and prvalues?. You can see that in addition to lvalues and rvalues, we also have prvalues and glvalues and xvalues, and they form a various venn diagram sort of relation.
C++ has rules that say that variables of various types can bind to expressions. An expressions reference type however, is discarded (people often say that expressions do not have reference type). Instead, the expression have a value category, which determines which variables can bind to it.
Put another way: rvalue references and lvalue references are only directly relevant on the left hand of the assignment, the variable being created/bound. On the right side, we are talking about expressions and not variables, and rvalue/lvalue reference-ness is only relevant in the context of determining value category.
A very simple example to start with is simple looking at things of purely type int. A variable of type int as an expression, is an lvalue. However, an expression consisting of evaluating a function that returns an int, is an rvalue. This makes intuitive sense to most people; the key thing though is to separate out the type of an expression (even before references are discarded) and its value category.
What this is leading to, is that even though variables of type int&& can only bind to rvalues, does not mean that all expressions with type int&&, are rvalues. In fact, as the rules at http://en.cppreference.com/w/cpp/language/value_category say, any expression consisting of naming a variable, is always an lvalue, no matter the type.
That's why you need std::move in order to pass along rvalue references into subsequent functions that take by rvalue reference. It's because rvalue references do not bind to other rvalue references. They bind to rvalues. If you want to get the move constructor, you need to give it an rvalue to bind to, and a named rvalue reference is not an rvalue.
std::move is a function that returns an rvalue reference. And what's the value category of such an expression? An rvalue? Nope. It's an xvalue. Which is basically an rvalue, with some additional properties.
In both foo and bar, the expression a is an lvalue. The statement return a; means to initialize the return value object from the initializer a, and return that object.
The difference between the two cases is that overload resolution for this initialization is performed differently depending on whether or not a declared as a non-volatile automatic object within the innermost enclosing block, or a function parameter.
Which it is for foo but not bar. (In bar , a is declared as a reference). So return a; in foo selects the move constructor to initialize the return value, but return a; in bar selects the copy constructor.
The full text is C++14 [class.copy]/32:
When the criteria for elision of a copy/move operation are met, but not for an exception-declaration , and the object to be copied is designated by an lvalue, or when the expression in a return statement is a (possibly parenthesized) id-expression that names an object with automatic storage duration declared in the body or parameter-declaration-clause of the innermost enclosing function or lambda-expression, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue. If the first overload resolution fails or was not performed, or if the type of the first parameter of the selected constructor is not an rvalue reference to the object’s type (possibly cv-qualified), overload resolution is performed again, considering the object as an lvalue. [Note: This two-stage overload resolution must be performed regardless of whether copy elision will occur. It determines the constructor to be called if elision is not performed, and the selected constructor must be accessible even if the call is elided. —end note ]
where "criteria for elision of a copy/move operation are met" refers to [class.copy]/31.1:
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing
the automatic object directly into the function’s return value
Note, these texts will change for C++17.

Result of ternary operator not an rvalue

If you compile this program with a C++11 compiler, the vector is not moved out of the function.
#include <vector>
using namespace std;
vector<int> create(bool cond) {
vector<int> a(1);
vector<int> b(2);
return cond ? a : b;
}
int main() {
vector<int> v = create(true);
return 0;
}
If you return the instance like this, it is moved.
if(cond) return a;
else return b;
Here is a demo on ideone.
I tried it with gcc 4.7.0 and MSVC10. Both behave the same way.
My guess why this happens is this:
The ternary operators type is an lvalue because it is evaluated before return statement is executed. At this point a and b are not yet xvalues (soon to expire).
Is this explanation correct?
Is this a defect in the standard?
This is clearly not the intended behaviour and a very common case in my opinion.
Here are the relevant Standard quotes:
12.8 paragraph 32:
Copy elision is permitted in the following circumstances [...]
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function's return value
[when throwing, with conditions]
[when the source is a temporary, with conditions]
[when catching by value, with conditions]
paragraph 33:
When the criteria for elision of a copy operation are met or would be met save for the fact that the source object is a function parameter, and the object to be copied is designated by an lvalue, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue. If overload resolution fails, or if the type of the first parameter of the selected constructor is not an rvalue reference to the object's type (possibly cv-qualified), overload resolution is performed again, considering the object as an lvalue. [Note: This two-stage overload resolution must be performed regardless of whether copy elision will occur. It determines the constructor to be called if elision is not performed, and the selected constructor must be accessible even if the call is elided. - end note]
Since the expression in return (cond ? a : b); is not a simple variable name, it's not eligible for copy elision or rvalue treatment. Maybe a bit unfortunate, but it's easy to imagine stretching the example a little bit further at a time until you create a headache of an expectation for compiler implementations.
You can of course get around all this by explicitly saying to std::move the return value when you know it's safe.
This will fix it
return cond ? std::move(a) : std::move(b);
Consider the ternary operator as a function, like your code is
return ternary(cond, a, b);
The parameters won't be moved implicitly, you need to make it explicit.

C++11: Move/Copy construction ambiguity?

In C++11 we can define copy and move constructors, but are both allowed on the same class? If so, how do you disambiguate their usage? For example:
Foo MoveAFoo() {
Foo f;
return f;
}
Is the above a copy? A move? How do I know?
Usually it will be neither due to RVO.
If that optimisation can't be performed, then it will be a move, because the object being returned is going out of scope (and will be destroyed just after). If it can't be moved, then it will be copied. If it can't be copied, it won't compile.
The whole point of move constructors is that when a copy is going to be made of an object that is just about to be destroyed, it is often unnecessary to make a whole copy, and the resources can be moved from the dying object to the object being created instead.
You can tell when either the copy or move constructor is going to be called based on what is about to happen to the object being moved/copied. Is it about to go out of scope and be destructed? If so, the move constructor will be called. If not, the copy constructor.
Naturally, this means you may have both a move constructor and copy constructor in the same class. You can also have a copy assignment operator and a move assignment operator as well.
Update: It may be unclear as to exactly when the move constructor/assignment operator is called versus the plain copy constructor/assignment operator. If I understand correctly, the move constructor is called if an object is initialised with an xvalue (eXpiring value). §3.10.1 of the standard says
An xvalue (an “eXpiring” value) also refers to an object, usually near
the end of its lifetime (so that its resources may be moved, for
example). An xvalue is the result of certain kinds of expressions
involving rvalue references (8.3.2). [ Example: The result of calling
a function whose return type is an rvalue reference is an xvalue. —end
example ]
And the beginning of §5 of the standard says:
[ Note: An expression is an xvalue if it is:
the result of calling a
function, whether implicitly or explicitly, whose return type is an
rvalue reference to object type,
a cast to an rvalue reference to
object type,
a class member access expression designating a
non-static data member of non-reference type in which the object
expression is an xvalue, or
a .* pointer-to-member expression in
which the first operand is an xvalue and the second operand is a
pointer to data member.
In general, the effect of this rule is that
named rvalue references are treated as lvalues and unnamed rvalue
references to objects are treated as xvalues; rvalue references to
functions are treated as lvalues whether named or not. —end note ]
As an example, if NRVO can be done, it's like this:
void MoveAFoo(Foo* f) {
new (f) Foo;
}
Foo myfoo; // pretend this isn't default constructed
MoveAFoo(&myfoo);
If NRVO can't be done but Foo is moveable, then your example is a little like this:
void MoveAFoo(Foo* fparam) {
Foo f;
new (fparam) Foo(std::move(f));
}
Foo f; // pretend this isn't being default constructed
MoveAFoo(&f);
And if it can't be moved but it can be copied, then it's like this
void MoveAFoo(Foo* fparam) {
Foo f;
new (fparam) Foo((Foo&)f);
}
Foo f; // pretend this isn't default constructed
MoveAFoo(&f);
To back up #Seth, here's the relevant paragraph from the standard:
§12.8 [class.copy] p32
When the criteria for elision of a copy operation are met or would be met save for the fact that the source object is a function parameter, and the object to be copied is designated by an lvalue, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue. If overload resolution fails, or if the type of the first parameter of the selected constructor is not an rvalue reference to the object’s type (possibly cv-qualified), overload resolution is performed again, considering the object as an lvalue. [ Note: This two-stage overload resolution must be performed regardless of whether copy elision will occur. It determines the constructor to be called if elision is not performed, and the selected constructor must be accessible even if the call is elided. —end note ]
The "disambiguation" is just your old friend, overload resolution:
Foo y;
Foo x(y); // copy
Foo x(std::move(y)); // move
The expression y in the first example is an lvalue of type Foo, which binds to Foo const & (and also Foo & if you have such a constructor); the type of the expression std::move(y) in the second example is Foo &&, so it'll bind to Foo && (and also Foo const & absent the former).
In your example, the result of MoveAFoo() is a temporary of type Foo, so it'll bind to the Foo &&-constructor if one is available, and to a const-copy constructor otherwise.
Finally, in a function returning Foo (by value), the statement return x; is equivalent to return std::move(x); if x is a local variable of type Foo -- this is a special new rule to make the use of move semantics easier.
Foo MoveAFoo() {
Foo f;
return f;
}
This is definition of function MoveAFoo that returns object of type Foo. In its body local Foo f; is created and destructed when it goes out of its scope.
In this code:
Foo x = MoveAFoo();
object Foo f; is created inside of MoveAFoo function and directly assigned into x, which means that copy constructor is not called.
But in this code:
Foo x;
x = MoveAFoo();
object Foo f; is created inside of MoveAFoo function, then the copy of f is created and stored into x and original f is destructed.