Choosing function overload with rvalue references [duplicate] - c++

Given all three functions, this call is ambiguous.
int f( int );
int f( int && );
int f( int const & );
int q = f( 3 );
Removing f( int ) causes both Clang and GCC to prefer the rvalue reference over the lvalue reference. But instead removing either reference overload results in ambiguity with f( int ).
Overload resolution is usually done in terms of a strict partial ordering, but int seems to be equivalent to two things which are not equivalent to each other. What are the rules here? I seem to recall a defect report about this.
Is there any chance int && may be preferred over int in a future standard? The reference must bind to an initializer, whereas the object type is not so constrained. So overloading between T and T && could effectively mean "use the existing object if I've been given ownership, otherwise make a copy." (This is similar to pure pass-by-value, but saves the overhead of moving.) As these compilers currently work, this must be done by overloading T const & and T &&, and explicitly copying. But I'm not even sure even that is strictly standard.

What are the rules here?
As there is only one parameter, the rule is that one of the three viable parameter initializations of that parameter must be a better match than both the other two. When two initializations are compared, either one is better than the other, or neither is better (they are indistinguishable).
Without special rules about direct reference binding, all three initializations mentioned would be indistinguishable (in all three comparisons).
The special rules about direct reference binding make int&& better than const int&, but neither is better or worse than int. Therefore there is no best match:
S1 S2
int int&& indistinguishable
int const int& indistinguishable
int&& const int& S1 better
int&& is better than const int& because of 13.3.3.2:
S1 and S2 are reference bindings (8.5.3) and neither refers to an implicit object parameter of a non-static member function declared without a ref-qualifier, and S1 binds an rvalue reference to an rvalue and S2 binds an lvalue reference.
But this rule does not apply when one of the initializations is not a reference binding.
Is there any chance int && may be preferred over int in a future standard? The reference must bind to an initializer, whereas the object type is not so constrained. So overloading between T and T && could effectively mean "use the existing object if I've been given ownership, otherwise make a copy."
You propose to make a reference binding a better match than a non-reference binding. Why not post your idea to isocpp future proposals. SO is not the best for subjective discussion / opinion.

Related

cv-qualifier propagation in structured binding

As quoted in dcl.struct.bind,
Let cv denote the cv-qualifiers in the decl-specifier-seq.
Designating the non-static data members of E as m 0 , m 1 , m 2 , ... (in declaration order), each v i is the name of an lvalue that refers to the member m i of e and whose type is cv T i , where T i is the declared type of that member;
If I'm understanding correctly, the cv-qualifiers are propagated from the declartion of structured binding.
Say I have a simple struct,
struct Foo {
int x;
double y;
};
Consider the two scenarios,
const Foo f{1, 1.0};
auto& [x, y] = f;
// static_assert(std::is_same<decltype(x), int>::value); // Fails!
static_assert(std::is_same<decltype(x), const int>::value); // Succeeds
Live Demo.
Does the cv-qualifier of x come from the deduction auto?
The second one,
Foo f{1, 1.0};
const auto& [x, y] = f;
const auto& rf = f;
static_assert(std::is_same<decltype(x), const int>::value); // with const
static_assert(std::is_same<decltype(rf.x), int>::value); // without const
Live Demo. The result complies with the standard, which makes sense.
My second question is is there any reason to propagate the cv-qualifiers, isn't it a kind of inconsistent (to the initialization a reference with auto)?
decltype has a special rule when the member of a class is named directly as an unparenthesized member access expression. Instead of producing the result it would usually if the expression was treated as an expression, it will result in the declared type of the member.
So decltype(rf.x) gives int, because x is declared as int. You can force decltype to behave as it would for other expressions by putting extra parentheses (decltype((rf.x))), in which case it will give const int& since it is an lvalue expression and an access through a const reference.
Similarly there are special rules for decltype if a structured binding is named directly (without parentheses), which is why you don't get const int& for decltype(x).
However the rules for structured bindings take the type from the member access expression as an expression if the member is not a reference type, which is why const is propagated. At least that is the case since the post-C++20 resolution of CWG issue 2312 which intends to make the const propagation work correctly with mutable members.
Before the resolution the type of the structured binding was actually specified to just be the declared type of the member with the cv-qualifiers of the structured binding declaration added, as you are quoting in your question.
I might be missing some detail on what declared type refers to exactly, but it seems to me that this didn't actually specify x to have type const int& in your first snippet (and decltype hence also not const), although that seems to be how all compilers always handled that case and is also the only behavior that makes sense. Maybe it was another defect, silently or unintentionally fixed by CWG 2312.
So, practically speaking, both rf.x and x in your example are const int lvalue expressions when you use them as expressions. The only oddity here is in how decltype behaves.
The cvref-qualifiers don't always propagate. They apply only to the single hidden variable that stores a copy/reference to the initializer.
Ref-qualifiers don't affect the identifiers created by the binding, those always act as references to the said common hidden variable. But if that variable itself is a copy, this causes them to behave almost like they were true copies.
How cv-qualifiers propagate is affected by what you're binding:
For non-tuple-like classes and arrays, the behavior is defined in terms of . and [] respectively, which normally propagate const, except for mutable class members.
For tuple-like classes, it depends on how get<I>() is implemented for your type. For tuples and similar classes, it propagates const only if the element is not a reference.
E.g. for a tuple of non-const references, the binding will always produce non-const identifiers.
The inconsistency you're observing for decltype(x) vs decltype(rf.x) is caused by both of them being different special cases for decltype. Arguably the former makes more sense, but it's too late to change the latter.
If you add a second pair of parentheses, you'll get the same behavior for both.

Passing rvalue reference in function argument [duplicate]

I think there's something I'm not quite understanding about rvalue references. Why does the following fail to compile (VS2012) with the error 'foo' : cannot convert parameter 1 from 'int' to 'int &&'?
void foo(int &&) {}
void bar(int &&x) { foo(x); };
I would have assumed that the type int && would be preserved when passed from bar into foo. Why does it get transformed into int once inside the function body?
I know the answer is to use std::forward:
void bar(int &&x) { foo(std::forward<int>(x)); }
so maybe I just don't have a clear grasp on why. (Also, why not std::move?)
I always remember lvalue as a value that has a name or can be addressed. Since x has a name, it is passed as an lvalue. The purpose of reference to rvalue is to allow the function to completely clobber value in any way it sees fit. If we pass x by reference as in your example, then we have no way of knowing if is safe to do this:
void foo(int &&) {}
void bar(int &&x) {
foo(x);
x.DoSomething(); // what could x be?
};
Doing foo(std::move(x)); is explicitly telling the compiler that you are done with x and no longer need it. Without that move, bad things could happen to existing code. The std::move is a safeguard.
std::forward is used for perfect forwarding in templates.
Why does it get transformed into int once inside the function body?
It doesn't; it's still a reference to an rvalue.
When a name appears in an expression, it's an lvalue - even if it happens to be a reference to an rvalue. It can be converted into an rvalue if the expression requires that (i.e. if its value is needed); but it can't be bound to an rvalue reference.
So as you say, in order to bind it to another rvalue reference, you have to explicitly convert it to an unnamed rvalue. std::forward and std::move are convenient ways to do that.
Also, why not std::move?
Why not indeed? That would make more sense than std::forward, which is intended for templates that don't know whether the argument is a reference.
It's the "no name rule". Inside bar, x has a name ... x. So it's now an lvalue. Passing something to a function as an rvalue reference doesn't make it an rvalue inside the function.
If you don't see why it must be this way, ask yourself -- what is x after foo returns? (Remember, foo is free to move x.)
rvalue and lvalue are categories of expressions.
rvalue reference and lvalue reference are categories of references.
Inside a declaration, T x&& = <initializer expression>, the variable x has type T&&, and it can be bound to an expression (the ) which is an rvalue expression. Thus, T&& has been named rvalue reference type, because it refers to an rvalue expression.
Inside a declaration, T x& = <initializer expression>, the variable x has type T&, and it can be bound to an expression (the ) which is an lvalue expression (++). Thus, T& has been named lvalue reference type, because it can refer to an lvalue expression.
It is important then, in C++, to make a difference between the naming of an entity, that appears inside a declaration, and when this name appears inside an expression.
When a name appears inside an expression as in foo(x), the name x alone is an expression, called an id-expression. By definition, and id-expression is always an lvalue expression and an lvalue expressions can not be bound to an rvalue reference.
When talking about rvalue references it's important to distinguish between two key unrelated steps in the lifetime of a reference - binding and value semantics.
Binding here refers to the exact way a value is matched to the parameter type when calling a function.
For example, if you have the function overloads:
void foo(int a) {}
void foo(int&& a) {}
Then when calling foo(x), the act of selecting the proper overload involves binding the value x to the parameter of foo.
rvalue references are only about binding semantics.
Inside the bodies of both foo functions the variable a acts as a regular lvalue. That is, if we rewrite the second function like this:
void foo(int&& a) {
foo(a);
}
then intuitively this should result in a stack overflow. But it doesn't - rvalue references are all about binding and never about value semantics. Since a is a regular lvalue inside the function body, then the first overload foo(int) will be called at that point and no stack overflow occurs. A stack overflow would only occur if we explicitly change the value type of a, e.g. by using std::move:
void foo(int&& a) {
foo(std::move(a));
}
At this point a stack overflow will occur because of the changed value semantics.
This is in my opinion the most confusing feature of rvalue references - that the type works differently during and after binding. It's an rvalue reference when binding but it acts like an lvalue reference after that. In all respects a variable of type rvalue reference acts like a variable of type lvalue reference after binding is done.
The only difference between an lvalue and an rvalue reference comes when binding - if there is both an lvalue and rvalue overload available, then temporary objects (or rather xvalues - eXpiring values) will be preferentially bound to rvalue references:
void goo(const int& x) {}
void goo(int&& x) {}
goo(5); // this will call goo(int&&) because 5 is an xvalue
That's the only difference. Technically there is nothing stopping you from using rvalue references like lvalue references, other than convention:
void doit(int&& x) {
x = 123;
}
int a;
doit(std::move(a));
std::cout << a; // totally valid, prints 123, but please don't do that
And the keyword here is "convention". Since rvalue references preferentially bind to temporary objects, then it's reasonable to assume that you can gut the temporary object, i.e. move away all of its data away from it, because after the call it's not accessible in any way and is going to be destroyed anyway:
std::vector<std::string> strings;
string.push_back(std::string("abc"));
In the above snippet the temporary object std::string("abc") cannot be used in any way after the statement in which it appears, because it's not bound to any variable. Therefore push_back is allowed to move away its contents instead of copying it and therefore save an extra allocation and deallocation.
That is, unless you use std::move:
std::vector<std::string> strings;
std::string mystr("abc");
string.push_back(std::move(mystr));
Now the object mystr is still accessible after the call to push_back, but push_back doesn't know this - it's still assuming that it's allowed to gut the object, because it's passed in as an rvalue reference. This is why the behavior of std::move() is one of convention and also why std::move() by itself doesn't actually do anything - in particular it doesn't do any movement. It just marks its argument as "ready to get gutted".
The final point is: rvalue references are only useful when used in tandem with lvalue references. There is no case where an rvalue argument is useful by itself (exaggerating here).
Say you have a function accepting a string:
void foo(std::string);
If the function is going to simply inspect the string and not make a copy of it, then use const&:
void foo(const std::string&);
This always avoids a copy when calling the function.
If the function is going to modify or store a copy of the string, then use pass-by-value:
void foo(std::string s);
In this case you'll receive a copy if the caller passes an lvalue and temporary objects will be constructed in-place, avoiding a copy. Then use std::move(s) if you want to store the value of s, e.g. in a member variable. Note that this will work efficiently even if the caller passes an rvalue reference, that is foo(std::move(mystring)); because std::string provides a move constructor.
Using an rvalue here is a poor choice:
void foo(std::string&&)
because it places the burden of preparing the object on the caller. In particular if the caller wants to pass a copy of a string to this function, they have to do that explicitly;
std::string s;
foo(s); // XXX: doesn't compile
foo(std::string(s)); // have to create copy manually
And if you want to pass a mutable reference to a variable, just use a regular lvalue reference:
void foo(std::string&);
Using rvalue references in this case is technically possible, but semantically improper and totally confusing.
The only, only place where an rvalue reference makes sense is in a move constructor or move assignment operator. In any other situation pass-by-value or lvalue references are usually the right choice and avoid a lot of confusion.
Note: do not confuse rvalue references with forwarding references that look exactly the same but work totally differently, as in:
template <class T>
void foo(T&& t) {
}
In the above example t looks like a rvalue reference parameter, but is actually a forwarding reference (because of the template type), which is an entirely different can of worms.

Need Meyers Effective C++ Widget rvalue example explanation

I have a little C++ question.
On the first pages of Effective Modern C++, there is an example:
class Widget {
public:
Widget(Widget&& rhs);
};
Also, there is a comment: 'rhs is an lvalue, though it has an rvalue reference type'.
I just understood nothing, to be honest. What does it mean 'rhs is an lvalue, but it's type is rvalue reference'?
Keep in mind that there are two distinct things here:
One is related to the type of variables: there are two types of references: lvalue references (&) and rvalue references (&&).
This determines what the function preferentially accepts and is always "obvious" because you can just read it from the type signature (or use decltype).
The other is a property of expressions (or values): an expression can be an lvalue or an rvalue (actually, it's more complicated than that...).
This property is not manifest in the code directly (but there is a rule of thumb, see below), but you can see its effects in the overload resolution. In particular,
lvalue arguments prefer to bind to lvalue-reference parameters, whereas
rvalue arguments prefer to bind to rvalue-reference parameters.
These properties are closely related (and in some sense "dual" to each other), but they don't necessarily agree with each other. In particular, it's important to realize that variables and expressions are actually different things, so formally speaking they aren't even comparable, "apples to oranges".
There is this rule in C++ that, even though you have declared rhs to be an rvalue reference (meaning that it will preferentially match arguments that are rvalues), within the block of move constructor, the variable rhs itself will still behave as an lvalue, and thus preferentially match functions that accept lvalue references.
void test(Widget&&) { std::cout << "test(Widget&&): called\n"; }
void test(Widget&) { std::cout << "test(Widget&): called\n"; }
Widget::Widget(Widget&& rhs) {
// here, `rhs` is an lvalue, not an rvalue even though
// its type is declared to be an rvalue reference
// for example, this will match `test(Widget&)`
// rather than the `test(Widget&&)` overload, which may be
// a bit counter-intuitive
test(rhs);
// if you really want to match `test(Widget&&)`
// you must use `std::move` to "wrap" the variable
// so that it can be treated as an rvalue
test(std::move(rhs));
}
The rationale for this was to prevent unintended moves within the move constructor.
The general rule of thumb is: if the expression has a name (i.e. consists of a single, named variable) then it's an lvalue. If the expression is anonymous, then it's an rvalue. (As dyp noted, this is not technically correct -- see his comment for a more formal description.)
Short and simple explanation :P
Widget(Widget&& rhs);
is a move constructor. It will accept a rvalue as a parameter. Inside the definition of the move constructor, you can refer to the other Widget using the name rhs, therefore it is an lvalue.

Overload resolution between object, rvalue reference, const reference

Given all three functions, this call is ambiguous.
int f( int );
int f( int && );
int f( int const & );
int q = f( 3 );
Removing f( int ) causes both Clang and GCC to prefer the rvalue reference over the lvalue reference. But instead removing either reference overload results in ambiguity with f( int ).
Overload resolution is usually done in terms of a strict partial ordering, but int seems to be equivalent to two things which are not equivalent to each other. What are the rules here? I seem to recall a defect report about this.
Is there any chance int && may be preferred over int in a future standard? The reference must bind to an initializer, whereas the object type is not so constrained. So overloading between T and T && could effectively mean "use the existing object if I've been given ownership, otherwise make a copy." (This is similar to pure pass-by-value, but saves the overhead of moving.) As these compilers currently work, this must be done by overloading T const & and T &&, and explicitly copying. But I'm not even sure even that is strictly standard.
What are the rules here?
As there is only one parameter, the rule is that one of the three viable parameter initializations of that parameter must be a better match than both the other two. When two initializations are compared, either one is better than the other, or neither is better (they are indistinguishable).
Without special rules about direct reference binding, all three initializations mentioned would be indistinguishable (in all three comparisons).
The special rules about direct reference binding make int&& better than const int&, but neither is better or worse than int. Therefore there is no best match:
S1 S2
int int&& indistinguishable
int const int& indistinguishable
int&& const int& S1 better
int&& is better than const int& because of 13.3.3.2:
S1 and S2 are reference bindings (8.5.3) and neither refers to an implicit object parameter of a non-static member function declared without a ref-qualifier, and S1 binds an rvalue reference to an rvalue and S2 binds an lvalue reference.
But this rule does not apply when one of the initializations is not a reference binding.
Is there any chance int && may be preferred over int in a future standard? The reference must bind to an initializer, whereas the object type is not so constrained. So overloading between T and T && could effectively mean "use the existing object if I've been given ownership, otherwise make a copy."
You propose to make a reference binding a better match than a non-reference binding. Why not post your idea to isocpp future proposals. SO is not the best for subjective discussion / opinion.

C++0x rvalue references and temporaries

(I asked a variation of this question on comp.std.c++ but didn't get an answer.)
Why does the call to f(arg) in this code call the const ref overload of f?
void f(const std::string &); //less efficient
void f(std::string &&); //more efficient
void g(const char * arg)
{
f(arg);
}
My intuition says that the f(string &&) overload should be chosen, because arg needs to be converted to a temporary no matter what, and the temporary matches the rvalue reference better than the lvalue reference.
This is not what happens in GCC and MSVC (edit: Thanks Sumant: it doesn't happen in GCC 4.3-4.5). In at least G++ and MSVC, any lvalue does not bind to an rvalue reference argument, even if there is an intermediate temporary created. Indeed, if the const ref overload isn't present, the compilers diagnose an error. However, writing f(arg + 0) or f(std::string(arg)) does choose the rvalue reference overload as you would expect.
From my reading of the C++0x standard, it seems like the implicit conversion of a const char * to a string should be considered when considering if f(string &&) is viable, just as when passing a const lvalue ref arguments. Section 13.3 (overload resolution) doesn't differentiate between rvalue refs and const references in too many places. Also, it seems that the rule that prevents lvalues from binding to rvalue references (13.3.3.1.4/3) shouldn't apply if there's an intermediate temporary - after all, it's perfectly safe to move from the temporary.
Is this:
Me misreading/misunderstand the standard, where the implemented behavior is the intended behavior, and there's some good reason why my example should behave the way it does?
A mistake that the compiler vendors have somehow all made? Or a mistake based on common implementation strategies? Or a mistake in e.g. GCC (where this lvalue/rvalue reference binding rule was first implemented), that was copied by other vendors?
A defect in the standard, or an unintended consequence, or something that should be clarified?
EDIT: I have a follow-on question that is related: C++0x rvalue references - lvalues-rvalue binding
GCC is doing it wrong according the FCD. The FCD says at 8.5.3 about reference binding
If the reference is an lvalue reference and the initializer expression is an [lvalue / class type] ...
Otherwise, the reference shall be an lvalue reference to a non-volatile const type (i.e., cv1 shall be const), or the reference shall be an rvalue reference and the initializer expression shall be an rvalue or have a function type.
Your case for the call to the std::string && matches none of them, because the initializer is an lvalue. It doesn't get to the place to create a temporary rvalue, because that toplevel bullet already requires an rvalue.
Now, overload resolution doesn't directly use reference binding to see whether there exist an implicit conversion sequence. Instead, it says at 13.3.3.1.4/2
When a parameter of reference type is not bound directly to an argument expression, the conversion sequence is the one required to convert the argument expression to the underlying type of the reference according to 13.3.3.1.
Thus, overload resolution figures out a winner, even though that winner may actually not be able to bind to that argument. For example:
struct B { B(int) { /* ... */ } };
struct A { int bits: 1; };
void f(int&);
void f(B);
int main() { A a; f(a.bits); }
Reference binding at 8.5 forbids bitfields to bind to lvalue references. But overload resolution says that the conversion sequence is the one converting to int, thus succeeding even though when the call is made later, the call is ill-formed. Thus my bitfields example is ill-formed. If it was to choose the B version, it would have succeeded, but needed a user defined conversion.
However, there exist two exceptions for that rule. These are
Except for an implicit object parameter, for which see 13.3.1, a standard conversion sequence cannot be formed if it requires binding an lvalue reference to non-const to an rvalue or binding an rvalue reference to an lvalue.
Thus, the following call is valid:
struct B { B(int) { /* ... */ } };
struct A { int bits: 1; };
void f(int&); /* binding an lvalue ref to non-const to rvalue! */
void f(B);
int main() { A a; f(1); }
And thus, your example calls the const T& version
void f(const std::string &);
void f(std::string &&); // would bind to lvalue!
void g(const char * arg) { f(arg); }
However, if you say f(arg + 0), you create an rvalue, and thus the second function is viable.
It was a defect in the standard draft you read. This defect got in as a side effect of some eager editing to disallow binding of rvalue references to lvalues for safety reasons.
Your intuition is right. Of course, there is no harm in allowing an rvalue reference to refer to some unnamed temporary even if the initializer was an lvalue expression. After all, this is what rvalue references are for. The issue you observed has been fixed last year. The upcoming standard will mandate that the second overload will be picked in your example where the rvalue reference will refer to some temporary string object.
The rule fix made it into the draft n3225.pdf (2010-11-27):
[...]
Otherwise, the reference shall be an lvalue reference to a non-volatile const type (i.e., cv1 shall be const), or the reference shall be an rvalue reference and the initializer expression shall be an rvalue or have a function type. [...]
[...]
Otherwise, a temporary of [...] is created [...]
double&& rrd3 = i; // rrd3 refers to temporary with value 2.0
But N3225 seems to have missed to say what i is in this example. The latest draft N3290 contains these examples:
double d2 = 1.0;
double&& rrd2 = d2; // error: copying lvalue of related type
int i3 = 2;
double&& rrd3 = i3; // rrd3 refers to temporary with value 2.0
Since your MSVC version was released before this issue got fixed, it still handles rvalue references according to the old rules. The next MSVC version is expected to implement the new rvalue reference rules (dubbed "rvalue references 2.1" by MSVC developers) see link.
I did not see the behavior mentioned by Doug on g++. g++ 4.5 and 4.4.3 both call f(string &&) as expected but VS2010 calls f(const string &). Which g++ version are you using?
A lot of things in the current draft of the standard need clarification, if you ask me. And the compilers are still developing, so it's hard to trust their help.
It looks pretty clear that your intuition is right… temporaries of any kind are supposed to bind to rvalue references. For example, §3.10, the new "taxonomy" section, categorically defines temporaries as rvalues.
The problem may be that the RR argument specification is insufficient to invoke the creation of a temporary. §5.2.2/5: "Where a parameter is of const reference type a temporary object is introduced if needed." That sounds suspiciously exclusive.
Seems to slip through the cracks again at §13.3.3.1/6: (emphasis mine)
When the parameter type is not a reference, the implicit conversion sequence models a copy-initialization of the parameter from the argument expression. The implicit conversion sequence is the one required to convert the argument expression to a prvalue of the type of the parameter.
Note that copy-initialization string &&rr = "hello"; works fine in GCC.
EDIT: Actually the problem doesn't exist on my version of GCC. I'm still trying to figure out how the second standard conversion of the user-defined conversion sequence relates to forming an rvalue reference. (Is RR formation a conversion at all? Or is it dictated by scattered tidbits like 5.2.2/5?)
Take a look at this:
http://blogs.msdn.com/vcblog/archive/2009/02/03/rvalue-references-c-0x-features-in-vc10-part-2.aspx
rvalue references: overload resolution
It looks like your case is: "Lvalues strongly prefer binding to lvalue references".
I don't know if that has changed in the latest versions of the standard, but it used to say something like "if in doubt, don't use the rvalue reference". Probably for compatibility reasons.
If you want the move semantics, use f(std::move(arg)), that works with both compilers.