Why const for implicit conversion? - c++

After extensive reading of ISO/IEC 14882, Programming language – C++ I'm still unsure why const is needed for implicit conversion to a user-defined type with a single argument constructor like the following
#include <iostream>
class X {
public:
X( int value ) {
printf("constructor initialized with %i",value);
}
}
void implicit_conversion_func( const X& value ) {
//produces "constructor initialized with 99"
}
int main (int argc, char * const argv[]) {
implicit_conversion_func(99);
}
Starting with section 4 line 3
An expression e can be implicitly converted to a type T if and only if the declaration T t=e; is well-formed, for some invented temporary variable t (8.5). Certain language constructs require that an expression be converted to a Boolean value. An expression e appearing in such a context is said to be contextually converted to bool and is well-formed if and only if the declaration bool t(e); is well-formed, for some invented temporary variable t (8.5). The effect of either implicit conversion is the same as performing the declaration and initialization and then using the temporary variable as the result of the conversion. The result is an lvalue if T is an lvalue reference type (8.3.2), and an rvalue otherwise. The expression e is used as an lvalue if and only if the initialization uses it as an lvalue.
Following that I found the section on initializers related to user-defined types in 8.5 line 6
If a program calls for the default initialization of an object of a const-qualified type T, T shall be a class type with a user-provided default constructor.
Finally I ended up at 12.3 line 2 about user-defined conversions which states
User-defined conversions are applied only where they are unambiguous (10.2, 12.3.2).
Needless to say, 10.2 and 12.3.2 didn't answer my question.
Can someone shed some light on what effect const has on implicit conversions?
Does the use of const make the conversion "unambiguous" per 12.3 line 2?
Does const somehow affect lvalue vs. rvalue talked about in section 4?

It doesn't really have much to do with the conversion being implicit. Moreover, it doesn't really have much to do with conversions. It is really about rvalues vs. lvalues.
When you convert 99 to type X, the result is an rvalue. In C++ results of conversions are always rvalues (unless you convert to reference type). It is illegal in C++ to attach non-const references to rvalues.
For example, this code will not compile
X& r = X(99); // ERROR
because it attempts to attach a non-const reference to an rvalue. On the other hand, this code is fine
const X& cr = X(99); // OK
because it is perfectly OK to attach a const reference to an rvalue.
The same thing happens in your code as well. The fact that it involves an implicit conversion is kinda beside the point. You can replace implicit conversion with an explicit
one
implicit_conversion_func(X(99));
and end up with the same situation: with const it compiles, without const it doesn't.
Again, the only role the conversion (explicit or implicit) plays here is that it helps us to produce an rvalue. In general, you can produce an rvalue in some other way and run into the same issue
int &ir = 3 + 2; // ERROR
const int &cir = 3 + 2; // OK

Per section 5.2.2 paragraph 5, when an argument to a function is of const reference type, a temporary variable is automatically introduced if needed. In your example, the rvalue result of X(99) has to be put into a temporary variable so that that variable can be passed by const reference to implicit_conversion_func.

Related

Check the viability of a conversion function [duplicate]

This question already has answers here:
Why can a member function be called on a temporary but a global function cannot?
(2 answers)
Closed 6 months ago.
I have the following code:
struct S {
operator int(); // F1
operator double(); // F2
};
int main() {
int res = S();
}
Since neither F1 nor F2 is cv-qualified, the type of the implicit object parameter is S&, and the corresponding argument to be matched against is S().
Now, per [over.match.viable]/4: (emphasis mine)
Third, for F to be a viable function, there shall exist for each
argument an implicit conversion sequence that converts that argument
to the corresponding parameter of F. If the parameter has reference
type, the implicit conversion sequence includes the operation of
binding the reference, and the fact that an lvalue reference to
non-const cannot bind to an rvalue and that an rvalue reference cannot
bind to an lvalue can affect the viability of the function (see
[over.ics.ref]).
According to the above quote (bold), I'm expecting that neither F1 nor F2 is viable, because the implicit object parameter for both is of type S& and it cannot bind to an rvalue, S().
But, when I tried to compile the code, I found out both functions are viable candidates. Which is best match is not what I am asking about here.
So, why are both F1 and F2 viable candidates, even though the implicit object parameter (of type S&) cannot bind to class prvalue S()?
The main concern expressed in your question appears to be how a given rvalue argument can bind to an implicitly declared lvalue reference parameter. (I'm not here even attempting to make an adjudication on the extensive discusssion in the comments to your question about whether or not any actual overloads are involved in your code sample.)
This (main) concern is addressed – quite clearly, IMHO – in another part of the [over.match.funcs] section of the C++ Standard you cite (bold emphasis mine):
12.2.2.1       General[over.match.funcs.general]
…
5     During overload resolution, the implied
object argument is indistinguishable from other arguments. The implicit
object parameter, however, retains its identity since no user-defined
conversions can be applied to achieve a type match with it. For
implicit object member functions declared without a ref-qualifier,
even if the implicit object parameter is not const-qualified, an
rvalue can be bound to the parameter as long as in all other respects
the argument can be converted to the type of the implicit object
parameter.
Without this paragraph, implicit conversion functions would lose a great deal of their usefulness, such as the use-case you have provided in your example.

Clang and GCC disagree on legality of direct initialization with conversion operator

The latest version of clang (3.9) rejects this code on the second line of f; the latest version of gcc (6.2) accepts it:
struct Y {
Y();
Y(const Y&);
Y(Y&&);
};
struct X {
operator const Y();
};
void f() {
X x;
Y y(x);
}
If any of these changes are made, clang will then accept the code:
Remove Y's move constructor
Remove const from the conversion operator
Replace Y y(x) with Y y = x
Is the original example legal? Which compiler is wrong? After checking the sections about conversion functions and overload resolution in the standard I have not been able to find a clear answer.
When we're enumerating the constructors and check their viability - i.e. whether there is an implicit conversion sequence - for the move constructor, [dcl.init.ref]/5 falls through to the last bullet point (5.2.2), which was modified by core issues 1604 and 1571 (in that order).
The bottom line of these resolutions is that
If T1 or T2 is a class type and T1 is not reference-related to T2, user-defined conversions are
considered using the rules for copy-initialization of an object of type “cv1 T1” by user-defined conversion (8.6, 13.3.1.4, 13.3.1.5); the program is ill-formed if the corresponding non-reference copy-initialization would be ill-formed. The result of the call to the conversion function, as described for the non-reference copy-initialization, is then used to direct-initialize the reference.
The first part just causes the conversion operator to be selected. So, according to the boldfaced part, we use const Y to direct-initialize Y&&. Again, we fall through until the last bullet point, which fails due to (5.2.2.3):
If T1 is reference-related to T2: — cv1 shall be the same
cv-qualification as, or greater cv-qualification than, cv2 ; and
However, this does not appertain to our original overload resolution anymore, which only sees that the conversion operator shall be used to direct-initialize the reference. In your example, overload resolution selects the move constructor because [over.ics.rank]/(3.2.5), and then the above paragraph makes the program ill-formed. This is a defect and has been filed as core issue 2077. A sensible solution would discard the move constructor during overload resolution.
All this makes sense wrt to your fixes: removing const would prevent the fall-through since the types are now reference-compatible, and removing the move constructor leaves the copy constructor, which has a const reference (i.e. works as well). Finally, when we write Y y = x;, instead of [dcl.init]/(17.6.2), (17.6.3) applies;
Otherwise (i.e., for the remaining copy-initialization cases), user-defined conversion sequences that can convert from the source type to the destination type or (when a conversion function is used) to a derived class thereof are enumerated as described in 13.3.1.4, and the best one is chosen through overload resolution (13.3). [...]. The call is used to direct-initialize, according to the rules above, the object that is the destination of the copy-initialization.
I.e. the initialization is effectively the same as Y y(x.operator const Y());, which succeeds, because the move constructor is not viable (Y&& y = const Y fails shallowly enough) and the copy constructor is selected.
I think this is a clang bug.
We start with [over.match.ctor]:
When objects of class type are direct-initialized (8.6), copy-initialized from an expression of the same or a
derived class type (8.6), or default-initialized (8.6), overload resolution selects the constructor. For direct-initialization
or default-initialization that is not in the context of copy-initialization, the candidate functions
are all the constructors of the class of the object being initialized.
So we consider, for instance, the copy constructor. Is the copy constructor viable?
From [dcl.init.ref]:
— If the initializer expression [...] has a class type (i.e., T2 is a class type), where T1 is not reference-related to T2, and can be
converted to an rvalue of type “cv3 T3”, where “cv1 T1” is reference-compatible with “cv3
T3” (see 13.3.1.6) then the reference is bound to the value of the initializer expression in the first case and to
the result of the conversion in the second case.
Those candidate functions in [over.match.ref] are:
For direct-initialization, those explicit conversion functions that
are not hidden within S and yield type “lvalue reference to cv2 T2” or “cv2 T2” or “rvalue reference to
cv2 T2”, respectively, where T2 is the same type as T or can be converted to type T with a qualification
conversion (4.5), are also candidate functions.
Which includes our operator const Y(). Hence the copy constructor is viable. The move constructor is not (since you can't bind a non-const rvalue reference to a const rvalue), so we have exactly one viable candidate, which makes the program well-formed.
Er, as a followup, this is LLVM bug 16682, which makes it seem much more complicated than what I've initially laid out.
When you write code where two decent compilers disagree whether it is legal or not, you are programming too close to the edge. Let's say I'm a maintenance programmer supporting that code. How do you expect me to know whether this is legal, and what exactly the semantics of this code are, if even gcc and clang cannot agree?
Change your code. Make it simpler so that both less "clever" programmers and compilers understand it without problems. There are no prizes for being the most "clever" programmer around.
Look at Columbo's answer: I have no doubt that his analysis of the situation is perfectly fine and correct. But I wouldn't want to support code that requires a very clever 50 line analysis to demonstrate that it is correct. If you are writing C++ compilers, you should carefully study his answer. If you are writing application code, you should never write code that requires looking at his answer.

Reference to const T initialized by value of type other than T

For the following code:
struct A {
explicit A(int) {}
};
const A& a(1); // error on g++/clang++/vc++
const A& b = 1; // error on g++/clang++/vc++
const A& c{1}; // ok on g++/clang++, error on vc++
const A& d = {1}; // error on g++/clang++/vc++
Which one(s) of the 4 initialization is(are) legal?
If we ignore vc++ first, it seems that the difference between direct-init and copy-init is not behaving consistently here. If the third one is well-formed because it's direct-init, why does the first one which is also direct-init fail to compile? What's the logic behind this? Or it's just a bug for g++/clang++ and vc++ handles it correctly?
If you're using braced-init-lists and the destination type of an initialization is a reference:
[dcl.init.list]/3 (from n3690)
Otherwise, if the initializer list has a single element of type E and either T is not a reference type or its referenced type is
reference-related to E, the object or reference is initialized from
that element; if a narrowing conversion (see below) is required to
convert the element to T, the program is ill-formed.
Otherwise, if T is a reference type, a prvalue temporary of the type referenced by T is copy-list-initialized or
direct-list-initialized, depending on the kind of initialization for
the reference, and the reference is bound to that temporary. [Note: As
usual, the binding will fail and the program is ill-formed if the
reference type is an lvalue reference to a non-const type. — end note]
Otherwise, if the initializer list has no elements, the object is value-initialized.
For the two examples const A& c{1}; and const A& d = {1};, the second bullet of the quotation above applies. The first one direct-list-initializes a const A, the second one copy-list-initializes a const A. Copy-initialization selecting an explicit constructor is ill-formed, see [over.match.list]/1.
If you're not using braced-init-lists, there's no difference between copy-initialization and direct-initialization as far as I can tell. The last bullet of [dcl.init.ref]/5 applies in all cases:
Otherwise, a temporary of type “cv1 T1” is created and initialized from the initializer expression
using the rules for a non-reference copy-initialization (8.5). The reference is then bound to the
temporary.
Copy-initialization cannot select an explicit ctor, see [over.match.copy]/1 (it's not viable).
Conclusion:
const A& c{1};
is legal. The other are not, because they either use copy-initialization or copy-list-initialization and the only viable / selected ctor is explicit.
Your struct can only be created by an explicit call to it's constructor. No implicit conversion is allowed.
A const reference must point to an existing object of that type on construction.
None of your lines create an object of type A by calling it's explicit constructor. So I don't see why any of those lines should properly initialize a const reference.

C++0x rvalue references and temporaries

(I asked a variation of this question on comp.std.c++ but didn't get an answer.)
Why does the call to f(arg) in this code call the const ref overload of f?
void f(const std::string &); //less efficient
void f(std::string &&); //more efficient
void g(const char * arg)
{
f(arg);
}
My intuition says that the f(string &&) overload should be chosen, because arg needs to be converted to a temporary no matter what, and the temporary matches the rvalue reference better than the lvalue reference.
This is not what happens in GCC and MSVC (edit: Thanks Sumant: it doesn't happen in GCC 4.3-4.5). In at least G++ and MSVC, any lvalue does not bind to an rvalue reference argument, even if there is an intermediate temporary created. Indeed, if the const ref overload isn't present, the compilers diagnose an error. However, writing f(arg + 0) or f(std::string(arg)) does choose the rvalue reference overload as you would expect.
From my reading of the C++0x standard, it seems like the implicit conversion of a const char * to a string should be considered when considering if f(string &&) is viable, just as when passing a const lvalue ref arguments. Section 13.3 (overload resolution) doesn't differentiate between rvalue refs and const references in too many places. Also, it seems that the rule that prevents lvalues from binding to rvalue references (13.3.3.1.4/3) shouldn't apply if there's an intermediate temporary - after all, it's perfectly safe to move from the temporary.
Is this:
Me misreading/misunderstand the standard, where the implemented behavior is the intended behavior, and there's some good reason why my example should behave the way it does?
A mistake that the compiler vendors have somehow all made? Or a mistake based on common implementation strategies? Or a mistake in e.g. GCC (where this lvalue/rvalue reference binding rule was first implemented), that was copied by other vendors?
A defect in the standard, or an unintended consequence, or something that should be clarified?
EDIT: I have a follow-on question that is related: C++0x rvalue references - lvalues-rvalue binding
GCC is doing it wrong according the FCD. The FCD says at 8.5.3 about reference binding
If the reference is an lvalue reference and the initializer expression is an [lvalue / class type] ...
Otherwise, the reference shall be an lvalue reference to a non-volatile const type (i.e., cv1 shall be const), or the reference shall be an rvalue reference and the initializer expression shall be an rvalue or have a function type.
Your case for the call to the std::string && matches none of them, because the initializer is an lvalue. It doesn't get to the place to create a temporary rvalue, because that toplevel bullet already requires an rvalue.
Now, overload resolution doesn't directly use reference binding to see whether there exist an implicit conversion sequence. Instead, it says at 13.3.3.1.4/2
When a parameter of reference type is not bound directly to an argument expression, the conversion sequence is the one required to convert the argument expression to the underlying type of the reference according to 13.3.3.1.
Thus, overload resolution figures out a winner, even though that winner may actually not be able to bind to that argument. For example:
struct B { B(int) { /* ... */ } };
struct A { int bits: 1; };
void f(int&);
void f(B);
int main() { A a; f(a.bits); }
Reference binding at 8.5 forbids bitfields to bind to lvalue references. But overload resolution says that the conversion sequence is the one converting to int, thus succeeding even though when the call is made later, the call is ill-formed. Thus my bitfields example is ill-formed. If it was to choose the B version, it would have succeeded, but needed a user defined conversion.
However, there exist two exceptions for that rule. These are
Except for an implicit object parameter, for which see 13.3.1, a standard conversion sequence cannot be formed if it requires binding an lvalue reference to non-const to an rvalue or binding an rvalue reference to an lvalue.
Thus, the following call is valid:
struct B { B(int) { /* ... */ } };
struct A { int bits: 1; };
void f(int&); /* binding an lvalue ref to non-const to rvalue! */
void f(B);
int main() { A a; f(1); }
And thus, your example calls the const T& version
void f(const std::string &);
void f(std::string &&); // would bind to lvalue!
void g(const char * arg) { f(arg); }
However, if you say f(arg + 0), you create an rvalue, and thus the second function is viable.
It was a defect in the standard draft you read. This defect got in as a side effect of some eager editing to disallow binding of rvalue references to lvalues for safety reasons.
Your intuition is right. Of course, there is no harm in allowing an rvalue reference to refer to some unnamed temporary even if the initializer was an lvalue expression. After all, this is what rvalue references are for. The issue you observed has been fixed last year. The upcoming standard will mandate that the second overload will be picked in your example where the rvalue reference will refer to some temporary string object.
The rule fix made it into the draft n3225.pdf (2010-11-27):
[...]
Otherwise, the reference shall be an lvalue reference to a non-volatile const type (i.e., cv1 shall be const), or the reference shall be an rvalue reference and the initializer expression shall be an rvalue or have a function type. [...]
[...]
Otherwise, a temporary of [...] is created [...]
double&& rrd3 = i; // rrd3 refers to temporary with value 2.0
But N3225 seems to have missed to say what i is in this example. The latest draft N3290 contains these examples:
double d2 = 1.0;
double&& rrd2 = d2; // error: copying lvalue of related type
int i3 = 2;
double&& rrd3 = i3; // rrd3 refers to temporary with value 2.0
Since your MSVC version was released before this issue got fixed, it still handles rvalue references according to the old rules. The next MSVC version is expected to implement the new rvalue reference rules (dubbed "rvalue references 2.1" by MSVC developers) see link.
I did not see the behavior mentioned by Doug on g++. g++ 4.5 and 4.4.3 both call f(string &&) as expected but VS2010 calls f(const string &). Which g++ version are you using?
A lot of things in the current draft of the standard need clarification, if you ask me. And the compilers are still developing, so it's hard to trust their help.
It looks pretty clear that your intuition is right… temporaries of any kind are supposed to bind to rvalue references. For example, §3.10, the new "taxonomy" section, categorically defines temporaries as rvalues.
The problem may be that the RR argument specification is insufficient to invoke the creation of a temporary. §5.2.2/5: "Where a parameter is of const reference type a temporary object is introduced if needed." That sounds suspiciously exclusive.
Seems to slip through the cracks again at §13.3.3.1/6: (emphasis mine)
When the parameter type is not a reference, the implicit conversion sequence models a copy-initialization of the parameter from the argument expression. The implicit conversion sequence is the one required to convert the argument expression to a prvalue of the type of the parameter.
Note that copy-initialization string &&rr = "hello"; works fine in GCC.
EDIT: Actually the problem doesn't exist on my version of GCC. I'm still trying to figure out how the second standard conversion of the user-defined conversion sequence relates to forming an rvalue reference. (Is RR formation a conversion at all? Or is it dictated by scattered tidbits like 5.2.2/5?)
Take a look at this:
http://blogs.msdn.com/vcblog/archive/2009/02/03/rvalue-references-c-0x-features-in-vc10-part-2.aspx
rvalue references: overload resolution
It looks like your case is: "Lvalues strongly prefer binding to lvalue references".
I don't know if that has changed in the latest versions of the standard, but it used to say something like "if in doubt, don't use the rvalue reference". Probably for compatibility reasons.
If you want the move semantics, use f(std::move(arg)), that works with both compilers.

non-class rvalues always have cv-unqualified types

§3.10 section 9 says "non-class rvalues always have cv-unqualified types". That made me wonder...
int foo()
{
return 5;
}
const int bar()
{
return 5;
}
void pass_int(int&& i)
{
std::cout << "rvalue\n";
}
void pass_int(const int&& i)
{
std::cout << "const rvalue\n";
}
int main()
{
pass_int(foo()); // prints "rvalue"
pass_int(bar()); // prints "const rvalue"
}
According to the standard, there is no such thing as a const rvalue for non-class types, yet bar() prefers to bind to const int&&. Is this a compiler bug?
EDIT: Apparently, this is also a const rvalue :)
EDIT: This issue seems to be fixed in g++ 4.5.0, both lines print "rvalue" now.
The committee already seems to be aware that there's a problem in this part of the standard. CWG issue 690 talks about a somewhat similar problem with exactly the same part of the standard (in the "additional note" from September, 2009). I'd guess new language will be drafted for that part of the standard soon.
Edit: I've just submitted a post on comp.std.c++, noting the problem and suggesting new wording for the relevant piece of the standard. Unfortunately, being a moderated newsgroup, nearly everybody will probably have forgotten this question by the time it makes it through the approval queue there.
Good point. I guess there are two things to look at: 1) as you pointed out the non-class rvalue thingsy and 2) how overload resolution works:
The selection criteria for the best
function are the number of arguments,
how well the arguments match the
parameter-type-list of the candidate
function, [...]
I haven't seen anything in the standard that tells me non-class rvalues are treated specially during overload resolution.
Your question is covered in the draft of the standard I have though (N-4411) somewhat:
What does come into play is however a parallel reading of reference binding, implicit conversion sequences, references, and overload resolution in general:
13.3.3.1.4 Reference binding
2 When a parameter of reference type
is not bound directly to an argument
expression, the conversion sequence
is the one required to convert the argument expression to the underlying
type of the reference according
to 13.3.3.1.
and
13.3.3.2 Ranking implicit conversion sequences
3 Two implicit conversion sequences of
the same form are indistinguishable
conversion sequences unless one of the
following rules applies:
— Standard conversion sequence S1 is a better conversion sequence than
standard
conversion sequence S2 if
— S1 and S2 are reference bindings (8.5.3) and neither refers to an
implicit object parameter of a
nonstatic
member function declared without a ref-qualifier, and either S1 binds an
lvalue reference
to an lvalue and S2 binds an rvalue reference or S1 binds an rvalue
reference to an rvalue and S2
binds an lvalue reference.
[ Example:
int i;
int f();
int g(const int&);
int g(const int&&);
int j = g(i); // calls g(const int&)
int k = g(f()); // calls g(const int&&)