Difference in behavior of pointer-to-member access operators - c++

In C++, I'm searching for the crucial sections of the standard
explaining the subtle difference in behavior I've observed between the
language's two pointer-to-member access operators, .* and ->*.
According to my test program shown below, whilst ->* seems to allow its
right-hand expression to be of any type implicitly convertible to
pointer to member of S, .* does not so. When compiling with
gcc and clang, both compilers yield errors for the line marked '(2)'
stating that my class Offset cannot be used as a member pointer.
Test Program
https://godbolt.org/z/46nMPvKxE
#include <iostream>
struct S { int m; };
template<typename C, typename M>
struct Offset
{
M C::* value;
operator M C::* () { return value; } // implicit conversion function
};
int main()
{
S s{42};
S* ps = &s;
Offset<S, int> offset{&S::m};
std::cout << ps->*offset << '\n'; // (1) ok
std::cout << s.*offset << '\n'; // (2) error
std::cout.flush();
}
Compiler Output
GCC 12.2:
'offset' cannot be used as a member pointer, since it is of type 'Offset<S, int>'
clang 15.0:
right hand operand to .* has non-pointer-to-member type 'Offset<S, int>'
Program Variation
In order to prove that ->* actually performs an implicit conversion using Offset's
conversion function in the test program shown above, I declared it explicit for test purposes,
explicit operator M C::* () { return value; } // no longer implicit conversion function
resulting in the compilers to also yield errors for the line marked '(1)':
GCC 12.2:
error: no match for 'operator->*' (operand types are 'S*' and 'Offset<S, int>')
note: candidate: 'operator->*(S*, int S::*)' (built-in)
note: no known conversion for argument 2 from 'Offset<S, int>' to 'int S::*'
clang 15.0:
error: right hand operand to ->* has non-pointer-to-member type 'Offset<S, int>'
Research
Whilst there is a well-documented difference between the two operators in that
->* is overloadable and .* is not, my code obviously does not make use of this option
but rather relies on the built-in operator ->* defined for raw pointer type S*.
Besides differences in overloadability, I merely found documentation stating the similarity
of the expressions. Cited from the standard (https://open-std.org/jtc1/sc22/wg21/docs/papers/2020/n4868.pdf):
[7.6.4.2] The binary operator .* binds its second operand, which shall be of type “pointer to member of T” to its first
operand, which shall be a glvalue of class T or of a class of which T is an unambiguous and accessible base
class. The result is an object or a function of the type specified by the second operand.
[7.6.4.3] [...] The expression E1->E2 is converted into the equivalent form ((E1)).*E2.
And cited from cppreference.com (https://en.cppreference.com/w/cpp/language/operator_member_access#Built-in_pointer-to-member_access_operators):
The second operand of both operators is an expression of type pointer to member ( data or function) of T or pointer to member of an unambiguous and accessible base class B of T.
The expression E1->*E2 is exactly equivalent to (*E1).*E2 for built-in types; that is why the following rules address only E1.*E2.
Nowhere have I found a notion of conversion of the right hand operand.
Question
What have I overlooked? Can someone point me to an explanation of this difference in behavior?

When overloadable operators are used and at least one operand is of class or enumeration type, overload resolution is performed using a candidate set that includes built-in candidates ([over.match.oper]/3) - for ->* in particular, see [over.built]/9.
In this case, a built-in candidate is selected, so the implicit conversion is applied to the second operand, and then ->* is interpreted as the built-in operator ([over.match.oper]/11).
With .*, there's no overload resolution at all, so no implicit conversion either.

Related

Overload resolution of user-defined type conversion

I considered a type conversion from one type to the other, which are defined by two way, i.e., type conversion constructor and type conversion function.
struct to_type;
struct from_type{
operator to_type()const;
};
struct to_type{
to_type() = default;
to_type(const from_type&){}
};
from_type::operator to_type()const{return to_type();}
int main(){
from_type From;
to_type To;
To = From;
return 0;
}
gcc (v13.0.0, but seems to be same even in v4.9.4) don't throw any error and just call type conversion constructor in the above code.
On the other hand, clang (v16.0.0, but seems to be same even in v7.6.0) throw a "ambiguous" compile error like the following.
prog.cc:14:10: error: reference initialization of type 'to_type &&' with initializer of type 'from_type' is ambiguous
To = From;
^~~~
prog.cc:3:4: note: candidate function
operator to_type()const;
^
prog.cc:7:4: note: candidate constructor
to_type(const from_type&){}
^
prog.cc:5:8: note: passing argument to parameter here
struct to_type{
^
1 error generated.
It seems to be so curious that two major compiler show different result for this simple code. Is either compiler don't match with the standard for C++? I guess this overload resolution related to [over.ics.rank], but I could not concluded which compiler's behavior match to the standard.
Or do my source code contains undefined behavior?
[ADD 2022-12-21T00:20Z]
Following the comment by Artyer, I tried -pedantic compile option for gcc, and now gcc also output error message.
prog.cc: In function 'int main()':
prog.cc:14:10: error: conversion from 'from_type' to 'to_type' is ambiguous
14 | To = From;
| ^~~~
prog.cc:9:1: note: candidate: 'from_type::operator to_type() const'
9 | from_type::operator to_type()const{return to_type();}
| ^~~~~~~~~
prog.cc:7:4: note: candidate: 'to_type::to_type(const from_type&)'
7 | to_type(const from_type&){}
| ^~~~~~~
prog.cc:5:8: note: initializing argument 1 of 'constexpr to_type& to_type::operator=(to_type&&)'
5 | struct to_type{
| ^~~~~~~
This suggests that at least the default behavior of gcc without -pedantic don't match with the requirements of C++ standard for this source code.
In this case clang is right, this behaviour is defined in [over.best.ics.general], and standard even mentions that such a conversion is ambiguous explicitly under the sample code to [over.best.ics.general]/10 (the scenario under the link is in fact considers another kind of ambiguity, but resolution to user-defined pair of conversion constructor and conversion operator is one of the candidates, so I removed the part of the code with another candidate):
class B;
class A { A (B&);};
class B { operator A (); };
...
void f(A) { }
...
B b;
f(b); // ... an (ambiguous) conversion b → A (via constructor or conversion function)
In order to break the name resolution down, I'd like to represent the conversion sequence (To = From;) as an assignment operator function:
to_type& operator=(to_type&& param) { } // defined implicitly by the compiler
The actual conversion happens when param needs to get into existence out of From argument of type from_type. The compiler then needs to decide step by step:
Which type the conversion sequence from from_type to to_type&& is of? (standard/user defined/ellipsis):
The types from_type is user defined, but to_type&& is of reference type, and reference binding could be considered identity (i.e. standard) conversion. However it's not the case, since from_type and to_type are not the same and cannot be bound directly ([over.ics.ref]/2):
When a parameter of reference type is not bound directly to an argument expression, the conversion sequence is the one required to convert the argument expression to the referenced type according to [over.best.ics].
Without reference binding there is no any other standard conversion sequence that may suit here. Let's consider user-defined conversion. [over.ics.user] gives us the following definition:
A user-defined conversion sequence consists of an initial standard conversion sequence followed by a user-defined conversion ([class.conv]) followed by a second standard conversion sequence. If the user-defined conversion is specified by a constructor ([class.conv.ctor]), the initial standard conversion sequence converts the source type to the type of the first parameter of that constructor. If the user-defined conversion is specified by a conversion function, the initial standard conversion sequence converts the source type to the type of the implicit object parameter of that conversion function.
This sounds about right to me: we need to convert from_type argument to to_type temporary in order for to_type&& to bind to the argument, thus the sequence is either
from_type -> const from_type& for the converting constructor argument to_type(const from_type&) -> to_type&& for the move-assignment operator of to_type& operator=(to_type&&)
OR
from_type -> implicit-object-parameter of type const from_type& for conversion function operator to_type() const -> to_type&& for the move-assignment operator of to_type& operator=(to_type&&).
Now we have two possible conversion sequences of the same kind (user-defined). For this scenario [over.best.ics.general]/10 says the following:
If there are multiple well-formed implicit conversion sequences converting the argument to the parameter type, the implicit conversion sequence associated with the parameter is defined to be the unique conversion sequence designated the ambiguous conversion sequence. For the purpose of ranking implicit conversion sequences as described in [over.ics.rank].
The Ranking implicit conversion sequences documentation then gives the following clues about deciding on which conversion (of the same sequence type) should take precedence for user-defined sequences ([over.ics.rank]/3.3, emphasis mine):
User-defined conversion sequence U1 is a better conversion sequence than another user-defined conversion sequence U2 if they contain the same user-defined conversion function or constructor or they initialize the same class in an aggregate initialization and in either case the second standard conversion sequence of U1 is better than the second standard conversion sequence of U2
Here we go, for both scenarios (with the converting constructor and the conversion function) the second standard conversion is of the same type (a temporary of type to_type to to_type&&), thus the operations are indistinguishable.
Clang is wrong in rejecting the program because an overload with T&& is better match than an overload with const int&. Note that gcc is also wrong because it uses the copy constructor instead of operator to_type() const. See demo. Only msvc is right and both gcc and clang are wrong. MSVC correctly uses the conversion function.
S1 S2
int int&& indistinguishable
int const int& indistinguishable
int&& const int& S1 better
Consider the contrived example:
#include <iostream>
struct to_type;
struct from_type{
operator to_type()const;
};
struct to_type{
to_type() = default;
to_type(const from_type&){std::cout <<"copy ctor";}
};
from_type::operator to_type()const{
std::cout<<"to_type operator";
return to_type();}
void f(to_type&&){}
int main(){
from_type From;
f(From); //valid and this should use from_type::operator to_type() const
}
Demo
The above program is rejected by clang(with the same error as you're getting) but accepted by gcc and msvc. Note that even though gcc accepts the above program it is still wrong because the conversion function should be used instead of the copy ctor. MSVC on the other hand correctly uses the conversion function.

How to overload operator + for const char* and int

I know this is silly and ugly, but I'm migrating some code automatically. My source language allows implicit conversion between strings and ints, and for example this is allowed:
var = "hello " + 2
print(var) # prints "hello 2"
How can I in C++ overload the + operator for const char* and int? I'm getting the error:
error: ‘std::string operator+(char* const&, int)’ must have an
argument of class or enumerated type
What you are asking for is illegal
To legally overload an operator at least one of the operands involved has to be a user-defined type. Since neither char* nor int is user-defined, what you are trying to accomplish isn't possible.
This, what you are trying to do, is intentionally, and explicitly, disallowed in the standard. Don't you think it would be weird if suddenly 1+3 = 42 because someone "clever" have defined an overload for operator+(int, int)?
What does the Standard say? (n3337)
13.3.1.2p1-2 Operators in expressions [over.match.oper]
If no operand of an operator in an expression has a type that is a class or an enumeration, the operator is assumed to be a built-in operator and interpreted according to Clause 5.
If either operand has a type that is a class or an enumeration, a user-defined operator function might be declared that implements this operator or a user-defined conversion can be neccessary to convert the operand to a type that is appropriate for a built-in operator.
( Note: The wording is the same in both C++03, and the next revision of the standard; C++14 )

Is it possible to define an implicit conversion operator to std::array?

I am trying to have a C++ class that can be implicitly converted to std::array. Conversion works, but it is not implicit.
#include <array>
class A {
private:
std::array<float, 7> data;
public:
operator std::array<float, 7>&() { return data; }
operator const std::array<float, 7>&() const { return data; }
};
int main() {
A a;
a[1] = 0.5f; // fails to compile
auto it = a.begin(); // fails to compile
A b;
static_cast<std::array<float, 7>>(b)[1] = 0.5f; //ok
auto it2 = static_cast<std::array<float, 7>>(b).begin(); //ok
return 0;
}
I understand the above example is quite convoluted, as it basically completely exposes a private member of the class. But this is an oversimplified example, I am just trying to tackle the problem of why implicit conversions to std::array does not work.
I have tried the above example with both clang-3.2 and gcc-4.8. Neither compiles.
Even more perplexing is that if I use implicit conversion to pointer, compilation apparently succeeds:
operator float *() { return data.begin(); }
operator const float *() const { return data.cbegin(); }
But of course, this means losing the many niceties of std::array, which I will accept if there isn't a better solution.
I'm answering your question from a comment:
Could you please elaborate on why my conversion does not make sense? While trying to resolve operator[], why should the compiler not consider possible conversions?
Short answer, because that's how it works. A conversion operator to a built-in type can be called here, not to user-defined type.
A bit longer answer:
When an operator is used in an expression, overload resolution follows the rules laid out in 13.3.1.2.
First:
2 If either operand has a type that is a class or an enumeration, a user-defined operator function might be
declared that implements this operator or a user-defined conversion can be necessary to convert the operand
to a type that is appropriate for a built-in operator. In this case, overload resolution is used to determine
which operator function or built-in operator is to be invoked to implement the operator [...].
a[1] is, for this purpose interpreted as a.operator[](1), as shown in Table 11 in the same section.
The lookup is then performed as follows:
3 For a unary operator # with an operand of a type whose cv-unqualified version is T1, and for a binary
operator # with a left operand of a type whose cv-unqualified version is T1 and a right operand of a type
whose cv-unqualified version is T2, three sets of candidate functions, designated member candidates, non-
member candidates and built-in candidates, are constructed as follows:
— If T1 is a complete class type, the set of member candidates is the result of the qualified lookup of
T1::operator# (13.3.1.1.1); otherwise, the set of member candidates is empty. [1]
— The set of non-member candidates is the result of the unqualified lookup of operator# in the context
of the expression according to the usual rules for name lookup in unqualified function calls (3.4.2)
except that all member functions are ignored. However, if no operand has a class type, only those
non-member functions in the lookup set that have a first parameter of type T1 or “reference to (possibly
cv-qualified) T1”, when T1 is an enumeration type, or (if there is a right operand) a second parameter
of type T2 or “reference to (possibly cv-qualified) T2”, when T2 is an enumeration type, are candidate
functions. [2]
— For the operator ,, the unary operator &, or the operator ->, the built-in candidates set is empty.
For all other operators, the built-in candidates include all of the candidate operator functions defined
in 13.6 that, compared to the given operator,
— have the same operator name, and
— accept the same number of operands, and
— accept operand types to which the given operand or operands can be converted according to
13.3.3.1, and [3]
— do not have the same parameter-type-list as any non-template non-member candidate.
The result is as follows:
[1] finds nothing (there's no operator[] in your class
[2] finds nothing (there's no free function operator[] and neither of operands are enumeration types)
[3] finds built-in operator[](float*, std::ptrdiff_t) because A declares a conversion to float*
You can get them to work by overloading operator[] and begin() on A, or publicly inheriting from array (not recommended though).
The implicit conversion only works when it makes sense (say if you passed an A to a function that expects a std::array<float, 7>), not in your case. And that's a good thing if you ask me.

If the first operand of an additive expression is convertible to both pointer and integer, which conversion is chosen?

In the following example, which conversion function should be called? Why should that one be chosen over the other?
struct A
{
operator int();
operator int*();
};
A x;
int i = x + 1;
The compiler chooses operator int().. but why?
Here are some relevant quotes from C++03:
From [expr.add]
For addition, either both operands shall have arithmetic or enumeration type, or one operand shall be a pointer to a completely defined object type and the other shall have integral or enumeration type.
From [conv]
expressions with a given type will be implicitly converted to other types in several contexts:
When used as operands of operators. The operator’s requirements for its operands dictate the destination type
The reason for this behavior is that the built-in operator which accepts a pointer as its left hand operand accepts an object of type std::ptrdiff_t as its right hand operand. This is specified in § 13.6 of the C++11 Standard:
For every cv-qualified or cv-unqualified object type T there exist candidate operator functions of the form
T * operator+(T *, std::ptrdiff_t);
[...]
Since 1 has type int, the compiler considers the built-in operator + that takes two ints as a better choice, because it onlys require a (user-defined) conversion for the first argument.
If you provided an argument of type std::ptrdiff_t as the right hand operand of operator +, you would see the expected ambiguity:
int i = x + static_cast<std::ptrdiff_t>(1); // AMBIGUOUS!
Here is a live example.

Unary + on pointers

I was just browsing through the draft of the C++11 standard and found the following puzzling statement (§13.6/8):
For every type T there exist candidate operator functions of the form
T* operator+(T*);
How should this "unary +" operator on pointer be understood? Is this just a no-op in the normal case, which can nevertheless be overloaded? Or is there some deeper point I am missing here?
The + on pointers is a noop except for turning things to rvalues. It sometimes is handy if you want to decay arrays or functions
int a[] = { 1, 2, 3 };
auto &&x = +a;
Now x is an int*&& and not an int(&)[3]. If you want to pass x or +a to templates, this difference might become important. a + 0 is not always equivalent, consider
struct forward_decl;
extern forward_decl a[];
auto &&x = +a; // well-formed
auto &&y = a + 0; // ill-formed
The last line is ill-formed, because adding anything to a pointer requires the pointer's pointed-to class type to be completely defined (because it advances by sizeof(forward_decl) * N bytes).
The answer to your question is just a page above the quote you cited — §13.6/1:
The candidate operator functions that represent the built-in operators defined in Clause 5 are specified in this subclause. These candidate functions participate in the operator overload resolution process as described in 13.3.1.2 and are used for no other purpose. [ Note: Because built-in operators take only operands with non-class type, and operator overload resolution occurs only when an operand expression originally has class or enumeration type, operator overload resolution can resolve to a built-in operator only when an operand has a class type that has a user-defined conversion to a non-class type appropriate for the operator, or when an operand has an enumeration type that can be converted to a type appropriate for the operator. Also note that some of the candidate operator functions given in this subclause are more permissive than the built-in operators themselves. As described in 13.3.1.2, after a built-in operator is selected by overload resolution the expression is subject to the requirements for the built-in operator given in Clause 5, and therefore to any additional semantic constraints given there. If there is a user-written candidate with the same name and parameter types as a built-in candidate operator function, the built-in operator function is hidden and is not included in the set of candidate functions. —end note ]
Well, you could overload it do do whatever you want, but it's just there for symmetry with the unary - operator. As you mention, it's just a no-op most of the time.