There are some new rules about rewritten comparison operators in C++20, and I'm trying to understand how they work. I've run into the following program:
struct B {};
struct A
{
bool operator==(B const&); // #1
};
bool operator==(B const&, A const&); // #2
int main()
{
B{} == A{}; // C++17: calls #2
// C++20: calls #1
}
which actually breaks existing code. I'm a little surprised by this; #2 actually still looks better to me :p
So how do these new rules change the meaning of existing code?
That particular aspect is a simple form of rewriting, reversing the operands. The primary operators == and <=> can be reversed, the secondaries !=, <, >, <=, and >=, can be rewritten in terms of the primaries.
The reversing aspect can be illustrated with a relatively simple example.
If you don't have a specific B::operator==(A) to handle b == a, you can use the reverse to do it instead: A::operator==(B). This makes sense because equality is a bi-directional relationship: (a == b) => (b == a).
Rewriting for secondary operators, on the other hand, involves using different operators. Consider a > b. If you cannot locate a function to do that directly, such as A::operator>(B), the language will go looking for things like A::operator<=>(B) then simply calculating the result from that.
That's a simplistic view of the process but it's one that most of my students seem to understand. If you want more details, it's covered in the [over.match.oper] section of C++20, part of overload resolution (# is a placeholder for the operator):
For the relational and equality operators, the rewritten candidates include all member, non-member, and built-in candidates for the operator <=> for which the rewritten expression (x <=> y) # 0 is well-formed using that operator<=>.
For the relational, equality, and three-way comparison operators, the rewritten candidates also include a synthesized candidate, with the order of the two parameters reversed, for each member, non-member, and built-in candidate for the
operator <=> for which the rewritten expression 0 # (y <=> x) is well-formed using that operator<=>.
Hence gone are the days of having to provide a real operator== and operator<, then boiler-plating:
operator!= as ! operator==
operator> as ! (operator== || operator<)
operator<= as operator== || operator<
operator>= as ! operator<
Don't complain if I've gotten one or more of those wrong, that just illustrates my point on how much better C++20 is, since you now only have to provide a minimal set (most likely just operator<=> plus whatever else you want for efficiency) and let the compiler look after it :-)
The question as to why one is being selected over the other can be discerned with this code:
#include <iostream>
struct B {};
struct A {
bool operator==(B const&) { std::cout << "1\n"; return true; }
};
bool operator==(B const&, A const&) { std::cout << "2\n"; return true; }
int main() {
auto b = B{}; auto a = A{};
b == a; // outputs: 1
(const B)b == a; // 1
b == (const A)a; // 2
(const B)b == (const A)a; // 2
}
The output of that indicates that it's the const-ness of a deciding which is the better candidate.
As an aside, you may want to have a look at this article, which offers a more in-depth look.
From a non-language-lawyer sense, it works like this. C++20 requires that operator== compute whether the two objects are equal. The concept of equality is commutative: if A == B, then B == A. As such, if there are two operator== functions that could be called by C++20's argument reversal rules, then your code should behave identically either way.
Basically, what C++20 is saying is that if it matters which one gets called, you're defining "equality" incorrectly.
So let's get into the details. And by "the details", I mean the most horrifying chapter of the standard: function overload resolution.
[over.match.oper]/3 defines the mechanism by which the candidate function set for an operator overload is built. C++20 adds to this by introducing "rewritten candidates": a set of candidate functions discovered by rewriting the expression in a way that C++20 deems to be logically equivalent. This only applies to the relational and in/equality operators.
The set is built in accord with the following:
For the relational ([expr.rel]) operators, the rewritten candidates include all non-rewritten candidates for the expression x <=> y.
For the relational ([expr.rel]) and three-way comparison ([expr.spaceship]) operators, the rewritten candidates also include a synthesized candidate, with the order of the two parameters reversed, for each non-rewritten candidate for the expression y <=> x.
For the != operator ([expr.eq]), the rewritten candidates include all non-rewritten candidates for the expression x == y.
For the equality operators, the rewritten candidates also include a synthesized candidate, with the order of the two parameters reversed, for each non-rewritten candidate for the expression y == x.
For all other operators, the rewritten candidate set is empty.
Note the particular concept of a "synthesized candidate". This is standard-speak for "reversing the arguments".
The rest of the section details what it means if one of the rewritten candidates gets chosen (aka: how to synthesize the call). To find which candidate gets chosen, we must delve into the most horrifying part of the most horrifying chapter of the C++ standard:
Best viable function matching.
What matters here is this statement:
a viable function F1 is defined to be a better function than another viable function F2 if for all arguments i, ICSi(F1) is not a worse conversion sequence than ICSi(F2), and then
And that matters... because of this. Literally.
By the rules of [over.ics.scs], an identity conversion is a better match than a conversion that adds a qualifier.
A{} is a prvalue, and... it's not const. Neither is the this parameter to the member function. So it's an identity conversion, which is a better conversion sequence than one that goes to the const A& of the non-member function.
Yes, there is a rule further down that explicitly makes rewritten functions in the candidate list less viable. But it doesn't matter, because the rewritten call is a better match on function arguments alone.
If you use explicit variables and declare one like this A const a{};, then [over.match.best]/2.8 gets involved and de-prioritizes the rewritten version. As seen here. Similarly, if you make the member function const, you also get consistent behavior.
Related
In the following simplified program, struct C inherits from two structs A and B. The former defines both spaceship operator <=> and less operator, the latter – only spaceship operator. Then less operation is performed with objects of class C:
#include <compare>
struct A {
auto operator <=>(const A&) const = default;
bool operator <(const A&) const = default;
};
struct B {
auto operator <=>(const B&) const = default;
};
struct C : A, B {};
int main() {
C c;
c.operator<(c); //ok everywhere
return c < c; //error in GCC
}
The surprising moment here is that the explicit call c.operator<(c) succeeds in all compliers, but the similar call c<c is permitted by Clang but rejected in GCC:
error: request for member 'operator<=>' is ambiguous
<source>:8:10: note: candidates are: 'auto B::operator<=>(const B&) const'
<source>:4:10: note: 'constexpr auto A::operator<=>(const A&) const'
Demo: https://gcc.godbolt.org/z/xn7W9PaPc
There is a possibly related question: GCC can't differentiate between operator++() and operator++(int) . But in that question the explicit operator (++) call is rejected by all compilers, unlike this question where explicit operator call is accepted by all.
I thought that only one operator < is present in C, which was derived from A, and starship operator shall not be considered at all. Is it so and what compiler is right here?
gcc is correct here.
When you do:
c.operator<(c);
You are performing name lookup on something literally named operator<. There is only one such function (the one in A) so this succeeds.
But when you do c < c, you're not doing lookup for operator<. You're doing two things:
a specific lookup for c < c which finds operator< candidates (member, non-member, or builtin)
finding all rewritten candidates for c <=> c
Now, the first lookup succeeds and finds the same A::operator< as before. But the second lookup fails - because c <=> c is ambiguous (between the candidates in A and B). And the rule, from [class.member.lookup]/6 is:
The result of the search is the declaration set of S(N,T).
If it is an invalid set, the program is ill-formed.
We have an invalid set as the result of the search, so the program is ill-formed. It's not that we find nothing, it's that the whole lookup fails. Just because in this context we're looking up a rewritten candidate rather than a primary candidate doesn't matter, it's still a failed lookup.
And it's actually good that it fails because if we fix this ambiguous merge set issue in the usual way:
struct C : A, B {
+ using A::operator<=>;
+ using B::operator<=>;
};
Then our lookup would be ambiguous! Because now our lookup for the rewritten candidates finds two operator<=>s, so we end up with three candidates:
operator<(A const&, A const&)
operator<=>(A const&, A const&)
operator<=>(B const&, B const&)
1 is better than 2 (because a primary candidate is better than a rewritten candidate), but 1 vs 3 is ambiguous (neither is better than the other).
So the fact that the original fails, and this one also fails, is good: it's up to you as the class author to come up with the right thing to do - since it's not obvious what that is.
I reported the issue to Microsoft and they tell me that their and Clang behavior is right, while GCC is wrong: https://developercommunity.visualstudio.com/t/False-acceptance-of-ambiguity-in-case-of/1534112
Let me quote their answer here for completeness:
The compiler behavior you’re observing is by design as per the resolution outlined in https://cdacamar.github.io/wg21papers/proposed/spaceship-dr.html. The reason is that the compiler will find both A::operator<=>, A::operator<, and B::operator<=> as possible overload resolution candidates. Because A::operator< does not require rewriting the expression the compiler will not consider A::operator<=> because A::operator< is declared in the same scope with the same signature so the only remaining candidates are A::operator< and B::operator<=> but since <=> requires rewriting the expression it is dropped later and A::operator< is selected. You can observe that Clang has the same behavior here: https://godbolt.org/z/M6ffr95Ej
I'm running into a strange behavior with the new spaceship operator <=> in C++20. I'm using Visual Studio 2019 compiler with /std:c++latest.
This code compiles fine, as expected:
#include <compare>
struct X
{
int Dummy = 0;
auto operator<=>(const X&) const = default; // Default implementation
};
int main()
{
X a, b;
a == b; // OK!
return 0;
}
However, if I change X to this:
struct X
{
int Dummy = 0;
auto operator<=>(const X& other) const
{
return Dummy <=> other.Dummy;
}
};
I get the following compiler error:
error C2676: binary '==': 'X' does not define this operator or a conversion to a type acceptable to the predefined operator
I tried this on clang as well, and I get similar behavior.
I would appreciate some explanation on why the default implementation generates operator== correctly, but the custom one doesn't.
This is by design.
[class.compare.default] (emphasis mine)
3 If the class definition does not explicitly declare an ==
operator function, but declares a defaulted three-way comparison
operator function, an == operator function is declared implicitly
with the same access as the three-way comparison operator function.
The implicitly-declared == operator for a class X is an inline
member and is defined as defaulted in the definition of X.
Only a defaulted <=> allows a synthesized == to exist. The rationale is that classes like std::vector should not use a non-defaulted <=> for equality tests. Using <=> for == is not the most efficient way to compare vectors. <=> must give the exact ordering, whereas == may bail early by comparing sizes first.
If a class does something special in its three-way comparison, it will likely need to do something special in its ==. Thus, instead of generating a potentially non-sensible default, the language leaves it up to the programmer.
During the standardization of this feature, it was decided that equality and ordering should logically be separated. As such, uses of equality testing (== and !=) will never invoke operator<=>. However, it was still seen as useful to be able to default both of them with a single declaration. So if you default operator<=>, it was decided that you also meant to default operator== (unless you define it later or had defined it earlier).
As to why this decision was made, the basic reasoning goes like this. Consider std::string. Ordering of two strings is lexicographical; each character has its integer value compared against each character in the other string. The first inequality results in the result of ordering.
However, equality testing of strings has a short-circuit. If the two strings aren't of equal length, then there's no point in doing character-wise comparison at all; they aren't equal. So if someone is doing equality testing, you don't want to do it long-form if you can short-circuit it.
It turns out that many types that need a user-defined ordering will also offer some short-circuit mechanism for equality testing. To prevent people from implementing only operator<=> and throwing away potential performance, we effectively force everyone to do both.
The other answers explain really well why the language is like this. I just wanted to add that in case it's not obvious, it is of course possible to have a user-provided operator<=> with a defaulted operator==. You just need to explicitly write the defaulted operator==:
struct X
{
int Dummy = 0;
auto operator<=>(const X& other) const
{
return Dummy <=> other.Dummy;
}
bool operator==(const X& other) const = default;
};
Note that the defaulted operator== performs memberwise == comparisons. That is to say, it is not implemented in terms of the user-provided operator<=>. So requiring the programmer to explicitly ask for this is a minor safety feature to help prevent surprises.
I'm running into a strange behavior with the new spaceship operator <=> in C++20. I'm using Visual Studio 2019 compiler with /std:c++latest.
This code compiles fine, as expected:
#include <compare>
struct X
{
int Dummy = 0;
auto operator<=>(const X&) const = default; // Default implementation
};
int main()
{
X a, b;
a == b; // OK!
return 0;
}
However, if I change X to this:
struct X
{
int Dummy = 0;
auto operator<=>(const X& other) const
{
return Dummy <=> other.Dummy;
}
};
I get the following compiler error:
error C2676: binary '==': 'X' does not define this operator or a conversion to a type acceptable to the predefined operator
I tried this on clang as well, and I get similar behavior.
I would appreciate some explanation on why the default implementation generates operator== correctly, but the custom one doesn't.
This is by design.
[class.compare.default] (emphasis mine)
3 If the class definition does not explicitly declare an ==
operator function, but declares a defaulted three-way comparison
operator function, an == operator function is declared implicitly
with the same access as the three-way comparison operator function.
The implicitly-declared == operator for a class X is an inline
member and is defined as defaulted in the definition of X.
Only a defaulted <=> allows a synthesized == to exist. The rationale is that classes like std::vector should not use a non-defaulted <=> for equality tests. Using <=> for == is not the most efficient way to compare vectors. <=> must give the exact ordering, whereas == may bail early by comparing sizes first.
If a class does something special in its three-way comparison, it will likely need to do something special in its ==. Thus, instead of generating a potentially non-sensible default, the language leaves it up to the programmer.
During the standardization of this feature, it was decided that equality and ordering should logically be separated. As such, uses of equality testing (== and !=) will never invoke operator<=>. However, it was still seen as useful to be able to default both of them with a single declaration. So if you default operator<=>, it was decided that you also meant to default operator== (unless you define it later or had defined it earlier).
As to why this decision was made, the basic reasoning goes like this. Consider std::string. Ordering of two strings is lexicographical; each character has its integer value compared against each character in the other string. The first inequality results in the result of ordering.
However, equality testing of strings has a short-circuit. If the two strings aren't of equal length, then there's no point in doing character-wise comparison at all; they aren't equal. So if someone is doing equality testing, you don't want to do it long-form if you can short-circuit it.
It turns out that many types that need a user-defined ordering will also offer some short-circuit mechanism for equality testing. To prevent people from implementing only operator<=> and throwing away potential performance, we effectively force everyone to do both.
The other answers explain really well why the language is like this. I just wanted to add that in case it's not obvious, it is of course possible to have a user-provided operator<=> with a defaulted operator==. You just need to explicitly write the defaulted operator==:
struct X
{
int Dummy = 0;
auto operator<=>(const X& other) const
{
return Dummy <=> other.Dummy;
}
bool operator==(const X& other) const = default;
};
Note that the defaulted operator== performs memberwise == comparisons. That is to say, it is not implemented in terms of the user-provided operator<=>. So requiring the programmer to explicitly ask for this is a minor safety feature to help prevent surprises.
I'm running into a strange behavior with the new spaceship operator <=> in C++20. I'm using Visual Studio 2019 compiler with /std:c++latest.
This code compiles fine, as expected:
#include <compare>
struct X
{
int Dummy = 0;
auto operator<=>(const X&) const = default; // Default implementation
};
int main()
{
X a, b;
a == b; // OK!
return 0;
}
However, if I change X to this:
struct X
{
int Dummy = 0;
auto operator<=>(const X& other) const
{
return Dummy <=> other.Dummy;
}
};
I get the following compiler error:
error C2676: binary '==': 'X' does not define this operator or a conversion to a type acceptable to the predefined operator
I tried this on clang as well, and I get similar behavior.
I would appreciate some explanation on why the default implementation generates operator== correctly, but the custom one doesn't.
This is by design.
[class.compare.default] (emphasis mine)
3 If the class definition does not explicitly declare an ==
operator function, but declares a defaulted three-way comparison
operator function, an == operator function is declared implicitly
with the same access as the three-way comparison operator function.
The implicitly-declared == operator for a class X is an inline
member and is defined as defaulted in the definition of X.
Only a defaulted <=> allows a synthesized == to exist. The rationale is that classes like std::vector should not use a non-defaulted <=> for equality tests. Using <=> for == is not the most efficient way to compare vectors. <=> must give the exact ordering, whereas == may bail early by comparing sizes first.
If a class does something special in its three-way comparison, it will likely need to do something special in its ==. Thus, instead of generating a potentially non-sensible default, the language leaves it up to the programmer.
During the standardization of this feature, it was decided that equality and ordering should logically be separated. As such, uses of equality testing (== and !=) will never invoke operator<=>. However, it was still seen as useful to be able to default both of them with a single declaration. So if you default operator<=>, it was decided that you also meant to default operator== (unless you define it later or had defined it earlier).
As to why this decision was made, the basic reasoning goes like this. Consider std::string. Ordering of two strings is lexicographical; each character has its integer value compared against each character in the other string. The first inequality results in the result of ordering.
However, equality testing of strings has a short-circuit. If the two strings aren't of equal length, then there's no point in doing character-wise comparison at all; they aren't equal. So if someone is doing equality testing, you don't want to do it long-form if you can short-circuit it.
It turns out that many types that need a user-defined ordering will also offer some short-circuit mechanism for equality testing. To prevent people from implementing only operator<=> and throwing away potential performance, we effectively force everyone to do both.
The other answers explain really well why the language is like this. I just wanted to add that in case it's not obvious, it is of course possible to have a user-provided operator<=> with a defaulted operator==. You just need to explicitly write the defaulted operator==:
struct X
{
int Dummy = 0;
auto operator<=>(const X& other) const
{
return Dummy <=> other.Dummy;
}
bool operator==(const X& other) const = default;
};
Note that the defaulted operator== performs memberwise == comparisons. That is to say, it is not implemented in terms of the user-provided operator<=>. So requiring the programmer to explicitly ask for this is a minor safety feature to help prevent surprises.
Consider the following simple example
struct C
{
template <typename T> operator T () {return 0.5;}
operator int () {return 1;}
operator bool () {return false;}
};
int main ()
{
C c;
double x = c;
std::cout << x << std::endl;
}
When compiled with Clang, it gives the following error
test.cpp:11:12: error: conversion from 'C' to 'double' is ambiguous
double x = c;
^ ~
test.cpp:4:5: note: candidate function
operator int () {return 1;}
^
test.cpp:5:5: note: candidate function
operator bool () {return false;}
^
test.cpp:3:27: note: candidate function [with T = double]
template <typename T> operator T () {return 0.5;}
^
1 error generated.
Other compilers generate similar errors, e.g., GCC and Intel iclc
If I remove operator int and operator bool. It compiles fine and work as expected. If only remove one of them, that is keep the template operator and say operator int, then the non-template version is always chosen.
My understanding is that only when the template and non-template overloaded functions are equal in the sense that they are both perfect match or both require the same conversion sequence, the non-template version will be preferred. However in this case, it appears that the compiler does not see the operator template as a perfect match. And when both the bool and int overloading are present, then naturally it considers they are ambiguous.
In summary, my question is that why the operator template is not considered a perfect match in this case?
Let's break this down into two different problems:
1. Why does this generate a compiler error?
struct C
{
operator bool () {return false;}
operator int () {return 1;}
};
As both int and bool can be implicitly converted to double, the compiler can not know which function it should use. There are two functions which it could use and neither one takes precedence over the other one.
2. Why isn't the templated version a perfect match?
struct C
{
template <typename T> operator T () {return 0.5;}
operator int () {return 1;}
};
Why is operator int() called when requesting a double?
The non-template function is called because a non-template function takes precedence in overload resolution. (Overloading function templates)
EDIT:
I was wrong! As Yan Zhou mentioned in his comment, and as it is stated in the link I provided, a perfect match in the templated function takes precedence over the non-templated function.
I tested your code (compiled with g++ 4.7.2), and it worked as expected: it returned 0.5, in other words, the templated function was used!
EDIT2:
I now tried with clang and I can reproduce the behaviour you described. As it works correctly in gcc, this seems to be a bug in clang.
This is interesting. There are two ways to read a critical part of section 13.3.3. The original example should definitely call the function template, but the version where one of the non-templates is removed might be argued to be ambiguous.
13.3.3:
A viable function F1 is defined to be a better function than another viable function F2 if for all arguments i, ICS_i(F1) is not a worse conversion sequence than ICS_i(F2), and then
for some argument j, ICS_j(F1) is a better conversion sequence than ICS_j(F2), or, if not that,
the context is an initialization by user-defined conversion (see 8.5, 13.3.1.5, and 13.3.1.6) and the standard conversion sequence from the return type of F1 to the destination type (i.e., the type of the entity being initialized) is a better conversion sequence than the standard conversion sequence from the return type of F2 to the destination type, or, if not that,
F1 is a non-template function and F2 is a function template specialization, or, if not that,
F1 and F2 are function template specializations, and the function template for F1 is more specialized than the template for F2 according to the partial ordering rules described in 14.5.6.2.
If there is exactly one viable function that is a better function than all other viable functions, then it is the one selected by overload resolution; otherwise the call is ill-formed.
In the example, clang correctly identifies the set of three viable candidate functions:
C::operator int()
C::operator bool()
C::operator double<double>()
The third is a function template specialization. (I don't think the syntax above is legal, but you get the idea: at this point of overload resolution it's not treated as a template, but as a specialization with a definite function type.)
The only Implicit Conversion Sequence on arguments here (ICS1) is the exact match "lvalue C" to "C&" on the implicit parameter, so that won't make a difference.
This example is exactly the situation described in the second bullet, so the function returning double is clearly better than the other two.
Here's where it gets weird: By a very literal reading, operator int is also better than the template specialization, because of the third bullet. "Wait a minute, shouldn't 'better than' be antisymmetric? How can you say F1 is better than F2 AND F2 is better than F1?" Unfortunately, the Standard doesn't explicitly say anything of the sort. "Doesn't the second bullet take priority over the third bullet because of the 'if not that' phrase?" Yes, for constant F1 and F2. But the Standard doesn't say that satisfying the second bullet for (F1,F2) makes the third bullet for (F2,F1) not applicable.
Of course, since operator int is not better than operator bool and vice versa, there is still "exactly one viable function that is a better function than all other viable functions".
I'm not exactly endorsing this weird reading, except maybe to report it as a Standard defect. Going with that would have bizarre consequences (like removing an overload which was not the best from this example changes a program from well-formed to ambiguous!). I think the intent is for the second bullet to be considered both ways before the third bullet is considered at all.
Which would mean the function template should be selected by overload resolution, and this is a clang bug.