Given a programmer defined POD struct that will be stored in an unordered_map, is there any particular advantage in defining:
namespace std {
template<>
struct equal_to<MyType> {
bool operator()(const MyType& lhs, const MyType& rhs) const {
...
}
};
}
over simply defining:
operator==(const MyType& lhs, const MyType& rhs)
(I'm already aware of the potential advantage of using an "inlineable" function object rather than a function pointer for the hashing function).
I would say operator== has more uses than a specialization of equal_to<> because people normally write a == b, not equal_to<T>()(a, b). And the default equal_to<> is implemented in terms of operator==, not the other way around.
If you need to specialize std::equal_to because it must behave differently from operator==, then a better idea may be to implement a custom my_equal_to predicate class, not related to std::equal_to in order to follow the principle of least surprise.
Also, there is interface deficiency in std::equal_to<T> because it accepts arguments of the same type. C++14 std::equal_to<void> fixes the deficiency by accepting arguments of different types and forwarding them to operator==.
operator==, on the other hand, can have multiple overloads for different types (e.g. operator==(std::string const&, char const*)).
Which means that in C++14 std::equal_to<void> and overloaded operator== work nicely together, see N3657 Adding heterogeneous comparison lookup to associative containers for more details.
I'm already aware of the potential advantage of using an "inlineable" function object rather than a function pointer for the hashing function
Function pointers do not apply here, default equal_to<> uses operator== directly, not through a pointer.
If you have a class hierarchy and want to use the equal_to of the parent class on two objects of the child class, you may use the equal_to-variant, but not the == variant (which will choose the comperator for the child class):
Related
Note: I presume that this is technically duplicate of this question but:
changes to == in C++20 are quite radical, and I am not sure if
reviving 9 year question is the proper thing to do.
I ask specifically about the operators == and <=> that are being rewritten by
compiler, not for example operator <.
p.s. I have my own opinion at the moment(based on some talk by foonathan), but it is just a current preference and I prefer not bias the potential answers with it.
I would argue that in C++20, comparisons should be member functions unless you have a strongly compelling reason otherwise.
Lemme first start with the C++17 calculus: we would often write our comparisons as non-members. The reason for this is that it was the only way to allow two-sided comparisons. If I had a type X that I wanted to be comparable to int, I can't make 1 == X{} work with a member function - it has to be a free function:
struct X { };
bool operator==(X, int);
bool operator==(int lhs, X rhs) { return rhs == lhs; }
There wasn't much choice in the matter. Now, writing these as purely free functions is sub-optimal because we're polluting the namespace and increasing the amounts of candidates in lookup - so it's better to make them hidden friends:
struct X {
friend bool operator==(X, int);
friend bool operator==(int lhs, X rhs) { return rhs == lhs; }
};
In C++20, we don't have this issue because the comparisons are themselves symmetric. You can just write:
struct X {
bool operator==(int) const;
};
And that declaration alone already allows both X{} == 1 and 1 == X{}, while also already not contributing extra candidates for name lookups (it will already only be a candidate if one side or the other is an X).
Moreover, in C++20, you can default comparisons if they're declared within the declaration of the class. These could be either member functions or hidden friends, but not external free functions.
One interesting case for a reason to provide non-member comparison is what I ran into with std::string. The comparisons for that type are currently non-member function templates:
template<class charT, class traits, class Allocator>
constexpr bool
operator==(const basic_string<charT, traits, Allocator>& lhs,
const basic_string<charT, traits, Allocator>& rhs) noexcept;
This has importantly different semantics from making this a member (non-template) function or a hidden friend (non-template) function in that it doesn't allow implicit conversions, by way of being a template. As I pointed out, turning this comparison operator into a non-template would have the effect of suddenly allowing implicit conversions on both sides which can break code that wasn't previously aware of this possibility.
But in any case, if you have a class template and want to avoid conversions on your comparisons, that might be a good reason to stick with a non-member function template for your comparison operator. But that's about it.
I'd argue from software engineering standpoint that always one should prefer to use free functions instead of member methods when possible. And I believe it is true for all functions. Why? It improves encapsulation and frees the function from "knowing" how the class is implemented. Of course, often comparison functions need to access private members and it is fine then to use friend or member function (still I'd prefer friend). Scott Meyers writes a bit about it in Effective C++, item 23
Here is article by Scott that reiterates this thought
I've been combing through the internet to find an answer, but I couldn't find any. The only reasons given seems to be relevant for comparing with objects of different type (e.g. MyClass == int). But the most common use case is comparing a class instance to another instance of the same class, not to any unrelated type.
In other words, I do understand the problems with:
struct A {
bool operator==(int b);
};
But I cannot find any good reason to not use member function in the most obvious use-case:
struct A {
bool operator==(const A&);
};
The most canonical duplicate What are the basic rules and idioms for operator overloading? says "overload binary operators as non-member" as rule of a thumb.
Operator overloading : member function vs. non-member function? gives example mentioned above - if you were to use this operator with instance of another class/primitive type...
CppCoreGuidelines has a vague explanation "If you use member functions, you need two", which I assume applies to comparing with object of different type.
Why should operator< be non-member function? mentions that "non-member functions play better with implicit conversion", but it seems again the case of left-hand operand not being the instance of the class.
On the other hand, member overload seems to have a couple positive sides:
No need to befriend the function or to provide getters for members
It is always available to class users (although this might be the downside also)
No problems with lookup (which seems to be common in our GoogleTests for some reason)
Is overloading operator== as non-member function just a convention to keep it the same with possible overloads in other classes? Or are there any other reasons to make it non-member?
Well, in your question, you did forget to const qualify the member function, and it would be harder to write bool operator==(A&, const A&); by accident.
If you had an implicit constructor, a class with implicit conversion to A or base class with an operator== with higher priority, the member function wouldn't work if it was on the left, but would if it was on the right. Although most of the time implicit conversions are a bad idea, inheritance could reasonably lead to a problem.
struct A {
A(int); // Implicit constructor
A();
bool operator==(const A&) const;
};
struct B : A {
bool operator==(const B&) const;
};
void test() {
A a;
B b;
// 1 == a; // Doesn't work
a == 1;
// b == a; // Doesn't work; Picks `B::operator==(const B&) const;`
a == b; // Picks `A::operator==(const A&) const`, converting `b` to an `A&`.
// Equality is no longer symmetric as expected
}
In the future, with the C++20 operator<=>, you will most likely always implement this as a member function (namely as auto operator<=>(const T&) const = default;), so we know that this guideline may change.
The arguments for using non-member operator overload for symmetric operations are based on style and consistency. They are not very strong arguments 1. Non-member overloads are typically preferred because a weak argument is still a little bit better than no argument at all.
Your arguments for member operator overload don't seem to be any stronger. Consider following:
No need to befriend the function
On the other hand, if you use a non-member overload, then you don't have need to declare a member function. Is befriending the non-member somehow worse?
or to provide getters for members
There is no need for that if you befriend the overload.
It is always available to class users (although this might be the downside also)
It is unclear how this differs from the non-member overloads. Are they also not always available to the class users?
No problems with lookup (which seems to be common in our GoogleTests for some reason)
Are there lookup problems with non-member overloads? Could you demonstrate the problem with an example, and show how the problem is solved by using a member overload instead?
If it does solve the problem, then you can of course use that. Just because some guidelines recommend that you prefer one alternative as a rule of thumb, doesn't mean that is the only alternative to be used in all use cases.
1 Although, see answer https://stackoverflow.com/a/57927564/2079303 which is arguably stronger than just stylistic.
I've been combing through the internet to find an answer, but I couldn't find any. The only reasons given seems to be relevant for comparing with objects of different type (e.g. MyClass == int). But the most common use case is comparing a class instance to another instance of the same class, not to any unrelated type.
In other words, I do understand the problems with:
struct A {
bool operator==(int b);
};
But I cannot find any good reason to not use member function in the most obvious use-case:
struct A {
bool operator==(const A&);
};
The most canonical duplicate What are the basic rules and idioms for operator overloading? says "overload binary operators as non-member" as rule of a thumb.
Operator overloading : member function vs. non-member function? gives example mentioned above - if you were to use this operator with instance of another class/primitive type...
CppCoreGuidelines has a vague explanation "If you use member functions, you need two", which I assume applies to comparing with object of different type.
Why should operator< be non-member function? mentions that "non-member functions play better with implicit conversion", but it seems again the case of left-hand operand not being the instance of the class.
On the other hand, member overload seems to have a couple positive sides:
No need to befriend the function or to provide getters for members
It is always available to class users (although this might be the downside also)
No problems with lookup (which seems to be common in our GoogleTests for some reason)
Is overloading operator== as non-member function just a convention to keep it the same with possible overloads in other classes? Or are there any other reasons to make it non-member?
Well, in your question, you did forget to const qualify the member function, and it would be harder to write bool operator==(A&, const A&); by accident.
If you had an implicit constructor, a class with implicit conversion to A or base class with an operator== with higher priority, the member function wouldn't work if it was on the left, but would if it was on the right. Although most of the time implicit conversions are a bad idea, inheritance could reasonably lead to a problem.
struct A {
A(int); // Implicit constructor
A();
bool operator==(const A&) const;
};
struct B : A {
bool operator==(const B&) const;
};
void test() {
A a;
B b;
// 1 == a; // Doesn't work
a == 1;
// b == a; // Doesn't work; Picks `B::operator==(const B&) const;`
a == b; // Picks `A::operator==(const A&) const`, converting `b` to an `A&`.
// Equality is no longer symmetric as expected
}
In the future, with the C++20 operator<=>, you will most likely always implement this as a member function (namely as auto operator<=>(const T&) const = default;), so we know that this guideline may change.
The arguments for using non-member operator overload for symmetric operations are based on style and consistency. They are not very strong arguments 1. Non-member overloads are typically preferred because a weak argument is still a little bit better than no argument at all.
Your arguments for member operator overload don't seem to be any stronger. Consider following:
No need to befriend the function
On the other hand, if you use a non-member overload, then you don't have need to declare a member function. Is befriending the non-member somehow worse?
or to provide getters for members
There is no need for that if you befriend the overload.
It is always available to class users (although this might be the downside also)
It is unclear how this differs from the non-member overloads. Are they also not always available to the class users?
No problems with lookup (which seems to be common in our GoogleTests for some reason)
Are there lookup problems with non-member overloads? Could you demonstrate the problem with an example, and show how the problem is solved by using a member overload instead?
If it does solve the problem, then you can of course use that. Just because some guidelines recommend that you prefer one alternative as a rule of thumb, doesn't mean that is the only alternative to be used in all use cases.
1 Although, see answer https://stackoverflow.com/a/57927564/2079303 which is arguably stronger than just stylistic.
The STL functors are implemented like this:
template<class T>
struct less{
bool operator()(T const& lhs, T const& rhs){
return lhs < rhs;
}
};
This makes us mention the (possibly long) type everytime we create such a functor. Why are they not implemented like shown below? Any reasons?
struct less{
template<class T>
bool operator()(T const& lhs, T const& rhs){
return lhs < rhs;
}
};
That would make them usable without any mentioning of (possibly long) types.
It would also make it impossible to specialize them for user defined types.
They are supposed to be a customization point.
To summarize the discussions in the comments:
Although it is technically possible to do like Xeo suggests, the language standard doesn't allow it.
It is very hard to write a working class template if users are allowed to specialize individual functions of the template. In some cases it might however be a good idea to specialize the whole class for a user defined type.
Therefore the C++98 standard writes (17.4.3.1):
It is undefined for a C++ program to add declarations or definitions to namespace std or namespaces within namespace std unless otherwise specified. A program may add template specializations for any standard library template to namespace std.
As it isn't "otherwise specified" that Xeo's code is allowed, we are to understand that it is not. Perhaps not totally obvious! Or that "template specializations" only apply to classes.
The new C++11 standard has had this part expanded, and spells it out in more detail (17.6.4.2):
The behavior of a C++ program is undefined if it adds declarations or definitions to namespace std or to a namespace within namespace std unless otherwise specified. A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly
prohibited.
The behavior of a C++ program is undefined if it declares
โ an explicit specialization of any member function of a standard library class template, or
โ an explicit specialization of any member function template of a standard library class or class template, or
โ an explicit or partial specialization of any member class template of a standard library class or class template.
A program may explicitly instantiate a template defined in the standard library only if the declaration depends on the name of a user-defined type and the instantiation meets the standard library requirements for the original template.
Maybe:
std::multiset<long, std::less<int> > moduloset;
Odd thing to do, but the point is that std::less<int>, std::less<long>, std::less<unsigned int> implement different mathematical functions which produce different results when passed (the result of converting) certain argument expressions. Various algorithms and other standard library components work by specifying a functor, so it makes sense to me that there are different functors to represent those different mathematical functions, not just different overloads of operator() on one functor.
Furthermore, a functor with a template operator() can't be an Adaptable Binary Predicate, since it doesn't have argument types (an argument can have any type). So if std::less were defined as you suggest then it couldn't participate in the stuff in <functional>.
Also on a highly speculative note -- std::less was probably designed before support for template member functions was at all widespread, since there are various notes in the SGI STL documentation that say, "if your implementation doesn't support member templates then this isn't available". For such a simple component there would, I guess, be an incentive to do something that works today. Once it exists, the standardization could then have removed it in favour of something else, but was it worth disrupting existing code? If it was that big a deal, then either you or the standard could introduce a flexible_less functor as you describe.
Finally, why
template<class T>
bool operator()(T const& lhs, T const& rhs){
return lhs < rhs;
}
rather than
template<class T, class U>
bool operator()(T const& lhs, U const& rhs){
return lhs < rhs;
}
For user-defined types, the two might not be the same. Yes, this is an unfair question, since I don't know why there's no two-template-argument version of std::less ;-)
In C++0x (n3126), smart pointers can be compared, both relationally and for equality. However, the way this is done seems inconsistent to me.
For example, shared_ptr defines operator< be equivalent to:
template <typename T, typename U>
bool operator<(const shared_ptr<T>& a, const shared_ptr<T>& b)
{
return std::less<void*>()(a.get(), b.get());
}
Using std::less provides total ordering with respect to pointer values, unlike a vanilla relational pointer comparison, which is unspecified.
However, unique_ptr defines the same operator as:
template <typename T1, typename D1, typename T2, typename D2>
bool operator<(const unique_ptr<T1, D1>& a, const unique_ptr<T2, D2>& b)
{
return a.get() < b.get();
}
It also defined the other relational operators in similar fashion.
Why the change in method and "completeness"? That is, why does shared_ptr use std::less while unique_ptr uses the built-in operator<? And why doesn't shared_ptr also provide the other relational operators, like unique_ptr?
I can understand the rationale behind either choice:
with respect to method: it represents a pointer so just use the built-in pointer operators, versus it needs to be usable within an associative container so provide total ordering (like a vanilla pointer would get with the default std::less predicate template argument)
with respect to completeness: it represents a pointer so provide all the same comparisons as a pointer, versus it is a class type and only needs to be less-than comparable to be used in an associative container, so only provide that requirement
But I don't see why the choice changes depending on the smart pointer type. What am I missing?
Bonus/related: std::shared_ptr seems to have followed from boost::shared_ptr, and the latter omits the other relational operators "by design" (and so std::shared_ptr does too). Why is this?
This was a defect in drafts of C++11; a defect report was opened to change the std::unique_ptr relational operator overloads to use std::less: see LWG Defect 1297.
This was fixed in time for the final C++11 specification. C++11 ยง20.7.1.4[unique.ptr.special]/5 specifies that the operator< overload:
Returns: less<CT>()(x.get(), y.get())
where x and y are the two operands of the operator and CT is the common type of the two pointers (since pointers to different types, e.g. with different cv-qualifications, can be compared).