Why isn't std::find() using my operator==? - c++

In the following snippet of code, I've overloaded the operator== to compare my pair type with string. But for some reason, the compiler isn't finding my operator as a match for the find function. Why not?
Edit: Thanks for all the suggestions for alternatives, but I'd still like to understand why. The code looks like it should work; I'd like to know why it doesn't.
#include <vector>
#include <utility>
#include <string>
#include <algorithm>
typedef std::pair<std::string, int> RegPair;
typedef std::vector<RegPair> RegPairSeq;
bool operator== (const RegPair& lhs, const std::string& rhs)
{
return lhs.first == rhs;
}
int main()
{
RegPairSeq sequence;
std::string foo("foo");
// stuff that's not important
std::find(sequence.begin(), sequence.end(), foo);
// g++: error: no match for 'operator==' in '__first. __gnu_cxx::__normal_iterator<_Iterator, _Container>::operator* [with _Iterator = std::pair<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, _Container = std::vector<std::pair<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > >]() == __val'
// clang++: error: invalid operands to binary expression ('std::pair<std::basic_string<char>, int>' and 'std::basic_string<char> const')
}

The problem is that std::find is a function template and it uses argument-dependent lookup (ADL) to find the right operator== to use.
Both of the arguments are in the std namespace (std::pair<std::string, int> and std::string), so ADL starts by looking in the std namespace. There it finds some operator== (which one, it doesn't matter; there are lots in the Standard Library and if you've included <string>, at least the one that compares two std::basic_string<T> objects could be found).
Because an operator== overload is found in the std namespace, ADL stops searching enclosing scopes. Your overload, which is located in the global namespace, is never found. Name lookup occurs before overload resolution; it doesn't matter during name lookup whether the arguments match.

The cleanest solution is to make a predicate and use find_if:
struct StringFinder
{
StringFinder(const std::string & st) : s(st) { }
const std::string s;
bool operator()(const RegPair& lhs) const { return lhs.first == s; }
}
std::find_if(sequence.begin(), sequence.end(), StringFinder(foo));
If you have C++11 you can use a lambda instead.

The accepted answer is, unfortunately, misleading.
Overload resolution for operator == used inside std::find function template is performed by both regular lookup and argument-dependent lookup (ADL)
Regular lookup is performed in accordance with usual rules of unqualified name lookup. It is looked up from the definition of std::find in standard library. Obviously, the above user-provided declaration of operator == is not visible from there.
ADL is a different story. Theoretically ADL can see names defined later, e.g. names visible from the point of std::find invocation inside main. However, ADL does not just see everything. ADL is restricted to searching only inside so called associated namespaces. These namespaces are brought into the consideration by types of arguments used in the invocation of operator == in accordance to the rules of 6.4.2/2.
In this example types of both arguments of == belong to namespace std. One template argument of std:pair<> is also from std. Another is of fundamental type int, which has no associated namespace. Therefore std is the only associated namespace in this case. ADL looks in std and only in std. The above user-provided declaration of operator == is not found, since it resides in global namespace.
It is incorrect to say that ADL stops looking after finding some "other" definitions of operator == inside std. ADL does not work in "inside-out" fashion as other forms of lookup often do. ADL searches in associated namespaces and that's it. Regardless of whether any other forms of operator == were found in std or not, ADL does not attempt to continue its search in global namespace. This is the incorrect/misleading part of the accepted answer.
Here's a more compact example that illustrates the same issue
namespace N
{
struct S {};
}
template<typename T> void foo(T a)
{
bar(a);
}
void bar(N::S s) {}
int main()
{
N::S a;
foo(a);
}
Ordinary lookup fails since there's no bar declared above foo. Seeing that bar is called with an argument of N::S type, ADL will look for bar in associated namespace N. There's no bar in N either. The code is ill-formed. Note that absense of bar in N does not make ADL to expand its search into the global namespace and find global bar.
It is quite easy to inadvertently change the set of associated namespaces used by ADL, which is why such issues often come and go after seemingly innocent and unrelated changes in the code. For example, if we change the declaration of RegPair as follows
enum E { A, B, C };
typedef std::pair<std::string, E> RegPair;
the error will suddenly disappear. After this change global namespace also becomes associated for ADL, along with std, which is why ADL finds the user-provided declaration of operator ==.

Another "correct" solution:
struct RegPair : std::pair<std::string, int>
{
bool operator== (const std::string& rhs) const;
};
bool RegPair::operator== (const std::string& rhs) const
{
return first == rhs;
}

Related

Overloaded/templated function for element-wise operations on variable depth nested vector [duplicate]

In the following snippet of code, I've overloaded the operator== to compare my pair type with string. But for some reason, the compiler isn't finding my operator as a match for the find function. Why not?
Edit: Thanks for all the suggestions for alternatives, but I'd still like to understand why. The code looks like it should work; I'd like to know why it doesn't.
#include <vector>
#include <utility>
#include <string>
#include <algorithm>
typedef std::pair<std::string, int> RegPair;
typedef std::vector<RegPair> RegPairSeq;
bool operator== (const RegPair& lhs, const std::string& rhs)
{
return lhs.first == rhs;
}
int main()
{
RegPairSeq sequence;
std::string foo("foo");
// stuff that's not important
std::find(sequence.begin(), sequence.end(), foo);
// g++: error: no match for 'operator==' in '__first. __gnu_cxx::__normal_iterator<_Iterator, _Container>::operator* [with _Iterator = std::pair<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>*, _Container = std::vector<std::pair<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int>, std::allocator<std::pair<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > >]() == __val'
// clang++: error: invalid operands to binary expression ('std::pair<std::basic_string<char>, int>' and 'std::basic_string<char> const')
}
The problem is that std::find is a function template and it uses argument-dependent lookup (ADL) to find the right operator== to use.
Both of the arguments are in the std namespace (std::pair<std::string, int> and std::string), so ADL starts by looking in the std namespace. There it finds some operator== (which one, it doesn't matter; there are lots in the Standard Library and if you've included <string>, at least the one that compares two std::basic_string<T> objects could be found).
Because an operator== overload is found in the std namespace, ADL stops searching enclosing scopes. Your overload, which is located in the global namespace, is never found. Name lookup occurs before overload resolution; it doesn't matter during name lookup whether the arguments match.
The cleanest solution is to make a predicate and use find_if:
struct StringFinder
{
StringFinder(const std::string & st) : s(st) { }
const std::string s;
bool operator()(const RegPair& lhs) const { return lhs.first == s; }
}
std::find_if(sequence.begin(), sequence.end(), StringFinder(foo));
If you have C++11 you can use a lambda instead.
The accepted answer is, unfortunately, misleading.
Overload resolution for operator == used inside std::find function template is performed by both regular lookup and argument-dependent lookup (ADL)
Regular lookup is performed in accordance with usual rules of unqualified name lookup. It is looked up from the definition of std::find in standard library. Obviously, the above user-provided declaration of operator == is not visible from there.
ADL is a different story. Theoretically ADL can see names defined later, e.g. names visible from the point of std::find invocation inside main. However, ADL does not just see everything. ADL is restricted to searching only inside so called associated namespaces. These namespaces are brought into the consideration by types of arguments used in the invocation of operator == in accordance to the rules of 6.4.2/2.
In this example types of both arguments of == belong to namespace std. One template argument of std:pair<> is also from std. Another is of fundamental type int, which has no associated namespace. Therefore std is the only associated namespace in this case. ADL looks in std and only in std. The above user-provided declaration of operator == is not found, since it resides in global namespace.
It is incorrect to say that ADL stops looking after finding some "other" definitions of operator == inside std. ADL does not work in "inside-out" fashion as other forms of lookup often do. ADL searches in associated namespaces and that's it. Regardless of whether any other forms of operator == were found in std or not, ADL does not attempt to continue its search in global namespace. This is the incorrect/misleading part of the accepted answer.
Here's a more compact example that illustrates the same issue
namespace N
{
struct S {};
}
template<typename T> void foo(T a)
{
bar(a);
}
void bar(N::S s) {}
int main()
{
N::S a;
foo(a);
}
Ordinary lookup fails since there's no bar declared above foo. Seeing that bar is called with an argument of N::S type, ADL will look for bar in associated namespace N. There's no bar in N either. The code is ill-formed. Note that absense of bar in N does not make ADL to expand its search into the global namespace and find global bar.
It is quite easy to inadvertently change the set of associated namespaces used by ADL, which is why such issues often come and go after seemingly innocent and unrelated changes in the code. For example, if we change the declaration of RegPair as follows
enum E { A, B, C };
typedef std::pair<std::string, E> RegPair;
the error will suddenly disappear. After this change global namespace also becomes associated for ADL, along with std, which is why ADL finds the user-provided declaration of operator ==.
Another "correct" solution:
struct RegPair : std::pair<std::string, int>
{
bool operator== (const std::string& rhs) const;
};
bool RegPair::operator== (const std::string& rhs) const
{
return first == rhs;
}

Namespace resolution with operator== in the STL

Consider a simple type, in a namespace, with an operator==:
namespace ANamespace {
struct Foo { int i; float f; };
}
#ifdef INSIDE
namespace ANamespace {
bool operator==(const Foo& l, const Foo& r)
{
return l.i == r.i && l.f == r.f;
}
}
#else
bool operator==(const ANamespace::Foo& l, const ANamespace::Foo& r)
{
return l.i == r.i && l.f == r.f;
}
#endif
bool compareElements(const std::vector<ANamespace::Foo>& l, const std::vector<ANamespace::Foo>& r)
{
return l == r;
}
If operator== is defined inside ANamespace (by defining INSIDE), the example compiles. But if operator== is defined in the global namespace (the #else case), the function compareElements() doesn't compile - both in GCC and Clang, and with both libstdc++ and libc++. All emit a template error along the lines of:
In file included from /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/9.2.0/../../../../include/c++/9.2.0/vector:60:
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/9.2.0/../../../../include/c++/9.2.0/bits/stl_algobase.h:820:22: error: invalid operands to binary expression ('const ANamespace::Foo' and 'const ANamespace::Foo')
if (!(*__first1 == *__first2))
~~~~~~~~~ ^ ~~~~~~~~~
...
However, directly comparing two Foos in a function, e.g.,
bool compareDirectly(const ANamespace::Foo& l, const ANamespace::Foo& r)
{
return l == r;
}
seems to work fine regardless of where operator== is defined.
Are there rules in the standard about where the STL expects operator== to be defined?
!(*__first1 == *__first2) takes place in std::operator==, a function template, so it is considered a dependent unqualified function call expression, so during overload resolution only functions found within the definition context of std::operator== and those found via ADL are candidates.
Clearly there are no operator==(const Foo&, const Foo&) declared within the context of the definition of the standard comparison operator. In an argument dependent lookup (ADL) the namespaces of each of the arguments are checked to search for a viable function for the call, so this is why defining operator== inside of ANamespace works.
In short, declaring operator== in the same namespace in which your class is declared guarantees that argument-dependent lookup will find it, so that's what you should do. The standard does not mandate that you follow this convention, but in practice it is the only way to obtain the guarantee. This also applies to other operators that the standard library might invoke on your types.
If you choose to declare operator== in the global namespace but your type is not declared in the global namespace, there is a chance that the standard library algorithm will still be able to find your operator== through unqualified name lookup. However, there's no guarantee that this works, since unqualified name lookup will stop at the innermost enclosing namespace in which operator== is found. In other words, in an algorithm of the form
namespace std {
template< class InputIt1, class InputIt2 >
constexpr bool equal( InputIt1 first1, InputIt1 last1,
InputIt2 first2 ) {
// ...
}
}
the unqualified name lookup of operator== will find any operator==s declared in the std namespace (which of course, will not be applicable to your user-defined type) and then, if it found anything in std, even though it may not be a viable overload, will not look in the global namespace.
You need to read up on "ADL" aka "Argument Dependent Lookup".
Basically, when you write v1 == v2, the compiler looks for an operator== taking two arguments of the correct type ANamespace::Foo in the current namespace. (Note: We're ignoring conversions here). If it can't find one, then it will look in the namespace that the type is defined in (ANamespace).
Wikipedia has an article about this.

Name hiding by using declaration

#include <iostream>
struct H
{
void swap(H &rhs);
};
void swap(H &, H &)
{
std::cout << "swap(H &t1, H &t2)" << std::endl;
}
void H::swap(H &rhs)
{
using std::swap;
swap(*this, rhs);
}
int main(void)
{
H a;
H b;
a.swap(b);
}
And this is the result:
swap(H &t1, H &t2)
In the code above, I try to define a swap function of H. In the function void H::swap(H &rhs), I use an using declaration to make the name std::swap visible. If there isn't an using declaration, the code cannot be compiled because there is no usable swap function with two parameters in class H.
I have a question here. In my opinion, after I used the using declaration -- using std::swap, it just make the std::swap -- the template function in STL visible. So I thought that the swap in STL should be invoked in H::swap(). But the result showed that the void swap(H &t1, H &t2) was invoked instead.
So here is my question:
Why can't I invoke swap without a using declaration?(I guess it is because there is no swap function with two parameters in the class. But I am not sure. )
Why will the swap of my definition be invoked instead of the STL swap in the H::swap?
Why can't I invoke swap without a using declaration?
We start in the nearest enclosing scope and work our way outwards until we find something. With this:
void H::swap(H &rhs)
{
swap(*this, rhs);
}
Unqualified swap finds H::swap(). Then we do argument-dependent lookup. But the rule there is, from [basic.lookup.argdep]:
Let X be the lookup set produced by unqualified lookup (3.4.1) and let Y be the lookup set produced by
argument dependent lookup (defined as follows). If X contains
— a declaration of a class member, or
— a block-scope function declaration that is not a using-declaration, or
— a declaration that is neither a function or a function template
then Y is empty. Otherwise Y is the set of declarations found in the namespaces associated with the argument types as described below. [...]
Since the unqualified lookup set finds a class member, the argument-dependent lookup set is empty (that is, it doesnt find swap(H&, H&)).
Why will the swap of my definition be invoked instead of the STL swap in the H::swap?
When you add:
void H::swap(H &rhs)
{
using std::swap;
swap(*this, rhs);
}
now unqualified swap finds std::swap() and not H::swap(), since the former is declared in a more inner scope. using std::swap; does not match any of the criteria in the above-stated rule that would lead to Y being empty (it's not a class member, it is a using-declaration, and it is a function template). As a result, the argument-dependent lookup set does include declarations found in associated namespaces - which includes swap(H&, H&) (since H is in the global namespace). We end up with two overload candidates - and yours is preferred since it's the non-template.
See Xeo's answer on the preferred way to add swap to your class. Basically, you want to write:
struct H {
friend void swap(H&, H&) { ... }
};
This will be found by ADL (and only by ADL). And then whenever anybody calls swap correct:
using std::swap;
swap(a, b);
Lookup will find yours where appropriate.

Can't use function name distance

The following code compiles fine:
#include <string>
int dist(std::string& a, std::string& b) {
return 0;
}
int main() {
std::string a, b;
dist(a, b);
return 0;
}
But when I rename the function from dist to distance:
#include <string>
int distance(std::string& a, std::string& b) {
return 0;
}
int main() {
std::string a, b;
distance(a, b);
return 0;
}
I get this error when compiling (gcc 4.2.1):
/usr/include/c++/4.2.1/bits/stl_iterator_base_types.h: In instantiation of ‘std::iterator_traits<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >’:
b.cpp:9: instantiated from here
/usr/include/c++/4.2.1/bits/stl_iterator_base_types.h:129: error: no type named ‘iterator_category’ in ‘struct std::basic_string<char, std::char_traits<char>, std::allocator<char> >’
Why can't I name the function distance?
The reason is that a standard algorithm called std::distance exists, which is found by ADL (Argument Dependent Lookup): although your call is not qualified with the std namespace, the type of your arguments a and b (i.e. std::string) lives in the same namespace as the std::distance function (i.e. std), and therefore std::distance() is also considered for overload resolution.
If you really want to call your function distance() (I'd suggest you not to), you can either put it in a namespace of yours, and then fully qualify the function name when you call it, or leave it in the global namespace and invoke it this way:
::distance(a, b);
// ^^
Notice, however, that ADL alone might not cause your program to fail compiling if your implementation of the Standard Library provides a SFINAE-friendly version of iterator_traits (more details in this Q&A on StackOverflow - courtesy of MooingDuck).
With a SFINAE-friendly implementation of iterator_traits, your compiler should recognize that the std::distance() function template (because it is a template) cannot be instantiated when given arguments of type std::string, because of its return type:
template< class InputIt >
typename std::iterator_traits<InputIt>::difference_type
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// Trying to instantiate this with InputIt = std::string
// may result in a soft error during type deduction if
// your implementation is SFINAE-friendly, and in a hard
// error otherwise.
distance( InputIt first, InputIt last );
In this case, the compiler would simply discard this template for the purposes of overload resolution and pick your distance() function.
However, if your implementation of the Standard Library does not provide a SFINAE-friendly version of iterator_traits, substitution failure may occur in a context that does not qualify for SFINAE, thus resulting in a (hard) compilation error.
This live example shows your original program compiling with GCC 4.8.0, which comes with a version of libstdc++ that implements a SFINAE-friendly iterator_traits.

Ambiguous call to templated function due to ADL

I've been bitten by this problem a couple of times and so have my colleagues. When compiling
#include <deque>
#include <boost/algorithm/string/find.hpp>
#include <boost/operators.hpp>
template< class Rng, class T >
typename boost::range_iterator<Rng>::type find( Rng& rng, T const& t ) {
return std::find( boost::begin(rng), boost::end(rng), t );
}
struct STest {
bool operator==(STest const& test) const { return true; }
};
struct STest2 : boost::equality_comparable<STest2> {
bool operator==(STest2 const& test) const { return true; }
};
void main() {
std::deque<STest> deq;
find( deq, STest() ); // works
find( deq, STest2() ); // C2668: 'find' : ambiguous call to overloaded function
}
...the VS9 compiler fails when compiling the second find. This is due to the fact that STest2 inherits from a type that is defined in boost namespace which triggers the compiler to try ADL which finds boost::algorithm::find(RangeT& Input, const FinderT& Finder).
An obvious solution is to prefix the call to find(…) with "::" but why is this necessary? There is a perfectly valid match in the global namespace, so why invoke Argument-Dependent Lookup? Can anybody explain the rationale here?
ADL isn't a fallback mechanism to use when "normal" overload resolution fails, functions found by ADL are just as viable as functions found by normal lookup.
If ADL was a fallback solution then you might easily fall into the trap were a function was used even when there was another function that was a better match but only visible via ADL. This would seem especially strange in the case of (for example) operator overloads. You wouldn't want two objects to be compared via an operator== for types that they could be implicitly converted to when there exists a perfectly good operator== in the appropriate namespace.
I'll add the obvious answer myself because I just did some research on this problem:
C++03 3.4.2
§2 For each argument type T in the function call, there is a set of zero or more associated namespaces [...] The sets of namespaces and classes are determined in the following way:
[...]
— If T is a class type (including unions), its associated classes are: the class itself; the class of which it is a
member, if any; and its direct and indirect base classes. Its associated namespaces are the namespaces
in which its associated classes are defined.
§ 2a If the ordinary unqualified lookup of the name finds the declaration of a class member function, the associated
namespaces and classes are not considered. Otherwise the set of declarations found by the lookup of
the function name is the union of the set of declarations found using ordinary unqualified lookup and the set
of declarations found in the namespaces and classes associated with the argument types.
At least it's standard conformant, but I still don't understand the rationale here.
Consider a mystream which inherits from std::ostream. You would like that your type would support all the << operators that are defined for std::ostream normally in the std namespace. So base classes are associated classes for ADL.
I think this also follows from the substitution principle - and functions in a class' namespace are considered part of its interface (see Herb Sutter's "What's in a class?"). So an interface that works on the base class should remain working on a derived class.
You can also work around this by disabling ADL:
(find)( deq, STest2() );
I think you stated the problem yourself:
in the global namespace
Functions in the global namespace are considered last. It's the most outer scope by definition. Any function with the same name (not necessarily applicable) that is found in a closer scope (from the call point of view) will be picked up first.
template <typename Rng, typename T>
typename Rng::iterator find( Rng& rng, T const& t );
namespace foo
{
bool find(std::vector<int> const& v, int);
void method()
{
std::deque<std::string> deque;
auto it = find(deque, "bar");
}
}
Here (unless vector or deque include algorithm, which is allowed), the only method that will be picked up during name look-up will be:
bool foo::find(std::vector<int> const&, int);
If algorithm is somehow included, there will also be:
template <typename FwdIt>
FwdIt std::find(FwdIt begin, FwdIt end,
typename std::iterator_traits<FwdIt>::value_type const& value);
And of course, overload resolution will fail stating that there is no match.
Note that name-lookup is extremely dumb: neither arity nor argument type are considered!
Therefore, there are only two kinds of free-functions that you should use in C++:
Those which are part of the interface of a class, declared in the same namespace, picked up by ADL
Those which are not, and that you should explicitly qualified to avoid issues of this type
If you fall out of these rules, it might work, or not, depending on what's included, and that's very awkward.