The following code is causing a little headache for us: clang and MSVC accepts the following code, while GCC rejects it. We believe GCC is right this time, but I wanted to make it sure before filing the bugreports. So, are there any special rules for operator[] lookup that I'm unaware of?
struct X{};
struct Y{};
template<typename T>
struct B
{
void f(X) { }
void operator[](X){}
};
template<typename T>
struct C
{
void f(Y) { }
void operator[](Y){}
};
template<typename T> struct D : B<T>, C<T> {};
int main()
{
D<float> d;
//d.f(X()); //This is erroneous in all compilers
d[Y()];//this is accepted by clang and MSVC
}
So is the above code is correct in resolving the operator[] call in the main function?
It's not 100% clear in which compiler the issue lie. The standard goes over a lot of rules for name lookup (which is what this is an issue with), but more specifically section 13.5.5 covers the operator[] overload:
13.5.5 Subscripting [over.sub]
1 - operator[] shall be a non-static member function with exactly one parameter. It implements the subscripting syntax
postfix-expression [ expr-or-braced-init-list ]
Thus, a subscripting expression x[y] is interpreted as x.operator[](y) for a class object x of type T if T::operator[](T1) exists and if the operator is selected as the best match function by the overload resolution mechanism (13.3.3).
Looking at the standard on Overloading (chapter 13):
13 Overloading [over]
1 - When two or more different declarations are specified for a single name in the same scope, that name is said to be overloaded. By extension, two declarations in the same scope that declare the same name but with different types are called overloaded declarations. Only function and function template declarations can be overloaded; variable and type declarations cannot be overloaded.
2 - When an overloaded function name is used in a call, which overloaded function declaration is being referenced is determined by comparing the types of the arguments at the point of use with the types of the parameters in the overloaded declarations that are visible at the point of use. This function selection process is called overload resolution and is defined in 13.3.
...
13.2 Declaration matching [over.dcl]
1 - Two function declarations of the same name refer to the same function if they are in the same scope and have equivalent parameter declarations (13.1). A function member of a derived class is not in the same scope as a function member of the same name in a base class.
So according to this and section 10.2 on derived classes, since you've declared struct D : B, C, both B and C have member functions for operator[] but different types, thus the operator[] function is overloaded within the scope of D (since there's no using nor is operator[] overridden or hidden directly in D).
Based on this, MSVC and Clang are incorrect in their implementations since d[Y()] should be evaluated to d.operator[](Y()), which would produce an ambiguous name resolution; so the question is why do they accept the syntax of d[Y()] at all?
The only other areas I could see with regards to the subscript ([]) syntax make reference to section 5.2.1 (which states what a subscript expression is) and 13.5.5 (stated above), which means that those compilers are using other rules to further compile the d[Y()] expression.
If we look at name lookup, we see that 3.4.1 Unqualified name lookup paragraph 3 states that
The lookup for an unqualified name used as the postfix-expression of a function call is described in 3.4.2.
Where 3.4.2 states:
3.4.2 Argument-dependent name lookup [basic.lookup.argdep]
1 - When the postfix-expression in a function call (5.2.2) is an unqualified-id, other namespaces not considered during the usual unqualified lookup (3.4.1) may be searched, and in those namespaces, namespace-scope friend function or function template declarations (11.3) not otherwise visible may be found.
2 - For each argument type T in the function call, there is a set of zero or more associated namespaces and a set of zero or more associated classes to be considered. The sets of namespaces and classes is determined entirely by the types of the function arguments (and the namespace of any template template argument). Typedef names and using-declarations used to specify the types do not contribute to this set. The sets of namespaces and classes are determined in the following way:
...
(2.2) - If T is a class type (including unions), its associated classes are: the class itself; the class of which it is a member, if any; and its direct and indirect base classes. Its associated namespaces are the innermost enclosing namespaces of its associated classes. Furthermore, if T is a class template specialization, its associated namespaces and classes also include: the namespaces and classes associated with the types of the template arguments provided for template type parameters (excluding template template parameters); the namespaces of which any template template arguments are members; and the classes of which any member templates used as template template arguments are members. [ Note: Non-type template arguments do not contribute to the set of associated namespaces.—end note ]
Note the emphasis on may.
With the above points and a couple of others from 3.4 (name lookup), one could believe that Clang and MSVC are using these rules to find d[] first (and thus finding it as C::operator[]) vs. using 13.5.5 to turn d[] into d.operator[] and continuing compilation.
It should be noted that bringing the operators of the base classes into scope of the D class or using explicit scope does, however, 'fix' this issue across all three compilers (as is expected based on the using declaration clauses in the references), example:
struct X{};
struct Y{};
template<typename T>
struct B
{
void f(X) { }
void operator[](X) {}
};
template<typename T>
struct C
{
void f(Y) { }
void operator[](Y) {}
};
template<typename T>
struct D : B<T>, C<T>
{
using B<T>::operator[];
using C<T>::operator[];
};
int main()
{
D<float> d;
d.B<float>::operator[](X()); // OK
//d.B<float>::operator[](Y()); // Error
//d.C<float>::operator[](X()); // Error
d.C<float>::operator[](Y()); // OK
d[Y()]; // calls C<T>::operator[](Y)
return 0;
}
Since the standard is ultimately left to the interpretation of the implementer, I'm not sure which compiler would be technically correct in this instance since MSVC and Clang might be using other rules to compile this though, given the subscripting paragraphs from the standard, I'm inclined to say they are not strictly adhering to the standard as much as GCC is in this instance.
I hope this can add some insight into the problem.
I believe that Clang and MSVC are incorrect, and GCC is correct to reject this code. This is an example of the principle that names in different scopes do not overload with each other. I submitted this to Clang as llvm bug 26850, we'll see if they agree.
There is nothing special about operator[] vs f(). From [over.sub]:
operator[] shall be a non-static member function with exactly one parameter. [...] Thus, a subscripting expression x[y] is interpreted as x.operator[](y) for a class object x of type T
if T::operator[](T1) exists and if the operator is selected as the best match function by the overload
resolution mechanism
So the rules governing the lookup of d[Y()] are the same as the rules governing d.f(X()). All the compilers were correct to reject the latter, and should have also rejected the former. Moreover, both Clang and MSVC reject
d.operator[](Y());
where both they accept:
d[Y()];
despite the two having identical meaning. There is no non-member operator[], and this is not a function call so there is no argument-dependent lookup either.
What follows is an explanation of why the call should be viewed as ambiguous, despite one of the two inherited member functions seeming like it's a better match.
The rules for member name lookup are defined in [class.member.lookup]. This is already a little difficult to parse, plus it refers to C as the object we're looking up in (which in OP is named D, whereas C is a subobject). We have this notion of lookup set:
The lookup set for f in C, called S(f,C), consists of two component sets: the declaration set, a set of
members named f; and the subobject set, a set of subobjects where declarations of these members (possibly
including using-declarations) were found. In the declaration set, using-declarations are replaced by the set
of designated members that are not hidden or overridden by members of the derived class (7.3.3), and type
declarations (including injected-class-names) are replaced by the types they designate.
The declaration set for operator[] in D<float> is empty: there is neither an explicit declaration nor a using-declaration.
Otherwise (i.e., C does not contain a declaration of f or the resulting declaration set is empty), S(f,C) is
initially empty. If C has base classes, calculate the lookup set for f in each direct base class subobject Bi,
and merge each such lookup set S(f,Bi) in turn into S(f,C).
So we look into B<float> and C<float>.
The following steps define the result of merging lookup set S(f,Bi)
into the intermediate S(f,C):
— If each of the subobject members of S(f,Bi) is a base class subobject of at least one of the subobject
members of S(f,C), or if S(f,Bi) is empty, S(f,C) is unchanged and the merge is complete. Conversely,
if each of the subobject members of S(f,C) is a base class subobject of at least one of the
subobject members of S(f,Bi), or if S(f,C) is empty, the new S(f,C) is a copy of S(f,Bi).
— Otherwise, if the declaration sets of S(f,Bi) and S(f,C) differ, the merge is ambiguous: the new
S(f,C) is a lookup set with an invalid declaration set and the union of the subobject sets. In subsequent
merges, an invalid declaration set is considered different from any other.
— Otherwise, the new S(f,C) is a lookup set with the shared set of declarations and the union of the
subobject sets.
The result of name lookup for f in C is the declaration set of S(f,C). If it is an invalid set, the program is
ill-formed. [ Example:
struct A { int x; }; // S(x,A) = { { A::x }, { A } }
struct B { float x; }; // S(x,B) = { { B::x }, { B } }
struct C: public A, public B { }; // S(x,C) = { invalid, { A in C, B in C } }
struct D: public virtual C { }; // S(x,D) = S(x,C)
struct E: public virtual C { char x; }; // S(x,E) = { { E::x }, { E } }
struct F: public D, public E { }; // S(x,F) = S(x,E)
int main() {
F f;
f.x = 0; // OK, lookup finds E::x
}
S(x, F) is unambiguous because the A and B base subobjects of D are also base subobjects of E, so S(x,D)
is discarded in the first merge step. —end example ]
So here's what happens. First, we try to merge the empty declaration set of operator[] in D<float> with that of B<float>. This gives us the set {operator[](X)}.
Next, we merge that with the declaration set of operator[] in C<float>. This latter declaration set is {operator[](Y)}. These merge sets differ, so the merge is ambiguous. Note that overload resolution is not considered here. We are simply looking up the name.
The fix, by the way, is to add using-declarations to D<T> such that there is no merge step done:
template<typename T> struct D : B<T>, C<T> {
using B<T>::operator[];
using C<T>::operator[];
};
Related
#include <iostream>
template<class T>
struct A{
int c = T::a;
};
struct B{
A<B> cc;
static const int a = 0;
};
int main(){
}
GCC and Clang both accept the above example. Regarding how the lookup rule applies to a name in a class template specialization, which might be a member of a class that causes the implicit instantiation of the class template specialization is not clear in c++20 or before. Fortunately, it's clear in the current draft.
[basic.lookup#class.member.lookup-3]
The declaration set is the result of a single search in the scope of C for N from immediately after the class-specifier of C if P is in a complete-class context of C or from P otherwise. If the resulting declaration set is not empty, the subobject set contains C itself, and calculation is complete.
It seems that the rule can be used to interpret most cases that I said in the above, however, consider the above example, it's a bit special.
The definition of non-static member cc would cause the implicit instantiation of specialization A<B>, at this point, it's not the complete context of class B. Hence, the subsequent lookup for a name in the scope B should be from this point. If T::a is used as the type-specifier of a non-static member of class template A, It's 100% sure that the compiler would report an error that says no identifier a can be found in the scope of B.
However, except for the following rule
temp.res#general-1
If the name is dependent (as specified in [temp.dep]), it is looked up for each specialization (after substitution) because the lookup depends on a template parameter.
I cannot find any wording in the standard that states whether an implicit instantiation of A can cause the lookup rule to be applied to the name within the default member initializer or is not at that point.
At first glance of the above rule, It should mean that the lookup should be performed for the name when the enclosing class template specialization is instantiating, then a cannot be found in the scope of B since it's not deemed as a complete class at that point and only these declarations that are reachable at point P can be found.
Obviously, the behavior of GCC and Clang consider T is a complete type such that a can be found in T. In other words, it seems that these compilers do not immediately perform the lookup for T::a when A is instantiating. If I miss a certain special rule, please point them out.
First, consider this C++ code:
#include <stdio.h>
struct foo_int {
void print(int x) {
printf("int %d\n", x);
}
};
struct foo_str {
void print(const char* x) {
printf("str %s\n", x);
}
};
struct foo : foo_int, foo_str {
//using foo_int::print;
//using foo_str::print;
};
int main() {
foo f;
f.print(123);
f.print("abc");
}
As expected according to the Standard, this fails to compile, because print is considered separately in each base class for the purpose of overload resolution, and thus the calls are ambiguous. This is the case on Clang (4.0), gcc (6.3) and MSVC (17.0) - see godbolt results here.
Now consider the following snippet, the sole difference of which is that we use operator() instead of print:
#include <stdio.h>
struct foo_int {
void operator() (int x) {
printf("int %d\n", x);
}
};
struct foo_str {
void operator() (const char* x) {
printf("str %s\n", x);
}
};
struct foo : foo_int, foo_str {
//using foo_int::operator();
//using foo_str::operator();
};
int main() {
foo f;
f(123);
f("abc");
}
I would expect the results to be identical to the previous case, but it is not the case - while gcc still complains, Clang and MSVC can compile this fine!
Question #1: who is correct in this case? I expect it to be gcc, but the fact that two other unrelated compilers give a consistently different result here makes me wonder whether I'm missing something in the Standard, and things are different for operators when they're not invoked using function syntax.
Also note that if you only uncomment one of the using declarations, but not the other, then all three compilers will fail to compile, because they will only consider the function brought in by using during overload resolution, and thus one of the calls will fail due to type mismatch. Remember this; we'll get back to it later.
Now consider the following code:
#include <stdio.h>
auto print_int = [](int x) {
printf("int %d\n", x);
};
typedef decltype(print_int) foo_int;
auto print_str = [](const char* x) {
printf("str %s\n", x);
};
typedef decltype(print_str) foo_str;
struct foo : foo_int, foo_str {
//using foo_int::operator();
//using foo_str::operator();
foo(): foo_int(print_int), foo_str(print_str) {}
};
int main() {
foo f;
f(123);
f("foo");
}
Again, same as before, except now we don't define operator() explicitly, but instead get it from a lambda type. Again, you'd expect the results to be consistent with the previous snippet; and this is true for the case where both using declarations are commented out, or if both are uncommented. But if you only comment out one but not the other, things are suddenly different again: now only MSVC complains as I would expect it to, while Clang and gcc both think it's fine - and use both inherited members for overload resolution, despite only one being brought in by using!
Question #2: who is correct in this case? Again, I'd expect it to be MSVC, but then why do both Clang and gcc disagree? And, more importantly, why this is different from the previous snippet? I would expect the lambda type to behave exactly the same as a manually defined type with overloaded operator()...
Barry got #1 right. Your #2 hit a corner case: captureless nongeneric lambdas have an implicit conversion to function pointer, which got used in the mismatch case. That is, given
struct foo : foo_int, foo_str {
using foo_int::operator();
//using foo_str::operator();
foo(): foo_int(print_int), foo_str(print_str) {}
} f;
using fptr_str = void(*)(const char*);
f("hello") is equivalent to f.operator fptr_str()("hello"), converting the foo to an pointer-to-function and calling that. If you compile at -O0 you can actually see the call to the conversion function in the assembly before it gets optimized away. Put an init-capture in print_str, and you'll see an error since the implicit conversion goes away.
For more, see [over.call.object].
The rule for name lookup in base classes of a class C only happens if C itself doesn't directly contain the name is [class.member.lookup]/6:
The following steps define the result of merging lookup set S(f,Bi)
into the intermediate S(f,C):
If each of the subobject members of S(f,Bi) is a base class subobject of at least one of the subobject members of S(f,C), or if S(f,Bi) is empty, S(f,C) is unchanged and the merge is complete. Conversely, if each of the subobject members of S(f,C) is a base class subobject of at least one of the subobject members of S(f,Bi), or if S(f,C) is empty, the new S(f,C) is a copy of S(f,Bi).
Otherwise, if the declaration sets of S(f,Bi) and S(f,C) differ, the merge is ambiguous: the new S(f,C) is a lookup set with an invalid declaration set and the union of the subobject sets. In subsequent merges, an invalid declaration set is considered different from any other.
Otherwise, the new S(f,C) is a lookup set with the shared set of declarations and the union of the subobject sets.
If we have two base classes, that each declare the same name, that the derived class does not bring in with a using-declaration, lookup of that name in the derived class would run afoul of that second bullet point and the lookup should fail. All of your examples are basically the same in this regard.
Question #1: who is correct in this case?
gcc is correct. The only difference between print and operator() is the name that we're looking up.
Question #2: who is correct in this case?
This is the same question as #1 - except we have lambdas (which give you unnamed class types with overload operator()) instead of explicit class types. The code should be ill-formed for the same reason. At least for gcc, this is bug 58820.
Your analysis of the first code is incorrect. There is no overload resolution.
The name lookup process occurs wholly before overload resolution. Name lookup determines which scope an id-expression is resolved to.
If a unique scope is found via the name look-up rules, then overload resolution begins: all instances of that name within that scope form the overload set.
But in your code, name lookup fails. The name is not declared in foo, so base classes are searched. If the name is found in more than one immediate base class then the program is ill-formed and the error message describes it as an ambiguous name.
The name lookup rules do not have special cases for overloaded operators. You should find that the code:
f.operator()(123);
fails for the same reason as f.print failed. However, there is another issue in your second code. f(123) is NOT defined as always meaning f.operator()(123);. In fact the definition in C++14 is in [over.call]:
operator() shall be a non-static member function with an arbitrary number of parameters. It can have default arguments. It implements the function call syntax
postfix-expression ( expression-list opt )
where the postfix-expression evaluates to a class object and the possibly empty expression-list matches the parameter list of an operator() member function of the class. Thus, a call x(arg1,...) is interpreted as x.operator()(arg1, ...) for a class object x of type T if T::operator()(T1, T2, T3) exists and if the operator is selected as the best match function by the overload resolution mechanism (13.3.3).
This actually seems an imprecise specification to me so I can understand different compilers coming out with different results. What is T1,T2,T3? Does it mean the types of the arguments? (I suspect not). What are T1,T2,T3 when multiple operator() function exist, only taking one argument?
And what is meant by "if T::operator() exists" anyway? It could perhaps mean any of the following:
operator() is declared in T.
Unqualified lookup of operator() in the scope of T succeeds and performing overload resolution on that lookup set with the given arguments succeeds.
Qualified lookup of T::operator() in the calling context succeeds and performing overload resolution on that lookup set with the given arguments succeeds.
Something else?
To proceed from here (for me anyway) I would like to understand why the standard didn't simply say that f(123) means f.operator()(123);, the former being ill-formed if and only if the latter is ill-formed. The motivation behind the actual wording might reveal the intent and therefore which compiler's behaviour matches the intent.
Consider the following code:
struct A {
int propose();
};
struct A1 : A {
int propose(int);
using A::propose;
};
struct B1 : A1 {
protected:
using A1::propose;
public:
using A::propose;
};
int main() {
B1().propose();
}
Let's compile this: g++ -std=c++11 main.cpp.
I'm getting the following compiler error using GNU 4.8.1:
main.cpp: In function 'int main()':
main.cpp:2:9: error: 'int A::propose()' is inaccessible
int propose();
^
main.cpp:18:18: error: within this context
B1().propose();
However, this code compiles in AppleClang 6.0.0.6000056.
I understand that there is no need for the using in B1, (in my code was necessary, but I had 1 using too much by mistake). In any case, why Clang compiles it? Is this expected?
In [namespace.udecl], we have:
When a using-declaration brings names from a base class into a derived class scope, member functions and
member function templates in the derived class override and/or hide member functions and member function
templates with the same name, parameter-type-list (8.3.5), cv-qualification, and ref-qualifier (if any) in a
base class (rather than conflicting).
The standard explicitly says that names brought in will not conflict with names in a base class. But it doesn't say anything about bringing in conflicting names.
The section also says:
A using-declaration is a declaration and can therefore be used repeatedly where (and only where) multiple
declarations are allowed. [ Example:
struct B {
int i;
};
struct X : B {
using B::i;
using B::i; // error: double member declaration
};
—end example ]
And interestingly, in the following example it's GCC that happily compiles it (and prints A) while Clang allows the construction of a C but rejects the call to foo as ambiguous:
struct A {
void foo() { std::cout << "A\n"; }
};
struct B {
void foo() { std::cout << "B\n"; }
};
struct C : A, B {
using A::foo;
using B::foo;
};
int main()
{
C{}.foo();
return 0;
}
So the short answer is - I suspect this is underspecified in the standard and that both compilers are doing acceptable things. I would just avoid writing this sort of code for general sanity.
The declaration is legal.
Calling it is legal and should work anywhere, and it can only be called from the class and derived classes, and it can be called from within any class. You'll note that this makes little sense.
There are no rules that ban that construct in declarations (importing the name twice from two different base classes with the same signature), and it is even used in "real" code where the derived class goes and hides the name after they are imported.
If you don't hide it, you are in the strange situation where the same function A::propose is both protected and public at the same time, as it is named twice (legally) in the same scope with different access control. This is ... unusual.
If you are within a class, a sub-clause says you can use it:
[class.access.base]/5.1
A member m is accessible at the point R when named in class N if — (5.1) m as a member of N is public
and propose is clearly public. (it is also protected but we don't have to keep reading for that case!)
Elsewhere, we have a contradiction. You are told you can use it everywhere without restriction [class.access]/1(3). And you are told that you can only use it in certain circumstances [class.access]/1(2).
I am uncertain how to resolve that ambiguity.
The rest of the logic train:
In [namespace.udecl]/10 we have:
A using-declaration is a declaration and can therefore be used repeatedly where (and only where) multiple declarations are allowed.
And [namespace.udecl]/13:
Since a using-declaration is a declaration, the restrictions on declarations of the same name in the same declarative region
so each of those using X::propose; are declarations.
[basic.scope] has no applicable restrictions on two functions of the same name in a scope, other than [basic.scope.class]/1(3) which states that if reordering of declarations changes the program, the program is ill-formed. So we cannot say that the later one wins.
Two declarations of member functions in the same scope are legal under [basic.scope]. However, under [over], there are restrictions on two member functions with the same name.
[over]/1 states:
When two or more different declarations are specified for a single name in the same scope, that name is said to be overloaded
And there are some restrictions on overloading. This is what usually prevents
struct foo {
int method();
int method();
};
from being legal. However:
[over.load]/1 states:
Not all function declarations can be overloaded. Those that cannot be overloaded are specified here. A program is ill-formed if it contains two such non-overloadable declarations in the same scope. [Note: This
restriction applies to explicit declarations in a scope, and between such declarations and declarations made through a using-declaration (7.3.3). It does not apply to sets of functions fabricated as a result of name lookup (e.g., because of using-directives) or overload resolution (e.g., for operator functions). —end note
the note explicitly permits symbols introduced via two using-declarations from being considered by the overloading restrictions! The rules only apply to two explicit declarations within the scope, or between an explicit declaration within the scope and a using declaration.
There are zero restrictions on two using-declarations. They can have the same name, and their signatures can conflict as much as you'd like.
This is useful, because usually you can go and then hide their declaration (with a declaration in the derived class), and nothing goes wrong [namespace.udecl]/15:
When a using-declaration brings names from a base class into a derived class scope, member functions and member function templates in the derived class override and/or hide member functions and member function templates with the same name, parameter-type-list (8.3.5), cv-qualification, and ref-qualifier (if any) in a base class (rather than conflicting).
However, this is not done here. We then call the method. Overload resolution occurs.
See [namespace.udecl]/16:
For the purpose of overload resolution, the functions which are introduced by a
using-declaration into a derived class will be treated as though they were members of the derived class. In particular, the implicit this parameter shall be treated as if it were a pointer to the derived class rather than to the base class. This has no effect on the type of the function, and in all other respects the function remains a member of
the base class.
So we have to treat them as if they are members of the derived class for the purpose of overload resolution. But there are still 3 declarations here:
protected:
int A::propose(); // X
int A1::propose(int); // Y
public:
int A::propose(); // Z
Thus the call to B1().propose() considers all 3 declarations. Both X and Z are equal. They, however, refer to the same function, and overload resolution states there is an ambiguity if two different functions are selected. So the result is not ambiguous. There may be access control violations, or not, depending on how you read the standard.
[over.match]/3
If a best viable function exists and is unique, overload resolution succeeds and produces it as the result. Otherwise overload resolution fails and the invocation is ill-formed. When overload resolution succeeds, and the best viable function is not accessible (Clause 11) in the context in which it is used, the program is ill-formed.
Consider this code:
struct A; // incomplete type
template<class T>
struct D { T d; };
template <class T>
struct B { int * p = nullptr; };
int main() {
B<D<A>> u, v;
u = v; // doesn't compile; complain that D<A>::d has incomplete type
u.operator=(v); // compiles
}
Demo. Since u.operator=(v) compiles but u = v; doesn't, somewhere during the overload resolution for the latter expression the compiler must have implicitly instantiated D<A> - but I don't see why that instantiation is required.
To make things more interesting, this code compiles:
struct A; // incomplete type
template<class T>
struct D; // undefined
template <class T>
struct B { int * p = nullptr; };
int main() {
B<D<A>> u, v;
u = v;
u.operator=(v);
}
Demo.
What's going on here? Why does u = v; cause the implicit instantiation of D<A> - a type that's nowhere used in the body of B's definition - in the first case but not the second?
The entire point of the matter is ADL kicking in:
N3797 - [basic.lookup.argdep]
When the postfix-expression in a function call (5.2.2) is an unqualified-id, other namespaces not considered
during the usual unqualified lookup (3.4.1) may be searched, and in those namespaces, namespace-scope
friend function or function template declarations (11.3) not otherwise visible may be found.
following:
For each argument type T in the function call, there is a set of zero or more associated namespaces and a
set of zero or more associated classes to be considered. [...] The sets of
namespaces and classes are determined in the following way:
If T is a class type [..] its associated classes are: ...
furthemore if T is a class template specialization its associated namespaces and classes also include: the namespaces and classes associated with the
types of the template arguments provided for template type parameters
D<A> is an associated class and therefore in the list waiting for its turn.
Now for the interesting part [temp.inst]/1
Unless a class template specialization has been explicitly instantiated (14.7.2) or explicitly specialized (14.7.3),
the class template specialization is implicitly instantiated [...] when the completeness of the class type affects the semantics of the program
One could think that the completeness of the type D<A> doesn't affect at all the semantic of that program, however [basic.lookup.argdep]/4 says:
When considering an associated namespace, the lookup is the same as the lookup performed when the associated namespace is used as a qualifier (3.4.3.2)
except that:
[...]
Any namespace-scope friend functions or friend function templates declared in associated classes are visible within their respective
namespaces even if they are not visible during an ordinary lookup (11.3)
i.e. the completeness of the class type actually affects friends declarations -> the completeness of the class type therefore affects the semantics of the program.
That is also the reason why your second sample works.
TL;DR D<A> is instantiated.
The last interesting point regards why ADL starts in the first place for
u = v; // Triggers ADL
u.operator=(v); // Doesn't trigger ADL
§13.3.1.2/2 dictates that there can be no non-member operator= (other than the built-in ones). Join this to [over.match.oper]/2:
The set of non-member candidates is the result of the unqualified lookup of operator# in the context
of the expression according to the usual rules for name lookup in unqualified function calls (3.4.2)
except that all member functions are ignored.
and the logical conclusion is: there's no point in performing the ADL lookup if there's no non-member form in table 11. However [temp.inst]p7 relaxes this:
If the overload resolution process can determine the correct function to call without instantiating a class template definition, it is unspecified whether that instantiation actually takes place.
and that's the reason why clang triggered the entire ADL -> implicit instantiation process in the first place.
Starting from r218330 (at the time of writing this, it has been committed a few minutes ago) this behavior was changed not to perform ADL for operator= at all.
References
r218330
clang sources / Sema module
cfe-dev
N3797
Thanks to Richard Smith and David Blaikie for helping me figuring this out.
Well, I think in Visual Studio 2013 code should look like(without = nullptr):
struct A; // incomplete type
template<class T>
struct D { T d; };
template <class T>
struct B { int * p; };
int void_main() {
B<D<A>> u, v;
u = v; // compiles
u.operator=(v); // compiles
return 0;
}
In this case it should compile well just because incomplete types can be used for specific template class specialization usage.
As for the run-time error - the variable v used without initialization - it's correct - struct B does not have any constructor => B::p is not initialized and could contain garbage.
I reproduce below the argument-dependent lookup (ADL) example given in pages 396 and 397 of Stroustrup book (4th edition):
namespace N {
struct S { int i; };
void f(S);
void g(S);
void h(int);
};
struct Base {
void f(N::S);
};
struct D : Base {
void mf(N::S);
void g(N::S x)
{
f(x); // call Base::f()
mf(x); // call D::mf()
h(1); // error: no h(int) available
}
};
What the comments say above is correct (I've tested it), but that doesn't seem to agree with what the author says in the next paragraph:
In the standard, the rules for argument-dependent lookup are phrased
in terms of associated namespaces (iso §3.4.2). Basically:
If an argument is a class member , the associated namespaces are the class itself (including its base classes) and the class's
enclosing namespaces.
If an argument is a member of a namespace, the associated namespaces are the enclosing namespaces.
If an argument is a built-in type, there are no associated namespaces.
In the example, x, which has type N::S is not a member of class D, nor of its base Base. But it's a member of namespace N. According to the second bullet above, the function N::f(S) should be the one called, instead of Base::f().
The result above also doesn't seem to agree with the second bullet in paragraph 3.4.2p2 in the Standard, which says:
If T is a class type (including unions), its associated classes are:
the class itself; the class of which it is a member, if any; and its
direct and indirect base classes. Its associated namespaces are the
namespaces of which its associated classes are members. Furthermore,
if T is a class template specialization, its associated namespaces and
classes also include: the namespaces and classes associated with the
types of the template arguments provided for template type parameters
(excluding template template parameters); the namespaces of which any
template template arguments are members; and the classes of which any
member templates used as template template arguments are members.
3.4.2/3 Let X be the lookup set produced by unqualified lookup (3.4.1) and let Y be the lookup set produced by argument dependent
lookup (defined as follows). If X contains
a declaration of a class member, or
a block-scope function declaration that is not a using-declaration, or
a declaration that is neither a function or a function template
then Y is empty. Otherwise...
So basically, ADL doesn't kick in when the ordinary lookup finds a member function or local (block-scope) function declaration (or something that's not a function). It does kick in when the ordinary lookup finds a stand-alone namespace-scope function, or when it finds nothing at all.