Ambiguous operator overload in Clang - c++

Consider the following:
template<typename T>
struct C {};
template<typename T, typename U>
void operator +(C<T>&, U);
struct D: C<D> {};
struct E {};
template<typename T>
void operator +(C<T>&, E);
void F() { D d; E e; d + e; }
This code compiles fine on both GCC-7 and Clang-5. The selected overload for operator + is that of struct E.
Now, if the following change takes place:
/* Put `operator +` inside the class. */
template<typename T>
struct C {
template<typename U>
void operator +(U);
};
that is, if operator + is defined inside the class template, instead of outside, then Clang yields ambiguity between both operator +s present in the code. GCC still compiles fine.
Why does this happen? Is this a bug in either GCC or Clang?

This is a bug in gcc; specifically, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53499 .
The problem is that gcc is regarding the implicit object parameter of a class template member function as having a dependent type; that is, during function template partial ordering gcc transforms
C<D>::template<class U> void operator+(U); // #1
into
template<class T, class U> void operator+(C<T>&, U); // #1a (gcc, wrong)
when it should be transformed into
template<class U> void operator+(C<D>&, U); // #1b (clang, correct)
We can see that when compared to your
template<class T> void operator+(C<T>&, E); // #2
#2 is better than the erroneous #1a, but is ambiguous with #1b.
Observe that gcc incorrectly accepts even when C<D> is not a template at all - i.e., when C<D> is a class template full specialization:
template<class> struct C;
struct D;
template<> struct C<D> {
// ...
This is covered by [temp.func.order]/3, with clarification in the example. Note that again, gcc miscompiles that example, incorrectly rejecting it but for the same reason.

Edit: The original version of this answer said that GCC was correct. I now believe that Clang is correct according to the wording of the standard, but I can see how GCC's interpretation could also be correct.
Let's look at your first example, where the two declarations are:
template<typename T, typename U>
void operator +(C<T>&, U);
template<typename T>
void operator +(C<T>&, E);
Both are viable, but it is obvious that the second template is more specialized than the first. So GCC and Clang both resolve the call to the second template. But let's walk through [temp.func.order] to see why, in the wording of the standard, the second template is more specialized.
The partial ordering rules tell us to replace each type template parameter with a unique synthesized type and then perform deduction against the other template. Under this scheme, the first overload type becomes
void(C<X1>&, X2)
and deduction against the second template fails since the latter only accepts E. The second overload type becomes
void(C<X3>&, E)
and deduction against the first template succeeds (with T = X3 and U = E). Since the deduction succeeded in only one direction, the template that accepted the other's transformed type (the first one) is considered less specialized, and thus, the second overload is chosen as the more specialized one.
When the second overload is moved into class C, both overloads are still found and the overload resolution process should apply in exactly the same way. First, the argument list is constructed for both overloads, and since the first overload is a non-static class member, an implied object parameter is inserted. According to [over.match.funcs], the type of that implied object parameter should be "lvalue reference to C<T>" since the function does not have a ref-qualifier. So the two argument lists are both (C<D>&, E). Since this fails to effect a choice between the two overloads, the partial ordering test kicks in again.
The partial ordering test, described in [temp.func.order], also inserts an implied object parameter:
If only one of the function templates M is a non-static member of
some class A, M is considered to have a new first parameter inserted in its function parameter list. Given cv
as the cv-qualifiers of M (if any), the new parameter is of type “rvalue reference to cv A” if the optional
ref-qualifier of M is && or if M has no ref-qualifier and the first parameter of the other template has rvalue
reference type. Otherwise, the new parameter is of type “lvalue reference to cv A”. [ Note: This allows a
non-static member to be ordered with respect to a non-member function and for the results to be equivalent
to the ordering of two equivalent non-members. — end note ]
This is the step where, presumably, GCC and Clang take different interpretations of the standard.
My take: The member operator+ has already been found in the class C<D>. The template parameter T for the class C is not being deduced; it is known because the name lookup process entered the concrete base class C<D> of D. The actual operator+ that is submitted to partial ordering therefore does not have a free T parameter; it is not void operator+(C<T>&, U), but rather, void operator+(C<D>&, U).
Thus, for the member overload, the transformed function type should not be void(C<X1>&, X2), but rather void(C<D>&, X2). For the non-member overload, the transformed function type is still void(C<X3>&, E) as before. But now we see that void(C<D>&, X2) is not a match for the non-member template void(C<T>&, E) nor is void(C<X3>&, E) a match for the member template void(C<D>&, U). So partial ordering fails, and overload resolution returns an ambiguous result.
GCC's decision to continue to select the non-member overload makes sense if you assume that it is constructing the transformed function type for the member lexically, making it still void(C<X1>&, X2), while Clang substitutes D into the template, leaving only U as a free parameter, before beginning the partial ordering test.

Related

Ambigous operator overload on clang [duplicate]

Consider the following code, which mixes a member and a non-member operator|
template <typename T>
struct S
{
template <typename U>
void operator|(U)
{}
};
template <typename T>
void operator|(S<T>,int) {}
int main()
{
S<int>() | 42;
}
In Clang this code fails to compile stating that the call to operator| is ambiguous, while gcc and msvc can compile it. I would expect the non-member overload to be selected since it is more specialized. What is the behavior mandated by the standard? Is this a bug in Clang?
(Note that moving the operator template outside the struct resolves the ambiguity.)
I do believe clang is correct marking the call as ambiguous.
Converting the member to a free-standing function
First, the following snippet is NOT equal to the code you have posted w.r.t S<int>() | 42; call.
template <typename T>
struct S
{
};
template <typename T,typename U>
void operator|(S<T>,U)
{}
template <typename T>
void operator|(S<T>,int) {}
In this case the now-non-member implementation must deduce both T and U, making the second template more specialized and thus chosen. All compilers agree on this as you have observed.
Overload resolution
Given the code you have posted.
For overload resolution to take place, a name lookup is initiated first. That consists of finding all symbols named operator| in the current context, among all members of S<int>, and the namespaces given by ADL rules which is not relevant here. Crucially, T is resolved at this stage, before overload resolution even happens, it must be. The found symbols are thus
template <typename T> void operator|(S<T>,int)
template <typename U> void S<int>::operator|(U)
For the purpose of picking a better candidate, all member functions are treated as non-members with a special *this parameter. S<int>& in our case.
[over.match.funcs.4]
For implicit object member functions, the type of the implicit object parameter is
(4.1)
“lvalue reference to cv X” for functions declared without a ref-qualifier or with the & ref-qualifier
(4.2)
“rvalue reference to cv X” for functions declared with the && ref-qualifier
where X is the class of which the function is a member and cv is the cv-qualification on the member function declaration.
This leads to:
template <typename T> void operator|(S<T>,int)
template <typename U> void operator|(S<int>&,U)
Looking at this, one might assume that because the second function cannot bind the rvalue used in the call, the first function must be chosen. Nope, there is special rule that covers this:
[over.match.funcs.5][Emphasis mine]
During overload resolution, the implied object argument is indistinguishable from other arguments.
The implicit object parameter, however, retains its identity since no user-defined conversions can be applied to achieve a type match with it.
For implicit object member functions declared without a ref-qualifier, even if the implicit object parameter is not const-qualified, an rvalue can be bound to the parameter as long as in all other respects the argument can be converted to the type of the implicit object parameter.
Due to some other rules, U is deduced to be int, not const int& or int&. S<T> can also be easily deduced as S<int> and S<int> is copyable.
So both candidates are still valid. Furthermore neither is more specialized than the other. I won't go through the process step by step because I am not able to who can blame me if even all the compilers did not get the rules right here. But it is ambiguous exactly for the same reason as foo(42) is for
void foo(int){}
void foo(const int&){}
I.e. there is no preference between copy and reference as long as the reference can bind the value. Which is true in our case even for S<int>& due to the rule above. The resolution is just done for two arguments instead of one.

SFINAE works with deduction but fails with substitution

Consider the following MCVE
struct A {};
template<class T>
void test(T, T) {
}
template<class T>
class Wrapper {
using type = typename T::type;
};
template<class T>
void test(Wrapper<T>, Wrapper<T>) {
}
int main() {
A a, b;
test(a, b); // works
test<A>(a, b); // doesn't work
return 0;
}
Here test(a, b); works and test<A>(a, b); fails with:
<source>:11:30: error: no type named 'type' in 'A'
using type = typename T::type;
~~~~~~~~~~~~^~~~
<source>:23:13: note: in instantiation of template class 'Wrap<A>' requested here
test<A>(a, b); // doesn't work
^
<source>:23:5: note: while substituting deduced template arguments into function template 'test' [with T = A]
test<A>(a, b); // doesn't work
LIVE DEMO
Question: Why is that? Shouldn't SFINAE work during substitution? Yet here it seems to work during deduction only.
Self introduction
Hello everyone, I am an innocent compiler.
The first call
test(a, b); // works
In this call, the argument type is A. Let me first consider the first overload:
template <class T>
void test(T, T);
Easy. T = A.
Now consider the second:
template <class T>
void test(Wrapper<T>, Wrapper<T>);
Hmm ... what? Wrapper<T> for A? I have to instantiate Wrapper<T> for every possible type T in the world just to make sure that a parameter of type Wrapper<T>, which might be specialized, can't be initialized with an argument of type A? Well ... I don't think I'm going to do that ...
Hence I will not instantiate any Wrapper<T>. I will choose the first overload.
The second call
test<A>(a, b); // doesn't work
test<A>? Aha, I don't have to do deduction. Let me just check the two overloads.
template <class T>
void test(T, T);
T = A. Now substitute — the signature is (A, A). Perfect.
template <class T>
void test(Wrapper<T>, Wrapper<T>);
T = A. Now subst ... Wait, I never instantiated Wrapper<A>? I can't substitute then. How can I know whether this would be a viable overload for the call? Well, I have to instantiate it first. (instantiating) Wait ...
using type = typename T::type;
A::type? Error!
Back to L. F.
Hello everyone, I am L. F. Let's review what the compiler has done.
Was the compiler innocent enough? Did he (she?) conform to the standard?
#YSC has pointed out that [temp.over]/1 says:
When a call to the name of a function or function template is written
(explicitly, or implicitly using the operator notation), template
argument deduction ([temp.deduct]) and checking of any explicit
template arguments ([temp.arg]) are performed for each function
template to find the template argument values (if any) that can be
used with that function template to instantiate a function template
specialization that can be invoked with the call arguments. For each
function template, if the argument deduction and checking succeeds,
the template-arguments (deduced and/or explicit) are used to
synthesize the declaration of a single function template
specialization which is added to the candidate functions set to be
used in overload resolution. If, for a given function template,
argument deduction fails or the synthesized function template
specialization would be ill-formed, no such function is added to the
set of candidate functions for that template. The complete set of
candidate functions includes all the synthesized declarations and all
of the non-template overloaded functions of the same name. The
synthesized declarations are treated like any other functions in the
remainder of overload resolution, except as explicitly noted in
[over.match.best].
The missing type leads to a hard error. Read https://stackoverflow.com/a/15261234. Basically, we have two stages when determining whether template<class T> void test(Wrapper<T>, Wrapper<T>) is the desired overload:
Instantiation. In this case, we (fully) instantiate Wrapper<A>. In this stage, using type = typename T::type; is problematic because A::type is nonexistent. Problems that occur in this stage are hard errors.
Substitution. Since the first stage already fails, this stage is not even reached in this case. Problems that occur in this stage are subject to SFINAE.
So yeah, the innocent compiler has done the right thing.
I'm not a language lawyer but I don't think that defining a using type = typename T::type; inside a class is, itself, usable as SFINAE to enable/disable a function receiving an object of that class.
If you want a solution, you can apply SFINAE to the Wrapper version as follows
template<class T>
auto test(Wrapper<T>, Wrapper<T>)
-> decltype( T::type, void() )
{ }
This way, this test() function is enabled only for T types with a type type defined inside it.
In your version, is enabled for every T type but gives error when T is incompatible with Wrapper.
-- EDIT --
The OP precises and asks
My Wrapper has many more dependencies on T, it would be impractical to duplicate them all in a SFINAE expression. Isn't there a way to check if Wrapper itself can be instantiated?
As suggested by Holt, you can create a custom type traits to see if a type is a Wrapper<something> type; by example
template <typename>
struct is_wrapper : public std::false_type
{ };
template <typename T>
struct is_wrapper<Wrapper<T>> : public std::true_type
{ using type = T; };
Then you can modify the Wrapper version to receive a U type and check if U is a Wrapper<something> type
template <typename U>
std::enable_if_t<is_wrapper<U>{}> test (U, U)
{ using T = typename is_wrapper<U>::type; }
Observe that you can recover the original T type (if you need it) using the type definition inside the is_wrapper struct.
If you need a non-Wrapper version of test(), with this solution you have to explicity disable it when T is a Wrapper<something> type to avoid collision
template <typename T>
std::enable_if_t<!is_wrapper<T>{}> test(T, T)
{ }
Deduction of the function called in a function call expression is performed in two steps:
Determination of the set of viable functions;
Determination of the best viable function.
The set of viable function can only contain function declaration and template function specialization declaration.
So when a call expression (test(a,b) or test<A>(a,b)) names a template function, it is necessary to determine all template arguments: this is called template argument deduction. This is performed in three steps [temp.deduct]:
Subsitution of explicitly provided template arguments (in names<A>(x,y) A is explicitly provided);(substitution means that in the function template delcaration, the template parameter are replaced by their argument)
Deduction of template arguments that are not provided;
Substitution of deduced template argument.
Call expression test(a,b)
There are no explictly provided template argument.
T is deduced to A for the first template function, deduction fails for the second template function [temp.deduct.type]/8. So the second template function will not participate to overload resolution
A is subsituted in the declaration of the first template function. The subsitution succeeds.
So there is only one overload in the set and it is selected by overload resolution.
Call expression test<A>(a,b)
(Edit after the pertinent remarks of #T.C. and #geza)
The template argument is provided: A and it is substituted in the declaration of the two template functions. This substitution only involves the instantiation of the declaration of the function template specialization. So it is fine for the two template
No deduction of template argument
No substitution of deduced template argument.
So the two template specializations, test<A>(A,A) and test<A>(Wrapper<A>,Wrapper<A>), participate in overload resolution. First the compiler must determine which function are viable. To do that the compiler needs to find an implicit conversion sequence that converts the function argument to the function parameter type [over.match.viable]/4:
Third, for F to be a viable function, there shall exist for each argument an implicit conversion sequence that converts that argument to the corresponding parameter of F.
For the second overload, in order to find a conversion to Wrapper<A> the compiler needs the definition of this class. So it (implicitly) instantiates it. This is this instantiation that causes the observed error generated by compilers.

Are the template partial ordering rules underspecified? [duplicate]

Consider the following simple (to the extent that template questions ever are) example:
#include <iostream>
template <typename T>
struct identity;
template <>
struct identity<int> {
using type = int;
};
template<typename T> void bar(T, T ) { std::cout << "a\n"; }
template<typename T> void bar(T, typename identity<T>::type) { std::cout << "b\n"; }
int main ()
{
bar(0, 0);
}
Both clang and gcc print "a" there. According to the rules in [temp.deduct.partial] and [temp.func.order], to determine partial ordering, we need to synthesize some unique types. So we have two attempts at deduction:
+---+-------------------------------+-------------------------------------------+
| | Parameters | Arguments |
+---+-------------------------------+-------------------------------------------+
| a | T, typename identity<T>::type | UniqueA, UniqueA |
| b | T, T | UniqueB, typename identity<UniqueB>::type |
+---+-------------------------------+-------------------------------------------+
For deduction on "b", according to Richard Corden's answer, the expression typename identity<UniqueB>::type is treated as a type and is not evaluated. That is, this will be synthesized as if it were:
+---+-------------------------------+--------------------+
| | Parameters | Arguments |
+---+-------------------------------+--------------------+
| a | T, typename identity<T>::type | UniqueA, UniqueA |
| b | T, T | UniqueB, UniqueB_2 |
+---+-------------------------------+--------------------+
It's clear that deduction on "b" fails. Those are two different types so you cannot deduce T to both of them.
However, it seems to me that the deduction on A should fail. For the first argument, you'd match T == UniqueA. The second argument is a non-deduced context - so wouldn't that deduction succeed iff UniqueA were convertible to identity<UniqueA>::type? The latter is a substitution failure, so I don't see how this deduction could succeed either.
How and why do gcc and clang prefer the "a" overload in this scenario?
As discussed in the comments, I believe there are several aspects of the function template partial ordering algorithm that are unclear or not specified at all in the standard, and this shows in your example.
To make things even more interesting, MSVC (I tested 12 and 14) rejects the call as ambiguous. I don't think there's anything in the standard to conclusively prove which compiler is right, but I think I might have a clue about where the difference comes from; there's a note about that below.
Your question (and this one) challenged me to do some more investigation into how things work. I decided to write this answer not because I consider it authoritative, but rather to organize the information I have found in one place (it wouldn't fit in comments). I hope it will be useful.
First, the proposed resolution for issue 1391. We discussed it extensively in comments and chat. I think that, while it does provide some clarification, it also introduces some issues. It changes [14.8.2.4p4] to (new text in bold):
Each type nominated above from the parameter template and the
corresponding type from the argument template are used as the types of
P and A. If a particular P contains no template-parameters that
participate in template argument deduction, that P is not used to
determine the ordering.
Not a good idea in my opinion, for several reasons:
If P is non-dependent, it doesn't contain any template parameters at all, so it doesn't contain any that participate in argument deduction either, which would make the bold statement apply to it. However, that would make template<class T> f(T, int) and template<class T, class U> f(T, U) unordered, which doesn't make sense. This is arguably a matter of interpretation of the wording, but it could cause confusion.
It messes with the notion of used to determine the ordering, which affects [14.8.2.4p11]. This makes template<class T> void f(T) and template<class T> void f(typename A<T>::a) unordered (deduction succeeds from first to second, because T is not used in a type used for partial ordering according to the new rule, so it can remain without a value). Currently, all compilers I've tested report the second as more specialized.
It would make #2 more specialized than #1 in the following example:
#include <iostream>
template<class T> struct A { using a = T; };
struct D { };
template<class T> struct B { B() = default; B(D) { } };
template<class T> struct C { C() = default; C(D) { } };
template<class T> void f(T, B<T>) { std::cout << "#1\n"; } // #1
template<class T> void f(T, C<typename A<T>::a>) { std::cout << "#2\n"; } // #2
int main()
{
f<int>(1, D());
}
(#2's second parameter is not used for partial ordering, so deduction succeeds from #1 to #2 but not the other way around). Currently, the call is ambiguous, and should arguably remain so.
After looking at Clang's implementation of the partial ordering algorithm, here's how I think the standard text could be changed to reflect what actually happens.
Leave [p4] as it is and add the following between [p8] and [p9]:
For a P / A pair:
If P is non-dependent, deduction is considered successful if and only if P and A are the same type.
Substitution of deduced template parameters into the non-deduced contexts appearing in P is not performed and does not affect the outcome of the deduction process.
If template argument values are successfully deduced for all template parameters of P except the ones that appear only in non-deduced contexts, then deduction is considered successful (even if some parameters used in P remain without a value at the end of the deduction process for that particular P / A pair).
Notes:
About the second bullet point: [14.8.2.5p1] talks about finding template argument values that will make P, after substitution of the deduced values (call it the deduced A), compatible with A. This could cause confusion about what actually happens during partial ordering; there's no substitution going on.
MSVC doesn't seem to implement the third bullet point in some cases. See the next section for details.
The second and third bullet points are intented to also cover cases where P has forms like A<T, typename U::b>, which aren't covered by the wording in issue 1391.
Change the current [p10] to:
Function template F is at least as specialized as function template
G if and only if:
for each pair of types used to determine the ordering, the type from F is at least as specialized as the type from G, and,
when performing deduction using the transformed F as the argument template and G as the parameter template, after deduction is done
for all pairs of types, all template parameters used in the types from
G that are used to determine the ordering have values, and those
values are consistent across all pairs of types.
F is more specialized than G if F is at least as specialized
as G and G is not at least as specialized as F.
Make the entire current [p11] a note.
(The note added by the resolution of 1391 to [14.8.2.5p4] needs to be adjusted as well - it's fine for [14.8.2.1], but not for [14.8.2.4].)
For MSVC, in some cases, it looks like all template parameters in P need to receive values during deduction for that specific P / A pair in order for deduction to succeed from A to P. I think this could be what causes implementation divergence in your example and others, but I've seen at least one case where the above doesn't seem to apply, so I'm not sure what to believe.
Another example where the statement above does seem to apply: changing template<typename T> void bar(T, T) to template<typename T, typename U> void bar(T, U) in your example swaps results around: the call is ambiguous in Clang and GCC, but resolves to b in MSVC.
One example where it doesn't:
#include <iostream>
template<class T> struct A { using a = T; };
template<class, class> struct B { };
template<class T, class U> void f(B<U, T>) { std::cout << "#1\n"; }
template<class T, class U> void f(B<U, typename A<T>::a>) { std::cout << "#2\n"; }
int main()
{
f<int>(B<int, int>());
}
This selects #2 in Clang and GCC, as expected, but MSVC rejects the call as ambiguous; no idea why.
The partial ordering algorithm as described in the standard speaks of synthesizing a unique type, value, or class template in order to generate the arguments. Clang manages that by... not synthesizing anything. It just uses the original forms of the dependent types (as declared) and matches them both ways. This makes sense, as substituting the synthesized types doesn't add any new information. It can't change the forms of the A types, since there's generally no way to tell what concrete types the substituted forms could resolve to. The synthesized types are unknown, which makes them pretty similar to template parameters.
When encountering a P that is a non-deduced context, Clang's template argument deduction algorithm simply skips it, by returning "success" for that particular step. This happens not only during partial ordering, but for all types of deductions, and not just at the top level in a function parameter list, but recursively whenever a non-deduced context is encountered in the form of a compound type. For some reason, I found that surprising the first time I saw it. Thinking about it, it does, of course, make sense, and is according to the standard ([...] does not participate in type deduction [...] in [14.8.2.5p4]).
This is consistent with Richard Corden's comments to his answer, but I had to actually see the compiler code to understand all the implications (not a fault of his answer, but rather of my own - programmer thinking in code and all that).
I've included some more information about Clang's implementation in this answer.
I believe the key is with the following statement:
The second argument is a non-deduced context - so wouldn't that deduction succeed iff UniqueA were convertible to identity::type?
Type deduction does not perform checking of "conversions". Those checks take place using the real explicit and deduced arguments as part of overload resolution.
This is my summary of the steps that are taken to select the function template to call (all references taken from N3937, ~ C++ '14):
Explicit arguments are replaced and the resulting function type checked that it is valid. (14.8.2/2)
Type deduction is performed and the resulting deduced arguments are replaced. Again the resulting type must be valid. (14.8.2/5)
The function templates that succeeded in Steps 1 and 2 are specialized and included in the overload set for overload resolution. (14.8.3/1)
Conversion sequences are compared by overload resolution. (13.3.3)
If the conversion sequences of two function specializations are not 'better' the partial ordering algorithm is used to find the more specialized function template. (13.3.3)
The partial ordering algorithm checks only that type deduction succeeds. (14.5.6.2/2)
The compiler already knows by step 4 that both specializations can be called when the real arguments are used. Steps 5 and 6 are being used to determine which of the functions is more specialized.

Template partial ordering - why does partial deduction succeed here

Consider the following simple (to the extent that template questions ever are) example:
#include <iostream>
template <typename T>
struct identity;
template <>
struct identity<int> {
using type = int;
};
template<typename T> void bar(T, T ) { std::cout << "a\n"; }
template<typename T> void bar(T, typename identity<T>::type) { std::cout << "b\n"; }
int main ()
{
bar(0, 0);
}
Both clang and gcc print "a" there. According to the rules in [temp.deduct.partial] and [temp.func.order], to determine partial ordering, we need to synthesize some unique types. So we have two attempts at deduction:
+---+-------------------------------+-------------------------------------------+
| | Parameters | Arguments |
+---+-------------------------------+-------------------------------------------+
| a | T, typename identity<T>::type | UniqueA, UniqueA |
| b | T, T | UniqueB, typename identity<UniqueB>::type |
+---+-------------------------------+-------------------------------------------+
For deduction on "b", according to Richard Corden's answer, the expression typename identity<UniqueB>::type is treated as a type and is not evaluated. That is, this will be synthesized as if it were:
+---+-------------------------------+--------------------+
| | Parameters | Arguments |
+---+-------------------------------+--------------------+
| a | T, typename identity<T>::type | UniqueA, UniqueA |
| b | T, T | UniqueB, UniqueB_2 |
+---+-------------------------------+--------------------+
It's clear that deduction on "b" fails. Those are two different types so you cannot deduce T to both of them.
However, it seems to me that the deduction on A should fail. For the first argument, you'd match T == UniqueA. The second argument is a non-deduced context - so wouldn't that deduction succeed iff UniqueA were convertible to identity<UniqueA>::type? The latter is a substitution failure, so I don't see how this deduction could succeed either.
How and why do gcc and clang prefer the "a" overload in this scenario?
As discussed in the comments, I believe there are several aspects of the function template partial ordering algorithm that are unclear or not specified at all in the standard, and this shows in your example.
To make things even more interesting, MSVC (I tested 12 and 14) rejects the call as ambiguous. I don't think there's anything in the standard to conclusively prove which compiler is right, but I think I might have a clue about where the difference comes from; there's a note about that below.
Your question (and this one) challenged me to do some more investigation into how things work. I decided to write this answer not because I consider it authoritative, but rather to organize the information I have found in one place (it wouldn't fit in comments). I hope it will be useful.
First, the proposed resolution for issue 1391. We discussed it extensively in comments and chat. I think that, while it does provide some clarification, it also introduces some issues. It changes [14.8.2.4p4] to (new text in bold):
Each type nominated above from the parameter template and the
corresponding type from the argument template are used as the types of
P and A. If a particular P contains no template-parameters that
participate in template argument deduction, that P is not used to
determine the ordering.
Not a good idea in my opinion, for several reasons:
If P is non-dependent, it doesn't contain any template parameters at all, so it doesn't contain any that participate in argument deduction either, which would make the bold statement apply to it. However, that would make template<class T> f(T, int) and template<class T, class U> f(T, U) unordered, which doesn't make sense. This is arguably a matter of interpretation of the wording, but it could cause confusion.
It messes with the notion of used to determine the ordering, which affects [14.8.2.4p11]. This makes template<class T> void f(T) and template<class T> void f(typename A<T>::a) unordered (deduction succeeds from first to second, because T is not used in a type used for partial ordering according to the new rule, so it can remain without a value). Currently, all compilers I've tested report the second as more specialized.
It would make #2 more specialized than #1 in the following example:
#include <iostream>
template<class T> struct A { using a = T; };
struct D { };
template<class T> struct B { B() = default; B(D) { } };
template<class T> struct C { C() = default; C(D) { } };
template<class T> void f(T, B<T>) { std::cout << "#1\n"; } // #1
template<class T> void f(T, C<typename A<T>::a>) { std::cout << "#2\n"; } // #2
int main()
{
f<int>(1, D());
}
(#2's second parameter is not used for partial ordering, so deduction succeeds from #1 to #2 but not the other way around). Currently, the call is ambiguous, and should arguably remain so.
After looking at Clang's implementation of the partial ordering algorithm, here's how I think the standard text could be changed to reflect what actually happens.
Leave [p4] as it is and add the following between [p8] and [p9]:
For a P / A pair:
If P is non-dependent, deduction is considered successful if and only if P and A are the same type.
Substitution of deduced template parameters into the non-deduced contexts appearing in P is not performed and does not affect the outcome of the deduction process.
If template argument values are successfully deduced for all template parameters of P except the ones that appear only in non-deduced contexts, then deduction is considered successful (even if some parameters used in P remain without a value at the end of the deduction process for that particular P / A pair).
Notes:
About the second bullet point: [14.8.2.5p1] talks about finding template argument values that will make P, after substitution of the deduced values (call it the deduced A), compatible with A. This could cause confusion about what actually happens during partial ordering; there's no substitution going on.
MSVC doesn't seem to implement the third bullet point in some cases. See the next section for details.
The second and third bullet points are intented to also cover cases where P has forms like A<T, typename U::b>, which aren't covered by the wording in issue 1391.
Change the current [p10] to:
Function template F is at least as specialized as function template
G if and only if:
for each pair of types used to determine the ordering, the type from F is at least as specialized as the type from G, and,
when performing deduction using the transformed F as the argument template and G as the parameter template, after deduction is done
for all pairs of types, all template parameters used in the types from
G that are used to determine the ordering have values, and those
values are consistent across all pairs of types.
F is more specialized than G if F is at least as specialized
as G and G is not at least as specialized as F.
Make the entire current [p11] a note.
(The note added by the resolution of 1391 to [14.8.2.5p4] needs to be adjusted as well - it's fine for [14.8.2.1], but not for [14.8.2.4].)
For MSVC, in some cases, it looks like all template parameters in P need to receive values during deduction for that specific P / A pair in order for deduction to succeed from A to P. I think this could be what causes implementation divergence in your example and others, but I've seen at least one case where the above doesn't seem to apply, so I'm not sure what to believe.
Another example where the statement above does seem to apply: changing template<typename T> void bar(T, T) to template<typename T, typename U> void bar(T, U) in your example swaps results around: the call is ambiguous in Clang and GCC, but resolves to b in MSVC.
One example where it doesn't:
#include <iostream>
template<class T> struct A { using a = T; };
template<class, class> struct B { };
template<class T, class U> void f(B<U, T>) { std::cout << "#1\n"; }
template<class T, class U> void f(B<U, typename A<T>::a>) { std::cout << "#2\n"; }
int main()
{
f<int>(B<int, int>());
}
This selects #2 in Clang and GCC, as expected, but MSVC rejects the call as ambiguous; no idea why.
The partial ordering algorithm as described in the standard speaks of synthesizing a unique type, value, or class template in order to generate the arguments. Clang manages that by... not synthesizing anything. It just uses the original forms of the dependent types (as declared) and matches them both ways. This makes sense, as substituting the synthesized types doesn't add any new information. It can't change the forms of the A types, since there's generally no way to tell what concrete types the substituted forms could resolve to. The synthesized types are unknown, which makes them pretty similar to template parameters.
When encountering a P that is a non-deduced context, Clang's template argument deduction algorithm simply skips it, by returning "success" for that particular step. This happens not only during partial ordering, but for all types of deductions, and not just at the top level in a function parameter list, but recursively whenever a non-deduced context is encountered in the form of a compound type. For some reason, I found that surprising the first time I saw it. Thinking about it, it does, of course, make sense, and is according to the standard ([...] does not participate in type deduction [...] in [14.8.2.5p4]).
This is consistent with Richard Corden's comments to his answer, but I had to actually see the compiler code to understand all the implications (not a fault of his answer, but rather of my own - programmer thinking in code and all that).
I've included some more information about Clang's implementation in this answer.
I believe the key is with the following statement:
The second argument is a non-deduced context - so wouldn't that deduction succeed iff UniqueA were convertible to identity::type?
Type deduction does not perform checking of "conversions". Those checks take place using the real explicit and deduced arguments as part of overload resolution.
This is my summary of the steps that are taken to select the function template to call (all references taken from N3937, ~ C++ '14):
Explicit arguments are replaced and the resulting function type checked that it is valid. (14.8.2/2)
Type deduction is performed and the resulting deduced arguments are replaced. Again the resulting type must be valid. (14.8.2/5)
The function templates that succeeded in Steps 1 and 2 are specialized and included in the overload set for overload resolution. (14.8.3/1)
Conversion sequences are compared by overload resolution. (13.3.3)
If the conversion sequences of two function specializations are not 'better' the partial ordering algorithm is used to find the more specialized function template. (13.3.3)
The partial ordering algorithm checks only that type deduction succeeds. (14.5.6.2/2)
The compiler already knows by step 4 that both specializations can be called when the real arguments are used. Steps 5 and 6 are being used to determine which of the functions is more specialized.

What are the rules for choosing between a variadic template method and a usual template method?

I have two template operators in class:
template<class T>
size_t operator()(const T& t) const {
static_assert(boost::is_pod<T>(), "Not a POD type");
return sizeof t;
}
template<typename... T>
size_t operator()(const boost::variant<T...>& t) const
{
return boost::apply_visitor(boost::bind(*this, _1), t);
}
I pass boost::variant<some, pod, types, here> as an argument to these operators. GCC 4.8 and llvm 6.0 compile the code fine, choosing boost::variant parameterized operator. gcc 4.7 chooses const T& t parameterized operator and thus fails to compile due to static assert.
So, I have a question, what are the rules for choosing between these two?
I think gcc 4.7 must have a bug, but I don't have any proof.
The key section is in [temp.deduct.partial]:
Two sets of types are used to determine the partial ordering. For each of the templates involved there is
the original function type and the transformed function type. [ Note: The creation of the transformed type
is described in 14.5.6.2. —end note ] The deduction process uses the transformed type as the argument
template and the original type of the other template as the parameter template. This process is done twice
for each type involved in the partial ordering comparison: once using the transformed template-1 as the
argument template and template-2 as the parameter template and again using the transformed template-2
as the argument template and template-1 as the parameter template.
That's really dense, even for the C++ standard, but what it basically means is this. Take our two overloads:
template <class T> // #1
size_t operator()(const T& t) const
template <typename... T> // #2
size_t operator()(const boost::variant<T...>& t)
And we're going to basically assign some unique type(s) to each one and try to see if the other applies. So let's pick some type A for the #1, and B,C,D for #2. Does operator()(const A&) work for #2? No. Does operator()(const boost::variant<B,C,D>&) work for #1? Yes. Thus, the partial ordering rules indicate #2 is more specialized than #1.
And so, from [temp.func.order]:
The deduction process determines whether one of the templates is more specialized than the other. If
so, the more specialized template is the one chosen by the partial ordering process.
And from [over.match.best]:
[A] viable function F1 is defined to be a better function than another viable function
F2 if
— [..]
— F1 and F2 are function template specializations, and the function template for F1 is more specialized
than the template for F2 according to the partial ordering rules described in 14.5.6.2.
Thus, #2 should be chosen in any case where it applies. If GCC chooses #1, that is nonconforming behavior and is a bug.
In general the compiler just treats all deduced template instantiations as potential overloads, picking the "best viable function" (§ 13.3.3).
Indeed this means GCC 4.7 has a bug then.
See §14.8.3: Overload resolution
describes that all template instances will join in the set of candidates as any non-template declared overload:
A function template can be overloaded either by (non-template) functions of its
name or by (other) function templates of the same name. When a call to that
name is written (explicitly, or implicitly using the operator notation),
template argument deduction (14.8.2) and checking of any explicit template
arguments (14.3) are performed for each function template to find the template
argument values (if any) that can be used with that function template to
instantiate a function template specialization that can be invoked with the
call arguments. For each function template, if the argument deduction and
checking succeeds, the template- arguments (deduced and/or explicit) are used
to synthesize the declaration of a single function template specialization
which is added to the candidate functions set to be used in overload
resolution. If, for a given function template, argument deduction fails, no
such function is added to the set of candidate functions for that template. The
complete set of candidate functions includes all the synthesized declarations
and all of the non-template overloaded functions of the same name. The
synthesized declarations are treated like any other functions in the remainder
of overload resolution, except as explicitly noted in 13.3.3.
In the case of your question, the overloads end up being indistinguishable (credit: #Piotr S). In such cases "partial ordering" is applied (§14.5.6.2):
F1 and F2 are function template specializations, and the function template for F1 is more specialized than the template for F2
Note that things can get pretty tricky, when e.g. the "open template" version took a T& instead of T const& (non const references are preferred, all else being equal).
When you had several overloads that end up having the same "rank" for overload resolution, the call is ill-formed and the compiler will diagnose an ambiguous function invocation.