Is a pointer to function (sometimes/always?) a function declarator? - c++

(This question has been broken out from the discussion to this answer, which highlights CWG 1892)
Some paragraphs of the standard applies specific rules to function declarators; e.g. [dcl.spec.auto]/3 regarding placeholder types [emphasis mine]:
The placeholder type can appear with a function declarator in the decl-specifier-seq, type-specifier-seq, conversion-function-id, or trailing-return-type, in any context where such a declarator is valid. If the function declarator includes a trailing-return-type ([dcl.fct]), that trailing-return-type specifies the declared return type of the function. Otherwise, the function declarator shall declare a function. [...]
restricts where placeholder types may appear with(in) a function declarator. We may study the following example:
int f() { return 0; }
auto (*g)() = f; // #1
which both GCC and Clang accepts, deducing g to int(*)().
Is a pointer to function (sometimes/always?) a function declarator?
Or, alternatively, applied to the example, should #1 be rejected as per [dcl.spec.auto]/3, or does the latter not apply here as a pointer to function is not a function declarator (instead allowing #1 as per [dcl.spec.auto]/4 regarding variable type deduction from initializer)?
The rules for what is a given declarator is not entirely easy to follow, but we may note that, from [dcl.decl]/1
A declarator declares a single variable, function, or type, within a declaration.
that a given declarator is either any of a variable declarator, a function declarator or a type declarator.
[dcl.ptr] covers (variable) declarators that are pointers, but does not explicitly (/normatively) mention pointers to functions, albeit does so non-normatively in [dcl.ptr]/4
[dcl.fct] covers function declarators but does not mention function pointers as part of function declarations, other than a note that function types are checked during assignment/initialization to function pointers (which is not relevant for what a function declarator is)
My interpretation is that #1 is legal (as per the current standard), as it falls under a variable declarator. If this is actually correct, then the extended question (from the linked thread) is whether
template<auto (*g)()>
int f() { return g(); }
is legal or not (/intended to be legal or not as per CWG 1892); as the template parameter arguably contains a declarator that is a function pointer declarator, and not a function declarator.
We may finally note, as similarly pointed out in the linked to answer, that
template<auto g()> // #2
int f() { return g(); }
is arguably ill-formed (although this example is also accepted by both GCC and Clang), as the non-type template parameter at #2 is a function declarator and is thus used in an illegal context as per [dcl.spec.auto]/3, as it does not contain a trailing return type and does not declare a function.

The confusion here arises from two different meanings of "declarator": one is the portion of a declaration (after the specifiers) that pertains to one entity (or typedef-name), while the other is any of the several syntactic constructs used to form the former kind. The latter meaning gives rise to the grammar productions ptr-declarator (which also covers references) and noptr-declarator (which includes functions and arrays). That meaning is also necessary to give any meaning to a restriction that a "function declarator shall declare a function". Moreover, if we took the variable declaration
auto (*g)() = /*…*/;
to not involve a "function declarator" for the purposes of [dcl.spec.auto.general]/3, we would not be able to write
auto (*g)() -> int;
which is universally accepted (just as is the similar example in the question).
Moreover, while the statement that checks whether "the function declarator includes a trailing-return-type" inevitably refers to an overall declarator (which is what supports a trailing-return-type), it does so in its capacity as a "declaration operator" because it still allows the above cases with nested use of such operators. (What that limitation forbids is just
auto *f() -> int*;
where deduction would work but isn't performed at all here because it would always be useless.)
Meanwhile, there is some evidence, beyond implementation consensus, that the answer to the higher-level question is that auto in these cases should be allowed: [dcl.spec.auto.general]/1 says that auto in a function parameter serves to declare a generic lambda or abbreviated function template "if it is not the auto type-specifier introducing a trailing-return-type" rather than if it is not used with a function declarator at all.

Related

What is correct syntax for explicit call to conversion operator with deduced return type (auto) [duplicate]

struct A{
operator auto(){
return 0;
}
};
int main(){
A a;
a.operator auto(); // #1
a.operator int(); // #2
}
GCC accepts that #2 is the right way to call the conversion function explicitly while Clang accepts #1.
It seems that #1 is ill-formed due to the following rule:
dcl.spec.auto#6
A program that uses auto or decltype(auto) in a context not explicitly allowed in this section is ill-formed.
This usage a.operator auto() is not explicitly allowed in section [dcl.spec.auto], hence it should be ill-formed. However, for the second usage, which is accepted by GCC, the standard does not say that the conversion-function-id where the conversion-type-id is replaced by the deduced type denotes the name of the conversion function. In other words, the declared conversion-function-id in the declaration is operator auto rather than operator int. The former has the same token as the declarator-id of the declaration. According to the grammar, the unqualified-id operator auto should be the name of that conversion function. So, how to explicitly call this conversion function? Is it underspecified in the standard about which is the name of the conversion function when it contains a placeholder specifier?
It seems, that this is not specified precisely enough.
From 10.1.7.4 The auto specifier:
The placeholder type can appear with a function declarator in the
decl-specifier-seq, type-specifier-seq, conversion-function-id, or
trailing-return-type, in any context where such a declarator is valid.
Reading precisely, one might distinguish here between "can" and the stronger "can only", i.e. potentially opening up room for degrees of freedom for compiler intrinsics (strictly wrong vs. unspecified behavior).
And 3.4.5 class member access says:
7 If the id-expression is a conversion-function-id, its
conversion-type-id is first looked up in the class of the object
expression and the name, if found, is used.
Again leaving room for interpretation if the auto keyword can effectively be a fully qualified conversion-type-id within this context or not.
Your question itself might have to be further branched, namely
What are the overloading rules for the operator auto() usage in detail, i.e. should it be available for regular candidates competition already on class definition level? (not the case for Clang and Gcc, both accept the operator a priori besides an extra operator int() ...)
Can the operator auto() be called with explicit member operator referring (your case 1), i.e. effectively, has it a (unique) accessible name? Allowing that would be contradictory to all other explicitly allowed use cases for the keyword.
I've seen explicit tests for this within several clang revisions so its behavior is not an artefact of implicit naming convention applicance but an explicitly desired behavior obviously.
As already mentioned within the comments, Clang's behavior is a bit more overall consistent here at least in comparison to gcc since it's totally clear there, where the auto keyword is used for type deduction and where for name / function-id resolution. The operator auto() there is handled as a more explicit own entity, whereas for gcc, it has anonymous character similar to a lambda but is involved within candidates competition even for the explicit member operator access way.

How to explicitly call a conversion function whose conversion-type-id contains a placeholder specifier

struct A{
operator auto(){
return 0;
}
};
int main(){
A a;
a.operator auto(); // #1
a.operator int(); // #2
}
GCC accepts that #2 is the right way to call the conversion function explicitly while Clang accepts #1.
It seems that #1 is ill-formed due to the following rule:
dcl.spec.auto#6
A program that uses auto or decltype(auto) in a context not explicitly allowed in this section is ill-formed.
This usage a.operator auto() is not explicitly allowed in section [dcl.spec.auto], hence it should be ill-formed. However, for the second usage, which is accepted by GCC, the standard does not say that the conversion-function-id where the conversion-type-id is replaced by the deduced type denotes the name of the conversion function. In other words, the declared conversion-function-id in the declaration is operator auto rather than operator int. The former has the same token as the declarator-id of the declaration. According to the grammar, the unqualified-id operator auto should be the name of that conversion function. So, how to explicitly call this conversion function? Is it underspecified in the standard about which is the name of the conversion function when it contains a placeholder specifier?
It seems, that this is not specified precisely enough.
From 10.1.7.4 The auto specifier:
The placeholder type can appear with a function declarator in the
decl-specifier-seq, type-specifier-seq, conversion-function-id, or
trailing-return-type, in any context where such a declarator is valid.
Reading precisely, one might distinguish here between "can" and the stronger "can only", i.e. potentially opening up room for degrees of freedom for compiler intrinsics (strictly wrong vs. unspecified behavior).
And 3.4.5 class member access says:
7 If the id-expression is a conversion-function-id, its
conversion-type-id is first looked up in the class of the object
expression and the name, if found, is used.
Again leaving room for interpretation if the auto keyword can effectively be a fully qualified conversion-type-id within this context or not.
Your question itself might have to be further branched, namely
What are the overloading rules for the operator auto() usage in detail, i.e. should it be available for regular candidates competition already on class definition level? (not the case for Clang and Gcc, both accept the operator a priori besides an extra operator int() ...)
Can the operator auto() be called with explicit member operator referring (your case 1), i.e. effectively, has it a (unique) accessible name? Allowing that would be contradictory to all other explicitly allowed use cases for the keyword.
I've seen explicit tests for this within several clang revisions so its behavior is not an artefact of implicit naming convention applicance but an explicitly desired behavior obviously.
As already mentioned within the comments, Clang's behavior is a bit more overall consistent here at least in comparison to gcc since it's totally clear there, where the auto keyword is used for type deduction and where for name / function-id resolution. The operator auto() there is handled as a more explicit own entity, whereas for gcc, it has anonymous character similar to a lambda but is involved within candidates competition even for the explicit member operator access way.

Is using explicit return type in one translation unit and deduced return type in another allowed?

My question is similar to this one, but subtly different.
Suppose I have two translation units, exec.cpp and lib.cpp, as followed:
// exec.cpp
int foo();
int main() {
return foo();
}
and
// lib.cpp
auto foo() {
return 42;
}
Is it legal to compile and link them together? Or is it ill-formed NDR?
Note: both g++ and clang generate the expected executable (i.e. returns 42) with command <compiler> exec.cpp lib.cpp -o program
Note: Arguably this is a bad practice (as the return type can change if the implementation changes, and breaks the code). But I still would like to know the answer.
All standard references below refers to N4861: March 2020 post-Prague working draft/C++20 DIS..
From [basic.link]/11 [emphasis mine]:
After all adjustments of types (during which typedefs are replaced by their definitions), the types specified by all declarations referring to a given variable or function shall be identical, except that declarations for an array object can specify array types that differ by the presence or absence of a major array bound ([dcl.array]). A violation of this rule on type identity does not require a diagnostic.
[dcl.spec.auto]/3 covers that a placeholder type can appear with a function declarator, and if this declarator does not include a trailing-return-type (as is the case of OP's example)
[...] Otherwise [no trailing-return-type], the function declarator shall declare a function.
where
[...] the return type of the function is deduced from non-discarded return statements, if any, in the body of the function ([stmt.if]).
[dcl.fct]/1 covers function declarators that do not include a trailing-return-type [emphasis mine, removing opt parts of the grammar that do not apply in this particular example]:
In a declaration T D where D has the form [...] the type of the declarator-id in D is “derived-declarator-type-list function of parameter-type-list returning T” [...]
Thus, the two declarations
int f(); // #1
auto foo() { // #2
// [dcl.spec.auto]/3:
// return type deduced to 'int'
}
both declare functions where the type of the associated declarator-id in D of these T D declarations is
“derived-declarator-type-list function of parameter-type-list returning T”
where in both cases, T is int:
explicitly specified in #1,
deduced as per [dcl.spec.auto]/3 in #2.
Thus, the declarations #1 and #2, after all adjustments of types, have identical (function) types, thus fulfilling [basic.link]/11, and the OP's example is well-formed. Any slight variation of the definition of auto f(), however, could lead to a deduced return type which is not int, in which case [basic.link]/11 is violated, NDR.

why a conversion function declaration does not require at least one defining-type-specifier

Except in a declaration of a constructor, destructor, or conversion function, at least one defining-type-specifier that is not a cv-qualifier shall appear in a complete type-specifier-seq or a complete decl-specifier-seq.
Constructor is an exception because constructor can be declared like constructor(){}, No defining-type-specifier in this declaration, similarly for destructor.
For conversion function, I can't have a idea that a conversion function does not need a type-specifier when I think in the above quote, defining-type-specifier contains type-specifier, because the sentence defining-type-specifier that is not a cv-qualifier implies that, only type-specifier contains cv-qualifier, Hence I think a conversion function declaration shall contain at least a defining-type-specifier(in less range, it's type-specifier that is a subset of defining-type-specifier), The grammar of a conversion function likes as the following, due to [dcl.fct.def#general-1]:
attribute-specifier-seq(opt) decl-specifier-seq(opt) declarator virt-specifier-seq(opt) function-body
Thereof, its declarator will like as the following:
operator conversion-type-id
conversion-type-id
type-specifier-seq conversion-declarator(opt)
However according to [class.conv.fct], It says:
A decl-specifier in the decl-specifier-seq of a conversion function (if any) shall be neither a defining-type-specifier nor static.
It means the decl-specifier-seq(opt) can't be a defining-type-specifier nor static.
But, in the type-specifier-seq of a conversion-type-id, It must be a defining-type-specifier.
operator Type(){ // Type is a defining-type-specifier(more exactly,it's a type-specifier)
return Type{};
}
and you wouldn't define a conversion functions like this:
operator (){ // there's no defining-type-specifier in type-specifier-seq
//...
}
[dcl.fct#11]
Types shall not be defined in return or parameter types.
This is a restriction for that the defining-type-specifier must appear in a function declaration, Due to
defining-type-specifier consists of:
type-specifier
class-specifier
enum-specifier
class-specifier or enum-specifier can't be used in a decl-specifier-seq of a function delcaration because of the last quote. only type-specifier is permitted.
So, as far as now, what the standard actually wants to say is that, use a more range wording for type-specifier, namely, defining-type-specifier, because you indeed can declare a variable likes struct A{} variable;, there's no class-specifier be contained in type-specifier, hence, as a general rule, the standard use the "wording" defining-type-specifier to cover such cases.
So, why conversion function is an exception in the first rule? If there're any misunderstandings in the above analysis, correct me.
Questions:
why a conversion function is an exception in the first quote?
If a conversion function is an exception, why other functions are not?
I agree - the paragraph [dcl.type]/3 should instead say something like:
Except in a declaration of a constructor, destructor, or conversion function, or in a lambda-declarator, a complete decl-specifier-seq shall contain at least one defining-type-specifier that is not a cv-qualifier. A complete type-specifier-seq shall contain at least one type-specifier that is not a cv-qualifier.
You're correct that:
defining-type-specifier parses a wider set of input token sequences than type-specifier.
decl-type-specifier parses an even wider set of input token sequences than defining-type-specifier.
const and volatile are valid in parsing any of the three.
The syntaxes including class ClassName {...}; and enum EnumName {...}; are valid in a defining-type-specifier or a decl-type-specifier but not in a type-specifier.
The C++ grammar uses type-specifier-seq and decl-specifier-seq in many places where the name of a type is expected (plus a few where they're not to name a type). The quoted paragraph [dcl.type]/3 requiring "at least one defining-type-specifier that is not a cv-qualifier" in these sequences is mainly saying that in all of those contexts, variations which don't name a type at all are not permitted: you can't say auto v1 = new const; or static constexpr typedef f();. Most individual uses have additional restrictions on what sorts of type-specifier can and cannot appear there, but those such rules are in addition to this basic one. In particular, many of them don't allow defining types within the specifier sequence. But since decl-type-specifier is used within simple-declaration as the ordinary way to define classes and enumerations, this rule is not the place for that restriction.
The reason for excluding constructors, destructors, and conversion functions is that although they might not have a decl-type-specifier-seq at all, they might in fact use a decl-type-specifier which does not contain any defining-type-specifier. For example, in
class MyClass {
public:
explicit MyClass(int);
};
the constructor declaration has a decl-type-specifier whose only specifier is the explicit keyword, which is not a defining-type-specifier.
(However, in looking through, I found another context where the rule should NOT apply: A lambda expression allows an optional decl-specifier-seq after its parameter list, where the only permitted specifiers are mutable and constexpr; neither is a defining-type-specifier.)
I'm guessing this paragraph version came along with or following a change in the grammar between C++14 and C++17. The initial decl-specifier-seq in a simple-declaration was changed from optional to required. A new grammar symbol nodeclspec-function-declaration was added to cover cases of friend declarations and template-related declarations which declare constructors, destructors, or conversion functions with no initial specifiers and without defining them. Other declarations of constructors, destructors, and conversion functions are actually covered by either function-definition or member-declaration, which still use an optional decl-specifier-seq, so the changes to simple-declaration didn't affect them.
For conversion functions, the text in [class.conv.fct]/1 saying
A decl-specifier in the decl-specifier-seq of a conversion function (if any) shall be neither a defining-type-specifier nor static.
forms the actual requirement: The [dcl.type] sentence excludes a conversion function's decl-type-specifier-seq from its usual requirement, so it doesn't say anything about what is and isn't legal. This [class.conv.fct] sentence gives the actual rule for this case.
A conversion function may be declared:
by a function-definition (if it has a body, including =default; or =delete;)
by a member-declaration (inside a class definition, if the declaration does not have a body)
by a simple-declaration or nodeclspec-function-declaration (if in a friend declaration, explicit specialization, or explicit instantiation)
A nodeclspec-function-declaration allows no initial specifiers, but the other three symbols all have a rule in which a decl-specifier-seq (either required or optional) is followed by either a declarator or an init-declarator-list. As you noted, the declarator for a conversion function contains the operator keyword followed by a type-specifier-seq. The declarator must also contain () or (void) or equivalent so that it declares a function with no arguments.
With a few more assumptions, the general form of a conversion function declaration is either
attribute-specifier-seqopt decl-specifier-seqopt operator type-specifier-seq conversion-declaratoropt attribute-specifier-seqopt parameters-and-qualifiers virt-specifier-seqopt pure-specifieropt ;
or
attribute-specifier-seqopt decl-specifier-seqopt operator type-specifier-seq conversion-declaratoropt attribute-specifier-seqopt parameters-and-qualifiers virt-specifier-seqopt function-body
So there's an optional decl-specifier-seq before the operator keyword and a required type-specifier-seq after it. It's the decl-specifier-seq which might not be present at all, and which must not contain a defining-type-specifier (because you don't put a type before the operator keyword) or static (because a conversion function must always be a non-static member). But the decl-specifier-seq may contain constexpr, inline, virtual, or explicit, or combinations of those.
The trouble you've noticed is that the wording of [dcl.type]/3 also means it technically doesn't apply to the type-specifier-seq in such a declaration which names the target type for the conversion. ([dcl.pre]/4 clears up many similar statements about grammar symbols in a declaration, but doesn't apply to this case since there's no intervening scope involved.) We could still infer that a defining-type-specifier is needed from phrases in the Standard like "the type specified by the conversion-type-id". But it would be better if the rule in [dcl.type]/3 applied to this type-specifier-seq like it does to most of them.
A function declaration must have a defining-type-specifier simply means that a function declaration must have the form:
Type f();
// ^^^^ defining-type-specifier (in this case, a type-specifier)
// this must be an existing type
and can't be of the form:
f(); // error, no defining-type-specifier
The quoted rule from dcl.fct:
Types shall not be defined in return or parameter types.
has nothing to do with defining-type-specifiers (despite the similar terminology). It simply means that you can't define a type in the declaration of a function.
struct A{} f(); // error, can't define a type in return
void f(struct A{}); // error, can't define a type in parameter
so this is not in conflict with the exceptions quoted at the beginning of your question.

Type-id ambiguity in trailing return type

Quote from the Standard:
The type-id in a trailing-return-type includes the longest possible sequence of abstract-declarators.
Note: This resolves the ambiguous binding of array and function declarators.
Example:
auto f()->int(*)[4]; // function returning a pointer to array[4] of int
// not function returning array[4] of pointer to int
I'm wondering, what is ambiguous with this given code?
Ambiguities that are mentioned in the Standard are usually due to the ambiguity of the grammar itself, while in this case given character sequence is always interpreted as a trailing return type (i.e. a single type), and there shouldn't be ambiguity on which one it is.
By the way, why type-id is referred to? I mean, formally it can only occur in different places of code than trailing-return-type. Or is it just informally mentioned since everything that can be parsed (by itself) as trailing-return-type can be parsed as type-id (by itself) as well? I'm just not that well-aware of when nonterminals are used in the Standard...
In N2541, a trailing-return-type may appear in any function declarator. That means auto (*f() -> int); is a valid declaration. At that time auto f()->int(*)[4]; may be interpreted as having the same meaning of auto (f()->int(*))[4]; (declare f as function returning array[4] of pointer to int). It may also be interpreted in the way we normally expect, i.e. [4] is a part of the trailing-return-type and the declaration declares a function returning a pointer to array[4] of int. The quoted paragraph was added at that time to resolve this ambiguity.
After N2541 was voted into the standard, CWG 681 changed the grammar to ensure that a trailing-return-type may only appear in the top level function declarator; therefore [4] has to be a part of the trailing-return-type. There is no longer any ambiguity. However, CWG 681 does not remove the disambiguation rule, which seems to be an oversight.
That oversight was recently corrected by CWG 2040, which removed the now-useless paragraph.
N2541 also allows a type-id to appear after the symbol ->, which brings an ambiguity in declarations like auto f() -> struct S { };. This is CWG 770 and is resolved by N2927, which defines trailing-return-type as the symbol -> followed by a trailing-type-specifier-seq followed by an optional abstract-declarator. However N2927 does not modify the disambiguation rule, even though it makes no sense now that there's no type-id in a trailing-return-type.
A type-id is a type-specifier-seq followed by an optional abstract-declarator. At that time a class type or enumeration type definition might appear in a type-specifier-seq but not in a trailing-type-specifier-seq. This made -> struct S {} an invalid trailing-return-type, though struct S {} was a valid type-id.
The grammar was changed again lately by CWG 2141, which renames type-specifier-seq to defining-type-specifier-seq and renames trailing-type-specifier-seq to type-specifier-seq. A type-id is still defined as a type-specifier-seq followed by an optional abstract-declarator. The end result is that a class type or enumeration type definition now may not appear in a type-id. Now again, any type-id may appear after the symbol -> in a trailing-return-type.