Function default argument value depending on argument name in C++ [duplicate] - c++

This question already has an answer here:
Using a parameter's name inside its own default value - is it legal?
(1 answer)
Closed last year.
If one defines a new variable in C++, then the name of the variable can be used in the initialization expression, for example:
int x = sizeof(x);
And what about default value of a function argument? Is it allowed there to reference the argument by its name? For example:
void f(int y = sizeof(y)) {}
This function is accepted in Clang, but rejected in GCC with the error:
'y' was not declared in this scope
Demo: https://gcc.godbolt.org/z/YsvYnhjTb
Which compiler is right here?

According to the C++17 standard (11.3.6 Default arguments)
9 A default argument is evaluated each time the function is called
with no argument for the corresponding parameter. A parameter shall
not appear as a potentially-evaluated expression in a default
argument. Parameters of a function declared before a default
argument are in scope and can hide namespace and class member name
It provides the following example:
int h(int a, int b = sizeof(a)); // OK, unevaluated operand
So, this function declaration
void f(int y = sizeof(y)) {}
is correct because, in this expression sizeof(y), y is not an evaluated operand, based on C++17 8.3.3 Sizeof:
1 The sizeof operator yields the number of bytes in the object
representation of its operand. The operand is either an expression,
which is an unevaluated operand (Clause 8), or a parenthesized
type-id.
and C++17 6.3.2 Point of declaration:
1 The point of declaration for a name is immediately after its
complete declarator (Clause 11) and before its initializer (if any),
except as noted below.

The code does not appear ill-formed, so Clang is alright.
[basic.scope.pdecl]
1 The point of declaration for a name is immediately after its complete declarator ([dcl.decl]) and before its initializer (if any), except as noted below.
This is the notorious passage that is under discussion. I bring it here just to mention that "except as noted below" doesn't include any mention of default arguments. So y is declared right before = sizeof(y).
The other relevant paragraph is
[dcl.fct.default]
9 A default argument is evaluated each time the function is called with no argument for the corresponding parameter. A parameter shall not appear as a potentially-evaluated expression in a default argument. Parameters of a function declared before a default argument are in scope and can hide namespace and class member names.
sizeof(y) is not potentially evaluated, so this is also fine.
Seeing as the first paragraph makes y available as a name, and it's used in a way that is not illegal, must be some quirk of GCC that rejects the code.
Though personally, I don't see it as a great loss. This is not the most practical bit of code.

Related

Is a pointer to function (sometimes/always?) a function declarator?

(This question has been broken out from the discussion to this answer, which highlights CWG 1892)
Some paragraphs of the standard applies specific rules to function declarators; e.g. [dcl.spec.auto]/3 regarding placeholder types [emphasis mine]:
The placeholder type can appear with a function declarator in the decl-specifier-seq, type-specifier-seq, conversion-function-id, or trailing-return-type, in any context where such a declarator is valid. If the function declarator includes a trailing-return-type ([dcl.fct]), that trailing-return-type specifies the declared return type of the function. Otherwise, the function declarator shall declare a function. [...]
restricts where placeholder types may appear with(in) a function declarator. We may study the following example:
int f() { return 0; }
auto (*g)() = f; // #1
which both GCC and Clang accepts, deducing g to int(*)().
Is a pointer to function (sometimes/always?) a function declarator?
Or, alternatively, applied to the example, should #1 be rejected as per [dcl.spec.auto]/3, or does the latter not apply here as a pointer to function is not a function declarator (instead allowing #1 as per [dcl.spec.auto]/4 regarding variable type deduction from initializer)?
The rules for what is a given declarator is not entirely easy to follow, but we may note that, from [dcl.decl]/1
A declarator declares a single variable, function, or type, within a declaration.
that a given declarator is either any of a variable declarator, a function declarator or a type declarator.
[dcl.ptr] covers (variable) declarators that are pointers, but does not explicitly (/normatively) mention pointers to functions, albeit does so non-normatively in [dcl.ptr]/4
[dcl.fct] covers function declarators but does not mention function pointers as part of function declarations, other than a note that function types are checked during assignment/initialization to function pointers (which is not relevant for what a function declarator is)
My interpretation is that #1 is legal (as per the current standard), as it falls under a variable declarator. If this is actually correct, then the extended question (from the linked thread) is whether
template<auto (*g)()>
int f() { return g(); }
is legal or not (/intended to be legal or not as per CWG 1892); as the template parameter arguably contains a declarator that is a function pointer declarator, and not a function declarator.
We may finally note, as similarly pointed out in the linked to answer, that
template<auto g()> // #2
int f() { return g(); }
is arguably ill-formed (although this example is also accepted by both GCC and Clang), as the non-type template parameter at #2 is a function declarator and is thus used in an illegal context as per [dcl.spec.auto]/3, as it does not contain a trailing return type and does not declare a function.
The confusion here arises from two different meanings of "declarator": one is the portion of a declaration (after the specifiers) that pertains to one entity (or typedef-name), while the other is any of the several syntactic constructs used to form the former kind. The latter meaning gives rise to the grammar productions ptr-declarator (which also covers references) and noptr-declarator (which includes functions and arrays). That meaning is also necessary to give any meaning to a restriction that a "function declarator shall declare a function". Moreover, if we took the variable declaration
auto (*g)() = /*…*/;
to not involve a "function declarator" for the purposes of [dcl.spec.auto.general]/3, we would not be able to write
auto (*g)() -> int;
which is universally accepted (just as is the similar example in the question).
Moreover, while the statement that checks whether "the function declarator includes a trailing-return-type" inevitably refers to an overall declarator (which is what supports a trailing-return-type), it does so in its capacity as a "declaration operator" because it still allows the above cases with nested use of such operators. (What that limitation forbids is just
auto *f() -> int*;
where deduction would work but isn't performed at all here because it would always be useless.)
Meanwhile, there is some evidence, beyond implementation consensus, that the answer to the higher-level question is that auto in these cases should be allowed: [dcl.spec.auto.general]/1 says that auto in a function parameter serves to declare a generic lambda or abbreviated function template "if it is not the auto type-specifier introducing a trailing-return-type" rather than if it is not used with a function declarator at all.

How come you need initialisers for all variables when using auto in multiple declarations?

I would've expected an initialiser would only be necessary for the first declaration. e.g.
auto x = 2, y;
I would expect this to deduce x's type as int and then implicitly replace "auto" with the base type "int", meaning y would then be a default initialised integer. Actually the entire thing doesn't compile because y explicitly needs and initialiser. Similarly it's odd to me that
auto x = 2, y = 3.3;
causes an error too. I would've expected y to be initialsed to 3 in a double-to-int conversion, but:
error: inconsistent deduction for 'auto': 'int' and then 'double'
I read through http://en.cppreference.com/w/cpp/language/auto and couldn't explicitly find an explanation. Actually it seemed like that link was on my side:
Once the type of the initializer has been determined, the compiler determines the type that will replace the keyword auto using the rules for template argument deduction from a function call (see template argument deduction#Other contexts for details).
Is it simply "just cause"?
Is it simply "just cause"?
Yes.
Both variables have a deduced type, and both variables thus need an initialiser. The logic that requires both to have the same type is applied post-deduction.
[C++11: 7.1.6.4/7]: If the list of declarators contains more than one declarator, the type of each declared variable is determined as described above. If the type deduced for the template parameter U is not the same in each deduction, the program is ill-formed.
[C++14: 7.1.6.4/8]: If the init-declarator-list contains more than one init-declarator, they shall all form declarations of variables. The type of each declared variable is determined as described above, and if the type that replaces the placeholder type is not the same in each deduction, the program is ill-formed.
Call it a C++ oddity, but I imagine it's there to help keep the standard wording simple. After all, wouldn't it be a little confusing (and by that I mean more confusing/unclear than auto already is) if your example worked as you describe?

When do extra parentheses have an effect, other than on operator precedence?

Parentheses in C++ are used in many places: e.g. in function calls and grouping expressions to override operator precedence. Apart from illegal extra parentheses (such as around function call argument lists), a general -but not absolute- rule of C++ is that extra parentheses never hurt:
5.1 Primary expressions [expr.prim]
5.1.1 General [expr.prim.general]
6 A parenthesized expression is a primary expression whose type and
value are identical to those of the enclosed expression. The presence
of parentheses does not affect whether the expression is an lvalue.
The parenthesized expression can be used in exactly the same contexts
as those where the enclosed expression can be used, and with the same
meaning, except as otherwise indicated.
Question: in which contexts do extra parentheses change the meaning of a C++ program, other than overriding basic operator precedence?
NOTE: I consider the restriction of pointer-to-member syntax to &qualified-id without parentheses to be outside the scope because it restricts syntax rather than allowing two syntaxes with different meanings. Similarly, the use of parentheses inside preprocessor macro definitions also guards against unwanted operator precedence.
TL;DR
Extra parentheses change the meaning of a C++ program in the following contexts:
preventing argument-dependent name lookup
enabling the comma operator in list contexts
ambiguity resolution of vexing parses
deducing referenceness in decltype expressions
preventing preprocessor macro errors
Preventing argument-dependent name lookup
As is detailed in Annex A of the Standard, a post-fix expression of the form (expression) is a primary expression, but not an id-expression, and therefore not an unqualified-id. This means that argument-dependent name lookup is prevented in function calls of the form (fun)(arg) compared to the conventional form fun(arg).
3.4.2 Argument-dependent name lookup [basic.lookup.argdep]
1 When the postfix-expression in a function call (5.2.2) is an
unqualified-id, other namespaces not considered during the usual
unqualified lookup (3.4.1) may be searched, and in those namespaces,
namespace-scope friend function or function template declarations
(11.3) not otherwise visible may be found. These modifications to the
search depend on the types of the arguments (and for template template
arguments, the namespace of the template argument). [ Example:
namespace N {
struct S { };
void f(S);
}
void g() {
N::S s;
f(s); // OK: calls N::f
(f)(s); // error: N::f not considered; parentheses
// prevent argument-dependent lookup
}
—end example ]
Enabling the comma operator in list contexts
The comma operator has a special meaning in most list-like contexts (function and template arguments, initializer lists etc.). Parentheses of the form a, (b, c), d in such contexts can enable the comma operator compared to the regular form a, b, c, d where the comma operator does not apply.
5.18 Comma operator [expr.comma]
2 In contexts where comma is given a special meaning, [ Example: in
lists of arguments to functions (5.2.2) and lists of initializers
(8.5) —end example ] the comma operator as described in Clause 5 can
appear only in parentheses. [ Example:
f(a, (t=3, t+2), c);
has three arguments, the second of which has the value 5. —end example
]
Ambiguity resolution of vexing parses
Backward compatibility with C and its arcane function declaration syntax can lead to surprising parsing ambiguities, known as vexing parses. Essentially, anything that can be parsed as a declaration will be parsed as one, even though a competing parse would also apply.
6.8 Ambiguity resolution [stmt.ambig]
1 There is an ambiguity in the grammar involving expression-statements
and declarations: An expression-statement with a function-style
explicit type conversion (5.2.3) as its leftmost subexpression can be
indistinguishable from a declaration where the first declarator starts
with a (. In those cases the statement is a declaration.
8.2 Ambiguity resolution [dcl.ambig.res]
1 The ambiguity arising from the similarity between a function-style
cast and a declaration mentioned in 6.8 can also occur in the context
of a declaration. In that context, the choice is between a function
declaration with a redundant set of parentheses around a parameter
name and an object declaration with a function-style cast as the
initializer. Just as for the ambiguities mentioned in 6.8, the
resolution is to consider any construct that could possibly be a
declaration a declaration. [ Note: A declaration can be explicitly
disambiguated by a nonfunction-style cast, by an = to indicate
initialization or by removing the redundant parentheses around the
parameter name. —end note ] [ Example:
struct S {
S(int);
};
void foo(double a) {
S w(int(a)); // function declaration
S x(int()); // function declaration
S y((int)a); // object declaration
S z = int(a); // object declaration
}
—end example ]
A famous example of this is the Most Vexing Parse, a name popularized by Scott Meyers in Item 6 of his Effective STL book:
ifstream dataFile("ints.dat");
list<int> data(istream_iterator<int>(dataFile), // warning! this doesn't do
istream_iterator<int>()); // what you think it does
This declares a function, data, whose return type is list<int>. The
function data takes two parameters:
The first parameter is named dataFile. It's type is istream_iterator<int>. The
parentheses around dataFile are superfluous and are ignored.
The second parameter has no name. Its type is pointer to function taking
nothing and returning an istream_iterator<int>.
Placing extra parentheses around the first function argument (parentheses around the second argument are illegal) will resolve the ambiguity
list<int> data((istream_iterator<int>(dataFile)), // note new parens
istream_iterator<int>()); // around first argument
// to list's constructor
C++11 has brace-initializer syntax that allows to side-step such parsing problems in many contexts.
Deducing referenceness in decltype expressions
In contrast to auto type deduction, decltype allows referenceness (lvalue and rvalue references) to be deduced. The rules distinguish between decltype(e) and decltype((e)) expressions:
7.1.6.2 Simple type specifiers [dcl.type.simple]
4 For an expression e, the type denoted by decltype(e) is defined as
follows:
— if e is an unparenthesized id-expression or an
unparenthesized class member access (5.2.5), decltype(e) is the type
of the entity named by e. If there is no such entity, or if e names a
set of overloaded functions, the program is ill-formed;
— otherwise,
if e is an xvalue, decltype(e) is T&&, where T is the type of e;
—
otherwise, if e is an lvalue, decltype(e) is T&, where T is the type
of e;
— otherwise, decltype(e) is the type of e.
The operand of the
decltype specifier is an unevaluated operand (Clause 5). [ Example:
const int&& foo();
int i;
struct A { double x; };
const A* a = new A();
decltype(foo()) x1 = 0; // type is const int&&
decltype(i) x2; // type is int
decltype(a->x) x3; // type is double
decltype((a->x)) x4 = x3; // type is const double&
—end example ] [ Note: The rules for determining types involving
decltype(auto) are specified in 7.1.6.4. —end note ]
The rules for decltype(auto) have a similar meaning for extra parentheses in the RHS of the initializing expression. Here's an example from the C++FAQ and this related Q&A
decltype(auto) look_up_a_string_1() { auto str = lookup1(); return str; } //A
decltype(auto) look_up_a_string_2() { auto str = lookup1(); return(str); } //B
The first returns string, the second returns string &, which is a reference to the local variable str.
Preventing preprocessor macro related errors
There is a host of subtleties with preprocessor macros in their interaction with the C++ language proper, the most common of which are listed below
using parentheses around macro parameters inside the macro definition #define TIMES(A, B) (A) * (B); in order to avoid unwanted operator precedence (e.g. in TIMES(1 + 2, 2 + 1) which yields 9 but would yield 6 without the parentheses around (A) and (B)
using parentheses around macro arguments having commas inside: assert((std::is_same<int, int>::value)); which would otherwise not compile
using parentheses around a function to protect against macro expansion in included headers: (min)(a, b) (with the unwanted side effect of also disabling ADL)
In general, in programming languages, "extra" parentheses implies that they are not changing the syntactical parsing order or meaning. They are being added to clarify the order (operator precedence) for the benefit of people reading the code, and their only effect would be to slightly slow the compile process, and reduce human errors in understanding the code (probably speeding up the overall development process).
If a set of parentheses actually changes the way an expression is parsed, then they are by definition not extra. Parentheses that turn an illegal/invalid parse into a legal one are not "extra", although that may point out a poor language design.

C++11 initializer with ambiguous function id-expression?

In the following C++11 code:
void f(int) {}
void f(double) {}
void (*p)(int) = f;
There are two functions.
The third f identifier is an id-expression and the initializer of p.
In 5.1.1p8 [expr.prim.general]/8 it says:
The type of the [id-expression] is the type of the identifier.
The result is the entity denoted by the identifier. The result is an lvalue if the entity is a function, variable, or data member and a prvalue otherwise.
Given that f could be referring to two different entities with two different types, there is no "the entity" or "the type".
Is there some other text in the standard that addresses this situation?
Do implementations just disambiguate this as an extension or is it required somewhere? (Without some other text one could argue that an implementation could reject the f id-expression as ambiguous.)
The standard (at § 13.4) defines that:
A use of an overloaded function name without arguments is resolved in
certain contexts to a function, a pointer to function or a pointer to
member function for a specific function from the overload set. A
function template name is considered to name a set of overloaded
functions in such contexts. The function selected is the one whose
type is identical to the function type of the target type required in
the context.
Emphasis mine.
After the quote, there is an example (at § 13.4/5) that resembles yours:
int f(double);
int f(int);
int (*pfd)(double) = &f; // selects f(double)
int (*pfi)(int) = &f; // selects f(int)
As far as the unary & is concerned, the standard specifies that (at § 5.3.1/6 and thanks to jogojapan):
The address of an overloaded function can be taken only in a context
that uniquely determines which version of the overloaded function is
referred to.
but can also be omitted (at § 13.4/1):
The overloaded function name can be preceded by the & operator.
(again, emphasis mine) just like you did, in your example.

Can a (C/C++) array initialization reference itself? [duplicate]

This question already has an answer here:
Can I Reference Previous Members of an Initializer List?
(1 answer)
Closed 11 months ago.
I was wondering about an initialization of the following form:
int array[] = {
v - 1,
array[0] + 1
} ;
In the initialization of the second element, the value of the first is used, but the entire array is not yet initialized. This happens to compile with g++, but I was unsure whether this is actually portable and a well defined construct?
See 3.3.2 Point of declaration:
The point of declaration for a name is immediately after its complete declarator (Clause 8) and before its
initializer (if any), except as noted below. [ Example:
int x = 12;
{ int x = x; }
Here the second x is initialized with its own (indeterminate) value. —end example ]
So you are referring to the array correctly, its name is known after the =.
Then, 8.5.1 Aggregates:
An aggregate is an array or a class [...]
17: The full-expressions in an initializer-clause are evaluated in the order in which they appear.
However, I see no reference to when the evaluated values are actually written into the array, so I wouldn't rely on this and would even go so far to declare your code as not well defined.
As far as I can see, this is not well defined. The standard (C++11, 8.5.1/17) specifies that "The full-expressions in an initializer-clause are evaluated in the order in which they appear", but I can't see anything that requires each aggregate element to be initialised from the result of its initializer-clause before the next is evaluated.
Can a (C/C++) array initialization reference itself?
This is also valid C code.
C has some correspondent paragraph (emphasis mine).
(C99, 6.2.1p7) "Structure, union, and enumeration tags have scope that begins just after the appearance of the tag in a type specifier that declares the tag. Each enumeration constant has scope that begins just after the appearance of its defining enumerator in an enumerator list. Any other identifier has scope that begins just after the completion of its declarator."
I think this is handled by http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1343 . Initially my report was only about non-class initializers for namespace scope objects (see When exactly is an initializer temporary destroyed?), but the problem exists for aggregate elements just aswell if they are non-class. And as the additional recent note explains, even seems to exist for the entire aggregate initialization aswell, even if it is a class object, because then no constructor call happens that would enlargen the full-expression of the initializer.
If instead of int you would have used a class, and the initialization would be a constructor call, then that constructor call would be part of the same full expression that encloses the aggregate-ininitializer element, so that here the order would be OK and your code would be well-defined.