A branch of an if-statement cannot be just a declaration? - c++

A branch of an if-statement cannot be just a declaration. If we need to introduce a name in a branch, it must be enclosed in a block.------by TC++PL 4th.
void f1(int i)
{
if (i)
int x = i + 2; //error: declaration of if-statement branch
}
But it makes sense on compilers of VS2013 and GCC4.8.
And Working draft(N3242) shows me that
If the substatement in a selection-statement is a single statement and not a compound-statement, it is as if it was rewritten to be a compound-statement containing the original substatement.
the code can be equivalently rewritten as:
void f1(int i)
{
if (i) {
int x = i + 2;
}
}
Thus after the if statement, x is no longer in scope.
So what is the standard?

Syntactically, a declaration can be a statement. Specifically, it's a declaration-statement.
(I say "can be" because not all declarations are statements. For example, a declaration at file scope, where a statement cannot legally appear, is not a statement.)
Since the syntax of an if statement is:
if ( condition ) statement
or
if ( condition ) statement else statement
it's perfectly legal for either branch of an if statement to be a declaration.
In your example, it's not particularly useful, but I can easily imagine that you might want to declare an object for the purpose of executing its constructor and/or destructor.
A branch of an if-statement cannot be just a declaration. If we need
to introduce a name in a branch, it must be enclosed in a
block.------by TC++PL 4th.
I have a PDF copy of "The C++ Programming Language", 4th edition, 4th printing, dated April 2015. The statement in 9.3 that "A declaration is a statement." is still there. That statement is incorrect, or at least incomplete. The statement that:
A branch of an if-statement cannot be just a declaration. If we need
to introduce a name in a branch, it must be enclosed in a block.
is no longer there; it's been updated to:
The scope of a declaration of a branch of an if-statement is just that
branch. If we need to introduce a name in a branch, it must be
enclosed in a block (§9.2).
That's not strictly correct, but it's true that if you want a declaration in a branch, you need to enclose it in a block if you want to refer to it in a statement. A branch consisting of just a declaration is legal but usually not useful.
(Incidentally, C has different rules. C has permitted mixed declarations and statements since the 1999 standard, but it doesn't treat declarations as statements, so in C a declaration can't be a branch of an if statement.)

Related

When was the ability to declare a variable in the if statement introduced in C++?

In C++, we can declare variables directly in the if statement and its value is used as the condition, e.g.
if (SubClass *subObject = dynamic_cast<SubClass *>(baseObject)) {
// ...
}
For some reason, I always assumed that this was a "relatively" new feature, introduced in C++11 at the earliest, but when I tried to confirm this, I found no information about this, only that C++17 expands the syntax even further. When I tried compiling a minimal example with -std=c++98, it worked. So has this been a feature of C++ from the beginning?
The first formal C++ Standard was ISO/IEC 14882:1998 (a.k.a. C++98). In this 'draft' version of that, the declaration of a variable inside an if statement is explicitly mentioned:
6.4 Selection statements       [stmt.select]
…
3     A name
introduced by a declaration in a condition (either introduced by the
type-specifier-seq or the declarator of the condition) is in scope
from its point of declaration until the end of the substatements
controlled by the condition. If the name is re-declared in the
outermost block of a substatement controlled by the condition, the
declaration that re-declares the name is ill-formed. [Example:
if (int x = f()) {
int x; // ill-formed, redeclaration of x
} else {
int x; // ill-formed, redeclaration of x
}
—end example]
So, in terms of formal Standards: Yes, it's "been there since the beginning."

Is it undefined behavior to redeclare a variable within enclosed scope?

#include <iostream>
using namespace std;
int main() {
int i = 0;
if(true) {
int i = 5;
cout << i << '\n';
}
return 0;
}
Tried running the above code on Ideone to see if its legal. The results perplex me:
We have a compilation error (1), (2)
Or this code prints 5 as expected (1)
Or it prints nothing (1), (2)
As you can see from my links this same code behaves radically differently each time it is compiled on Ideone! This smells like undefined behavior (UB).
OK C++ is known for its unintuitive behaviors BUT! - I admit its just my intuition but I wouldn't expect even C++ to make redeclaring a variable in an inner scope UB! Id expect either shadowing or mandatory compilation error.
Is my code really UB according to the C++ standard, or is it just a peculiarity of Ideone and/or gcc? If its UB, is it UB because I redeclared i or for some other reason I'm failing to notice now?
Is it UB to redeclare a variable within enclosed scope?
No, it is not.
The compiler error you are seeing is most likely caused by the fact that the outer i is declared but not used.
Otherwise, your code is just fine.
It works fine for me at https://ideone.com/AwVJqZ as well as my desktop.
There is no undefined behavior, the standard allows name hiding, it is covered in [basic.scope.hiding]:
A declaration of a name in a nested declarative region hides a declaration of the same name in an enclosing declarative region; see [basic.scope.declarative] and [basic.lookup.unqual].
and [basic.scope.declarative] says:
Every name is introduced in some portion of program text called a declarative region, which is the largest part of the program in which that name is valid, that is, in which that name may be used as an unqualified name to refer to the same entity.
In general, each particular name is valid only within some possibly discontiguous portion of program text called its scope.
To determine the scope of a declaration, it is sometimes convenient to refer to the potential scope of a declaration.
The scope of a declaration is the same as its potential scope unless the potential scope contains another declaration of the same name.
In that case, the potential scope of the declaration in the inner (contained) declarative region is excluded from the scope of the declaration in the outer (containing) declarative region.
and gives the following example:
[ Example: In
int j = 24;
int main() {
int i = j, j;
j = 42;
}
the identifier j is declared twice as a name (and used twice). The
declarative region of the first j includes the entire example. The
potential scope of the first j begins immediately after that j and
extends to the end of the program, but its (actual) scope excludes the
text between the , and the }. The declarative region of the second
declaration of j (the j immediately before the semicolon) includes all
the text between { and }, but its potential scope excludes the
declaration of i. The scope of the second declaration of j is the same
as its potential scope. — end example  ]
Why you see such variable results from IDEone, I don't know. It does not provide a lot of knobs for figuring out what is going on. Wandbox is one of several alternatives that provide a lot of knobs and does not exhibit the same variability for this case.

Is "int (x), 1;" an ambiguous statement?

void f(int x) {
int (x), 1;
}
Clang compiles it, GCC doesn't. Which compiler is correct?
IMO, the wording in [stmt.ambig] is clear enough on this:
An expression-statement with a function-style explicit type conversion as its leftmost subexpression can be indistinguishable from a declaration where the first declarator starts with a (. In those cases the statement is a declaration.
[Note: If the statement cannot syntactically be a declaration, there is no ambiguity, so this rule does not apply. The whole statement might need to be examined to determine whether this is the case.
The wording speaks of an entire (expression-)statement.
Your statement cannot be parsed as a declaration, because the lexeme 1 is grammatically not a declarator. There is no ambiguity: it might look ambiguous if we looked solely at int(x), but the standard quite explicitly denies that if some prefix of the statement parses as a declaration, the entire statement is considered a potential declaration.
In fact, core experts had a highly similar discussion back in 2002 over core issue 340---I highlighted the important bits. Here, again, we have a supposed declaration that contains an incompatible sub-construct.
Consider the following program:
struct Point {
Point(int){}
};
struct Lattice {
Lattice(Point, Point, int){}
};
int main(void) {
int a, b;
Lattice latt(Point(a), Point(b), 3); /* Line X */
}
The problem concerns the line marked /* Line X */, which is an ambiguous
declarations for either an object or a function. The clause that
governs this ambiguity is 8.2 [dcl.ambig.res] paragraph 1, and reads:
The ambiguity arising from the similarity between a function-style
cast and a declaration mentioned in 6.8 [stmt.ambig] [..]
Based on this clause there are two
possible interpretations of the declaration in line X:
The declaration of latt declares a function with a return value of the
type Lattice and taking three arguments. The type of the first two
arguments is Point and each of these arguments is followed by a
parameter name in redundant parentheses. The type of the third
argument can not be determined, because it is a literal. This will
result in a syntax error.
The declaration of latt declares an object,
because the other option (a function declaration) would result in a
syntax error. Note that the last sentence before the "[Note:" is not
much help, because both options are declarations.
Steve Adamczyk: a number of people replied to this posting on
comp.std.c++ saying that they did not see a problem.
The original
poster replied:
I can't do anything but agree with your argumentation. So there is
only one correct interpretation of clause 8.2 [dcl.ambig.res]
paragraph 1, but I have to say that with some rewording, the clause
can be made a lot clearer, like stating explicitly that the entire
declaration must be taken into account and that function declarations
are preferred over object declarations.
I would like to suggest the following as replacement for the current
clause 8.2 [dcl.ambig.res] paragraph 1:
The ambiguity arising from the similarity between a functionstyle cast
and a declaration mentioned in 6.8 [stmt.ambig] […]
The working group felt that the current wording is clear enough.

What does the standard mean by "a subsequent condition of that statement"?

The standard as of N4567 forbids some kinds of re-declaration of a name previously declared in a condition as follows—according to the standard(§3.3.3/4):
Names declared in the for-init-statement, the for-range-declaration, and in the condition of if, while, for, and switch statements are local to the if, while, for, or switch statement (including the controlled statement), and shall not be redeclared in a subsequent condition of that statement nor in the outermost block (or, for the if statement, any of the outermost blocks) of the controlled statement; see 6.4.
However, considering the fact that the following code compiles fine,
int main(void) {
if (int i=10)
if (int i=20)
;
return 0;
}
it seems unclear to me what exactly "a subsequent condition of that statement" stands for.
The highlighted "that" statement means the if, while, for, and switch statement that has defined the name, and not the substatement controlled by the condition or the iteration.
This is explained in:
6.4/3: A name introduced by a declaration in a condition (either introduced by the decl-specifier-seq or the declara- tor of the
condition) is in scope from its point of declaration until the end of
the substatements controlled by the condition. If the name is
re-declared in the outermost block of a substatement controlled by the
condition, the declaration that re-declares the name is ill-formed.
This is why the following statment is valid:
if (int i=10)
if (int i=20)
;
The compiler analyses the declaration of if (int i=20) not as a different condition of the same if-statement, but as a controlled substatement. And as the second declaration of i takes place in the condition, it is not considered in the outer block of the constrolled statement.
By contrast, the following almost equivalent statement is not valid, as it breaks the outer block constraint:
if (int k=10) {
int k=20; // <===== ouch ! redefinition in the outerblock
if (k)
cout <<"oops";
}
Hence the only case where you can have a "subsequent condition of that statement" is the for statement. The standandard highlights this special situation, by giving the rationale to the constraint that you've quoted with a clearer wording:
6.5.3/1: (...) names declared in the for-init-statement are in the same declarative-region as those declared in the condition,
i.e. declaring the same name in the init and in the condition would break the ODR.

Defining variables in control structures

According to the standard, what is the difference in behavior between declaring variables in control structures versus declaring variables elsewhere? I can't seem to find any mention of it.
If what I'm referring to isn't clear, here's an example:
if (std::shared_ptr<Object> obj = objWeakPtr.lock())
As you can see, I'm declaring and initializing a local variable, obj, in the if block.
Also, is there any technical reason as to why this syntax isn't given any special behavior when used in place of a conditional? For example, adding an additional set of brackets results in a compiler error; this also prevents the variable from being chained with other conditions.
// Extra brackets, won't compile.
if ((std::shared_ptr<Object> obj = objWeakPtr.lock()))
// If the above were valid, something like this could be desirable.
if ((std::shared_ptr<Object> obj = objWeakPtr.lock()) && obj->someCondition())
According to the standard, what is the difference in behavior between declaring variables in control structures versus declaring variables elsewhere? I can't seem to find any mention of it.
Declarations inside control structure introductions are no different that declarations elsewhere. That's why you can't find any differences.
6.4/3 does describe some specific semantics for this, but there are no surprises:
[n3290: 6.4/3]: A name introduced by a declaration in a condition
(either introduced by the type-specifier-seq or the declarator of the
condition) is in scope from its point of declaration until the end of
the substatements controlled by the condition. If the name is
re-declared in the outermost block of a substatement controlled by the
condition, the declaration that re-declares the name is ill-formed. [..]
Also, is there any technical reason as to why this syntax isn't given any special behavior when used in place of a conditional? For example, adding an additional set of brackets results in a compiler error; this also prevents the variable from being chained with other conditions.
An if condition can contain either a declarative statement or an expression. No expression may contain a declarative statement, so you can't mix them either.
[n3290: 6.4/1]: Selection statements choose one of several flows of control.
selection-statement:
if ( condition ) statement
if ( condition ) statement else statement
switch ( condition ) statement
condition:
expression
attribute-specifier-seq[opt] decl-specifier-seq declarator = initializer-clause
attribute-specifier-seq[opt] decl-specifier-seq declarator braced-init-list
It all just follows from the grammar productions.
The difference from declaring and initializing a variable in the condition and declaring it elsewhere is that the variable is used as the condition, and is in scope inside the if's conditional statement, but out of scope outside that condition. Also, it's not legal to re-declare the variable inside the if condition. So
bool x=something();
if(x) {
bool y=x; // legal, x in scope
int x=3; // legal
...
}
while (x=something_else()) // legal, x still in scope
...
but:
if(bool x=something())
bool y=x; // still legal
int x=3; // not legal to redeclare
...
}
while (x=something_else()) // x declared in condition not in scope any more