Is it undefined behavior to redeclare a variable within enclosed scope? - c++

#include <iostream>
using namespace std;
int main() {
int i = 0;
if(true) {
int i = 5;
cout << i << '\n';
}
return 0;
}
Tried running the above code on Ideone to see if its legal. The results perplex me:
We have a compilation error (1), (2)
Or this code prints 5 as expected (1)
Or it prints nothing (1), (2)
As you can see from my links this same code behaves radically differently each time it is compiled on Ideone! This smells like undefined behavior (UB).
OK C++ is known for its unintuitive behaviors BUT! - I admit its just my intuition but I wouldn't expect even C++ to make redeclaring a variable in an inner scope UB! Id expect either shadowing or mandatory compilation error.
Is my code really UB according to the C++ standard, or is it just a peculiarity of Ideone and/or gcc? If its UB, is it UB because I redeclared i or for some other reason I'm failing to notice now?

Is it UB to redeclare a variable within enclosed scope?
No, it is not.
The compiler error you are seeing is most likely caused by the fact that the outer i is declared but not used.
Otherwise, your code is just fine.
It works fine for me at https://ideone.com/AwVJqZ as well as my desktop.

There is no undefined behavior, the standard allows name hiding, it is covered in [basic.scope.hiding]:
A declaration of a name in a nested declarative region hides a declaration of the same name in an enclosing declarative region; see [basic.scope.declarative] and [basic.lookup.unqual].
and [basic.scope.declarative] says:
Every name is introduced in some portion of program text called a declarative region, which is the largest part of the program in which that name is valid, that is, in which that name may be used as an unqualified name to refer to the same entity.
In general, each particular name is valid only within some possibly discontiguous portion of program text called its scope.
To determine the scope of a declaration, it is sometimes convenient to refer to the potential scope of a declaration.
The scope of a declaration is the same as its potential scope unless the potential scope contains another declaration of the same name.
In that case, the potential scope of the declaration in the inner (contained) declarative region is excluded from the scope of the declaration in the outer (containing) declarative region.
and gives the following example:
[ Example: In
int j = 24;
int main() {
int i = j, j;
j = 42;
}
the identifier j is declared twice as a name (and used twice). The
declarative region of the first j includes the entire example. The
potential scope of the first j begins immediately after that j and
extends to the end of the program, but its (actual) scope excludes the
text between the , and the }. The declarative region of the second
declaration of j (the j immediately before the semicolon) includes all
the text between { and }, but its potential scope excludes the
declaration of i. The scope of the second declaration of j is the same
as its potential scope. — end example  ]
Why you see such variable results from IDEone, I don't know. It does not provide a lot of knobs for figuring out what is going on. Wandbox is one of several alternatives that provide a lot of knobs and does not exhibit the same variability for this case.

Related

Which subclause of C++ standard prohibits redeclaration / redefinition in a same block?

I'm reading Standard for Programming Language C++ and I cannot find a subclause prohibiting code like follows, which will obviously not compile:
/* Code A */
int main() {
int i;
int i;
}
while this one will compile:
/* Code B */
int main() {
int i;
{ int i; }
}
I've found something related, but I failed to find a matching one:
[basic.def.odr#1]: No translation unit shall contain more than one definition of any variable...
If it's this subclause, I cannot find a subclause explaining why the 2 i's are not the same variable in Code B but are the same variable in Code A;
[basic.scope.block#1]:A name declared in a block ([stmt.block]) is local to that block; it has block scope. Its potential scope begins at its point of declaration ([basic.scope.pdecl]) and ends at the end of its block. A variable declared at block scope is a local variable.
In fact I tried to look for something like or more general than "A name of variable with a block scope cannot be redeclared within its potential scope, excluding nested blocks" like [temp.local#6], but I failed:
[temp.local#6]: The name of a template-parameter shall not be redeclared within its scope (including nested scopes). ...
So can some give me some help? Thanks!
You are looking for [basic.scope.scope]/5
Two declarations potentially conflict if they correspond and cause their shared name to denote different entities ([basic.link]). The program is ill-formed if, in any scope, a name is bound to two declarations that potentially conflict and one precedes the other ([basic.lookup]).
emphasis mine

How is a type that's forward declared in a function parameter list visible outside the function scope? [duplicate]

This question already has an answer here:
Class declaration inside function parameter list
(1 answer)
Closed 1 year ago.
The following program compiles, which I find strange.
void f(class s);
using u = s; // ok, but why?
s is a forward declaration of a class inside a function parameter list, and it seems to me it should not be visible outside the function scope.
basic.scope.param seems the obvious place I would find this rule, but I can't work it out. The wording could be somewhere in dcl.dcl, but I'm not sure where to look.
What rule covers this? Optionally, an explanation of why this rule exists would be nice.
To start with, this rule is not particularly new. It existed since C++'s inception, pretty much. As for C++20, it is written as follows:
[basic.scope.pdecl]
7 The point of declaration of a class first declared in an elaborated-type-specifier is as follows:
...
for an elaborated-type-specifier of the form
class-key identifier
if the elaborated-type-specifier is used in the decl-specifier-seq or parameter-declaration-clause of a function defined in namespace scope, the identifier is declared as a class-name in the namespace that contains the declaration; otherwise, except as a friend declaration, the identifier is declared in the smallest namespace or block scope that contains the declaration.
But you are looking in the latest greatest draft head. You can't find it because the draft has P1787 merged in. It changes the normative wording and moves it with the intent of fixing some outstanding wording issues and improving the standard's approach in a world where modules exist.
Today, the relevant part resides in
[dcl.type.elab]
3 Otherwise, an elaborated-type-specifier E shall not have an attribute-specifier-seq. If E contains an identifier but no nested-name-specifier and (unqualified) lookup for the identifier finds nothing, E shall not be introduced by the enum keyword and declares the identifier as a class-name. The target scope of E is the nearest enclosing namespace or block scope.
And essentially, it means the same thing the C++20 wording does. It introduces the the class name as if by forward declaration into the nearest enclosing scope.
As for why this rule exists. Well... it doesn't exist in C up to date. Which creates some fairly obscure problems for the uninitiated. Consider this simple program:
void func(struct foo*);
struct foo { int bar; };
int main() {
struct foo f;
func(&f);
}
void func(struct foo* pf) {
pf->bar = 0;
}
It produces a slew of diagnostics, which frankly don't seem justified. IMHO it's a shortcoming of C, which in turn is motivation enough for C++ to do things the way it does. Compile the exact same program with a C++ compiler, and it's well formed.
class s is a forward declaration. That is equivalent to
class s;
void f(s);
s is not the name of a variable, but it's a type. So you are just saying that function f takes a parameter of type s.
Also the next line tells that u is equivalent to s. No problem. You don't need the full definition to do that.

C++ Order of Declaration (in Multi-variable Declaration Line)

I use the following in my C++ code:
int a = 0, b = a;
I would like to know if this behaviour is reliable and well defined (left to right order of name declaration) and that my code will not break with other compilers with an undeclared name error.
If not reliable, I would break the statement:
int a = 0;
int b = a;
Thank you.
I believe the answer is no.
It is subject to core active issue 1342 which says:
It is not clear what, if anything, in the existing specification requires that the initialization of multiple init-declarators within a single declaration be performed in declaration order.
We have non-normative note in [dcl.decl]p3 which says:
...[ Note: A declaration with several declarators is usually
equivalent to the corresponding sequence of declarations each with a
single declarator. That is
T D1, D2, ... Dn;
is usually equivalent to
T D1; T D2; ... T Dn;
...
but it is non-normative and it does not cover the initialization case at all and as far as I can tell no normative wording says the same thing.
Although the standard does cover the scope of names in [basic.scope.pdecl]p1 which says:
The point of declaration for a name is immediately after its complete
declarator and before its initializer (if any), except as noted below.
[ Example:
unsigned char x = 12;
{ unsigned char x = x; }
Here the second x is initialized with its own (indeterminate) value.
— end example  ]
The fact that you thought to ask this question suggests that the style is not great. Even though the one-line version is almost guaranteed† to work, I would still go with the two-line approach for the greater clarity to human readers.
† I initially said it was guaranteed, but I will step back from that. After reviewing the relevant portion of the spec, I can see how language lawyers would complain that this guarantee is not explicitly stated. (As Shafik Yaghmour points out, core active issue 1342 notes the lack of an explicit guarantee, albeit with phrasing that suggests that such a guarantee should be present.)
I will step back only to "almost guaranteed", though, as it is strongly implied by "Each init-declarator in a declaration is analyzed separately as if it was in a declaration by itself.". That is, the analysis of int a = 0, b = a; has two parts: one where a variable named a is initialized to 0, and one where a variable named b is initialized to the value of a. If you are truly keeping these parts separate, then the first part would have to finish before the second part begins (otherwise they are not as if each was in a declaration by itself), so a would have the value 0 before b is initialized. I accept that this might be not definite enough for the language lawyers, but it should be good enough for a compiler's bug report if there is a compiler for which that line does not work as intended.
My apologies for not looking up the spec earlier. The "language-lawyer" tag was not present when I initially answered.
A declaration statement that defines multiple variables separated by comma is exactly equivalent to multiple declaration statements that defines a single variable in the same order because the scope of a variable begins just after it's name, but there are (at least) two exceptions:
1) When a variable declaration hides a type with the same name, as in:
struct S {};
S S, T;
Is different from
struct S {};
S S;
S T; //error: S is a variable name
But
struct S {};
S S, T{S};
Is equivalent to
struct S{};
S S;
struct S T{S};
2) When using the auto and decltype(auto) specifiers:
auto i{0}, j{i}, k{2.0}; // error: deduction for auto fails, double or int?
Is different from
auto i{0};
auto j{i};
auto k{2.0};
In any case, evaluation order is always from left to right.

Where is name lookup rule defined that finds the most immediate declaration of a name?

int i;
void f()
{
int i{};
{
int a = i; // local or global 'i'?
}
}
My question is not which i gets chosen, as it's clear that it's the local one, but rather, where in the standard that is specified.
The closest rule I could find is [basic.lookup.unqual]p6, which says:
In the definition of a function that is a member of namespace N, a name used after the function's declarator-id shall be declared before its use in the block in which it is used or in one of its enclosing blocks ([stmt.block]) or shall be declared before its use in namespace N or, if N is a nested namespace, shall be declared before its use in one of N's enclosing namespaces.
But there it just says that the name has to be declared sometime before the use; it's not what I'm looking for. The example in the same paragraph makes everything clearer as it says what scopes are searched in what order, but it's an example and as such not nominative.
Every other paragraph in [basic.lookup.unqual] doesn't apply to non-member functions. So my question is where in the standard is this specified?
In [basic.scope.declarative] we have:
Every name is introduced in some portion of program text called a declarative region, which is the largest part of the program in which that name is valid, that is, in which that name may be used as an unqualified name to refer to the same entity.
In general, each particular name is valid only within some possibly discontiguous portion of program text called its scope.
To determine the scope of a declaration, it is sometimes convenient to refer to the potential scope of a declaration.
The scope of a declaration is the same as its potential scope unless the potential scope contains another declaration of the same name.
In that case, the potential scope of the declaration in the inner (contained) declarative region is excluded from the scope of the declaration in the outer (containing) declarative region.
[ Example: In
int j = 24;
int main() {
int i = j, j;
j = 42;
}
the identifier j is declared twice as a name (and used twice).
The declarative region of the first j includes the entire example.
The potential scope of the first j begins immediately after that j and extends to the end of the program, but its (actual) scope excludes the text between the , and the }.
The declarative region of the second declaration of j (the j immediately before the semicolon) includes all the text between { and }, but its potential scope excludes the declaration of i.
The scope of the second declaration of j is the same as its potential scope.
— end example ]
(Emphasis mine.)
In your
int a = i;
example, i must refer to the local i because the global i is literally not in scope here.
As it says at the beginning of [basic.lookup.unqual]:
In all the cases listed in [basic.lookup.unqual], the scopes are searched for a declaration in the order listed in each of the respective categories [...]
But it doesn't matter which search order we choose if only one declaration is in scope in the first place.

Defining variables in control structures

According to the standard, what is the difference in behavior between declaring variables in control structures versus declaring variables elsewhere? I can't seem to find any mention of it.
If what I'm referring to isn't clear, here's an example:
if (std::shared_ptr<Object> obj = objWeakPtr.lock())
As you can see, I'm declaring and initializing a local variable, obj, in the if block.
Also, is there any technical reason as to why this syntax isn't given any special behavior when used in place of a conditional? For example, adding an additional set of brackets results in a compiler error; this also prevents the variable from being chained with other conditions.
// Extra brackets, won't compile.
if ((std::shared_ptr<Object> obj = objWeakPtr.lock()))
// If the above were valid, something like this could be desirable.
if ((std::shared_ptr<Object> obj = objWeakPtr.lock()) && obj->someCondition())
According to the standard, what is the difference in behavior between declaring variables in control structures versus declaring variables elsewhere? I can't seem to find any mention of it.
Declarations inside control structure introductions are no different that declarations elsewhere. That's why you can't find any differences.
6.4/3 does describe some specific semantics for this, but there are no surprises:
[n3290: 6.4/3]: A name introduced by a declaration in a condition
(either introduced by the type-specifier-seq or the declarator of the
condition) is in scope from its point of declaration until the end of
the substatements controlled by the condition. If the name is
re-declared in the outermost block of a substatement controlled by the
condition, the declaration that re-declares the name is ill-formed. [..]
Also, is there any technical reason as to why this syntax isn't given any special behavior when used in place of a conditional? For example, adding an additional set of brackets results in a compiler error; this also prevents the variable from being chained with other conditions.
An if condition can contain either a declarative statement or an expression. No expression may contain a declarative statement, so you can't mix them either.
[n3290: 6.4/1]: Selection statements choose one of several flows of control.
selection-statement:
if ( condition ) statement
if ( condition ) statement else statement
switch ( condition ) statement
condition:
expression
attribute-specifier-seq[opt] decl-specifier-seq declarator = initializer-clause
attribute-specifier-seq[opt] decl-specifier-seq declarator braced-init-list
It all just follows from the grammar productions.
The difference from declaring and initializing a variable in the condition and declaring it elsewhere is that the variable is used as the condition, and is in scope inside the if's conditional statement, but out of scope outside that condition. Also, it's not legal to re-declare the variable inside the if condition. So
bool x=something();
if(x) {
bool y=x; // legal, x in scope
int x=3; // legal
...
}
while (x=something_else()) // legal, x still in scope
...
but:
if(bool x=something())
bool y=x; // still legal
int x=3; // not legal to redeclare
...
}
while (x=something_else()) // x declared in condition not in scope any more