I have found multiple questions regarding the subject of defining variables inside a switch construct, but I have not yet found a clear answer to this question.
Chapter 5.3.2 of the book C++ Primer says the following:
As we’ve seen, execution in a switch can jump across case labels. When execution jumps to a particular case, any code that occurred inside the switch before that label is ignored.
Considering this information, I do not understand why the example below is legal. If control jumps to the false case, it should ignore the true case. This means that, assigning to i should be illegal, because it was never declared. Why is this construct legal?
case true:
int i;
break;
case false:
i = 42;
break;
Declaration is a compile-time thing, and what happens at runtime is irrelevant to that fact. i is visible at any point within the same or child scope after its declaration.
There is nothing that causes a scope change between the two cases, so i remains visible in the false case regardless of whether the true case executed.
This is why you may see anonymous blocks ({ }) used to artificially constrain scope in switch cases. It's to prevent exactly this potential issue (though in this case it's not an issue).
case true: {
int i;
break;
} // Closing this block causes the end of i's lifetime.
case false: {
i = 42; // Compile-time error; i is no longer in scope.
break;
}
Note that your code becomes illegal just by initializing i. Jumps cannot cross over initialization in either direction.
case true:
int i = 0;
break;
case false: // error: jump to case label crosses initialization
i = 42;
break;
Also, any variable of a type that is not trivial cannot have a lifetime spanning cases even if it is not explicitly initialized.
case true:
std::string i;
break;
case false: // error: jump to case label crosses initialization
i = "42";
break;
The fix in this case is to use anonymous blocks to constrain the scope the declaration of i to not span multiple cases.
The relevant standardese:
It is possible to transfer into a block, but not in a way that bypasses declarations with initialization. A program that jumps* from a point where a variable with automatic storage duration is not in scope to a point where it is in scope is ill-formed unless the variable has scalar type, class type with a trivial default constructor and a trivial destructor, a cv-qualified version of one of these types, or an array of one of the preceding types and is declared without an initializer.
-- C++14 (N4296) [stmt.dcl.3]
The footnote (*) regarding jumps:
The transfer from the condition of a switch statement to a case label is considered a jump in this respect.
Related
Consider this example
#include <iostream>
int main(){
switch(int a = 1){ //#condition
case 1: switch(int a = 2){}
case 2: switch(int a = 2){}
}
}
why the redeclarations of a are well-formed in this example?
According to the following rule:
basic.scope.block#3
Names declared in the init-statement, the for-range-declaration, and in the condition of if, while, for, and switch statements are local to the if, while, for, or switch statement (including the controlled statement), and shall not be redeclared in a subsequent condition of that statement nor in the outermost block (or, for the if statement, any of the outermost blocks) of the controlled statement.
IIUC, both the declarations in statement switch(int a = 2){} or switch(int a = 2){} are all in the outermost block of the controlled statement which is a compound-statement.
As contrast:
#include <iostream>
int main(){
switch(int a = 1){ //#condition
case 1: int a = 2;
}
}
The redeclaration of a after case 1 is ill-formed since it's redeclared in the outermost block of that statement.
Clarify
According to stmt.block, A block is an alias of a compound-statement. So the above rule totally says about block, it's regardless of scope. The rule is equivalent to:
shall not be redeclared in the outermost compound-statement of the controlled statement.
So, what I can't understand here is that, since there's no any block between the condition of inner switch and the outermost block of the first switch, how could say that the condition of the inner switch is not in the outermost block of the outer switch?
switch(int a = 1){ <- outermost block of the primary `switch`
case 1: switch(int a = 2 /*there's no any block contains this condition*/){}
}
As contrast:
switch(int a = 1){ <- outermost block of the primary `switch`
case 1: { /* here exists a block between `int a = 2` and outermost block of the primary `switch`, so the condition definitely not in the outermost block*/
switch(int a = 2 ){}
}
}
Is there any rule in the standard that I have missed says about the transformation which is similar to stmt.while#2, which will make the condition be contained in an invented block(compound-statement)?
The real question is the meaning of in here. Anything between the { and } that delimit the switch is of course in that “outermost block”, but the use of “outermost” clearly implies that we’re not supposed to consider portions of the program that are (also) inside a nested block. The simplest way to reach that interpretation is to read “in” as “directly in”, in the same sense that “a function declared in a namespace” does not usually include member functions of classes in that namespace. The condition of a nested switch is then exempt because declarations in it are not directly in any block.
P1787R6, which was adopted in November 2020, clarifies the situation by rewriting [basic.scope.block] to specifically refer to the singular scope associated with the substatement, independently of transformations like that in [stmt.while]/2.
There's no such rule. The code in stmt.while/2 is misleadingly redundant; it could equally be written
label:
if ( condition ) {
statement
goto label ;
}
It is necessary to look at the meaning of the word "in" in stmt.pre/5 and basic.scope.block/3:
Names declared in the init-statement, the for-range-declaration, and in the condition of if, while, for, and switch statements are local to the if, while, for, or switch statement (including the controlled statement), and shall not be redeclared in a subsequent condition of that statement nor in the outermost block (or, for the if statement, any of the outermost blocks) of the controlled statement.
Here, "in" means "immediately in"; it's referring to declaration-statements which would have scope the remainder of the block. Declarations via the condition of a selection statement are not "in" the enclosing block, they are "in" that selection statement.
In the following code, why is the variable i not assigned the value 1?
#include <stdio.h>
int main(void)
{
int val = 0;
switch (val) {
int i = 1; //i is defined here
case 0:
printf("value: %d\n", i);
break;
default:
printf("value: %d\n", i);
break;
}
return 0;
}
When I compile, I get a warning about i not being initialized despite int i = 1; that clearly initializes it
$ gcc -Wall test.c
warning: ‘i’ is used uninitialized in this function [-Wuninitialized]
printf("value %d\n", i);
^
If val = 0, then the output is 0.
If val = 1 or anything else, then the output is also 0.
Please explain to me why the variable i is declared but not defined inside the switch. The object whose identifier is i exists with automatic storage duration (within the block) but is never initialized. Why?
According to the C standard (6.8 Statements and blocks), emphasis mine:
3 A block allows a set of declarations and statements to be grouped
into one syntactic unit. The initializers of objects that have
automatic storage duration, and the variable length array declarators
of ordinary identifiers with block scope, are evaluated and the values
are stored in the objects (including storing an indeterminate value
in objects without an initializer) each time the declaration is
reached in the order of execution, as if it were a statement, and
within each declaration in the order that declarators appear.
And (6.8.4.2 The switch statement)
4 A switch statement causes control to jump to, into, or past the
statement that is the switch body, depending on the value of a
controlling expression, and on the presence of a default label and the
values of any case labels on or in the switch body. A case or default
label is accessible only within the closest enclosing switch
statement.
Thus the initializer of variable i is never evaluated because the declaration
switch (val) {
int i = 1; //i is defined here
//...
is not reached in the order of execution due to jumps to case labels and like any variable with the automatic storage duration has indeterminate value.
See also this normative example from 6.8.4.2/7:
EXAMPLE In the artificial program fragment
switch (expr)
{
int i = 4;
f(i);
case 0:
i = 17; /* falls through into default code */
default:
printf("%d\n", i);
}
the object whose identifier is i exists with
automatic storage duration (within the block) but is never
initialized, and thus if the controlling expression has a nonzero
value, the call to the printf function will access an indeterminate
value. Similarly, the call to the function f cannot be reached.
In the case when val is not zero, the execution jumps directly to the label default. This means that the variable i, while defined in the block, isn't initialized and its value is indeterminate.
6.8.2.4 The switch statement
A switch statement causes control to jump to, into, or past the statement that is the
switch body, depending on the value of a controlling expression, and on the presence of a
default label and the values of any case labels on or in the switch body. A case or
default label is accessible only within the closest enclosing switch statement.
Indeed, your i is declared inside the switch block, so it only exists inside the switch. However, its initialization is never reached, so it stays uninitialized when val is not 0.
It is a bit like the following code:
{
int i;
if (val==0) goto zerovalued;
else goto nonzerovalued;
i=1; // statement never reached
zerovalued:
i = 10;
printf("value:%d\n",i);
goto next;
nonzerovalued:
printf("value:%d\n",i);
goto next;
next:
return 0;
}
Intuitively, think of raw declaration like asking the compiler for some location (on the call frame in your call stack, or in a register, or whatever), and think of initialization as an assignment statement. Both are separate steps, and you could look at an initializing declaration in C like int i=1; as syntactic sugar for the raw declaration int i; followed by the initializing assignment i=1;.
(actually, things are slightly more complex e.g. with int i= i!=i; and even more complex in C++)
Line for initialization of i variable int i = 1; is never called because it does not belong to any of available cases.
The initialization of variables with automatic storage durations is detailed in C11 6.2.4p6:
For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.
I.e. the lifetime of i in
switch(a) {
int i = 2;
case 1: printf("%d",i);
break;
default: printf("Hello\n");
}
is from { to }. Its value is indeterminate, unless the declaration int i = 2; is reached in the execution of the block. Since the declaration is before any case label, the declaration cannot be ever reached, since the switch jumps to the corresponding case label - and over the initialization.
Therefore i remains uninitialized. And since it does, and since it has its address never taken, the use of the uninitialized value to undefined behaviour C11 6.3.2.1p2:
[...] If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
(Notice that the standard itself here words the contents in the clarifying parenthesis incorrectly - it is declared with an initializer but the initializer is not executed).
The issue of variable declaration in switch-case statements is well discussed in this SO post, and the answers covered most of the aspects. But I faced a problem for that I couldn't find a solid reason. Can someone please explain what is wrong with this code?
switch (var)
{
case 0:
int test;
test = 0; // no error
int test2 = 0; // ERROR: initialization of 'test2' is skipped by 'case' label
std::string str;
str = "test"; // ERROR: initialization of 'str' is skipped by 'case' label
break;
case 1:;
break;
}
I know why the 6th line results in error. But what is wrong with the next two lines? I think this may have something to do with the difference between native types and class types, but I am not sure.
This is not a duplicate question of Why can't variables be declared in a switch statement?! As I have provided a link to the original one. Please read the two questions and note the difference. AFAIK, issue is not discussed in the original question.
It is possible to transfer into a block, but not in a way that bypasses declarations with initialization. A
program that jumps 90 from a point where a variable with automatic storage duration is not in scope to a
point where it is in scope is ill-formed unless the variable has scalar type, class type with a trivial default
constructor and a trivial destructor, a cv-qualified version of one of these types, or an array of one of the
preceding types and is declared without an initializer (8.5).
([stmt.dcl]/3)
The intuitive explanation is that you can only skip a declaration if the initialization it performs is effectively a no-op. If a value is provided, you can't skip it. If there is any code in the constructor of a class, you can't skip it.
The following syntax is valid:
while (int i = get_data())
{
}
But the following is not:
do
{
} while (int i = get_data());
We can see why via the draft standard N4140 section 6.4:
1 [...]
condition:
expression
attribute-specifier-seqopt decl-specifier-seq declarator = initializer-clause
attribute-specifier-seqopt decl-specifier-seq declarator braced-init-list
2 The rules for conditions apply both to selection-statements and
to the for and while statements (6.5). [...]
and section 6.5
1 Iteration statements specify looping.
iteration-statement:
while ( condition ) statement
do statement while ( expression ) ;
Instead, you're forced to do something ugly like:
int i = get_data();
do
{
} while ((i = get_data())); // double parentheses sic
What is the rationale for this?
It seems like scoping would be the issue, what would be the scope of i declared in the while portion of a do while statement? It would seem rather unnatural to have a variable available within the loop when the declaration is actually below the loop itself. You don't have this issue with the other loops since the declarations comes before the body of the loop.
If we look at the draft C++ standard section [stmt.while]p2 we see that for the while statement that:
while (T t = x) statement
is equivalent to:
label:
{ // start of condition scope
T t = x;
if (t) {
statement
goto label;
}
} // end of condition scope
and:
The variable created in a condition is destroyed and created with each iteration of the loop.
How would we formulate this for the do while case?
and as cdhowie points out if we look at section [stmt.do]p2 it says (emphasis mine):
In the do statement the substatement is executed repeatedly until the
value of the expression becomes false. The test takes place after each
execution of the statement.
which means the body of the loop is evaluated before we would even reach the declaration.
While we could create an exception for this case it would violate our intuitive sense that in general the point of declaration for a name is after we see the complete declaration(with some exceptions for example class member variables) with unclear benefits. Point of declaration is covered in section 3.3.2.
There are several reasons for why it would be difficult to allow.
The language sticks to the general rule that everything should be declared above the point of usage. In this case the variable declared in do-while would be declared below its expected natural scope (the cycle body). Making this variable accessible inside the cycle would've required a special treatment for do-while cycles. Even though we know examples of such special treatment (e.g. in-class member function bodies can see all class members, including the ones declared below), there's probably not much practical sense in doing it for do-while cycles.
In case of do-while these special treatment rules would also require finding a meaningful way of handling initialization of variables declared in this fashion. Note that in C++ language the lifetime of such variable is limited to one iteration of the loop, i.e. the variable is created and destroyed on each iteration. That means that for do-while cycle the variable will always remain uninitialized, unless you introduce some rule that would somehow move the initialization to the beginning of the loop body. That would be quite confusing in my opinion.
It would be very unnatural to have a declaration of i after the block and to then be able to access it in the block. Declaration in for and while are nice short-hands that give limited-scope use to a variable that is needed in the loop logic.
Cleaner to do it this way:
int i;
do {
i = get_data();
// whatever you want to do with i;
} while (i != 0);
This is because everything else follows the practice of declaring variables before you use them, eg:
public static void main(String[] args){
// scope of args
}
for(int i=1; i<10; i++){
// scope of i
}
{
...
int somevar;
//begin scope of var
...
//end of scope of var
}
This is because things are parsed top down, and because following this convention keeps things intuitive, thus why you can declare a while(int var < 10) because the scope of that var will be the area inside the loop, after the declaration.
The do while doesn't make any sense to declare a variable because the scope would end at the same time it would be checked because that's when that block is finished.
Add this
#define do(cond) switch (cond) do default:
at the beginning of your code.
Now, you can write
do (int i = get_data())
{
// your code
} while ((i = get_data()));
It is important that this #define does not break the original syntax of the do keyword in do-while loop.
However, I admit that it is obscure.
Your first syntax is valid while the second is not.
However, your while loop will loop forever, even if your function get_data() returns 0.
Not sure if that's exactly what you want to happen.
I have a program with a switch statement similar to this:
switch(n)
{
case 0:
/* stuff */
break;
int foo;
case 1:
foo = 5;
break;
case 2:
foo = 6;
break;
}
Notice the int foo; between case 0 and case 1. This statement is unreachable: if you walk through the program, you'll never step over it.
This compiles without warnings or errors with Clang, but it seemed to be jacked up when I ran it (though that could be due to other causes).
Is it well-defined behavior to declare a variable in an unreachable statement and use it in reachable statements, and is it going to work?
It is well-defined behavior as long as the variable has trivial construction, and has (approximately)
the same effect as if the variable was declared in a larger scope.
If any initialization is needed, you'll get an error.
section 6.7 says
It is possible to transfer into a block, but not in a way that bypasses declarations with initialization. A program that jumps from a point where a variable with automatic storage duration is not in scope to a point where it is in scope is ill-formed unless the variable has scalar type, class type with a trivial default constructor and a trivial destructor, a cv-qualified version of one of these types, or an array of one of the preceding types and is declared without an initializer.