The issue of variable declaration in switch-case statements is well discussed in this SO post, and the answers covered most of the aspects. But I faced a problem for that I couldn't find a solid reason. Can someone please explain what is wrong with this code?
switch (var)
{
case 0:
int test;
test = 0; // no error
int test2 = 0; // ERROR: initialization of 'test2' is skipped by 'case' label
std::string str;
str = "test"; // ERROR: initialization of 'str' is skipped by 'case' label
break;
case 1:;
break;
}
I know why the 6th line results in error. But what is wrong with the next two lines? I think this may have something to do with the difference between native types and class types, but I am not sure.
This is not a duplicate question of Why can't variables be declared in a switch statement?! As I have provided a link to the original one. Please read the two questions and note the difference. AFAIK, issue is not discussed in the original question.
It is possible to transfer into a block, but not in a way that bypasses declarations with initialization. A
program that jumps 90 from a point where a variable with automatic storage duration is not in scope to a
point where it is in scope is ill-formed unless the variable has scalar type, class type with a trivial default
constructor and a trivial destructor, a cv-qualified version of one of these types, or an array of one of the
preceding types and is declared without an initializer (8.5).
([stmt.dcl]/3)
The intuitive explanation is that you can only skip a declaration if the initialization it performs is effectively a no-op. If a value is provided, you can't skip it. If there is any code in the constructor of a class, you can't skip it.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I want to make all the variables which I create in my program to have initialization to zero without doing it explicitly.
For example suppose if I create a variable like below:
int i;
We know that it must contain any garbage value and to make it have value 0 by default we need to declare it like
int i=0;
but I want my all variables in the program to contain garbage value as 0 not any other value.
In my code I want my every variables (which include class variables, global variables, local variables) to automatically initialize to 0 without doing it explicitly.
So what logic I tried is that in C, I have written the "Hello World" program without using main() function. So what I have done in that time is to override the _start function which is the first function called by compiler to set up everything and then call to main(). So I think there must be a function which compiler calls during the creation of a variables and I thought we can set the value 0 to the all variables there. Please help me with this problem. If there exist some other logic to solve this problem you can share with me I am open to all solutions but please don't say to explicitly declare them with the value 0.
As a person who spends much of his working life looking at other people's broken code, I really have to say this, although I doubt it will be popular...
If it matters what the initial value of a variable is, then you should initialize it explicitly.
That's true even if you initialize it to what is, in fact, the default value. When I look at a statement like this:
int i = 0;
I immediately know (or think I know) that the programmer really thought about the value, and set it. If I read this:
int i;
then I assume that the programmer does not care about the value -- presumably because it will be assigned a value later.
So far as automatic variables are concerned, it would be easy enough for the compiler to generate code that zero'd the relevant part of the stack frame on entry to a function. I suspect it would be hard to do in application code; but why would you want to? Not only would it make the program behave in a way that appears to violate the language specifications, it would encourage slopping, unreadable programming practices.
Just initialize your variables, and have done with it. Or, if you don't want to initialize it because you know that the compiler will initialize it in the way you want, insert a comment to that effect. You'll thank yourself when you have to fix a bug five years later.
Default initialization and some words regarding the complexity of initialization in C++
To limit the scope of this discussion, let T be any kind of type (fundamental such as int, class types, aggregate as well as non-aggregate), and the t be a variable of automatic storage duration:
int main() {
T t; // default initialization
}
The declaration of t means t will be initialized by means of default initialization. Default initialization acts differently for different kind of types:
For fundamental types such as int, bool, float and so on, the effect is that t is left in an uninitialized state, and reading from it before explicitly initializing it (later) is undefined behavior
For class types, overload resolution resolve to a default constructor (which may be implicitly or implicitly generated), which will initialize the object but where its data member object could end up in an uninitialized state, depending on the definition of the default constructor selected
For array types, every element of the array is default-initialize, following the rules above
C++ initialization is complex, and there are many special rules and gotchas that can end up with uninitialized variable or data members of variables whence read results in UB.
Hence a long-standing recommendation is to always explicitly initialized you variables (with automatic storage duration), and not rely on the fact that default initialization may or may not result in a fully initialized variable. A common approach is (attempting) to use value initialization, by means of initialization of variable with empty braces:
int main() {
int i{}; // value-initialization -> zero-initialization
SomeAggregateClass ac{}; // aggregate initialization
}
However even this approach can fail for class types if one is not careful whether the class is an aggregate or not:
struct A {
A() = default; // not user-provided.
int a;
};
struct B {
B(); // user-provided.
int b;
};
// Out of line definition: a user-provided
// explicitly-defaulted constructor.
B::B() = default;
In this example (in C++11 through C++17), A is an aggregate, whereas B is not. This, in turn, means that initialization of B by means of an empty direct-list-init will result in its data member b being left in an uninitialized state. For A, however, the same initialization syntax will result in (via aggregate initialization of the A object and subsequent value initalization of its data member a) zero-initialization of its data member a:
A a{};
// Empty brace direct-list-init:
// -> A has no user-provided constructor
// -> aggregate initialization
// -> data member 'a' is value-initialized
// -> data member 'a' is zero-initialized
B b{};
// Empty brace direct-list-init:
// -> B has a user-provided constructor
// -> value-initialization
// -> default-initialization
// -> the explicitly-defaulted constructor will
// not initialize the data member 'b'
// -> data member 'b' is left in an unititialized state
This may come as a surprise, and with the obvious risk of reading the uninitialized data member b with the result of undefined behaviour:
A a{};
B b{}; // may appear as a sound and complete initialization of 'b'.
a.a = b.b; // reading uninitialized 'b.b': undefined behaviour.
I have found multiple questions regarding the subject of defining variables inside a switch construct, but I have not yet found a clear answer to this question.
Chapter 5.3.2 of the book C++ Primer says the following:
As we’ve seen, execution in a switch can jump across case labels. When execution jumps to a particular case, any code that occurred inside the switch before that label is ignored.
Considering this information, I do not understand why the example below is legal. If control jumps to the false case, it should ignore the true case. This means that, assigning to i should be illegal, because it was never declared. Why is this construct legal?
case true:
int i;
break;
case false:
i = 42;
break;
Declaration is a compile-time thing, and what happens at runtime is irrelevant to that fact. i is visible at any point within the same or child scope after its declaration.
There is nothing that causes a scope change between the two cases, so i remains visible in the false case regardless of whether the true case executed.
This is why you may see anonymous blocks ({ }) used to artificially constrain scope in switch cases. It's to prevent exactly this potential issue (though in this case it's not an issue).
case true: {
int i;
break;
} // Closing this block causes the end of i's lifetime.
case false: {
i = 42; // Compile-time error; i is no longer in scope.
break;
}
Note that your code becomes illegal just by initializing i. Jumps cannot cross over initialization in either direction.
case true:
int i = 0;
break;
case false: // error: jump to case label crosses initialization
i = 42;
break;
Also, any variable of a type that is not trivial cannot have a lifetime spanning cases even if it is not explicitly initialized.
case true:
std::string i;
break;
case false: // error: jump to case label crosses initialization
i = "42";
break;
The fix in this case is to use anonymous blocks to constrain the scope the declaration of i to not span multiple cases.
The relevant standardese:
It is possible to transfer into a block, but not in a way that bypasses declarations with initialization. A program that jumps* from a point where a variable with automatic storage duration is not in scope to a point where it is in scope is ill-formed unless the variable has scalar type, class type with a trivial default constructor and a trivial destructor, a cv-qualified version of one of these types, or an array of one of the preceding types and is declared without an initializer.
-- C++14 (N4296) [stmt.dcl.3]
The footnote (*) regarding jumps:
The transfer from the condition of a switch statement to a case label is considered a jump in this respect.
I have a program with a switch statement similar to this:
switch(n)
{
case 0:
/* stuff */
break;
int foo;
case 1:
foo = 5;
break;
case 2:
foo = 6;
break;
}
Notice the int foo; between case 0 and case 1. This statement is unreachable: if you walk through the program, you'll never step over it.
This compiles without warnings or errors with Clang, but it seemed to be jacked up when I ran it (though that could be due to other causes).
Is it well-defined behavior to declare a variable in an unreachable statement and use it in reachable statements, and is it going to work?
It is well-defined behavior as long as the variable has trivial construction, and has (approximately)
the same effect as if the variable was declared in a larger scope.
If any initialization is needed, you'll get an error.
section 6.7 says
It is possible to transfer into a block, but not in a way that bypasses declarations with initialization. A program that jumps from a point where a variable with automatic storage duration is not in scope to a point where it is in scope is ill-formed unless the variable has scalar type, class type with a trivial default constructor and a trivial destructor, a cv-qualified version of one of these types, or an array of one of the preceding types and is declared without an initializer.
Out of curiosity, I've tried this code, resulting from an interview question[*]
int main(int argc, char *argv[])
{
int a = 1234;
printf("Outer: %d\n", a);
{
int a(a);
printf("Inner: %d\n", a);
}
}
When compiled on Linux (both g++ 4.6.3 and clang++ 3.0) it outputs:
Outer: 1234
Inner: -1217375632
However on Windows (VS2010) it prints:
Outer: 1234
Inner: 1234
The rationale would be that, until the copy-constructor of the second 'a' variable has finished, the first 'a' variable is still accessible. However I'm not sure if this is standard behaviour, or just a(nother) Microsoft quirk.
Any idea?
[*] The actual question was:
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
{
// Not at global scope here
int a = 1234;
{
int a;
// how do you set this a to the value of the containing scope a ?
}
}
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
Unless the outer scope can be explicitly named you cannot do this. You can explicitly name the global scope, namespace scopes, and class scopes, but not function or block statement scopes.
C++11 [basic.scope.pdecl 3.3.2 p1 states:
The point of declaration for a name is immediately after its complete declarator (Clause 8) and before its initializer (if any), except as noted below. [ Example:
int x = 12;
{ int x = x; }
Here the second x is initialized with its own (indeterminate) value. —end example ]
MSVC correctly implements this example, however it does not correctly implement this when the initializer uses parentheses instead of assignment syntax. There's a bug filed about this on microsoft connect.
Here's an example program with incorrect behavior in VS as a result of this bug.
#include <iostream>
int foo(char) { return 0; }
int foo(int) { return 1; }
int main()
{
char x = 'a';
{
int x = foo(static_cast<decltype(x)>(0));
std::cout << "'=' initialization has correct behavior? " << (x?"Yes":"No") << ".\n";
}
{
int x(foo(static_cast<decltype(x)>(0)));
std::cout << "'()' initialization has correct behavior? " << (x?"Yes":"No") << ".\n";
}
}
C++ includes the following note.
[ Note: Operations involving indeterminate values may cause undefined behavior. —end note ]
However, this note indicates that operations may cause undefined behavior, not that they necessarily do. The above linked bug report includes an acknowledgement from Microsoft that this is a bug and not that the program triggers undefined behavior.
Edit: And now I've changed the example so that the object with indeterminate value is only 'used' in an unevaluated context, and I believe that this absolutely rules out the possibility of undefined behavior on any platform, while still demonstrating the bug in Visual Studio.
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
If you want to get technical about the wording, it's pretty easy. A "temporary" has a specific meaning in C++ (see §12.2); any named variable you create is not a temporary. As such, you can just create a local variable (which is not a temporary) initialized with the correct value:
int a = 1234;
{
int b = a;
int a = b;
}
An even more defensible possibility would be to use a reference to the variable in the outer scope:
int a = 1234;
{
int &ref_a = a;
int a = ref_a;
}
This doesn't create an extra variable at all -- it just creates an alias to the variable at the outer scope. Since the alias has a different name, we retain access to the variable at the outer scope, without defining a variable (temporary or otherwise) to do so. Many references are implemented as pointers internally, but in this case (at least with a modern compiler and optimization turned on) I'd expect it not to be -- that the alias really would just be treated as a different name referring to the variable at the outer scope (and a quick test with VC++ shows that it works this way -- the generated assembly language doesn't use ref_a at all).
Another possibility along the same lines would be like this:
const int a = 10;
{
enum { a_val = a };
int a = a_val;
}
This is somewhat similar to the reference, except that in this case there's not even room for argument about whether a_val could be called a variable -- it absolutely is not a variable. The problem is that an enumeration can only be initialized with a constant expression, so we have to define the outer variable as const for it to work.
I doubt any of these is what the interviewer really intended, but all of them answer the question as stated. The first is (admittedly) a pure technicality about definitions of terms. The second might still be open to some argument (many people think of references as variables). Though it restricts the scope, there's no room for question or argument about the third.
What you are doing, initializing a variable with itself, is undefined behavior. All your test cases got it right, this is not a quirk. An implementation could also initialize a to 123456789 and it would still be standard.
Update: The comments on this answer point that initializing a variable with itself is not undefined behavior, but trying to read such variable is.
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
You can't. As soon as the identical name is declared, the outer name is inaccessible for the rest of the scope. You'd need a copy or an alias of the outer variable, which means you'd need a temporary variable.
I'm surprised that, even with the warning level cranked up, VC++ doesn't complain on this line:
int a(a);
Visual C++ will sometimes warn you about hiding a variable (maybe that's only for members of derived classes). It's also usually pretty good about telling you you're using a value before it has been initialized, which is the case here.
Looking at the code generated, it happens to initialize the inner a to the same value of the outer a because that's what's left behind in a register.
I had a look at the standard, it's actually a grey area but here's my 2 cents...
3.1 Declarations and definitions [basic.def]
A declaration introduces names into a translation unit or redeclares names introduced by previous declarations.
A declaration is a definition unless... [non relevant cases follow]
3.3.1 Point of declaration
The point of declaration for a name is immediately after its complete declarator and before its initializer (if any), except as noted below [self-assignment example].
A nonlocal name remains visible up to the point of declaration of the local name that hides it.
Now, if we assume that this is the point of declaration of the inner 'a' (3.3.1/1)
int a (a);
^
then the outer 'a' should be visible up to that point (3.3.1/2), where the inner 'a' is defined.
Problem is that in this case, according to 3.1/2, a declaration IS a definition. This means the inner 'a' should be created. Until then, I can't understand from the standard whether the outer 'a' is still visible or not. VS2010 assumes that it is, and all that falls within the parentheses refers to the outer scope. However clang++ and g++ treat that line as a case of self-assignment, which results in undefined behaviour.
I'm not sure which approach is correct, but I find VS2010 to be more consistent: the outer scope is still visible until the inner 'a' is fully created.
In my mind, always, definition means storage allocation.
In the following code, int i allocates a 4-byte (typically) storage on program stack and bind it to i, and i = 3 assigns 3 to that storage. But because of goto, definition is bypassed which means there is no storage allocated for i.
I heard that local variables are allocated either at the entry of the function (f() in this case) where they reside, or at the point of definition.
But either way, how can i be used while it hasn't been defined yet (no storage at all)? Where does the value three assigned to when executing i = 3?
void f()
{
goto label;
int i;
label:
i = 3;
cout << i << endl; //prints 3 successfully
}
Long story short; goto will result is a runtime jump, variable definition/declaration will result in storage allocation, compile time.
The compiler will see and decide on how much storage to allocate for an int, it will also make so that this allocated storage will be set to 3 when "hitting" i = 3;.
That memory location will be there even if there is a goto at the start of your function, before the declaration/definition, just as in your example.
Very silly simile
If I place a log on the ground and my friend runs (with his eyes closed) and jumps over it, the log will still be there - even if he hasn't seen or felt it.
It's realistic to say that he could turn around (at a later time) and set it on fire, if he wanted to. His jump doesn't make the log magically disappear.
Your code is fine. The variable lives wherever it would live had the goto not been there.
Note that there are situations where you can't jump over a declaration:
C++11 6.7 Declaration statement [stmt.dcl]
3 It is possible to transfer into a block, but not in a way that bypasses declarations with initialization. A
program that jumps from a point where a variable with automatic storage duration is not in scope to a
point where it is in scope is ill-formed unless the variable has scalar type, class type with a trivial default
constructor and a trivial destructor, a cv-qualified version of one of these types, or an array of one of the
preceding types and is declared without an initializer (8.5). [ Example:
void f()
{
// ...
goto lx; // ill-formed: jump into scope of `a'
// ...
ly:
X a = 1;
// ...
lx:
goto ly; // ok, jump implies destructor
// call for `a' followed by construction
// again immediately following label ly
}
—end example ]
Definitions are not executable code. They are just instructions to the compiler, letting it know the size and the type of the variable. In this sense, the definition is not bypassed by the goto statement.
If you use a class with a constructor instead of an int, the call of the constructor would be bypassed by the goto, but the storage would be allocated anyway. The class instance would remain uninitialized, however, so using it before its definition/initialization line gets the control is an error.
In my mind, always, definition means storage allocation.
This is not correct. The storage for the variable is reserved by the compiler when it creates the stack-layout for the function. The goto just bypasses the initialization. Since you assign a value before printing, everything is fine.
The control of flow has nothing to do with variable's storage which is reserved at compile time by the compiler.
The goto statement only effects the dynamic initialization of the object. For built-in types and POD types, it doesn't matter, for they can remain uninitialized. However, for non-POD types, this would result in compilation error. For example see this
struct A{ A(){} }; //it is a non-POD type
void f()
{
goto label;
A a; //error - you cannot skip this!
label:
return;
}
Error:
prog.cpp: In function ‘void f()’:
prog.cpp:8: error: jump to label ‘label’
prog.cpp:5: error: from here
prog.cpp:6: error: crosses initialization of ‘A a’
See here : http://ideone.com/p6kau
In this example A is a non-POD type because it has user-defined constructor, which means the object needs to be dynamically initialized, but since the goto statement attempts to skip this, the compiler generates error, as it should.
Please note that objects of only built-in types and POD types can remain uninitialized.
To make it short, variable declaration is lexical, i.e. pertaining to the lexical {}-enclosed blocks. The binding is valid from the line it is declared to the end of the block. It is unaffected by flow control (goto).
Variable assignment of locol (stack) variables, on the other hand, is a runtime operation performed when the control flow gets there. So goto has an influence on that.
Things get a bit more tricky when object construction becomes involved, but that's not your case here.
The position of the declaration of i is irrelevant to the compiler. You can prove this to yourself by compiling your code with int i before the goto and then after and comparing the generated assembly:
g++ -S test_with_i_before_goto.cpp -o test1.asm
g++ -S test_with_i_after_goto.cpp -o test2.asm
diff -u test1.asm test2.asm
The only difference in this case is the source file name (.file) reference.
The definition of a variable DOES NOT allocate memory for the variable. It does tell the compiler to prepare appropriate memory space to store the variable though, but the memory is not allocated when control passed the definition.
What really matters here is initialization.