I'm working my way through Accelerated C++ right now, and I've encountered a fundamental lack of understanding regarding scope and code blocks on my part.
There is an exercise at the end of chapter 1 that wants you to decide whether this code is will run:
#include <iostream>
#include <string>
int main()
{
{
const std::string s = "a string";
std::cout << s << std::endl;
{
const std::string s = "another string";
std::cout << s << std::endl;
}
}
return 0;
}
I was sure it wouldn't, but it does. My hobby programming experience was that variables declared in a block are available to other blocks contained within it, but not those outside of it.
And that must be at least half-true, since removing the second declaration of s will output "a string" twice, giving me the impression that s as declared in the second block is also present in the third block.
I also tried removing the braces of the third block entirely, resulting in the compilation error I expected in the first place. But how is that different from declaring a constant that already exists in the scope of the third block? Does the declaration of a constant only carry over to a smaller scope if there is no second declaration in that smaller scope?
I've gone through everything in the book up until this point again to see if I missed something, but I can't find any information on how variable and const declarations are affected by curly braces.
This doesn't just apply to constants though, that doesn't matter.
But how is that different from declaring a constant that already exists in the scope of the third block?
You are introducing another scope, where the variable s has not been defined, so it is perfectly legal to define one. If you remove one, you'll get a redefinition error, because you already have an s in the same scope.
Does the declaration of a constant only carry over to a smaller scope if there is no second declaration in that smaller scope?
Not really. Your second s is shadowing the first one. Technically, both of them exist, but you have no way of accessing the first one. Sometimes you do, with the help of the scope resolution operator, but in your case, no.
// global scope
int a;
void f() {
int a = 0;
a = 4; // local 'a'.
::a = 4; // global 'a'.
}
I can't find any information on how variable and const declarations are affected by curly braces.
Curly braces in general introduce a new scope (although there are some exceptions). As long as a variable with a given name is not defined in the current scope, you can define it. It doesn't matter if there is a variable with the same name in a scope outside of it, but your compiler will likely warn you about it.
Java and C++ are one of the few that allows this.
C# does not allow this.
It is called Variable Shadowing. If you declare a variable with same name in a smaller inner block as outer blocks variable then you get name masking.
So it doesnt matter that you have a const variable because the inner variable is a different variable altogether.
Related
I tend to use the words define, declare and assign interchangeably but this seems to cause offense to some people. Is this justified? Should I only use the word declare for the first time I assign to a variable? Or is there more to it than that?
A definition is where a value or function is described, i.e. the compiler or programmer is told precisely what it is, e.g.
int foo()
{
return 1;
}
int var; // or, e.g. int var = 5; but this is clearer.
A declaration tells the compiler, or programmer that the function or variable exists. e.g.
int foo();
extern int var;
An assignment is when a variable has its value set, usually with the = operator. e.g.
a = b;
a = foo();
Define and declare are similar but assign is very different.
Here I am declaring (or defining) a variable:
int x;
Here I am assigning a value to that variable:
x = 0;
Here I am doing both in one statement:
int x = 0;
Note
Not all languages support declaration and assignment in one statement:
T-SQL
declare x int;
set x = 0;
Some languages require that you assign a value to a variable upon declaration. This requirement allows the compiler or interpreter of the language to infer a type for the variable:
Python
x = 0
It is important to use the correct terminology, otherwise people will not know what you are talking about, or incorrectly assume that you don't know what you are talking about.
These terms often have precise meanings in the standards for various languages. When that is the case they should not be conflated.
In c for instance:
a function may be defined only once (when you say what it does), but it may also be declared before that (when you say what arguments it takes and what type it returns).
likewise a variable is declared when you say what type it is, and this happens only once for each scope. But you may assign a value repeatedly. (Some languages also differentiate between initialization (giving a variable a value at declaration time) and assignment (changing the value later).)
General Role:
Definition = declaration + reserved space.
Definition, declaration, and assignment have two cases:
for Variables.
for Functions.
For Variables:
-- Definition:
To tell the compiler to reserve memory for the variable.
int x;
-- Declaration:
To tell the compiler that the variable defined in somewhere else.
extern int x;
-- Assignment:
To tell the compiler to put the value in the variable.
x = 0;
For Functions:
-- Definition:
int functionDef(int x){
int x;
...
...
...
return x;
}
-- Declaration:
It is just the prototype of the function.
int functionDef(int x);
The differences can seem subtle, but they are important. Not every language makes the same distinctions, but in C++ a variable declaration makes the type and name of the variable known to the compiler
int i;
A variable definition allocates storage and specifies an initial value for the variable.
i = 1;
You can combine a variable declaration and definition into one statement, as is commonly done.
int x = 1;
Declaring a variable inside a function will also set aside memory for the variable, so the following code implicitly defines variable a as a part of its declaration.
int main()
{
int a;
return 0;
}
Since variable a is automatically defined by the compiler, it will contain whatever value was in the memory location that was allocated for it. This is why it is not safe to use automatic variables until you've explicitly assigned a known value to them.
An assignment takes place any time you change the value of a variable in your program.
x = 2;
x++;
x += 4;
A function declaration, similar to the variable declaration, makes the function signature known to the compiler. This allows you to call a function in your source code before it is defined without causing a compiler error.
int doSomething(float x);
A function definition specifies the return type, name, parameter list, and instructions for a function. The first three of these elements must match the function declaration. A function must only be defined once in a given program.
int doSomething(float x)
{
if( x < 0 )
{
x = -x;
}
return static_cast<int>(x);
}
You can combine the function decalartion and definition into one, but you must do so before the function is called anywhere in your program.
It might depend on the language, as has been said. I think it really depends on whether the words are used for things like classes. For most of the data types discussed here, the question might not have much relevance. In C++ (see c++ - What is the difference between a definition and a declaration?), a class or struct always has precisely one definition but can be declared zero or more times. A class cannot be declared without a definition. So "declared" might be synonymous with "used".
In most languages, simple types such as integers do not need definitions in the manner that classes do.
The correct answer depends on which language you're talking about. Computer languages often have specific terminology, either because of the language specification or the community grown up around the language. COBOL, back when I used it, had a much different terminology than more mainstream languages (in the sense of languages closer to the mainstream of language development, not mainstream business). Forth developed some strange terminology.
If you know English, you can usually get a good idea as to what a word means from its normal meaning, but never count on it too much. The same is true with specific words across languages or language communities.
I have encountered a problem in my learning of C++, where a local variable in a function is being passed to the local variable with the same name in another function, both of these functions run in main().
When this is run,
#include <iostream>
using namespace std;
void next();
void again();
int main()
{
int a = 2;
cout << a << endl;
next();
again();
return 0;
}
void next()
{
int a = 5;
cout << a << endl;
}
void again()
{
int a;
cout << a << endl;
}
it outputs:
2
5
5
I expected that again() would say null or 0 since 'a' is declared again there, and yet it seems to use the value that 'a' was assigned in next().
Why does next() pass the value of local variable 'a' to again() if 'a' is declared another time in again()?
http://en.cppreference.com/w/cpp/language/ub
You're correct, an uninitialized variable is a no-no. However, you are allowed to declare a variable and not initialize it until later. Memory is set aside to hold the integer, but what value happens to be in that memory until you do so can be anything at all. Some compilers will auto-initialize variables to junk values (to help you catch bugs), some will auto-initialize to default values, and some do nothing at all. C++ itself promises nothing, hence it's undefined behavior. In your case, with your simple program, it's easy enough to imagine how the compiler created assembly code that reused that exact same piece of memory without altering it. However, that's blind luck, and even in your simple program isn't guaranteed to happen. These types of bugs can actually be fairly insidious, so make it a rule: Be vigilant about uninitialized variables.
An uninitialized non-static local variable of *built-in type (phew! that was a mouthful) has an indeterminate value. Except for the char types, using that value yields formally Undefined Behavior, a.k.a. UB. Anything can happen, including the behavior that you see.
Apparently with your compiler and options, the stack area that was used for a in the call of next, was not used for something else until the call of again, where it was reused for the a in again, now with the same value as before.
But you cannot rely on that. With UB anything, or nothing, can happen.
* Or more generally of POD type, Plain Old Data. The standard's specification of this is somewhat complicated. In C++11 it starts with §8.5/11, “If no initializer is specified for an object, the object is default-initialized; if no initialization is performed, an object with automatic or dynamic storage duration has indeterminate value.”. Where “automatic … storage duration” includes the case of local non-static variable. And where the “no initialization” can occur in two ways via §8.5/6 that defines default initialization, namely either via a do-nothing default constructor, or via the object not being of class or array type.
This is completely coincidental and undefined behavior.
What's happened is that you have two functions called immediately after one another. Both will have more or less identical function prologs and both reserve a variable of exactly the same size on the stack.
Since there are no other variables in play and the stack is not modified between the calls, you just happen to end up with the local variable in the second function "landing" in the same place as the previous function's local variable.
Clearly, this is not good to rely upon. In fact, it's a perfect example of why you should always initialize variables!
I wonder why I can not declare a variable inside the following expression.
string finalgrade = ( ( int grade = 100 ) < 60 ) ? "fail" : "pass";
While we can declare a variable inside a for statement.
In C++, declarations are only allowed within a declaration statement and within the control structures if, while and for.
Since the purpose of a declaration is to introduce a name, a declaration only makes sense when there is a containing scope in which the name is visible, and that makes those options the only sensible ones. Declaration statements introduce the name into the surrounding scope, and the three control structures each contain their own, inner scopes into which the respective declarations introduce the name.
It's just a matter of how the syntax of C++ is defined. You can't put statements in expressions and declaring a variable is a decl-statement. A for loop takes a decl-statement as its initializer, so a variable can be declared there.
You could theoretically have a syntax where the code you wrote would work. But it would probably be fairy confusing. What's the scope of the declared variable? And programmers would be required to look deep into expressions to see whether a variable wasn't secretly declared somewhere.
I think it may correspond to the fact that in C++ specs the declaration/initialization statement like int var = val is a piece of code that is basically equivalent to two others: int var which is statement and var = 5 which is expression and gives a chunk of memory to a variable with the assigned value. So, basically what you are trying to do is to compare statement that has no return value with a number.
To illustrate it better try to run the following code:
int main(){
int i;
if ((i = 100) > 60){
std::cout << "It worked!";
}
return 0;
}
You will see that this code compiles since now you compare expression i = 100 that returns value and a number.
I ran into a case where I had to swap out the value of a certain object. Due to my own sloppy copy and paste, I accidentally copied the type declaration as well. Here is a simplified example:
int main()
{
int i = 42;
cout << "i = " << i++ << endl;
// ... much later
if( isSwapRequired == true )
{
int i = 24;
cout << "i = " << i++ << endl;
}
cout << "i = " << i++ << endl;
}
To my dismay, the compiler did not catch this and further went on to let i = 24 live in its own little scope. Then later, it turns out that outside the scope, i remains as 43. I noticed that if both i were in the same level, then the compiler would obligingly catch this mistake. Is there a reason for the compiler to treat the multiple declarations differently?
If it matters, I am using VS10.
This program is perfectly valid and correct as per rules laid out by the standard, the compiler does not need to catch anything, there is nothing to catch.
The standard allows same named variables to exist in their respective scopes and it clearly defines the rules as to which variable will be referenced when you use them in particular scope.Same named variables hide or shadow the variables at global scope.
Within in your local scope(within the conditional if block) the locally declared i hides the global i. If you need to access global i within this scope you need to use ::i.
Outside the conditional block, the only i that exists is the globally declared i.
Answer to question in comments:
Though compilers don't really have to warn of this, most compilers will provide you this diagnostic if you compile your program with highest warning level enabled or you explicitly tell the compiler to warn of this specific behavior.
For GCC, you can use -Wshadow.
-Wshadow
Warn whenever a local variable or type declaration shadows another variable, parameter, type, or class member (in C++), or whenever a built-in function is shadowed. Note that in C++, the compiler warns if a local variable shadows an explicit typedef, but not if it shadows a struct/class/enum.
This is no multiple declaration, because each of your is has a different scope and the local scope is always in favor of the global scope.
If you want to use your i from top-level of main() use ::i.
See here for a tutorial.
Out of curiosity, I've tried this code, resulting from an interview question[*]
int main(int argc, char *argv[])
{
int a = 1234;
printf("Outer: %d\n", a);
{
int a(a);
printf("Inner: %d\n", a);
}
}
When compiled on Linux (both g++ 4.6.3 and clang++ 3.0) it outputs:
Outer: 1234
Inner: -1217375632
However on Windows (VS2010) it prints:
Outer: 1234
Inner: 1234
The rationale would be that, until the copy-constructor of the second 'a' variable has finished, the first 'a' variable is still accessible. However I'm not sure if this is standard behaviour, or just a(nother) Microsoft quirk.
Any idea?
[*] The actual question was:
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
{
// Not at global scope here
int a = 1234;
{
int a;
// how do you set this a to the value of the containing scope a ?
}
}
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
Unless the outer scope can be explicitly named you cannot do this. You can explicitly name the global scope, namespace scopes, and class scopes, but not function or block statement scopes.
C++11 [basic.scope.pdecl 3.3.2 p1 states:
The point of declaration for a name is immediately after its complete declarator (Clause 8) and before its initializer (if any), except as noted below. [ Example:
int x = 12;
{ int x = x; }
Here the second x is initialized with its own (indeterminate) value. —end example ]
MSVC correctly implements this example, however it does not correctly implement this when the initializer uses parentheses instead of assignment syntax. There's a bug filed about this on microsoft connect.
Here's an example program with incorrect behavior in VS as a result of this bug.
#include <iostream>
int foo(char) { return 0; }
int foo(int) { return 1; }
int main()
{
char x = 'a';
{
int x = foo(static_cast<decltype(x)>(0));
std::cout << "'=' initialization has correct behavior? " << (x?"Yes":"No") << ".\n";
}
{
int x(foo(static_cast<decltype(x)>(0)));
std::cout << "'()' initialization has correct behavior? " << (x?"Yes":"No") << ".\n";
}
}
C++ includes the following note.
[ Note: Operations involving indeterminate values may cause undefined behavior. —end note ]
However, this note indicates that operations may cause undefined behavior, not that they necessarily do. The above linked bug report includes an acknowledgement from Microsoft that this is a bug and not that the program triggers undefined behavior.
Edit: And now I've changed the example so that the object with indeterminate value is only 'used' in an unevaluated context, and I believe that this absolutely rules out the possibility of undefined behavior on any platform, while still demonstrating the bug in Visual Studio.
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
If you want to get technical about the wording, it's pretty easy. A "temporary" has a specific meaning in C++ (see §12.2); any named variable you create is not a temporary. As such, you can just create a local variable (which is not a temporary) initialized with the correct value:
int a = 1234;
{
int b = a;
int a = b;
}
An even more defensible possibility would be to use a reference to the variable in the outer scope:
int a = 1234;
{
int &ref_a = a;
int a = ref_a;
}
This doesn't create an extra variable at all -- it just creates an alias to the variable at the outer scope. Since the alias has a different name, we retain access to the variable at the outer scope, without defining a variable (temporary or otherwise) to do so. Many references are implemented as pointers internally, but in this case (at least with a modern compiler and optimization turned on) I'd expect it not to be -- that the alias really would just be treated as a different name referring to the variable at the outer scope (and a quick test with VC++ shows that it works this way -- the generated assembly language doesn't use ref_a at all).
Another possibility along the same lines would be like this:
const int a = 10;
{
enum { a_val = a };
int a = a_val;
}
This is somewhat similar to the reference, except that in this case there's not even room for argument about whether a_val could be called a variable -- it absolutely is not a variable. The problem is that an enumeration can only be initialized with a constant expression, so we have to define the outer variable as const for it to work.
I doubt any of these is what the interviewer really intended, but all of them answer the question as stated. The first is (admittedly) a pure technicality about definitions of terms. The second might still be open to some argument (many people think of references as variables). Though it restricts the scope, there's no room for question or argument about the third.
What you are doing, initializing a variable with itself, is undefined behavior. All your test cases got it right, this is not a quirk. An implementation could also initialize a to 123456789 and it would still be standard.
Update: The comments on this answer point that initializing a variable with itself is not undefined behavior, but trying to read such variable is.
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
You can't. As soon as the identical name is declared, the outer name is inaccessible for the rest of the scope. You'd need a copy or an alias of the outer variable, which means you'd need a temporary variable.
I'm surprised that, even with the warning level cranked up, VC++ doesn't complain on this line:
int a(a);
Visual C++ will sometimes warn you about hiding a variable (maybe that's only for members of derived classes). It's also usually pretty good about telling you you're using a value before it has been initialized, which is the case here.
Looking at the code generated, it happens to initialize the inner a to the same value of the outer a because that's what's left behind in a register.
I had a look at the standard, it's actually a grey area but here's my 2 cents...
3.1 Declarations and definitions [basic.def]
A declaration introduces names into a translation unit or redeclares names introduced by previous declarations.
A declaration is a definition unless... [non relevant cases follow]
3.3.1 Point of declaration
The point of declaration for a name is immediately after its complete declarator and before its initializer (if any), except as noted below [self-assignment example].
A nonlocal name remains visible up to the point of declaration of the local name that hides it.
Now, if we assume that this is the point of declaration of the inner 'a' (3.3.1/1)
int a (a);
^
then the outer 'a' should be visible up to that point (3.3.1/2), where the inner 'a' is defined.
Problem is that in this case, according to 3.1/2, a declaration IS a definition. This means the inner 'a' should be created. Until then, I can't understand from the standard whether the outer 'a' is still visible or not. VS2010 assumes that it is, and all that falls within the parentheses refers to the outer scope. However clang++ and g++ treat that line as a case of self-assignment, which results in undefined behaviour.
I'm not sure which approach is correct, but I find VS2010 to be more consistent: the outer scope is still visible until the inner 'a' is fully created.