Although the example below compiles fine except for the last line with the error, I'd like to know the ins and outs of this 'scoping' within a scope? Also the terminology of this, if any.
Consider these brackets:
void func()
{
int i = 0;
{ // nice comment to describe this scope
while( i < 10 )
++i;
}
{ // nice comment to describe this scope
int j= 0;
while( j< 10 )
++j;
}
i = 0; // OK
// j = 0; // error C2065
}
consider this:
error C2065: 'j' : undeclared identifier
edit:
Accepted answer is from bitmask, although I think everyone should place it in the context of anio's answer. Especially, quote: "perhaps you should break your function into 2 functions"
Do. By all means!
Keeping data as local as possible and as const as possible has two main advantages:
side effects are reduced and the code becomes more functional
with complex objects, destructors can be be invoked early within a function, as soon as the data is not needed any more
Additionally, this can be useful for documentation to summarise the job a particular part of a function does.
I've heard this being referred to as explicit or dummy scoping.
I personally don't find much value in adding additional scoping within a function. If you are relying on it to separate parts of your function, perhaps you should break your function into 2 functions. Smaller functions are better than larger ones. You should strive to have small easily understood functions.
The one legitimate use of scopes within a function is for limiting the duration of a lock:
int doX()
{
// Do some work
{
//Acquire lock
} // Lock automatically released because of RAII
}
The inner scope effectively limits the code over which the lock is held. I believe this is common practice.
Yes, definitely - it's a great habit to always keep your variables as local as possible! Some examples:
for (std::string line; std::getline(std::cin, line); ) // does not
{ // leak "line"
// process "line" // into ambient
} // scope
int result;
{ // putting this in a separate scope
int a = foo(); // allows us to copy/paste the entire
a += 3; // block without worrying about
int b = bar(a); // repeated declarators
result *= (a + 2*b);
}
{ // ...and we never really needed
int a = foo(); // a and b outside of this anyway!
a += 3;
int b = bar(a);
result *= (a + 2*b);
}
Sometimes a scope is necessary for synchronisation, and you want to keep the critical section as short as possible:
int global_counter = 0;
std::mutex gctr_mx;
void I_run_many_times_concurrently()
{
int a = expensive_computation();
{
std::lock_guard<std::mutex> _(gctr_mx);
global_counter += a;
}
expensive_cleanup();
}
The explicit scoping is usually not done for commenting purposes, but I don't see any harm in doing it if you feel it makes your code more readable.
Typical usage is for avoiding name clashes and controlling when the destructors are called.
A pair of curly braces defines a scope. Names declared or defined within a scope are not visible outside that scope, which is why j is not defined at the end. If a name in a scope is the same as a name defined earlier and outside that scope, it hides the outer name.
Related
How to get the following code to work?
int main(){
bool flag = true;
if(flag){
// note that I donot want to define variable globally.
int a = 5;
}
if(flag){
// but I still want to use this local variable within the same condition.
a++;
}
}
Note that I don't want to define this variable globally or use a static variable.
I'm curious if there is a way for c++ to make local variables available in all regions with the same conditions?
What you ask for literally is a local variable that is not a local variable. Thats not possible.
On the other hand, you basically want data + code, thats a class. If you wrap it in a class your function can look like this:
int main(){
Foo f;
f.doSomething();
}
And the class can be this
struct Foo {
bool flag = false;
int a = 0;
void doSomething() {
if (flag) ++a;
}
};
What you're directly asking for isn't possible. You'll have to declare something up front. If it's about avoiding construction of objects until you have some relevant detail, you could use std::optional
int main()
{
std::optional<int> a;
if(flag)
{
a = 10;
}
if(a)
{
*a++;
}
}
You set the variable's scope to be the scope you want it to be.
int main(){
bool flag = true;
// declare it in this scope if you want it to persist
// thru this scope
int a;
if(flag){
a = 5;
}
if(flag){
a++;
}
}
No.
According to section 6.4.3, basic.scope.block:
1 Each
(1.1) selection or iteration statement ([stmt.select], [stmt.iter]),
[...]
(1.4) compound statement ([stmt.block]) that is not the compound-statement of a handler
introduces a block scope that includes that statement or handler.
A variable that belongs to a block scope is a block variable.
According to section 6.7.5.4, basic.stc.auto, clause 1:
Variables that belong to a block or parameter scope and are not explicitly declared static, thread_local, or extern have automatic storage duration. The storage for these entities lasts until the block in which they are created exits.
I am writing a toy compiler, which compile a c/c++ like language to c++.
I am using bison, but in this structure is hard to handle when variable became out of scope.
In the source lanugage in the whole main function there can be only one variable with the same name, it is good, but there is a problem what I cannot solve.
I cannot make c++ code like this, because c++ compiler throw semantical error:
'var' was not declared in this scope.
int main()
{
if (true)
{
int var = 4;
}
if (true)
{
var = 5;
}
}
The source language has while, if and if/else statements.
I have to throw semantical error if a declared variable is assinged in out of scope.
For example:
This should be semantical error, so I cannot generetad this code:
int main()
{
while(0)
{
int var = 1;
}
if (1)
{
var = 2;
}
}
This also have to be semantical error:
int main()
{
if (0)
{
int var = 1;
}
else
{
if (1)
{
var = 5;
}
}
}
And this is allowed, I can generate this code:
int main()
{
if (0)
{
}
else
{
int var = 1;
if (1)
{
while (0)
{
while (0)
{
var = 2;
}
}
}
}
}
I tried lot of things, but I cannot solve when there is nested if, if/else or while.
I read tons of tutorials about symbol table, but none of them can explain properly how to manage a variable if it is become out of scope.
If you familiar with this topic and with bison, please do not just give me hints, like "use stack, and mark a variable if it become out of scope". I found lot of article about it.
Instead of please give me pseudocode or concrate implementation sketch.
I think it cannot be so much difficult, because in the whole main function there can be one variable with the same name as I wrote.
Symbol table:
struct SymbolType
{
int lineNumber;
std::string identifier;
int identifierValue;
Type type;
int functionArgumentNumber;
Type functionReturnType;
Type markType;
bool outOfScope;
};
class Symbol
{
public:
void AddVariable(int _lineNumber, std::string _identifier, int _identifierValue, Type _type, int _functionArgumentNumber, Type _functionReturnType, Type _markType, bool _outOfScope);
void AddMarker(int _lineNumber, std::string _scopeName, Type _markType);
bool FindVariable(std::string _identifier);
int FindVariableValue(std::string _identifier);
void Remove();
void Print();
std::vector<SymbolType> symbolTable;
private:
int lineNumber;
std::string identifier;
int identifierValue;
Type type;
int functionArgumentNumber;
Type functionReturnType;
Type markType;
bool outOfScope;
};
Now let's assume the following: While you are in a nested scope you cannot add a variable to a parent scope. So we can work e.g. with a stack like structure (push/pop at the end only suffices, but with read access to all entries – the latter requirement disqualifying std::stack, so we'd operate e.g. on std::vector instead).
Encountering the declaration of a new variable:Run up the entire stack to see if that variable exists already. If so, issue an error ('duplicate declaration/definition').
Encountering accessing a variable: Run up the entire stack to see if that variable exists; if not, issue an error ('not declared/defined' – I wouldn't differentiate between the variable not having been defined ever or having left the scope).
On leaving a scope, run up the stack and remove any variable that resides in that scope.
To be able to do 3. you have (at least) two options:
With every stack entry provide an identifier for the respective scope, could be simple counter. Then delete all those variables that have the same counter value. If you fear the counter might overflow, then reduce it by 1 as well (then it would always represent current scope depths).
Have a sentinel type – that one would be pushed to the stack on opening a new scope, compare unequal to any variable name, and on leaving a scope you'd delete all variables until you encounter the sentinel – and the sentinel itself.
This processing makes your outOfScope member obsolete.
Your addVariable function takes too many parameters, by the way – why should a variable need a return type, for instance???
I recommend adding multiple functions for every specific semantic type your language might provide (addVariable, addFunction, ...). Each function accepts what actually is necessary to be configurable and sets the rest to appropriate defaults (e.g. Type to Function within addFunction).
Edit – processing the example from your comment:
while(condition)
{ // opens a new scope:
// either place a sentinel or increment the counter (scope depth)
int var = 1; // push an entry onto the stack
// (with current counter value, if not using sentinels)
if (true)
{ // again a new scope, see above
var = 2; // finds 'var' on the stack, so fine
} // closes a scope:
// either you find immediately the sentinel, pop it from the stack
// and are done
//
// or you discover 'var' having a counter value less than current
// counter so you are done as well
} // closing next scope:
// either you find 'var', pop it from the stack, then the sentinel,
// pop it as well and are done
//
// or you discover 'var' having same counter value as current one,
// so pop it, then next variable has lower counter value again or the
// stack is empty, thus you decrement the counter and are done again
(I know) In c++ I can declare variable out of scope and I can't run any code/statement, except for initializing global/static variables.
IDEA
Is it a good idea to use below tricky code in order to (for example) do some std::map manipulation ?
Here I use void *fakeVar and initialize it through Fake::initializer() and do whatever I want in it !
std::map<std::string, int> myMap;
class Fake
{
public:
static void* initializer()
{
myMap["test"]=222;
// Do whatever with your global Variables
return NULL;
}
};
// myMap["Error"] = 111; => Error
// Fake::initializer(); => Error
void *fakeVar = Fake::initializer(); //=> OK
void main()
{
std::cout<<"Map size: " << myMap.size() << std::endl; // Show myMap has initialized correctly :)
}
One way of solving it is to have a class with a constructor that does things, then declare a dummy variable of that class. Like
struct Initializer
{
Initializer()
{
// Do pre-main initialization here
}
};
Initializer initializer;
You can of course have multiple such classes doing miscellaneous initialization. The order in each translation unit is specified to be top-down, but the order between translation units is not specified.
You don't need a fake class... you can initialize using a lambda
auto myMap = []{
std::map<int, string> m;
m["test"] = 222;
return m;
}();
Or, if it's just plain data, initialize the map:
std::map<std::string, int> myMap { { "test", 222 } };
Is it a good idea to use below tricky code in order to (for example)
do some std::map manipulation ?
No.
Any solution entailing mutable non-local variables is a terrible idea.
Is it a good idea...?
Not really. What if someone decides that in their "tricky initialisation" they want to use your map, but on some system or other, or for not obvious reason after a particular relink, your map ends up being initialised after their attempted use? If you instead have them call a static function that returns a reference to the map, then it can initialise it on first call. Make the map a static local variable inside that function and you stop any accidental use without this protection.
§ 8.5.2 states
Except for objects declared with the constexpr specifier, for which
see 7.1.5, an initializer in the definition of a variable can consist
of arbitrary expressions involving literals and previously declared
variables and functions, regardless of the variable’s storage duration
therefore what you're doing is perfectly allowed by the C++ standard. That said, if you need to perform "initialization operations" it might be better to just use a class constructor (e.g. a wrapper).
What you've done is perfectly legal C++. So, if it works for you and is maintainable and understandable by anybody else who works with the code, it's fine. Joachim Pileborg's sample is clearer to me though.
One problem with initializing global variables like this can occur if they use each other during initialization. In that case it can be tricky to ensure that variables are initialized in the correct order. For that reason, I prefer to create InitializeX, InitializeY, etc functions, and explicitly call them in the correct order from the Main function.
Wrong ordering can also cause problems during program exit where globals still try to use each other when some of them may have been destroyed. Again, some explicit destruction calls in the correct order before Main returns can make it clearer.
So, go for it if it works for you, but be aware of the pitfalls. The same advice applies to pretty much every feature in C++!
You said in your question that you yourself think the code is 'tricky'. There is no need to overcomplicate things for the sake of it. So, if you have an alternative that appears less 'tricky' to you... that might be better.
When I hear "tricky code", I immediately think of code smells and maintenance nightmares. To answer your question, no, it isn't a good idea. While it is valid C++ code, it is bad practice. There are other, much more explicit and meaningful alternatives to this problem. To elaborate, the fact that your initializer() method returns void* NULL is meaningless as far as the intention of your program goes (i.e. each line of your code should have meaningful purpose), and you now have yet another unnecessary global variable fakeVar, which needlessly points to NULL.
Let's consider some less "tricky" alternatives:
If it's extremely important that you only ever have one global instance of myMap, perhaps using the Singleton Pattern would be more fitting, and you would be able to lazily initialize the contents of myMap when they are needed. Keep in mind that the Singleton Pattern has issues of its own.
Have a static method create and return the map or use a global namespace. For example, something along the lines of this:
// global.h
namespace Global
{
extern std::map<std::string, int> myMap;
};
// global.cpp
namespace Global
{
std::map<std::string, int> initMap()
{
std::map<std::string, int> map;
map["test"] = 222;
return map;
}
std::map<std::string, int> myMap = initMap();
};
// main.cpp
#include "global.h"
int main()
{
std::cout << Global::myMap.size() << std::endl;
return 0;
}
If this is a map with specialized functionality, create your own class (best option)! While this isn't a complete example, you get the idea:
class MyMap
{
private:
std::map<std::string, int> map;
public:
MyMap()
{
map["test"] = 222;
}
void put(std::string key, int value)
{
map[key] = value;
}
unsigned int size() const
{
return map.size();
}
// Overload operator[] and create any other methods you need
// ...
};
MyMap myMap;
int main()
{
std::cout << myMap.size() << std::endl;
return 0;
}
In C++, you cannot have statements outside any function. However, you have global objects declared, and constructor (initializer) call for these global objects are automatic before main starts. In your example, fakeVar is a global pointer that gets initialized through a function of class static scope, this is absolutely fine.
Even a global object would do provide that global object constructor does the desired initializaton.
For example,
class Fake
{
public:
Fake() {
myMap["test"]=222;
// Do whatever with your global Variables
}
};
Fake fake;
This is a case where unity builds (single translation unit builds) can be very powerful. The __COUNTER__ macro is a de facto standard among C and C++ compilers, and with it you can write arbitrary imperative code at global scope:
// At the beginning of the file...
template <uint64_t N> void global_function() { global_function<N - 1>(); } // This default-case skips "gaps" in the specializations, in case __COUNTER__ is used for some other purpose.
template <> void global_function<__COUNTER__>() {} // This is the base case.
void run_global_functions();
#define global_n(N, ...) \
template <> void global_function<N>() { \
global_function<N - 1>(); /* Recurse and call the previous specialization */ \
__VA_ARGS__; /* Run the user code. */ \
}
#define global(...) global_n(__COUNTER__, __VA_ARGS__)
// ...
std::map<std::string, int> myMap;
global({
myMap["test"]=222;
// Do whatever with your global variables
})
global(myMap["Error"] = 111);
int main() {
run_global_functions();
std::cout << "Map size: " << myMap.size() << std::endl; // Show myMap has initialized correctly :)
}
global(std::cout << "This will be the last global code run before main!");
// ...At the end of the file
void run_global_functions() {
global_function<__COUNTER__ - 1>();
}
This is especially powerful once you realize that you can use it to initialize static variables without a dependency on the C runtime. This means you can generate very small executables without having to eschew non-zero global variables:
// At the beginning of the file...
extern bool has_static_init;
#define default_construct(x) x{}; global(if (!has_static_init()) new (&x) decltype(x){})
// Or if you don't want placement new:
// #define default_construct(x) x{}; global(if (!has_static_init()) x = decltype(x){})
class Complicated {
int x = 42;
Complicated() { std::cout << "Constructor!"; }
}
Complicated default_construct(my_complicated_instance); // Will be zero-initialized if the CRT is not linked into the program.
int main() {
run_global_functions();
}
// ...At the end of the file
static bool get_static_init() {
volatile bool result = true; // This function can't be inlined, so the CRT *must* run it.
return result;
}
has_static_init = get_static_init(); // Will stay zero without CRT
This answer is similar to Some programmer dude's answer, but may be considered a bit cleaner. As of C++17 (that's when std::invoke() was added), you could do something like this:
#include <functional>
auto initializer = std::invoke([]() {
// Do initialization here...
// The following return statement is arbitrary. Without something like it,
// the auto will resolve to void, which will not compile:
return true;
});
C++ compilers emit warnings when a local variable may be uninitialized on first usage. However, sometimes, I know that the variable will always be written before being used, so I do not need to initialize it. When I do this, the compiler emits a warning, of course. Since my team is building with -Werror, the code will not compile. How can I turn off this warning for specific local variables. I have the following restrictions:
I am not allowed to change compiler flags
The solution must work on all compilers (i.e., no gnu-extensions or other compiler specific attributes)
I want to use this only on specific local variables. Other uninitialized locals should still trigger a warning
The solution should not generate any instructions.
I cannot alter the class of the local variable. I.e., I cannot simply add a "do nothing" constructor.
Of course, the easiest solution would be to initialize the variable. However, the variable is of a type that is costly to initialize (even default initialization is costly) and the code is used in a very hot loop, so I do not want to waste the CPU cycles for an initialization that is guaranteed to be overwritten before it is read anyway.
So is there a platform-independent, compiler-independent way of telling the compiler that a local variable does not need to be initialized?
Here is some example code that might trigger such a warning:
void foo(){
T t;
for(int i = 0; i < 100; i++){
if (i == 0) t = ...;
if (i == 1) doSomethingWith(t);
}
}
As you see, the first loop cycle initializes t and the second one uses it, so t will never be read uninitialized. However, the compiler is not able to deduce this, so it emits a warning. Note that this code is quite simplified for the sake of brevity.
My answer will recommend another approach: instead of disabling the warning code, just do some reformulation on the implementation. I see two approaches:
First Option
You can use pointers instead of a real object and guarantee that it will be initialized just when you need it, something like:
std::unique_ptr<T> t;
for(int i=0; i<100; i++)
{
if(i == 0) if(t.empty()) t = std::unique_ptr<T>(new T); *t = ...;
if(i == 1) if(t.empty()) t = std::unique_ptr<T>(new T); doSomethingWith(*t);
}
It's interesting to note that probably when i==0, you don't need to construct t using the default constructor. I can't guess how your operator= is implemented, but I supose that probably you are assigning an object that's already allocated in the code that you are omitting in the ... segment.
Second Option
As your code experiences such a huge performance loss, I can infer that T will never be an basic tipe (ints, floats, etc). So, instead of using pointers, you can reimplement your class T in a way that you use an init method and avoid initializing it on the constructor. You can use some boolean to indicate if the class needs initalization or not:
class FooClass()
{
public:
FooClass() : initialized(false){ ... }
//Class implementation
void init()
{
//Do your heavy initialization code here.
initialized = true;
}
bool initialized() const { return initialized; }
private:
bool initialized;
}
Than you will be able to write it like this:
T t;
for(int i=0; i<100; i++)
{
if(i == 0) if(!t.initialized()) t.init(); t = ...;
if(i == 1) if(!t.initialized()) t.init(); doSomethingWith(t);
}
If the code is not very complex, I usually unroll one of the iterations:
void foo(){
T t;
t = ...;
for(int i = 1; i < 100; i++){
doSomethingWith(t);
}
}
while(true){
bool flag;
while(true){
if (conditions) {
flag=true;
break;
}
}
}
In this case, is the flag reset to false condition after it exits the inner while loop? It seems from the display of the console that it is still in true condition.
No, there is no "reset". There is no magic whatsoever. In fact, flag will not even be magically initialized to false for you, you'll have to do it yourself.
I think you're thinking of classic examples of scope and shadowing:
int a = 4;
//a is 4 here
{
int a = 3;
//a is 3 here
}
//a is 4 here
But there is no magic here, either. There are two different variables a which happen to share a name. a in the inner block refers to the second integer. If you could refer to the first integer, you'd be reading a completely different integer.
Here is some magic:
SomeClass x; //x's constructor is called
{
SomeOtherClass y; //y's constructor is called
} //y's destructor is called
Since y is automatic, it gets destroyed at the end of its scope. (So did the second a, by the way, only there was no way to tell.) If it has a destructor, it will be called. If its destructor does something fancy such as "resetting some flag", you'll see the results. (Only not through y, which will be gone.)
The fact that the {} have no if/while/function/etc. attached to them is irrelevant.