In symbol table how to mark variable out of scope? - c++

I am writing a toy compiler, which compile a c/c++ like language to c++.
I am using bison, but in this structure is hard to handle when variable became out of scope.
In the source lanugage in the whole main function there can be only one variable with the same name, it is good, but there is a problem what I cannot solve.
I cannot make c++ code like this, because c++ compiler throw semantical error:
'var' was not declared in this scope.
int main()
{
if (true)
{
int var = 4;
}
if (true)
{
var = 5;
}
}
The source language has while, if and if/else statements.
I have to throw semantical error if a declared variable is assinged in out of scope.
For example:
This should be semantical error, so I cannot generetad this code:
int main()
{
while(0)
{
int var = 1;
}
if (1)
{
var = 2;
}
}
This also have to be semantical error:
int main()
{
if (0)
{
int var = 1;
}
else
{
if (1)
{
var = 5;
}
}
}
And this is allowed, I can generate this code:
int main()
{
if (0)
{
}
else
{
int var = 1;
if (1)
{
while (0)
{
while (0)
{
var = 2;
}
}
}
}
}
I tried lot of things, but I cannot solve when there is nested if, if/else or while.
I read tons of tutorials about symbol table, but none of them can explain properly how to manage a variable if it is become out of scope.
If you familiar with this topic and with bison, please do not just give me hints, like "use stack, and mark a variable if it become out of scope". I found lot of article about it.
Instead of please give me pseudocode or concrate implementation sketch.
I think it cannot be so much difficult, because in the whole main function there can be one variable with the same name as I wrote.
Symbol table:
struct SymbolType
{
int lineNumber;
std::string identifier;
int identifierValue;
Type type;
int functionArgumentNumber;
Type functionReturnType;
Type markType;
bool outOfScope;
};
class Symbol
{
public:
void AddVariable(int _lineNumber, std::string _identifier, int _identifierValue, Type _type, int _functionArgumentNumber, Type _functionReturnType, Type _markType, bool _outOfScope);
void AddMarker(int _lineNumber, std::string _scopeName, Type _markType);
bool FindVariable(std::string _identifier);
int FindVariableValue(std::string _identifier);
void Remove();
void Print();
std::vector<SymbolType> symbolTable;
private:
int lineNumber;
std::string identifier;
int identifierValue;
Type type;
int functionArgumentNumber;
Type functionReturnType;
Type markType;
bool outOfScope;
};

Now let's assume the following: While you are in a nested scope you cannot add a variable to a parent scope. So we can work e.g. with a stack like structure (push/pop at the end only suffices, but with read access to all entries – the latter requirement disqualifying std::stack, so we'd operate e.g. on std::vector instead).
Encountering the declaration of a new variable:Run up the entire stack to see if that variable exists already. If so, issue an error ('duplicate declaration/definition').
Encountering accessing a variable: Run up the entire stack to see if that variable exists; if not, issue an error ('not declared/defined' – I wouldn't differentiate between the variable not having been defined ever or having left the scope).
On leaving a scope, run up the stack and remove any variable that resides in that scope.
To be able to do 3. you have (at least) two options:
With every stack entry provide an identifier for the respective scope, could be simple counter. Then delete all those variables that have the same counter value. If you fear the counter might overflow, then reduce it by 1 as well (then it would always represent current scope depths).
Have a sentinel type – that one would be pushed to the stack on opening a new scope, compare unequal to any variable name, and on leaving a scope you'd delete all variables until you encounter the sentinel – and the sentinel itself.
This processing makes your outOfScope member obsolete.
Your addVariable function takes too many parameters, by the way – why should a variable need a return type, for instance???
I recommend adding multiple functions for every specific semantic type your language might provide (addVariable, addFunction, ...). Each function accepts what actually is necessary to be configurable and sets the rest to appropriate defaults (e.g. Type to Function within addFunction).
Edit – processing the example from your comment:
while(condition)
{ // opens a new scope:
// either place a sentinel or increment the counter (scope depth)
int var = 1; // push an entry onto the stack
// (with current counter value, if not using sentinels)
if (true)
{ // again a new scope, see above
var = 2; // finds 'var' on the stack, so fine
} // closes a scope:
// either you find immediately the sentinel, pop it from the stack
// and are done
//
// or you discover 'var' having a counter value less than current
// counter so you are done as well
} // closing next scope:
// either you find 'var', pop it from the stack, then the sentinel,
// pop it as well and are done
//
// or you discover 'var' having same counter value as current one,
// so pop it, then next variable has lower counter value again or the
// stack is empty, thus you decrement the counter and are done again

Related

How to implement local variables that can be used in other places with the same conditions in C++

How to get the following code to work?
int main(){
bool flag = true;
if(flag){
// note that I donot want to define variable globally.
int a = 5;
}
if(flag){
// but I still want to use this local variable within the same condition.
a++;
}
}
Note that I don't want to define this variable globally or use a static variable.
I'm curious if there is a way for c++ to make local variables available in all regions with the same conditions?
What you ask for literally is a local variable that is not a local variable. Thats not possible.
On the other hand, you basically want data + code, thats a class. If you wrap it in a class your function can look like this:
int main(){
Foo f;
f.doSomething();
}
And the class can be this
struct Foo {
bool flag = false;
int a = 0;
void doSomething() {
if (flag) ++a;
}
};
What you're directly asking for isn't possible. You'll have to declare something up front. If it's about avoiding construction of objects until you have some relevant detail, you could use std::optional
int main()
{
std::optional<int> a;
if(flag)
{
a = 10;
}
if(a)
{
*a++;
}
}
You set the variable's scope to be the scope you want it to be.
int main(){
bool flag = true;
// declare it in this scope if you want it to persist
// thru this scope
int a;
if(flag){
a = 5;
}
if(flag){
a++;
}
}
No.
According to section 6.4.3, basic.scope.block:
1 Each
(1.1) selection or iteration statement ([stmt.select], [stmt.iter]),
[...]
(1.4) compound statement ([stmt.block]) that is not the compound-statement of a handler
introduces a block scope that includes that statement or handler.
A variable that belongs to a block scope is a block variable.
According to section 6.7.5.4, basic.stc.auto, clause 1:
Variables that belong to a block or parameter scope and are not explicitly declared static, thread_­local, or extern have automatic storage duration. The storage for these entities lasts until the block in which they are created exits.

How to make functions variables public, they are not in a class C++ [duplicate]

This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 8 months ago.
I would like to know how I can make a function's variable public to other functions.
Example:
void InHere
{
int one = 1; // I want to be public
}
int main()
{
InHere(); // This will set int one = 1
one = 2; // If the variable is public, I should be able to do this
return 0;
}
Does anyone know how to do this? The only things I find when searching is for classes, as you can see nothing is in a class and I don't want them to be in one.
Any help is really appreciated!
A variable defined locally to a function is generally inaccessible outside that function unless the function explicitly supplies a reference/pointer to that variable.
One option is for the function to explicitly return a reference or pointer to that variable to the caller. That gives undefined behaviour if the variable is not static, as it does not exist after the function returns.
int &InHere()
{
static int one = 1;
return one;
}
void some_other_func()
{
InHere() = 2;
}
This causes undefined behaviour if the variable one is not static since, as far as the program as a whole is concerned, the variable only comes into existence whes InHere() is called and ceases to exist as it returns (so the caller receives a dangling reference - a reference to something that no longer exists).
Another option is for the function to pass a pointer or reference to the variable as an argument to another function.
void do_something(int &variable)
{
variable = 2;
}
int InHere()
{
int one = 1;
do_something(one);
std::cout << one << '\n'; // will print 2
}
The downside is that this only provides access to functions CALLED BY InHere(). Although the variable does not need to be static in this case, the variable still ceases to exist as InHere() returns (so if you want to combine option 1 and option 2 in some way, the variable needs to be static)
A third option is to define the variable at file scope, so it has static storage duration (i.e. its lifetime is not related to the function);
int one;
void InHere()
{
one = 1;
}
void another_function()
{
one = 2;
}
int main()
{
InHere();
// one has value 1
some_other_function();
// one has value 2
}
A global variable can be accessed in any function that has visibility of a declaration of the variable. For example, we could do
extern int one; // declaration but not definition of one
int one; // definition of one. This can only appear ONCE into the entire program
void InHere()
{
one = 1;
}
And, in other source file
extern int one; // this provides visibility to one but relies on it
// being defined in another source file
void another_function()
{
one = 2;
}
int main()
{
InHere();
// one has value 1
some_other_function();
// one has value 2
}
Be careful with that though - there are numerous down-sides of global/static variables, to the extent they are usually considered VERY BAD programming technique. Have a look at this link (and pages linked to from there) for a description of some of the problems.
Just set the variable as a global variable. Then you can access it from other functions.
int one;
void InHere()
{
one = 1; // I want to be public
}
int main()
{
InHere(); // This will set int one = 1
one = 2; // If the variable is public, I should be able to do this
return 0;
}
if you want it inside a class, then try the code below
#include <iostream>
using namespace std;
class My_class
{
// private members
public: // public section
// public members, methods or attributes
int one;
void InHere();
};
void My_class::InHere()
{
one = 1; // it is public now
}
int main()
{
My_class obj;
obj.InHere(); // This will set one = 1
cout<<obj.one;
obj.one = 2; // If the variable is public, I should be able to do this
cout<<obj.one;
return 0;
}

Local static const variables that may need to refer to different variables

I have a function that has a variable called static const int initial_var = some_var so that on subsequent runs to the function, initial_var is guaranteed to not change. The issue is however the function may be called for different some_vars and because initial_var is used in calculations, this can screw things up.
func() is meant to operate on DIFFERENT variables, all named some_var. Their state needs to be remembered so I use a static const variable, but that will only remember the state for ONE variable.
void func()
{
static const int initial_var = some_var;
some_var = initial_var; // This is the part where things may screw up if some_var
// is a different variable
}
What's an elegant way to fix this?
You say you need "Their state needs to be remembered" so you can just put them in an array.
int array[10]; // 10 elements.
int count = 0;
void storeVariable(int temp)
{
array[count] = temp;
count++;
// Reset if full.
if(count >= 10)
count = 0;
}
That seems fairly simple enough.

Trouble implementing min Heaps in C++

I'm trying to implement a minheap in C++. However the following code keeps eliciting errors such as :
heap.cpp:24:4: error: cannot convert 'complex int' to 'int' in assignment
l=2i;
^
heap.cpp:25:4: error: cannot convert 'complex int' to 'int' in assignment
r=2i+1;
^
heap.cpp: In member function 'int Heap::main()':
heap.cpp:47:16: error: no matching function for call to 'Heap::heapify(int [11], int&)'
heapify(a,i);
^
heap.cpp:47:16: note: candidate is:
heap.cpp:21:5: note: int Heap::heapify(int)
int heapify(int i) //i is the parent index, a[] is the heap array
^
heap.cpp:21:5: note: candidate expects 1 argument, 2 provided
make: * [heap] Error 1
#include <iostream>
using namespace std;
#define HEAPSIZE 10
class Heap
{
int a[HEAPSIZE+1];
Heap()
{
for (j=1;j<(HEAPISZE+1);j++)
{
cin>>a[j];
cout<<"\n";
}
}
int heapify(int i) //i is the parent index, a[] is the heap array
{
int l,r,smallest,temp;
l=2i;
r=2i+1;
if (l<11 && a[l]<a[i])
smallest=l;
else
smallest=i;
if (r<11 && a[r]<a[smallest])
smallest=r;
if (smallest != i)
{
temp = a[smallest];
a[smallest] = a[i];
a[i]=temp;
heapify(smallest);
}
}
int main()
{
int i;
for (i=1;i<=HEAPSIZE;i++)
{
heapify(a,i);
}
}
}
Ultimately, the problem with this code is that it was written by someone who skipped chapters 1, 2 and 3 of "C++ for Beginners". Lets start with some basics.
#include <iostream>
using namespace std;
#define HEAPSIZE 10
Here, we have included the C++ header for I/O (input output). A fine start. Then, we have issued a directive that says "Put everything that is in namespace std into the global namespace". This saves you some typing, but means that all of the thousands of things that were carefully compartmentalized into std:: can now conflict with names you want to use in your code. This is A Bad Thing(TM). Try to avoid doing it.
Then we went ahead and used a C-ism, a #define. There are times when you'll still need to do this in C++, but it's better to avoid it. We'll come back to this.
The next problem, at least in the code you posted, is a misunderstanding of the C++ class.
The 'C' language that C++ is based on has the concept of a struct for describing a collection of data items.
struct
{
int id;
char name[64];
double wage;
};
It's important to notice the syntax - the trailing ';'. This is because you can describe a struct and declare variables of it's type at the same time.
struct { int id; char name[64]; } earner, manager, ceo;
This declares a struct, which has no type name, and variables earner, manager and ceo of that type. The semicolon tells the compiler when we're done with this statement. Learning when you need a semicolon after a '}' takes a little while; usually you don't, but in struct/class definition you do.
C++ added lots of things to C, but one common misunderstanding is that struct and class are somehow radically different.
C++ originally extended the struct concept by allowing you to describe functions in the context of the struct and by allowing you to describe members/functions as private, protected or public, and allowing inheritance.
When you declare a struct, it defaults to public. A class is nothing more than a struct which starts out `private.
struct
{
int id;
char name[64];
double wage;
};
class
{
public:
int id;
char name[64];
double wage;
};
The resulting definitions are both identical.
Your code does not have an access specifier, so everything in your Heap class is private. The first and most problematic issue this causes is: Nobody can call ANY of your functions, because they are private, they can only be called from other class members. That includes the constructor.
class Foo { Foo () {} };
int main()
{
Foo f;
return 0;
}
The above code will fail to compile, because main is not a member of Foo and thus cannot call anything private.
This brings us to another problem. In your code, as posted, main is a member of Foo. The entry point of a C++ program is main, not Foo::main or std::main or Foo::bar::herp::main. Just, good old int main(int argc, const char* argv[]) or int main().
In C, with structs, because C doesn't have member functions, you would never be in a case where you were using struct-members directly without prefixing that with a pointer or member reference, e.g. foo.id or ptr->wage. In C++, in a member function, member variables can be referenced just like local function variables or parameters. This can lead to some confusion:
class Foo
{
int a, b;
public:
void Set(int a, int b)
{
a = a; // Erh,
b = b; // wat???
}
};
There are many ways to work around this, but one of the most common is to prefix member variables with m_.
Your code runs afoul of this, apparently the original in C passed the array to heapify, and the array was in a local variable a. When you made a into a member, leaving the variable name exactly the same allowed you not to miss the fact that you no-longer need to pass it to the object (and indeed, your heapify member function no-longer takes an array as a pointer, leading to one of your compile errors).
The next problem we encounter, not directly part of your problem yet, is your function Heap(). Firstly, it is private - you used class and haven't said public yet. But secondly, you have missed the significance of this function.
In C++ every struct/class has an implied function of the same name as the definition. For class Heap that would be Heap(). This is the 'default constructor'. This is the function that will be executed any time someone creates an instance of Heap without any parameters.
That means it's going to be invoked when the compiler creates a short-term temporary Heap, or when you create a vector of Heap()s and allocate a new temporary.
These functions have one purpose: To prepare the storage the object occupies for usage. You should try and avoid as much other work as possible until later. Using std::cin to populate members in a constructor is one of the most awful things you can do.
We now have a basis to begin to write the outer-shell of the code in a fashion that will work.
The last change is the replacement of "HEAPSIZE" with a class enum. This is part of encapsulation. You could leave HEAPSIZE as a #define but you should expose it within your class so that external code doesn't have to rely on it but can instead say things like Heap::Size or heapInstance.size() etc.
#include <iostream>
#include <cstdint> // for size_t etc
#include <array> // C++11 encapsulation for arrays.
struct Heap // Because we want to start 'public' not 'private'.
{
enum { Size = 10 };
private:
std::array<int, Size> m_array; // meaningful names ftw.
public:
Heap() // default constructor, do as little as possible.
: m_array() // says 'call m_array()s default ctor'
{}
// Function to load values from an istream into this heap.
void read(std::istream& in)
{
for (size_t i = 0; i < Size; ++i)
{
in >> m_array[i];
}
return in;
}
void write(std::ostream& out)
{
for (size_t i = 0; i < Size; ++i)
{
if (i > 0)
out << ','; // separator
out << m_array[i];
}
}
int heapify(size_t index)
{
// implement your code here.
}
}; // <-- important.
int main(int argc, const char* argv[])
{
Heap myHeap; // << constructed but not populated.
myHeap.load(std::cin); // read from cin
for (size_t i = 1; i < myHeap.Size; ++i)
{
myHeap.heapify(i);
}
myHead.write(std::cout);
return 0;
}
Lastly, we run into a simple, fundamental problem with your code. C++ does not have implicit multiplication. 2i is the number 2 with a suffix. It is not the same as 2 * i.
int l = 2 * i;
There is also a peculiarity with your code that suggests you are mixing between 0-based and 1-based implementation. Pick one and stick with it.
--- EDIT ---
Technically, this:
myHeap.load(std::cin); // read from cin
for (size_t i = 1; i < myHeap.Size; ++i)
{
myHeap.heapify(i);
}
is poor encapsulation. I wrote it this way to draw on the original code layout, but I want to point out that one reason for separating construction and initialization is that it allows initialization to be assured that everything is ready to go.
So, it would be more correct to move the heapify calls into the load function. After all, what better time to heapify than as we add new values, keeping the list in order the entire time.
for (size_t i = 0; i < Size; ++i)
{
in >> m_array[i];
heapify(i);
}
Now you've simplified your classes api, and users don't have to be aware of the internal machinery.
Heap myHeap;
myHeap.load(std::cin);
myHeap.write(std::cout);

Scope within a scope, do or don't?

Although the example below compiles fine except for the last line with the error, I'd like to know the ins and outs of this 'scoping' within a scope? Also the terminology of this, if any.
Consider these brackets:
void func()
{
int i = 0;
{ // nice comment to describe this scope
while( i < 10 )
++i;
}
{ // nice comment to describe this scope
int j= 0;
while( j< 10 )
++j;
}
i = 0; // OK
// j = 0; // error C2065
}
consider this:
error C2065: 'j' : undeclared identifier
edit:
Accepted answer is from bitmask, although I think everyone should place it in the context of anio's answer. Especially, quote: "perhaps you should break your function into 2 functions"
Do. By all means!
Keeping data as local as possible and as const as possible has two main advantages:
side effects are reduced and the code becomes more functional
with complex objects, destructors can be be invoked early within a function, as soon as the data is not needed any more
Additionally, this can be useful for documentation to summarise the job a particular part of a function does.
I've heard this being referred to as explicit or dummy scoping.
I personally don't find much value in adding additional scoping within a function. If you are relying on it to separate parts of your function, perhaps you should break your function into 2 functions. Smaller functions are better than larger ones. You should strive to have small easily understood functions.
The one legitimate use of scopes within a function is for limiting the duration of a lock:
int doX()
{
// Do some work
{
//Acquire lock
} // Lock automatically released because of RAII
}
The inner scope effectively limits the code over which the lock is held. I believe this is common practice.
Yes, definitely - it's a great habit to always keep your variables as local as possible! Some examples:
for (std::string line; std::getline(std::cin, line); ) // does not
{ // leak "line"
// process "line" // into ambient
} // scope
int result;
{ // putting this in a separate scope
int a = foo(); // allows us to copy/paste the entire
a += 3; // block without worrying about
int b = bar(a); // repeated declarators
result *= (a + 2*b);
}
{ // ...and we never really needed
int a = foo(); // a and b outside of this anyway!
a += 3;
int b = bar(a);
result *= (a + 2*b);
}
Sometimes a scope is necessary for synchronisation, and you want to keep the critical section as short as possible:
int global_counter = 0;
std::mutex gctr_mx;
void I_run_many_times_concurrently()
{
int a = expensive_computation();
{
std::lock_guard<std::mutex> _(gctr_mx);
global_counter += a;
}
expensive_cleanup();
}
The explicit scoping is usually not done for commenting purposes, but I don't see any harm in doing it if you feel it makes your code more readable.
Typical usage is for avoiding name clashes and controlling when the destructors are called.
A pair of curly braces defines a scope. Names declared or defined within a scope are not visible outside that scope, which is why j is not defined at the end. If a name in a scope is the same as a name defined earlier and outside that scope, it hides the outer name.