So I have been working on a question and this think stumbled upon me. When I declare a variable outside the main function, the program works correctly, that is it reaches the else case of "Friendship is magic" but if the variable is declared inside it returns Chris instead of Friendship statement.
int mis, chr;
int main() {
int a, n, m;
cin >> a;
for (int i = 0; i < a; i++) {
//code here
}
if(mis > chr)
cout << "Mishka";
else if(chr > mis)
cout << "Chris";
else
cout << "Friendship is magic!^^";
}
The input that I am using makes the value of chr and mis equal so it should be evaluating to the else statement but instead it just stops at else if.
"With great power (provided by C++) there must also come great responsibility"
And
"With uninitialized variables comes the undefined behavior"
Variables that are declared at global scope are being initialized by the compiler. However, the variables defined inside any function (i.e, having automatic storage) may contain garbage values (that could be different in each invocation of the program). I recommend always initializing your variables to some value.
int main()
{
int mis = 0, chr = 0;
// ...
return 0;
}
Let's come to your program now:
When I declare a variable outside the main function, the program works correctly, that is it reaches the else case of "Friendship is magic"
It is happening because the variables (on which your if ladder dependent) are being initialized to 0. Since, both variables have same value (0), the else part of the if statement is being executed.
but if the variable is declared inside it returns Chris instead of Friendship statement.
It's a perfect example of undefined behavior. If they are defined inside your function, they will be holding some garbage value and that might not be equal. Hence, what you are observing is an undefined behavior and you might get different results in different machines or even sometimes in a same machine.
I tried your code and it works the same for me in both cases.
So, I would say it changes with editor to editor and always initialize the variable before using it.
Otherwise, we might face the same problem as you face. As it initializes the variables with garbage values.
http://en.cppreference.com/w/cpp/language/storage_duration
Static local variables
Static variables declared at block scope are initialized the first
time control passes through their declaration (unless their
initialization is zero- or constant-initialization, which can be
performed before the block is first entered). On all further calls,
the declaration is skipped.
What is the meaning of 'static' in this quote? Is it:
static storage duration. The storage for the object is allocated when
the program begins and deallocated when the program ends.
If so, then it doesn't explain what happens to int k; in the main or any other function in terms of initialization, because k is not a static variable (it doesn't get allocated when the program begins and deallocated when the program ends - wait a second, you might say that the main function starts running when the program begins and returns when the program ends, but that's not how it works I guess).
In the main function:
int k;
k++;
results in an error: uninitialized local variable 'k' used.
So if k above is not a static local variable, could you give an example of such variables?
And could anyone tell me why the following code compiles and runs without any issues, even though k is not initialized?
Given a function with no body:
void foo(int* number) { }
And calling it like this in main:
int k;
foo(&k);
k++;
It now compiles and runs with no problems, but the value of k is -858993459. The compiler didn't like the fact I attempted to increment it without initiation, but passing it to foo caused that the compiler forgot about it. Why?
Probably depends on your compiler implementation.
I guess the compiler sees that you variable is passed to a function in a non-const way, so the function can potentially modify/initialize the value of k.
Your compiler doesn't dig into the function to extrapolate the logic, so it let you compile the code.
EDIT : I don't get the same behaviour with this online compier : http://cpp.sh/9axqz
But the compilation warning disapears as soon as I do something with the number in the function.
A static local variable is something like this:
void example() {
static int i = 0; // a static local variable
int j = 0; // not static
// ...
}
Builtin types such as int are not automatically initialized. Therefore you may get a warning when you try to read from that variable before having set it to something meaningful because it can hold just about anything.
Just because your particular compiler no longer gave you a warning does not mean that your variable is initialized properly. As a rule of thumb you should always explicitly initialize your variables to avoid accidentally reading from it without having set it first:
int i; // not initialized
int j = 0; // explicitly set to 0
Here is a link to the C++ Core Guidelines on the topic of initialization: https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es20-always-initialize-an-object, it may be worth a read.
A static variable can be defined like this:
int nextIndex()
{
static int index = -1;
return ++index;
}
So when you first call this function the static variable index will be initialized with -1;. The next line will increase index by 1 and return the result.
All subsequent calls of nextIndex will only execute the return ++index; part and skip the initialization.
Also static varibales are automatically initialized with zero if not implemented different.
So if you change above function to
int nextIndex()
{
static int index;
return ++index;
}
the first value it returns will be one. Also the first call won't initialize the index, but it will be initialized as soon as your program starts.
Question #1: Is declaring a variable inside a loop a good practice or bad practice?
I've read the other threads about whether or not there is a performance issue (most said no), and that you should always declare variables as close to where they are going to be used. What I'm wondering is whether or not this should be avoided or if it's actually preferred.
Example:
for(int counter = 0; counter <= 10; counter++)
{
string someString = "testing";
cout << someString;
}
Question #2: Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?
This is excellent practice.
By creating variables inside loops, you ensure their scope is restricted to inside the loop. It cannot be referenced nor called outside of the loop.
This way:
If the name of the variable is a bit "generic" (like "i"), there is no risk to mix it with another variable of same name somewhere later in your code (can also be mitigated using the -Wshadow warning instruction on GCC)
The compiler knows that the variable scope is limited to inside the loop, and therefore will issue a proper error message if the variable is by mistake referenced elsewhere.
Last but not least, some dedicated optimization can be performed more efficiently by the compiler (most importantly register allocation), since it knows that the variable cannot be used outside of the loop. For example, no need to store the result for later re-use.
In short, you are right to do it.
Note however that the variable is not supposed to retain its value between each loop. In such case, you may need to initialize it every time. You can also create a larger block, encompassing the loop, whose sole purpose is to declare variables which must retain their value from one loop to another. This typically includes the loop counter itself.
{
int i, retainValue;
for (i=0; i<N; i++)
{
int tmpValue;
/* tmpValue is uninitialized */
/* retainValue still has its previous value from previous loop */
/* Do some stuff here */
}
/* Here, retainValue is still valid; tmpValue no longer */
}
For question #2:
The variable is allocated once, when the function is called. In fact, from an allocation perspective, it is (nearly) the same as declaring the variable at the beginning of the function. The only difference is the scope: the variable cannot be used outside of the loop. It may even be possible that the variable is not allocated, just re-using some free slot (from other variable whose scope has ended).
With restricted and more precise scope come more accurate optimizations. But more importantly, it makes your code safer, with less states (i.e. variables) to worry about when reading other parts of the code.
This is true even outside of an if(){...} block. Typically, instead of :
int result;
(...)
result = f1();
if (result) then { (...) }
(...)
result = f2();
if (result) then { (...) }
it's safer to write :
(...)
{
int const result = f1();
if (result) then { (...) }
}
(...)
{
int const result = f2();
if (result) then { (...) }
}
The difference may seem minor, especially on such a small example.
But on a larger code base, it will help : now there is no risk to transport some result value from f1() to f2() block. Each result is strictly limited to its own scope, making its role more accurate. From a reviewer perspective, it's much nicer, since he has less long range state variables to worry about and track.
Even the compiler will help better : assuming that, in the future, after some erroneous change of code, result is not properly initialized with f2(). The second version will simply refuse to work, stating a clear error message at compile time (way better than run time). The first version will not spot anything, the result of f1() will simply be tested a second time, being confused for the result of f2().
Complementary information
The open-source tool CppCheck (a static analysis tool for C/C++ code) provides some excellent hints regarding optimal scope of variables.
In response to comment on allocation:
The above rule is true in C, but might not be for some C++ classes.
For standard types and structures, the size of variable is known at compilation time. There is no such thing as "construction" in C, so the space for the variable will simply be allocated into the stack (without any initialization), when the function is called. That's why there is a "zero" cost when declaring the variable inside a loop.
However, for C++ classes, there is this constructor thing which I know much less about. I guess allocation is probably not going to be the issue, since the compiler shall be clever enough to reuse the same space, but the initialization is likely to take place at each loop iteration.
Generally, it's a very good practice to keep it very close.
In some cases, there will be a consideration such as performance which justifies pulling the variable out of the loop.
In your example, the program creates and destroys the string each time. Some libraries use a small string optimization (SSO), so the dynamic allocation could be avoided in some cases.
Suppose you wanted to avoid those redundant creations/allocations, you would write it as:
for (int counter = 0; counter <= 10; counter++) {
// compiler can pull this out
const char testing[] = "testing";
cout << testing;
}
or you can pull the constant out:
const std::string testing = "testing";
for (int counter = 0; counter <= 10; counter++) {
cout << testing;
}
Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?
It can reuse the space the variable consumes, and it can pull invariants out of your loop. In the case of the const char array (above) - that array could be pulled out. However, the constructor and destructor must be executed at each iteration in the case of an object (such as std::string). In the case of the std::string, that 'space' includes a pointer which contains the dynamic allocation representing the characters. So this:
for (int counter = 0; counter <= 10; counter++) {
string testing = "testing";
cout << testing;
}
would require redundant copying in each case, and dynamic allocation and free if the variable sits above the threshold for SSO character count (and SSO is implemented by your std library).
Doing this:
string testing;
for (int counter = 0; counter <= 10; counter++) {
testing = "testing";
cout << testing;
}
would still require a physical copy of the characters at each iteration, but the form could result in one dynamic allocation because you assign the string and the implementation should see there is no need to resize the string's backing allocation. Of course, you wouldn't do that in this example (because multiple superior alternatives have already been demonstrated), but you might consider it when the string or vector's content varies.
So what do you do with all those options (and more)? Keep it very close as a default -- until you understand the costs well and know when you should deviate.
I didn't post to answer JeremyRR's questions (as they have already been answered); instead, I posted merely to give a suggestion.
To JeremyRR, you could do this:
{
string someString = "testing";
for(int counter = 0; counter <= 10; counter++)
{
cout << someString;
}
// The variable is in scope.
}
// The variable is no longer in scope.
I don't know if you realize (I didn't when I first started programming), that brackets (as long they are in pairs) can be placed anywhere within the code, not just after "if", "for", "while", etc.
My code compiled in Microsoft Visual C++ 2010 Express, so I know it works; also, I have tried to to use the variable outside of the brackets that it was defined in and I received an error, so I know that the variable was "destroyed".
I don't know if it is bad practice to use this method, as a lot of unlabeled brackets could quickly make the code unreadable, but maybe some comments could clear things up.
For C++ it depends on what you are doing.
OK, it is stupid code but imagine
class myTimeEatingClass
{
public:
//constructor
myTimeEatingClass()
{
sleep(2000);
ms_usedTime+=2;
}
~myTimeEatingClass()
{
sleep(3000);
ms_usedTime+=3;
}
const unsigned int getTime() const
{
return ms_usedTime;
}
static unsigned int ms_usedTime;
};
myTimeEatingClass::ms_CreationTime=0;
myFunc()
{
for (int counter = 0; counter <= 10; counter++) {
myTimeEatingClass timeEater();
//do something
}
cout << "Creating class took " << timeEater.getTime() << "seconds at all" << endl;
}
myOtherFunc()
{
myTimeEatingClass timeEater();
for (int counter = 0; counter <= 10; counter++) {
//do something
}
cout << "Creating class took " << timeEater.getTime() << "seconds at all" << endl;
}
You will wait 55 seconds until you get the output of myFunc.
Just because each loop constructor and destructor together need 5 seconds to finish.
You will need 5 seconds until you get the output of myOtherFunc.
Of course, this is a crazy example.
But it illustrates that it might become a performance issue when each loop the same construction is done when the constructor and / or destructor needs some time.
Since your second question is more concrete, I'm going to address it first, and then take up your first question with the context given by the second. I wanted to give a more evidence-based answer than what's here already.
Question #2: Do most compilers realize that the variable has already
been declared and just skip that portion, or does it actually create a
spot for it in memory each time?
You can answer this question for yourself by stopping your compiler before the assembler is run and looking at the asm. (Use the -S flag if your compiler has a gcc-style interface, and -masm=intel if you want the syntax style I'm using here.)
In any case, with modern compilers (gcc 10.2, clang 11.0) for x86-64, they only reload the variable on each loop pass if you disable optimizations. Consider the following C++ program—for intuitive mapping to asm, I'm keeping things mostly C-style and using an integer instead of a string, although the same principles apply in the string case:
#include <iostream>
static constexpr std::size_t LEN = 10;
void fill_arr(int a[LEN])
{
/* *** */
for (std::size_t i = 0; i < LEN; ++i) {
const int t = 8;
a[i] = t;
}
/* *** */
}
int main(void)
{
int a[LEN];
fill_arr(a);
for (std::size_t i = 0; i < LEN; ++i) {
std::cout << a[i] << " ";
}
std::cout << "\n";
return 0;
}
We can compare this to a version with the following difference:
/* *** */
const int t = 8;
for (std::size_t i = 0; i < LEN; ++i) {
a[i] = t;
}
/* *** */
With optimization disabled, gcc 10.2 puts 8 on the stack on every pass of the loop for the declaration-in-loop version:
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
mov DWORD PTR -12[rbp], 8 ;✷
whereas it only does it once for the out-of-loop version:
mov DWORD PTR -12[rbp], 8 ;✷
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
Does this make a performance impact? I didn't see an appreciable difference in runtime between them with my CPU (Intel i7-7700K) until I pushed the number of iterations into the billions, and even then the average difference was less than 0.01s. It's only a single extra operation in the loop, after all. (For a string, the difference in in-loop operations is obviously a bit greater, but not dramatically so.)
What's more, the question is largely academic, because with an optimization level of -O1 or higher gcc outputs identical asm for both source files, as does clang. So, at least for simple cases like this, it's unlikely to make any performance impact either way. Of course, in a real-world program, you should always profile rather than make assumptions.
Question #1: Is declaring a variable inside a loop a good practice or
bad practice?
As with practically every question like this, it depends. If the declaration is inside a very tight loop and you're compiling without optimizations, say for debugging purposes, it's theoretically possible that moving it outside the loop would improve performance enough to be handy during your debugging efforts. If so, it might be sensible, at least while you're debugging. And although I don't think it's likely to make any difference in an optimized build, if you do observe one, you/your pair/your team can make a judgement call as to whether it's worth it.
At the same time, you have to consider not only how the compiler reads your code, but also how it comes off to humans, yourself included. I think you'll agree that a variable declared in the smallest scope possible is easier to keep track of. If it's outside the loop, it implies that it's needed outside the loop, which is confusing if that's not actually the case. In a big codebase, little confusions like this add up over time and become fatiguing after hours of work, and can lead to silly bugs. That can be much more costly than what you reap from a slight performance improvement, depending on the use case.
Once upon a time (pre C++98); the following would break:
{
for (int i=0; i<.; ++i) {std::string foo;}
for (int i=0; i<.; ++i) {std::string foo;}
}
with the warning that i was already declared (foo was fine as that's scoped within the {}). This is likely the WHY people would first argue it's bad. It stopped being true a long time ago though.
If you STILL have to support such an old compiler (some people are on Borland) then the answer is yes, a case could be made to put the i out the loop, because not doing so makes it makes it "harder" for people to put multiple loops in with the same variable, though honestly the compiler will still fail, which is all you want if there's going to be a problem.
If you no longer have to support such an old compiler, variables should be kept to the smallest scope you can get them so that you not only minimise the memory usage; but also make understanding the project easier. It's a bit like asking why don't you have all your variables global. Same argument applies, but the scopes just change a bit.
It's a very good practice, as all above answer provide very good theoretical aspect of the question let me give a glimpse of code, i was trying to solve DFS over GEEKSFORGEEKS, i encounter the optimization problem......
If you try to solve the code declaring the integer outside the loop will give you Optimization Error..
stack<int> st;
st.push(s);
cout<<s<<" ";
vis[s]=1;
int flag=0;
int top=0;
while(!st.empty()){
top = st.top();
for(int i=0;i<g[top].size();i++){
if(vis[g[top][i]] != 1){
st.push(g[top][i]);
cout<<g[top][i]<<" ";
vis[g[top][i]]=1;
flag=1;
break;
}
}
if(!flag){
st.pop();
}
}
Now put integers inside the loop this will give you correct answer...
stack<int> st;
st.push(s);
cout<<s<<" ";
vis[s]=1;
// int flag=0;
// int top=0;
while(!st.empty()){
int top = st.top();
int flag = 0;
for(int i=0;i<g[top].size();i++){
if(vis[g[top][i]] != 1){
st.push(g[top][i]);
cout<<g[top][i]<<" ";
vis[g[top][i]]=1;
flag=1;
break;
}
}
if(!flag){
st.pop();
}
}
this completely reflect what sir #justin was saying in 2nd comment....
try this here
https://practice.geeksforgeeks.org/problems/depth-first-traversal-for-a-graph/1. just give it a shot.... you will get it.Hope this help.
Chapter 4.8 Block Structure in K&R's The C Programming Language 2.Ed.:
An automatic variable declared and initialized in a
block is initialized each time the block is entered.
I might have missed seeing the relevant description in the book like:
An automatic variable declared and initialized in a
block is allocated only one time before the block is entered.
But a simple test can prove the assumption held:
#include <stdio.h>
int main(int argc, char *argv[]) {
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
int k;
printf("%p\n", &k);
}
}
return 0;
}
The two snippets below generate the same assembly.
// snippet 1
void test() {
int var;
while(1) var = 4;
}
// snippet 2
void test() {
while(1) int var = 4;
}
output:
test():
push rbp
mov rbp, rsp
.L2:
mov DWORD PTR [rbp-4], 4
jmp .L2
Link: https://godbolt.org/z/36hsM6Pen
So, until profiling opposes or computation extensive constructor is involved, keeping declation close to its usage should be the default approach.
I was reading C++ Primer by Stanley B. Lippman and at the part of flow control it shows an example of a for-loop like this one:
#include <iostream>
int main(void){
int sum=0;
for (int val=1; val <= 10; val++)
sum +=val;
std::cout << sum << std::endl;
return 0;
}
If I try std::cout << val; outside of the for-loop, the IDE gives me an error. But I want to understand why it happens and how it is different from this code:
#include <iostream>
int main(void){
int sum=0;
int val;
for ( val=1; val <= 10; val++)
sum +=val;
std::cout << sum << std::endl;
std::cout << val;
return 0;
}
Where I can actually print the val value without any problem.
Does it have something to do with local variable considering the for-loop a function that we are using inside of the main?
Every variable has scope, which is (loosely speaking) its lifetime.
A variable declared in the head of a for loop has scope limited to the for loop. When control passes out of the loop, the variable passes out of scope. Exactly the same thing happens to variables declared in a function, when control passes out of the function.
Does it have something to do with local variable considering the for-loop a function that we are using inside of the main?
Nothing to do with functions; there are plenty of other ways to get a scope, including a block:
int main()
{
{
int x = 0;
}
std::cout << x; // error
}
And a for loop is another example.
Its deliberate. Its called scope, Whenever you introduce a block { /*stuff here */ } (and a few other places) the variables inside the block are local to the block and supersede (hide) variables of the same name defined outside the block. It helps to ensure block-local code does not tread on other variables. Can be useful when editing an old codebase and you don't want to accidentally use a wider scoped variable.
The concept behind this is called Scope of the variable. Whenever a variable is created a memory location is assigned to it. So in order to optimize the memory usage scope of the variable is defined.
In the first case, the variable is limited only to the for loop. The variable is created inside at the beginning of the loop, utilized inside the loop and destroyed when the execution comes out of the loop.
In second case, the variable is created outside the loop so it persists till the execution of the program and hence you were able to access it even after the loop.
The variable declared inside the for loop is an example of local variable and that declared outside is an example of global variables.
Not only with the loops, the local variables will also be used in if..else blocks, functions and their usage is limited only within the scope.
Hope this answers your question.
Running this exact code does not turn up any errors for me. I get a value of 55 for sum and 11 for val as expected. Because you defined val outside of your for loop but in main, it should be good everywhere within the main() function to the best of my knowledge. As a general rule, if you declare a variable in between two curly braces without any fancy keywords, it should stay valid within those two curly braces (being "local" to that block of code). I am no c++ guru, so there might be exceptions to that (I am also not sure if "if" statements count as code blocks whose locally defined variables only work within the statement).
Alright, I wanna know why this code is working, I just realized that I have two variables with the same name within the same scope.
I'm using g++ (gcc 4.4).
for(int k = 0 ; k < n ; k++)
{
while(true)
{
i = Tools::randomInt(0, n);
bool exists = false;
for(int k = 0 ; k < p_new_solution_size ; k++)
if( i == p_new_solution[k] )
{
exists = true;
break;
}
if(!exists)
break;
}
p_new_solution[p_new_solution_size] = i;
p_new_solution_size++;
}
The k in the inner for loop shadows (or hides) the k in the outer for loop.
You can declare multiple variables with the same name at different scopes. A very simple example would be the following:
int main()
{
int a; // 'a' refers to the int until it is shadowed or its block ends
{
float a; // 'a' refers to the float until the end of this block
} // 'a' now refers to the int again
}
Alright, I wanna know why this code is working, I just realized that I have two variables with the same name within the same scope.
You seem confused about scopes. They're not "within the same" scope... the for loop's k has it's own nested/inner scope. More importantly, to see why the language allows it, consider:
#define DO_SOMETHING \
do { for (int i = 1; i <= 2; ++i) std::cout << i << '\n'; } while (false)
void f()
{
for (int i = 1; i <= 10; ++i)
DO_SOMETHING();
}
Here, the text substituted by the macro "DO_SOMETHING" gets evaluated in the same scope as i. If you're writing DO_SOMETHING, you may need its expansion to store something in a variable, and settle on the identifier i - obviously you have no way of knowing if it'll already exist in the calling context. You could try to pick something more obscure, but you'd have people using such convoluted variable names that their code maintainability suffered, and regardless sooner or later there would be a clash. So, the language just lets the inner scopes introduce variables with the same name: the innermost match is used until its scope exits.
Even when you're not dealing with macros, it's a pain to have to stop and think about whether some outer scope is already using the same name. If you know you just want a quick operation you can pop it an indepedent (nested) scope without considering that larger context (as long as you don't have code in there that actually wants to use the outer-scope variable: if you do then you can sometimes specify it explicitly (if it's scoped by namespaces and classes, but if it's in a function body you do end up needing to change your own loop variable's name (or create a reference or something to it before introducing your same-named variable)).
From the standard docs, Sec 3.3.1
Every name is introduced in some portion of program text called a declarative region, which is the largest part of the
program in which that name is valid, that is, in which that name may be used as an unqualified name to refer to the
same entity. In general, each particular name is valid only within some possibly discontiguous portion of program text
called its scope. To determine the scope of a declaration, it is sometimes convenient to refer to the potential scope of
a declaration. The scope of a declaration is the same as its potential scope unless the potential scope contains another
declaration of the same name. In that case, the potential scope of the declaration in the inner (contained) declarative
region is excluded from the scope of the declaration in the outer (containing) declarative region.
It might sound confusing in your first read, but it does answer your question.
The potential scope is same as the scope of the declaration unless another (inner) declaration occurs. If occurred, the potential scope of the outer declaration is removed and just the inner declaration's holds.
Hope am clear and it helps..
Because you are allowed to have two variables of the same name within the same scope, but not within the same declaration space. The compiler takes the most-local variable of the appropriate name, similar to how you can 'hide' global variables of name X with a member variable of name X. You should be getting a warning though.
In C/C++ the variable scope is limited by the braces so the following code is valid for the compiler:
int k()
{
int k;
{
int k;
{
int k;
}
}
}