C++ Loops, scope, and stack (why does this work?) - c++

I'm finding it difficult to find specific answers for questions like this, but I'll explain the way I think think this is supposed to (not) work, and maybe someone can tell me where I'm wrong.
I create an integer array of size 10 in main which lives on the stack. Then I use a for loop to load integers 0-9 into that array. When the for loop ends, the stack pointer returns to where it was after initializing arr1, but those values still exist in memory, and the array is still pointing to them.
Then, I use another for loop to flood the stack, which should overwrite the values created in the first for loop, and now when I go to access my values in arr1, they should be pointing to integers with values of 5000.
When I print the array, it still prints out numbers 0-9. Why does this work? I thought data declared in a loop was popped off the stack when it goes out of scope. Thanks
int main(void)
{
int arr1[10];
for(int i = 0; i < 10; i++)
arr1[i] = i;
for(int i = 0; i < 500; i++)
int a = 5000;
for(int i = 0; i < 10; i++)
cout << arr1[i] << endl;
system("PAUSE");
return 0;
}

You seem to have misunderstood data lifetime in C++. When local data is declared in a scope block (whether in a function, loop, or other control structure), it remains on the stack until that scope block ends.
That means the entirety of arr1 remains allocated and on the stack until main() finishes, no matter what else you do. It will not be overwritten by subsequent local variables in the same function, or variables nested in deeper scope blocks.
In your second loop, variable a is (theoretically) being created and destroyed on every iteration of the loop body. That means you don't have 500 instances of a on the stack during the loop -- you only ever have one. (In practice, the compiler almost certainly optimises that out though, since it's not doing anything useful.)

No, the array is not "pointing to" the values. arr1 is 10 integers, it is not 10 pointers to integers. The values are stored in the array. Creating and then immediately destroying another variable a, even doing it 500 times, does not change what is stored in the array arr1.
It's possible that you have come from a language where variables and/or array elements by definition are references to objects. That is not the case in C++.
And for what it's worth, C++ implementations generally do not move the stack pointer around when you enter and exit a for loop. Generally they only generate one stack frame per function containing all the variables that function needs, although they are permitted to do what they like provided that they call destructors at the correct time.

As you say "data declared in a loop was popped off the stack when it goes out of scope." That is true. But in your sample, the array is NOT declared inside a loop. It is declared outside of the loop and is still in scope for the rest of main().
If you had something like this:
char * arr1[10];
for(int i = 0; i < 10; i++)
{
char * text[16];
sprintf(text, "%d", i);
arr1[i] = text;
}
for(int j = 0; j < 10; ++j)
printf ("%s\n", arr1[i]);
then this would crash because the text[] arrays are declared inside the first loop and they go out of scope at the end of the loop, leaving pointers to the variables that are out of scope. I think this is what you had in mind.

arr1 and a are in totally independent memory locations. You can prove this with this line:
printf("%lu %p %p %p", \
sizeof(int), (void *)&arr1[0], \
(void *)&arr1[9], (void *)&a);
For example, on my machine I got this output:
4 0x7fff5541e640 0x7fff5541e664 0x7fff5541e63c
As you can see, &a is four bytes (sizeof(int)) from the first element of arr1.
The following statement:
for(int i = 0; i < 500; i++)
int a = 5000;
Does not have the effect that you're suspecting. It simply assigns the value 5000 to the variable a over and over again, or, has been optimized and only does it once or even not at all because a is not used.

Related

pointer value change C++ [duplicate]

This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 4 years ago.
I'm relatively new too C++ programming. While I was working on a code about arguments passing with an array of character pointers. I encountered a problem where the value of my pointers are changed after certain operations. Below is my code.
#include <iostream>
using namespace std;
void input(char* argv[], int &i)
{
char buff[10][20]; //buffer string array
while (cin.peek() != '\n') {
cin >> buff[i++];
}
for (int j = 0; j < i; j++) {
argv[j] = buff[j];
}
argv[i] = NULL; // putting a NULL at the end
}
int main(int argc, char* argv[])
{
char *arg[10];
int i = 0;
input(arg, i); //input the arguments
for (int j = 0; j < i; j++) {
cout << arg[j] << endl; //output the arguments entered
}
return 0;
}
The sub-function void input(char* argv[], int &i) is supposed to let me input my arguments as many as 9 times or when an enter key is pressed. While i indicates the total number of arguments.
The arguments are then stored as an array of character pointers and then pass it back to the main function's char *arg[10] to hold.
However, I found that after
cout << arg[j] << endl;
The values of arg are lost, and random values are being printed.
You're creating a two-dimensional array of characters buff on the stack, and then you're returning pointers into that array through the argv parameter. But buff lives on the stack and ceases to exist as soon as the input function exits. The memory used by buff will be overwritten by other functions that you call after calling input.
You should allocate buff in main and then pass it into input so it continues to live in the scope of main after input returns.
Another option would be to allocate heap space for buff in input. In this case the main function would be responsible for freeing the memory after it was done with it.
Obviously there are more advanced C++ features you could use to avoid some of this overhead. Though this is a C++ program, it's effectively written as C. But understanding how memory and pointers work is essential to understanding the problems that the newer C++ features solve.
the value of my pointers are changed
The pointers are the only things that weren't damaged. The problem is the memory they point to.
You can prove the first part by printing the value of each of these pointers, or just inspecting them in the debugger. (You can print the address rather than the C-string it points to by casting to void, like cout << static_cast<void*>(arg[j]) << '\n').
So what happened to your C strings? Well, you declared an automatic-scope array variable inside the function input. That array ceases to exist when the function exits, just like any other automatic-scope variable. Accessing the memory where a variable used to live, after the variable ceases to exist, is illegal.
The fact that you returned pointers into this array doesn't make it legal to read through (dereference) them after the array itself goes out of scope, and this is in fact undefined behaviour.
The contents being overwritten is actually the best case, because it meant you noticed the bug: it could legally have crashed or, even worse, appeared to work flawlessly until after you submitted/deployed/sold the program, and crashed every run thereafter.
Think of the stack as being a large (but not unlimited) amount of memory. It is allocated and freed simply be moving the stack pointer down and up (the directions will depend on the hardware).
Here's your code with some annotations.
input(arg, i);
// when you get here the stack pointer will have been moved up, freeing the space
// that was allocated for 'buf' in 'input'
// the space for 'j' could overwrite the space where 'buf' was
for (int j = 0; j < i; j++) {
// the calls to 'cout' and 'end;' could overwrite the space where 'buf was'
cout << arg[j] << endl;
}

Getting sementation fault (core dumped)

Everything seems to run okay up until the return part of shuffle_array(), but I'm not sure what.
int * shuffle_array(int initialArray[], int userSize)
{
// Variables
int shuffledArray[userSize]; // Create new array for shuffled
srand(time(0));
for (int i = 0; i < userSize; i++) // Copy initial array into new array
{
shuffledArray[i] = initialArray[i];
}
for(int i = userSize - 1; i > 0; i--)
{
int randomPosition = (rand() % userSize;
temp = shuffledArray[i];
shuffledArray[i] = shuffledArray[randomPosition];
shuffledArray[randomPosition] = temp;
}
cout << "The numbers in the initial array are: ";
for (int i = 0; i < userSize; i++)
{
cout << initialArray[i] << " ";
}
cout << endl;
cout << "The numbers in the shuffled array are: ";
for (int i = 0; i < userSize; i++)
{
cout << shuffledArray[i] << " ";
}
cout << endl;
return shuffledArray;
}
Sorry if spacing is off here, not sure how to copy and past code into here, so I had to do it by hand.
EDIT: Should also mention that this is just a fraction of code, not the whole project I'm working on.
There are several issues of varying severity, and here's my best attempt at flagging them:
int shuffledArray[userSize];
This array has a variable length. I don't think that it's as bad as other users point out, but you should know that this isn't allowed by the C++ standard, so you can't expect it to work on every compiler that you try (GCC and Clang will let you do it, but MSVC won't, for instance).
srand(time(0));
This is most likely outside the scope of your assignment (you've probably been told "use rand/srand" as a simplification), but rand is actually a terrible random number generator compared to what else the C++ language offers. It is rather slow, it repeats quickly (calling rand() in sequence will eventually start returning the same sequence that it did before), it is easy to predict based on just a few samples, and it is not uniform (some values have a much higher probability of being returned than others). If you pursue C++, you should look into the <random> header (and, realistically, how to use it, because it's unfortunately not a shining example of simplicity).
Additionally, seeding with time(0) will give you sequences that change only once per second. This means that if you call shuffle_array twice quickly in succession, you're likely to get the same "random" order. (This is one reason that often people will call srand once, in main, instead.)
for(int i = userSize - 1; i > 0; i--)
By iterating to i > 0, you will never enter the loop with i == 0. This means that there's a chance that you'll never swap the zeroth element. (It could still be swapped by another iteration, depending on your luck, but this is clearly a bug.)
int randomPosition = (rand() % userSize);
You should know that this is biased: because the maximum value of rand() is likely not divisible by userSize, you are marginally more likely to get small values than large values. You can probably just read up the explanation and move on for the purposes of your assignment.
return shuffledArray;
This is a hard error: it is never legal to return storage that was allocated for a function. In this case, the memory for shuffledArray is allocated automatically at the beginning at the function, and importantly, it is deallocated automatically at the end: this means that your program will reuse it for other purposes. Reading from it is likely to return values that have been overwritten by some code, and writing to it is likely to overwrite memory that is currently used by other code, which can have catastrophic consequences.
Of course, I'm writing all of this assuming that you use the result of shuffle_array. If you don't use it, you should just not return it (although in this case, it's unlikely to be the reason that your program crashes).
Inside a function, it's fine to pass a pointer to automatic storage to another function, but it's never okay to return that. If you can't use std::vector (which is the best option here, IMO), you have three other options:
have shuffle_array accept a shuffledArray[] that is the same size as initialArray already, and return nothing;
have shuffle_array modify initialArray instead (the shuffling algorithm that you are using is in-place, meaning that you'll get correct results even if you don't copy the original input)
dynamically allocate the memory for shuffledArray using new, which will prevent it from being automatically reclaimed at the end of the function.
Option 3 requires you to use manual memory management, which is generally frowned upon these days. I think that option 1 or 2 are best. Option 1 would look like this:
void shuffle_array(int initialArray[], int shuffledArray[], int userSize) { ... }
where userSize is the size of both initialArray and shuffledArray. In this scenario, the caller needs to own the storage for shuffledArray.
You should NOT return a pointer to local variable. After the function returns, shuffledArray gets deallocated and you're left with a dangling pointer.
You cannot return a local array. The local array's memory is released when you return (did the compiler warn you about that). If you do not want to use std::vector then create yr result array using new
int *shuffledArray = new int[userSize];
your caller will have to delete[] it (not true with std::vector)
When you define any non static variables inside a function, those variables will reside in function's stack. Once you return from function, the function's stack is gone. In your program, you are trying to return a local array which will be gone once control is outside of shuffle_array().
To solve this, either you need to define the array globally (which I won't prefer because using global variables are dangerous) or use dynamic memory allocation for the array which will create space for the array in heap rather than allocating the space on the function's stack. You can use std::vectors also, if you are familiar with vectors.
To allocate memory dynamically, you have to use new as mentioned below.
int *shuffledArray[] = new int[userSize];
and once you completed using shuffledArray, you need to free the memory as below.
delete [] shuffledArray;
otherwise your program will leak memory.

Declaring variables in C/C++

Someone told me: "declaring variables close to their use have value". He corrected me:
void student_score(size_t student_list_size) {
// int exam;
// int average;
// int digit;
// int counter_digits;
for (size_t i = 0; i < student_list_size; i++) {
int exam;
int average;
int digit;
int counter_digits;
I think it's bad, because here variables initialized every loop. What's true?
I encourage to declare them in as local a scope as possible, and as close to the first use as possible. This makes it easier for the reader to find the declaration and see what type the variable is and what it was initialized to. And of course, compiler will optimize it.
Both methods would be optimised to the same thing by the compiler. But for readability and ease of future maintenance, declaring the variables within the loop might be preferred by some.
I think it's bad, because here variables initialized every loop. What's true?
The code given the variables are not initialised at all.
So it is just a matter of personal taste.
It depends. C and C++ work differently in that regard. In C, variables MUST be declared at the start of its scope (forget that, it depends on your compiler although it holds true in pre-C99 compilers, as pointed out in the comments - thanks guys!), while in C++ you can declare them anywhere.
Now, it depends. Let's suppose the following two pieces of code:
int i = 0;
while (i < 5) {
i++;
}
while (i < 5) {
int i = 0;
i++;
}
In this case, the first piece of code is what's gonna work, because in the second case you declare the variable at each loop. But, let's suppose the following...
int i = 0;
while (i < 5) {
std::String str = "The number is now " + std::to_string(i);
cout << str << endl;
i++;
}
In short, declare the variables where it makes most sense to you and your code. Unless you're microoptimizing, like most things, it all depends on context.
This seems like a discussion with many personal opinions, so I'll add mine: I am a C programmer and like the C rules of having to declare your variables at the begining of a function (or block). For one thing, it tells me the stack lay-out so if I create another bug (I do "occassionaly") and overwrite something, I can determine the cause from just the stack lay-out. Then, when having to go through C++ code, I always have to search where the variable is declared. I rather look at the function beginning and see them all neatly declared.
I do ocassionally want to write for (int i=0;... and know the variable goes out of scope following the for loop (I hope it goes out of scope, as it is declared before the block begins).
Then there is an example in another answer:
while (i < 5) {
int i = 0;
i++;
}
and I hope it doesn't work because if it works it means there must be a variable i declared before this loop and so you have a clash of variables and scopes which can be horrible to debug (the loop will never terminate because the wrong i is incremented). I mean, I hope all variables must have been declared before their use.
I believe you must have a discipline in declaring your variables. The discipline can vary per person, just as long as it is easy to find where a variable is declared and what its scope is.
First of all that is not valid C code unless you are using -std=c99.
Furthermore, as long as you aren't creating dangling pointers in the loop, there is no problem with it.
The advantage you gain from doing it this way is that these variables are in a tight knit scope.
You need to know some knowledge of what local variable is ?
A variable declared inside a function is known as local variable. Local variables are also called automatic variables.
Scope: The area where a variable can be accessed is known as scope of variable.
Scope of local variable:
Local variable can be used only in the function in which it is declared.
Lifetime:
The time period for which a variable exists in the memory is known as lifetime of variable.
Lifetime of local variable:
Lifetime of local variables starts when control enters the function in which it is declared and it is destroyed when control exists from the function or block.
void student_score(size_t student_list_size) {
// int exam;
// int average;
// int digit;
// int counter_digits;
/* The above variable declared are local to the function student_score and can't be used
in other blocks. In your case its the for loop. */
for (size_t i = 0; i < student_list_size; i++) {
int exam;
int average;
int digit;
int counter_digits;
...
/* The above declared variable are local to the for loop and not to the function.*/
Consider this example:
#include<stdio.h>
void fun();
int main()
{
fun();
return 0;
}
void fun()
{
int i,temp=100,avg=200;
for (i=0;i<2;i++)
{
int temp,avg;
temp = 10 + 20;
avg = temp / 2;
printf("inside for loop: %d %d",temp,avg);
printf("\n");
}
printf("outside for loop: %d %d\n",temp,avg);
}
Output:
inside for loop: 30 15
inside for loop: 30 15
outside for loop: 100 200
If you are declaring the variable in loops the then declare the variable as static (In case if the value of the variable to be saved/used for further iteration).
Even compiler spend lot of time to initialize the variable in loops.
My suggestion is to declare at the beginning of the function. its a good programming practice.
I think, i made my point.

Declaration and initialize pointer-to-pointer char-array

Concerning a pointer to pointer I wonder if its ok with the following:
char** _charPp = new char*[10];
for (int i = 0; i < 10; i++)
ยด _charPp[i] = new char[100];
That is - if I want a pointer to pointer that points to 10 char-arrays.
The question - is this ok now or do I have to make some kind of initializing of every char-arrays? Then - how do I do that?
I will later on in the program fill these arrays with certain chars but i suspect that the arrays should be initialized with values before such as "0"
Yes, it looks pretty much suitable,
int arraySizeX = 100;
int arraySizeY = 100;
char** array2d = new char*[arraySizeX] // you've just allocated memory that
// holds 100 * 4 (average) bytes
for(int i = 0; i < arraySizeX; ++i)
{
array2d[i] = new char[arraySizeY] // you've just allocated a chunk that
//holds 100 char values. It is 100 * 1 (average) bytes
memset(array2d[i], 0, arraySizeY) // to make every element NULL
}
I will later on in the program fill these arrays with certain chars
but i suspect that the arrays should be initialized with values before
such as "0"
No, there's no "default value" in C++. You need to assign 0 to the pointer to make it null, or use memset() to do it with arrays.
Netherwire's answer shows correctly how to initialize the chars to 0. I'll address the first part of your question.
There is no requirement to initialize them yet. As long as you initialize the chars before they're read, it's correct.
Depending on the structure of your code and how you use the arrays, it may be safer to do anyway since it can be hard to find the bug if you eventually do read some of them before initialization.
Also, remember to later delete[] the arrays that you've allocated. Or better yet, consider using std::vector instead (and possibly std::string depending on what you use those arrays for.)

100% of array correct in function, 75% of array correct in CALLING function - C

Note: i'm using the c++ compiler, hence why I can use pass by reference
i have a strange problem, and I don't really know what's going on.
Basically, I have a text file: http://pastebin.com/mCp6K3HB
and I'm reading the contents of the text file in to an array of atoms:
typedef struct{
char * name;
char * symbol;
int atomic_number;
double atomic_weight;
int electrons;
int neutrons;
int protons;
} atom;
this is my type definition of atom.
void set_up_temp(atom (&element_record)[DIM1])
{
char temp_array[826][20];
char temp2[128][20];
int i=0;
int j=0;
int ctr=0;
FILE *f=fopen("atoms.txt","r");
for (i = 0; f && !feof(f) && i < 827; i++ )
{
fgets(temp_array[i],sizeof(temp_array[0]),f);
}
for (j = 0; j < 128; j++)
{
element_record[j].name = temp_array[ctr];
element_record[j].symbol = temp_array[ctr+1];
element_record[j].atomic_number = atol(temp_array[ctr+2]);
element_record[j].atomic_weight = atol(temp_array[ctr+3]);
element_record[j].electrons = atol(temp_array[ctr+4]);
element_record[j].neutrons = atol(temp_array[ctr+5]);
element_record[j].protons = atol(temp_array[ctr+6]);
ctr = ctr + 7;
}
//Close the file to free up memory and prevent leaks
fclose(f);
} //AT THIS POINT THE DATA IS FINE
Here is the function I'm using to read the data. When i debug this function, and let it run right up to the end, I use the debugger to check it's contents, and the array has 100% correct data, that is, all elements are what they should be relative to the text file.
http://i.imgur.com/SEq9w7Q.png This image shows what I'm talking about. On the left, all the elements, 0, up to 127, are perfect.
Then, I go down to the function I'm calling it from.
atom myAtoms[118];
set_up_temp(myAtoms); //AT THIS POINT DATA IS FINE
region current_button_pressed; // NOW IT'S BROKEN
load_font_named("arial", "cour.ttf", 20);
panel p1 = load_panel("atomicpanel.txt");
panel p2 = load_panel("NumberPanel.txt");
As soon as ANYTHING is called, after i call set_up_temp, the elements 103 to 127 of my array turn in to jibberish. As more things get called, EVEN MORE of the array turns to jibberish. This is weird, I don't know what's happening... Does anyone have any idea? Thanks.
for (j = 0; j < 128; j++)
{
element_record[j].name = temp_array[ctr];
You are storing, and then returning, pointers into temp_array, which is on the stack. The moment you return from the function, all of temp_array becomes invalid -- it's undefined behavior to dereference any of those pointers after that point. "Undefined behavior" includes the possibility that you can still read elements 0 through 102 with no trouble, but 103 through 127 turn to gibberish, as you say. You need to allocate space for these strings that will live as long as the atom object. Since as you say you are using C++, the easiest fix is to change both char * members to std::string. (If you don't want to use std::string, the second easiest fix is to use strdup, but then you have to free that memory explicitly.)
This may not be the only bug in this code, but it's probably the one causing your immediate problem.
In case you're curious, the reason the high end of the data is getting corrupted is that on most (but not all) computers, including the one you're using, the stack grows downward, i.e. from high addresses to low. Arrays, however, always index from low addresses to high. So the high end of the memory area that used to be temp_array is the part that's closest to the stack pointer in the caller, and thus most likely to be overwritten by subsequent function calls.
Casual inspection yields this:
char temp_array[826][20];
...
for (i = 0; f && !feof(f) && i < 827; i++ )
Your code potentially allows i to become 826. Which means you're accessing the 827th element of temp_array. Which is one past the end. Oops.
Additionally, you are allocating an array of 118 atoms (atom myAtoms[118];) but you are setting 128 of them inside of set_up_temp in the for (j = 0; j < 128; j++) loop.
The moral of this story: Mind your indices and since you use C++ leverage things like std::vector and std::string and avoid playing with arrays directly.
Update
As Zack pointed out, you're returning pointers to stack-allocated variables which will go away when the set_up_temp function returns. Additionally, the fgets you use doesn't do what you think it does and it's HORRIBLE code to begin with. Please read the documentation for fgets and ask yourself what your code does.
You are allocating an array with space for 118 elements but the code sets 128 of them, thus overwriting whatever happens to live right after the array.
Also as other noted you're storing in the array pointers to data that is temporary to the function (a no-no).
My suggestion is to start by reading a good book about C++ before programming because otherwise you're making your life harder for no reason. C++ is not a language in which you can hope to make serious progress by experimentation.