Doesn't the space occupied by a variable get deallocated as soon as the control is returned from the function??
I thought it got deallocated.
Here I have written a function which is working fine even after returning a local reference of an array from function CoinDenom,Using it to print the result of minimum number of coins required to denominate a sum.
How is it able to print the right answer if the space got deallocated??
int* CoinDenom(int CoinVal[],int NumCoins,int Sum) {
int min[Sum+1];
int i,j;
min[0]=0;
for(i=1;i<=Sum;i++) {
min[i]=INT_MAX;
}
for(i=1;i<=Sum;i++) {
for(j=0;j< NumCoins;j++) {
if(CoinVal[j]<=i && min[i-CoinVal[j]]+1<min[i]) {
min[i]=min[i-CoinVal[j]]+1;
}
}
}
return min; //returning address of a local array
}
int main() {
int Coins[50],Num,Sum,*min;
cout<<"Enter Sum:";
cin>>Sum;
cout<<"Enter Number of coins :";
cin>>Num;
cout<<"Enter Values";
for(int i=0;i<Num;i++) {
cin>>Coins[i];
}
min=CoinDenom(Coins,Num,Sum);
cout<<"Min Coins required are:"<< min[Sum];
return 0;
}
The contents of the memory taken by local variables is undefined after the function returns, but in practice it'll stay unchanged until something actively changes it.
If you change your code to do some significant work between populating that memory and then using it, you'll see it fail.
What you have posted is not C++ code - the following is illegal in C++:
int min[Sum+1];
But in general, your program exhibits undefined behaviour. That means anything could happen - it could even appear to work.
The space is "deallocated" when the function returns - but that doesn't mean the data isn't still there in memory. The data will still be on the stack until some other function overwrites it. That is why these kinds of bugs are so tricky - sometimes it'll work just fine (until all the sudden it doesn't)
You need to allocate memory on the heap for return variable.
int* CoinDenom(int CoinVal[],int NumCoins,int Sum) {
int *min= new int[Sum+1];
int i,j;
min[0]=0;
for(i=1;i<=Sum;i++) {
min[i]=INT_MAX;
}
for(i=1;i<=Sum;i++) {
for(j=0;j< NumCoins;j++) {
if(CoinVal[j]<=i && min[i-CoinVal[j]]+1<min[i]) {
min[i]=min[i-CoinVal[j]]+1;
}
}
}
return min; //returning address of a local array
}
min=CoinDenom(Coins,Num,Sum);
cout<<"Min Coins required are:"<< min[Sum];
delete[] min;
return 0;
In your case you able to see the correct values only, because no one tried to change it. In general this is unpredictable situation.
That array is on the stack, which in most implementations, is a pre-allocated contiguous block of memory. You have a stack pointer that points to the top of the stack, and growing the stack means just moving the pointer along it.
When the function returned, the stack pointer was set back, but the memory is still there and if you have a pointer to it, you could access it, but it's not legal to do so -- nothing will stop you, though. The memory values in the array's old space will change the next time the stack depth runs over the area where the array is.
The variable you use for the array is allocated on stack and stack is fully available to the program - the space is not blocked or otherwise hidden.
It is deallocated in the sense that it can be reused later for other function calls and in the sense that destructors get called for variables allocated there. Destructors for integers are trivial and don't do anything. That's why you can access it and it can happen that the data has not been overwritten yet and you can read it.
The answer is that there's a difference between what the language standard allows, and what turns out to work (in this case) because of how the specific implementation works.
The standard says that the memory is no longer used, and so must not be referenced.
In practice, local variables on the stack. The stack memory is not freed until the application terminates, which means you'll never get an access violation/segmentation fault for writing to stack memory. But you're still violating the rules of C++, and it won't always work. The compiler is free to overwrite it at any time.
In your case, the array data has simply not been overwritten by anything else yet, so your code appears to work. Call another function, and the data gets overwritten.
How is it able to print the right answer if the space got deallocated??
When memory is deallocated, it still exists, but it may be reused for something else.
In your example, the array has been deallocated but its memory hasn't yet been reused, so its contents haven't yet been overwritten with other values, which is why you're still able from it the values that you wrote.
The fact that it won't have been reused yet is not guaranteed; and the fact that you can even read from it at all after it's deallocated is also not guaranteed: so don't do it.
This might or might not work, behaviour is undefined and it's definitely wrong to do it like this. Most compilers also give a compiler warning, for example GCC:
test.cpp:8: warning: address of local variable `min' returned
Memory is like clay that never hardens. Allocating memory is like taking some clay out of the clay pot. Maybe you make a cat and a cheeseburger. When you have finished, you deallocate the clay by putting your figures back into the pot, but just being put into the pot does not make them lose their shape: if you or someone else looks into the pot, they will continue to observe your cat and cheeseburger sitting on the top of the clay stack until someone else comes along and makes them into something else.
The NAND gates in the memory chips are the clay, the strata that holds the NAND gates is the clay pot, and the particular voltages that represent the value of your variables are your sculptures. Those voltages do not change just because your program has taken them off the list of things it cares about.
You need to understand the stack. Add this function,
void f()
{
int a[5000];
memset( a, 0, sizeof(a) );
}
and then call it immediately after calling CoinDenom() but before writing to cout. You'll find that it no longer works.
Your local variables are stored on the stack. CoinDenom() returns a memory address that points into the stack. Very simplified and leaving out lots of details, say the stack pointer is pointing to 0x1000 just before you call CoinDenom. An int* (Coins) is pushed on the stack. This becomes CoinVal[]. Then an int, Num which becomes NumCoins. Then another int, Sum which becomes Sum. Thats 3 ints at 4 bytes/int. Then space for the local variables:
int min[Sum+1];
int i,j;
which would be (Sum + 3) * 4 bytes/int. Say Sum = 2, that gives us another 20 bytes total, so the stack pointer gets incremented by 32 bytes to 0x1020. (All of main's locals are below 0x1000 on the stack.) min is going to point to 0x100c. When CoinDenom() returns, the stack pointer is decremented "freeing" that memory but unless another function is called to have it's own locals allocated in that memory, nothing's going to happen to change what's stored in that memory.
See http://en.wikipedia.org/wiki/Calling_convention for more detail on how the stack is managed.
Related
This question already has answers here:
At what exact moment is a local variable allocated storage?
(5 answers)
Closed 6 years ago.
Even in C (not just C++) you can declare variables at the start of a code block, which is enclosed in curly braces.
Example:
#include <stdio.h>
void use_stack(int cnt)
{
if (cnt<=16) {
int a[16];
int i;
a[0]=3;
a[1]=5;
for (i=2;i<cnt;i++) {
a[i]=a[i-1]+a[i-2];
}
printf("a[%d] == %d\n",cnt-1,a[cnt-1]);
}
else {
printf("cnt is too big\n");
}
}
Now I know that variables like the array a[16] are allocated on the stack in this case.
I was wondering if the space for this array is allocated at the start of the function (first opening curly brace) or at the start of the block where it is declared (opening curly brace after if).
From examining the assembler code it seems the compiler allocates the space for a[16] directly at the entry of the function.
I actually expected that the stack would be allocated (stack pointer decreased) at the declaration of a[16] and that the stack would be de-allocated (stack pointer increased) at the end of the corresponding if code block.
But this does not happen it seems (stack for a[16] is allocated directly at function entry, even if a[16] is not used in the else branch).
Has anyone an explanation why this is the case ?
So is there any part of the C language standard, which explains this behavior, or does it have to do with things like "longjmp" or signal handling, which maybe require that the stack pointer is "constant" inside a function ?
Note: The reason why I assumed the stack would be allocated/deallocated at the start/end of the code block is, because in C++ the constructor/destructor of objects allocated on the stack will be called at the start/end of the code block. So if you examine the assembler code of a C++ program you will notice that the stack is still allocated at the function entry; just the constructor/destructor call will be done at the start/end of the code block.
I am explicitly interested why stack is not allocated/deallocated at the start/end of a code block using curly braces.
The question: At what exact moment is a local variable allocated storage? is only about a local variable allocated at the start of a function. I am surprised that stack allocation for variables allocated later inside a code block is also done at the function entry.
So far the answers have been:
Something to do with optimization
Might different for C, C++
Stack is not even mentioned in the C language specification
So: I am interested in the answer for C... (and I strongly believe that the answer will apply to C++ also, but I am not asking about C++ :-)).
Optimization: Here is an example which will directly demonstrate why I am so surprised and why I am quite sure that this is not an optimization:
#include <stdio.h>
static char *stackA;
static char *stackB;
static void calc(int c,int *array)
{
int result;
if (c<=0) {
// base case c<=0:
stackB=(char *)(&result);
printf("stack ptr calc() = %p\n",stackB);
if (array==NULL) {
printf("a[0] == 1\n");
} else {
array[0]=1;
}
return;
}
// here: c>0
if (array==NULL) {
// no array allocated, so allocate it now
int i;
int a[2500];
// calculate array entries recursively
calc(c-1,a);
// calculate current array entry a[c]
a[c]=a[c-1]+3;
// print full array
for(i=0;i<=c;i++) {
printf("a[%d] = %d\n",i,a[i]);
}
} else {
// array already allocated
calc(c-1,array);
// calculate current array entry a[c]
array[c]=array[c-1]+3;
}
}
int main()
{
int a;
stackA=(char *)(&a);
printf("stack ptr main() = %p\n",stackA);
calc(9,NULL);
printf("used stack = %d\n",(int)(stackA-stackB));
}
I am aware that this is an ugly program :-).
The function calc calculates n*3 + 1 for all 0<=n<=c in a recursive fashion.
If you look at the code for calc you notice that the array a[2500] is only declared when the input parameter array to the function is NULL.
Now this only happens in the call to calc which is done in main.
The stackA and stackB pointers are used to calculate a rough estimate how much stack is used by this program.
Now: int a[2500] should consume around 10000 bytes (4 bytes per integer, 2500 entries). So you could expect that the whole program consumes around 10000 bytes of stack + something additional (for overhead when calc is called recursively).
But: It turns out this program consumes around 100000 bytes of stack (10 times as much as expected). The reason is, that for each call of calc the array a[2500] is allocated, even if it is only used in the first call. There are 10 calls to calc (0<=c<=9) and so you consume 100000 bytes of stack.
It does not matter if you compile the program with or without optimization
GCC-4.8.4 and clang for x64, MS Visual C++ 2010, Windriver for DIAB (for PowerPC) all exhibit this behavior
Even weirder: C99 introduces Variable Length Arrays. If I replace int a[2500]; in the above code with int a[2500+c]; then the program uses less stack space (around 90000 bytes less).
Note: If I only change the call to calc in main to calc(1000,NULL); the program will crash (stack overflow == segmentation fault). If I additionally change to int a[2500+c]; the program works and uses less than 100KB stack. I still would like to see an answer, which explains why a Variable Length Array does not lead to a stack overflow whereas a fixed length array does lead to a stack overflow, in particular since this fixed length array is out of scope (except for the first invocation of calc).
So what's the reason for this behavior in C ?
I do not believe that GCC/clang both simply are not able to do better; I strongly believe there has to be a technical reason for this. Any ideas ?
Answer by Google
After more googling: I strongly believe this has something to do with "setjmp/longjmp" behavior. Google for "Variable Length Array longjmp" and see for yourself. It seems longjmp is hard to implement if you do not allocate all arrays at function entry.
The language rules for automatic storage only guarantees that the last allocated is the first deallocated.
A compiler can implement this logical stack any way it sees fit.
If it can prove that a function isn't recursive it can even allocated the storage at program start-up.
I believe that the above applies to C as well as C++, but I'm no C expert.
Please, when you ask about the details of a programming language, limit the question to one language at a time.
There is no technical reason for this other than choices that compiler makers made. It's less generated code and faster executing code to always reserve all the stack space we'll need at the beginning of the function. So all the compilers made the same reasonable performance tradeoff.
Try using a variable length array and you'll see that the compiler is fully capable of generating code that "allocates" stack just for a block. Something like this:
void
foo(int sz, int x)
{
extern void bar(char *);
if (x) {
char a[sz];
bar(a);
} else {
char a[10];
bar(a);
}
}
My compiler generates code that always reserves stack space for the x is false part, but the space for the true part is only reserved if x is true.
How this is done isn't regulated by any standard. The C and C++ standards don't mention the stack at all, in theory those languages could be used even on computers that don't have a stack.
On computers that do have a stack, how this is done is specified by the ABI of the given system. Often, stack space is reserved at the point when the program enters the function. But compilers are free to optimize the code so that the stack space is only reserved when a certain variable is used.
At any rate, the point where you declare the variable has no relation to when it gets allocated. In your example, int a[16] is either allocated when the function is entered, or it is allocated just before the first place where it is used. It doesn't matter if a is declared inside the if statement or at the top of the function.
In C++ however, there is the concept of constructors. If your variable is an object with a constructor, then that constructor will get executed at the point where the variable is declared. Meaning that the variable must be allocated before that happens.
Alf has explained the limitations and freedoms that the standard specifies (or rather, what it doesn't specify).
I'll suggest an answer to the question
why stack is not allocated/deallocated at the start/end of a code block using curly braces.
The compilers that you tested chose to do so (I'm actually just guessing, I didn't write any of them), because of better run-time performance and simplicity (and of course, because it is allowed).
Allocating 96 bytes (arbitrary example) once takes about half as much time as allocating 48 bytes twice. And third as much times as allocating 32 bytes thrice.
Consider a loop as an extreme case:
for(int i = 0; i < 1000000; i++) {
int j;
}
If j is allocated at the start of the function, there is one allocation. If j is allocated within the loop body, then there will be a million allocations. Less allocations is better (faster).
Note: The reason why I assumed the stack would be allocated/deallocated at the start/end of the code block is, because in C++ the constructor/destructor of objects allocated on the stack will be called at the start/end of the code block.
Now you know that you were wrong to have assumed so. As written in a fine answer in the linked question, the allocation/dealocation need not coincide with the construction/destruction.
Imagine I would use a[100000]
That is approaching a very significant fraction of total stack space. You should allocate large blocks of memory like that dynamically.
case 1:
int main()
{
int T=5;
while(T--){
int a;
cout<<&a<<"\n";
}
}
it prints the same address 5 times.
i suppose it should print 5 different addresses.
case 2:
int main()
{
int T=5;
while(T--){
int* a=new int;
cout<<a<<"\n";
}
}
prints 5 different addresses
My question is:
Why does'nt new memory is allocated every time a variable declaration is encountered in first case?
and the difference between 1st case and 2nd case.
In the first case, a is located on the stack. Basically, a gets "constructed" (a better wording might be "assigned space") there in each iteration and released afterwards. So after each iteration, the space previously allocated to a is free again, and the new a gets that space in the next iteration. This is why the address is the same.
In the second case, you allocate memory on the heap and (additionally) do not free it again. So the memory can't be reassigned in the next iteration.
In theory, the absolute position on the stack is allocated every time the variable comes into scope and deallocated every time it goes out of scope. The LIFO nature of the stack in which it is allocated then makes sure the same location is allocated each time.
But in practice, the compiler allocates relative positions on the stack at compile time whenever doing so is practical (which in this case is trivially true). With pre allocated relative positions, the simple act of entering the function effectively allocates all instances of all local variables. A local object in a loop like that would be constructed and/or initialized for each instance, but allocation was done once in advance for all instances. So the addresses are the same for an even more fundamental reason than the LIFO nature of a stack. They are the same because the allocation was only done once.
If your C++ compiler supports a common C99 feature, you could construct tests that might distinguish the above two cases. Something roughly like:
for (int i=0; i<2; ++i) {
int unpredictable[ f(i) ];
for (int j=0; j<2; ++j) {
int T=5;
// does the location of T vary as i changes ??
int U[ f(j) ]; // I'm pretty sure the location of U varies
}}
We want the values of f(0) and f(1) to be easy at run time, but hard for the optimizer to see at compile time. That is most robust if f is declared in this module but defined in another.
By preventing the compiler from doing all the allocation at compile time, maybe we prevent it from doing some easy allocation at compile time, or maybe it still sorts out the ones that can be allocated at compile time and run time allocation is used only as needed.
It depends on the compiler. Since the variable is declared in the innermost scope, the compiler thinks it is okay to reuse the same location for the variable. Of course it can be located in different addresses.
My code is as below: (Tried to made my original code simpler here)
class ExClass {
public:
int check;
ExClass() { }
}
//in main
int main()
{
ExClass *classPtr = new ExClass();
if(classPtr->check == -1)
{
cout<<"check is negative";
}
}
Here the variable 'check' is not initialized in constructor, so it should take some garbage value(as I know).
But my problem here is, it is always printing "check is negative".(so 'check' is -1)
How this happens everytime ? How the variable 'check' is -1 always ?
Thanks for help !
your check is what's called an uninitialized variable:
A common assumption made by novice programmers is that all variables are set to a known value, such as zero, when they are declared. While this is true for many languages, it is not true for all of them, and so the potential for error is there. Languages such as C use stack space for variables, and the collection of variables allocated for a subroutine is known as a stack frame. While the computer will set aside the appropriate amount of space for the stack frame, it usually does so simply by adjusting the value of the stack pointer, and does not set the memory itself to any new state (typically out of efficiency concerns). Therefore, whatever contents of that memory at the time will appear as initial values of the variables which occupy those addresses.
This means the value isn't quite random. But more of a 'whatever the value of the thing before was' thing.
Also note that some compilers (such as visual c++) initialise uninitialised memory with magic numbers (In Visual Studio C++, what are the memory allocation representations?) while in debugging mode.
static const int MAX_SIZE = 256; //I assume this is static Data
bool initialiseArray(int* arrayParam, int sizeParam) //where does this lie?
{
if(size > MAX_SIZE)
{
return false;
}
for(int i=0; i<sizeParam; i++)
{
arrayParam[i] = 9;
}
return true;
}
void main()
{
int* myArray = new int[30]; //I assume this is allocated on heap memory
bool res = initialiseArray(myArray, 30); //Where does this lie?
delete myArray;
}
We're currently learning the different categories of memory, i know that theres
-Code Memory
-Static Data
-Run-Time Stack
-Free Store(Heap)
I have commented where im unsure about, just wondering if anyone could help me out. My definition for the Run-Time stack describes that this is used for functions but my code memory defines that it contains all instructions for the methods/functions so im just a bit confused.
Can anyone lend a hand?
static const int MAX_SIZE = 256; //I assume this is static Data
Yes indeed. In fact, because it's const, this value might not be kept in your final executable at all, because the compiler can just substitute "256" anywhere it sees MAX_SIZE.
bool initialiseArray(int* arrayParam, int sizeParam) //where does this lie?
The code for the initialiseArray() function will be in the data section of your exectuable. You can get a pointer to the memory address, and call the function via that address, but other than that there's not much else you can do with it.
The arrayParam and sizeParam arguments will be passed to the function by value, on the stack. Likewise, the bool return value will be placed into the calling function's stack area.
int* myArray = new int[30]; //I assume this is allocated on heap memory
Correct.
bool res = initialiseArray(myArray, 30); //Where does this lie?
Effectively, the myArray pointer and the literal 30 are copied into the stack area of initialiseArray(), which then operates on them, and then the resulting bool is copied into the stack area of the calling function.
The actual details of argument passing are a lot more grizzly and depend on calling conventions (of which there are several, particularly on Windows), but unless you're doing something really specialised then they're not really important :-)
The stack is used for automatic variables - that is, variables declared within functions, or as function parameters. These variables are destroyed automatically when the program leaves the block of code they were declared in.
You're correct that MAX_SIZE has a static lifetime - it is destroyed automatically at the end of the program. You're also correct that the array allocated with new[] is on the heap (having a dynamic lifetime) - it won't be destroyed automatically, so need to be deleted. By the way, you need delete [] myArray; to match the use of new [].
The pointer to it (myArray) is an automatic variable, on the stack, as are res and the function arguments.
There is just one type of memory... it is a memory :D
What is different is where it is and how you access it.
If you go deep into the exe loader in Windows ( or in any kind of OS actually ) what it really does is that is stores the information of your sections ( parts of you exe ) and at run time at lays it out properly into the memory and applies access rights. So generally the code section where your "program" is the same memory ( your RAM ) as your data section. The difference is that the access rights are different, the code section usually only have read + execute the data just read + write ( and no execute ).
The stack is still a memory, it is special in the sense that it is again controlled by the OS, the stack size is the size in bytes of how big your stack is, but here the purpose is to hold immediate values between function calls ( as per stdcall ) and local variables ( depends on the compiler how it does it exactly ) so because it is a memory you CAN use it but like you it is to to lets say allocate a 10000 byte string on the stack. In assembly you have direct access since there is a stack pointer EBP ( If I remember correctly :P ) or in C/C++ you can use alloca.
The new and the delete operators are built ins for the C++ language but as far as I know they use the same system allocators as you do, in fact you can override them and use malloc/free and it should work which means that again this is the same memory.
The difference between using new/delete and an os specific function is that you let the language handle the allocation but in the end you will get a pointer just like you would with any other function.
On top of this there are special ways but those change the way the memory handled, in Windows this is the virtual memory for example, like VirutalAlloc, VirutalFree will allow you specify what you will do with the memory you want to use thus you allow the OS to optimize better, like you tell it I want 2Gb of memory BUT it doesn't have to be in RAM, so it may save it on the disk but you STILL access this with memory pointers.
And about your questions :
static const int MAX_SIZE = 256; //I assume this is static Data
It usually depends on the compiler but mostly they will treat this as const ( static is something else ) which means that it will be in the const section of the exe which in turn means that this memory block will be read-only.
int* myArray = new int[30]; //I assume this is allocated on heap memory
Yes this will be on the heap, but how it is allocated depends on the implementation and whenever you override the new operator, if you do you can for example force it to be in the Virtual memory so in fact it could be on the disk or in RAM, but this is silly thing to do so yes it will be on the heap.
bool res = initialiseArray(myArray, 30); //Where does this lie?
Multiple things happen here, because the compiler know that the first parameter of initialiseArray must be a pointer to an int it will pass a pointer to myArray so both a pointer and the value of 30 will go on the stack and then the function is called.
In the function which is in the memory ( the code section ) it runs and gets the parameters ( int* arrayParam, int sizeParam ) from the stack it will know that you want to write to the arrayParam and that is is pointer so it will write into the location arrayParam points to. To where exactly you specify it with arrayParam[i] < i will offset the memory pointer to the correct value, again C++ does some magic by adjusting the pointer for you since the adjustment in code should be in bytes it will move the memory pointer by 4 since ( usually ) int == 4 bytes.
To get a better overview of where goes what and how it works, use a debugger or a disassembler ( like OllyDbg ) and see it for yourself, if you want know more about how the stack is used look up the stdcall calling convention.
#include<iostream.h>
#include<conio.h>
int *p;
void Set()
{
int i;
i=7;
p=&i;
}
int Use()
{
double d;
d=3.0;
d+=*p;
//if i replace the above 3 statements with --> double d=3.0+*p; then output is 10 otherwise the output is some random value(address of random memory location)
//why is this happening? Isn't it the same?
return d;
}
void main()
{
clrscr();
Set();
cout<<Use();
getch();
}
My question is as mentioned in comments above.I want to know the exact reason for the difference in outputs.In the above code output is some address of random memory location and i understand that is because of i is a local variable of Set() but then how is it visible in the second case that is by replacing it with double d=3.0+*p; because then the output comes 10( 7+3 ) although 7 should not have been visible?
The result of using pointer p is undefined, it could also give you a segfault or just return 42. The technical reason behind the results your'e getting are probably:
i inside Set is placed on the stack. The value 7 ist stored there and p points to that location in memory. When you return from Set value remains in memory: the stack is not "destroyed", it's just the stack pointer which is reset. p still points to this location which still contains the integer representation of "3".
Inside Use the same location on the stack is reused for d.
When the compiler is not optimizing, in the first case (i.e. the whole computation in one line), it first uses the value 7 (which is still there in memory with p pointing to it), does the computation, overwrites the value (since you assign it to d which is at the same location) and returns it.
In the second case, it first overwrites the value withe the double value 3.0 and then takes the first 4 bytes interpreted as integer value for evaluating *p in d+=*p.
This case shows why returning pointers/references to local variables is such a bad thing: when writing the function Set you could even write some kind of unit tests and they would not detect the error. It might get unnoticed just until the software goes into production and has to perform some really critical task and just fails then.
This applies to all kindes of "undefined behaviour", especially in "low level" languages like C/C++. The bad thing is that "undefined" may very well mean "perfectly working until it's too late"...
After exiting function Set the value of pointer p becomes invalid due to destroying local variable i. The program has undefined behavior.