When exactly is stack allocated [duplicate] - c++

This question already has answers here:
At what exact moment is a local variable allocated storage?
(5 answers)
Closed 6 years ago.
Even in C (not just C++) you can declare variables at the start of a code block, which is enclosed in curly braces.
Example:
#include <stdio.h>
void use_stack(int cnt)
{
if (cnt<=16) {
int a[16];
int i;
a[0]=3;
a[1]=5;
for (i=2;i<cnt;i++) {
a[i]=a[i-1]+a[i-2];
}
printf("a[%d] == %d\n",cnt-1,a[cnt-1]);
}
else {
printf("cnt is too big\n");
}
}
Now I know that variables like the array a[16] are allocated on the stack in this case.
I was wondering if the space for this array is allocated at the start of the function (first opening curly brace) or at the start of the block where it is declared (opening curly brace after if).
From examining the assembler code it seems the compiler allocates the space for a[16] directly at the entry of the function.
I actually expected that the stack would be allocated (stack pointer decreased) at the declaration of a[16] and that the stack would be de-allocated (stack pointer increased) at the end of the corresponding if code block.
But this does not happen it seems (stack for a[16] is allocated directly at function entry, even if a[16] is not used in the else branch).
Has anyone an explanation why this is the case ?
So is there any part of the C language standard, which explains this behavior, or does it have to do with things like "longjmp" or signal handling, which maybe require that the stack pointer is "constant" inside a function ?
Note: The reason why I assumed the stack would be allocated/deallocated at the start/end of the code block is, because in C++ the constructor/destructor of objects allocated on the stack will be called at the start/end of the code block. So if you examine the assembler code of a C++ program you will notice that the stack is still allocated at the function entry; just the constructor/destructor call will be done at the start/end of the code block.
I am explicitly interested why stack is not allocated/deallocated at the start/end of a code block using curly braces.
The question: At what exact moment is a local variable allocated storage? is only about a local variable allocated at the start of a function. I am surprised that stack allocation for variables allocated later inside a code block is also done at the function entry.
So far the answers have been:
Something to do with optimization
Might different for C, C++
Stack is not even mentioned in the C language specification
So: I am interested in the answer for C... (and I strongly believe that the answer will apply to C++ also, but I am not asking about C++ :-)).
Optimization: Here is an example which will directly demonstrate why I am so surprised and why I am quite sure that this is not an optimization:
#include <stdio.h>
static char *stackA;
static char *stackB;
static void calc(int c,int *array)
{
int result;
if (c<=0) {
// base case c<=0:
stackB=(char *)(&result);
printf("stack ptr calc() = %p\n",stackB);
if (array==NULL) {
printf("a[0] == 1\n");
} else {
array[0]=1;
}
return;
}
// here: c>0
if (array==NULL) {
// no array allocated, so allocate it now
int i;
int a[2500];
// calculate array entries recursively
calc(c-1,a);
// calculate current array entry a[c]
a[c]=a[c-1]+3;
// print full array
for(i=0;i<=c;i++) {
printf("a[%d] = %d\n",i,a[i]);
}
} else {
// array already allocated
calc(c-1,array);
// calculate current array entry a[c]
array[c]=array[c-1]+3;
}
}
int main()
{
int a;
stackA=(char *)(&a);
printf("stack ptr main() = %p\n",stackA);
calc(9,NULL);
printf("used stack = %d\n",(int)(stackA-stackB));
}
I am aware that this is an ugly program :-).
The function calc calculates n*3 + 1 for all 0<=n<=c in a recursive fashion.
If you look at the code for calc you notice that the array a[2500] is only declared when the input parameter array to the function is NULL.
Now this only happens in the call to calc which is done in main.
The stackA and stackB pointers are used to calculate a rough estimate how much stack is used by this program.
Now: int a[2500] should consume around 10000 bytes (4 bytes per integer, 2500 entries). So you could expect that the whole program consumes around 10000 bytes of stack + something additional (for overhead when calc is called recursively).
But: It turns out this program consumes around 100000 bytes of stack (10 times as much as expected). The reason is, that for each call of calc the array a[2500] is allocated, even if it is only used in the first call. There are 10 calls to calc (0<=c<=9) and so you consume 100000 bytes of stack.
It does not matter if you compile the program with or without optimization
GCC-4.8.4 and clang for x64, MS Visual C++ 2010, Windriver for DIAB (for PowerPC) all exhibit this behavior
Even weirder: C99 introduces Variable Length Arrays. If I replace int a[2500]; in the above code with int a[2500+c]; then the program uses less stack space (around 90000 bytes less).
Note: If I only change the call to calc in main to calc(1000,NULL); the program will crash (stack overflow == segmentation fault). If I additionally change to int a[2500+c]; the program works and uses less than 100KB stack. I still would like to see an answer, which explains why a Variable Length Array does not lead to a stack overflow whereas a fixed length array does lead to a stack overflow, in particular since this fixed length array is out of scope (except for the first invocation of calc).
So what's the reason for this behavior in C ?
I do not believe that GCC/clang both simply are not able to do better; I strongly believe there has to be a technical reason for this. Any ideas ?
Answer by Google
After more googling: I strongly believe this has something to do with "setjmp/longjmp" behavior. Google for "Variable Length Array longjmp" and see for yourself. It seems longjmp is hard to implement if you do not allocate all arrays at function entry.

The language rules for automatic storage only guarantees that the last allocated is the first deallocated.
A compiler can implement this logical stack any way it sees fit.
If it can prove that a function isn't recursive it can even allocated the storage at program start-up.
I believe that the above applies to C as well as C++, but I'm no C expert.
Please, when you ask about the details of a programming language, limit the question to one language at a time.

There is no technical reason for this other than choices that compiler makers made. It's less generated code and faster executing code to always reserve all the stack space we'll need at the beginning of the function. So all the compilers made the same reasonable performance tradeoff.
Try using a variable length array and you'll see that the compiler is fully capable of generating code that "allocates" stack just for a block. Something like this:
void
foo(int sz, int x)
{
extern void bar(char *);
if (x) {
char a[sz];
bar(a);
} else {
char a[10];
bar(a);
}
}
My compiler generates code that always reserves stack space for the x is false part, but the space for the true part is only reserved if x is true.

How this is done isn't regulated by any standard. The C and C++ standards don't mention the stack at all, in theory those languages could be used even on computers that don't have a stack.
On computers that do have a stack, how this is done is specified by the ABI of the given system. Often, stack space is reserved at the point when the program enters the function. But compilers are free to optimize the code so that the stack space is only reserved when a certain variable is used.
At any rate, the point where you declare the variable has no relation to when it gets allocated. In your example, int a[16] is either allocated when the function is entered, or it is allocated just before the first place where it is used. It doesn't matter if a is declared inside the if statement or at the top of the function.
In C++ however, there is the concept of constructors. If your variable is an object with a constructor, then that constructor will get executed at the point where the variable is declared. Meaning that the variable must be allocated before that happens.

Alf has explained the limitations and freedoms that the standard specifies (or rather, what it doesn't specify).
I'll suggest an answer to the question
why stack is not allocated/deallocated at the start/end of a code block using curly braces.
The compilers that you tested chose to do so (I'm actually just guessing, I didn't write any of them), because of better run-time performance and simplicity (and of course, because it is allowed).
Allocating 96 bytes (arbitrary example) once takes about half as much time as allocating 48 bytes twice. And third as much times as allocating 32 bytes thrice.
Consider a loop as an extreme case:
for(int i = 0; i < 1000000; i++) {
int j;
}
If j is allocated at the start of the function, there is one allocation. If j is allocated within the loop body, then there will be a million allocations. Less allocations is better (faster).
Note: The reason why I assumed the stack would be allocated/deallocated at the start/end of the code block is, because in C++ the constructor/destructor of objects allocated on the stack will be called at the start/end of the code block.
Now you know that you were wrong to have assumed so. As written in a fine answer in the linked question, the allocation/dealocation need not coincide with the construction/destruction.
Imagine I would use a[100000]
That is approaching a very significant fraction of total stack space. You should allocate large blocks of memory like that dynamically.

Related

Memory allocation in looping variable declaration

case 1:
int main()
{
int T=5;
while(T--){
int a;
cout<<&a<<"\n";
}
}
it prints the same address 5 times.
i suppose it should print 5 different addresses.
case 2:
int main()
{
int T=5;
while(T--){
int* a=new int;
cout<<a<<"\n";
}
}
prints 5 different addresses
My question is:
Why does'nt new memory is allocated every time a variable declaration is encountered in first case?
and the difference between 1st case and 2nd case.
In the first case, a is located on the stack. Basically, a gets "constructed" (a better wording might be "assigned space") there in each iteration and released afterwards. So after each iteration, the space previously allocated to a is free again, and the new a gets that space in the next iteration. This is why the address is the same.
In the second case, you allocate memory on the heap and (additionally) do not free it again. So the memory can't be reassigned in the next iteration.
In theory, the absolute position on the stack is allocated every time the variable comes into scope and deallocated every time it goes out of scope. The LIFO nature of the stack in which it is allocated then makes sure the same location is allocated each time.
But in practice, the compiler allocates relative positions on the stack at compile time whenever doing so is practical (which in this case is trivially true). With pre allocated relative positions, the simple act of entering the function effectively allocates all instances of all local variables. A local object in a loop like that would be constructed and/or initialized for each instance, but allocation was done once in advance for all instances. So the addresses are the same for an even more fundamental reason than the LIFO nature of a stack. They are the same because the allocation was only done once.
If your C++ compiler supports a common C99 feature, you could construct tests that might distinguish the above two cases. Something roughly like:
for (int i=0; i<2; ++i) {
int unpredictable[ f(i) ];
for (int j=0; j<2; ++j) {
int T=5;
// does the location of T vary as i changes ??
int U[ f(j) ]; // I'm pretty sure the location of U varies
}}
We want the values of f(0) and f(1) to be easy at run time, but hard for the optimizer to see at compile time. That is most robust if f is declared in this module but defined in another.
By preventing the compiler from doing all the allocation at compile time, maybe we prevent it from doing some easy allocation at compile time, or maybe it still sorts out the ones that can be allocated at compile time and run time allocation is used only as needed.
It depends on the compiler. Since the variable is declared in the innermost scope, the compiler thinks it is okay to reuse the same location for the variable. Of course it can be located in different addresses.

default value for a class member in c++

My code is as below: (Tried to made my original code simpler here)
class ExClass {
public:
int check;
ExClass() { }
}
//in main
int main()
{
ExClass *classPtr = new ExClass();
if(classPtr->check == -1)
{
cout<<"check is negative";
}
}
Here the variable 'check' is not initialized in constructor, so it should take some garbage value(as I know).
But my problem here is, it is always printing "check is negative".(so 'check' is -1)
How this happens everytime ? How the variable 'check' is -1 always ?
Thanks for help !
your check is what's called an uninitialized variable:
A common assumption made by novice programmers is that all variables are set to a known value, such as zero, when they are declared. While this is true for many languages, it is not true for all of them, and so the potential for error is there. Languages such as C use stack space for variables, and the collection of variables allocated for a subroutine is known as a stack frame. While the computer will set aside the appropriate amount of space for the stack frame, it usually does so simply by adjusting the value of the stack pointer, and does not set the memory itself to any new state (typically out of efficiency concerns). Therefore, whatever contents of that memory at the time will appear as initial values of the variables which occupy those addresses.
This means the value isn't quite random. But more of a 'whatever the value of the thing before was' thing.
Also note that some compilers (such as visual c++) initialise uninitialised memory with magic numbers (In Visual Studio C++, what are the memory allocation representations?) while in debugging mode.

3d array declaration causes segmentation fault [duplicate]

This question already has answers here:
Segmentation fault on large array sizes
(7 answers)
Closed 4 years ago.
I'm converting a program from fortran to C++.
My code seems to run fine until I add this array declaration:
float TC[100][100][100];
And then when I run it I get a segmentation fault error. This array should only take up 8Mb of memory and my machine has 3 Gb. Is there a problem with this declaration? My c++ is pretty rusty.
That array is about 4 megabyte large. If this definition is inside a function (as local variable), then the compiler tries to store it on the stack, which on most systems cannot grow that large.
The Fortran compiler probably allocated it statically (Fortran routines are not allowed to be called recursively unless explicitly marked as recursive, so static allocation for local variables works there for non-recursive functions), and therefore the error doesn't occur there.
A simple fix would be to explicitly declare the variable static, assuming the Fortran function was not declared recursive. However this may bite you later, if you ever try to call that function recursively from a revised version. So a better solution would probably be to allocate it dynamically. However that costs extra time and therefore depending on the nature of the code, may hurt your performance too much (Fortran code quite often is numerical code where performance matters).
If you choose to make the array static, you can build in a protection against accidental recursive calls:
void yourfunction()
{
static bool active;
static float TC[100][100][100];
assert(!active);
active = true;
// your code
active = false;
}
I'm guessing TC is being allocated as an auto local variable. This means it's being stored on the stack. You don't get 4mb of stack memory, so it's causing a stack overflow.
To solve it, use dynamic allocation with a structured container or new.
This looks like a stack-based declaration. Try allocating from the heap (i.e. using the new operator).
If you are declaring it inside of a function, as a local variable, it may be that you stack is not big enough to fit the array. You may try to allocate in the heap, with new or malloc(), or, if your design allows, make it a global variable.
In C++ the stack has a limited amount of space. MSVC defaults this size to 1MB. If the stack uses more than 1MB it will segfault or stackoverflow or something. You will have to move that structure to dynamic memory. To move it to dynamic memory, you want something like this:
typedef float (bigarray)[100][100][100];
bigarray& TC() {
static bigarray* ptr = NULL;
if (ptr == NULL) {
ptr = new float[100][100][100];
for(int j=0; j<100; j++) {
ptr[j] = new float[100][100];
for(int i=0; i<100; i++)
ptr[j][i] = new float[100];
}
}
return *ptr;
}
That will allocate the structure in dynamic memory the first time it is accessed, as a jagged array. You can get more performance out of a rectangle array, but you have to change types:
typedef std::vector<std::array<std::array<float, 100>, 100> bigarray;
bigarray TC(100);
According to http://cs.nyu.edu/exact/core/doc/stackOverflow.txt, gcc/linux defaults the stack size to 8MB, which isn't big enough for your structure and int main() If you really want to, MSVC has flags to increase the stack sizes up to 32MB. Linux has a ulimit command to increase the stack size up to 32MB.

What is the difference in execution time for allocating to the heap vs the stack?

I am a little stuck on this question in my homework:
Write three functions in C or C++: one that declares a large array statically, one that declares the same large array on the stack, and one that creates the same large array from the heap. Call each of the subprograms a large number of times {at least 100,000) and output the time required by each. Explain the results.
int main(void)
{
int staticIntArray[ARRAY_SIZE];//array on the stack
int *ptrArray = new int[ARRAY_SIZE]; // pointer on the stack but array on the heap.
double timeItTakes;
clock_t tStart = clock();
fillWithRandomNumbers(staticIntArray, ARRAY_SIZE);
double time = static_cast<double>(clock() - static_cast<double>(tStart)/static_cast<double>(CLOCKS_PER_SEC));
printf ("Array on stack time is %.10f\n", time);
clock_t tStart2 = clock();
fillWithRandomNumbers(ptrArray, ARRAY_SIZE);
double time2 = static_cast<double>(clock() - static_cast<double>(tStart2)/static_cast<double>(CLOCKS_PER_SEC));
printf ("Array on heap time is %.10f\n", time2);
//cout << "Array on the heap time is " << (timeIntStack - time(NULL));
}
void fillWithRandomNumbers(int intArray[], int size)
{
for(int i = 0; i<size; i++)
intArray[i] = rand();
}
The output is:
Array on stack time is 1.9990000000
Array on heap time is 2.9980000000
Press any key to continue . . .
What I understand is that the stack is much smaller allocation of memory that is for local variables and parameters while the heap is a large pool of dynamically allocated memory. Here are my questions...
Does using the random class affect the time it takes for the function to execute? Is allocating large arrays on the stack slower because there is less memory available?
I am not trying ask you to do my homework but I just need a little help clarifying the concepts... Any help would be much appreciated...
First, allow me a slight rant.
The assignment -- presumably in a C++ programming class -- is a bad one. It is diverting your focus to the performance implications of dynamic allocation versus static or automatic allocation, but that is not the main reason you should choose one form of allocation over another. Rather, lifetime and visibility requirements in addition to ownership semantics should be considered long before performance when deciding how to allocate a chunk of memory. Even setting this argument aside, the test is still invalid because the hardware you run the code on, the size of the individual elements in the array and the array itself, the operating system, how the kernel blocks when allocating and the optimization the compiler is permitted to employ are all going to effect the execution speed in any real code you write. But this assignment seems to be suggesting that you should conclude, "see? Dynamic allocation is slower. We should never use it." That reasoning is incorrect and teaches you to employ premature optimization.
OK, end of my rant.
On to your assignment. You are doing two main things wrong.
The assignment never asks you to populate the array.
The assignment asks you to allocate the arrays 10k times, but you're doing it only once.
(Bonus!) The assignment doesn't ask you to deallocate the dynamically allocated array -- but it should.
You are only allocating your arrays once, at the start of the program, and only writing their contents in the fillWithRandomNumbers loop. The program didn't measure allocation at all; for this to happen, the new operator should have been within a loop.
Try to follow the assignment here: Write three functions, with each function allocating the array in a different way.
First you are not doing what the assignment asks for.
Second you should expect the stack allocation to be much faster, since the only thing that is does internally is to move the stack pointer.

When does the space occupied by a variable get deallocated in c++?

Doesn't the space occupied by a variable get deallocated as soon as the control is returned from the function??
I thought it got deallocated.
Here I have written a function which is working fine even after returning a local reference of an array from function CoinDenom,Using it to print the result of minimum number of coins required to denominate a sum.
How is it able to print the right answer if the space got deallocated??
int* CoinDenom(int CoinVal[],int NumCoins,int Sum) {
int min[Sum+1];
int i,j;
min[0]=0;
for(i=1;i<=Sum;i++) {
min[i]=INT_MAX;
}
for(i=1;i<=Sum;i++) {
for(j=0;j< NumCoins;j++) {
if(CoinVal[j]<=i && min[i-CoinVal[j]]+1<min[i]) {
min[i]=min[i-CoinVal[j]]+1;
}
}
}
return min; //returning address of a local array
}
int main() {
int Coins[50],Num,Sum,*min;
cout<<"Enter Sum:";
cin>>Sum;
cout<<"Enter Number of coins :";
cin>>Num;
cout<<"Enter Values";
for(int i=0;i<Num;i++) {
cin>>Coins[i];
}
min=CoinDenom(Coins,Num,Sum);
cout<<"Min Coins required are:"<< min[Sum];
return 0;
}
The contents of the memory taken by local variables is undefined after the function returns, but in practice it'll stay unchanged until something actively changes it.
If you change your code to do some significant work between populating that memory and then using it, you'll see it fail.
What you have posted is not C++ code - the following is illegal in C++:
int min[Sum+1];
But in general, your program exhibits undefined behaviour. That means anything could happen - it could even appear to work.
The space is "deallocated" when the function returns - but that doesn't mean the data isn't still there in memory. The data will still be on the stack until some other function overwrites it. That is why these kinds of bugs are so tricky - sometimes it'll work just fine (until all the sudden it doesn't)
You need to allocate memory on the heap for return variable.
int* CoinDenom(int CoinVal[],int NumCoins,int Sum) {
int *min= new int[Sum+1];
int i,j;
min[0]=0;
for(i=1;i<=Sum;i++) {
min[i]=INT_MAX;
}
for(i=1;i<=Sum;i++) {
for(j=0;j< NumCoins;j++) {
if(CoinVal[j]<=i && min[i-CoinVal[j]]+1<min[i]) {
min[i]=min[i-CoinVal[j]]+1;
}
}
}
return min; //returning address of a local array
}
min=CoinDenom(Coins,Num,Sum);
cout<<"Min Coins required are:"<< min[Sum];
delete[] min;
return 0;
In your case you able to see the correct values only, because no one tried to change it. In general this is unpredictable situation.
That array is on the stack, which in most implementations, is a pre-allocated contiguous block of memory. You have a stack pointer that points to the top of the stack, and growing the stack means just moving the pointer along it.
When the function returned, the stack pointer was set back, but the memory is still there and if you have a pointer to it, you could access it, but it's not legal to do so -- nothing will stop you, though. The memory values in the array's old space will change the next time the stack depth runs over the area where the array is.
The variable you use for the array is allocated on stack and stack is fully available to the program - the space is not blocked or otherwise hidden.
It is deallocated in the sense that it can be reused later for other function calls and in the sense that destructors get called for variables allocated there. Destructors for integers are trivial and don't do anything. That's why you can access it and it can happen that the data has not been overwritten yet and you can read it.
The answer is that there's a difference between what the language standard allows, and what turns out to work (in this case) because of how the specific implementation works.
The standard says that the memory is no longer used, and so must not be referenced.
In practice, local variables on the stack. The stack memory is not freed until the application terminates, which means you'll never get an access violation/segmentation fault for writing to stack memory. But you're still violating the rules of C++, and it won't always work. The compiler is free to overwrite it at any time.
In your case, the array data has simply not been overwritten by anything else yet, so your code appears to work. Call another function, and the data gets overwritten.
How is it able to print the right answer if the space got deallocated??
When memory is deallocated, it still exists, but it may be reused for something else.
In your example, the array has been deallocated but its memory hasn't yet been reused, so its contents haven't yet been overwritten with other values, which is why you're still able from it the values that you wrote.
The fact that it won't have been reused yet is not guaranteed; and the fact that you can even read from it at all after it's deallocated is also not guaranteed: so don't do it.
This might or might not work, behaviour is undefined and it's definitely wrong to do it like this. Most compilers also give a compiler warning, for example GCC:
test.cpp:8: warning: address of local variable `min' returned
Memory is like clay that never hardens. Allocating memory is like taking some clay out of the clay pot. Maybe you make a cat and a cheeseburger. When you have finished, you deallocate the clay by putting your figures back into the pot, but just being put into the pot does not make them lose their shape: if you or someone else looks into the pot, they will continue to observe your cat and cheeseburger sitting on the top of the clay stack until someone else comes along and makes them into something else.
The NAND gates in the memory chips are the clay, the strata that holds the NAND gates is the clay pot, and the particular voltages that represent the value of your variables are your sculptures. Those voltages do not change just because your program has taken them off the list of things it cares about.
You need to understand the stack. Add this function,
void f()
{
int a[5000];
memset( a, 0, sizeof(a) );
}
and then call it immediately after calling CoinDenom() but before writing to cout. You'll find that it no longer works.
Your local variables are stored on the stack. CoinDenom() returns a memory address that points into the stack. Very simplified and leaving out lots of details, say the stack pointer is pointing to 0x1000 just before you call CoinDenom. An int* (Coins) is pushed on the stack. This becomes CoinVal[]. Then an int, Num which becomes NumCoins. Then another int, Sum which becomes Sum. Thats 3 ints at 4 bytes/int. Then space for the local variables:
int min[Sum+1];
int i,j;
which would be (Sum + 3) * 4 bytes/int. Say Sum = 2, that gives us another 20 bytes total, so the stack pointer gets incremented by 32 bytes to 0x1020. (All of main's locals are below 0x1000 on the stack.) min is going to point to 0x100c. When CoinDenom() returns, the stack pointer is decremented "freeing" that memory but unless another function is called to have it's own locals allocated in that memory, nothing's going to happen to change what's stored in that memory.
See http://en.wikipedia.org/wiki/Calling_convention for more detail on how the stack is managed.