main function stack size - c++

Max function stack size is limited and can be quickly exhausted if we use big stack variables or get careless with recursive functions.
But main's stack isn't really a stack. main is always called exactly once and never recursively. By all means main's stack is static storage allocated at the very beginning and it lives until the very end. Does it mean I can allocate big arrays in main's stack?
int main()
{
double a[5000000];
}

main is just a normal function. Stack size is system dependent.
Alos remember your process shares only one stack, for all function calls. Items are pushed and popped from the stack, as function are called by main.

It's implementation-defined (the language standard doesn't talk about stacks, AFAIK). But typically, main lives on the stack just like any other function.

It's 100% compiler and system dependent, like most of this kind of funny business. Heck, even the existence of the stack isn't mandated by the standard.
In practice, yes, it's on the stack, and no, you can't allocate things like that on the stack without running into trouble.

When you allocate an array in that manner, it is allocated on the stack. There is a platform-dependent maximum size the stack can grow to. And yes, you've exceeded it.

Second thought, I just remembered - it can be called recursively. Check out this obfuscated code:
http://en.wikipedia.org/wiki/Obfuscated_code
It calls main many times and works wonders :) It's a fun link anyway. So, its definitely stack allocated, sorry about that!

The stack is something that is used by all functions - the way you've worded your question suggests that each function is given a stack which is not the case.
Stack usage grows with each function call - main() being the first. The allocation that you used in your example is just as bad as making a stack allocation in another function.

For most modern systems, there is no real reason the stack size needs to be limited. You can probably adjust an operating system parameter and that program will work fine. (As will any that allocates an equal amount of data on the stack, main or not.)
However, if you really want an object with a lifetime equal to the duration of the program, create a global variable instead of a local inside main. Most platforms do not artificially limit the size of global objects — they can usually be as large as the memory map allows.
By the way, main is not active for the duration of a C++ program. It may be preceded by construction of global objects and followed by destruction of same and atexit handlers.

Related

How can I avoid putting this variable on the stack?

I'm currently adapting some example Arduino code to fit my needs. The following snippet confuses me:
// Dont put this on the stack:
uint8_t buf[RH_RF95_MAX_MESSAGE_LEN];
What does it mean to put the buf variable on the stack? How can I avoid doing this? What bad things could happen if I did it?
The program stack has a limited size (even on desktop computers, it's typically capped in megabytes, and on an Arduino, it may be much smaller).
All function local variables for functions are stored there, in a LIFO manner; the variables of your main method are at the bottom of the stack, the variables of the functions called in main on top of that, and so on; space is (typically) reserved on entering a function, and not reclaimed until a function returns. If a function allocates a truly huge buffer (or multiple functions in a call chain allocate slightly smaller buffers) you can rapidly approach the stack limit, which will cause your program to crash.
It sounds like your array is being allocated outside of a function, putting it at global scope. The downside to this is there is only one shared buffer (so two functions can't use it simultaneously without coordinating access, while a stack buffer would be independently reserved for each function), but the upside is that it doesn't cost stack to use it; it's allocated from a separate section of program memory (a section that's usually unbounded, or at least has limits in the gigabyte, rather than megabyte range).
So to answer your questions:
What does it mean to put the buf variable on the stack?
It would be on the stack if it:
Is declared in function scope rather than global scope, and
Is not declared as static (or thread_local, though that's more complicated than you should care about right now); if it's declared static at function scope, it's basically global memory that can only be referenced directly in that specific function
How can I avoid doing this?
Don't declare huge non-static arrays at function scope.
What bad things could happen if I did it?
If the array is large enough, you could suffer a stack overflow from running out of available stack space, crashing your program.

Can stack memory be allocated within a function automatically?

I'm sorry if this has been asked before, but I didn't find anything...
For a "normal" x86 architecture:
When I call a large function in C++, is the memory then allocated immediately for all stack variables?
Or are there compilers which can (and do) modify the stack size even if the function is not finished.
For example if a new scope starts:
int largeFunction(){
int a = 1;
int b = 2;
// .... long code ....
{ // new scope
int c = 5;
// .... code again ....
}
// .....
}
Could the call stack "grow" also for the variable c at the beginning of the separate scope and "shrink" at its end?
Or will current compilers always produce code which affects the stack pointer at the entry and return value of the function?
Thanks for your answer in advance.
1) How long a function is has nothing to do with the allocation of memory, independent of stack or heap.
2) When stack is "allocated" depends only on the compiler's way to make the most efficient code. "Efficient" has a wide range of requirements. All compilers have options to modify the optimizer goals for speed & size and most compilers can optimize also for lower stack consumption and other parameters.
3) Automatic variables can go on the stack but that is not a must. A lot of variables should be "allocated" to registers of your cpu. This speeds up the code a lot and saves stack. But this depends very much on the cpu platform.
4) When a compiler generates a new stack frame is also a question of optimization of code. Compilers can do "out of order execution" if this saves resources or fits better with the architecture. So the question when a stack frame comes in use cannot be answered. A new scope (open brace) can be the point for allocating a new stack frame, but this is never a guarantee. Sometimes it is not efficient to do a recalculation of all stack relative addresses of all called functions from the actual scope.
5) Some compilers can also use heap memory for auto variables. This is often seen on embedded cores if access via special instructions is faster as a stack relative addressing.
But normally it is not very important when a compiler do what he wants. The only thing which is sometimes to remember is, that you have to guarantee that your stack is large enough. Often system calls for new threads have params to set the stack size. So you have to know how many stack size your implementation needs. But in all other cases: Forget to think about. This job is done perfectly from your compiler developers.
I don't know the answer (and I hope you only want to know because you're curious, as no valid program should be able to tell the difference), but you could test the behaiour of your compiler by calling a function like this before the new scope and again after the new scope:
std::intptr_t stackaddr()
{
int i;
return reinterpret_cast<std::intptr_t>(&i);
}
If you get the same result then it means the stack was already adjusted in advance of creating c.
There was a change in G++ 4.7 which allows the compiler to re-use the stack space of c after its scope ends, where previously any new variables after that point would have increased the stack usage: "G++ now properly re-uses stack space allocated for temporary objects when their lifetime ends, which can significantly lower stack consumption for some C++ functions." But I think that only affects how much stack is reserved on entry to the function, not when/where it's reserved.
This is entirely dependent on the runtime conventions of the system you are using, however, the CPU architecture usually plays a big part in the decision, because the architecture defines what stack management can safely be used. On the old PowerPCs under MacOS X, for instance, stack frames were always of fixed size, one atomical store of the stackpointer at the low end of a new stack frame would allocate it, dereferencing the stackpointer was equivalent to poping an entire stack frame.
Current systems like Linux and (correct me, if I'm wrong) Windows on x86 have a more dynamic approach with atomic push and pop instructions (there is no atomic pop on PowerPC), where the parameters to a function call are pushed unto the stack before each function call, effectively resizing the allocated stack frame each time.
So, yes, on many current systems the compiler can resize the stack frame, but on other systems such an operation is at least hard to accomplish (never impossiple, though).

Stack-allocated objects still taking memory after going out of scope?

People always talk about how objects created without the new keyword are destroyed when they go out of scope, but when I think about this, it seems like that's wrong. Perhaps the destructor is called when the variable goes out of scope, but how do we know that it is no longer taking up space in the stack? For example, consider the following:
void DoSomething()
{
{
My_Object obj;
obj.DoSomethingElse();
}
AnotherFuncCall();
}
Is it guaranteed that obj will not be saved on the stack when AnotherFuncCall is executed? Because people are always saying it, there must be some truth to what they say, so I assume that the destructor must be called when obj goes out of scope, before AnotherFuncCall. Is that a fair assumption?
You are confusing two different concepts.
Yes, your object's destructor will be called when it leaves its enclosing scope. This is guaranteed by the standard.
No, there is no guarantee that an implementation of the language uses a stack to implement automatic storage (i.e., what you refer to as "stack allocated objects".)
Since most compilers use a fixed size stack I'm not even sure what your question is. It is typically implemented as a fixed size memory region where a pointer move is all that is required to "clean up" the stack as that memory will be used again soon enough.
So, since the memory region used to implement a stack is fixed in size there is no need to set the memory your object took to 0 or something else. It can live there until it is needed again, no harm done.
I believe it depends where in the stack the object was created. If it was on the bottom (assuming stack grows down) then I think the second function may overwrite the destroyed objects space. If the object was inside the stack, then probably that space is wasted, since all further objects would have to be shifted.
Your stack is not dynamically allocated and deallocated, it's just there. Your objects constructors and destructors will get called but you don't get the memory back.
Because people are always saying it, there must be some truth to what they say, so I assume that the destructor must be called when obj goes out of scope, before AnotherFuncCall. Is that a fair assumption?
This is correct. Note that this final question says nothing about a stack". Whether an implementation uses a stack, or something else, is up to the implementation.
Objects created "on the stack" in local scope have what is called automatic storage duration. The Standard says:
C++03 3.7.2 Automatic storage duration
1/ Local objects explicitly declared auto or register or not
explicitly declared static or extern have automatic storage duration.
The storage for these objects lasts until the block in which they are
created exits.
2/ [Note: these objects are initialized and destroyed as described in
6.7. ]
On the destruction of these objects:
6.7 Declaration statement
2/ Variables with automatic storage duration (3.7.2) are initialized
each time their declaration-statement is executed. Variables with
automatic storage duration declared in the block are destroyed on exit
from the block (6.6).
Hence, according to the Standard, when object with local scope fall out of scope, the destructor is called and the storage is released.
Weather or not that storage is on a stack the Standard doesn't say. It just says the storage is released, wherever it might be.
Some architectures don't have stacks in the same sense a PC has. C++ is meant to work on any kind of programmable device. That's why it never mentions anything about stacks, heaps, etc.
On a typical PC-type platform running Windows and user-mode code, these automatic variables are stored on a stack. These stacks are fixed-size, and are created when the thread starts. As they become instantiated, they take up more of the space on the stack, and the stack pointer moves. If you allocate enough of these variables, you will overflow the stack and your program will die an ugly death.
Try running this on a Windows PC and see what happens for an example:
int main()
{
int boom[10000000];
for( int* it = &boom[0]; it != &boom[sizeof(boom)/sizeof(boom[0])]; ++it )
*it = 42;
}
What people say is indeed true. The object still remains in the memory location. However, the way stack works means that the object does not take any memory space from stack.
What usually happens when memory is allocated on the stack is that the stack pointer is decremented by sizeof(type) and when the variable goes out of scope and the object is freed, the stack pointer is incremented, thus shrinking the effective size of data allocated on the stack. Indeed, the data still resides in it's original address, it is not destroyed or deleted at all.
And just to clarify, the C++ standard says absolutely nothing about this! The C++ standard is completely unaware of anything called stack or heap in sense of memory allocation because they are platform specific implementation details.
Your local variables on stack do not take extra memory. The system provides some memory from each thread's stack, and the variables on the stack just use part of it. After running out of the scope, the compiler can reuse the same part of the stack for other variables (used later in the same function).
how do we know that it is no longer taking up space in the stack?
We don't. There are way to see whether they do or don't, but those are architecture and ABI specific. Generally, functions do pop whatever they pushed to the stack when they return control to the caller. What C/C++ guarantees is that it will call a destructor of high-level objects when they leave the scope (though some older C++ like MSVC 6 had terrible bugs at a time when they did not).
Is it guaranteed that obj will not be saved on the stack when AnotherFuncCall is executed?
No. It is up to the compiler to decide when and how to push and pop stack frames as long as that way complies with ABI requirements.
The question "Is something taking up space in the stack" is a bit of a loaded question, because in reality, there is no such thing as free space (at a hardware level.) A lot of people (myself included, at one point) thought that space on a computer is freed by actually clearing it, i.e. changing the data to zeroes. However, this is actually not the case, as doing so would be a lot of extra work. It takes less time to do nothing to memory than it does to clear it. So if you don't need to clear it, don't! This is true for the stack as well as files you delete from your computer. (Ever noticed that "emptying the recycle bin" takes less time than copying those same files to another folder? That's why - they're not actually deleted, the computer just forgets where they're stored.)
Generally speaking, most hardware stacks are implemented with a stack pointer, which tells the CPU where the next empty slot in the stack is. (Or the most recent item pushed on the stack, again, this depends on the CPU architecture.)
When you enter a function, the assembly code subtracts from the stack pointer to create enough room for your local variables, etc. Once the function ends, and you exit scope, the stack pointer is increased by the same amount it was originally decreased, before returning. This increasing of the stack pointer is what is meant by "the local variables on the stack have been freed." It's less that they've been freed and more like "the CPU is now willing to overwrite them with whatever it wants to without a second thought."
Now you may be asking, if our local variables from a previous scope still exist, why can't we use them? Reason being, there's no guarantee they'll still be there from the time you exit scope and the time you try to read them again.

Can I exhaust stack?

I know that by using operator new() I can exhaust memory and I know how to protect myself against such a case, but can I exhaust memory by creating objects on stack? And if yes, how can I check if object creation was succesful?
Thank you.
You can exhaust a stack. In such cases, your program will probably crash with the stack overflow exception immediately.
A stack has a size too, so you can look at it as simply a block of memory. Variables inside functions for example are allocated here. Also when you call a function, the call itself is stored on the stack (very simplified, i know). So if you make a infinite recursion (as mentioned in another answer) then the stack gets filled but not emptied (this happens when a function returns, the information about the call is "deleted") so at some time you will fill the whole space allocated for your programs stack and your app will crash.
Note that there are ways how to determine/change the size of stack.
Just look at the title of this site and you will see the answer.
Write some infinite recursion if you want to see "live" what happens.
i.e.
void fun() { fun(); }
Yes, you can exhaust the stack. On common systems, the hardware/OS traps that and aborts your program. However, it is hard to do so. You would have to either create huge objects on the stack (automatic arrays) or do deep recursion.
Note that, if you use common abstractions such as std::string, std::vector etc., you can hardly ever exhaust the stack, because while they live on the stack, they have their data on the heap. (This is true for all STL containers coming with the std lib except for std::tr1::array.)
Yes, see the site's name. You can't really check that the object creation is successful -- the program simply crashes on stack overflow.
Memory is not infinite, so wherever you allocate objects you will eventually run out of it.
Ok, but you'll need sharp reactions to spot when the 'object creation' succeeds.
class MyObject {
private:
int x
public:
MyObject() { x = 0; }
};
int main(int argc, char **argv) {
IWantToExhaustTheStack();
return 0;
}
void IWantToExhaustTheStack() {
MyObject o;
IWantToExhaustTheStack();
}
Now compile and run this, for a very short while your object creation will work. You will know that the object creation has failed, when your program fails.
Joking aside, and in response to your updated question, there is no standard way to determine the stack size. See : This Stackoverflow Question in relation to Win32. However, the stack is used to call methods and hold local temporary and return variables. If you are allocating large objects on the stack, you really should be thinking of putting them on the heap.
Yes, you can exhaust the stack and you cannot test whether object creation fails because after failure this is already too late.
Generally, the only way to protect from stack overflow is to design application in a such way that it will not exceed given limit. I.e. if recursion modifies an image then put limit on image size or use other algorithm for huge images.
Watch recursions (not too deep), watch alloca (not too much). Watch peeks when examining stack usage.
In OpenSolaris there is few functions that lets you to control the stack.

Proper stack and heap usage in C++?

I've been programming for a while but It's been mostly Java and C#. I've never actually had to manage memory on my own. I recently began programming in C++ and I'm a little confused as to when I should store things on the stack and when to store them on the heap.
My understanding is that variables which are accessed very frequently should be stored on the stack and objects, rarely used variables, and large data structures should all be stored on the heap. Is this correct or am I incorrect?
No, the difference between stack and heap isn't performance. It's lifespan: any local variable inside a function (anything you do not malloc() or new) lives on the stack. It goes away when you return from the function. If you want something to live longer than the function that declared it, you must allocate it on the heap.
class Thingy;
Thingy* foo( )
{
int a; // this int lives on the stack
Thingy B; // this thingy lives on the stack and will be deleted when we return from foo
Thingy *pointerToB = &B; // this points to an address on the stack
Thingy *pointerToC = new Thingy(); // this makes a Thingy on the heap.
// pointerToC contains its address.
// this is safe: C lives on the heap and outlives foo().
// Whoever you pass this to must remember to delete it!
return pointerToC;
// this is NOT SAFE: B lives on the stack and will be deleted when foo() returns.
// whoever uses this returned pointer will probably cause a crash!
return pointerToB;
}
For a clearer understanding of what the stack is, come at it from the other end -- rather than try to understand what the stack does in terms of a high level language, look up "call stack" and "calling convention" and see what the machine really does when you call a function. Computer memory is just a series of addresses; "heap" and "stack" are inventions of the compiler.
I would say:
Store it on the stack, if you CAN.
Store it on the heap, if you NEED TO.
Therefore, prefer the stack to the heap. Some possible reasons that you can't store something on the stack are:
It's too big - on multithreaded programs on 32-bit OS, the stack has a small and fixed (at thread-creation time at least) size (typically just a few megs. This is so that you can create lots of threads without exhausting address space. For 64-bit programs, or single threaded (Linux anyway) programs, this is not a major issue. Under 32-bit Linux, single threaded programs usually use dynamic stacks which can keep growing until they reach the top of the heap.
You need to access it outside the scope of the original stack frame - this is really the main reason.
It is possible, with sensible compilers, to allocate non-fixed size objects on the heap (usually arrays whose size is not known at compile time).
It's more subtle than the other answers suggest. There is no absolute divide between data on the stack and data on the heap based on how you declare it. For example:
std::vector<int> v(10);
In the body of a function, that declares a vector (dynamic array) of ten integers on the stack. But the storage managed by the vector is not on the stack.
Ah, but (the other answers suggest) the lifetime of that storage is bounded by the lifetime of the vector itself, which here is stack-based, so it makes no difference how it's implemented - we can only treat it as a stack-based object with value semantics.
Not so. Suppose the function was:
void GetSomeNumbers(std::vector<int> &result)
{
std::vector<int> v(10);
// fill v with numbers
result.swap(v);
}
So anything with a swap function (and any complex value type should have one) can serve as a kind of rebindable reference to some heap data, under a system which guarantees a single owner of that data.
Therefore the modern C++ approach is to never store the address of heap data in naked local pointer variables. All heap allocations must be hidden inside classes.
If you do that, you can think of all variables in your program as if they were simple value types, and forget about the heap altogether (except when writing a new value-like wrapper class for some heap data, which ought to be unusual).
You merely have to retain one special bit of knowledge to help you optimise: where possible, instead of assigning one variable to another like this:
a = b;
swap them like this:
a.swap(b);
because it's much faster and it doesn't throw exceptions. The only requirement is that you don't need b to continue to hold the same value (it's going to get a's value instead, which would be trashed in a = b).
The downside is that this approach forces you to return values from functions via output parameters instead of the actual return value. But they're fixing that in C++0x with rvalue references.
In the most complicated situations of all, you would take this idea to the general extreme and use a smart pointer class such as shared_ptr which is already in tr1. (Although I'd argue that if you seem to need it, you've possibly moved outside Standard C++'s sweet spot of applicability.)
You also would store an item on the heap if it needs to be used outside the scope of the function in which it is created. One idiom used with stack objects is called RAII - this involves using the stack based object as a wrapper for a resource, when the object is destroyed, the resource would be cleaned up. Stack based objects are easier to keep track of when you might be throwing exceptions - you don't need to concern yourself with deleting a heap based object in an exception handler. This is why raw pointers are not normally used in modern C++, you would use a smart pointer which can be a stack based wrapper for a raw pointer to a heap based object.
To add to the other answers, it can also be about performance, at least a little bit. Not that you should worry about this unless it's relevant for you, but:
Allocating in the heap requires finding a tracking a block of memory, which is not a constant-time operation (and takes some cycles and overhead). This can get slower as memory becomes fragmented, and/or you're getting close to using 100% of your address space. On the other hand, stack allocations are constant-time, basically "free" operations.
Another thing to consider (again, really only important if it becomes an issue) is that typically the stack size is fixed, and can be much lower than the heap size. So if you're allocating large objects or many small objects, you probably want to use the heap; if you run out of stack space, the runtime will throw the site titular exception. Not usually a big deal, but another thing to consider.
Stack is more efficient, and easier to managed scoped data.
But heap should be used for anything larger than a few KB (it's easy in C++, just create a boost::scoped_ptr on the stack to hold a pointer to the allocated memory).
Consider a recursive algorithm that keeps calling into itself. It's Very hard to limit and or guess the total stack usage! Whereas on the heap, the allocator (malloc() or new) can indicate out-of-memory by returning NULL or throw ing.
Source: Linux Kernel whose stack is no larger than 8KB!
For completeness, you may read Miro Samek's article about the problems of using the heap in the context of embedded software.
A Heap of Problems
The choice of whether to allocate on the heap or on the stack is one that is made for you, depending on how your variable is allocated. If you allocate something dynamically, using a "new" call, you are allocating from the heap. If you allocate something as a global variable, or as a parameter in a function it is allocated on the stack.
In my opinion there are two deciding factors
1) Scope of variable
2) Performance.
I would prefer to use stack in most cases but if you need access to variable outside scope you can use heap.
To enhance performance while using heaps you can also use the functionality to create heap block and that can help in gaining performance rather than allocating each variable in different memory location.
probably this has been answered quite well. I would like to point you to the below series of articles to have a deeper understanding of low level details. Alex Darby has a series of articles, where he walks you through with a debugger. Here is Part 3 about the Stack.
http://www.altdevblogaday.com/2011/12/14/c-c-low-level-curriculum-part-3-the-stack/