c++: stackoverflow error in recurssion - c++

When does it cause a stackoverflow error in recursion in c++? What is the size of memory consumed while working with recursion? Is it 4 times the function invoked(4 being the size of pointer)? Does that mean there is a different pointer associated with each call?

Every platform has limits on the number of stack frames you can use in a program. When a recursive function does not meet its terminating criteria soon enough, it will lead to stack overflow.
In Microsoft Visual Studio compilers, you can specify the stack size using the compiler option /F (There is also a linker option, /STACK). Without this, the stack size is 1 MB. You can get more information at http://msdn.microsoft.com/en-us/library/tdkhxaks.aspx.
Each stack frame needs different amount of memory -- they are determined by the number and types of local variables, the type of the return value, the number and types of parameters. Hence, the number of stack frames you can use without causing stack overflow varies.
g++/gcc also have a way of specifying stack size using -Wl-stack_size. You can find more on that subject at Change stack size for a C++ application in Linux during compilation with GNU compiler.

C++ as a language has no notion of a "stack" or "stack overflow".
The stack is an implementation detail. The amount consumed per call depends on your platform, your compiler, the actual code etc. As a rule of thumb, you can expect the return address and all of the function's arguments to be pushed onto the stack. Additionally, automatic variables usually live on the stack (but see below).
This is, however, a simplification: in some cases the compiler might be able to eliminate function calls altogether or turn them into jump instructions. Arguments are commonly passed in registers. Automatic variables can be optimized away or stored in registers. Etc etc.
If you want to know for sure, compile your code to assembly and carefully study the result. Alternatively, rig up some representative benchmarks and run them until the stack is exhausted.
Last but not least, the amount of stack available to an application is often configurable at the OS level.

Related

Using whole stack memory

Hello I heard that in c++ stack memory is being used for "normal" variables. How do I make stack full? I tried to use ton of arrays but it didnt help. How big is stack and where is it located?
The C++ language doesn't specify such thing as "stack". It is an implementation detail, and as such it doesn't make sense deliberating about unless we are discussing a particular implementation of C++.
But yes, in a typical C++ implementation, automatic variables are stored on the execution stack.
How do I make stack full?
Step 1: Use a language implementation that has limited stack size. This is quite common.
Step 2: Create an automatic variable that exceeds the limit. Or nest too many non-tail-recursive function calls. If you're lucky, the program may crash.
You wouldn't want stack to be exhausted in production use.
How big is stack
Depends on language implementation. It may even be configurable. The default is one to a few megabytes on common desktop/server systems. Less on embedded systems.
and where is it located?
Somewhere in memory where the language implementation has chosen.
The most important thing to take out of this is that the memory available for automatic variables is typically limited. As such:
Don't use large automatic variables.
Don't use recursion when asymptotic growth of depth is linear or worse.
Don't let user input affect the amount or size of automatic variables or depth of recursion without constraint.
Hello I heard that in c++ stack memory is being used for "normal" variables.
Local (automatic) variables declared in a function or main are allocated memory mostly on stack (or register) and are deallocated when the execution is done.
How do I make stack full? I tried to use ton of arrays but it didnt help.
Using ton of arrays, many recursive calls, parameter passing large structs that contain ton of arrays are ways. Another way might also be to reduce stack size: -Wl,--stack,number (for gcc)
How big is stack and where is it located?
It depends on platform, operating system so on. Standard does not determine any stack size. Its location is determined by OS before the program starts. OS allocates a memory for stack from virtual memory.

How to calculate remaining size of the stack? [duplicate]

I'm architecting a small software engine and I'd like to make expensive use of the stack for rapid iterations of large number sets. But then it occurred to me that this might be a bad idea since the stack isn't as large a memory store as the heap. But I am attracted to the stack's speed and lack of dynamic allocation coding practices.
Is there a way to find out how far I can push the stack on a given platform? I am looking mainly at mobile devices but the issue could come up on any platform.
On *nix, use getrlimit:
RLIMIT_STACK
The maximum size of the process stack, in bytes. Upon
reaching this limit, a SIGSEGV signal is generated. To handle
this signal, a process must employ an alternate signal stack
(sigaltstack(2)).
On Windows, use VirtualQuery:
For the first call, pass it the address of any value on the stack to
get the base address and size, in bytes, of the committed stack space.
On an x86 machine where the stack grows downwards, subtract the size
from the base address and VirtualQuery again: this will give you the
size of the space reserved for the stack (assuming you're not
precisely on the limit of stack size at the time). Summing the two
naturally gives you the total stack size.
There is no platform-independent method since stack size is left to the implementation and host system logically - on an embedded mini-SOC there are less resources to distribute than on a 128GB RAM server. You can however influence the stack size of a specific thread on all OS'es as well with API-specific calls.
A possible portable solution is to write an allocator yourself.
You do not have to make use of the process stack, just simulate it in the heap.
Allocate a large amount of memory in the beginning, and write a stack allocator on top of it to use it while allocating.
Google 'Allocator Requirements' for information on how to achieve it in C++.
I'm not sure if the term 'Stack Allocator' is canonical, but I mean that you have to put stack like restrictions on where the allocation or deallocation has to happen.
Since you said that your algorithm is suited to this pattern, I think it'd be easy.
In standard C++, definitely not. In a portable way, probably not. In a particular OS, sometimes. If nothing else, you could open your own executable size and inspect the headers of the executable file to see it's stacksize. [The next problem is of course "how much of the stack was used before this bit of code" - which can be difficult to determine].
If you run the code in a separate thread, many of the (low level) thread interfaces allow you to specify a stack (or stacksize), E.g Posix threads pthread_set_stacksize or MS _beginthread. Again, you don't know EXACTLY how much space has been used up before it gets to the actual thread code - but it's probably not a huge amount.
Of course, in an embedded system (e.g. mobile phone), the stacksize is typically quite small, 4K, 12K or 64KB is very much normal - sometimes even a lot smaller than that in some systems.
Another potential problem is that you can't really know how much space is ACTUALLY used on the stack - you can measure after the fact in a compiled system, and of course, if you have a stack local array of int array[25];, we can know it takes up at least 25 * sizeof(int) - but there may be padding, the compiler saves registers on the stack, etc, etc.
Edit, as an afterthought:
I also don't really see much benefit in having two code-paths:
if (enough_stack_space_for_something)
use_stack_based_algorithm();
else
use_heap_based_algorithm();
This would add a fair amount of extra overhead, and more code is generally not a good plan in an embedded/mobile system.
Edit2: Also, if allocating memory is a major part of the runtime, perhaps looking at why that is, for example block-creation of objects would help?
To expand on the answers already given about why there is no portable way to do this, the entire concept of an actual stack is not part of the standard. You could write a C or C++ runtime that doesn't use a stack at all other than the function call records (which might internally be a linked list or something else).
The stack is an implementation detail of a particular machine/OS/compiler. Hence any technique to access stack metrics will be specific to machine/OS/compiler.
While not an actual answer to your specific question (Niels covered that quite well) but as advice to your problem domain: just allocate a large chunk of memory in the heap. There's no reason aside from convenience that the "real" stack is any different. Highly recursive (non-tail-recursive) algorithms often need to do this to ensure that they have a virtually unbounded "stack." Scripting languages that want to ensure they give a runtime error/exception rather than crashing the host application also often do this. To be efficient about things, you can either implement a "split stack" (like a std::deque would give you) or you can just be sure to preallocate a stack big enough for your needs.
There's no standard way to do it from within the language. I'm not even aware of a documented extension that is able to query.
However some compilers have options to set the stack size. And platform may specify what it does when launching a process, and/or provide ways to set stack size of a new thread, maybe even manipulate existing one.
For small platforms it's usual to know the whole memory size, have all the data segments on one end, a set size arena for the heap (may be 0), and the rest is stack, approaching from the other side.

UNIX: What should be Stack Size (ulimit -s) in UNIX? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
How can I calculate the minimum Stack Size required for my program in UNIX, so that my program never get crashed.
Suppose my program is
int main()
{
int number;
number++;
return 0;
}
1) What can be the Stack Size requried to run for this program? How it is calculated ?
2) My Unix system gives ulimit -s 512000. Is this value 512MB really required for my small program?
3) And what if I have a big program having Multithreads, some 500 functions, including some libraries, Macros, Dynamically allocated memory etc. How much Stack Size is required for that ?
Your program in itself uses a few bytes - 1 int, but there is of course the part of the runtime that comes BEFORE main as well to take into account. But it's unlikely to be more than a few dozen bytes, maybe a couple of hundred bytes at a stretch. Since the minimum stack size in any modern OS is "one page" = 4KB, this should easily fit in that.
51200 = 51.2MB, but that seems quite high. On my Linux Fedora 16 x86-64 machine, it is 8192.
Threads don't really matter, as each thread has its own stack. The number of functions are in themselves not a huge contributor to stack usage. Running out of stack is nearly always caused by large local variables and/or deep recursion. For any program that is more than a little bit complex, calculating precise stack usage can be quite tricky. Typically, it involves running the program a lot and seeing if the stack "explodes". If not, you have enough stack. Library functions, generally speaking, tends to not use huge amounts of stack, but there are always exceptions.
To examplify:
void func()
{
int x, y, z;
float w;
...
}
This function takes up approximately 16 bytes of stack, plus the general overhead of calling a function, typically 1-3 "machine words" (4-12 bytes on a 32-bit machine, 8-24 bytes for a 64-bit machine).
void func2()
{
int x[10000];
...
}
This function will take 40000 bytes of stack-space. Obviously, you don't need many recursive calls to this function to run out of stack.
There is no magic way to tell how much space will your program require on the stack. It'd depend on what the code is actually doing. Infinite (or very deep) recursion would result in stack overflow even if the program doesn't seem to do anything.
As an example, see the following:
$ ulimit
unlimited
$ echo "foo(){foo();} main(){foo();}" | gcc -x c -
$ ./a.out
Segmentation fault (core dumped)
Most people rely on the stack being “large” and their programs not using all of it, simply because the size has been set so large that programs rarely fail because they run out of stack space unless they use very large arrays with automatic storage duration.
This is an engineering failure, in the sense that it is not engineering: A known and largely preventable source of complete failure is uncontrolled.
In general, it can be difficult to compute the actual stack needs of a program. Especially when there is recursion, a compiler cannot generally predict how many times a routine will be called recursively, so it cannot know how many times that routine will need stack space. Another complication is calls to addresses prepared at run-time, such as calls to virtual functions or through other pointers-to-functions.
However, compilers and linkers could provide some assistance. For any routine that uses a fixed amount of stack space, a compiler, in theory, could provide that information. A routine may include blocks that are or are not executed, and each block might have different stack space requirements. This would interfere with a compiler providing a fixed number for the routine, but a compiler might provide information about each block individually and/or a maximum for the routine.
Linkers could, in theory, examine the call tree and, if it is static and is not recursive, provide a maximum stack use for the linked program. They could also provide the stack use along a particular call subchain (e.g., from one routine through the chain of calls that leads to the same routine being called recursively) so that a human could then apply knowledge of the algorithm to multiple the stack use of the subchain by the maximum number of times it might be called recursively).
I have not seen compilers or linkers with these features. This suggests there is little economic incentive for developing these features.
There are times when stack use information is important. Operating system kernels may have a stack that is much more limited than user processes, so the maximum stack use of the kernel code ought (as a good engineering practice) to be calculated so that the stack size can be set appropriately (or the code redesigned to use less stack).
If you have a critical need for calculating stack space requirements, you can examine the assembly code generated by the compiler. In many routines on many computing platforms, a fixed number is subtracted from the stack pointer at the beginning of the routine. In the absence of additional subtractions or “push” instructions, this is the stack use of the routine, excluding further stack used by subroutines it calls. However, routines may contain blocks of code that contain additional stack allocations, so you must be careful about examining the generated assembly code to ensure you have found all stack adjustments.
Routines may also contain stack allocations computed at run-time. In a situation where calculating stack space is critical, you might avoid writing code that causes such allocations (e.g., avoid using C’s variable-length array feature).
Once you have determined the stack use of each routine, you can determine the total stack use of the program by adding the stack use of each routine along various routine-call paths (including the stack use of the start routine that runs before main is called).
This sort of calculation of the stack use of a complete program is generally difficult and is rarely performed.
You can generally estimate the stack use of a program by knowing how much data it “needs” to do its work. Each routine generally needs stack space for the objects it uses with automatic storage duration plus some overhead for saving processor registers, passing parameters to subroutines, some scratch work, and so on. Many things can alter stack use, so only an estimate can be obtained this way. For example, your sample program does not need any space for number. Since no result of declaring or using number is ever printed, the optimizer in your compiler can eliminate it. Your program only needs stack space for the start routine; the main routine does not need to do anything except return zero.

Is it possible to determine how much space is available on the stack?

I'm architecting a small software engine and I'd like to make expensive use of the stack for rapid iterations of large number sets. But then it occurred to me that this might be a bad idea since the stack isn't as large a memory store as the heap. But I am attracted to the stack's speed and lack of dynamic allocation coding practices.
Is there a way to find out how far I can push the stack on a given platform? I am looking mainly at mobile devices but the issue could come up on any platform.
On *nix, use getrlimit:
RLIMIT_STACK
The maximum size of the process stack, in bytes. Upon
reaching this limit, a SIGSEGV signal is generated. To handle
this signal, a process must employ an alternate signal stack
(sigaltstack(2)).
On Windows, use VirtualQuery:
For the first call, pass it the address of any value on the stack to
get the base address and size, in bytes, of the committed stack space.
On an x86 machine where the stack grows downwards, subtract the size
from the base address and VirtualQuery again: this will give you the
size of the space reserved for the stack (assuming you're not
precisely on the limit of stack size at the time). Summing the two
naturally gives you the total stack size.
There is no platform-independent method since stack size is left to the implementation and host system logically - on an embedded mini-SOC there are less resources to distribute than on a 128GB RAM server. You can however influence the stack size of a specific thread on all OS'es as well with API-specific calls.
A possible portable solution is to write an allocator yourself.
You do not have to make use of the process stack, just simulate it in the heap.
Allocate a large amount of memory in the beginning, and write a stack allocator on top of it to use it while allocating.
Google 'Allocator Requirements' for information on how to achieve it in C++.
I'm not sure if the term 'Stack Allocator' is canonical, but I mean that you have to put stack like restrictions on where the allocation or deallocation has to happen.
Since you said that your algorithm is suited to this pattern, I think it'd be easy.
In standard C++, definitely not. In a portable way, probably not. In a particular OS, sometimes. If nothing else, you could open your own executable size and inspect the headers of the executable file to see it's stacksize. [The next problem is of course "how much of the stack was used before this bit of code" - which can be difficult to determine].
If you run the code in a separate thread, many of the (low level) thread interfaces allow you to specify a stack (or stacksize), E.g Posix threads pthread_set_stacksize or MS _beginthread. Again, you don't know EXACTLY how much space has been used up before it gets to the actual thread code - but it's probably not a huge amount.
Of course, in an embedded system (e.g. mobile phone), the stacksize is typically quite small, 4K, 12K or 64KB is very much normal - sometimes even a lot smaller than that in some systems.
Another potential problem is that you can't really know how much space is ACTUALLY used on the stack - you can measure after the fact in a compiled system, and of course, if you have a stack local array of int array[25];, we can know it takes up at least 25 * sizeof(int) - but there may be padding, the compiler saves registers on the stack, etc, etc.
Edit, as an afterthought:
I also don't really see much benefit in having two code-paths:
if (enough_stack_space_for_something)
use_stack_based_algorithm();
else
use_heap_based_algorithm();
This would add a fair amount of extra overhead, and more code is generally not a good plan in an embedded/mobile system.
Edit2: Also, if allocating memory is a major part of the runtime, perhaps looking at why that is, for example block-creation of objects would help?
To expand on the answers already given about why there is no portable way to do this, the entire concept of an actual stack is not part of the standard. You could write a C or C++ runtime that doesn't use a stack at all other than the function call records (which might internally be a linked list or something else).
The stack is an implementation detail of a particular machine/OS/compiler. Hence any technique to access stack metrics will be specific to machine/OS/compiler.
While not an actual answer to your specific question (Niels covered that quite well) but as advice to your problem domain: just allocate a large chunk of memory in the heap. There's no reason aside from convenience that the "real" stack is any different. Highly recursive (non-tail-recursive) algorithms often need to do this to ensure that they have a virtually unbounded "stack." Scripting languages that want to ensure they give a runtime error/exception rather than crashing the host application also often do this. To be efficient about things, you can either implement a "split stack" (like a std::deque would give you) or you can just be sure to preallocate a stack big enough for your needs.
There's no standard way to do it from within the language. I'm not even aware of a documented extension that is able to query.
However some compilers have options to set the stack size. And platform may specify what it does when launching a process, and/or provide ways to set stack size of a new thread, maybe even manipulate existing one.
For small platforms it's usual to know the whole memory size, have all the data segments on one end, a set size arena for the heap (may be 0), and the rest is stack, approaching from the other side.

How can I determine appropriate stack and heap sizes for ARM Cortex, using C++

The cortex M3 processor startup file allows you to specify the amount of RAM dedicated to the stack and the heap. For a c++ code base, is there a general rule of thumb or perhaps some more explicit way to determine the values for the stack and heap sizes? For example, would you count the number and size of unique objects, or maybe use the compiled code size?
The cortex M3 processor startup file
allows you to specify the amount of
RAM dedicated to the stack and the
heap.
That is not a feature of the Cortex-M3, but rather the start-up code provided by your development toolchain. It is the way the Keil ARM-MDK default start-up files for M3 work. It is slightly unusual; more commonly you would specify a stack size, and any remaining memory after stack and static memory allocation by the linker becomes the heap; this is arguably better since you do not end up with a pool of unusable memory. You could modify that and use an alternative scheme, but you'd need to know what you are doing.
If you are using Keil ARM-MDK, the linker options --info=stack and --callgraph add information to the map file that aids stack requirement analysis. These and other techniques are described here.
If you are using an RTOS or multi-tasking kernel, each task will have its own stack. The OS may provide stack analysis tools, Keil's RTX kernel viewer shows current stack usage but not peak stack usage (so is mostly useless, and it only works correctly for tasks with default stack lengths).
If you have to implement stack checking tools yourself, the normal method is to fill the stack with a known value, and starting from the high address, inspect the value until you find the first value that is not the fill byte, this will give the likley high tide mark of the stack. You can implement code to do this, or you can manually fill the memory from the debugger, and then monitor stack usage in a debugger memory window.
Heap requirement will depend on the run-time behaviour of your code; you'll have to analyse that yourself however in ARM/Keil Realview, the MemManage Exception handler will be called when C++'s new throws an exception; I am not sure if malloc() does that or simply returns NULL. You can place a breakpoint in the exception handler or modify the handler to emit an error message to detect heap exhaustion during testing. There is also a a __heapstats() function that can be used to output heap information. It has a somewhat cumbersome interface, I wrapped it thus:
void heapinfo()
{
typedef int (*__heapprt)(void *, char const *, ...);
__heapstats( (__heapprt)std::fprintf, stdout ) ;
}
The compiled code size will not help as the code does not run in the stack nor the heap. Cortex-M3 devices are typically implemented on microcontrollers with built in Flash and a relatively small amount of RAM. In this configuration, the code will typically run from Flash.
The heap is used for dynamic memory allocation. Counting the number of unique objects will give you a rough estimate but you also have to account for any other elements that use dynamic memory allocation (using the new keyword in C++). Generally, dynamic memory allocation is avoided in embedded systems for the precise reason that heap size is hard to manage.
The stack will be used for variable passing, local variables, and context saving during exception handling routines. It is generally hard to get a good idea of stack usage unless you're code allocates a large block of local memory or a large objects. One technique that may help is to allocate all of the available RAM you have for the stack. Fill the stack with a known pattern (0x00 or 0xff are not the best choices since these values occur frequently), run the system for a while then examine the stack to see how much was used. Admittedly, this not a very precise nor scientific approach but still helpful in many cases.
The latest version of the IAR Compiler has a feature that will determine what stack size you need, based on a static analysis of your code (assuming you don't have any recursion).
The general approach, if you don't have an exact number, is to make as big as you can, and then when you start running out of memory, start trimming the stack down until your program crashes due to a stack over flow. I wish that was a joke, but that is the way it is usually done.
Reducing until it crashes is a quick ad-hoc way. You can also fill the stack with a known value, say, 0xCCCC, and then monitor maximum stack usage by scanning for the 0xCCCC.
It's imperfect, but much better than looking for a crash.
The rationale being, reducing stack size does not guarantee that stack overflow will munch something "visible".