Stack overflow from local variables?

Stack overflow from local variables? - c++

Let me start by saying my question is not about stack overflows but on way to make it happen, without compile-time errors\warnings.
I know (first hand) you can overflow a stack with recursion:
void endlessRecursion()
{
int x = 1;
if(x) endlessRecursion(); //the 'if' is just to hush the compiler
}
My question is, is it possible to overflow the stack by declaring too many local variables.
The obvious way is just declare a huge array like so:
void myStackOverflow()
{
char maxedArrSize[0x3FFFFFFF]; // < 1GB, compiler didn't yell
}
In practice even 0xFFFFF bytes causes stack overflow on my machine
So, I was wondering:
Since I haven't tried it, if I declare enough variables would the
stack overflow?
Is there a way to use the preprocessor or other compile-time "tools" (like C++ template meta-programming) to do the first thing, i.e. make it declare a lot of local variables, by somehow causing it to loop? if so, how?
This is theoretical - Is there a way to know if the a program's stack would overflow at compilation time? and if so, please explain.

Yes, allocating a large amount of memory will cause a stack overflow. It shouldn't matter whether you allocate one large variable or a lot of small ones; the total size is what's relevant.
You can't do a compile-time loop with the preprocessor, but you can implement some shortcuts that let you generate large amounts of code without typing it all. For example:
#define DECLARE1 { int i;
#define END1 }
#define DECLARE2 DECLARE1 DECLARE1
#define END2 END1 END1
#define DECLARE4 DECLARE2 DECLARE2
#define END4 END2 END2
and so on. This puts the multiple int i; declarations in nested blocks, ensuring that all the objects exist at the same time while avoiding name conflicts. (I couldn't think of a way to give all the variables distinct names.)
DECLARE4 END4
expands to:
{ int i; { int i; { int i; { int i; } } } }
This won't work if your compiler imposes a limit on the length of a line after preprocessing.
The lesson here is that the preprocessor isn't really designed for this kind of thing. It's much easier and more flexible to write a program in your favorite scripting language that generates the declarations. For example, in bash:
for i in {1..100} ; do
echo " int i$i;"
done

On question 3, I believe the answer is no. The compiler can know how much stack space each function uses. But the total stack space depends on your call sequences, which depend on logic evaluated at runtime. As long as there are no recursive calls, it seems possible to determine an upper bound for the stack space used. If that upper bound is smaller than the available stack space, you could be certain that the stack will not overflow. A lower bound seems possible as well. If that is higher than the stack size, you could be certain that the stack will overflow.
Except for very trivial programs, this would only give you boundaries, not an exact amount of stack space. And once recursion gets involved, I don't think there's an upper bound that you can statically determine in the general case. That almost starts sounding like the halting problem.
All of the very limited options above obviously assume that you have a given stack size. As other posters mentioned, the compiler can generally not know the stack size, because it's often part of the system configuration.
The closest I have seen are static analyzers. I seem to remember that some of them flag large stack variables. But I doubt that they try to analyze actual stack usage. It's probably just a simple heuristic that basically tells you that having large variables on the stack is a bad idea, and that you may want to avoid it.

Yes, too many variables will blow the stack.

Since I haven't tried it, if I declare enough variables would the stack overflow?
Yes, declaring a single array of large size and declaring multiple variables in same scope is similar.
Is there a way to use the preprocessor to do the first thing, i.e. make it declare a lot of local variables, by somehow causing it to loop?
I don't think so, as your compilation (and memory allocation) starts from main(). Whatever you declare using preprocessor commands is expanded in preprocessing stage. This stage doesn't involve any memory allocation.
This is theoretical - Is there a way to know if the a program's stack would overflow?
Yes, for linux system you can get the amount of stack memory allocated to your program, anything more than that will lead to stack Overflow. You can read the this link for details as how to know stack size of any process.

As to #3
Is there a way to know if the a program's stack would overflow at compilation time?
Yes. This is a standard feature of some PIC compilers for embedded processors especially those using a Harvard architecture But it comes at a cost: no recursion nor VLAs. Thus at compile time, an analysis of the code reports the maximum depth in the main processor code as well as the max depth handling interrupts. But the analysis does not prove that the maximum combined depth of those two will occur.
Depending on processor type, an ample stack can be allocated at compile time preventing possible overflow.

Related

Can't understand why this simple recursion begins to work and then crashes

I was doing some trick questions about C++ and I run on similar code to this then I modified it to see what will happen.
I don't understand why at first place this recursion is working (it's printing values from 2 to 4764) and then suddenly it throws exception.
I don't understand also why I can say return in void function and actually return something other than "return;"
Can anyone explain this two problems?
#include<iostream>
using namespace std;
void function(int& a){
a++;
cout << a << endl;
return function(a);
}
void main() {
int b = 2;
function(b);
system("pause>0");
}

The comments have correctly identified that your infinite recursion is causing a stack overflow - each new call to the same function is taking up more RAM, until you use up the amount allocated for the program (the default C++ stack size varies greatly by environment, and anywhere from 10s kB on old systems to 10+ MB on the upper end).The comments have correctly identified that your infinite recursion is causing a stack overflow - the space amount allocated for this purpose (the default C++ stack size varies greatly by environment, and anywhere from 10s kB on old systems to 10+ MB on the upper end). While the function itself is doing very little in terms of memory, the stack frames (which keep track of which function called which other ongoing function with what parameters) can take up quite a lot.
While useful for certain data structures, recursive programs should not need to go several thousand layers deep and usually add a stop condition (in this case, even checking whether a > some_limit) to identify the point where they have gone to deep and need to stop adding more things to the stack (plain return;).
In this case, the exact same output can be achieved with a simple for loop, so I guess these trick questions are purely experimental.

On x86-64 platforms, like your laptop or desktop, functions get called one of two ways:
with a call assembly instruction
with a jmp assembly instruction
What's the difference? A call assembly instruction has additional instructions after it: when the function is called, the code will return to the place it was called from. In order to keep track of where it is, the function uses memory on the stack. If a recursive function calls itself using call, then as it recurses it'll use up more and more of the stack, eventually resulting in a stack overflow.
On the other hand, a jmp instruction just tells the CPU to jump to the section of code where the other function is stored. If a function is calling itself, then the CPU will just jmp back up to the top of the function and start it over with the updated parameters. This is called a tail-call optimization, and it prevents stack overflow entirely in a lot of common cases because the stack doesn't grow.
If you compile your code at a higher optimization level (say, -O2 on GCC), then the compiler will use tail-call optimization and your code won't have a stack overflow.

C/C++ Allocation

Giving a number X and reading X numbers into an uni-dimensional array, which of the following ways is the best(fastest as execution time)?
Please note that X is a number between 1 and 1000000
scanf("%d", &x);
int array[x];
//continue reading X numbers into array
Or
scanf("%d", &x);
int array[1000000];
//continue reading X ...
Or
scanf("%d", &x);
int * array = malloc(x*sizeof(int));
//same as above
free(array);
Or the C++ dynamic allocation method?
Note 1: that I am posting this from a mobile phone, I hope the format for the code above is fine, if not, I ask nicely somebody (<3) to edit it, since it is painfull to indent code from a phone.
Note 2: How could I test what I asked above?

Since there appears scanf (and the comments assume that there's another million calls to scanf) any questions regarding the memory allocation in combination with "Which is fastest?" can be universally answered with: "Yes" (read as: irrelevant).
While automatic storage ("stack allocation") is generally faster than freestore, it is entirely insignificant compared to the time you will spend in scanf. That being said, it is usually (not necessarily, but usually) dynamic deallocation which is slow, not allocation.
A couple of points to note in general on that code:
Reading an integer from some external source (file, network, argv, whatever) and doing an allocation based on that number without doing a sanity check first is massively bad karma. This is bound to cause a problem one day, it is how many existing real-world exploits came into being. Do not trust blindly that any number that you got from somewhere is automatically valid. Even if no malice is involved, accident may still provide an invalid number which will cause catastrophic failure.
Allocating a non-constant sized array on the stack will work under recent versions of C and will "work" as an extension even under C++ if you use GCC, but it is normally not allowable in C++ (meaning it will fail to compile).
Allocating a million integers means roughly 4MB of memory, which is pretty harsh towards your maximum stack size (often only 1MB). Expect a stack overflow happening.
Allocating an unknown number of integers (but expecting the number to be up to a million) is similar to (3).
The worst thing re (3) and (4) is that it may actually succeed. Which possibly means your program will unexpectedly crash later (encountering a stack overflow), in an entirely unrelated innocent piece of code. And you will wonder why that happens, since the code that crashes looks like it is perfectly valid (and it is, indeed!).

You'll get compilation error for this code:
scanf("%d", &x);
int array[x];
x should be known at compilation time in this case.
When using int array[1000000] you allocate memory on the stack, not in the heap, so it's fundamental difference comparing to malloc or new operator. It would be faster because it takes actually only one CPU command of modifying stack pointer.
If comparing malloc and new, malloc will be faster because new will eventually call malloc inside. But the performance gain will be tiny, It doesn't worth to optimize your c++ program in this way, just use c++ when you need to allocate dynamic memory.

Detecting that the stack is full

When writing C++ code I've learned that using the stack to store memory is a good idea.
But recently I ran into a problem:
I had an experiment that had code that looked like this:
void fun(const unsigned int N) {
float data_1[N*N];
float data_2[N*N];
/* Do magic */
}
The code exploted with a seqmentation fault at random, and I had no idea why.
It turned out that problem was that I was trying to store things that were to big on my stack, is there a way of detecting this? Or at least detecting that it has gone wrong?

float data_1[N*N];
float data_2[N*N];
These are variable length arrays (VLA), as N is not a constant expression. The const-ness in the parameter only ensures that N is read-only. It doesn't tell the compiler that N is constant expression.
VLAs are allowed in C99 only; in other version of C, and all versions of C++ they're not allowed. However, some compilers provides VLA as compiler-extension feature. If you're compiling with GCC, then try using -pedantic option, it will tell you it is not allowed.
Now why your program gives segfault, probably because of stack-overflow due to large value of N * N:
Consider using std::vector as:
#include <vector>
void fun(const unsigned int N)
{
std::vector<float> data_1(N*N);
std::vector<float> data_2(N*N);
//your code
}

It's extremely difficult to detect that the stack is full, and not at all portable. One of the biggest problems is that stack frames are of variable size (especially when using variable-length arrays, which are really just a more standard way of doing what people were doing before with alloca()) so you can't use simple proxies like the number of stack frames.
One of the simplest methods that is mostly portable is to put a variable (probably of type char so that a pointer to it is a char*) at a known depth on the stack and to then measure the distance from that point to a variable (of the same type) in the current stack frame by simple pointer arithmetic. Add in an estimate of how much space you're about to allocate, and you can have a good guess as to wether the stack is about to blow up on you. The problems with this are that you don't know the direction that the stack is growing in (no, they don't all grow in the same direction!) and working out the size of the stack space is itself rather messy (you can try things like system limits, but they're really quite awkward). Plus the hack factor is very high.
The other trick I've seen used on 32-bit Windows only was to try to alloca() sufficient space and handle the system exception that would occur if there was insufficient room.
int have_enough_stack_space(void) {
int enough_space = 0;
__try { /* Yes, that's got a double-underscore. */
alloca(SOME_VALUE_THAT_MEANS_ENOUGH_SPACE);
enough_space = 1;
} __except (EXCEPTION_EXECUTE_HANDLER) {}
return enough_space;
}
This code is very non-portable (e.g., don't count on it working on 64-bit Windows) and building with older gcc requires some nasty inline assembler instead! Structured exception handling (which this is a use of) is amongst the blackest of black arts on Windows. (And don't return from inside the __try construct.)

Try using instead functions like malloc. It will return null explicitly, if it failed to find a block of memory of the size you requested.
Of course, in that case don't forget to free this memory in the end of function, after you are done.
Also, you can check the settings of your compiler, with what stack memory limit it generates the binaries.

One of the reasons people say it is better to use stack instead of heap memory can be because of the fact that variables allocated on top of the stack will be popped out automatically when you leave the body of the function. For storing big blocks of information it is usual to use heap memory and other data structures like linked lists or trees. Also memories allocated on the stack is limited and much more less than you can allocate in the heap space. I think it is better to manage the memory allocation and releasing more carefully instead of trying to use stack for storing big data.
You can use framework which manage your memory allocations. As well you can use VDL to check your memory leaks and memories which is not released.

is there a way of detecting this?
No, in general.
Stack size is platform depedent. Typically, Operating System decides the size of the stack. So you can check your OS (ulimit -s on linux) to see how much stack memory it allocates for your program.
If your compiler supports stackavail() then you can check it. It's better to go heap-allocated memory in situations where you are unsure whether you'd exceed the stack limit.

Having large 2d arrays : static int vs int

While solving a DP related problem, I observed that first works but the second seg faults .
What is the actual reason and what is the memory limit for just using int ?
int main(){
static int a[3160][3160];
return 0;
}
int main(){
int a[3160][3160];
return 0;
}

Because you probably don't have enough stack memory to store that big array.
Second Example creates an array on stack, while the First example creates an array which is not located on stack but somewhere in the data/Bss segment, since you explicitly specify the storage criteria using static qualifier.
Note that c++ standard does not specify stack or heap or data segment or Bss segment these are all implementation defined details. The standard only specify's behavior expected of variables declared with different storage criteria. So, where the variables are actually created is implementation defined but one thing for sure is, both your examples will create the arrays in different memory regions and the second one crashes because there is not enough memory in that region.
Also, probably if you are creating an array of such huge dimensions in actual implementation your design seems flawed and you might want to consider revisiting it.
You might also want to consider using std::array or std::vector, instead of the traditional c-style arrays.

A stack allocation that large is not safe (unless you were to fulfill that guarantee).
The stack size varies by platform/hardware. Therefore, the 'memory limit' varies dramatically. If you use huge stack arrays like this, be prepared to see this error often when your program is run on an processor other than the one you use for development. If you absolutely need a stack that large, you must create your own threads with explicit stack sizes.
However, that measure is not needed because you should just use a dynamic allocation here.
static is not a good choice if you need it to be reentrant.
As Als noted (+1) - the reason for the runtime error is very likely the stack size.

In which cases is alloca() useful?

Why would you ever want to use alloca() when you could always allocate a fixed size buffer on the stack large enough to fit all uses? This is not a rhetorical question...

It could be useful if the size of the buffer varies at runtime, or if you only sometimes need it: this would use less stack space overall than a fixed-size buffer in each call. Particularly if the function is high up the stack or recursive.

You might want to use it if there's no way to know the maximum size you might need at compile time.
Whether you should is another question - it's not standard, and there's no way to tell whether it might cause a stack overflow.

In which cases is alloca() useful?
The only time I ever saw alloca being used was in Open Dynamics Engine.
AFAIK they were allocating HUGE matrices with it (so compiled program could require 100MB stack), which were automatically freed when function returns (looks like smartpointer ripoff to me). This was quite a while ago.
Although it was probably much faster than new/malloc, I still think it was a bad idea.
Instead of politely running out of RAM program could crash with stack overflow (i.e. misleading) when scene became too complex to handle. Not a nice behavior, IMO, especially for physics engine, where you can easily expect someone to throw few thousands bricks into scene and see what happens when they all collide at once. Plus you had to set stack size manually - i.e. on system with more RAM, program would be still limited by stack size.
a fixed size buffer on the stack large enough to fit all uses? This is not a rhetorical question...
If you need fixed-size buffer for all uses, then you could as well put it into static/global variable or use heap memory.

Using alloca() may be reasonable when you are unable to use malloc() (or new in C++, or another memory allocator) reliably, or at all, but you can assume there's more space available on your stack - that is, when you can't really do anything else.
For example, in glibc's segfault.c, we have:
/* This function is called when a segmentation fault is caught. The system
is in an unstable state now. This means especially that malloc() might
not work anymore. */
static void
catch_segfault (int signal, SIGCONTEXT ctx)
{
void **arr;
/* ... */
/* Get the backtrace. */
arr = alloca (256 * sizeof (void *));
/* ... */
}

Never - it's not part of C++, and not useful in C. However, you cannot allocate "a static buffer on the stack" - static buffers are allocated at compile time, and not on the stack.
The point of alloca() is of course that it is not fixed sized, it is on the stack, and that it is freed automatically when a function exits. Both C++ and C have better mechanisms to handle this.

The alloca() function is virtually never needed; for memory allocation purposes, you can use malloc()/free() in C (or one of the collection of possibilities in C++) and achieve pretty much the same practical effect. This has the advantage of coping better with smaller stack sizes.
However I have seen[1] one legit (if hacky!) use of it: for detecting potential stack overflow on Windows; if the allocation (of the amount of slop space you wanted to access) failed, you were out but had enough room to recover gracefully. It was wrapped in __try/__except so that it didn't crash, and needed extra assembler tricks to avoid gcc-induced trouble. As I said, a hack. But a clever one that is the only valid use for alloca() that I've ever seen.
But don't do that. Better to write the code to not need such games.
[1] It was in Tcl 8.4 (and possibly earlier versions of Tcl). It was removed in later versions. Later versions removed it because it was finicky, very tricky and deeply troubling. 8.6 uses a stackless implementation of the execution engine instead of that sort of funkiness.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js