This question already has answers here:
Segmentation fault on large array sizes
(7 answers)
Closed 4 years ago.
I'm converting a program from fortran to C++.
My code seems to run fine until I add this array declaration:
float TC[100][100][100];
And then when I run it I get a segmentation fault error. This array should only take up 8Mb of memory and my machine has 3 Gb. Is there a problem with this declaration? My c++ is pretty rusty.
That array is about 4 megabyte large. If this definition is inside a function (as local variable), then the compiler tries to store it on the stack, which on most systems cannot grow that large.
The Fortran compiler probably allocated it statically (Fortran routines are not allowed to be called recursively unless explicitly marked as recursive, so static allocation for local variables works there for non-recursive functions), and therefore the error doesn't occur there.
A simple fix would be to explicitly declare the variable static, assuming the Fortran function was not declared recursive. However this may bite you later, if you ever try to call that function recursively from a revised version. So a better solution would probably be to allocate it dynamically. However that costs extra time and therefore depending on the nature of the code, may hurt your performance too much (Fortran code quite often is numerical code where performance matters).
If you choose to make the array static, you can build in a protection against accidental recursive calls:
void yourfunction()
{
static bool active;
static float TC[100][100][100];
assert(!active);
active = true;
// your code
active = false;
}
I'm guessing TC is being allocated as an auto local variable. This means it's being stored on the stack. You don't get 4mb of stack memory, so it's causing a stack overflow.
To solve it, use dynamic allocation with a structured container or new.
This looks like a stack-based declaration. Try allocating from the heap (i.e. using the new operator).
If you are declaring it inside of a function, as a local variable, it may be that you stack is not big enough to fit the array. You may try to allocate in the heap, with new or malloc(), or, if your design allows, make it a global variable.
In C++ the stack has a limited amount of space. MSVC defaults this size to 1MB. If the stack uses more than 1MB it will segfault or stackoverflow or something. You will have to move that structure to dynamic memory. To move it to dynamic memory, you want something like this:
typedef float (bigarray)[100][100][100];
bigarray& TC() {
static bigarray* ptr = NULL;
if (ptr == NULL) {
ptr = new float[100][100][100];
for(int j=0; j<100; j++) {
ptr[j] = new float[100][100];
for(int i=0; i<100; i++)
ptr[j][i] = new float[100];
}
}
return *ptr;
}
That will allocate the structure in dynamic memory the first time it is accessed, as a jagged array. You can get more performance out of a rectangle array, but you have to change types:
typedef std::vector<std::array<std::array<float, 100>, 100> bigarray;
bigarray TC(100);
According to http://cs.nyu.edu/exact/core/doc/stackOverflow.txt, gcc/linux defaults the stack size to 8MB, which isn't big enough for your structure and int main() If you really want to, MSVC has flags to increase the stack sizes up to 32MB. Linux has a ulimit command to increase the stack size up to 32MB.
Related
This question already has answers here:
At what exact moment is a local variable allocated storage?
(5 answers)
Closed 6 years ago.
Even in C (not just C++) you can declare variables at the start of a code block, which is enclosed in curly braces.
Example:
#include <stdio.h>
void use_stack(int cnt)
{
if (cnt<=16) {
int a[16];
int i;
a[0]=3;
a[1]=5;
for (i=2;i<cnt;i++) {
a[i]=a[i-1]+a[i-2];
}
printf("a[%d] == %d\n",cnt-1,a[cnt-1]);
}
else {
printf("cnt is too big\n");
}
}
Now I know that variables like the array a[16] are allocated on the stack in this case.
I was wondering if the space for this array is allocated at the start of the function (first opening curly brace) or at the start of the block where it is declared (opening curly brace after if).
From examining the assembler code it seems the compiler allocates the space for a[16] directly at the entry of the function.
I actually expected that the stack would be allocated (stack pointer decreased) at the declaration of a[16] and that the stack would be de-allocated (stack pointer increased) at the end of the corresponding if code block.
But this does not happen it seems (stack for a[16] is allocated directly at function entry, even if a[16] is not used in the else branch).
Has anyone an explanation why this is the case ?
So is there any part of the C language standard, which explains this behavior, or does it have to do with things like "longjmp" or signal handling, which maybe require that the stack pointer is "constant" inside a function ?
Note: The reason why I assumed the stack would be allocated/deallocated at the start/end of the code block is, because in C++ the constructor/destructor of objects allocated on the stack will be called at the start/end of the code block. So if you examine the assembler code of a C++ program you will notice that the stack is still allocated at the function entry; just the constructor/destructor call will be done at the start/end of the code block.
I am explicitly interested why stack is not allocated/deallocated at the start/end of a code block using curly braces.
The question: At what exact moment is a local variable allocated storage? is only about a local variable allocated at the start of a function. I am surprised that stack allocation for variables allocated later inside a code block is also done at the function entry.
So far the answers have been:
Something to do with optimization
Might different for C, C++
Stack is not even mentioned in the C language specification
So: I am interested in the answer for C... (and I strongly believe that the answer will apply to C++ also, but I am not asking about C++ :-)).
Optimization: Here is an example which will directly demonstrate why I am so surprised and why I am quite sure that this is not an optimization:
#include <stdio.h>
static char *stackA;
static char *stackB;
static void calc(int c,int *array)
{
int result;
if (c<=0) {
// base case c<=0:
stackB=(char *)(&result);
printf("stack ptr calc() = %p\n",stackB);
if (array==NULL) {
printf("a[0] == 1\n");
} else {
array[0]=1;
}
return;
}
// here: c>0
if (array==NULL) {
// no array allocated, so allocate it now
int i;
int a[2500];
// calculate array entries recursively
calc(c-1,a);
// calculate current array entry a[c]
a[c]=a[c-1]+3;
// print full array
for(i=0;i<=c;i++) {
printf("a[%d] = %d\n",i,a[i]);
}
} else {
// array already allocated
calc(c-1,array);
// calculate current array entry a[c]
array[c]=array[c-1]+3;
}
}
int main()
{
int a;
stackA=(char *)(&a);
printf("stack ptr main() = %p\n",stackA);
calc(9,NULL);
printf("used stack = %d\n",(int)(stackA-stackB));
}
I am aware that this is an ugly program :-).
The function calc calculates n*3 + 1 for all 0<=n<=c in a recursive fashion.
If you look at the code for calc you notice that the array a[2500] is only declared when the input parameter array to the function is NULL.
Now this only happens in the call to calc which is done in main.
The stackA and stackB pointers are used to calculate a rough estimate how much stack is used by this program.
Now: int a[2500] should consume around 10000 bytes (4 bytes per integer, 2500 entries). So you could expect that the whole program consumes around 10000 bytes of stack + something additional (for overhead when calc is called recursively).
But: It turns out this program consumes around 100000 bytes of stack (10 times as much as expected). The reason is, that for each call of calc the array a[2500] is allocated, even if it is only used in the first call. There are 10 calls to calc (0<=c<=9) and so you consume 100000 bytes of stack.
It does not matter if you compile the program with or without optimization
GCC-4.8.4 and clang for x64, MS Visual C++ 2010, Windriver for DIAB (for PowerPC) all exhibit this behavior
Even weirder: C99 introduces Variable Length Arrays. If I replace int a[2500]; in the above code with int a[2500+c]; then the program uses less stack space (around 90000 bytes less).
Note: If I only change the call to calc in main to calc(1000,NULL); the program will crash (stack overflow == segmentation fault). If I additionally change to int a[2500+c]; the program works and uses less than 100KB stack. I still would like to see an answer, which explains why a Variable Length Array does not lead to a stack overflow whereas a fixed length array does lead to a stack overflow, in particular since this fixed length array is out of scope (except for the first invocation of calc).
So what's the reason for this behavior in C ?
I do not believe that GCC/clang both simply are not able to do better; I strongly believe there has to be a technical reason for this. Any ideas ?
Answer by Google
After more googling: I strongly believe this has something to do with "setjmp/longjmp" behavior. Google for "Variable Length Array longjmp" and see for yourself. It seems longjmp is hard to implement if you do not allocate all arrays at function entry.
The language rules for automatic storage only guarantees that the last allocated is the first deallocated.
A compiler can implement this logical stack any way it sees fit.
If it can prove that a function isn't recursive it can even allocated the storage at program start-up.
I believe that the above applies to C as well as C++, but I'm no C expert.
Please, when you ask about the details of a programming language, limit the question to one language at a time.
There is no technical reason for this other than choices that compiler makers made. It's less generated code and faster executing code to always reserve all the stack space we'll need at the beginning of the function. So all the compilers made the same reasonable performance tradeoff.
Try using a variable length array and you'll see that the compiler is fully capable of generating code that "allocates" stack just for a block. Something like this:
void
foo(int sz, int x)
{
extern void bar(char *);
if (x) {
char a[sz];
bar(a);
} else {
char a[10];
bar(a);
}
}
My compiler generates code that always reserves stack space for the x is false part, but the space for the true part is only reserved if x is true.
How this is done isn't regulated by any standard. The C and C++ standards don't mention the stack at all, in theory those languages could be used even on computers that don't have a stack.
On computers that do have a stack, how this is done is specified by the ABI of the given system. Often, stack space is reserved at the point when the program enters the function. But compilers are free to optimize the code so that the stack space is only reserved when a certain variable is used.
At any rate, the point where you declare the variable has no relation to when it gets allocated. In your example, int a[16] is either allocated when the function is entered, or it is allocated just before the first place where it is used. It doesn't matter if a is declared inside the if statement or at the top of the function.
In C++ however, there is the concept of constructors. If your variable is an object with a constructor, then that constructor will get executed at the point where the variable is declared. Meaning that the variable must be allocated before that happens.
Alf has explained the limitations and freedoms that the standard specifies (or rather, what it doesn't specify).
I'll suggest an answer to the question
why stack is not allocated/deallocated at the start/end of a code block using curly braces.
The compilers that you tested chose to do so (I'm actually just guessing, I didn't write any of them), because of better run-time performance and simplicity (and of course, because it is allowed).
Allocating 96 bytes (arbitrary example) once takes about half as much time as allocating 48 bytes twice. And third as much times as allocating 32 bytes thrice.
Consider a loop as an extreme case:
for(int i = 0; i < 1000000; i++) {
int j;
}
If j is allocated at the start of the function, there is one allocation. If j is allocated within the loop body, then there will be a million allocations. Less allocations is better (faster).
Note: The reason why I assumed the stack would be allocated/deallocated at the start/end of the code block is, because in C++ the constructor/destructor of objects allocated on the stack will be called at the start/end of the code block.
Now you know that you were wrong to have assumed so. As written in a fine answer in the linked question, the allocation/dealocation need not coincide with the construction/destruction.
Imagine I would use a[100000]
That is approaching a very significant fraction of total stack space. You should allocate large blocks of memory like that dynamically.
static const int MAX_SIZE = 256; //I assume this is static Data
bool initialiseArray(int* arrayParam, int sizeParam) //where does this lie?
{
if(size > MAX_SIZE)
{
return false;
}
for(int i=0; i<sizeParam; i++)
{
arrayParam[i] = 9;
}
return true;
}
void main()
{
int* myArray = new int[30]; //I assume this is allocated on heap memory
bool res = initialiseArray(myArray, 30); //Where does this lie?
delete myArray;
}
We're currently learning the different categories of memory, i know that theres
-Code Memory
-Static Data
-Run-Time Stack
-Free Store(Heap)
I have commented where im unsure about, just wondering if anyone could help me out. My definition for the Run-Time stack describes that this is used for functions but my code memory defines that it contains all instructions for the methods/functions so im just a bit confused.
Can anyone lend a hand?
static const int MAX_SIZE = 256; //I assume this is static Data
Yes indeed. In fact, because it's const, this value might not be kept in your final executable at all, because the compiler can just substitute "256" anywhere it sees MAX_SIZE.
bool initialiseArray(int* arrayParam, int sizeParam) //where does this lie?
The code for the initialiseArray() function will be in the data section of your exectuable. You can get a pointer to the memory address, and call the function via that address, but other than that there's not much else you can do with it.
The arrayParam and sizeParam arguments will be passed to the function by value, on the stack. Likewise, the bool return value will be placed into the calling function's stack area.
int* myArray = new int[30]; //I assume this is allocated on heap memory
Correct.
bool res = initialiseArray(myArray, 30); //Where does this lie?
Effectively, the myArray pointer and the literal 30 are copied into the stack area of initialiseArray(), which then operates on them, and then the resulting bool is copied into the stack area of the calling function.
The actual details of argument passing are a lot more grizzly and depend on calling conventions (of which there are several, particularly on Windows), but unless you're doing something really specialised then they're not really important :-)
The stack is used for automatic variables - that is, variables declared within functions, or as function parameters. These variables are destroyed automatically when the program leaves the block of code they were declared in.
You're correct that MAX_SIZE has a static lifetime - it is destroyed automatically at the end of the program. You're also correct that the array allocated with new[] is on the heap (having a dynamic lifetime) - it won't be destroyed automatically, so need to be deleted. By the way, you need delete [] myArray; to match the use of new [].
The pointer to it (myArray) is an automatic variable, on the stack, as are res and the function arguments.
There is just one type of memory... it is a memory :D
What is different is where it is and how you access it.
If you go deep into the exe loader in Windows ( or in any kind of OS actually ) what it really does is that is stores the information of your sections ( parts of you exe ) and at run time at lays it out properly into the memory and applies access rights. So generally the code section where your "program" is the same memory ( your RAM ) as your data section. The difference is that the access rights are different, the code section usually only have read + execute the data just read + write ( and no execute ).
The stack is still a memory, it is special in the sense that it is again controlled by the OS, the stack size is the size in bytes of how big your stack is, but here the purpose is to hold immediate values between function calls ( as per stdcall ) and local variables ( depends on the compiler how it does it exactly ) so because it is a memory you CAN use it but like you it is to to lets say allocate a 10000 byte string on the stack. In assembly you have direct access since there is a stack pointer EBP ( If I remember correctly :P ) or in C/C++ you can use alloca.
The new and the delete operators are built ins for the C++ language but as far as I know they use the same system allocators as you do, in fact you can override them and use malloc/free and it should work which means that again this is the same memory.
The difference between using new/delete and an os specific function is that you let the language handle the allocation but in the end you will get a pointer just like you would with any other function.
On top of this there are special ways but those change the way the memory handled, in Windows this is the virtual memory for example, like VirutalAlloc, VirutalFree will allow you specify what you will do with the memory you want to use thus you allow the OS to optimize better, like you tell it I want 2Gb of memory BUT it doesn't have to be in RAM, so it may save it on the disk but you STILL access this with memory pointers.
And about your questions :
static const int MAX_SIZE = 256; //I assume this is static Data
It usually depends on the compiler but mostly they will treat this as const ( static is something else ) which means that it will be in the const section of the exe which in turn means that this memory block will be read-only.
int* myArray = new int[30]; //I assume this is allocated on heap memory
Yes this will be on the heap, but how it is allocated depends on the implementation and whenever you override the new operator, if you do you can for example force it to be in the Virtual memory so in fact it could be on the disk or in RAM, but this is silly thing to do so yes it will be on the heap.
bool res = initialiseArray(myArray, 30); //Where does this lie?
Multiple things happen here, because the compiler know that the first parameter of initialiseArray must be a pointer to an int it will pass a pointer to myArray so both a pointer and the value of 30 will go on the stack and then the function is called.
In the function which is in the memory ( the code section ) it runs and gets the parameters ( int* arrayParam, int sizeParam ) from the stack it will know that you want to write to the arrayParam and that is is pointer so it will write into the location arrayParam points to. To where exactly you specify it with arrayParam[i] < i will offset the memory pointer to the correct value, again C++ does some magic by adjusting the pointer for you since the adjustment in code should be in bytes it will move the memory pointer by 4 since ( usually ) int == 4 bytes.
To get a better overview of where goes what and how it works, use a debugger or a disassembler ( like OllyDbg ) and see it for yourself, if you want know more about how the stack is used look up the stdcall calling convention.
The following code gives me a segmentation fault:
bool primeNums[100000000]; // index corresponds to number, t = prime, f = not prime
for (int i = 0; i < 100000000; ++i)
{
primeNums[i] = false;
}
However, if I change the array declaration to be dynamic:
bool *primeNums = new bool[100000000];
I don't get a seg-fault. I have a general idea of why this is: in the first example, the memory's being put on the stack while in the dynamic case it's being put on the heap.
Could you explain this in more detail?
bool primeNums[100000000];
used out all your stack space, therefore, you will get segmentation fault since there is not enough stack space to allocate a static array with huge size.
dynamic array is allocated on the heap, therefore, not that easy to get segmentation fault. Dynamic arrays are created using new in C++, it will call operator new to allocate memory then call constructor to initialize the allocated memory.
More information about how operator new works is quoted from the standard below [new.delete.single]:
Required behavior:
Return a nonnull pointer to suitably aligned storage (3.7.3), or else throw a bad_alloc exception. This requirement is binding on a replacement version of this function.
Default behavior:
— Executes a loop: Within the loop, the function first attempts to allocate the requested storage. Whether the attempt involves a call to the Standard C library function malloc is unspecified.
— Returns a pointer to the allocated storage if the attempt is successful. Otherwise, if the last argument to set_new_handler() was a null pointer, throw bad_alloc.
— Otherwise, the function calls the current new_handler (18.4.2.2). If the called function returns, the loop repeats.
— The loop terminates when an attempt to allocate the requested storage is successful or when a called new_handler function does not return.
So using dynamic array with new, when there is not enough space, it will throw bad_alloc by default, in this case, you will see an exception not a segmentation fault, when your array size is huge, it is better to use dynamic array or standard containers such as vectors.
bool primeNums[100000000];
This declaration allocates memory in the stack space. The stack space is a memory block allocated when your application is launched. It is usually in the range of a few kilobyes or megabytes (it depends on the language implementation, compiler, os, and other factors).
This space is used to store local and static variables so you have to be gentle and don't overuse it. Because this is a stack, all allocations are continuos (no empty space between allocations).
bool *primeNums = new bool[100000000];
In this case the memory is allocated is the heap. This is space free where large new chucks of memory can be allocated.
Some compilers or operating systems limit the size of the stack. On windows the default is 1 MB but it can be changed.
in the first case you allocate memory on stack:
bool primeNums[100000000]; // put 100000000 bools on stack
for (int i = 0; i < 100000000; ++i)
{
primeNums[i] = false;
}
however this is allocation on heap:
bool *primeNums = new bool[100000000]; // put 100000000 bools in the heap
and since stack is (very) limited this is the reason for segfault
"Process is terminated due to StackOverflowException" is the error I receive when I run the code below. If I change 63993 to 63992 or smaller there are no errors. I would like to initialize the structure to 100,000 or larger.
#include <Windows.h>
#include <vector>
using namespace std;
struct Point
{
double x;
double y;
};
int main()
{
Point dxF4struct[63993]; // if < 63992, runs fine, over, stack overflow
Point dxF4point;
vector<Point> dxF4storage;
for (int i = 0; i < 1000; i++) {
dxF4point.x = i; // arbitrary values
dxF4point.y = i;
dxF4storage.push_back(dxF4point);
}
for (int i = 0; i < dxF4storage.size(); i++) {
dxF4struct[i].x = dxF4storage.at(i).x;
dxF4struct[i].y = dxF4storage.at(i).y;
}
Sleep(2000);
return 0;
}
You are simply running out of stackspace - it's not infinite, so you have to take care not to run out.
Three obvious choices:
Use std::vector<Point>
Use a global variable.
Use dynamic allocation - e.g. Point *dxF4struct = new Point[64000]. Don't forget to call delete [] dxF4struct; at the end.
I listed the above in order that I think is preferable.
[Technically, before someone else points that out, yes, you can increase the stack, but that's really just moving the problem up a level somewhere else, and if you keep going at it and putting large structures on the stack, you will run out of stack eventually no matter how large you make the stack]
Increase the stack size. On Linux, you can use ulimit to query and set the stack size. On Windows, the stack size is part of the executable and can be set during compilation.
If you do not want to change the stack size, allocate the array on the heap using the new operator.
Well, you're getting a stack overflow, so the allocated stack is too small for this much data. You could probably tell your compiler to allocate more space for your executable, though just allocating it on the heap (std::vector, you're already using it) is what I would recommend.
Point dxF4struct[63993]; // if < 63992, runs fine, over, stack overflow
That line, you're allocating all your Point structs on the stack. I'm not sure the exact memory size of the stack but the default is around 1Mb. Since your struct is 16Bytes, and you're allocating 63393, you have 16bytes * 63393 > 1Mb, which causes a stackoverflow (funny posting aboot a stackoverflow on stack overflow...).
So you can either tell your environment to allocate more stack space, or allocate the object on the heap.
If you allocate your Point array on the heap, you should be able to allocate 100,000 easily (assuming this isn't running on some embedded proc with less than 1Mb of memory)
Point *dxF4struct = new Point[63993];
As a commenter wrote, it's important to know that if you "new" memory on the heap, it's your responsibility to "delete" the memory. Since this uses array new[], you need to use the corresponding array delete[] operator. Modern C++ has a smart pointer which will help with managing the lifetime of the array.
I am a little stuck on this question in my homework:
Write three functions in C or C++: one that declares a large array statically, one that declares the same large array on the stack, and one that creates the same large array from the heap. Call each of the subprograms a large number of times {at least 100,000) and output the time required by each. Explain the results.
int main(void)
{
int staticIntArray[ARRAY_SIZE];//array on the stack
int *ptrArray = new int[ARRAY_SIZE]; // pointer on the stack but array on the heap.
double timeItTakes;
clock_t tStart = clock();
fillWithRandomNumbers(staticIntArray, ARRAY_SIZE);
double time = static_cast<double>(clock() - static_cast<double>(tStart)/static_cast<double>(CLOCKS_PER_SEC));
printf ("Array on stack time is %.10f\n", time);
clock_t tStart2 = clock();
fillWithRandomNumbers(ptrArray, ARRAY_SIZE);
double time2 = static_cast<double>(clock() - static_cast<double>(tStart2)/static_cast<double>(CLOCKS_PER_SEC));
printf ("Array on heap time is %.10f\n", time2);
//cout << "Array on the heap time is " << (timeIntStack - time(NULL));
}
void fillWithRandomNumbers(int intArray[], int size)
{
for(int i = 0; i<size; i++)
intArray[i] = rand();
}
The output is:
Array on stack time is 1.9990000000
Array on heap time is 2.9980000000
Press any key to continue . . .
What I understand is that the stack is much smaller allocation of memory that is for local variables and parameters while the heap is a large pool of dynamically allocated memory. Here are my questions...
Does using the random class affect the time it takes for the function to execute? Is allocating large arrays on the stack slower because there is less memory available?
I am not trying ask you to do my homework but I just need a little help clarifying the concepts... Any help would be much appreciated...
First, allow me a slight rant.
The assignment -- presumably in a C++ programming class -- is a bad one. It is diverting your focus to the performance implications of dynamic allocation versus static or automatic allocation, but that is not the main reason you should choose one form of allocation over another. Rather, lifetime and visibility requirements in addition to ownership semantics should be considered long before performance when deciding how to allocate a chunk of memory. Even setting this argument aside, the test is still invalid because the hardware you run the code on, the size of the individual elements in the array and the array itself, the operating system, how the kernel blocks when allocating and the optimization the compiler is permitted to employ are all going to effect the execution speed in any real code you write. But this assignment seems to be suggesting that you should conclude, "see? Dynamic allocation is slower. We should never use it." That reasoning is incorrect and teaches you to employ premature optimization.
OK, end of my rant.
On to your assignment. You are doing two main things wrong.
The assignment never asks you to populate the array.
The assignment asks you to allocate the arrays 10k times, but you're doing it only once.
(Bonus!) The assignment doesn't ask you to deallocate the dynamically allocated array -- but it should.
You are only allocating your arrays once, at the start of the program, and only writing their contents in the fillWithRandomNumbers loop. The program didn't measure allocation at all; for this to happen, the new operator should have been within a loop.
Try to follow the assignment here: Write three functions, with each function allocating the array in a different way.
First you are not doing what the assignment asks for.
Second you should expect the stack allocation to be much faster, since the only thing that is does internally is to move the stack pointer.