Stack overflow with large array but not with equally large vector? - c++

I ran into a funny issue today working with large data structures. I initially was using a vector to store upwards of 1000000 ints but later decided I didn't actually need the dynamic functionality of the vector (I was reserving 1000000 spots as soon as it was declared anyway) and it would be beneficial to, instead, be able to add values any place in the data structure. So I switched it to an array and BAM stack overflow. I'm guessing this is because declaring the size of the array at compile time puts it in the stack and making use of a dynamic vector instead placed it on the heap (which I'm guessing is larger?).
So what's the right answer here? Move back to a dynamic memory system just so it gets put on the heap? Increase the size of the stack? Or am I way off base on the whole thing here...?
Thanks!

I initially was using a vector to store upwards of 1000000 ints
Good idea.
but later decided I didn't actually need the dynamic functionality of the vector (I was reserving 1000000 spots as soon as it was declared anyway)
Not such a good idea. You did need it.
and it would be beneficial to, instead, be able to add values any place in the data structure.
I don't follow.
I'm guessing this is because declaring the size of the array at compile time puts it in the stack and making use of a dynamic vector instead placed it on the heap (which I'm guessing is larger?).
Much. The call stack is typically of the order of 1MB-2MB in size by default. Your "heap" (free store) is only really bounded by your available RAM.
So what's the right answer here? Move back to a dynamic memory system just so it gets put on the heap?
Yes.
[edit: Joachim's right — static is another possible answer.]
Increase the size of the stack?
You could but even if you could stretch 4MB out of it, you've left yourself no wiggle room for other local data variables. Best use dynamic memory — that's the appropriate thing to do.
Or am I way off base on the whole thing here...?
No.

Related

stack overflow eror in c++ [duplicate]

I am using Dev C++ to write a simulation program. For it, I need to declare a single dimensional array with the data type double. It contains 4200000 elements - like double n[4200000].
The compiler shows no error, but the program exits on execution. I have checked, and the program executes just fine for an array having 5000 elements.
Now, I know that declaring such a large array on the stack is not recommended. However, the thing is that the simulation requires me to call specific elements from the array multiple times - for example, I might need the value of n[234] or n[46664] for a given calculation. Therefore, I need an array in which it is easier to sift through elements.
Is there a way I can declare this array on the stack?
No there is no(we'll say "reasonable") way to declare this array on the stack. You can however declare the pointer on the stack, and set aside a bit of memory on the heap.
double *n = new double[4200000];
accessing n[234] of this, should be no quicker than accessing n[234] of an array that you declared like this:
double n[500];
Or even better, you could use vectors
std::vector<int> someElements(4200000);
someElements[234];//Is equally fast as our n[234] from other examples, if you optimize (-O3) and the difference on small programs is negligible if you don't(+5%)
Which if you optimize with -O3, is just as fast as an array, and much safer. As with the
double *n = new double[4200000];
solution you will leak memory unless you do this:
delete[] n;
And with exceptions and various things, this is a very unsafe way of doing things.
You can increase your stack size. Try adding these options to your link flags:
-Wl,--stack,36000000
It might be too large though (I'm not sure if Windows places an upper limit on stack size.) In reality though, you shouldn't do that even if it works. Use dynamic memory allocation, as pointed out in the other answers.
(Weird, writing an answer and hoping it won't get accepted... :-P)
Yes, you can declare this array on the stack (with a little extra work), but it is not wise.
There is no justifiable reason why the array has to live on the stack.
The overhead of dynamically allocating a single array once is neglegible (you could say "zero"), and a smart pointer will safely take care of not leaking memory, if that is your concern.
Stack allocated memory is not in any way different from heap allocated memory (apart from some caching effects for small objects, but these do not apply here).
Insofar, just don't do it.
If you insist that you must allocate the array on the stack, you will need to reserve 32 megabytes of stack space first (preferrably a bit more). For that, using Dev-C++ (which presumes Windows+MingW) you will either need to set the reserved stack size for your executable using compiler flags such as -Wl,--stack,34000000 (this reserves somewhat more than 32MiB), or create a thread (which lets you specify a reserved stack size for that thread).
But really, again, just don't do that. There's nothing wrong with allocating a huge array dynamically.
Are there any reasons you want this on the stack specifically?
I'm asking because the following will give you a construct that can be used in a similar way (especially accessing values using array[index]), but it is a lot less limited in size (total max size depending on 32bit/64bit memory model and available memory (RAM and swap memory)) because it is allocated from the heap.
int arraysize= 4200000;
int *heaparray= new int[arraysize];
...
k= heaparray[456];
...
delete [] heaparray;
return;

Declare large array on Stack

I am using Dev C++ to write a simulation program. For it, I need to declare a single dimensional array with the data type double. It contains 4200000 elements - like double n[4200000].
The compiler shows no error, but the program exits on execution. I have checked, and the program executes just fine for an array having 5000 elements.
Now, I know that declaring such a large array on the stack is not recommended. However, the thing is that the simulation requires me to call specific elements from the array multiple times - for example, I might need the value of n[234] or n[46664] for a given calculation. Therefore, I need an array in which it is easier to sift through elements.
Is there a way I can declare this array on the stack?
No there is no(we'll say "reasonable") way to declare this array on the stack. You can however declare the pointer on the stack, and set aside a bit of memory on the heap.
double *n = new double[4200000];
accessing n[234] of this, should be no quicker than accessing n[234] of an array that you declared like this:
double n[500];
Or even better, you could use vectors
std::vector<int> someElements(4200000);
someElements[234];//Is equally fast as our n[234] from other examples, if you optimize (-O3) and the difference on small programs is negligible if you don't(+5%)
Which if you optimize with -O3, is just as fast as an array, and much safer. As with the
double *n = new double[4200000];
solution you will leak memory unless you do this:
delete[] n;
And with exceptions and various things, this is a very unsafe way of doing things.
You can increase your stack size. Try adding these options to your link flags:
-Wl,--stack,36000000
It might be too large though (I'm not sure if Windows places an upper limit on stack size.) In reality though, you shouldn't do that even if it works. Use dynamic memory allocation, as pointed out in the other answers.
(Weird, writing an answer and hoping it won't get accepted... :-P)
Yes, you can declare this array on the stack (with a little extra work), but it is not wise.
There is no justifiable reason why the array has to live on the stack.
The overhead of dynamically allocating a single array once is neglegible (you could say "zero"), and a smart pointer will safely take care of not leaking memory, if that is your concern.
Stack allocated memory is not in any way different from heap allocated memory (apart from some caching effects for small objects, but these do not apply here).
Insofar, just don't do it.
If you insist that you must allocate the array on the stack, you will need to reserve 32 megabytes of stack space first (preferrably a bit more). For that, using Dev-C++ (which presumes Windows+MingW) you will either need to set the reserved stack size for your executable using compiler flags such as -Wl,--stack,34000000 (this reserves somewhat more than 32MiB), or create a thread (which lets you specify a reserved stack size for that thread).
But really, again, just don't do that. There's nothing wrong with allocating a huge array dynamically.
Are there any reasons you want this on the stack specifically?
I'm asking because the following will give you a construct that can be used in a similar way (especially accessing values using array[index]), but it is a lot less limited in size (total max size depending on 32bit/64bit memory model and available memory (RAM and swap memory)) because it is allocated from the heap.
int arraysize= 4200000;
int *heaparray= new int[arraysize];
...
k= heaparray[456];
...
delete [] heaparray;
return;

Is the size of a dynamically-allocated array stored somewhere?

It seems to me that delete[] knows the size of a dynamic allocated array. My question is: Is there any way to get it out so that we don't need to provide the size explicitly when coding.
The method used by delete[] to figure out how many items it has to deal with is implementation dependent. You can't get to it or use it in any way.
Read C++ FAQ [16.14] After p = new Fred[n], how does the compiler know there are n objects to be destructed during delete[] p? (and the whole section for a general idea on free store management.)
My question is: Is there any way to get it out so that we don't need to provide the size explicitly when coding.
You don't need to, you just call delete [], with no size.
The way the compiler stores the size is an implementation detail and no specified. Most store it in some memory right before the array starts (not after, as others mentioned).
See this related question : How does delete[] "know" the size of the operand array?
Edited:
Since delete [] needs to call destructors for all elements of the array, the length must definitely be stored somewhere. As of why this memory is not accessible to prevent errors such as walking outside of the array due to its unknown size - I am not really sure. Strictly speaking, the length of statically allocated arrays must be known during compile time and the length of dynamically allocated arrays must be stored by the runtime, so in both cases buffer overflow errors are theoretically 100% preventable, and yet both static and dynamic arrays are unsafe. My guess is this is for performance purposes, bounds checking will make it slower and raw (C style) arrays offer best performance at zero safety.
The implementation of this varies with compiler and runtime vendors, there might be some implementations it may be accessible and usable, but it wouldn't be considered standard and recommended practice. The logical place for the length to be stored is somewhere in the header of the allocated memory fragment before the actual address you will get for the first element of the array.
The C++ compiler has the size of dynamically allocated arrays buried deep somewhere; however, this is not accessible in any way while coding in C++ - so you'll have to store the size somewhere after allocation.
[Edit]: Though some versions of the Visual Studio compiler suite stores the size at the index -1, this is not to be trusted across compilers, or to be used at all when coding.
I think it is compiler dependent and you cannot get it for your application to use. Following link shows 2 methods compiler uses.
http://www.parashift.com/c++-faq-lite/compiler-dependencies.html#faq-38.7
http://www.parashift.com/c++-faq-lite/compiler-dependencies.html#faq-38.8
Compilers follow different approaches to store the memory allocated on new.
This is one of the approach, I read somewhere.
When compiler allocates memory based on a new call, it sets apart one extra byte, maybe in the beginning, where it will store how much memory was allocated.
So when it encounters a delete call, it will use this stored value to decide how much memory has to be de-allocated.

Efficiently collect data from multiple 1-D arrays in to a single 1-D array

I've got a prewritten function in C that fills an 1-D array with data, e.g.
int myFunction(myData **arr,...);
myData *array;
int arraySize;
arraySize = myFunction(&arr, ...);
I would like to call the function n times in a row with slightly different parameters (n is dependent on user input), and I need all the data collected in a single C array afterwards. The size of the returned array is not always fixed. Oh, and myFunction does the memory allocation internally. I want to do this in a memory-efficient way, but using realloc in each iteration does not sound like a good idea.
I do have all the C++ functionality available (the project is in C++, just using a C library), but using std::vector is no good because the collected data is later sent in to a function with a definition similar to:
void otherFunction(myData *data, int numData, ...);
Any ideas? Only things I can think of are realloc or using a std::vector and copying the data into an array afterwards, and those don't sound too promising.
Using realloc() in each iteration sounds like a very fine idea to me, for two reasons:
"does not sound like a good idea" is what people usually say when they have not established a performance requirement for their software, and they have not tested their software against the performance requirement to see if there is any need to improve it.
Instead of reallocating a new block each time, the realloc method will simply keep expanding your memory block which will presumably be at the top of the memory heap, so it won't be wasting any time either traversing memory block lists, or copying data around. This holds true provided that whatever memory allocated by myFunction() gets freed before it returns. You can verify it by looking at the pointer returned by realloc() and seeing that it always (or almost always(*1)) is the exact same pointer as the one you gave it to reallocate.
EDIT (*1) some C++ runtimes implement two heaps, one for small allocations and one for large allocations, so if your block gets allocated in the heap for small blocks, and then it grows large, there is a possibility that it will be moved once to the heap for large blocks. So, don't expect the pointer to always be the same; just most of the time.
Just copy all of the data into an std::vector. You can call otherFunction on a vector v with
otherFunction(&v[0], v.size(), ...)
or
otherFunction(v.data(), v.size(), ...)
As for your efficiency requirement: it looks to me like your optimizing prematurely. First try this option, then measure how fast it is and only look for other solutions if it's really too slow.
If you know that you are going to call the function N times, and returned arrays are always M long, then why don't you just allocate one array M*N initially? Or if you don't know one of M or N, then set a worst case maximum. Or are M and N both dependent on user-input?
Then, change how you call your user-input-getting function, such that the array pointer you pass it is actually an offset into that large array, so that it stores the data in the right location. Then, next iteration, offset further, and call again.
I think best solution would be to write your own 1D array class with some methods which you need.
depending on how you write the class you'll get such result. (sorry bad grammar)..

Should a list of objects be stored on the heap or stack?

I have an object(A) which has a list composed of objects (B). The objects in the list(B) are pointers, but should the list itself be a pointer? I'm migrating from Java to C++ and still haven't gotten fully accustomed to the stack/heap. The list will not be passed outside of class A, only the elements in the list. Is it good practice to allocate the list itself on the heap just in case?
Also, should the class that contains the list(A) also be on the heap itself? Like the list, it will not be passed around.
Bear in mind that
The list would only be on the stack if Object-A was also on the stack
Even if the list itself is not on the heap, it may allocate its storage from the heap. This is how std::list, std::vector and most C++ lists work – the reason is that stack-based elements cannot grow.
These days most stacks are around 1mb, so you'd need a pretty big list of pretty big objects before you need to worry about it. Even if your stack was only about 32kb you could store close to eight thousand pointers before it would be an issue.
IMO people new to the explicit memory management in C/C++ can have a tendency to overthink these things.
Unless you're writing something that you know will have thousands of sizable objects just put the list on the stack. Unless you're using giant C-style arrays in a function the chances are the memory used by the list will end up in the heap anyway due to #1 and #2 above.
You're better off storing a list, if it can grow, on the heap. Since you never know what the runtime stack will be, overflow is a real danger, and the consequences are fatal.
If you absolutely know the upper bound of the list, and it's small compared to the size of your stack, you can probably get away with stack allocating the list.
I work in environments where the stack can be small and heap fragmentation needs to be avoided, so I'd use these rules:
If the list is small and a known fixed size, stack.
If the list is small and an unknown fixed size, you can consider both the heap and alloca(). Using the heap would be a fine choice if you can guarantee that your function doesn't allocate anything on the heap during the duration your allocation is going to be on there. If you can't guarantee this, you're asking for a fragment and alloca() would be a better choice.
If the list is large or will need to grow, use the heap. If you can't guarantee it won't fragment, we tend to have some recourses for this built into our memory manager such as top-down allocations and separate heaps.
Most situations don't call for people to worry about fragmentation, in which case they'd probably not recommend the usage of alloca.
With respect to the class containing the list, if it's local to the function scope I would put it on the stack provided that the internal data structures are not extremely large.
What do you mean by "list". If it's std::list (or std::vector or any other STL container) then it's not going to be storing anything on the stack so don't worry.
If you're in any doubt, look at sizeof(A) and that tells you how much memory it will use when it's on the stack.
But ... the decision should mainly be based on the lifetime of the object. Stack-based objects are destroyed as soon as they go out of scope.
The stack is always simplest. Unfortunately it has one big drawback - you have to know the number of elements ahead of time.