C++ Variables bigger than the stack (stack overflow)

C++ Variables bigger than the stack (stack overflow) - c++

I need to create an array like this
double grid [15000][40];
but the stack in Visual Studio 2012 is only of 1MB. How can I use variables like this or bigger?
This mean that if I create a
std::vector<int>
and I push_back 600 000 times it goes in stack overflow? This seems a big limitation, how can be solved?
Thank you in advance.

Large objects should have either static or dynamic storage duration.
Static:
int a[1000000];
void f()
{
a[3] = 12; // fine
}
Beware of shared, concurrent accesses to the static memory, though.
Dynamic (but managed properly by a suitable class):
void f()
{
std::vector<int> a(1000000); // dynamic objects managed by std::vector
a[3] = 12;
}
Here each function call will create and manage its own dynamic allocation (and the complexities of concurrency are delegated to the memory allocator, so you don't have to think about those).

There is no problem here.
This mean that if I create a std::vector and I push_back 600 000 times it goes in stack overflow? This seems a big limitation
No, because vectors elements do not have automatic storage duration (they don't "live on the stack"). They couldn't.
how can be solved
There is nothing to solve. Vector elements are dynamically allocated.

You can define this array as
static double grid [15000][40];
As for std::vector then it allocates memory for its elements in the heap not in the stack.

memory of vector is heap allocated, you don't have to worry about the stack.
Instead of doing push_back, you can use the member function resize :
std::vector<int> grid;
grid.resize(15000*40);
or even better, you can use a unique pointer if the grid has a fixed size

If you don't want to use std::vector for whatever reason, an alternative solution is:
int main()
{
int (*grid)[40] = new int[15000][40];
// work with grid
grid[0][0] = 0.0;
grid[14999][39] = 42;
delete [] grid;
}
The usual caveats apply about raw pointers to dynamic storage causing memory leaks if the scope is terminated by an exception, so that the delete never executes.

Related

Creating a global array of structs

For a project I am working on I need to have a global array of entry structs. I am having trouble though because I can't allocate memory until while running my program I determine the size of a file. The overall goal of this project is to create a word reference. So far how I am doing it is:
struct info{
//stores the specific character
std:: string c;
//stores the amount of times a word has come up in the file
float num;
}
info info_store[];
This project is to learn about arrays so I need to use an array

You can:
- use new/delete[]
info* p_array=new info[100]; // create an array of size 100
p_array[10].num; // member access example
delete[] p_array; // release memory
- use std::unique_ptr
std::unique_ptr<info[]> array(new info[size]);
-> The advantage is that your memory is automatically released when array is destroyed (no more delete[])

First of all, use std::vector or any other STL container.
Second, you can use dynamic arrays.
auto length = count_structs(file);
auto data = new info[length];
Something like this. Then just fill this array.
Ohh, and make sure you have delete [] data to prevent memory leaks.

Fast way to push_back a vector many times

I have identified a bottleneck in my c++ code, and my goal is to speed it up. I am moving items from one vector to another vector if a condition is true.
In python, the pythonic way of doing this would be to use a list comprehension:
my_vector = [x for x in data_vector if x > 1]
I have hacked a way to do this in C++, and it is working fine. However, I am calling this millions of times in a while-loop and it is slow. I do not understand much about memory allocation, but I assume that my problem has to do with allocating memory over-and-over again using push_back. Is there a way to allocate my memory differently to speed up this code? (I do not know how large my_vector should be until the for-loop has completed).
std::vector<float> data_vector;
// Put a bunch of floats into data_vector
std::vector<float> my_vector;
while (some_condition_is_true) {
my_vector.clear();
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Use my_vector to render graphics on the GPU, but do not change the elements of my_vector
// Change the elements of data_vector, but not the size of data_vector
}

Use std::copy_if, and reserve data_vector.size() for my_vector initially (as this is the maximum possible number of elements for which your predicate could evaluate to true):
std::vector<int> my_vec;
my_vec.reserve(data_vec.size());
std::copy_if(data_vec.begin(), data_vec.end(), std::back_inserter(my_vec),
[](const auto& el) { return el > 1; });
Note that you could avoid the reserve call here if you expect that the number of times that your predicate evaluates to true will be much less than the size of the data_vector.

Though there are various great solutions posted by others for your query, it seems there is still no much explanation for the memory allocation, which you do not much understand, so I would like to share my knowledge about this topic with you. Hope this helps.
Firstly, in C++, there are several types of memory: stack, heap, data segment.
Stack is for local variables. There are some important features associated with it, for example, they will be automatically deallocated, operation on it is very fast, its size is OS-dependent and small such that storing some KB of data in the stack may cause an overflow of memory, et cetera.
Heap's memory can be accessed globally. As for its important features, we have, its size can be dynamically extended if needed and its size is larger(much larger than stack), operation on it is slower than stack, manual deallocation of memory is needed (in nowadays's OS, the memory will be automatically freed in the end of program), et cetera.
Data segment is for global and static variables. In fact, this piece of memory can be divided into even smaller parts, e.g. BBS.
In your case, vector is used. In fact, the elements of vector are stored into its internal dynamic array, that is an internal array with a dynamic array size. In the early C++, a dynamic array can be created on the stack memory, however, it is no longer that case. To create a dynamic array, ones have to create it on heap. Therefore, the elements of vector are stored in an internal dynamic array on heap. In fact, to dynamically increase the size of an array, a process namely memory reallocation is needed. However, if a vector user keeps enlarging his or her vector, then the overhead cost of reallocation cost will be high. To deal with it, a vector would firstly allocate a piece of memory that is larger than the current need, that is allocating memory for potential future use. Therefore, in your code, it is not that case that memory reallocation is performed every time push_back() is called. However, if the vector to be copied is quite large, the memory reserved for future use will be not enough. Then, memory allocation will occur. To tackle it, vector.reserve() may be used.
I am a newbie. Hopefully, I have not made any mistake in my sharing.
Hope this helps.

Run the code twice, first time only counting, how many new elements you will need. Then use reserve to already allocate all the memory you need.
while (some_condition_is_true) {
my_vector.clear();
int newLength = 0;
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
newLength++;
my_vector.reserve(newLength);
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Do stuff with my_vector and change data_vector
}

I doubt allocating my_vector is the problem, especially if the while loop is executed many times as the capacity of my_vector should quickly become sufficient.
But to be sure you can just reserve capacity in my_vector corresponding to the size of data_vector:
my_vector.reserve(data_vector.size());
while (some_condition_is_true) {
my_vector.clear();
for (auto value : data_vector) {
if (value > 1)
my_vector.push_back(value);
}
}

If you are on Linux you can reserve memory for my_vector to prevent std::vector reallocations which is bottleneck in your case. Note that reserve will not waste memory due to overcommit, so any rough upper estimate for reserve value will fit your needs. In your case the size of data_vector will be enough. This line of code before while loop should fix the bottleneck:
my_vector.reserve(data_vector.size());

Delete, Free, or Deallocate?

I'm running into a problem where I use too much memory on the stack. I'm using several large arrays that I only need between steps in my code. Basically I need to know how to release the memory used by an array variable that's created as:
float arrayName[length][width];

To intentionally release some auto storage (items on the 'stack'), you can do the following - basically you simply limit the scope of your variables
change code from:
//...
float arrayName[length][width];
// ...
change code to:
//...
{
float arrayName[length][width];
// use arrayName here
//... still in-scope
} // scope limit
// all of arrayName released from stack
{
// stack is available for other use, so try
uint32_t u32[3][length][width];
// use u32 here
//... still in-scope
} // scope ended
// all of u32 released from stack
// better yet, use std::vector or another container
std::vector<uint32_t> bigArry;
NOTE: a vector uses a finite amount of stack (24 bytes on my system),
regardless of how many elements you put into it!

You should use vectors for things like this. It is a part of the C++ standard library and is very optimized in most implementations. The memory taken up by the vector will automatically get released when the vector goes out of scope. So you will never have to free up the memory yourself.
Another benefit with using a vector is that you do not have to worry about running out of stack space since all the "array" memory taken up by the vector is located on the heap of the program.
For examples http://en.cppreference.com/w/cpp/container/vector/vector
Other than that if you think your program memory is never going to be enough then you should consider using the disk as another storage mechanism. Databases work this way. They store most of their data on disk.

You won't need any special statements.
The array will be released on function return or exiting the scope if it is local variable having automatic storage duration, or on exiting the program if it is static variable (declared outside functions).

You may want to allocate the memory on the heap if you are running into a situation where you are running out of memory on the stack. In this case you'll want to new up the array.
float** my_array = new float* [rowCount];
for(int i = 0; i < rowCount; ++i)
{
my_array[i] = new float[columnCount];
}
// and delete it later
for(int i = 0; i < rowCount; ++i)
{
delete [] my_array[i];
}
delete [] my_array;

Should I use vectors instead of arrays?

If I have a fixed number of elements of class MyClass, should I use arrays or vectors?, ie:
MyClass* myArray[];
or
std::vector<MyClass*> myVector;
?

Use std::array or raw arrays for a small, static number of elements.
If you have a lot of elements (more than say 100kb), you hog the stack and are asking for stack corruption / overflow. In that case, or if the number of elements can only be known at runtime, use std::vector.

if you know the number in compile time - use static array.
if the number is dynamic (obtained from the user) - vector is much better to save you the hurdle of managing the memory

"Fixed" has two meanings in this context. The usual one is set once, never change, such as a value read from input. This value is known at runtime and requires dynamic allocation on the heap. Your options are a C-style array with new or a vector; it is highly recommended you use a vector.
#include <vector>
#include <iostream>
int main() {
int size;
std::cin >> size;
int *myArray = new int[size];
std::vector<int> myVector(size);
}
"Fixed" can also mean a compile-time constant, meaning it is constant for any run of the program. You can use a C-style array or a C++ array (automatic memory allocation on the stack).
#include <array>
int main() {
const int size = 50;
int myArray[size];
std::array<int, size> myArray;
}
These are faster, but your program needs to have access to sufficient stack memory, which is something you can change in your project settings. See this topic for more info. If the size of the array is really big, you may want to consider allocating on the Heap anyway (vector).

Vector of vector pointer memory allocation

First I want to say that, I have a vector which has thousand of vectors inside. Each of these inside vectors has thousand of numbers inside. I want to keep memory management safe and memory usage at minimum as much as possible.
I want to ask that if I have a code similiar to below
int size = 10;
vector<vector<double>>* something = new vector<vector<double>>(size);
vector<double>* insideOfSomething;
for(int i = 0; i < size; i++){
insideOfSomething = &(something->at(i));
//...
//do something with insideOfSomething
//...
}
I know that 'something' will be created in heap. What I don't understand is where the vectors are placed, 'insideOfSomething' points? If they are created in stack, then this means that I have a vector pointer, which points a vector in heap, that has vectors inside which are created in stack? (I'm very confused right now.)
If I have a code similiar to the one below;
vector<vector<double>*>* something = new vector<vector<double>*>(size);
vector<double>* insideOfSomething;
for(int i = 0; i < size; i++){
something->at(i) = new vector<double>();
insideOfSomething = something->at(i);
//...
//do something with inside insideOfSomething
//...
}
right know all of my vectors are stored in heap, right?
Which one is more usefull according to the memory management?

You should avoid allocating vectors on the heap and just declare them on the stack since the vector will manage its objects on the heap for you. Anywhere you want to avoid creating a copy you can just use a reference or const reference (which ever is necessary).
vector<vector<double> > something(size);
for(int i = 0; i < size; i++)
{
vector<double> &insideOfSomething = something.at(i);
//use insideOfSomething
}

Let's take a random, simplistic implementation of vector, as I think this will help you.
template <class T, class Alloc>
class vector
{
private:
T* buffer;
std::size_t vector_size;
std::size_t vector_capacity
Alloc alloc;
public:
...
};
In this case, if we write:
vector<int> v;
v.push_back(123);
... the pointer, buffer, the integrals: vector_size and vector_capacity, and the allocator object, alloc, will all be created on the stack (along with allocating any additional memory necessary for structure padding and alignment).
However, vector itself will allocate memory on the heap to which this buffer pointer will store its base address. That will always be on the heap and will contain the actual contents of the vector as we think of them.
This is still more efficient than this:
vector<int>* v = new vector<int>;
v->push_back(123);
...
delete v;
... as this would involve a heap allocation/deallocation for the vector itself (including its data members) in addition to the memory vector itself allocates for its internal contents (the buffer). It also introduces an additional level of indirection.
Now if we have a vector of Somethings (vector of vector or anything else):
vector<Something> v;
Those Something instances are always going to be allocated within a contiguous heap buffer since they would reside in the dynamically allocated memory blocks that vector creates and destroys internally.

In vector<> all data stored in heap
And i think you should simply use
vector< vector<double> > something;

I want to keep memory management safe and memory usage at minimum as much as possible.
Then
vector<vector<double>>* something = new vector<vector<double>>(size);
is already not good. As said in the other answers, vector already has its data on the heap, no need to mess around with new to achieve this. In fact, the objects' location is like
S t a c k H e a p
(vector<double>) sthng[0]
(vector<vector<double>>) sthng (vector<double>) sthng[1]
...
- - - - - -
(double) sthng[0][0]
(double) sthng[0][1]
...
- - - - - -
(double) sthng[1][0]
(double) sthng[1][1]
...
(of course, there is no particular ordering of the blocks on the heap)

Joe and hired777's answers explain that a vector will be allocated on the heap no matter what. I'll try to give some insight on the reason for this.
A vector is a resizeable container. Generally it doubles in size when it reaches capacity which means it needs to be able to allocate more memory than it had already allocated. Hence even when you declare vector inside a function and hence on the stack, internally it's holding a pointer to it's data on the heap and on going out of the function's scope, it's destructor will delete this data from the heap.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js