How can I allocate memory for the std::vector<std::vector<int> > data type? - c++

Suppose the following data structure typedef std::vector<std::vector<int> > MYARRAY is defined. Then for the variable MYARRAY var, how can I allocate memory for this variable before pushing data in it. For example,
std::vector<int> p1;
p1.push_back(1);
p1.push_back(2);
std::vector<int> p2;
p2.push_back(22);
p2.push_back(33);
var.push_back(p1);
var.push_back(p2);
If we do not allocate memory for var, then it will allocate memory automatically. So how can I allocate memory for this var before pushing data insider it? If it is std::vector<int> var2, I can just use var2.reserve(n) to allocate memory before using it.
EDIT:
Two suggestions have been made, but neither can work:
Solution 1: allocate memory for each element
var.reserve(3);
for(int i=0; i<3; i++)
var[i].reserve[20];
I use VC 2010 to compile and run the codes in the debug mode, and the following error message is given:
Solution 2: create the object in the beginning
std::vector<std::vector<int> > var(3, std::vector<int>(5));
After you created this variable, you can see this variable in VC 2010:
Its contents are already there. Therefore, if you push data on this variable, it will allocate memory once again.
EDIT 2:
Someone is interested in why I need allocate memory before using this variableļ¼Œ and the main reason is because of run-time library of windows. The variable var is defined in a executable program as an empty variable, and its contents are given by a function defined in a dynamic library. If both are using dynamic run-time library, it will not be an issue. But in my case both are linked with static run-time library, which means that each module is in charge of its memory allocation. Since var is defined in the executable program, it also should take care of its memory allocation.

The way you reserve memory for a vector is independent of what type the vector contains. If you want to reserve space for n elements in MYARRAY, call MYARRAY.reserve(n);.
Its contents are already there. Therefore, if you push data on this variable, it will allocate memory once again.
Right. Do you want to reserve memory or not? If you want to reserve memory, you'll have to use the memory you reserved. In order for you to "push data on this variable", you'd have to have the data somewhere, somewhere other than the memory you reserved. If you reserve memory and use that memory, you'll never have anything to push, because that would imply that you have something someplace other than in the memory you reserved that you need to add to the vector, which you couldn't possibly have.
You basically have three choices:
1) Don't reserve memory. Assemble the objects wherever you want and then push them into the vector. vec.push_back(myVector);
2) Reserve memory. Assemble the objects in place in the vector. vec[n].push_back(myInt);
3) Reserve memory. Assemble the objects wherever you want and then assign them into the memory you reserved. vec[n]=myIntVector
Notice that in none of these cases do you reserve memory and then push into the vector.

Like you already pointed out, std::vector has the reserve method that will reserve space for more data items. If you were to do p1.reserve(3) the vector would attempt to allocate space for 3 integers. If you run var.reserve(3), var will attempt to allocate space for 3 std::vector<int>'s, which is what it sounds like you want to do.
To allocate for the std::vector's inside of var, you could do:
for(int x=0; x<var.size(); x++) var[x].reserve(n);
EDIT
If you want to allocate space before inserting the vectors, you can declare var as:
std::vector<std::vector<int> > var(VEC_COUNT, std::vector<int>(n));
And then copy the new vectors in.

You cannot reserve space for vector's data before the vector exists. I believe you're looking for this:
void reserve(MYARRAY &arr, size_t dim1, size_t dim2)
{
arr.resize(dim1);
for (size_t idx = 0; idx < dim1; ++idx) {
arr[idx].reserve(dim2);
}
}

Related

Creating a global array of structs

For a project I am working on I need to have a global array of entry structs. I am having trouble though because I can't allocate memory until while running my program I determine the size of a file. The overall goal of this project is to create a word reference. So far how I am doing it is:
struct info{
//stores the specific character
std:: string c;
//stores the amount of times a word has come up in the file
float num;
}
info info_store[];
This project is to learn about arrays so I need to use an array
You can:
- use new/delete[]
info* p_array=new info[100]; // create an array of size 100
p_array[10].num; // member access example
delete[] p_array; // release memory
- use std::unique_ptr
std::unique_ptr<info[]> array(new info[size]);
-> The advantage is that your memory is automatically released when array is destroyed (no more delete[])
First of all, use std::vector or any other STL container.
Second, you can use dynamic arrays.
auto length = count_structs(file);
auto data = new info[length];
Something like this. Then just fill this array.
Ohh, and make sure you have delete [] data to prevent memory leaks.

Fast way to push_back a vector many times

I have identified a bottleneck in my c++ code, and my goal is to speed it up. I am moving items from one vector to another vector if a condition is true.
In python, the pythonic way of doing this would be to use a list comprehension:
my_vector = [x for x in data_vector if x > 1]
I have hacked a way to do this in C++, and it is working fine. However, I am calling this millions of times in a while-loop and it is slow. I do not understand much about memory allocation, but I assume that my problem has to do with allocating memory over-and-over again using push_back. Is there a way to allocate my memory differently to speed up this code? (I do not know how large my_vector should be until the for-loop has completed).
std::vector<float> data_vector;
// Put a bunch of floats into data_vector
std::vector<float> my_vector;
while (some_condition_is_true) {
my_vector.clear();
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Use my_vector to render graphics on the GPU, but do not change the elements of my_vector
// Change the elements of data_vector, but not the size of data_vector
}
Use std::copy_if, and reserve data_vector.size() for my_vector initially (as this is the maximum possible number of elements for which your predicate could evaluate to true):
std::vector<int> my_vec;
my_vec.reserve(data_vec.size());
std::copy_if(data_vec.begin(), data_vec.end(), std::back_inserter(my_vec),
[](const auto& el) { return el > 1; });
Note that you could avoid the reserve call here if you expect that the number of times that your predicate evaluates to true will be much less than the size of the data_vector.
Though there are various great solutions posted by others for your query, it seems there is still no much explanation for the memory allocation, which you do not much understand, so I would like to share my knowledge about this topic with you. Hope this helps.
Firstly, in C++, there are several types of memory: stack, heap, data segment.
Stack is for local variables. There are some important features associated with it, for example, they will be automatically deallocated, operation on it is very fast, its size is OS-dependent and small such that storing some KB of data in the stack may cause an overflow of memory, et cetera.
Heap's memory can be accessed globally. As for its important features, we have, its size can be dynamically extended if needed and its size is larger(much larger than stack), operation on it is slower than stack, manual deallocation of memory is needed (in nowadays's OS, the memory will be automatically freed in the end of program), et cetera.
Data segment is for global and static variables. In fact, this piece of memory can be divided into even smaller parts, e.g. BBS.
In your case, vector is used. In fact, the elements of vector are stored into its internal dynamic array, that is an internal array with a dynamic array size. In the early C++, a dynamic array can be created on the stack memory, however, it is no longer that case. To create a dynamic array, ones have to create it on heap. Therefore, the elements of vector are stored in an internal dynamic array on heap. In fact, to dynamically increase the size of an array, a process namely memory reallocation is needed. However, if a vector user keeps enlarging his or her vector, then the overhead cost of reallocation cost will be high. To deal with it, a vector would firstly allocate a piece of memory that is larger than the current need, that is allocating memory for potential future use. Therefore, in your code, it is not that case that memory reallocation is performed every time push_back() is called. However, if the vector to be copied is quite large, the memory reserved for future use will be not enough. Then, memory allocation will occur. To tackle it, vector.reserve() may be used.
I am a newbie. Hopefully, I have not made any mistake in my sharing.
Hope this helps.
Run the code twice, first time only counting, how many new elements you will need. Then use reserve to already allocate all the memory you need.
while (some_condition_is_true) {
my_vector.clear();
int newLength = 0;
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
newLength++;
my_vector.reserve(newLength);
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Do stuff with my_vector and change data_vector
}
I doubt allocating my_vector is the problem, especially if the while loop is executed many times as the capacity of my_vector should quickly become sufficient.
But to be sure you can just reserve capacity in my_vector corresponding to the size of data_vector:
my_vector.reserve(data_vector.size());
while (some_condition_is_true) {
my_vector.clear();
for (auto value : data_vector) {
if (value > 1)
my_vector.push_back(value);
}
}
If you are on Linux you can reserve memory for my_vector to prevent std::vector reallocations which is bottleneck in your case. Note that reserve will not waste memory due to overcommit, so any rough upper estimate for reserve value will fit your needs. In your case the size of data_vector will be enough. This line of code before while loop should fix the bottleneck:
my_vector.reserve(data_vector.size());

Delete, Free, or Deallocate?

I'm running into a problem where I use too much memory on the stack. I'm using several large arrays that I only need between steps in my code. Basically I need to know how to release the memory used by an array variable that's created as:
float arrayName[length][width];
To intentionally release some auto storage (items on the 'stack'), you can do the following - basically you simply limit the scope of your variables
change code from:
//...
float arrayName[length][width];
// ...
change code to:
//...
{
float arrayName[length][width];
// use arrayName here
//... still in-scope
} // scope limit
// all of arrayName released from stack
{
// stack is available for other use, so try
uint32_t u32[3][length][width];
// use u32 here
//... still in-scope
} // scope ended
// all of u32 released from stack
// better yet, use std::vector or another container
std::vector<uint32_t> bigArry;
NOTE: a vector uses a finite amount of stack (24 bytes on my system),
regardless of how many elements you put into it!
You should use vectors for things like this. It is a part of the C++ standard library and is very optimized in most implementations. The memory taken up by the vector will automatically get released when the vector goes out of scope. So you will never have to free up the memory yourself.
Another benefit with using a vector is that you do not have to worry about running out of stack space since all the "array" memory taken up by the vector is located on the heap of the program.
For examples http://en.cppreference.com/w/cpp/container/vector/vector
Other than that if you think your program memory is never going to be enough then you should consider using the disk as another storage mechanism. Databases work this way. They store most of their data on disk.
You won't need any special statements.
The array will be released on function return or exiting the scope if it is local variable having automatic storage duration, or on exiting the program if it is static variable (declared outside functions).
You may want to allocate the memory on the heap if you are running into a situation where you are running out of memory on the stack. In this case you'll want to new up the array.
float** my_array = new float* [rowCount];
for(int i = 0; i < rowCount; ++i)
{
my_array[i] = new float[columnCount];
}
// and delete it later
for(int i = 0; i < rowCount; ++i)
{
delete [] my_array[i];
}
delete [] my_array;

Vector of vector pointer memory allocation

First I want to say that, I have a vector which has thousand of vectors inside. Each of these inside vectors has thousand of numbers inside. I want to keep memory management safe and memory usage at minimum as much as possible.
I want to ask that if I have a code similiar to below
int size = 10;
vector<vector<double>>* something = new vector<vector<double>>(size);
vector<double>* insideOfSomething;
for(int i = 0; i < size; i++){
insideOfSomething = &(something->at(i));
//...
//do something with insideOfSomething
//...
}
I know that 'something' will be created in heap. What I don't understand is where the vectors are placed, 'insideOfSomething' points? If they are created in stack, then this means that I have a vector pointer, which points a vector in heap, that has vectors inside which are created in stack? (I'm very confused right now.)
If I have a code similiar to the one below;
vector<vector<double>*>* something = new vector<vector<double>*>(size);
vector<double>* insideOfSomething;
for(int i = 0; i < size; i++){
something->at(i) = new vector<double>();
insideOfSomething = something->at(i);
//...
//do something with inside insideOfSomething
//...
}
right know all of my vectors are stored in heap, right?
Which one is more usefull according to the memory management?
You should avoid allocating vectors on the heap and just declare them on the stack since the vector will manage its objects on the heap for you. Anywhere you want to avoid creating a copy you can just use a reference or const reference (which ever is necessary).
vector<vector<double> > something(size);
for(int i = 0; i < size; i++)
{
vector<double> &insideOfSomething = something.at(i);
//use insideOfSomething
}
Let's take a random, simplistic implementation of vector, as I think this will help you.
template <class T, class Alloc>
class vector
{
private:
T* buffer;
std::size_t vector_size;
std::size_t vector_capacity
Alloc alloc;
public:
...
};
In this case, if we write:
vector<int> v;
v.push_back(123);
... the pointer, buffer, the integrals: vector_size and vector_capacity, and the allocator object, alloc, will all be created on the stack (along with allocating any additional memory necessary for structure padding and alignment).
However, vector itself will allocate memory on the heap to which this buffer pointer will store its base address. That will always be on the heap and will contain the actual contents of the vector as we think of them.
This is still more efficient than this:
vector<int>* v = new vector<int>;
v->push_back(123);
...
delete v;
... as this would involve a heap allocation/deallocation for the vector itself (including its data members) in addition to the memory vector itself allocates for its internal contents (the buffer). It also introduces an additional level of indirection.
Now if we have a vector of Somethings (vector of vector or anything else):
vector<Something> v;
Those Something instances are always going to be allocated within a contiguous heap buffer since they would reside in the dynamically allocated memory blocks that vector creates and destroys internally.
In vector<> all data stored in heap
And i think you should simply use
vector< vector<double> > something;
I want to keep memory management safe and memory usage at minimum as much as possible.
Then
vector<vector<double>>* something = new vector<vector<double>>(size);
is already not good. As said in the other answers, vector already has its data on the heap, no need to mess around with new to achieve this. In fact, the objects' location is like
S t a c k H e a p
(vector<double>) sthng[0]
(vector<vector<double>>) sthng (vector<double>) sthng[1]
...
- - - - - -
(double) sthng[0][0]
(double) sthng[0][1]
...
- - - - - -
(double) sthng[1][0]
(double) sthng[1][1]
...
(of course, there is no particular ordering of the blocks on the heap)
Joe and hired777's answers explain that a vector will be allocated on the heap no matter what. I'll try to give some insight on the reason for this.
A vector is a resizeable container. Generally it doubles in size when it reaches capacity which means it needs to be able to allocate more memory than it had already allocated. Hence even when you declare vector inside a function and hence on the stack, internally it's holding a pointer to it's data on the heap and on going out of the function's scope, it's destructor will delete this data from the heap.

How to expand an array dynamically in C++? {like in vector }

Lets say, i have
int *p;
p = new int[5];
for(int i=0;i<5;i++)
*(p+i)=i;
Now I want to add a 6th element to the array. How do I do it?
You have to reallocate the array and copy the data:
int *p;
p = new int[5];
for(int i=0;i<5;i++)
*(p+i)=i;
// realloc
int* temp = new int[6];
std::copy(p, p + 5, temp); // Suggested by comments from Nick and Bojan
delete [] p;
p = temp;
You cannot. You must use a dynamic container, such as an STL vector, for this. Or else you can make another array that is larger, and then copy the data from your first array into it.
The reason is that an array represents a contiguous region in memory. For your example above, let us say that p points to address 0x1000, and the the five ints correspond to twenty bytes, so the array ends at the boundary of 0x1014. The compiler is free to place other variables in the memory starting at 0x1014; for example, int i might occupy 0x1014..0x1018. If you then extended the array so that it occupied four more bytes, what would happen?
If you allocate the initial buffer using malloc you can use realloc to resize the buffer. You shouldn't use realloc to resize a new-ed buffer.
int * array = (int*)malloc(sizeof(int) * arrayLength);
array = (int*)realloc(array, sizeof(int) * newLength);
However, this is a C-ish way to do things. You should consider using vector.
Why don't you look in the sources how vector does that? You can see the implementation of this mechanism right in the folder your C++ include files reside!
Here's what it does on gcc 4.3.2:
Allocate a new contiguous chunk of memory with use of the vector's allocator (you remember that vector is vector<Type, Allocator = new_allocator>?). The default allocator calls operator new() (not just new!) to allocate this chunk, letting himself thereby not to mess with new[]/delete[] stuff;
Copy the contents of the existing array to the newly allocated one;
Dispose previously aligned chunk with the allocator; the default one uses operator delete().
(Note, that if you're going to write your own vector, your size should increase "M times", not "by fixed amount". This will let you achieve amortized constant time. For example, if, upon each excession of the size limit, your vector grows twice, each element will be copied on average once.)
Same as others are saying, but if you're resizing the array often, one strategy is to resize the array each time by doubling the size. There's an expense to constantly creating new and destroying old, so the doubling theory tries to mitigate this problem by ensuring that there's sufficient room for future elements as well.