Understanding C++ map why heap memory isn't released with clear()? - c++

Suppose I have a forever loop to create hashmap:
void createMap() {
map<int, int> mymap;
for (int i = 0; i < INT_MAX; i++) {
mymap[i] = i;
}
mymap.clear(); // <-- this line doesn't seem to make a difference in memory growth
}
int main (void) {
while (1) {
createMap();
}
return 0;
}
I watched the code run and on MacOS, watching the Activity Monitor, the application keeps growing the memory usage with or without the mymap.clear() at end of the createMap() function.
Shouldn't memory usage be constant for the case where mymap.clear() is used?
What's the general recommendation for using STL data containers? Need to .clear() before end of function?

I asked in another forum, the folks there helped me understand the answer. It turns out, I didn't wait long enough to exit createMap function nor do I have enough memory to sustain this program.
It takes INT_MAX=2147483647 elements to be created, and for each map = 24 bytes element of pair<int, int> = 8 bytes.
Total minimum memory = 2.147483647^9 * 8 + 24 = 17179869200 bytes ~= 17.2 GB.
I reduced the size of the elements and tested both with and without .clear() the program grew and reduce in size accordingly.

The container you create is bound to the scope of your function. If the function returns, its lifetime ends. And as std::map owns its data, the memory it allocates is freed upon destruction.
Your code hence constantly allocates and frees the same amount of memory. Memory consumption is hence constant, although the exact memory locations will probably differ. This also means that you should not manually call clear at the end of this function. Use clear when you want to empty a container that you intend to continue using afterwards.
As a side note, std::map is not a hash map (std::unordered_map is one).

Related

How to free memory for vector of vectors (C++)

I have a vector<vector<double>> elem and I want to deallocate its memory many times in my program.
I tried using
vector<vector<double>>().swap(elem);
Or even a for cicle
for(int i=0; i<elem.size();i++)
vector<double>().swap(elem[i]);
vector<vector<double>>().swap(elem);
elem.resize(dim, vector<double>(0));
(I want the first dimension to be a certain number dim)
But when I call
cout<<elem[0].size();
numerous times in my program, the output keeps growing, even if I've just used the aforementioned method. This issue isn't present with the "main" size of the vector.
i.e.
cout<<elem.size();
always outputs dim
EDIT: I know about clear() but I want to deallocate the vector, shrink_to_fit() doesn't work either. Also this is implemented in a function out of the main one, as follows:
void arrayReset(vector<vector<double>> elem) {
for(int i=0; i<elem.size();i++)
vector<double>().swap(elem[i]);
vector<vector<double>>().swap(elem);
elem.resize(dim, vector<double>(0));
}
Your new function void arrayReset(vector<vector<double>> elem) { gets a COPY of your vector and [possibly] cleans it; you never see it in the calling function.
If you pass your vector by reference, you would manipulate the original vector.
How to free memory for vector
The way is the same for all vectors regardless of the element type.
Step 1: Remove the elements of the vector. Simplest way is the clear member function. After this step, the size member function will return 0.
Step 2: Call shrink_to_fit member function which requests the memory to be deallocated. After this step, capacity may return 0.
Technically, shrink_to_fit is a request that is not required to be honoured by the language implementation. The only guaranteed way to deallocate the memory is to destroy the vector. Example:
{
std::vector<std::vector<double> vector;
// use vector here
}
// memory has been deallocated
I want to deallocate its memory many times in my program.
Note that this is typically slower than not deallocating many times. I recommend making sure that you want something that is actually useful.

Fast way to push_back a vector many times

I have identified a bottleneck in my c++ code, and my goal is to speed it up. I am moving items from one vector to another vector if a condition is true.
In python, the pythonic way of doing this would be to use a list comprehension:
my_vector = [x for x in data_vector if x > 1]
I have hacked a way to do this in C++, and it is working fine. However, I am calling this millions of times in a while-loop and it is slow. I do not understand much about memory allocation, but I assume that my problem has to do with allocating memory over-and-over again using push_back. Is there a way to allocate my memory differently to speed up this code? (I do not know how large my_vector should be until the for-loop has completed).
std::vector<float> data_vector;
// Put a bunch of floats into data_vector
std::vector<float> my_vector;
while (some_condition_is_true) {
my_vector.clear();
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Use my_vector to render graphics on the GPU, but do not change the elements of my_vector
// Change the elements of data_vector, but not the size of data_vector
}
Use std::copy_if, and reserve data_vector.size() for my_vector initially (as this is the maximum possible number of elements for which your predicate could evaluate to true):
std::vector<int> my_vec;
my_vec.reserve(data_vec.size());
std::copy_if(data_vec.begin(), data_vec.end(), std::back_inserter(my_vec),
[](const auto& el) { return el > 1; });
Note that you could avoid the reserve call here if you expect that the number of times that your predicate evaluates to true will be much less than the size of the data_vector.
Though there are various great solutions posted by others for your query, it seems there is still no much explanation for the memory allocation, which you do not much understand, so I would like to share my knowledge about this topic with you. Hope this helps.
Firstly, in C++, there are several types of memory: stack, heap, data segment.
Stack is for local variables. There are some important features associated with it, for example, they will be automatically deallocated, operation on it is very fast, its size is OS-dependent and small such that storing some KB of data in the stack may cause an overflow of memory, et cetera.
Heap's memory can be accessed globally. As for its important features, we have, its size can be dynamically extended if needed and its size is larger(much larger than stack), operation on it is slower than stack, manual deallocation of memory is needed (in nowadays's OS, the memory will be automatically freed in the end of program), et cetera.
Data segment is for global and static variables. In fact, this piece of memory can be divided into even smaller parts, e.g. BBS.
In your case, vector is used. In fact, the elements of vector are stored into its internal dynamic array, that is an internal array with a dynamic array size. In the early C++, a dynamic array can be created on the stack memory, however, it is no longer that case. To create a dynamic array, ones have to create it on heap. Therefore, the elements of vector are stored in an internal dynamic array on heap. In fact, to dynamically increase the size of an array, a process namely memory reallocation is needed. However, if a vector user keeps enlarging his or her vector, then the overhead cost of reallocation cost will be high. To deal with it, a vector would firstly allocate a piece of memory that is larger than the current need, that is allocating memory for potential future use. Therefore, in your code, it is not that case that memory reallocation is performed every time push_back() is called. However, if the vector to be copied is quite large, the memory reserved for future use will be not enough. Then, memory allocation will occur. To tackle it, vector.reserve() may be used.
I am a newbie. Hopefully, I have not made any mistake in my sharing.
Hope this helps.
Run the code twice, first time only counting, how many new elements you will need. Then use reserve to already allocate all the memory you need.
while (some_condition_is_true) {
my_vector.clear();
int newLength = 0;
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
newLength++;
my_vector.reserve(newLength);
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Do stuff with my_vector and change data_vector
}
I doubt allocating my_vector is the problem, especially if the while loop is executed many times as the capacity of my_vector should quickly become sufficient.
But to be sure you can just reserve capacity in my_vector corresponding to the size of data_vector:
my_vector.reserve(data_vector.size());
while (some_condition_is_true) {
my_vector.clear();
for (auto value : data_vector) {
if (value > 1)
my_vector.push_back(value);
}
}
If you are on Linux you can reserve memory for my_vector to prevent std::vector reallocations which is bottleneck in your case. Note that reserve will not waste memory due to overcommit, so any rough upper estimate for reserve value will fit your needs. In your case the size of data_vector will be enough. This line of code before while loop should fix the bottleneck:
my_vector.reserve(data_vector.size());

Delete, Free, or Deallocate?

I'm running into a problem where I use too much memory on the stack. I'm using several large arrays that I only need between steps in my code. Basically I need to know how to release the memory used by an array variable that's created as:
float arrayName[length][width];
To intentionally release some auto storage (items on the 'stack'), you can do the following - basically you simply limit the scope of your variables
change code from:
//...
float arrayName[length][width];
// ...
change code to:
//...
{
float arrayName[length][width];
// use arrayName here
//... still in-scope
} // scope limit
// all of arrayName released from stack
{
// stack is available for other use, so try
uint32_t u32[3][length][width];
// use u32 here
//... still in-scope
} // scope ended
// all of u32 released from stack
// better yet, use std::vector or another container
std::vector<uint32_t> bigArry;
NOTE: a vector uses a finite amount of stack (24 bytes on my system),
regardless of how many elements you put into it!
You should use vectors for things like this. It is a part of the C++ standard library and is very optimized in most implementations. The memory taken up by the vector will automatically get released when the vector goes out of scope. So you will never have to free up the memory yourself.
Another benefit with using a vector is that you do not have to worry about running out of stack space since all the "array" memory taken up by the vector is located on the heap of the program.
For examples http://en.cppreference.com/w/cpp/container/vector/vector
Other than that if you think your program memory is never going to be enough then you should consider using the disk as another storage mechanism. Databases work this way. They store most of their data on disk.
You won't need any special statements.
The array will be released on function return or exiting the scope if it is local variable having automatic storage duration, or on exiting the program if it is static variable (declared outside functions).
You may want to allocate the memory on the heap if you are running into a situation where you are running out of memory on the stack. In this case you'll want to new up the array.
float** my_array = new float* [rowCount];
for(int i = 0; i < rowCount; ++i)
{
my_array[i] = new float[columnCount];
}
// and delete it later
for(int i = 0; i < rowCount; ++i)
{
delete [] my_array[i];
}
delete [] my_array;

C++ Block Allocator for creating new objects faster

I have a piece of code that creates thousand of objects, and appends them to a vector.
The following code is just an example of what is being done, even though the constructor has some parameters, and the for does not actually have that condition, but it serves the purpose of showing that it runs thousands of times.
vector<VolumeInformation*> vector = vector<VolumeInformation*>();
for (int i = 0; i < 5000; ++i) {
VolumeInformation* info = new VolumeInformation();
vector.push_back(info);
}
The code takes a lot of time to run, and I was trying to find a faster way of creating all the objects. I read about block allocators, but I am unsure if this is really meant for what I am trying to do, and if it really helps on getting this done faster. I would want to allocate memory for a thousand objects (for example), and keep on using that memory while it is still available, and then allocate some more when needed, avoiding having to allocate memory for a single object every time. Can this be done? Can you point me to somewhere where I can find an example on how to tell 'new' to use the previously allocated memory? If not for the objects itself, can the allocator be used for the memory of the vector (even though the object is what really needs speeding up)?
Thank you.
** UPDATE **
After all the answers and comments, I decided making a change in the code, so the vector would store the objects instead of the pointers, so I could use reserve to pre-allocate some memory for the vector, allowing to save some time by allocating memory for several object instances at once. Although, after doing some performance benchmark, I verify that the change I made is performing much worse, unless I know, ahead of time, the exact size of the vector. Here are my findings, I was wondering if someone could shed light into this, letting me know why this happens, if I am missing something here, or if the approach I was using before is really the best one.
Here is the code I used for benchmarking:
vector<int> v = vector<int>();
v.push_back(1);
v.push_back(3);
v.push_back(4);
v.push_back(5);
v.push_back(7);
v.push_back(9);
int testAmount = 200000;
int reserve = 500000;
Stopwatch w = Stopwatch();
w = Stopwatch();
vector<VolumeInformation> infos = vector<VolumeInformation>();
infos.reserve(reserve);
for (int i = 0; i < testAmount; ++i) {
infos.emplace_back(&v, 1, 0, 0);
}
int elapsed = w.Elapsed();
w = Stopwatch();
vector<VolumeInformation*> infoPointers = vector<VolumeInformation*>();
infoPointers.reserve(reserve);
for (int i = 0; i < testAmount; ++i) {
infoPointers.emplace_back(new VolumeInformation(&v, 1, 0, 0));
}
int elapsed2 = w.Elapsed();
If I comment out both reserve() lines, the version without pointers takes 32.701 seconds, while the pointer version takes 6.159! It takes 5+ times less than using a vector of objects.
If I use reserve, but set the amount of items to reserve to a value lower than the number of iterations, the vector of objects version still takes more time than the pointer version.
If I use reserve with a value higher or equal to the amount of iterations, the vector of objects version becomes a lot faster, taking only 270ms, against 8.901 seconds of the pointer version. The main issue here is that I do not know in advance the size that the vector will reach, as the iterations are not based in a hardcoded number, this was only to do the benchmarking.
Can someone explain why this happens, if there is another way around this, or if I am making anything wrong here?
vector is perfectly capable of pre-allocating a large block and using it for all the elements, if you just use it correctly:
// create 5000 default-constructed X objects
std::vector<X> v(5000);
Or if you need to pass constructor arguments:
std::vector<X> v;
v.reserve(5000); // allocate block of memory for 5000 objects
for (int i=0 ; i < v.size(); ++i)
v.emplace_back(arg1, arg2, i % 2 ? arg3 : arg4);
The last line constructs an X in the pre-allocated memory, with no copying, passing the function arguments to the X constructor.
I would want to allocate memory for a thousand objects (for example), and keep on using that memory while it is still available, and then allocate some more when needed, avoiding having to allocate memory for a single object every time.
std::vector does that automatically, you should probably stop using new and just have a vector<VolumeInformation> and put objects into it directly, instead of allocating individual objects and storing pointers to them.
Memory allocation is slow (see Why should C++ programmers minimize use of 'new'?), so stop allocating individual objects. Both the examples above will do 1 allocation, and 5000 constructor calls. Your original code does at least 5001 allocations and 5000 constructor calls (in typical C++ implementations it would do 5013 allocations and 5000 constructor calls).
** UPDATE **
If I comment out both reserve() lines, the version without pointers takes 32.701 seconds, while the pointer version takes 6.159! It takes 5+ times less than using a vector of objects.
Since you haven't actually shown a complete working program you're asking people to guess (always show the actual code!) but it suggests your class has a very slow copy constructor, which is used when the vector grows and the existing elements need to be copied over to the new memory (and the old elements are then destroyed).
If you can add a noexcept move constructor that is more efficient than the copy constructor then std::vector will use that when the vector needs to grow and will run much faster.
The main issue here is that I do not know in advance the size that the vector will reach, as the iterations are not based in a hardcoded number, this was only to do the benchmarking.
You could just reserve more elements than you are ever likely to need, trading higher memory usage for better performance.
You probably want to reserve space for your 5000 elements ahead of the loop:
vector.reserve(5000);
for (int i = 0; i < 5000; ++i) {
VolumeInformation info = new VolumeInformation();
vector.push_back(info);
}
this could save time by eliminating severals resizes as vector grows and if VolumeInformation costs a lot (in time) to copy.

define a large array in for-loop would compromise the performance?

Code looks like this:
for (int i = 0; i <= LARGE_NUMBER; ++i) {
int x[LARGE_NUMBER] = {0};
// do something with x
}
I think array x will be created each time when for-loop wades thru 0~LARGE_NUMBER, so
this will compromise the performance? Will -O2 do some help?
Your array will be zeroed in each iteration, so definitely it will.
This code is linear time, every element of the array will be zero-initialized:
int x[LARGE_NUMBER] = {0};
And this is constant time, just increment of the stack pointer:
int x[LARGE_NUMBER];
Performance will depend on whether LARGE_NUMBER is really large or not. If LARGE_NUMBER is of size of one or two cache lines than you won't notice difference between first and second version. But if LARGE_NUMBER is really large - you will. But if your array is so large than performance difference is noticeable - than you definitely need to move it to heap. Stack space is expensive and allocation of megabytes of data in it is wrong.
If your array is really large, you can allocate it on the heap once and call memset between iteration.
How is LARGE_NUMBER expected to be?
Consider that an on-stack allocation of a wide object can result wider than the stack space the system can give to a thread, and you are probably facing an "out of memory" problem even before performance can start. (The stack need to be fast, and hence in no more than few Megabytes: Ideally it has to fit the processor cache)
If that's the case a std::vector (that leave in the stack, but manages allocation from the heap) play better.
But defining it inside, makes it to be created / destroyed upon every iteration. Now: does those creation / destruction make sense (I mean: do they take some action that make sense to be repeated every time) or your problem is just to initialize to zero on every iteration? If that's the case, I wold probably do something like:
{ //just begin a scope block
std::vector<int> v(LARGE_NUMBER); //allocates LARGE_NUMBER int-s on heap
for (int i = 0; i <= LARGE_NUMBER; ++i)
{
std::fill(v.begin(), v.end(), 0); //reset at every iteration
// other stuff with v
}
} //here the vector and associated memory is finally eliminated
Note that the performance of std:fill is linear just like the initialization of an array, but you avoid that way to allocate/deallocate at every cycle.
In any case, your problem has O2 complexity, by definition.
Depends on your application... i assume that you have a fixed array size? It will use: http://www.cplusplus.com/reference/cstring/memset/
#include <stdio.h>
#include <string.h>
int* x = new int[LARGE_NUMBER];
for (int i = 0; i <= LARGE_NUMBER; ++i) {
memset(x,0,LARGE_NUMBER);
// do something with x
}
//some more stuff that needs x
delete[] x;
BTW: I have no C/C++ at hand to test the code.