Changing the reserve memory of C++ vector - c++

I have a vector with 1000 "nodes"
if(count + 1 > m_listItems.capacity())
m_listItems.reserve(count + 100);
The problem is I also clear it out when I'm about to refill it.
m_listItems.clear();
The capacity doesn't change.
I've used the resize(1); but that doesn't seem to alter the capacity.
So how does one change the reserve?

vector<Item>(m_listItems).swap(m_listItems);
will shrink m_listItems again: http://www.gotw.ca/gotw/054.htm (Herb Sutter)
If you want to clear it anyway, swap with an empty vector:
vector<Item>().swap(m_listItems);
which of course is way more efficient. (Note that swapping vectors basicially means just swapping two pointers. Nothing really time consuming going on)

You can swap the vector as others have suggested, and as described in http://www.gotw.ca/gotw/054.htm but be aware that it is not free, you're performing a copy of every element, because the vector has to allocate a new, smaller, chunk of memory, and copy all the old contents over. (The swap operation is essentially free, but you're swapping with a temporary initialized with a copy of the original vector's data, which is not free)
If you know in advance how big the vector is, you should allocate the right size to begin with, so no resizing is necessary:
std::vector<foo> v(1000); // Create a vector with capacity for 1000 elements
And if you don't know the capacity in advance, why does it matter whether it wastes a bit of space? Is it worth the time spent copying every element to a new and smaller vector (which is what std::vector(v).swap(v) will do), just to save a few kilobytes of memory?
Similarly, when you clear the vector, if you intend to refill it anyway, setting its capacity to zero seems to be an impressive waste of time.
Edit:
baash05: what if you had 1000000 items
an 10 megs of ram. would you say
reducing the amount of overhead is
important?
No. Resizing the vector requires more memory, temporarily, so if you're memory-limited, that might break your app. (You have to have the original vector in memory, and the temporary, before you can swap them, so you end up using up to twice as much RAM at that point). Afterwards, you might save a small amount of memory (up to a couple of MB), but this doesn't matter, because the excess capacity in the vector would never be accessed, so it would get pushed to the pagefile, and so not count towards your RAM limit in the first place.
If you have 1000000 items, then you should initialize the vector to the correct size in the first place.
And if you can't do that, then you'll typically be better off leaving the capacity alone. Especially since you stated that you're going to refill the vector, you should definitely reuse the capacity that has already been allocated, rather than allocating, reallocating, copying and freeing everything constantly.
You have two possible cases. Either you know how many elements you need to store, or you don't. If you know, then you can create the vector with the correct size in the first place, and so you never need to resize it, or you don't know, and then you might as well keep the excess capacity, so at least it won't have to resize upwards when you refill your vector.

You could try this technique from here
std::vector< int > v;
// ... fill v with stuff...
std::vector< int >().swap( v );

You can swap it with a new vector that has desired capacity.
vector< int > tmp;
old.swap( tmp );

As far as I can tell, you can't reallocate a vector to a lower capacity than it ever has; you can only allocate it larger. There are good reasons for this; among them is that the reallocation process is hugely computationally intensive. If you really need to have a smaller vector, free the old one and create a new one that's smaller. That's actually computationally much simpler than having the vector actually resize smaller.

Related

std::ifstream read into vector results in 0 sized vector [duplicate]

I am pre-allocating some memory to my a vector member variable. Below code is minimal part
class A {
vector<string> t_Names;
public:
A () : t_Names(1000) {}
};
Now at some point of time, if the t_Names.size() equals 1000. I am intending to increase the size by 100. Then if it reaches 1100, again increase by 100 and so on.
My question is, what to choose between vector::resize() and vector::reserve(). Is there any better choice in this kind of scenario ?
Edit: I have sort of precise estimate for the t_Names. I estimate it to be around 700 to 800. However in certain (seldom) situations, it can grow more than 1000.
The two functions do vastly different things!
The resize() method (and passing argument to constructor is equivalent to that) will insert or delete appropriate number of elements to the vector to make it given size (it has optional second argument to specify their value). It will affect the size(), iteration will go over all those elements, push_back will insert after them and you can directly access them using the operator[].
The reserve() method only allocates memory, but leaves it uninitialized. It only affects capacity(), but size() will be unchanged. There is no value for the objects, because nothing is added to the vector. If you then insert the elements, no reallocation will happen, because it was done in advance, but that's the only effect.
So it depends on what you want. If you want an array of 1000 default items, use resize(). If you want an array to which you expect to insert 1000 items and want to avoid a couple of allocations, use reserve().
EDIT: Blastfurnace's comment made me read the question again and realize, that in your case the correct answer is don't preallocate manually. Just keep inserting the elements at the end as you need. The vector will automatically reallocate as needed and will do it more efficiently than the manual way mentioned. The only case where reserve() makes sense is when you have reasonably precise estimate of the total size you'll need easily available in advance.
EDIT2: Ad question edit: If you have initial estimate, then reserve() that estimate. If it turns out to be not enough, just let the vector do it's thing.
resize() not only allocates memory, it also creates as many instances as the desired size which you pass to resize() as argument. But reserve() only allocates memory, it doesn't create instances. That is,
std::vector<int> v1;
v1.resize(1000); //allocation + instance creation
cout <<(v1.size() == 1000)<< endl; //prints 1
cout <<(v1.capacity()==1000)<< endl; //prints 1
std::vector<int> v2;
v2.reserve(1000); //only allocation
cout <<(v2.size() == 1000)<< endl; //prints 0
cout <<(v2.capacity()==1000)<< endl; //prints 1
Output (online demo):
1
1
0
1
So resize() may not be desirable, if you don't want the default-created objects. It will be slow as well. Besides, if you push_back() new elements to it, the size() of the vector will further increase by allocating new memory (which also means moving the existing elements to the newly allocated memory space). If you have used reserve() at the start to ensure there is already enough allocated memory, the size() of the vector will increase when you push_back() to it, but it will not allocate new memory again until it runs out of the space you reserved for it.
From your description, it looks like that you want to "reserve" the allocated storage space of vector t_Names.
Take note that resize initialize the newly allocated vector where reserve just allocates but does not construct. Hence, 'reserve' is much faster than 'resize'
You can refer to the documentation regarding the difference of resize and reserve
reserve when you do not want the objects to be initialized when reserved. also, you may prefer to logically differentiate and track its count versus its use count when you resize. so there is a behavioral difference in the interface - the vector will represent the same number of elements when reserved, and will be 100 elements larger when resized in your scenario.
Is there any better choice in this kind of scenario?
it depends entirely on your aims when fighting the default behavior. some people will favor customized allocators -- but we really need a better idea of what it is you are attempting to solve in your program to advise you well.
fwiw, many vector implementations will simply double the allocated element count when they must grow - are you trying to minimize peak allocation sizes or are you trying to reserve enough space for some lock free program or something else?

queue declaration causing std::bad_alloc [duplicate]

I am pre-allocating some memory to my a vector member variable. Below code is minimal part
class A {
vector<string> t_Names;
public:
A () : t_Names(1000) {}
};
Now at some point of time, if the t_Names.size() equals 1000. I am intending to increase the size by 100. Then if it reaches 1100, again increase by 100 and so on.
My question is, what to choose between vector::resize() and vector::reserve(). Is there any better choice in this kind of scenario ?
Edit: I have sort of precise estimate for the t_Names. I estimate it to be around 700 to 800. However in certain (seldom) situations, it can grow more than 1000.
The two functions do vastly different things!
The resize() method (and passing argument to constructor is equivalent to that) will insert or delete appropriate number of elements to the vector to make it given size (it has optional second argument to specify their value). It will affect the size(), iteration will go over all those elements, push_back will insert after them and you can directly access them using the operator[].
The reserve() method only allocates memory, but leaves it uninitialized. It only affects capacity(), but size() will be unchanged. There is no value for the objects, because nothing is added to the vector. If you then insert the elements, no reallocation will happen, because it was done in advance, but that's the only effect.
So it depends on what you want. If you want an array of 1000 default items, use resize(). If you want an array to which you expect to insert 1000 items and want to avoid a couple of allocations, use reserve().
EDIT: Blastfurnace's comment made me read the question again and realize, that in your case the correct answer is don't preallocate manually. Just keep inserting the elements at the end as you need. The vector will automatically reallocate as needed and will do it more efficiently than the manual way mentioned. The only case where reserve() makes sense is when you have reasonably precise estimate of the total size you'll need easily available in advance.
EDIT2: Ad question edit: If you have initial estimate, then reserve() that estimate. If it turns out to be not enough, just let the vector do it's thing.
resize() not only allocates memory, it also creates as many instances as the desired size which you pass to resize() as argument. But reserve() only allocates memory, it doesn't create instances. That is,
std::vector<int> v1;
v1.resize(1000); //allocation + instance creation
cout <<(v1.size() == 1000)<< endl; //prints 1
cout <<(v1.capacity()==1000)<< endl; //prints 1
std::vector<int> v2;
v2.reserve(1000); //only allocation
cout <<(v2.size() == 1000)<< endl; //prints 0
cout <<(v2.capacity()==1000)<< endl; //prints 1
Output (online demo):
1
1
0
1
So resize() may not be desirable, if you don't want the default-created objects. It will be slow as well. Besides, if you push_back() new elements to it, the size() of the vector will further increase by allocating new memory (which also means moving the existing elements to the newly allocated memory space). If you have used reserve() at the start to ensure there is already enough allocated memory, the size() of the vector will increase when you push_back() to it, but it will not allocate new memory again until it runs out of the space you reserved for it.
From your description, it looks like that you want to "reserve" the allocated storage space of vector t_Names.
Take note that resize initialize the newly allocated vector where reserve just allocates but does not construct. Hence, 'reserve' is much faster than 'resize'
You can refer to the documentation regarding the difference of resize and reserve
reserve when you do not want the objects to be initialized when reserved. also, you may prefer to logically differentiate and track its count versus its use count when you resize. so there is a behavioral difference in the interface - the vector will represent the same number of elements when reserved, and will be 100 elements larger when resized in your scenario.
Is there any better choice in this kind of scenario?
it depends entirely on your aims when fighting the default behavior. some people will favor customized allocators -- but we really need a better idea of what it is you are attempting to solve in your program to advise you well.
fwiw, many vector implementations will simply double the allocated element count when they must grow - are you trying to minimize peak allocation sizes or are you trying to reserve enough space for some lock free program or something else?

How much memory is really used by a vector to which objects are added and then removed?

I have a vector to which I keep adding objects (thousands). Once I am done with the objects, I remove them from the vector.
So, if I added a 1000 objects to my vector, and then removed 900 how much memory is my vector really using ? Is it the total amount of memory ever reserved ?
EDIT: How does one reclaim memory, so that the amount of memory used is only for the number of objects actually stored in the vector ?
EDIT: I understand that removing items does not reduce capacity, and adding items in future won't require reallocating memory as long as I am within capacity. But in my scenario, I know after a certain point in my app lifecycle, that I will not be adding to this vector, and so it makes sense to try to reclaim the memory.
By removing elements from a vector, it's capacity (the amount of elements that may fit into it, not the same as it's size) does not necessarily changes. So it's possible that it remains 1000 after removing 100 elements.

vector reserve c++

I have a very large multidimensional vector that changes in size all the time.
Is there any point to use the vector.reserve() function when I only know a good approximation of the sizes.
So basically I have a vector
A[256*256][x][y]
where x goes from 0 to 50 for every iteration in the program and then back to 0 again. The y values can differ every time, which means that for each of the
[256*256][y] elements the vector y can be of a different size but still smaller than 256;
So to clarify my problem this is what I have:
vector<vector<vector<int>>> A;
for(int i =0;i<256*256;i++){
A.push_back(vector<vector<int>>());
A[i].push_back(vector<int>());
A[i][0].push_back(SOME_VALUE);
}
Add elements to the vector...
A.clear();
And after this I do the same thing again from the top.
When and how should I reserve space for the vectors.
If I have understood this correctly I would save a lot of time if I would use reserve as I change the sizes all the time?
What would be the negative/positive sides of reserving the maximum size my vector can have which would be [256*256][50][256] in some cases.
BTW. I am aware of different Matrix Templates and Boost, but have decided to go with vectors on this one...
EDIT:
I was also wondering how to use the reserve function in multidimensional arrays.
If I only reserve the vector in two dimensions will it then copy the whole thing if I exceed its capacity in the third dimension?
To help with discussion you can consider the following typedefs:
typedef std::vector<int> int_t; // internal vector
typedef std::vector<int_t> mid_t; // intermediate
typedef std::vector<mid_t> ext_t; // external
The cost of growing (vector capacity increase) int_t will only affect the contents of this particular vector and will not affect any other element. The cost of growing mid_t requires copying of all the stored elements in that vector, that is it will require all of the int_t vector, which is quite more costly. The cost of growing ext_t is huge: it will require copying all the elements already stored in the container.
Now, to increase performance, it would be much more important to get the correct ext_t size (it seems fixed 256*256 in your question). Then get the intermediate mid_t size correct so that expensive reallocations are rare.
The amount of memory you are talking about is huge, so you might want to consider less standard ways to solve your problem. The first thing that comes to mind is adding and extra level of indirection. If instead of holding the actual vectors you hold smart pointers into the vectors you can reduce the cost of growing the mid_t and ext_t vectors (if ext_t size is fixed, just use a vector of mid_t). Now, this will imply that code that uses your data structure will be more complex (or better add a wrapper that takes care of the indirections). Each int_t vector will be allocated once in memory and will never move in either mid_t or ext_t reallocations. The cost of reallocating mid_t is proportional to the number of allocated int_t vectors, not the actual number of inserted integers.
using std::tr1::shared_ptr; // or boost::shared_ptr
typedef std::vector<int> int_t;
typedef std::vector< shared_ptr<int_t> > mid_t;
typedef std::vector< shared_ptr<mid_t> > ext_t;
Another thing that you should take into account is that std::vector::clear() does not free the allocated internal space in the vector, only destroys the contained objects and sets the size to 0. That is, calling clear() will never release memory. The pattern for actually releasing the allocated memory in a vector is:
typedef std::vector<...> myvector_type;
myvector_type myvector;
...
myvector.swap( myvector_type() ); // swap with a default constructed vector
Whenever you push a vector into another vector, set the size in the pushed vectors constructor:
A.push_back(vector<vector<int> >( somesize ));
You have a working implementation but are concerned about the performance. If your profiling shows it to be a bottleneck, you can consider using a naked C-style array of integers rather than the vector of vectors of vectors.
See how-do-i-work-with-dynamic-multi-dimensional-arrays-in-c for an example
You can re-use the same allocation each time, reallocing as necessary and eventually keeping it at the high-tide mark of usage.
If indeed the vectors are the bottleneck, performance beyond avoiding the sizing operations on the vectors each loop iteration will likely become dominated by your access pattern into the array. Try to access the highest orders sequentially.
If you know the size of a vector at construction time, pass the size to the c'tor and assign using operator[] instead of push_back. If you're not totally sure about the final size, make a guess (maybe add a little bit more) and use reserve to have the vector reserve enough memory upfront.
What would be the negative/positive sides of reserving the maximum size my vector can have which would be [256*256][50][256] in some cases.
Negative side: potential waste of memory. Positive side: less CPU time, less heap fragmentation. It's a memory/cpu tradeoff, the optimum choice depends on your application. If you're not memory-bound (on most consumer machines there's more than enough RAM), consider reserving upfront.
To decide how much memory to reserve, look at the average memory consumption, not at the peak (reserving 256*256*50*256 is not a good idea unless such dimensions are needed regularly)

Why not resize and clear works in GotW 54?

Referring to article Gotw 54 by HerbSutter, he explains about
The Right Way To "Shrink-To-Fit" a
vector or deque and
The Right Way to Completely Clear a vector or
deque
Can we just use container.resize()
and container.clear() for the above task
or am I missing something?
There are two different things that a vector holds: size Vs capacity. If you just resize the vector, there is no guarantee that the capacity(how much memory is reserved) must change. resize is an operation concerned with how much are you using, not how much the vector capacity is.
So for example.
size == how much you are using
capacity == how much memory is reserved
vector<int> v(10);
v.resize(5); // size == 5 but capacity (may or may) not be changed
v.clear() // size == 0 but capacity (may or may) not be changed
In the end, capacity should not changed on every operation, because that would bring a lot of memory allocation/deallocation overhead. He is saying that if you need to "deallocate" the memory reserved by vector, do that.
Neither resize() nor clear() work. The .capacity() of a vector is guaranteed to be at least as big as the current size() of the vector, and guaranteed to be at least as big as the reserve()d capacity. Also, this .capacity() doesn't shrink, so it is also at least as big as any previous size() or reserve()ation.
Now, the .capacity() of a vector is merely the memory it has reserved. Often not all of that memory cotains objects. Resizing removes objects, but doesn't recycle the memory. A vector can only recycle its memory buffer when allocating a larger buffer.
The swap trick works by copying all ojects to a smaller, more appropriate memory buffer. Afterwards, the original memory buffer can be recycled. This appears to violate the previous statement that the memory buffer of a vector can only grow. However, with the swap trick, you temporarily have 2 vectors.
The vector has size and capacity. It may hold X elements but have uninitialized memory in store for Y elements more. In a typical implementation erase, resize (when resizing to smaller size) and clear don't affect the capacity: the vector keeps the memory around for itself, should you want to add new items to it at a later time.