Benefits of using reserve() in a vector - C++ - c++

What is the benefit of using reserve when dealing with vectors. When should I use them? Couldn't find a clear cut answer on this but I assume it is faster when you reserve in advance before using them.
What say you people smarter than I?

It's useful if you have an idea how many elements the vector will ultimately hold - it can help the vector avoid repeatedly allocating memory (and having to move the data to the new memory).
In general it's probably a potential optimization that you shouldn't need to worry about, but it's not harmful either (at worst you end up wasting memory if you over estimate).
One area where it can be more than an optimization is when you want to ensure that existing iterators do not get invalidated by adding new elements.
For example, a push_back() call may invalidate existing iterators to the vector (if a reallocation occurs). However if you've reserved enough elements you can ensure that the reallocation will not occur. This is a technique that doesn't need to be used very often though.

It can be ... especially if you are going to be adding a lot of elements to you vector over time, and you want to avoid the automatic memory expansion that the container will make when it runs out of available slots.
For instance, back-insertions (i.e., std::vector::push_back) are considered an ammortized O(1) or constant-time process, but that is because if an insertion at the back of a vector is made, and the vector is out of space, it must then reallocate memory for a new array of elements, copy the old elements into the new array, and then it can copy the element you were trying to insert into the container. That process is O(N), or linear-time complexity, and for a large vector, could take quite a bit of time. Using the reserve() method allows you to pre-allocate memory for the vector if you know it's going to be at least some certain size, and avoid reallocating memory every time space runs out, especially if you are going to be doing back-insertions inside some performance-critical code where you want to make sure that the time to-do the insertion remains an actual O(1) complexity-process, and doesn't incurr some hidden memory reallocation for the array. Granted, your copy constructor would have to be O(1) complexity as well to get true O(1) complexity for the entire back-insertion process, but in regards to the actual algorithm for back-insertion into the vector by the container itself, you can keep it a known complexity if the memory for the slot is already pre-allocated.

This excellent article deeply explains differences between deque and vector containers. Section "Experiment 2" shows the benefits of vector::reserve().

If you know the eventual size of the vector then reserve is worth using.
Otherwise whenever the vector runs out of internal room it will re-size the buffer. This usually involves doubling (or 1.5 * current size) the size of the internal buffer (can be expensive if you do this a lot).
The real expensive bit is invoking the copy constructor on each element to copy it from the old buffer to the new buffer, followed by calling the destructor on each element in the old buffer.
If the copy constructor is expensive then it can be a problem.

Faster and saves memory
If you push_back another element, then a full vector will typically allocate double the memory it's currently using - since allocate + copy is expensive

Don't know about people smarter than you, but I would say that you should call reserve in advance if you are going to perform lots in insertion operations and you already know or can estimate the total number of elements, at least the order of magnitude. It can save you a lot of reallocations in good circumstances.

Although its an old question, Here is my implementation for the differences.
#include <iostream>
#include <chrono>
#include <vector>
using namespace std;
int main(){
vector<int> v1;
chrono::steady_clock::time_point t1 = chrono::steady_clock::now();
for(int i = 0; i < 1000000; ++i){
v1.push_back(1);
}
chrono::steady_clock::time_point t2 = chrono::steady_clock::now();
chrono::duration<double> time_first = chrono::duration_cast<chrono::duration<double>>(t2-t1);
cout << "Time for 1000000 insertion without reserve: " << time_first.count() * 1000 << " miliseconds." << endl;
vector<int> v2;
v2.reserve(1000000);
chrono::steady_clock::time_point t3 = chrono::steady_clock::now();
for(int i = 0; i < 1000000; ++i){
v2.push_back(1);
}
chrono::steady_clock::time_point t4 = chrono::steady_clock::now();
chrono::duration<double> time_second = chrono::duration_cast<chrono::duration<double>>(t4-t3);
cout << "Time for 1000000 insertion with reserve: " << time_second.count() * 1000 << " miliseconds." << endl;
return 0;
}
When you compile and run this program, it outputs:
Time for 1000000 insertion without reserve: 24.5573 miliseconds.
Time for 1000000 insertion with reserve: 17.1771 miliseconds.
Seems to be some improvement with reserve, but not that too much improvement. I think it will be more improvement for complex objects, I am not sure. Any suggestions, changes and comments are welcome.

It's always interesting to know the final total needed space before to request any space from the system, so you just require space once. In other cases the system may have to move you in a larger free zone (it's optimized but not always a free operation because a whole data copy is required). Even the compiler will try to help you, but the best is to to tell what you know (to reserve the total space required by your process). That's what i think. Greetings.

There is one more advantage of reserve that is not much related to performance but instead to code style and code cleanliness.
Imagine I want to create a vector by iterating over another vector of objects. Something like the following:
std::vector<int> result;
for (const auto& object : objects) {
result.push_back(object.foo());
}
Now, apparently the size of result is going to be the same as objects.size() and I decide to pre-define the size of result.
The simplest way to do it is in the constructor.
std::vector<int> result(objects.size());
But now the rest of my code is invalidated because the size of result is not 0 anymore; it is objects.size(). The subsequent push_back calls are going to increase the size of the vector. So, to correct this mistake, I now have to change how I construct my for-loop. I have to use indices and overwrite the corresponding memory locations.
std::vector<int> result(objects.size());
for (int i = 0; i < objects.size(); ++i) {
result[i] = objects[i].foo();
}
And I don't like it. Indices are everywhere in the code. This is also more vulnerable to making accidental copies because of the [] operator. This example uses integers and directly assigns values to result[i], but in a more complex for-loop with complex data structures, it could be relevant.
Coming back to the main topic, it is very easy to adjust the first code by using reserve. reserve does not change the size of the vector but only the capacity. Hence, I can leave my nice for loop as it is.
std::vector<int> result;
result.reserve(objects.size());
for (const auto& object : objects) {
result.push_back(object.foo());
}

Related

Performance impact when resizing vector within capacity

I have the following synthesized example of my code:
#include <vector>
#include <array>
#include <cstdlib>
#define CAPACITY 10000
int main() {
std::vector<std::vector<int>> a;
std::vector<std::array<int, 2>> b;
a.resize(CAPACITY, std::vector<int> {0, 0})
b.resize(CAPACITY, std::array<int, 2> {0, 0})
for (;;) {
size_t new_rand_size = (std::rand() % CAPACITY);
a.resize(new_rand_size);
b.resize(new_rand_size);
for (size_t i = 0; i < new_rand_size; ++i) {
a[i][0] = std::rand();
a[i][1] = std::rand();
b[i][0] = std::rand();
b[i][1] = std::rand();
}
process(a); // respectively process(b)
}
}
so obviously, the array version is better, because it requires less allocation, as the array is fixed in size and continuous in memory (correct?). It just gets reinitialized when up-resizing again within capacity.
Since I'm going to overwrite anyway, I was wondering if there's a way to skip initialization (e.g. by overwriting the allocator or similar) to optimize the code even further.
so obviously,
The word "obviously" is typically used to mean "I really, really want the following to be true, so I'm going to skip the part where I determine if it is true." ;) (Admittedly, you did better than most since you did bring up some reasons for your conclusion.)
the array version is better, because it requires less allocation, as the array is fixed in size and continuous in memory (correct?).
The truth of this depends on the implementation, but the there is some validity here. I would go with a less micro-managementy approach and say that the array version is preferable because the final size is fixed. Using a tool designed for your specialized situation (fixed size array) tends to incur less overhead than using a tool for a more general situation. Not always less, though.
Another factor to consider is the cost of default-initializing the elements. When a std::array is constructed, all of its elements are constructed as well. With a std::vector, you can defer constructing elements until you have the parameters for construction. For objects that are expensive to default-construct, you might be able to measure a performance gain using a vector instead of an array. (If you cannot measure a difference, don't worry about it.)
When you do a comparison, make sure the vector is given a fair chance by using it well. Since the size is known in advance, reserve the required space right away. Also, use emplace_back to avoid a needless copy.
Final note: "contiguous" is a bit more accurate/descriptive than "continuous".
It just gets reinitialized when up-resizing again within capacity.
This is a factor that affects both approaches. In fact, this causes your code to exhibit undefined behavior. For example, let's suppose that your first iteration resizes the outer vector to 1, while the second resizes it to 5. Compare what your code does to the following:
std::vector<std::vector<int>> a;
a.resize(CAPACITY, std::vector<int> {0, 0});
a.resize(1);
a.resize(5);
std::cout << "Size " << a[1].size() <<".\n";
The output indicates that the size is zero at this point, yet your code would assign a value to a[1][0]. If you want each element of a to default to a vector of 2 elements, you need to specify that default each time you resize a, not just initially.
Since I'm going to overwrite anyway, I was wondering if there's a way to skip initialization (e.g. by overwriting the allocator or similar) to optimize the code even further.
Yes, you can skip the initialization. In fact, it is advisable to do so. Use the tool designed for the task at hand. Your initialization serves to increase the capacity of your vectors. So use the method whose sole purpose is to increase the capacity of a vector: vector::reserve.
Another option – depending on the exact situation — might be to not resize at all. Start with an array of arrays, and track the last usable element in the outer array. This is sort of a step backwards in that you now have a separate variable for tracking the size, but if your real code has enough iterations, the savings from not calling destructors when the size decreases might make this approach worth it. (For cleaner code, write a class that wraps the array of arrays and that tracks the usable size.)
Since I'm going to overwrite anyway, I was wondering if there's a way to skip initialization
Yes: Don't resize. Instead, reserve the capacity and push (or emplace) the new elements.

C++ Block Allocator for creating new objects faster

I have a piece of code that creates thousand of objects, and appends them to a vector.
The following code is just an example of what is being done, even though the constructor has some parameters, and the for does not actually have that condition, but it serves the purpose of showing that it runs thousands of times.
vector<VolumeInformation*> vector = vector<VolumeInformation*>();
for (int i = 0; i < 5000; ++i) {
VolumeInformation* info = new VolumeInformation();
vector.push_back(info);
}
The code takes a lot of time to run, and I was trying to find a faster way of creating all the objects. I read about block allocators, but I am unsure if this is really meant for what I am trying to do, and if it really helps on getting this done faster. I would want to allocate memory for a thousand objects (for example), and keep on using that memory while it is still available, and then allocate some more when needed, avoiding having to allocate memory for a single object every time. Can this be done? Can you point me to somewhere where I can find an example on how to tell 'new' to use the previously allocated memory? If not for the objects itself, can the allocator be used for the memory of the vector (even though the object is what really needs speeding up)?
Thank you.
** UPDATE **
After all the answers and comments, I decided making a change in the code, so the vector would store the objects instead of the pointers, so I could use reserve to pre-allocate some memory for the vector, allowing to save some time by allocating memory for several object instances at once. Although, after doing some performance benchmark, I verify that the change I made is performing much worse, unless I know, ahead of time, the exact size of the vector. Here are my findings, I was wondering if someone could shed light into this, letting me know why this happens, if I am missing something here, or if the approach I was using before is really the best one.
Here is the code I used for benchmarking:
vector<int> v = vector<int>();
v.push_back(1);
v.push_back(3);
v.push_back(4);
v.push_back(5);
v.push_back(7);
v.push_back(9);
int testAmount = 200000;
int reserve = 500000;
Stopwatch w = Stopwatch();
w = Stopwatch();
vector<VolumeInformation> infos = vector<VolumeInformation>();
infos.reserve(reserve);
for (int i = 0; i < testAmount; ++i) {
infos.emplace_back(&v, 1, 0, 0);
}
int elapsed = w.Elapsed();
w = Stopwatch();
vector<VolumeInformation*> infoPointers = vector<VolumeInformation*>();
infoPointers.reserve(reserve);
for (int i = 0; i < testAmount; ++i) {
infoPointers.emplace_back(new VolumeInformation(&v, 1, 0, 0));
}
int elapsed2 = w.Elapsed();
If I comment out both reserve() lines, the version without pointers takes 32.701 seconds, while the pointer version takes 6.159! It takes 5+ times less than using a vector of objects.
If I use reserve, but set the amount of items to reserve to a value lower than the number of iterations, the vector of objects version still takes more time than the pointer version.
If I use reserve with a value higher or equal to the amount of iterations, the vector of objects version becomes a lot faster, taking only 270ms, against 8.901 seconds of the pointer version. The main issue here is that I do not know in advance the size that the vector will reach, as the iterations are not based in a hardcoded number, this was only to do the benchmarking.
Can someone explain why this happens, if there is another way around this, or if I am making anything wrong here?
vector is perfectly capable of pre-allocating a large block and using it for all the elements, if you just use it correctly:
// create 5000 default-constructed X objects
std::vector<X> v(5000);
Or if you need to pass constructor arguments:
std::vector<X> v;
v.reserve(5000); // allocate block of memory for 5000 objects
for (int i=0 ; i < v.size(); ++i)
v.emplace_back(arg1, arg2, i % 2 ? arg3 : arg4);
The last line constructs an X in the pre-allocated memory, with no copying, passing the function arguments to the X constructor.
I would want to allocate memory for a thousand objects (for example), and keep on using that memory while it is still available, and then allocate some more when needed, avoiding having to allocate memory for a single object every time.
std::vector does that automatically, you should probably stop using new and just have a vector<VolumeInformation> and put objects into it directly, instead of allocating individual objects and storing pointers to them.
Memory allocation is slow (see Why should C++ programmers minimize use of 'new'?), so stop allocating individual objects. Both the examples above will do 1 allocation, and 5000 constructor calls. Your original code does at least 5001 allocations and 5000 constructor calls (in typical C++ implementations it would do 5013 allocations and 5000 constructor calls).
** UPDATE **
If I comment out both reserve() lines, the version without pointers takes 32.701 seconds, while the pointer version takes 6.159! It takes 5+ times less than using a vector of objects.
Since you haven't actually shown a complete working program you're asking people to guess (always show the actual code!) but it suggests your class has a very slow copy constructor, which is used when the vector grows and the existing elements need to be copied over to the new memory (and the old elements are then destroyed).
If you can add a noexcept move constructor that is more efficient than the copy constructor then std::vector will use that when the vector needs to grow and will run much faster.
The main issue here is that I do not know in advance the size that the vector will reach, as the iterations are not based in a hardcoded number, this was only to do the benchmarking.
You could just reserve more elements than you are ever likely to need, trading higher memory usage for better performance.
You probably want to reserve space for your 5000 elements ahead of the loop:
vector.reserve(5000);
for (int i = 0; i < 5000; ++i) {
VolumeInformation info = new VolumeInformation();
vector.push_back(info);
}
this could save time by eliminating severals resizes as vector grows and if VolumeInformation costs a lot (in time) to copy.

The fastest way to populate std::vector of unknown size

I have a long array of data (n entities). Every object in this array has some values (let's say, m values for an object). And I have a cycle like:
myType* A;
// reading the array of objects
std::vector<anotherType> targetArray;
int i, j, k = 0;
for (i = 0; i < n; i++)
for (j = 0; j < m; j++)
{
if (check((A[i].fields[j]))
{
// creating and adding the object to targetArray
targetArray[k] = someGenerator(A[i].fields[j]);
k++;
}
}
In some cases I have n * m valid objects, in some (n * m) /10 or less.
The question is how do I allocate a memory for targetArray?
targetArray.reserve(n*m);
// Do work
targetArray.shrink_to_fit();
Count the elements without generating objects, and then allocate as much memory as I need and go with cycle one more time.
Resize the array on every iteration where new objects are being created.
I see a huge tactical mistake in each of my methods. Is another way to do it?
What you are doing here is called premature optimization. By default, std::vector will exponentially increase its memory footprint as it runs out of memory to store new objects. For example, a first push_back will allocate 2 elements. The third push_back will double the size etc. Just stick with push_back and get your code working.
You should start thinking about memory allocation optimization only when the above approach proves itself as a bottleneck in your design. If that ever happens, I think the best bet would be to come up with a good approximation for a number of valid objects and just call reserve() on a vector. Something like your first approach. Just make sure your shrink to fit implementation is correct because vectors don't like to shrink. You have to use swap.
Resizing array on every step is no good and std::vector won't really do it unless you try hard.
Doing an extra cycle through the list of objects can help, but it may also hurt as you could easily waste CPU cycles, bloat CPU cache etc. If in doubt - profile it.
The typical way would be to use targetArray.push_back(). This reallocates the memory when needed and avoids two passes through your data. It has a system for reallocating the memory that makes it pretty efficient, doing fewer reallocations as the vector gets larger.
However, if your check() function is very fast, you might get better performance by going through the data twice, determining how much memory you need and making your vector the right size to begin with. I would only do this if profiling has determined it is really necessary though.

Efficiency when populating a vector

Which would be more efficient, and why?
vector<int> numbers;
for (int i = 0; i < 10; ++i)
numbers.push_back(1);
or
vector<int> numbers(10,0);
for (int i = 0; i < 10; ++i)
numbers[i] = 1;
Thanks
The fastest would be:
vector <int> numbers(10, 1);
As for your two methods, usually the second one; although the first one avoids the first zeroing of the vector in the constructor, it allocates enough memory from the beginning, avoiding the reallocation.
In the benchmark I did some time ago the second method won even if you called reserve before the loop, because the overhead of push_back (which has to check for each insert if the capacity is enough for another item, and reallocate if necessary) still was predominant on the zeroing-overhead of the second method.
Note that this holds for primitive types. If you start to have objects with complicated copy constructors generally the best performing solution is reserve + push_back, since you avoid all the useless calls to the default constructor, which are usually heavier than the cost of the push_back.
In general the second one is faster because the first might involve one or more reallocations of the underlying array that stores the data. This can be aleviated with the reserve function like so:
vector<int> numbers;
numbers.reserve(10);
for (int i = 0; i < 10; ++i)
numbers.push_back(1);
This would be almost close in performance to your 2nd example since reserve tells the vector to allocate enough space for all the elements you are going to add so no reallocations occur in the for loop. However push_back still has to check whether vector's size exceeds it's current capacity and increment the value indicating the size of the vector so this will still be slightly slower than your 2nd example.
In general, probably the second, since push_back() may cause reallocations and resizing as you proceed through the loop, while in the second instance, you are pre-sizing your vector.
Use the second, and if you have iota available (C++11 has it) use that instead of the for loop.
std::vector<int> numbers(10);
std::iota(numbers.begin(), numbers.end(), 0);
The second one is faster because of preallocation of memory. In the first variant of code you could also use numbers.reserve(10); which will allocate some memory for you at once, and not at every iteration (maybe some implementation does more bulky reservation, but don't rely on this).
Also you'd better use iterators, not straight-forward access. Because iterator operation is more predictable and can be easely optimized.
#include <algorithm>
#include <vector>
using namespace std;
staitc const size_t N_ELEMS = 10;
void some_func() {
vector<int> numbers(N_ELEMS);
// Verbose variant
vector<int>::iterator it = numbers.begin();
while(it != numbers.end())
*it++ = 1;
// Or more tight (using C++11 lambdas)
// assuming vector size is adjusted
generate(numbers.begin(), numbers.end(), []{ return 1; });
}
//
There is a middle case, where you use reserve() then call push_back() a lot of times. This is always going to be at least as efficient than just calling push_back() if you know how many elements to insert.
The advantage of calling reserve() rather than resize() is that it does not need to initialise the members until you are about to write to them. Where you have a vector of objects of a class that need construction, this can be more expensive, especially if the default constructor for each element is non-trivial, but even then it is expensive.
The overhead of calling push_back though is that each time you call it, it needs to check the current size against the capacity to see if it needs to re-allocate.
So it's a case of N initializations vs N comparisons. When the type is int, there may well be an optimization with the initializations (memset or whatever) allowing this to be faster, but with objects I would say the comparisons (reserve and push_back) will almost certainly be quicker.

push_back for vector, deque and lists

I am trying to optimize a C++ routine. The main bottleneck in this routine is the push_back() of a vector of objects. I tried using a deque instead and even tried a list. But strangely (and contrary to theory) deque and list implementations run much slower than the vector counterpart.
In fact even clear() runs much slower for the deque and list implementations than the vector counterpart. In this case too, Vector implementation seems to be the fastest while list implementation is the slowest.
Any pointers?
Note: vector reserve() could have sped the implementation but cannot be done as it is unknown in size.
Thanks.
vector being faster to build or clear than deque or list is to be expected; it's a simpler data structure.
With regard to vector::push_back, it has to do two things:
check the vector is big enough to
hold the new item.
insert the new item.
You can generally speed things up by eliminating step 1 by simply resizing the vector and using operator[] to set items.
UPDATE:
Original poster asked for an example.
The code below times 128 mega insertions, and outputs
push_back : 2.04s
reserve & push_back : 1.73s
resize & place : 0.48s
when compiled and run with g++ -O3 on Debian/Lenny on an old P4 machine.
#include <iostream>
#include <time.h>
#include <vector>
int main(int,char**)
{
const size_t n=(128<<20);
const clock_t t0=clock();
{
std::vector<unsigned char> a;
for (size_t i=0;i<n;i++) a.push_back(i);
}
const clock_t t1=clock();
{
std::vector<unsigned char> a;
a.reserve(n);
for (size_t i=0;i<n;i++) a.push_back(i);
}
const clock_t t2=clock();
{
std::vector<unsigned char> a;
a.resize(n);
for (size_t i=0;i<n;i++) a[i]=i;
}
const clock_t t3=clock();
std::cout << "push_back : " << (t1-t0)/static_cast<float>(CLOCKS_PER_SEC) << "s" << std::endl;
std::cout << "reserve & push_back : " << (t2-t1)/static_cast<float>(CLOCKS_PER_SEC) << "s" << std::endl;
std::cout << "resize & place : " << (t3-t2)/static_cast<float>(CLOCKS_PER_SEC) << "s" << std::endl;
return 0;
}
If you don't know how many object you'll be adding it's very difficult to come up with an optimal solution. All you can do is try to minimize the cost that you know is happening - which in this case is that your vector is being constantly resized.
You could do this in two ways;
1) Split your operation into building and finalizing. This is where you build the list into a vector that is guaranteed to be big enough and when done copy it to another vector.
E.g.
std::vector<Foo> hugeVec;
hugeVec.reserve(1000); // enough for 1000 foo's
// add stuff
std::vector<Foo> finalVec;
finalVec = hugeVec;
2) Alternatively, when your vector is full call reserve with enough for another set of objects;
if (vec.capacity() == vec.size())
vec.reserve(vec.size() + 16); // alloc space for 16 more objects
You could choose a different container that did not result in all elements being copied upon a resize, but your bottleneck may then become the individual memory allocations for the new elements.
Are you pushing back the objects themselves, or a pointer to them? Pointers will usually be much faster as it's only 4-8 bytes to copy, compared to whatever the size of the objects are.
"push_back()" can be slow if the copy of an object is slow. If the default constructor is fast and you have a way tu use swap to avoid the copy, you could have a much faster program.
void test_vector1()
{
vector<vector<int> > vvi;
for(size_t i=0; i<100; i++)
{
vector<int> vi(100000, 5);
vvi.push_back(vi); // copy of a large object
}
}
void test_vector2()
{
vector<int> vi0;
vector<vector<int> > vvi;
for(size_t i=0; i<100; i++)
{
vector<int> vi(100000, 5);
vvi.push_back(vi0); // copy of a small object
vvi.back().swap(vi); // swap is fast
}
}
Results :
VS2005-debug
* test_vector1 -> 297
* test_vector2 -> 172
VS2005-release
* test_vector1 -> 203
* test_vector2 -> 94
gcc
* test_vector1 -> 343
* test_vector2 -> 188
gcc -O2
* test_vector1 -> 250
* test_vector2 -> 156
If you want vector to be fast, you must reserve() enough space. It makes a huge difference, because each grow is terrible expensive. If you dont know, make a good guess.
You'll need to give more information on the behavior of the routine.
In one place you're concerned about the speed of push_back() in another you're concerned about clear(). Are you building up the container, doing something then dumping it?
The results you see for clear() are because vector<> only has to release a singl block of memory, deque<> has to release several, and list<> has to release one for each element.
Deque has a more complex structure than vector and the speed differences between the two will be heavily dependent on both the specific implementation and the actual number of elements pushed back, but for large amounts of data it should be faster. clear() may be slower because it may choose to get rid of the more complex underlying structures. Much the same goes for list.
Regarding push_back() being slow and reserve being no help, the implementation of STL used in MSVC works something like this: When you first create a vector it reserves space for I think 10 elements. From then on, whenever it gets full, it reserves space for 1.5 times the number of elements in the vector. So, something like 10, 15, 22, 33, 49, 73, 105, 157... The re-allocations are expensive.
Even if you don't know the exact size, reserve() can be useful. reserve() doesn't prevent the vector from growing if it needs to. If you reserve() and the vector grows beyond that size, you have still improved things because of the reserve. If the vector turns out to be much smaller, well, maybe that's ok because the performance in general works better with smaller sizes.
You need to profile in RELEASE mode to know for sure what strategy works best.
You have to choose your container according to what you're going to do with it.
Relevant actions are: extending (with push), insertion (may not be needed at all), extraction, deletion.
At cplusplus.com, there is a very nice overview of the operations per container type.
If the operation is push-bound, it makes sense that the vector beats all others. The good thing about deque is that it allocates fixed chunks, so will make more efficient use of fragmented memory.