How does compiler decide capacity of a vector? [duplicate] - c++

What is the capacity() of an std::vector which is created using the default constuctor? I know that the size() is zero. Can we state that a default constructed vector does not call heap memory allocation?
This way it would be possible to create an array with an arbitrary reserve using a single allocation, like std::vector<int> iv; iv.reserve(2345);. Let's say that for some reason, I do not want to start the size() on 2345.
For example, on Linux (g++ 4.4.5, kernel 2.6.32 amd64)
#include <iostream>
#include <vector>
int main()
{
using namespace std;
cout << vector<int>().capacity() << "," << vector<int>(10).capacity() << endl;
return 0;
}
printed 0,10. Is it a rule, or is it STL vendor dependent?

The standard doesn't specify what the initial capacity of a container should be, so you're relying on the implementation. A common implementation will start the capacity at zero, but there's no guarantee. On the other hand there's no way to better your strategy of std::vector<int> iv; iv.reserve(2345); so stick with it.

Storage implementations of std::vector vary significantly, but all the ones I've come across start from 0.
The following code:
#include <iostream>
#include <vector>
int main()
{
using namespace std;
vector<int> normal;
cout << normal.capacity() << endl;
for (unsigned int loop = 0; loop != 10; ++loop)
{
normal.push_back(1);
cout << normal.capacity() << endl;
}
cin.get();
return 0;
}
Gives the following output:
0
1
2
4
4
8
8
8
8
16
16
under GCC 5.1, 11.2 - Clang 12.0.1 and:
0
1
2
3
4
6
6
9
9
9
13
under MSVC 2013.

As far as I understood the standard (though I could actually not name a reference), container instanciation and memory allocation have intentionally been decoupled for good reason. Therefor you have distinct, separate calls for
constructor to create the container itself
reserve() to pre allocate a suitably large memory block to accomodate at least(!) a given number of objects
And this makes a lot of sense. The only right to exist for reserve() is to give you the opportunity to code around possibly expensive reallocations when growing the vector. In order to be useful you have to know the number of objects to store or at least need to be able to make an educated guess. If this is not given you better stay away from reserve() as you will just change reallocation for wasted memory.
So putting it all together:
The standard intentionally does not specify a constructor that allows you to pre allocate a memory block for a specific number of objects (which would be at least more desirable than allocating an implementation specific, fixed "something" under the hood).
Allocation shouldn't be implicit. So, to preallocate a block you need to make a separate call to reserve() and this need not be at the same place of construction (could/should of course be later, after you became aware of the required size to accomodate)
Thus if a vector would always preallocate a memory block of implementation defined size this would foil the intended job of reserve(), wouldn't it?
What would be the advantage of preallocating a block if the STL naturally cannot know the intended purpose and expected size of a vector? It'll be rather nonsensical, if not counter-productive.
The proper solution instead is to allocate and implementation specific block with the first push_back() - if not already explicitely allocated before by reserve().
In case of a necessary reallocation the increase in block size is implementation specific as well. The vector implementations I know of start with an exponential increase in size but will cap the increment rate at a certain maximum to avoid wasting huge amounts of memory or even blowing it.
All this comes to full operation and advantage only if not disturbed by an allocating constructor. You have reasonable defaults for common scenarios that can be overriden on demand by reserve() (and shrink_to_fit()). So, even if the standard does not explicitely state so, I'm quite sure assuming that a newly constructed vector does not preallocate is a pretty safe bet for all current implementations.

As a slight addition to the other answers, I found that when running under debug conditions with Visual Studio a default constructed vector will still allocate on the heap even though the capacity starts at zero.
Specifically if _ITERATOR_DEBUG_LEVEL != 0 then vector will allocate some space to help with iterator checking.
https://learn.microsoft.com/en-gb/cpp/standard-library/iterator-debug-level
I just found this slightly annoying since I was using a custom allocator at the time and was not expecting the extra allocation.

This is an old question, and all answers here have rightly explained the standard's point of view and the way you can get an initial capacity in a portable manner by using std::vector::reserve;
However, I'll explain why it doesn't make sense for any STL implementation to allocate memory upon construction of an std::vector<T> object;
std::vector<T> of incomplete types;
Prior to C++17, it was undefined behavior to construct a std::vector<T> if the definition of T is still unknown at point of instantiation. However, that constraint was relaxed in C++17.
In order to efficiently allocate memory for an object, you need to know its size. From C++17 and beyond, your clients may have cases where your std::vector<T> class does not know the size of T. Does it makes sense to have memory allocation characteristics dependent on type completeness?
Unwanted Memory allocations
There are many, many, many times you'll need model a graph in software. (A tree is a graph); You are most likely going to model it like:
class Node {
....
std::vector<Node> children; //or std::vector< *some pointer type* > children;
....
};
Now think for a moment and imagine if you had lots of terminal nodes. You would be very pissed if your STL implementation allocates extra memory simply in anticipation of having objects in children.
This is just one example, feel free to think of more...

Standard doesnt specify initial value for capacity but the STL container automatically grows to accomodate as much data as you put in, provided you don't exceed the maximum size(use max_size member function to know).
For vector and string, growth is handled by realloc whenever more space is needed. Suppose you'd like to create a vector holding value 1-1000. Without using reserve, the code will typically result in between
2 and 18 reallocations during following loop:
vector<int> v;
for ( int i = 1; i <= 1000; i++) v.push_back(i);
Modifying the code to use reserve might result in 0 allocations during the loop:
vector<int> v;
v.reserve(1000);
for ( int i = 1; i <= 1000; i++) v.push_back(i);
Roughly to say, vector and string capacities grow by a factor of between 1.5 and 2 each time.

Related

Performance impact when resizing vector within capacity

I have the following synthesized example of my code:
#include <vector>
#include <array>
#include <cstdlib>
#define CAPACITY 10000
int main() {
std::vector<std::vector<int>> a;
std::vector<std::array<int, 2>> b;
a.resize(CAPACITY, std::vector<int> {0, 0})
b.resize(CAPACITY, std::array<int, 2> {0, 0})
for (;;) {
size_t new_rand_size = (std::rand() % CAPACITY);
a.resize(new_rand_size);
b.resize(new_rand_size);
for (size_t i = 0; i < new_rand_size; ++i) {
a[i][0] = std::rand();
a[i][1] = std::rand();
b[i][0] = std::rand();
b[i][1] = std::rand();
}
process(a); // respectively process(b)
}
}
so obviously, the array version is better, because it requires less allocation, as the array is fixed in size and continuous in memory (correct?). It just gets reinitialized when up-resizing again within capacity.
Since I'm going to overwrite anyway, I was wondering if there's a way to skip initialization (e.g. by overwriting the allocator or similar) to optimize the code even further.
so obviously,
The word "obviously" is typically used to mean "I really, really want the following to be true, so I'm going to skip the part where I determine if it is true." ;) (Admittedly, you did better than most since you did bring up some reasons for your conclusion.)
the array version is better, because it requires less allocation, as the array is fixed in size and continuous in memory (correct?).
The truth of this depends on the implementation, but the there is some validity here. I would go with a less micro-managementy approach and say that the array version is preferable because the final size is fixed. Using a tool designed for your specialized situation (fixed size array) tends to incur less overhead than using a tool for a more general situation. Not always less, though.
Another factor to consider is the cost of default-initializing the elements. When a std::array is constructed, all of its elements are constructed as well. With a std::vector, you can defer constructing elements until you have the parameters for construction. For objects that are expensive to default-construct, you might be able to measure a performance gain using a vector instead of an array. (If you cannot measure a difference, don't worry about it.)
When you do a comparison, make sure the vector is given a fair chance by using it well. Since the size is known in advance, reserve the required space right away. Also, use emplace_back to avoid a needless copy.
Final note: "contiguous" is a bit more accurate/descriptive than "continuous".
It just gets reinitialized when up-resizing again within capacity.
This is a factor that affects both approaches. In fact, this causes your code to exhibit undefined behavior. For example, let's suppose that your first iteration resizes the outer vector to 1, while the second resizes it to 5. Compare what your code does to the following:
std::vector<std::vector<int>> a;
a.resize(CAPACITY, std::vector<int> {0, 0});
a.resize(1);
a.resize(5);
std::cout << "Size " << a[1].size() <<".\n";
The output indicates that the size is zero at this point, yet your code would assign a value to a[1][0]. If you want each element of a to default to a vector of 2 elements, you need to specify that default each time you resize a, not just initially.
Since I'm going to overwrite anyway, I was wondering if there's a way to skip initialization (e.g. by overwriting the allocator or similar) to optimize the code even further.
Yes, you can skip the initialization. In fact, it is advisable to do so. Use the tool designed for the task at hand. Your initialization serves to increase the capacity of your vectors. So use the method whose sole purpose is to increase the capacity of a vector: vector::reserve.
Another option – depending on the exact situation — might be to not resize at all. Start with an array of arrays, and track the last usable element in the outer array. This is sort of a step backwards in that you now have a separate variable for tracking the size, but if your real code has enough iterations, the savings from not calling destructors when the size decreases might make this approach worth it. (For cleaner code, write a class that wraps the array of arrays and that tracks the usable size.)
Since I'm going to overwrite anyway, I was wondering if there's a way to skip initialization
Yes: Don't resize. Instead, reserve the capacity and push (or emplace) the new elements.

Any way that we can prevent the breakdown between reference and object when resize a vector?

I was debugging an issue and realized that when a vector is resizing, the reference will not work anymore. To illustrate this point, below is the minimal code. The output is 0 instead of 1. Is there anyway that we can prevent this happen except reserving a large space for x?
#include <iostream>
#include <vector>
using namespace std;
vector<int> x{};
int main(){
x.reserve(1);
x.push_back(0);
int & y = x[0];
x.resize(10);
y=1;
cout << x[0] << endl;
return 0;
}
This is called invalidation and the only way you can prevent it is if you make sure that the vector capacity does not change.
x.reserve(10);
x.push_back(0);
int &y = x[0];
x.resize(10);
The only way I can think of is to use std::deque instead of std::vector.
The reason for suggesting std::deque is this (from cppreference):
The storage of a deque is automatically expanded and contracted as
needed. Expansion of a deque is cheaper than the expansion of a
std::vector because it does not involve copying of the existing
elements to a new memory location.
That line about not copying is really the answer to your question. It means that the objects remain where you placed them (in memory) as long as the deque is alive.
However, on the very next line it says:
On the other hand, deques typically have large minimal memory cost; a
deque holding just one element has to allocate its full internal array
(e.g. 8 times the object size on 64-bit libstdc++; 16 times the object
size or 4096 bytes, whichever is larger, on 64-bit libc++).
It's now up to you to decide which is better - higher initial memory cost or changing your program's logic not to require referencing the items in the vector like that. You might also want to consider std::set or std::unordered_set for quickly finding an object within the container
There are several choices:
Don't use a vector.
Don't keep a reference.
Create a "smart reference" class that tracks the vector and the index and so it will obtain the appropriate object even if the vector moves.
You can create a vector of std::shared_ptr<> as well and keep the values instead of the interators.

Fast way to push_back a vector many times

I have identified a bottleneck in my c++ code, and my goal is to speed it up. I am moving items from one vector to another vector if a condition is true.
In python, the pythonic way of doing this would be to use a list comprehension:
my_vector = [x for x in data_vector if x > 1]
I have hacked a way to do this in C++, and it is working fine. However, I am calling this millions of times in a while-loop and it is slow. I do not understand much about memory allocation, but I assume that my problem has to do with allocating memory over-and-over again using push_back. Is there a way to allocate my memory differently to speed up this code? (I do not know how large my_vector should be until the for-loop has completed).
std::vector<float> data_vector;
// Put a bunch of floats into data_vector
std::vector<float> my_vector;
while (some_condition_is_true) {
my_vector.clear();
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Use my_vector to render graphics on the GPU, but do not change the elements of my_vector
// Change the elements of data_vector, but not the size of data_vector
}
Use std::copy_if, and reserve data_vector.size() for my_vector initially (as this is the maximum possible number of elements for which your predicate could evaluate to true):
std::vector<int> my_vec;
my_vec.reserve(data_vec.size());
std::copy_if(data_vec.begin(), data_vec.end(), std::back_inserter(my_vec),
[](const auto& el) { return el > 1; });
Note that you could avoid the reserve call here if you expect that the number of times that your predicate evaluates to true will be much less than the size of the data_vector.
Though there are various great solutions posted by others for your query, it seems there is still no much explanation for the memory allocation, which you do not much understand, so I would like to share my knowledge about this topic with you. Hope this helps.
Firstly, in C++, there are several types of memory: stack, heap, data segment.
Stack is for local variables. There are some important features associated with it, for example, they will be automatically deallocated, operation on it is very fast, its size is OS-dependent and small such that storing some KB of data in the stack may cause an overflow of memory, et cetera.
Heap's memory can be accessed globally. As for its important features, we have, its size can be dynamically extended if needed and its size is larger(much larger than stack), operation on it is slower than stack, manual deallocation of memory is needed (in nowadays's OS, the memory will be automatically freed in the end of program), et cetera.
Data segment is for global and static variables. In fact, this piece of memory can be divided into even smaller parts, e.g. BBS.
In your case, vector is used. In fact, the elements of vector are stored into its internal dynamic array, that is an internal array with a dynamic array size. In the early C++, a dynamic array can be created on the stack memory, however, it is no longer that case. To create a dynamic array, ones have to create it on heap. Therefore, the elements of vector are stored in an internal dynamic array on heap. In fact, to dynamically increase the size of an array, a process namely memory reallocation is needed. However, if a vector user keeps enlarging his or her vector, then the overhead cost of reallocation cost will be high. To deal with it, a vector would firstly allocate a piece of memory that is larger than the current need, that is allocating memory for potential future use. Therefore, in your code, it is not that case that memory reallocation is performed every time push_back() is called. However, if the vector to be copied is quite large, the memory reserved for future use will be not enough. Then, memory allocation will occur. To tackle it, vector.reserve() may be used.
I am a newbie. Hopefully, I have not made any mistake in my sharing.
Hope this helps.
Run the code twice, first time only counting, how many new elements you will need. Then use reserve to already allocate all the memory you need.
while (some_condition_is_true) {
my_vector.clear();
int newLength = 0;
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
newLength++;
my_vector.reserve(newLength);
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Do stuff with my_vector and change data_vector
}
I doubt allocating my_vector is the problem, especially if the while loop is executed many times as the capacity of my_vector should quickly become sufficient.
But to be sure you can just reserve capacity in my_vector corresponding to the size of data_vector:
my_vector.reserve(data_vector.size());
while (some_condition_is_true) {
my_vector.clear();
for (auto value : data_vector) {
if (value > 1)
my_vector.push_back(value);
}
}
If you are on Linux you can reserve memory for my_vector to prevent std::vector reallocations which is bottleneck in your case. Note that reserve will not waste memory due to overcommit, so any rough upper estimate for reserve value will fit your needs. In your case the size of data_vector will be enough. This line of code before while loop should fix the bottleneck:
my_vector.reserve(data_vector.size());

C++ Block Allocator for creating new objects faster

I have a piece of code that creates thousand of objects, and appends them to a vector.
The following code is just an example of what is being done, even though the constructor has some parameters, and the for does not actually have that condition, but it serves the purpose of showing that it runs thousands of times.
vector<VolumeInformation*> vector = vector<VolumeInformation*>();
for (int i = 0; i < 5000; ++i) {
VolumeInformation* info = new VolumeInformation();
vector.push_back(info);
}
The code takes a lot of time to run, and I was trying to find a faster way of creating all the objects. I read about block allocators, but I am unsure if this is really meant for what I am trying to do, and if it really helps on getting this done faster. I would want to allocate memory for a thousand objects (for example), and keep on using that memory while it is still available, and then allocate some more when needed, avoiding having to allocate memory for a single object every time. Can this be done? Can you point me to somewhere where I can find an example on how to tell 'new' to use the previously allocated memory? If not for the objects itself, can the allocator be used for the memory of the vector (even though the object is what really needs speeding up)?
Thank you.
** UPDATE **
After all the answers and comments, I decided making a change in the code, so the vector would store the objects instead of the pointers, so I could use reserve to pre-allocate some memory for the vector, allowing to save some time by allocating memory for several object instances at once. Although, after doing some performance benchmark, I verify that the change I made is performing much worse, unless I know, ahead of time, the exact size of the vector. Here are my findings, I was wondering if someone could shed light into this, letting me know why this happens, if I am missing something here, or if the approach I was using before is really the best one.
Here is the code I used for benchmarking:
vector<int> v = vector<int>();
v.push_back(1);
v.push_back(3);
v.push_back(4);
v.push_back(5);
v.push_back(7);
v.push_back(9);
int testAmount = 200000;
int reserve = 500000;
Stopwatch w = Stopwatch();
w = Stopwatch();
vector<VolumeInformation> infos = vector<VolumeInformation>();
infos.reserve(reserve);
for (int i = 0; i < testAmount; ++i) {
infos.emplace_back(&v, 1, 0, 0);
}
int elapsed = w.Elapsed();
w = Stopwatch();
vector<VolumeInformation*> infoPointers = vector<VolumeInformation*>();
infoPointers.reserve(reserve);
for (int i = 0; i < testAmount; ++i) {
infoPointers.emplace_back(new VolumeInformation(&v, 1, 0, 0));
}
int elapsed2 = w.Elapsed();
If I comment out both reserve() lines, the version without pointers takes 32.701 seconds, while the pointer version takes 6.159! It takes 5+ times less than using a vector of objects.
If I use reserve, but set the amount of items to reserve to a value lower than the number of iterations, the vector of objects version still takes more time than the pointer version.
If I use reserve with a value higher or equal to the amount of iterations, the vector of objects version becomes a lot faster, taking only 270ms, against 8.901 seconds of the pointer version. The main issue here is that I do not know in advance the size that the vector will reach, as the iterations are not based in a hardcoded number, this was only to do the benchmarking.
Can someone explain why this happens, if there is another way around this, or if I am making anything wrong here?
vector is perfectly capable of pre-allocating a large block and using it for all the elements, if you just use it correctly:
// create 5000 default-constructed X objects
std::vector<X> v(5000);
Or if you need to pass constructor arguments:
std::vector<X> v;
v.reserve(5000); // allocate block of memory for 5000 objects
for (int i=0 ; i < v.size(); ++i)
v.emplace_back(arg1, arg2, i % 2 ? arg3 : arg4);
The last line constructs an X in the pre-allocated memory, with no copying, passing the function arguments to the X constructor.
I would want to allocate memory for a thousand objects (for example), and keep on using that memory while it is still available, and then allocate some more when needed, avoiding having to allocate memory for a single object every time.
std::vector does that automatically, you should probably stop using new and just have a vector<VolumeInformation> and put objects into it directly, instead of allocating individual objects and storing pointers to them.
Memory allocation is slow (see Why should C++ programmers minimize use of 'new'?), so stop allocating individual objects. Both the examples above will do 1 allocation, and 5000 constructor calls. Your original code does at least 5001 allocations and 5000 constructor calls (in typical C++ implementations it would do 5013 allocations and 5000 constructor calls).
** UPDATE **
If I comment out both reserve() lines, the version without pointers takes 32.701 seconds, while the pointer version takes 6.159! It takes 5+ times less than using a vector of objects.
Since you haven't actually shown a complete working program you're asking people to guess (always show the actual code!) but it suggests your class has a very slow copy constructor, which is used when the vector grows and the existing elements need to be copied over to the new memory (and the old elements are then destroyed).
If you can add a noexcept move constructor that is more efficient than the copy constructor then std::vector will use that when the vector needs to grow and will run much faster.
The main issue here is that I do not know in advance the size that the vector will reach, as the iterations are not based in a hardcoded number, this was only to do the benchmarking.
You could just reserve more elements than you are ever likely to need, trading higher memory usage for better performance.
You probably want to reserve space for your 5000 elements ahead of the loop:
vector.reserve(5000);
for (int i = 0; i < 5000; ++i) {
VolumeInformation info = new VolumeInformation();
vector.push_back(info);
}
this could save time by eliminating severals resizes as vector grows and if VolumeInformation costs a lot (in time) to copy.

Benefits of using reserve() in a vector - C++

What is the benefit of using reserve when dealing with vectors. When should I use them? Couldn't find a clear cut answer on this but I assume it is faster when you reserve in advance before using them.
What say you people smarter than I?
It's useful if you have an idea how many elements the vector will ultimately hold - it can help the vector avoid repeatedly allocating memory (and having to move the data to the new memory).
In general it's probably a potential optimization that you shouldn't need to worry about, but it's not harmful either (at worst you end up wasting memory if you over estimate).
One area where it can be more than an optimization is when you want to ensure that existing iterators do not get invalidated by adding new elements.
For example, a push_back() call may invalidate existing iterators to the vector (if a reallocation occurs). However if you've reserved enough elements you can ensure that the reallocation will not occur. This is a technique that doesn't need to be used very often though.
It can be ... especially if you are going to be adding a lot of elements to you vector over time, and you want to avoid the automatic memory expansion that the container will make when it runs out of available slots.
For instance, back-insertions (i.e., std::vector::push_back) are considered an ammortized O(1) or constant-time process, but that is because if an insertion at the back of a vector is made, and the vector is out of space, it must then reallocate memory for a new array of elements, copy the old elements into the new array, and then it can copy the element you were trying to insert into the container. That process is O(N), or linear-time complexity, and for a large vector, could take quite a bit of time. Using the reserve() method allows you to pre-allocate memory for the vector if you know it's going to be at least some certain size, and avoid reallocating memory every time space runs out, especially if you are going to be doing back-insertions inside some performance-critical code where you want to make sure that the time to-do the insertion remains an actual O(1) complexity-process, and doesn't incurr some hidden memory reallocation for the array. Granted, your copy constructor would have to be O(1) complexity as well to get true O(1) complexity for the entire back-insertion process, but in regards to the actual algorithm for back-insertion into the vector by the container itself, you can keep it a known complexity if the memory for the slot is already pre-allocated.
This excellent article deeply explains differences between deque and vector containers. Section "Experiment 2" shows the benefits of vector::reserve().
If you know the eventual size of the vector then reserve is worth using.
Otherwise whenever the vector runs out of internal room it will re-size the buffer. This usually involves doubling (or 1.5 * current size) the size of the internal buffer (can be expensive if you do this a lot).
The real expensive bit is invoking the copy constructor on each element to copy it from the old buffer to the new buffer, followed by calling the destructor on each element in the old buffer.
If the copy constructor is expensive then it can be a problem.
Faster and saves memory
If you push_back another element, then a full vector will typically allocate double the memory it's currently using - since allocate + copy is expensive
Don't know about people smarter than you, but I would say that you should call reserve in advance if you are going to perform lots in insertion operations and you already know or can estimate the total number of elements, at least the order of magnitude. It can save you a lot of reallocations in good circumstances.
Although its an old question, Here is my implementation for the differences.
#include <iostream>
#include <chrono>
#include <vector>
using namespace std;
int main(){
vector<int> v1;
chrono::steady_clock::time_point t1 = chrono::steady_clock::now();
for(int i = 0; i < 1000000; ++i){
v1.push_back(1);
}
chrono::steady_clock::time_point t2 = chrono::steady_clock::now();
chrono::duration<double> time_first = chrono::duration_cast<chrono::duration<double>>(t2-t1);
cout << "Time for 1000000 insertion without reserve: " << time_first.count() * 1000 << " miliseconds." << endl;
vector<int> v2;
v2.reserve(1000000);
chrono::steady_clock::time_point t3 = chrono::steady_clock::now();
for(int i = 0; i < 1000000; ++i){
v2.push_back(1);
}
chrono::steady_clock::time_point t4 = chrono::steady_clock::now();
chrono::duration<double> time_second = chrono::duration_cast<chrono::duration<double>>(t4-t3);
cout << "Time for 1000000 insertion with reserve: " << time_second.count() * 1000 << " miliseconds." << endl;
return 0;
}
When you compile and run this program, it outputs:
Time for 1000000 insertion without reserve: 24.5573 miliseconds.
Time for 1000000 insertion with reserve: 17.1771 miliseconds.
Seems to be some improvement with reserve, but not that too much improvement. I think it will be more improvement for complex objects, I am not sure. Any suggestions, changes and comments are welcome.
It's always interesting to know the final total needed space before to request any space from the system, so you just require space once. In other cases the system may have to move you in a larger free zone (it's optimized but not always a free operation because a whole data copy is required). Even the compiler will try to help you, but the best is to to tell what you know (to reserve the total space required by your process). That's what i think. Greetings.
There is one more advantage of reserve that is not much related to performance but instead to code style and code cleanliness.
Imagine I want to create a vector by iterating over another vector of objects. Something like the following:
std::vector<int> result;
for (const auto& object : objects) {
result.push_back(object.foo());
}
Now, apparently the size of result is going to be the same as objects.size() and I decide to pre-define the size of result.
The simplest way to do it is in the constructor.
std::vector<int> result(objects.size());
But now the rest of my code is invalidated because the size of result is not 0 anymore; it is objects.size(). The subsequent push_back calls are going to increase the size of the vector. So, to correct this mistake, I now have to change how I construct my for-loop. I have to use indices and overwrite the corresponding memory locations.
std::vector<int> result(objects.size());
for (int i = 0; i < objects.size(); ++i) {
result[i] = objects[i].foo();
}
And I don't like it. Indices are everywhere in the code. This is also more vulnerable to making accidental copies because of the [] operator. This example uses integers and directly assigns values to result[i], but in a more complex for-loop with complex data structures, it could be relevant.
Coming back to the main topic, it is very easy to adjust the first code by using reserve. reserve does not change the size of the vector but only the capacity. Hence, I can leave my nice for loop as it is.
std::vector<int> result;
result.reserve(objects.size());
for (const auto& object : objects) {
result.push_back(object.foo());
}