When does vector::push_back increase capacity? - c++

I'm using a bunch of std::vectors by setting their capacity at the beginning and using push_back to slowly fill them up. Most of these vectors will have the same size (16 elements), although some might get larger. If I use push_back 16 times on a vector with size 0 and capacity 16 initially, can I be sure that capacity will be exactly 16 after the push_backs?

Yes -- once you reserve a specific capacity, the vector will not be reallocated until you exceed the capacity you've set1. Exactly how many more items you may be able to push without reallocation isn't specified, but you are guaranteed at least that many.
In particular, pointers and iterators into the vector are guaranteed to remain valid until you exceed the specified capacity.

23.3.6.5 [vector modifiers]
void push_back(const T& x);
void push_back(T&& x);
Remarks: Causes reallocation if the new size is greater than the old capacity
Pretty much self-explanatory.

Yup.
23.3.6.3p6:
No reallocation shall take place during insertions that happen after a call to reserve() until the time when an insertion would make the size of the vector greater than the value of capacity().

Related

Amortizing in std::vector::resize and std::vector::push_back

We know that the reallocation mechanism takes care of allocating more memory that we actually need when calling std::vector::push_back().
Usually the capacity grows with the multiplier 2x or with a golden ratio number ~1.618...
Assume, we add elements as follows:
std::vector<int> v;
for(unsigned i = 0; i < 100000; ++i)
{
v.resize(v.size() + 1);
}
Is it guaranteed that the capacity of the vector is "doubled" if the reallocation takes place?
In other words: would the "+1 resize" allocate the memory the same way as it is done for push_back.
Or it is a pure implementation-depended thing?
Is it guaranteed that the capacity of the vector is "doubled" if the reallocation takes place?
No. The complexity of memory reallocation is amortized constant. Whether the capacity of the object is doubled when needed or increased by another factor is implementation dependent.
would the "+1 resize" allocate the memory the same way as it is done for push_back
Yes.
std::vector::resize(size_type sz) appends sz - size() value-initialized elements to the sequence when sz is greater than size(). That is equivalent to:
insert(end(), sz-size(), <value initialized object>);
std::vector::insert, std::vector::emplace, and std::vector::push_back have the same complexity for memory allocation - amortized constant.
A vector is a sequence container that supports (amortized) constant
time insert and erase operations at the end; [vector.overview]
and
If size() < sz , appends sz - size() default-inserted elements to the
sequence.
for resize. IMHO that means, yes, it is guaranteed that the capacity of the vector is "doubled" if the reallocation takes place

STL vector with reverse and pop/push_back cost

I am not exactly good to come up with algorithm costs, so here I am asking.
Here is a vector initially initialized with 1000 elements:
vector<unsigned int> mFreeIndexes(1000);
I will continuously pop_back/push_back elements to the vector, but never push_back over 1000 (so never force vector to reallocate).
In this case will the pop_back/push_back operations be O(1) or O(n)?
From the C++ standard 23.3.7.5:
void push_back(const T& x);
void push_back(T&& x);
Remarks: Causes reallocation if the new size is greater than the old capacity (...)
Note that it doesn't say that it can't reallocate in the other scenario but this would be a very unusual implementation of the standard. I think you can safely assume that push_back won't reallocate when there's still capacity.
The thing with pop_back is a bit more complicated. The standard does not say anything about reallocation in the pop_back context. But it seems to be a common implementation (with no know exception) that pop_back does not reallocate. There are some guarantees though, see this:
Can pop_back() ever reduce the capacity of a vector? (C++)
Anyway as long as you don't go over predefined size you are safe to assume that no reallocation happens and the complexity is indeed O(1).

Use std::vector::data after reserve

I have a std::vector on which I call reserve with a large value. Afterwards I retrieve data().
Since iterating data is then crashing I am wondering whether this is even allowed. Is reserve forced to update data to the allocated memory range?
The guarantee of reserve is that subsequent insertions do not reallocate, and thus do not cause invalidation. That's it. There are no further guarantees.
Is reserve forced to update data to the allocated memory range?
No. The standard only guarantees that std::vector::data returns a pointer and [data(), data() + size()) is a valid range, the capacity is not concerned.
§23.3.11.4/1 vector data
[vector.data]:
Returns: A pointer such that [data(), data() + size()) is a valid
range. For a non-empty vector, data() == addressof(front()).
There is no requirement that data() returns dereferencable pointer for empty (size() == 0) vector, even if it has nonzero capacity. It might return nullptr or some arbitrary value (only requirement in this case is that it should be able to be compared with itself and 0 could be added to it without invoking UB).
I'd say the documentation is pretty clear on this topic: anything after data() + size() may be allocated but not initialized memory: if you want to also initialize this memory you should use vector::resize.
void reserve (size_type n);
Request a change in capacity
Requests that the vector capacity be at least enough to contain n elements.
If n is greater than the current vector capacity, the function causes
the container to reallocate its storage increasing its capacity to n
(or greater).
In all other cases, the function call does not cause a reallocation
and the vector capacity is not affected.
This function has no effect on the vector size and cannot alter its
elements.
I'm not sure why you would want to access anything after data() + size() after reserve() in the first place: the intended use of reserve() is to prevent unnecessary reallocations when you know or can estimate the expected size of your container, but at the same time avoid the unnecessary initializon of memory which may be either inefficient or impractical (e.g. non-trivial data for initialization is not available). In this situation you could replace log(N) reallocations and copies with only 1 improving performance.

Is std::vector::reserve(0); legal?

Is std::vector::reserve(0); legal and what will it do?
There's nothing to prohibit it. The effect of reserve is:
After reserve(), capacity() is greater or equal to the argument of reserve if
reallocation happens; and equal to the previous value of capacity() otherwise. Reallocation happens
at this point if and only if the current capacity is less than the argument of reserve().1
Since the value of capacity() can never be less than 0 (it's unsigned), this can never have any effect; it can never cause a reallocation.
1. c++ standard, [vector.capacity]
Yes, it is a legal no-op.
If new_cap is greater than the current capacity(), new storage is allocated, otherwise the method does nothing.
(Source, emphasis mine.)
Since capacity() will always be >= 0 (due to size_type being unsigned), passing a zero is guaranteed to do nothing.
According to the C++ Standard
After reserve(), capacity() is greater or equal to the argument of
reserve if reallocation happens; and equal to the previous value of
capacity() otherwise. Reallocation happens at this point if and only
if the current capacity is less than the argument of reserve().
So there simply will not be a reallocation if the argument of reserve is equal to 0.
The function itself throws an exception only in one case
Throws: length_error if n > max_size().
Take into account that reserve( 0 ) is not equivalent to resize( 0 ). In the last case all elements of the vector will be removed.
It is legal and will reserve no space. Though if the call is lower than its capacity the call will do nothing.
The documentation provides a clear answer to this:
Increase the capacity of the container to a value that's greater or equal to new_cap. If new_cap is greater than the current capacity(), new storage is allocated, otherwise the method does nothing.
capacity() returns a value that cannot be negative. Hence, passing zero for new_cap always falls into the second category - i.e. when the function does nothing.
void reserve (size_type n);
If n is greater than the current vector capacity, the function causes the container to reallocate its storage increasing its capacity to n (or greater).
In all other cases, the function call does not cause a reallocation and the vector capacity is not affected.
First of all, you should try to understand how Vector works. It is an array that reserve memory in order to use it when you need to store a new value trying to do the insert operation faster and efficient.
With std::vector::reserve() you can determine the amount of memory that you want to reserve, in your case, zero.
In case you want to add another value to your vector and the reserve space is zero, it will work with no problem at all, but the operation will be slower. It could be a problem if you want to do this for a lot of values, but probably you won't notice this if you do it just a few times.

Vector Resize vs Reserve for nested vectors

I'm trying to use another project's code and they have structs of this form:
struct data{
std::vector<sparse_array> cols,rows;
}
struct sparse_array {
std::vector<unsigned int> idxs;
std::vector<double> values;
void add(unsigned int idx, double value) {
idxs.push_back(idx);
values.push_back(value);
}
}
For my code, I tried using the following lines:
data prob;
prob.cols.reserve(num_cols);
prob.rows.reserve(num_rows);
// Some loop that calls
prob.cols[i].add(idx, value);
prob.rows[i].add(idx, value);
And when I output the values, prob.rows[i].value[j] to a file I get all zeros. But when I use resize instead of reserve I get the actual value that I read in. Can someone give me an explanation about this?
Function reserve() simply allocates a contiguous region of memory big enough to hold the number of items you specify and moves the vector's old content into this new block, which makes sure no more reallocations for the vectors' storage will be done upon insertions as long as the specified capacity is not exceeded. This function is used to reduce the number of reallocations (which also invalidate iterators), but does not insert any new items at the end of your vector.
From the C++11 Standard, Paragraph 23.3.6.3/1 about reserve():
A directive that informs a vector of a planned change in size, so that it can manage the storage
allocation accordingly. After reserve(), capacity() is greater or equal to the argument of reserve if reallocation happens; and equal to the previous value of capacity() otherwise. Reallocation happens at this point if and only if the current capacity is less than the argument of reserve(). If an exception is thrown other than by the move constructor of a non-CopyInsertable type, there are no effects.
Notice that by doing prob.cols[i].push_back(idx, value); you are likely to get undefined behavior, since i is probably an out-of-bounds index.
On the other hand, function resize() does insert items at the end of your vector, so that the final size of the vector will be the one you specified (this means it can even erase elements, if you specify a size smaller than the current one). If you specify no second argument to a call to resize(), the newly inserted items will be value-initialized. Otherwise, they will be copy-initialized from the value you provide.
From the C++11 Standard, Paragraph 23.3.6.3/9 about resize():
If sz <= size(), equivalent to erase(begin() + sz, end());. If size() < sz, appends
sz - size() value-initialized elements to the sequence.
So to sum it up, the reason why accessing your vector after invoking resize() gives the expected result is that items are actually being added to the vector. On the other hand, since the call to reserve() does not add any item, subsequent accesses to non-existing elements will give you undefined behavior.
If the vector is empty, then std::vector::resize(n) expands the content of this vector by inserting n new elements at the end. std::vector::reserve(n) only reallocates the memory block that your vector uses for storing its elements so that it's big enough to hold n elements.
Then when you call prob.cols[i], you are trying to access the element at index i. In case you used reserve before, this results in accessing the memory where no element resides yet, which produces the undefined behavior.
So just use resize in this case :)