I am not exactly good to come up with algorithm costs, so here I am asking.
Here is a vector initially initialized with 1000 elements:
vector<unsigned int> mFreeIndexes(1000);
I will continuously pop_back/push_back elements to the vector, but never push_back over 1000 (so never force vector to reallocate).
In this case will the pop_back/push_back operations be O(1) or O(n)?
From the C++ standard 23.3.7.5:
void push_back(const T& x);
void push_back(T&& x);
Remarks: Causes reallocation if the new size is greater than the old capacity (...)
Note that it doesn't say that it can't reallocate in the other scenario but this would be a very unusual implementation of the standard. I think you can safely assume that push_back won't reallocate when there's still capacity.
The thing with pop_back is a bit more complicated. The standard does not say anything about reallocation in the pop_back context. But it seems to be a common implementation (with no know exception) that pop_back does not reallocate. There are some guarantees though, see this:
Can pop_back() ever reduce the capacity of a vector? (C++)
Anyway as long as you don't go over predefined size you are safe to assume that no reallocation happens and the complexity is indeed O(1).
Related
I have a std::vector on which I call reserve with a large value. Afterwards I retrieve data().
Since iterating data is then crashing I am wondering whether this is even allowed. Is reserve forced to update data to the allocated memory range?
The guarantee of reserve is that subsequent insertions do not reallocate, and thus do not cause invalidation. That's it. There are no further guarantees.
Is reserve forced to update data to the allocated memory range?
No. The standard only guarantees that std::vector::data returns a pointer and [data(), data() + size()) is a valid range, the capacity is not concerned.
§23.3.11.4/1 vector data
[vector.data]:
Returns: A pointer such that [data(), data() + size()) is a valid
range. For a non-empty vector, data() == addressof(front()).
There is no requirement that data() returns dereferencable pointer for empty (size() == 0) vector, even if it has nonzero capacity. It might return nullptr or some arbitrary value (only requirement in this case is that it should be able to be compared with itself and 0 could be added to it without invoking UB).
I'd say the documentation is pretty clear on this topic: anything after data() + size() may be allocated but not initialized memory: if you want to also initialize this memory you should use vector::resize.
void reserve (size_type n);
Request a change in capacity
Requests that the vector capacity be at least enough to contain n elements.
If n is greater than the current vector capacity, the function causes
the container to reallocate its storage increasing its capacity to n
(or greater).
In all other cases, the function call does not cause a reallocation
and the vector capacity is not affected.
This function has no effect on the vector size and cannot alter its
elements.
I'm not sure why you would want to access anything after data() + size() after reserve() in the first place: the intended use of reserve() is to prevent unnecessary reallocations when you know or can estimate the expected size of your container, but at the same time avoid the unnecessary initializon of memory which may be either inefficient or impractical (e.g. non-trivial data for initialization is not available). In this situation you could replace log(N) reallocations and copies with only 1 improving performance.
I understand how random access iterators work for contiguous containers like std::vector: the iterator simply maintains a pointer to the current element and any additions/subtractions are applied to the pointer.
However, I'm baffled as to how similar functionality could be implemented for a non-contiguous container. My first guess for how std::deque:iterator works, is that it maintains a pointer to some table of the groups of contiguous memory it contains, but I'm not sure.
How would a typical standard library implement this?
You can satisfy the requirememts of a std::deque with a std::vector<std::unique_ptr<std::array<T,N>>> roughly. plus a low/high water mark telling you where the first/last elements are. (for an implementation defined N that could vary with T, and the std::arrays are actually blocks of properly aligned uninitialized memory and not std::arrays, but you get the idea).
Use usual exponential growth, but on both front and back.
Lookup simply does (index+first)/N and %N to find the block and sub element.
This is more expensive than a std::vector lookup, but is O(1).
A deque iterator can be implemented by storing both a pointer to the referenced value and a double pointer to the contiguous block of memory in which that value is located. The double pointer points into a contiguous array of pointers to blocks managed by the deque.
class deque_iterator
{
T* value;
T** block;
…
}
Because both value and block point into contiguous memory, you can implement operations such finding the distance between iterators in constant time (example adapted from libc++).
difference_type operator-(deque_iterator const& x, deque_iterator const& y)
{
return (x.block - y.block) * block_size
+ (x.value - *x.block)
- (y.value - *y.block);
}
Note that, while value will not be invalidated by operations such as push_front and push_back, block might be, which is why deque_iterator is invalidated by such operations.
Recently got into confusion on whether old memory will be freed or not after STL::vector size is increased.
When stl::vector capacity is increased due to insert, a new contiguous memory is allocated (=2*current vector capacity) and old contents are copied to new memory. And the old memory is freed.
Now recently we get into discussion and some believe that old memory is not freed instead it is kept for reference. So over multiple resize's the stl::vector start accumulating memory which is not really needed.
To my understanding it frees old memory but I don't have any concrete documentation on it. However my understanding may be wrong! I would appreciate if anyone who knows the details, share the same!
The vector definitely does not keep the memory. The allocator might, or operator new/delete might. Even the OS might keep the memory reserved for your program.
As you probably know, reallocations invalidate all pointers and iterators to the elements of the vector, according to the standard. If the old memory was kept somehow, the pointers and iterators would continue to be valid, since they would point at the same objects as before. Therefore, the standard implicitly says that memory is immediately released.
Of course, this does not mean that the runtime is forced to immediately wipe out that memory. In fact, it will most likely remain as it was until your product is deployed at the customer's site. Then it will explode in his face.
from N3690
23.3.7.5 vector modifiers [vector.modifiers]
iterator insert(const_iterator position, const T& x);
iterator insert(const_iterator position, T&& x);
iterator insert(const_iterator position, size_type n, const T& x);
template iterator insert(const_iterator
position, InputIterator first, InputIterator last);
iterator insert(const_iterator position, initializer_list);
template void emplace_back(Args&&... args);
template iterator emplace(const_iterator position,
Args&&... args);
void push_back(const T& x);
void push_back(T&& x);
1 Remarks: Causes reallocation if the new size is greater than the old
capacity. If no reallocation happens, all the iterators and references
before the insertion point remain valid. If an exception is thrown
other than by the copy constructor, move constructor, assignment
operator, or move assignment operator of T or by any InputIterator
operation there are no effects. If an exception is thrown by the move
constructor of a non-CopyInsertable T, the effects are unspecified.
2 Complexity: The complexity is linear in the number of elements
inserted plus the distance to the end of the vector.
I'm using a bunch of std::vectors by setting their capacity at the beginning and using push_back to slowly fill them up. Most of these vectors will have the same size (16 elements), although some might get larger. If I use push_back 16 times on a vector with size 0 and capacity 16 initially, can I be sure that capacity will be exactly 16 after the push_backs?
Yes -- once you reserve a specific capacity, the vector will not be reallocated until you exceed the capacity you've set1. Exactly how many more items you may be able to push without reallocation isn't specified, but you are guaranteed at least that many.
In particular, pointers and iterators into the vector are guaranteed to remain valid until you exceed the specified capacity.
23.3.6.5 [vector modifiers]
void push_back(const T& x);
void push_back(T&& x);
Remarks: Causes reallocation if the new size is greater than the old capacity
Pretty much self-explanatory.
Yup.
23.3.6.3p6:
No reallocation shall take place during insertions that happen after a call to reserve() until the time when an insertion would make the size of the vector greater than the value of capacity().
The C++ standard seems to make no statement regarding side-effects on capacity by either
resize(n), with n < size(), or clear().
It does make a statement about amortized cost of push_back and pop_back - O(1)
I can envision an implementation that does the usual sort of capacity changes
ala CLRS Algorithms (e.g. double when enlarging, halve when decreasing size to < capacity()/4).
(Cormen Lieserson Rivest Stein)
Does anyone have a reference for any implementation restrictions?
Calling resize() with a smaller size has no effect on the capacity of a vector. It will not free memory.
The standard idiom for freeing memory from a vector is to swap() it with an empty temporary vector: std::vector<T>().swap(vec);. If you want to resize downwards you'd need to copy from your original vector into a new local temporary vector and then swap the resulting vector with your original.
Updated: C++11 added a member function shrink_to_fit() for this purpose, it's a non-binding request to reduce capacity() to size().
Actually, the standard does specify what should happen:
This is from vector, but the theme is the same for all the containers (list, deque, etc...)
23.2.4.2 vector capacity [lib.vector.capacity]
void resize(size_type sz, T c = T());
6) Effects:
if (sz > size())
insert(end(), sz-size(), c);
else if (sz < size())
erase(begin()+sz, end());
else
; //do nothing
That is to say: If the size specified to resize is less than the number of elements, those elements will be erased from the container. Regarding capacity(), this depends on what erase() does to it.
I cannot locate it in the standard, but I'm pretty sure clear() is defined to be:
void clear()
{
erase(begin(), end());
}
Therefore, the effects clear() has on capacity() is also tied to the effects erase() has on it. According to the standard:
23.2.4.3 vector modifiers [lib.vector.modifiers]
iterator erase(iterator position);
iterator erase(iterator first, iterator last);
4) Complexity: The destructor of T is called the number of times equal to the number of the elements erased....
This means that the elements will be destructed, but the memory will remain intact. erase() has no effect on capacity, therefore resize() and clear() also have no effect.
The capacity will never decrease. I'm not sure if the standard states this explicitly, but it is implied: iterators and references to vector's elements must not be invalidated by resize(n) if n < capacity().
As i checked for gcc (mingw) the only way to free vector capacity is what mattnewport says.
Swaping it with other teporary vector.
This code makes it for gcc.
template<typename C> void shrinkContainer(C &container) {
if (container.size() != container.capacity()) {
C tmp = container;
swap(container, tmp);
}
//container.size() == container.capacity()
}