Complexity of downsizing STL vector

Complexity of downsizing STL vector - c++

What is the time complexity of downsizing (reducing the size) an std::vector<int>?
I get that it does not reallocate memory. On custom classes, it may need
to call destructors for all elements that are removed. But with integers,
will the downsizing happen in constant time?

Depends on what you mean by "reducing the size" of a vector.
Usually, people remove elements from a vector by calling erase. If you erase stuff at the end of the vector, then things are simple, and all that happens is that the elements are destructed - which, as Remy pointed out, is a no-op for ints.
If you're erasing from somewhere other than the end, then elements have to be shuffled around, and that takes time. Fortunately for your use case, copying an int is cheap, but it's not zero. So there's no way that removing an element from the beginning/middle of a vector can be constant time.
Note: calling resize on a vector to make it smaller removes elements at the end.

Related

When it comes to a sequence, is "vector[n].push_back()" is always O(1)?

I've used vector<int> v[N] a lot.
It's a very powerful tool for me.
I wonder v[n].push_back() costs O(1) on average.
I know when the vector is full, it needs to expand into double.
But isn't the sequence of vectors attached to each other?
If so, I think all vectors need to shift to the left which means it costs more than O(n).
To sum up, when it comes to sequence of vector, is v[n].push_back() always O(1)?
Please give me some help :D

It's not always O(1) anyway. See here (emphasis mine):
Complexity
Constant (amortized time, reallocation may happen).
If a reallocation happens, the reallocation is itself up to linear in
the entire size.
So even with just one vector, it's not guaranteed to be constant.
But isn't the sequence of vector attached each other? If so, I think
all vectors need to be shift which means it costs more than O(1).
This doesn't affect the runtime. The vectors in the array are independent of each other and manage their own storage dynamically (and separate from each other). The actual vector object in the array is always of the same size. When you modify an object in an array, it doesn't change the size of the object and move the rest of the array.

If so, I think all vectors need to shift to the left which means it costs more than O(n).
No thats not the case.
std::vectors dynamically allocate memory for an array to place the elements inside and merely store a pointer to that array (together with size and capacity). sizeof(std::vector<T>) is a compile-time constant. The amount of memory occupied by a std::vector inside a c-array does not change when you add elements to the vectors.
Hence, complexity of push_back is not effected by placing the vector in an array.

Does std::vector::insert reserve by definition?

When calling the insert member function on a std::vector, will it reserve before "pushing back" the new items? I mean does the standard guarantee that or not?
In other words, should I do it like this:
std::vector<int> a{1,2,3,4,5};
std::vector<int> b{6,7,8,9,10};
a.insert(a.end(),b.begin(),b.end());
or like this:
std::vector<int> a{1,2,3,4,5};
std::vector<int> b{6,7,8,9,10};
a.reserve(a.size()+b.size());
a.insert(a.end(),b.begin(),b.end());
or another better approach?

Regarding the complexity of the function [link]:
Linear on the number of elements inserted (copy/move construction)
plus the number of elements after position (moving).
Additionally, if InputIterator in the range insert (3) is not at least
of a forward iterator category (i.e., just an input iterator) the new
capacity cannot be determined beforehand and the insertion incurs in
additional logarithmic complexity in size (reallocations).
Hence, there is two cases :
The new capacity can be determined, therefore you won't need to call reserve
The new capacity can't be determined, hence a call to reserve should be useful.

Does std::vector::insert reserve by definition?
Not always; depends on the current capacity.
From the draft N4567, §23.3.6.5/1 ([vector.modifiers]):
Causes reallocation if the new size is greater than the old capacity.
If the allocated memory capacity in the vector is large enough to contain the new elements, no additional allocations for the vector are needed. So no, then it won't reserve memory.
If the vector capacity is not large enough, then a new block is allocated, the current contents moved/copied over and the new elements are inserted. The exact allocation algorithm is not specified, but typically it would be as used in the reserve() method.
... or another better approach?
If you are concerned about too many allocations whilst inserting elements into the vector, then calling the reserve method with the size of the number of expected elements to be added does minimise the allocations.
Does the vector call reserve before the/any insertions? I.e. does it allocate enough capacity in a single allocation?
No guarantees. How would it know the distance between the to input iterators? Given that the insert method can take an InputIterator (i.e. single pass iterator), it has no way of calculating the expected size. Could the method calculate the size if the iterators where something else (e.g. pointers or RandomAccessIterator)? Yes it could. Would it? Depends on the implementation and the optimisations that are made.

From the documentation, it seems that:
Causes reallocation if the new size() is greater than the old capacity().
Be aware also that in such a case all the iterators and references are invalidated.
It goes without saying thus that reallocations are in charge of the insert and, if you look at those operations one at the time, it's as a consistent reserve-size-plus-one operation is made at each step.
You can argue that a reserve call at the top of the insertion would speed up everything in those cases when more than one reallocation takes place... Well, right, it could help, but it mostly depends on your actual problem.

will stl deque reallocate my elements (c++)?

Hi I need an stl container which can be indexed like a vector but does not move old elements in the memory like a vector would do with resize or reserve (Unless I call reserve once at the beginning with a capacity enough for all elements, which is not good for me). (Note I do address binding to the elements so I expect the address of these elements to never change). So I've found this deque. Do you think it is good for this purpose? Important: I need only pushback but I need to grow the container on demand in small chunks.

std::deque "never invalidates pointers or references to the rest of the elements" when adding or removing elements at its back or front, so yes, when you only push_back the elements stay in place.

A careful reading of the documentation seems to indicate that so long as you insert at the beginning or the end it will not invalidate pointers, and invalidating pointers is a sign that the data is being copied or moved.
The way it's constructed is not quite like a linked list, where each element is allocated individually, but as a set of linked arrays presumably for performance reasons. Altering the order of elements in the middle will necessitate moving data around.

c++ inserting elements at the end of a vector

I am experiencing a problem with the vector container. I am trying to improve the performance of inserting a lot of elements into one vector.
Basically I am using vector::reserve to expand my vector _children if needed:
if (_children.capacity() == _children.size())
{
_children.reserve(_children.size() * 2);
}
and using vector::at() to insert a new element at the end of _children instead of vector::push_back():
_children.at(_children.size()) = child;
_children has already one element in it, so the first element should be inserted at position 1, and the capacity at this time is 2.
Despite this, an out_of_range error is thrown. Can someone explain to me, what I misunderstood here? Is it not possible to just insert an extra element even though the chosen position is less than the vector capacity? I can post some more code if needed.
Thanks in advance.
/mads

Increasing the capacity doesn't increase the number of elements in the vector. It simply ensures that the vector has capacity to grow up to the required size without having to reallocate memory. I.e., you still need to call push_back().
Mind you, calling reserve() to increase capacity geometrically is a waste of effort. std::vector already does this.

This causes accesses out of bounds. Reserving memory does not affect the size of the vector.
Basically, you are doing manually what push_back does internally. Why do you think it would be any more efficient?

That's not what at() is for. at() is a checked version of [], i.e. accessing an element. But reserve() does not change the number of elements.
You should just use reserve() followed by push_back or emplace_back or insert (at the end); all those will be efficient, since they will not cause reallocations if you stay under the capacity limit.
Note that the vector already behaves exactly like you do manually: When it reaches capacity, it resizes the allocated memory to a multiple of the current size. This is mandated by the requirement that adding elements have amortized constant time complexity.

Neither at nor reserve increase the size of the vector (the latter increases the capacity but not the size).
Also, your attempted optimization is almost certainly redundant; you should simply push_back the elements into the array and rely on std::vector to expand its capacity in an intelligent manner.

You have to differentiate between the capacity and the size. You can only assign within size, and reserve only affects the capacity.

vector::reserve is only internally reserving space but is not constructing objects and is not changing the external size of the vector. If you use reserve you need to use push_back.
Additionally vector::at does range checking, which makes it a lot slower compared to vector::operator[].
What you are doing is trying to mimic part of the behaviour vector already implements internally. It is going to expand by its size by a certain factor (usually around 1.5 or 2) every time it runs out of space. If you know that you are pushing back many objects and only want one reallocation use:
vec.reserve(vec.size() + nbElementsToAdd);
If you are not adding enough elements this is potentially worse than the default behaviour of vector.

The capacity of a vector is not the number of elements it has, but the number of elements it can hold without allocating more memory. The capacity is equal to or larger than the number of elements in the vector.
In your example, _children.size() is 1, but there is no element at position 1. You can only use assignment to replace existing elements, not for adding new ones. Per definition, the last element is at _children.at(_children.size()-1).
The correct way is just to use push_back(), which is highly optimized, and faster than inserting at an index. If you know beforehand how many elements you want to add, you can of course use reserve() as an optimization.
It's not necessary to call reserve manually, as the vector will automatically resize the internal storage if neccessary. Actually I believe what you do in your example is similar what the vector does internally anyway - when it reaches the capacity, reserve twice the current size.
See also http://www.cplusplus.com/reference/stl/vector/capacity/

What is a truly empty std::vector in C++?

I've got a two vectors in class A that contain other class objects B and C. I know exactly how many elements these vectors are supposed to hold at maximum. In the initializer list of class A's constructor, I initialize these vectors to their max sizes (constants).
If I understand this correctly, I now have a vector of objects of class B that have been initialized using their default constructor. Right? When I wrote this code, I thought this was the only way to deal with things. However, I've since learned about std::vector.reserve() and I'd like to achieve something different.
I'd like to allocate memory for these vectors to grow as large as possible because adding to them is controlled by user-input, so I don't want frequent resizings. However, I iterate through this vector many, many times per second and I only currently work on objects I've flagged as "active". To have to check a boolean member of class B/C on every iteration is silly. I don't want these objects to even BE there for my iterators to see when I run through this list.
Is reserving the max space ahead of time and using push_back to add a new object to the vector a solution to this?

A vector has capacity and it has size. The capacity is the number of elements for which memory has been allocated. Size is the number of elements which are actually in the vector. A vector is empty when its size is 0. So, size() returns 0 and empty() returns true. That says nothing about the capacity of the vector at that point (that would depend on things like the number of insertions and erasures that have been done to the vector since it was created). capacity() will tell you the current capacity - that is the number of elements that the vector can hold before it will have to reallocate its internal storage in order to hold more.
So, when you construct a vector, it has a certain size and a certain capacity. A default-constructed vector will have a size of zero and an implementation-defined capacity. You can insert elements into the vector freely without worrying about whether the vector is large enough - up to max_size() - max_size() being the maximum capacity/size that a vector can have on that system (typically large enough not to worry about). Each time that you insert an item into the vector, if it has sufficient capacity, then no memory-allocation is going to be allocated to the vector. However, if inserting that element would exceed the capacity of the vector, then the vector's memory is internally re-allocated so that it has enough capacity to hold the new element as well as an implementation-defined number of new elements (typically, the vector will probably double in capacity) and that element is inserted into the vector. This happens without you having to worry about increasing the vector's capacity. And it happens in constant amortized time, so you don't generally need to worry about it being a performance problem.
If you do find that you're adding to a vector often enough that many reallocations occur, and it's a performance problem, then you can call reserve() which will set the capacity to at least the given value. Typically, you'd do this when you have a very good idea of how many elements your vector is likely to hold. However, unless you know that it's going to a performance issue, then it's probably a bad idea. It's just going to complicate your code. And constant amortized time will generally be good enough to avoid performance issues.
You can also construct a vector with a given number of default-constructed elements as you mentioned, but unless you really want those elements, then that would be a bad idea. vector is supposed to make it so that you don't have to worry about reallocating the container when you insert elements into it (like you would have to with an array), and default-constructing elements in it for the purposes of allocating memory is defeating that. If you really want to do that, use reserve(). But again, don't bother with reserve() unless you're certain that it's going to improve performance. And as was pointed out in another answer, if you're inserting elements into the vector based on user input, then odds are that the time cost of the I/O will far exceed the time cost in reallocating memory for the vector on those relatively rare occasions when it runs out of capacity.
Capacity-related functions:
capacity() // Returns the number of elements that the vector can hold
reserve() // Sets the minimum capacity of the vector.
Size-related functions:
clear() // Removes all elements from the vector.
empty() // Returns true if the vector has no elements.
resize() // Changes the size of the vector.
size() // Returns the number of items in the vector.

Yes, reserve(n) will allocate space without actually putting elements there - increasing capacity() without increasing size().
BTW, if "adding to them is controlled by user-input" means that the user hits "insert X" and you insert X into the vector, you need not worry about the overhead of resizing. Waiting for user input is many times slower than the amortized constant resizing performance.

Your question is a little confusing, so let me try to answer what I think you asked.
Let's say you have a vector<B> which you default-construct. You then call vec.reserve(100). Now, vec contains 0 elements. It's empty. vec.empty() returns true and vec.size() returns 0. Every time you call push_back, you will insert one element, and unless vec conatins 100 elements, there will be no reallocation.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js