The main issue with vectors and pointers to their elements is that they can be reallocated in memory whenever a push_back is called, rendering the pointer invalid.
I am trying to implement a suffix trie, where I store a data structure node in a vector of nodes. I know that for a string of size n the number n(n+1)/2 is an upperbound for the number of nodes in the trie.
So will the code
std::string T = "Hello StackOverflow!";
std::vector<Node> nodes;
int n = T.length();
nodes.reserve(n*(n+1)/2);
guarantee that any pointers I create referring to elements of nodes will not be invalidated? i.e. will this guarantee that the vector is not reallocated?
Edit: I've implemented this and I keep getting the following error at runtime.
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::at: __n (which is 0) >= this->size() (which is 0)
Aborted (core dumped)
Any ideas what could be causing this?
According to the standard (N4140):
23.3.6.3 vector capacity
....
void reserve(size_type n);
....
After reserve(), capacity() is greater or equal to the argument of reserve if
reallocation happens; and equal to the previous value of capacity() otherwise. Reallocation happens
at this point if and only if the current capacity is less than the argument of reserve().
and
23.3.6.5 vector modifiers
....
void push_back(const T& x);
void push_back(T&& x);
Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens,
all the iterators and references before the insertion point remain valid.
You can be certain that your pointers will not be invalidated if you are careful. See std::vector::push_back. It says this about invalidation :
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
Simply make sure you do not push_back beyond capacity or call other methods that may invalidate. A list of methods that invalidate is available here in section 'Iterator invalidation'.
Related
Which operation is most costly in C++?
1. Resize of a vector (decrease size by 1)
2. Remove last element in vector
From http://en.cppreference.com/w/cpp/container/vector which essentially quotes the Standard:
void pop_back();
Removes the last element of the container.
No iterators or references except for back() and end() are invalidated.
void resize( size_type count );
Resizes the container to contain count elements. If the current size
is greater than count, the container is reduced to its first count
elements as if by repeatedly calling pop_back().
So in this case, calling resize(size() - 1) should be equivalent to calling pop_back(). However, calling pop_back() is the right thing to do as it expresses your intent.
NOTE: the answer is reflecting the changed interface of C++11's std::vector::resize(), which used to contain a hidden default argument which was being copied around (and which may or may not have been optimized away).
In my opinion they are equivalent. The both operations remove the last element and decrease the size.:)
According to the C++ Standard
void resize(size_type sz); 12 Effects: If sz <= size(), equivalent to
calling pop_back() size() - sz times
So they are simply equivalent according to my opinion and the point of view of the Standard.:)
Also if to consider member function erase instead of pop_back (in fact they do the same in this case) then according to the same Standard
4 Complexity: The destructor of T is called the number of times equal
to the number of the elements erased, but the move assignment
operator of T is called the number of times equal to the number of
elements in the vector after the erased elements.
As there are no move operations for the last element then the cost is the same.
Because vectors use an array as their underlying storage, inserting
elements in positions other than the vector end causes the container
to relocate all the elements that were after position to their new
positions.
< http://www.cplusplus.com/reference/vector/vector/insert/ >
I thought that this is the reason that iterator it becomes no longer valid after the last line in code below:
std::vector<int> myvector (3,100);
std::vector<int>::iterator it;
it = myvector.begin();
it = myvector.insert ( it , 200 );
myvector.insert (it,2,300);
But if I change the it's definition into myvector.end();, it's still the same. What is the reason behind this? How exactly does it work and are there situations where iterator insert can be still valid after filling part of vector with some elements? (or single one)
Well yes, that is a reason for iterators to elements after (and at, because insertion is done before the given element) the insertion point are invalidated. If you insert to the end, then there are no elements whose iterators could be invalidated. The end iterator is always invalidated, no matter where you insert. The more relevant description on that page:
Iterator validity
If a reallocation happens, all iterators, pointers and references related to the container are invalidated.
Otherwise, only those pointing to position and beyond are invalidated, with all iterators, pointers and references to elements before position guaranteed to keep referring to the same elements they were referring to before the call.
Here's what it points to if you change the first assignment to end.
it = myvector.end();
it points to end, good.
it = myvector.insert ( it , 200 );
Inserting to end does not invalidate any pointers to elements, but it does invalidate the end iterator which is the old value for it. Luckily, you now assign to the iterator returned by insert. That iterator does not point to the end of the vector but to the newly inserted element.
myvector.insert (it,2,300);
Now it is invalidated again, but you don't reassign it, so it remains so.
Of course, then there is the possibility, after each insert, that the vector was reallocated in which case all previous iterators to any part of the vector would be invalidated. That can be avoided by guaranteeing sufficient space with vector::reserve before initializing the iterators. The new iterator returned by insert will always be valid, even if the vector was reallocated.
Here is a better reference and explanation.
Causes reallocation if the new size() is greater than the old capacity(). If the new size() is greater than capacity(), all iterators and references are invalidated. Otherwise, only the iterators and references before the insertion point remain valid. The past-the-end iterator is also invalidated.
— http://en.cppreference.com/w/cpp/container/vector/insert
In the case where you use end(), size() still goes above capacity(). Try setting the capacity to something larger before insert().
I'm trying to use another project's code and they have structs of this form:
struct data{
std::vector<sparse_array> cols,rows;
}
struct sparse_array {
std::vector<unsigned int> idxs;
std::vector<double> values;
void add(unsigned int idx, double value) {
idxs.push_back(idx);
values.push_back(value);
}
}
For my code, I tried using the following lines:
data prob;
prob.cols.reserve(num_cols);
prob.rows.reserve(num_rows);
// Some loop that calls
prob.cols[i].add(idx, value);
prob.rows[i].add(idx, value);
And when I output the values, prob.rows[i].value[j] to a file I get all zeros. But when I use resize instead of reserve I get the actual value that I read in. Can someone give me an explanation about this?
Function reserve() simply allocates a contiguous region of memory big enough to hold the number of items you specify and moves the vector's old content into this new block, which makes sure no more reallocations for the vectors' storage will be done upon insertions as long as the specified capacity is not exceeded. This function is used to reduce the number of reallocations (which also invalidate iterators), but does not insert any new items at the end of your vector.
From the C++11 Standard, Paragraph 23.3.6.3/1 about reserve():
A directive that informs a vector of a planned change in size, so that it can manage the storage
allocation accordingly. After reserve(), capacity() is greater or equal to the argument of reserve if reallocation happens; and equal to the previous value of capacity() otherwise. Reallocation happens at this point if and only if the current capacity is less than the argument of reserve(). If an exception is thrown other than by the move constructor of a non-CopyInsertable type, there are no effects.
Notice that by doing prob.cols[i].push_back(idx, value); you are likely to get undefined behavior, since i is probably an out-of-bounds index.
On the other hand, function resize() does insert items at the end of your vector, so that the final size of the vector will be the one you specified (this means it can even erase elements, if you specify a size smaller than the current one). If you specify no second argument to a call to resize(), the newly inserted items will be value-initialized. Otherwise, they will be copy-initialized from the value you provide.
From the C++11 Standard, Paragraph 23.3.6.3/9 about resize():
If sz <= size(), equivalent to erase(begin() + sz, end());. If size() < sz, appends
sz - size() value-initialized elements to the sequence.
So to sum it up, the reason why accessing your vector after invoking resize() gives the expected result is that items are actually being added to the vector. On the other hand, since the call to reserve() does not add any item, subsequent accesses to non-existing elements will give you undefined behavior.
If the vector is empty, then std::vector::resize(n) expands the content of this vector by inserting n new elements at the end. std::vector::reserve(n) only reallocates the memory block that your vector uses for storing its elements so that it's big enough to hold n elements.
Then when you call prob.cols[i], you are trying to access the element at index i. In case you used reserve before, this results in accessing the memory where no element resides yet, which produces the undefined behavior.
So just use resize in this case :)
According to Stroustrup : The C++ programming language :-
"When a vector is resized to accommodate more (or fewer) elements, all of its elements may be
moved to new locations."
Is this holds true, even if the vector is re-sized to smaller size ?
Case 1: If the new size being requested is greater than the current std::vector::capacity() then all elements will be relocated.
Case 2: If the new size being requested is lesser than the current std::vector::capacity() then there will be no relocation of elements.
Standerdese Evidence:
The standard defines effect of vector::resize() as:
C++11 Standard 23.3.6.3/12 vector capacity:
void resize(size_type sz, const T& c);
Effect:
if (sz > size())
insert(end(), sz-size(), c);
else if (sz < size())
erase(begin()+sz, end());
else
; // do nothing
As #DavidRodrÃguez-dribeas correctly points out, Iterator invalidation rules for std::vector::insert() operation are:
23.3.6.5 vector modifiers
1 [insert,push_back,emplace,emplace_back]
Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before the insertion point remain valid.
Essentially this means:
All iterators and references before the point of insertion will be unaffected, unless the new container size is greater than the previous capacity because in such a scenario all elements might be moved to new locations thus invalidating pointers/iterators to original location.Since resize() only erases/inserts elements at the end of the container[Note 1].The governing factor boils down to size being requested as against current capacity.
Hence the Case 1 result.
In Case 2 std::vector::erase() will be called and the invalidation rule in this case is:
23.3.6.5 vector modifiers
iterator erase(const_iterator position);
3 Effects: Invalidates iterators and references at or after the point of the erase.
Since [Note 1], elements will be only removed at end and there is no need of relocation of all elements.
...elements may be moved to new locations."
Notice how it says may be moved. So that would imply that it depends what what kind of a resize it is.
Iterators in a vector are invalidated for two reasons. An element is inserted/removed before the location of the iterator (1) or the whole buffer is relocated (2) if the vector needs to grow it's capacity. The key here is a change to the capacity().
Because resize() only inserts/removes from the end of the container. When the vector shrinks only those iterators referring to the elements being removed become invalidated. When the vector grows no iterator will become invalid if the new size is smaller than capacity(), and all iterators will be invalidated if the new size is larger.
Since Als provided incorrect evidence1, I am adding here the correct quotes:
23.3.6.5 vector modifiers
1 [insert,push_back,emplace,emplace_back]
Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before the insertion point remain valid.
2 [erase]
Effects: Invalidates iterators and references at or after the point of the erase.
Similar quotes can be found in C++03.
1 Avoiding to duplicate the quote that dictates the equivalence of resize to either insert or erase. Which is right.
The answer in the body of the question ""When a vector is resized to accommodate more (or fewer) elements, all of its elements may be moved to new locations.""
The C++ standard seems to make no statement regarding side-effects on capacity by either
resize(n), with n < size(), or clear().
It does make a statement about amortized cost of push_back and pop_back - O(1)
I can envision an implementation that does the usual sort of capacity changes
ala CLRS Algorithms (e.g. double when enlarging, halve when decreasing size to < capacity()/4).
(Cormen Lieserson Rivest Stein)
Does anyone have a reference for any implementation restrictions?
Calling resize() with a smaller size has no effect on the capacity of a vector. It will not free memory.
The standard idiom for freeing memory from a vector is to swap() it with an empty temporary vector: std::vector<T>().swap(vec);. If you want to resize downwards you'd need to copy from your original vector into a new local temporary vector and then swap the resulting vector with your original.
Updated: C++11 added a member function shrink_to_fit() for this purpose, it's a non-binding request to reduce capacity() to size().
Actually, the standard does specify what should happen:
This is from vector, but the theme is the same for all the containers (list, deque, etc...)
23.2.4.2 vector capacity [lib.vector.capacity]
void resize(size_type sz, T c = T());
6) Effects:
if (sz > size())
insert(end(), sz-size(), c);
else if (sz < size())
erase(begin()+sz, end());
else
; //do nothing
That is to say: If the size specified to resize is less than the number of elements, those elements will be erased from the container. Regarding capacity(), this depends on what erase() does to it.
I cannot locate it in the standard, but I'm pretty sure clear() is defined to be:
void clear()
{
erase(begin(), end());
}
Therefore, the effects clear() has on capacity() is also tied to the effects erase() has on it. According to the standard:
23.2.4.3 vector modifiers [lib.vector.modifiers]
iterator erase(iterator position);
iterator erase(iterator first, iterator last);
4) Complexity: The destructor of T is called the number of times equal to the number of the elements erased....
This means that the elements will be destructed, but the memory will remain intact. erase() has no effect on capacity, therefore resize() and clear() also have no effect.
The capacity will never decrease. I'm not sure if the standard states this explicitly, but it is implied: iterators and references to vector's elements must not be invalidated by resize(n) if n < capacity().
As i checked for gcc (mingw) the only way to free vector capacity is what mattnewport says.
Swaping it with other teporary vector.
This code makes it for gcc.
template<typename C> void shrinkContainer(C &container) {
if (container.size() != container.capacity()) {
C tmp = container;
swap(container, tmp);
}
//container.size() == container.capacity()
}