Erasing while traversing in C++ stl map giving runtime error - c++

Following lines of C++ code gives runtime error but if erase operation mymap.erase(v) is removed it works:
map<int,int> mymap = {{1,0},{2,1},{9,2},{10,3},{11,4}};
for(auto it=mymap.rbegin();it!=mymap.rend();){
int v=it->first;
++it;
mymap.erase(v);
}
demo
Here iterator it is changed before deleting its value v, so iterator it should remain unaffected I believe.

When you are calling erase(v), you are invalidating the base iterator that the next reverse_iterator (from ++it) is using. So you need to create a new reverse_iterator from the base iterator that precedes the erased value.
Also, rather than erasing the value that the reverse_iterator is referring to, you should erase the base iterator instead, since you already know which element you want to erase. There is no need to make the map go hunting for the value again.
This works for me:
map<int,int> mymap = {{1,0},{2,1},{9,2},{10,3},{11,4}};
for(auto it = mymap.rbegin(); it != mymap.rend(); ){
auto v = --(it.base());
v = mymap.erase(v);
it = map<int,int>::reverse_iterator(v);
}
Demo
On the other hand, this loop is essentially just erase()'ing all elements from mymap, so a better option is to use mymap.clear() instead.

Indeed, std::map::erase:
References and iterators to the erased elements are invalidated. Other references and iterators are not affected.
But std::reverse_iterator
For a reverse iterator r constructed from an iterator i, the relationship &*r == &*(i-1) is always true (as long as r is dereferenceable); thus a reverse iterator constructed from a one-past-the-end iterator dereferences to the last element in a sequence.
Perhaps it gets more clear when you look at the image on the cppreference page.
The crucial part is "Reverse iterator stores an iterator to the next element than the one it actually refers to".
As a consequence (small modification to your code)
auto& element = *it; // fails in the next iteration, because...
int v = element.first;
++it; // now it stores an iterator to element
mymap.erase(v); // iterators to element are invalidated
You are erasing the element that is used by it in the next iteration.

Related

Got singular iterator error in looping with iterator and pop_back

Giving the below code (say it's named deque.cpp)
#include <cstdio>
#include <deque>
int main()
{
std::deque<int> d = {1, 2, 3};
for (auto it = d.rbegin(); it != d.rend();) {
printf("it: %d\n", *it);
++it;
d.pop_back();
}
return 0;
}
Compile with g++ -std=c++11 -o deque deque.cpp, it runs well:
$ ./deque
it: 3
it: 2
it: 1
But if compile with -D_GLIBCXX_DEBUG (g++ -std=c++11 -o deque_debug deque.cpp -D_GLIBCXX_DEBUG, it gets below error:
$ ./deque_debug
it: 3
/usr/include/c++/4.8/debug/safe_iterator.h:171:error: attempt to copy-
construct an iterator from a singular iterator.
...
It looks like the second loop's ++it is constructing from an singular iterator.
But I think after the first loop's ++it, the iterator points to 2, and pop_back() should not invalidate it. Then why the error occurs?
Note: I know the code could be re-write as below:
while (!d.empty()) {
auto it = d.rbegin();
printf("it: %d\n", *it);
d.pop_back();
}
And the error will be gone.
But I do want to know what exactly happens on the error code. (Does it mean the reverse iterator dose not actually point to the node I expect, but the one after it?)
Update: #Barry's answer resolved the question.
Please let me put an extra related question: the code
for (auto it = d.rbegin(); it != d.rend();) {
printf("it: %d\n", *it);
d.pop_back();
++it; // <== moved below pop_back()
}
is expected to be wrong, where ++it should be operating on an invalidated iterator. But why the code does not cause error?
The problem here arises from what reverse iterators actually are. The relevant relationship for a reverse iterator is:
For a reverse iterator r constructed from an iterator i, the relationship &*r == &*(i-1) is always true (as long as r is dereferenceable); thus a reverse iterator constructed from a one-past-the-end iterator dereferences to the last element in a sequence.
When we then do std::deque::pop_back(), we invalidate:
Iterators and references to the erased element are invalidated. The past-the-end iterator is also invalidated. Other references and iterators are not affected.
rbegin() is constructed from end(). After we increment it the first time, it will dereference to the 2 but its underlying base iterator is pointing to the 3 - that's the erased element. So the iterators referring to it include your now-advanced reverse iterator. That's why it's invalidated and that's why you're seeing the error that you're seeing.
Reverse iterators are complicated.
Instead of incrementing it, you could reassign it to rbegin():
for (auto it = d.rbegin(); it != d.rend();) {
printf("it: %d\n", *it);
d.pop_back();
// 'it' and 'it+1' are both invalidated
it = d.rbegin();
}
Erasing from the underlying container invalidates an iterator. Quoting from the rules:
Iterators are not dereferenceable if
they are past-the-end iterators
(including pointers past the end of an array) or before-begin
iterators. Such iterators may be dereferenceable in a particular
implementation, but the library never assumes that they are.
they are singular iterators, that is, iterators that are not associated with any sequence. A null pointer, as well as a default-constructed pointer
(holding an indeterminate value) is singular
they were invalidated by
one of the iterator-invalidating operations on the sequence to which
they refer.
Your code causes the iterator to be invalidated by the pop_back operation, and thus as per the third point above, it becomes non dereferenceable.
In your while loop you're avoiding this problem by getting a (new) valid iterator in each loop repetition.

Iterator to last element of std::vector using end()--

I have a std::vector and I want the iterator to the last element in the vector; I will be storing this iterator for later use.
NOTE: I want an iterator reference to it, not std::vector::back. Because I want to be able to compute the index of this object from the std::vector::begin later on.
The following is my logic to get the iterator to the last element:
std::vector<int> container;
std::vector<int>::iterator it = container.end()--;
Since std::vector::end has O(1) time complexity, is there a better way to do this?
I think you mean either:
std::vector<int>::iterator it = --container.end();
std::vector<int>::iterator it = container.end() - 1;
std::vector<int>::iterator it = std::prev(container.end());
You're unintentionally just returning end(). But the problem with all of these is what happens when the vector is empty, otherwise they're all do the right thing in constant time. Though if the vector is empty, there's no last element anyway.
Also be careful when storing iterators - they can get invalidated.
Note that if vector<T>::iterator is just T* (which would be valid), the first form above is ill-formed. The second two work regardless, so are preferable.
You have rbegin that does what you need
cplusplus reference
auto last = container.rbegin();
The way you are doing it will give you the wrong iterator because post increment will not change the value until after the assignment.
There is always this:
auto it = std::prev(container.end());
Remember to check first that the container is not empty so your iterator exists in a valid range.

Is it safe to invoke std::map::erase with std::map::begin?

We (all) know, erasing an element, pointer by an iterator invalidates the iterator, for example:
std::map< .. > map_;
std::map< .. >::iterator iter;
// ..
map_.erase( iter ); // this will invalidate `iter`.
But, what about:
map_.erase( map_.begin() );
is this safe? Will map_.begin() be a valid iterator, pointing to the (new) first element of the map?
"test it" is not a solution.
begin() is not an iterator but returns an iterator. After erasing the first element, begin()returns another (valid) iterator.
std::map<int, int> m;
m[1] = 2;
m[2] = 3;
m.erase(m.begin()); // <- begin() points to 1:2
std::cout << m.begin()->second; // <- begin() points to 2:3 now
On cppreference, we see:
All iterators (pos, first, last) must be valid and dereferenceable,
that is, the end() iterator (which is valid, but is not
dereferencable) cannot be used.
That pretty much answers your question. As long as the iterator returned by begin() is valid and dereferencable, it is OK to be used in std::map::erase(). A good way then to check if begin() is OK to be used in std::map::erase is by checking it if it is not equal to end():
if(map.begin() != map.end()) {
map.erase(map.begin());
}
Alternatively, you could also check if the map is empty, and use std::map::erase if it isn't
if(!map.empty()) {
map.erase(map.begin());
}
Of course it is.
map::begin returns a valid iterator referring to the first element in the map container.
http://www.cplusplus.com/reference/map/map/begin/
Pay attention to empty map.
is this safe?
Yes. It invalidates the temporary iterator returned by this call to begin(), and that iterator is destroyed at the end of the statement.
Will map_.begin() be a valid iterator, pointing to the (new) first element of the map?
Yes, unless the map is now empty. Erasing an element does not prevent you from creating new iterators to the remaining elements; that would make the map unusable.

Erase by iterator on a C++ STL map

I'm curious about the rationale behind the following code. For a given map, I can delete a range up to, but not including, end() (obviously,) using the following code:
map<string, int> myMap;
myMap["one"] = 1;
myMap["two"] = 2;
myMap["three"] = 3;
map<string, int>::iterator it = myMap.find("two");
myMap.erase( it, myMap.end() );
This erases the last two items using the range. However, if I used the single iterator version of erase, I half expected passing myMap.end() to result in no action as the iterator was clearly at the end of the collection. This is as distinct from a corrupt or invalid iterator which would clearly lead to undefined behaviour.
However, when I do this:
myMap.erase( myMap.end() );
I simply get a segmentation fault. I wouldn't have thought it difficult for map to check whether the iterator equalled end() and not take action in that case. Is there some subtle reason for this that I'm missing? I noticed that even this works:
myMap.erase( myMap.end(), myMap.end() );
(i.e. does nothing)
The reason I ask is that I have some code which receives a valid iterator to the collection (but which could be end()) and I wanted to simply pass this into erase rather than having to check first like this:
if ( it != myMap.end() )
myMap.erase( it );
which seems a bit clunky to me. The alternative is to re code so I can use the by-key-type erase overload but I'd rather not re-write too much if I can help it.
The key is that in the standard library ranges determined by two iterators are half-opened ranges. In math notation [a,b) They include the first but not the last iterator (if both are the same, the range is empty). At the same time, end() returns an iterator that is one beyond the last element, which perfectly matches the half-open range notation.
When you use the range version of erase it will never try to delete the element referenced by the last iterator. Consider a modified example:
map<int,int> m;
for (int i = 0; i < 5; ++i)
m[i] = i;
m.erase( m.find(1), m.find(4) );
At the end of the execution the map will hold two keys 0 and 4. Note that the element referred by the second iterator was not erased from the container.
On the other hand, the single iterator operation will erase the element referenced by the iterator. If the code above was changed to:
for (int i = 1; i <= 4; ++i )
m.erase( m.find(i) );
The element with key 4 will be deleted. In your case you will attempt to delete the end iterator that does not refer to a valid object.
I wouldn't have thought it difficult for map to check whether the iterator equalled end() and not take action in that case.
No, it is not hard to do, but the function was designed with a different contract in mind: the caller must pass in an iterator into an element in the container. Part of the reason for this is that in C++ most of the features are designed so that the incur the minimum cost possible, allowing the user to balance the safety/performance on their side. The user can test the iterator before calling erase, but if that test was inside the library then the user would not be able to opt out of testing when she knows that the iterator is valid.
n3337 23.2.4 Table 102
a.erase( q1, q2)
erases all the elements in the range [q1,q2). Returns q2.
So, iterator returning from map::end() is not in range in case of myMap.erase(myMap.end(), myMap.end());
a.erase(q)
erases the element pointed to by q. Returns an iterator pointing to the element immediately following q prior to the element being erased. If no such element exists, returns a.end().
I wouldn't have thought it difficult for map to check whether the
iterator equalled end() and not take action in that case. Is there
some subtle reason for this that I'm missing?
Reason is same, that std::vector::operator[] can don't check, that index is in range, of course.
When you use two iterators to specify a range, the range consists of the elements from the element that the first iterator points to up to but not including the element that the second iterator points to. So erase(it, myMap.end()) says to erase everything from it up to but not including end(). You could equally well pass an iterator that points to a "real" element as the second one, and the element that that iterator points to would not be erased.
When you use erase(it) it says to erase the element that it points to. The end() iterator does not point to a valid element, so erase(end()) doesn't do anything sensible. It would be possible for the library to diagnose this situation, and a debugging library will do that, but it imposes a cost on every call to erase to check what the iterator points to. The standard library doesn't impose that cost on users. You're on your own. <g>

Does pop_back() really invalidate *all* iterators on an std::vector?

std::vector<int> ints;
// ... fill ints with random values
for(std::vector<int>::iterator it = ints.begin(); it != ints.end(); )
{
if(*it < 10)
{
*it = ints.back();
ints.pop_back();
continue;
}
it++;
}
This code is not working because when pop_back() is called, it is invalidated. But I don't find any doc talking about invalidation of iterators in std::vector::pop_back().
Do you have some links about that?
The call to pop_back() removes the last element in the vector and so the iterator to that element is invalidated. The pop_back() call does not invalidate iterators to items before the last element, only reallocation will do that. From Josuttis' "C++ Standard Library Reference":
Inserting or removing elements
invalidates references, pointers, and
iterators that refer to the following
element. If an insertion causes
reallocation, it invalidates all
references, iterators, and pointers.
Here is your answer, directly from The Holy Standard:
23.2.4.2 A vector satisfies all of the requirements of a container and of a reversible container (given in two tables in 23.1) and of a sequence, including most of the optional sequence requirements (23.1.1).
23.1.1.12 Table 68
expressiona.pop_back()
return typevoid
operational semanticsa.erase(--a.end())
containervector, list, deque
Notice that a.pop_back is equivalent to a.erase(--a.end()). Looking at vector's specifics on erase:
23.2.4.3.3 - iterator erase(iterator position) - effects - Invalidates all the iterators and references after the point of the erase
Therefore, once you call pop_back, any iterators to the previously final element (which now no longer exists) are invalidated.
Looking at your code, the problem is that when you remove the final element and the list becomes empty, you still increment it and walk off the end of the list.
(I use the numbering scheme as used in the C++0x working draft, obtainable here
Table 94 at page 732 says that pop_back (if it exists in a sequence container) has the following effect:
{ iterator tmp = a.end();
--tmp;
a.erase(tmp); }
23.1.1, point 12 states that:
Unless otherwise specified (either explicitly or by defining a function in terms of other functions), invoking a container
member function or passing a container as an argument to a library function shall not invalidate iterators to, or change
the values of, objects within that container.
Both accessing end() as applying prefix-- have no such effect, erase() however:
23.2.6.4 (concerning vector.erase() point 4):
Effects: Invalidates iterators and references at or after the point of the erase.
So in conclusion: pop_back() will only invalidate an iterator to the last element, per the standard.
Here is a quote from SGI's STL documentation (http://www.sgi.com/tech/stl/Vector.html):
[5] A vector's iterators are invalidated when its memory is reallocated. Additionally, inserting or deleting an element in the middle of a vector invalidates all iterators that point to elements following the insertion or deletion point. It follows that you can prevent a vector's iterators from being invalidated if you use reserve() to preallocate as much memory as the vector will ever use, and if all insertions and deletions are at the vector's end.
I think it follows that pop_back only invalidates the iterator pointing at the last element and the end() iterator. We really need to see the data for which the code fails, as well as the manner in which it fails to decide what's going on. As far as I can tell, the code should work - the usual problem in such code is that removal of element and ++ on iterator happen in the same iteration, the way #mikhaild points out. However, in this code it's not the case: it++ does not happen when pop_back is called.
Something bad may still happen when it is pointing to the last element, and the last element is less than 10. We're now comparing an invalidated it and end(). It may still work, but no guarantees can be made.
Iterators are only invalidated on reallocation of storage. Google is your friend: see footnote 5.
Your code is not working for other reasons.
pop_back() invalidates only iterators that point to the last element. From C++ Standard Library Reference:
Inserting or removing elements
invalidates references, pointers, and
iterators that refer to the following
element. If an insertion causes
reallocation, it invalidates all
references, iterators, and pointers.
So to answer your question, no it does not invalidate all iterators.
However, in your code example, it can invalidate it when it is pointing to the last element and the value is below 10. In which case Visual Studio debug STL will mark iterator as invalidated, and further check for it not being equal to end() will show an assert.
If iterators are implemented as pure pointers (as they would in probably all non-debug STL vector cases), your code should just work. If iterators are more than pointers, then your code does not handle this case of removing the last element correctly.
Error is that when "it" points to the last element of vector and if this element is less than 10, this last element is removed. And now "it" points to ints.end(), next "it++" moves pointer to ints.end()+1, so now "it" running away from ints.end(), and you got infinite loop scanning all your memory :).
The "official specification" is the C++ Standard. If you don't have access to a copy of C++03, you can get the latest draft of C++0x from the Committee's website: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2723.pdf
The "Operational Semantics" section of container requirements specifies that pop_back() is equivalent to { iterator i = end(); --i; erase(i); }. the [vector.modifiers] section for erase says "Effects: Invalidates iterators and references at or after the point of the erase."
If you want the intuition argument, pop_back is no-fail (since destruction of value_types in standard containers are not allowed to throw exceptions), so it cannot do any copy or allocation (since they can throw), which means that you can guess that the iterator to the erased element and the end iterator are invalidated, but the remainder are not.
pop_back() will only invalidate it if it was pointing to the last item in the vector. Your code will therefore fail whenever the last int in the vector is less than 10, as follows:
*it = ints.back(); // Set *it to the value it already has
ints.pop_back(); // Invalidate the iterator
continue; // Loop round and access the invalid iterator
You might want to consider using the return value of erase instead of swapping the back element to the deleted position an popping back. For sequences erase returns an iterator pointing the the element one beyond the element being deleted. Note that this method may cause more copying than your original algorithm.
for(std::vector<int>::iterator it = ints.begin(); it != ints.end(); )
{
if(*it < 10)
it = ints.erase( it );
else
++it;
}
std::remove_if could also be an alternative solution.
struct LessThanTen { bool operator()( int n ) { return n < 10; } };
ints.erase( std::remove_if( ints.begin(), ints.end(), LessThanTen() ), ints.end() );
std::remove_if is (like my first algorithm) stable, so it may not be the most efficient way of doing this, but it is succinct.
Check out the information here (cplusplus.com):
Delete last element
Removes the last element in the vector, effectively reducing the vector size by one and invalidating all iterators and references to it.