vector::erase and reverse_iterator - c++

I have a collection of elements in a std::vector that are sorted in a descending order starting from the first element. I have to use a vector because I need to have the elements in a contiguous chunk of memory. And I have a collection holding many instances of vectors with the described characteristics (always sorted in a descending order).
Now, sometimes, when I find out that I have too many elements in the greater collection (the one that holds these vectors), I discard the smallest elements from these vectors some way similar to this pseudo-code:
grand_collection: collection that holds these vectors
T: type argument of my vector
C: the type that is a member of T, that participates in the < comparison (this is what sorts data before they hit any of the vectors).
std::map<C, std::pair<T::const_reverse_iterator, std::vector<T>&>> what_to_delete;
iterate(it = grand_collection.begin() -> grand_collection.end())
{
iterate(vect_rit = it->rbegin() -> it->rend())
{
// ...
what_to_delete <- (vect_rit->C, pair(vect_rit, *it))
if (what_to_delete.size() > threshold)
what_to_delete.erase(what_to_delete.begin());
// ...
}
}
Now, after running this code, in what_to_delete I have a collection of iterators pointing to the original vectors that I want to remove from these vectors (overall smallest values). Remember, the original vectors are sorted before they hit this code, which means that for any what_to_delete[0 - n] there is no way that an iterator on position n - m would point to an element further from the beginning of the same vector than n, where m > 0.
When erasing elements from the original vectors, I have to convert a reverse_iterator to iterator. To do this, I rely on C++11's §24.4.1/1:
The relationship between reverse_iterator and iterator is
&*(reverse_iterator(i)) == &*(i- 1)
Which means that to delete a vect_rit, I use:
vector.erase(--vect_rit.base());
Now, according to C++11 standard §23.3.6.5/3:
iterator erase(const_iterator position); Effects: Invalidates
iterators and references at or after the point of the erase.
How does this work with reverse_iterators? Are reverse_iterators internally implemented with a reference to a vector's real beginning (vector[0]) and transforming that vect_rit to a classic iterator so then erasing would be safe? Or does reverse_iterator use rbegin() (which is vector[vector.size()]) as a reference point and deleting anything that is further from vector's 0-index would still invalidate my reverse iterator?
Edit:
Looks like reverse_iterator uses rbegin() as its reference point. Erasing elements the way I described was giving me errors about a non-deferenceable iterator after the first element was deleted. Whereas when storing classic iterators (converting to const_iterator) while inserting to what_to_delete worked correctly.
Now, for future reference, does The Standard specify what should be treated as a reference point in case of a random-access reverse_iterator? Or this is an implementation detail?
Thanks!

In the question you have already quoted exactly what the standard says a reverse_iterator is:
The relationship between reverse_iterator and iterator is &*(reverse_iterator(i)) == &*(i- 1)
Remember that a reverse_iterator is just an 'adaptor' on top of the underlying iterator (reverse_iterator::current). The 'reference point', as you put it, for a reverse_iterator is that wrapped iterator, current. All operations on the reverse_iterator really occur on that underlying iterator. You can obtain that iterator using the reverse_iterator::base() function.
If you erase --vect_rit.base(), you are in effect erasing --current, so current will be invalidated.
As a side note, the expression --vect_rit.base() might not always compile. If the iterator is actually just a raw pointer (as might be the case for a vector), then vect_rit.base() returns an rvalue (a prvalue in C++11 terms), so the pre-decrement operator won't work on it since that operator needs a modifiable lvalue. See "Item 28: Understand how to use a reverse_iterator's base iterator" in "Effective STL" by Scott Meyers. (an early version of the item can be found online in "Guideline 3" of http://www.drdobbs.com/three-guidelines-for-effective-iterator/184401406).
You can use the even uglier expression, (++vect_rit).base(), to avoid that problem. Or since you're dealing with a vector and random access iterators: vect_rit.base() - 1
Either way, vect_rit is invalidated by the erase because vect_rit.current is invalidated.
However, remember that vector::erase() returns a valid iterator to the new location of the element that followed the one that was just erased. You can use that to 're-synchronize' vect_rit:
vect_rit = vector_type::reverse_iterator( vector.erase(vect_rit.base() - 1));

From a standardese point of view (and I'll admit, I'm not an expert on the standard): From §24.5.1.1:
namespace std {
template <class Iterator>
class reverse_iterator ...
{
...
Iterator base() const; // explicit
...
protected:
Iterator current;
...
};
}
And from §24.5.1.3.3:
Iterator base() const; // explicit
Returns: current.
Thus it seems to me that so long as you don't erase anything in the vector before what one of your reverse_iterators points to, said reverse_iterator should remain valid.
Of course, given your description, there is one catch: if you have two contiguous elements in your vector that you end up wanting to delete, the fact that you vector.erase(--vector_rit.base()) means that you've invalidated the reverse_iterator "pointing" to the immediately preceeding element, and so your next vector.erase(...) is undefined behavior.
Just in case that's clear as mud, let me say that differently:
std::vector<T> v=...;
...
// it_1 and it_2 are contiguous
std::vector<T>::reverse_iterator it_1=v.rend();
std::vector<T>::reverse_iterator it_2=it_1;
--it_2;
// Erase everything after it_1's pointee:
// convert from reverse_iterator to iterator
std::vector<T>::iterator tmp_it=it_1.base();
// but that points one too far in, so decrement;
--tmp_it;
// of course, now tmp_it points at it_2's base:
assert(tmp_it == it_2.base());
// perform erasure
v.erase(tmp_it); // invalidates all iterators pointing at or past *tmp_it
// (like, say it_2.base()...)
// now delete it_2's pointee:
std::vector<T>::iterator tmp_it_2=it_2.base(); // note, invalid iterator!
// undefined behavior:
--tmp_it_2;
v.erase(tmp_it_2);
In practice, I suspect that you'll run into two possible implementations: more commonly, the underlying iterator will be little more than a (suitably wrapped) raw pointer, and so everything will work perfectly happily. Less commonly, the iterator might actually try to track invalidations/perform bounds checking (didn't Dinkumware STL do such things when compiled in debug mode at one point?), and just might yell at you.

The reverse_iterator, just like the normal iterator, points to a certain position in the vector. Implementation details are irrelevant, but if you must know, they both are (in a typical implementation) just plain old pointers inside. The difference is the direction. The reverse iterator has its + and - reversed w.r.t. the regular iterator (and also ++ and --, > and < etc).
This is interesting to know, but doesn't really imply an answer to the main question.
If you read the language carefully, it says:
Invalidates iterators and references at or after the point of the erase.
References do not have a built-in sense of direction. Hence, the language clearly refers to the container's own sense of direction. Positions after the point of the erase are those with higher indices. Hence, the iterator's direction is irrelevant here.

Related

C++ formal requirement of behaviour of iterators over a container

I'm sure that I'm not alone in expecting that I could add several elements in some order to a vector or list, and then could use an iterator to retrieve those elements in the same order. For example, in:
#include <vector>
#include <cassert>
int main(int argc, char **argv)
{
using namespace std;
vector<int> v;
v.push_back(4);
v.push_back(10);
v.push_back(100);
auto i = v.begin();
assert(*i++ == 4);
assert(*i++ == 10);
assert(*i++ == 100);
return 0;
}
... all assertions should pass and the program should terminate normally (assuming that no std::bad_alloc exception is thrown during construction of the vector or adding the elements to it).
However, I'm having trouble reconciling this with any requirement in the C++ standard (I'm looking at C++11, but would like answers for other standards also if they are markedly different).
The requirement for begin() is just (23.2.1 para 6):
begin() returns an iterator referring to the first element in the container.
What I'm looking for is the requirement, or combination of requirements that in turn logically requires, that if i = v.begin(), then ++i shall refer to the second element in the vector (assuming that such an element exists) - or indeed, even the requirement that successive increments of an iterator will return each of the elements in the vector.
Edit:
A more general question is, what (if any) text in the standard requires that successfully incrementing an iterator obtained by calling begin() on a sequence (ordered or unordered) actually visits every element of the sequence?
There's isn't in the standard something straightforward to state that
if i = v.begin(), then ++i shall refer to the second element in the
vector.
However, for vector's iterators why can imply it from the following wording in the draft standard N4527 24.2.1/p5 In general [iterator.requirements.general]:
Iterators that further satisfy the requirement that, for integral
values n and dereferenceable iterator values a and (a + n), *(a + n) is equivalent to *(addressof(*a) + n), are called contiguous
iterators.
Now, std::vector's iterator satisfy this requirement, consequently we can imply that ++i is equivalent to i + 1 and thus to addressof(*i) + 1. Which indeed is the second element in the vector due to its contiguous nature.
Edit:
There was indeed a turbidness on the matter about random access iterators and contiguous storage containers in C++11 and C++14 standards. Thus, the commity decided to refine them by putting an extra group of iterators named contiguous iterators. You can find more info in the relative proposal N3884.
It looks to me like we need to put two separate parts of the standard together to get a solid requirement here. We can start with table 101, which requires that a[n] be equivalent to *(a.begin() + n) for sequence containers (specifically, basic_string, array, deqeue and vector) (and the same requirement for a.at(n), for the same containers).
Then we look at table 111 in [random.access.iterators], where it requires that the expression r += n be equivalent to:
{
difference_type m = n;
if (m >= 0)
while (m--)
++r;
else
while (m++)
--r;
return r;
}
[indentation added]
Between the two, these imply that for any n, *(begin() + n) refers to the nth item in the vector. Just in case you want to cover the last base I see open, let's cover the requirement that push_back actually append to the collection. That's also in table 101: a.push_back(t) "Appends a copy of t" (again for basic_string, string, deque, list, and vector).
[C++14: 23.2.3/1]: A sequence container organizes a finite set of objects, all of the same type, into a strictly linear arrangement. [..]
I don't know how else you'd interpret it.
The specification isn't just in the iterators. It is also in the specification of the containers, and the operations that modify those containers.
The thing is, you are not going to find a single clause that says "incrementing begin() repeatedly will access all elements of a vector in order". You need to look at the specification of every operation on every container (since these define an order of elements in the container) and the specification of iterators (and operations on them) which is essentially that "incrementing moves to the next element in the order that operations on the container defined, until we pass the end". It is the combination of numerous clauses in the standard that give the end effect.
The general concepts, however, are ....
All containers maintain some range of zero or more elements. That range has three key properties: a beginning (corresponding to the first element in an order that is meaningful to the container), and an end (corresponding to the last element), and an order (which determines the sequence in which elements will be retrieved one after the other - i.e. defines the meaning of "next").
An iterator is an object that either references an element in a range, or has a "past the end" value. An iterator that references an element in the range other than the end (last), when incremented, will reference the next element. An iterator that references the end (last) element in the range, when incremented, will be an end (past the end) iterator.
The begin() method returns an iterator that references (or points to) the first in the range (or an end iterator if the range has zero elements). The end() method returns an end iterator - one that corresponds to "one past the the end of the range". That means, if an iterator is initialised using the begin(), incrementing it repeatedly will move sequentially through the range until the end iterator is reached.
Then it is necessary to look at the specification for the various modifiers of the container - the member functions that add or remove elements. For example, push_back() is specified as adding an element to the end of the existing range for that container. It extends the range by adding an element to the end.
It is that combination of specifications - of iterators and of operations that modify containers - that guarantees the order. The net effect is that, if elements are added to a container in some order, then a iterator initialised using begin() will - when incremented repeatedly - reference the elements in the order in which they were placed in the container.
Obviously, some container modifiers are a bit more complicated - for example, std::vector's insert() is given an iterator, and adds elements there, shuffling subsequent elements to make room. However, the key point is that the modifiers place elements into the container in a defined order (or remove, in the case of operations like std::vector::erase()) and iterators will access elements in that defined order.

Why is "!=" used with iterators instead of "<"?

I'm used to writing loops like this:
for (std::size_t index = 0; index < foo.size(); index++)
{
// Do stuff with foo[index].
}
But when I see iterator loops in others' code, they look like this:
for (Foo::Iterator iterator = foo.begin(); iterator != foo.end(); iterator++)
{
// Do stuff with *Iterator.
}
I find the iterator != foo.end() to be offputting. It can also be dangerous if iterator is incremented by more than one.
It seems more "correct" to use iterator < foo.end(), but I never see that in real code. Why not?
All iterators are equality comparable. Only random access iterators are relationally comparable. Input iterators, forward iterators, and bidirectional iterators are not relationally comparable.
Thus, the comparison using != is more generic and flexible than the comparison using <.
There are different categories of iterators because not all ranges of elements have the same access properties. For example,
if you have an iterators into an array (a contiguous sequence of elements), it's trivial to relationally compare them; you just have to compare the indices of the pointed to elements (or the pointers to them, since the iterators likely just contain pointers to the elements);
if you have iterators into a linked list and you want to test whether one iterator is "less than" another iterator, you have to walk the nodes of the linked list from the one iterator until either you reach the other iterator or you reach the end of the list.
The rule is that all operations on an iterator should have constant time complexity (or, at a minimum, sublinear time complexity). You can always perform an equality comparison in constant time since you just have to compare whether the iterators point to the same object. So, all iterators are equality comparable.
Further, you aren't allowed to increment an iterator past the end of the range into which it points. So, if you end up in a scenario where it != foo.end() does not do the same thing as it < foo.end(), you already have undefined behavior because you've iterated past the end of the range.
The same is true for pointers into an array: you aren't allowed to increment a pointer beyond one-past-the-end of the array; a program that does so exhibits undefined behavior. (The same is obviously not true for indices, since indices are just integers.)
Some Standard Library implementations (like the Visual C++ Standard Library implementation) have helpful debug code that will raise an assertion when you do something illegal with an iterator like this.
Short answer: Because Iterator is not a number, it's an object.
Longer answer: There are more collections than linear arrays. Trees and hashes, for example, don't really lend themselves to "this index is before this other index". For a tree, two indices that live on separate branches, for example. Or, any two indices in a hash -- they have no order at all, so any order you impose on them is arbitrary.
You don't have to worry about "missing" End(). It is also not a number, it is an object that represents the end of the collection. It doesn't make sense to have an iterator that goes past it, and indeed it cannot.

Does std::vector::swap invalidate iterators?

If I swap two vectors, will their iterators remain valid, now just pointing to the "other" container, or will the iterator be invalidated?
That is, given:
using namespace std;
vector<int> x(42, 42);
vector<int> y;
vector<int>::iterator a = x.begin();
vector<int>::iterator b = x.end();
x.swap(y);
// a and b still valid? Pointing to x or y?
It seems the std mentions nothing about this:
[n3092 - 23.3.6.2]
void swap(vector<T,Allocator>& x);
Effects:
Exchanges the contents and capacity()
of *this with that of x.
Note that since I'm on VS 2005 I'm also interested in the effects of iterator debug checks etc. (_SECURE_SCL)
The behavior of swap has been clarified considerably in C++11, in large part to permit the Standard Library algorithms to use argument dependent lookup (ADL) to find swap functions for user-defined types. C++11 adds a swappable concept (C++11 §17.6.3.2[swappable.requirements]) to make this legal (and required).
The text in the C++11 language standard that addresses your question is the following text from the container requirements (§23.2.1[container.requirements.general]/8), which defines the behavior of the swap member function of a container:
Every iterator referring to an element in one container before the swap shall refer to the same element in the other container after the swap.
It is unspecified whether an iterator with value a.end() before the swap will have value b.end() after the swap.
In your example, a is guaranteed to be valid after the swap, but b is not because it is an end iterator. The reason end iterators are not guaranteed to be valid is explained in a note at §23.2.1/10:
[Note: the end() iterator does not refer to any element, so it may be
invalidated. --end note]
This is the same behavior that is defined in C++03, just substantially clarified. The original language from C++03 is at C++03 §23.1/10:
no swap() function invalidates any references, pointers, or iterators referring to the elements of the containers being swapped.
It's not immediately obvious in the original text, but the phrase "to the elements of the containers" is extremely important, because end() iterators do not point to elements.
Swapping two vectors does not invalidate the iterators, pointers, and references to its elements (C++03, 23.1.11).
Typically the iterator would contain knowledge of its container, and the swap operation maintains this for a given iterator.
In VC++ 10 the vector container is managed using this structure in <xutility>, for example:
struct _Container_proxy
{ // store head of iterator chain and back pointer
_Container_proxy()
: _Mycont(0), _Myfirstiter(0)
{ // construct from pointers
}
const _Container_base12 *_Mycont;
_Iterator_base12 *_Myfirstiter;
};
All iterators that refer to the elements of the containers remain valid
As for Visual Studio 2005, I have just tested it.
I think it should always work, as the vector::swap function even contains an explicit step to swap everything:
// vector-header
void swap(_Myt& _Right)
{ // exchange contents with _Right
if (this->_Alval == _Right._Alval)
{ // same allocator, swap control information
#if _HAS_ITERATOR_DEBUGGING
this->_Swap_all(_Right);
#endif /* _HAS_ITERATOR_DEBUGGING */
...
The iterators point to their original elements in the now-swapped vector object. (I.e. w/rg to the OP, they first pointed to elements in x, after the swap they point to elements in y.)
Note that in the n3092 draft the requirement is laid out in §23.2.1/9 :
Every iterator referring to an
element in one container before the
swap shall refer to the same element
in the other container after the swap.

Checking if an iterator is valid

Is there any way to check if an iterator (whether it is from a vector, a list, a deque...) is (still) dereferenceable, i.e. has not been invalidated?
I have been using try-catch, but is there a more direct way to do this?
Example: (which doesn't work)
list<int> l;
for (i = 1; i<10; i++) {
l.push_back(i * 10);
}
itd = l.begin();
itd++;
if (something) {
l.erase(itd);
}
/* now, in other place.. check if it points to somewhere meaningful */
if (itd != l.end())
{
// blablabla
}
I assume you mean "is an iterator valid," that it hasn't been invalidated due to changes to the container (e.g., inserting/erasing to/from a vector). In that case, no, you cannot determine if an iterator is (safely) dereferencable.
As jdehaan said, if the iterator wasn't invalidated and points into a container, you can check by comparing it to container.end().
Note, however, that if the iterator is singular -- because it wasn't initialized or it became invalid after a mutating operation on the container (vector's iterators are invalidated when you increase the vector's capacity, for example) -- the only operation that you are allowed to perform on it is assignment. In other words, you can't check whether an iterator is singular or not.
std::vector<int>::iterator iter = vec.begin();
vec.resize(vec.capacity() + 1);
// iter is now singular, you may only perform assignment on it,
// there is no way in general to determine whether it is singular or not
Non-portable answer: Yes - in Visual Studio
Visual Studio's STL iterators have a "debugging" mode which do exactly this. You wouldn't want to enable this in ship builds (there is overhead) but useful in checked builds.
Read about it on VC10 here (this system can and in fact does change every release, so find the docs specific to your version).
Edit Also, I should add: debug iterators in visual studio are designed to immediately explode when you use them (instead undefined behavior); not to allow "querying" of their state.
Usually you test it by checking if it is different from the end(), like
if (it != container.end())
{
// then dereference
}
Moreover using exception handling for replacing logic is bad in terms of design and performance. Your question is very good and it is definitively worth a replacement in your code. Exception handling like the names says shall only be used for rare unexpected issues.
Is there any way to check if a iterator (whether it is from a vector, a list, a deque...) is (still) dereferencable, i.e has not been invalidated ?
No, there isn't. Instead you need to control access to the container while your iterator exists, for example:
Your thread should not modify the container (invalidating the iterator) while it is still using an instantiated iterator for that container
If there's a risk that other threads might modify the container while your thread is iterating, then in order to make this scenario thread-safe your thread must acquire some kind of lock on the container (so that it prevents other threads from modifying the container while it's using an iterator)
Work-arounds like catching an exception won't work.
This is a specific instance of the more general problem, "can I test/detect whether a pointer is valid?", the answer to which is typically "no, you can't test for it: instead you have to manage all memory allocations and deletions in order to know whether any given pointer is still valid".
Trying and catching is not safe, you will not, or at least seldom throw if your iterator is "out of bounds".
what alemjerus say, an iterator can always be dereferenced. No matter what uglyness lies beneath. It is quite possible to iterate into other areas of memory and write to other areas that might keep other objects. I have been looking at code, watching variables change for no particular reason. That is a bug that is really hard to detect.
Also it is wise to remember that inserting and removing elements might potentially invalidate all references, pointers and iterators.
My best advice would be to keep you iterators under control, and always keep an "end" iterator at hand to be able to test if you are at the "end of the line" so to speak.
In some of the STL containers, the current iterator becomes invalid when you erase the current value of the iterator. This happens because the erase operation changes the internal memory structure of the container and increment operator on existing iterator points to an undefined locations.
When you do the following, iterator is incementented before it is passed to erase function.
if (something) l.erase(itd++);
Is there any way to check if an iterator is dereferencable
Yes, with gcc debugging containers available as GNU extensions. For std::list you can use __gnu_debug::list instead. The following code will abort as soon as invalid iterator is attempted to be used. As debugging containers impose extra overhead they are intended only when debugging.
#include <debug/list>
int main() {
__gnu_debug::list<int> l;
for (int i = 1; i < 10; i++) {
l.push_back(i * 10);
}
auto itd = l.begin();
itd++;
l.erase(itd);
/* now, in other place.. check if itd points to somewhere meaningful */
if (itd != l.end()) {
// blablabla
}
}
$ ./a.out
/usr/include/c++/7/debug/safe_iterator.h:552:
Error: attempt to compare a singular iterator to a past-the-end iterator.
Objects involved in the operation:
iterator "lhs" # 0x0x7ffda4c57fc0 {
type = __gnu_debug::_Safe_iterator<std::_List_iterator<int>, std::__debug::list<int, std::allocator<int> > > (mutable iterator);
state = singular;
references sequence with type 'std::__debug::list<int, std::allocator<int> >' # 0x0x7ffda4c57ff0
}
iterator "rhs" # 0x0x7ffda4c580c0 {
type = __gnu_debug::_Safe_iterator<std::_List_iterator<int>, std::__debug::list<int, std::allocator<int> > > (mutable iterator);
state = past-the-end;
references sequence with type 'std::__debug::list<int, std::allocator<int> >' # 0x0x7ffda4c57ff0
}
Aborted (core dumped)
The type of the parameters of the erase function of any std container (as you have listed in your question, i.e. whether it is from a vector, a list, a deque...) is always iterator of this container only.
This function uses the first given iterator to exclude from the container the element that this iterator points at and even those that follow. Some containers erase only one element for one iterator, and some other containers erase all elements followed by one iterator (including the element pointed by this iterator) to the end of the container. If the erase function receives two iterators, then the two elements, pointed by each iterator, are erased from the container and all the rest between them are erased from the container as well, but the point is that every iterator that is passed to the erase function of any std container becomes invalid! Also:
Each iterator that was pointing at some element that has been erased from the container becomes invalid, but it doesn't pass the end of the container!
This means that an iterator that was pointing at some element that has been erased from the container cannot be compared to container.end().
This iterator is invalid, and so it is not dereferencable, i.e. you cannot use neither the * nor -> operators, it is also not incrementable, i.e. you cannot use the ++ operator, and it is also not decrementable, i.e. you cannot use the -- operator.
It is also not comparable!!! I.E. you cannot even use neither == nor != operators
Actually you cannot use any operator that is declared and defined in the std iterator.
You cannot do anything with this iterator, like null pointer.
Doing something with an invalid iterator immediately stops the program and even causes the program to crash and an assertion dialog window appears. There is no way to continue program no matter what options you choose, what buttons you click. You just can terminate the program and the process by clicking the Abort button.
You don't do anything else with an invalid iterator, unless you can either set it to the begin of the container, or just ignore it.
But before you decide what to do with an iterator, first you must know if this iterator is either invalid or not, if you call the erase function of the container you are using.
I have made by myself a function that checks, tests, knows and returns true whether a given iterator is either invalid or not. You can use the memcpy function to get the state of any object, item, structure, class and etc, and of course we always use the memset function at first to either clear or empty a new buffer, structure, class or any object or item:
bool IsNull(list<int>::iterator& i) //In your example, you have used list<int>, but if your container is not list, then you have to change this parameter to the type of the container you are using, if it is either a vector or deque, and also the type of the element inside the container if necessary.
{
byte buffer[sizeof(i)];
memset(buffer, 0, sizeof(i));
memcpy(buffer, &i, sizeof(i));
return *buffer == 0; //I found that the size of any iterator is 12 bytes long. I also found that if the first byte of the iterator that I copy to the buffer is zero, then the iterator is invalid. Otherwise it is valid. I like to call invalid iterators also as "null iterators".
}
I have already tested this function before I posted it there and found that this function is working for me.
I very hope that I have fully answered your question and also helped you very much!
There is a way, but is ugly... you can use the std::distance function
#include <algorithms>
using namespace std
auto distance_to_iter = distance(container.begin(), your_iter);
auto distance_to_end = distance(container.begin(),container.end());
bool is_your_iter_still_valid = distance_to_iter != distance_to_end;
use erase with increment :
if (something) l.erase(itd++);
so you can test the validity of the iterator.
if (iterator != container.end()) {
iterator is dereferencable !
}
If your iterator doesnt equal container.end(), and is not dereferencable, youre doing something wrong.

Does pop_back() really invalidate *all* iterators on an std::vector?

std::vector<int> ints;
// ... fill ints with random values
for(std::vector<int>::iterator it = ints.begin(); it != ints.end(); )
{
if(*it < 10)
{
*it = ints.back();
ints.pop_back();
continue;
}
it++;
}
This code is not working because when pop_back() is called, it is invalidated. But I don't find any doc talking about invalidation of iterators in std::vector::pop_back().
Do you have some links about that?
The call to pop_back() removes the last element in the vector and so the iterator to that element is invalidated. The pop_back() call does not invalidate iterators to items before the last element, only reallocation will do that. From Josuttis' "C++ Standard Library Reference":
Inserting or removing elements
invalidates references, pointers, and
iterators that refer to the following
element. If an insertion causes
reallocation, it invalidates all
references, iterators, and pointers.
Here is your answer, directly from The Holy Standard:
23.2.4.2 A vector satisfies all of the requirements of a container and of a reversible container (given in two tables in 23.1) and of a sequence, including most of the optional sequence requirements (23.1.1).
23.1.1.12 Table 68
expressiona.pop_back()
return typevoid
operational semanticsa.erase(--a.end())
containervector, list, deque
Notice that a.pop_back is equivalent to a.erase(--a.end()). Looking at vector's specifics on erase:
23.2.4.3.3 - iterator erase(iterator position) - effects - Invalidates all the iterators and references after the point of the erase
Therefore, once you call pop_back, any iterators to the previously final element (which now no longer exists) are invalidated.
Looking at your code, the problem is that when you remove the final element and the list becomes empty, you still increment it and walk off the end of the list.
(I use the numbering scheme as used in the C++0x working draft, obtainable here
Table 94 at page 732 says that pop_back (if it exists in a sequence container) has the following effect:
{ iterator tmp = a.end();
--tmp;
a.erase(tmp); }
23.1.1, point 12 states that:
Unless otherwise specified (either explicitly or by defining a function in terms of other functions), invoking a container
member function or passing a container as an argument to a library function shall not invalidate iterators to, or change
the values of, objects within that container.
Both accessing end() as applying prefix-- have no such effect, erase() however:
23.2.6.4 (concerning vector.erase() point 4):
Effects: Invalidates iterators and references at or after the point of the erase.
So in conclusion: pop_back() will only invalidate an iterator to the last element, per the standard.
Here is a quote from SGI's STL documentation (http://www.sgi.com/tech/stl/Vector.html):
[5] A vector's iterators are invalidated when its memory is reallocated. Additionally, inserting or deleting an element in the middle of a vector invalidates all iterators that point to elements following the insertion or deletion point. It follows that you can prevent a vector's iterators from being invalidated if you use reserve() to preallocate as much memory as the vector will ever use, and if all insertions and deletions are at the vector's end.
I think it follows that pop_back only invalidates the iterator pointing at the last element and the end() iterator. We really need to see the data for which the code fails, as well as the manner in which it fails to decide what's going on. As far as I can tell, the code should work - the usual problem in such code is that removal of element and ++ on iterator happen in the same iteration, the way #mikhaild points out. However, in this code it's not the case: it++ does not happen when pop_back is called.
Something bad may still happen when it is pointing to the last element, and the last element is less than 10. We're now comparing an invalidated it and end(). It may still work, but no guarantees can be made.
Iterators are only invalidated on reallocation of storage. Google is your friend: see footnote 5.
Your code is not working for other reasons.
pop_back() invalidates only iterators that point to the last element. From C++ Standard Library Reference:
Inserting or removing elements
invalidates references, pointers, and
iterators that refer to the following
element. If an insertion causes
reallocation, it invalidates all
references, iterators, and pointers.
So to answer your question, no it does not invalidate all iterators.
However, in your code example, it can invalidate it when it is pointing to the last element and the value is below 10. In which case Visual Studio debug STL will mark iterator as invalidated, and further check for it not being equal to end() will show an assert.
If iterators are implemented as pure pointers (as they would in probably all non-debug STL vector cases), your code should just work. If iterators are more than pointers, then your code does not handle this case of removing the last element correctly.
Error is that when "it" points to the last element of vector and if this element is less than 10, this last element is removed. And now "it" points to ints.end(), next "it++" moves pointer to ints.end()+1, so now "it" running away from ints.end(), and you got infinite loop scanning all your memory :).
The "official specification" is the C++ Standard. If you don't have access to a copy of C++03, you can get the latest draft of C++0x from the Committee's website: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2723.pdf
The "Operational Semantics" section of container requirements specifies that pop_back() is equivalent to { iterator i = end(); --i; erase(i); }. the [vector.modifiers] section for erase says "Effects: Invalidates iterators and references at or after the point of the erase."
If you want the intuition argument, pop_back is no-fail (since destruction of value_types in standard containers are not allowed to throw exceptions), so it cannot do any copy or allocation (since they can throw), which means that you can guess that the iterator to the erased element and the end iterator are invalidated, but the remainder are not.
pop_back() will only invalidate it if it was pointing to the last item in the vector. Your code will therefore fail whenever the last int in the vector is less than 10, as follows:
*it = ints.back(); // Set *it to the value it already has
ints.pop_back(); // Invalidate the iterator
continue; // Loop round and access the invalid iterator
You might want to consider using the return value of erase instead of swapping the back element to the deleted position an popping back. For sequences erase returns an iterator pointing the the element one beyond the element being deleted. Note that this method may cause more copying than your original algorithm.
for(std::vector<int>::iterator it = ints.begin(); it != ints.end(); )
{
if(*it < 10)
it = ints.erase( it );
else
++it;
}
std::remove_if could also be an alternative solution.
struct LessThanTen { bool operator()( int n ) { return n < 10; } };
ints.erase( std::remove_if( ints.begin(), ints.end(), LessThanTen() ), ints.end() );
std::remove_if is (like my first algorithm) stable, so it may not be the most efficient way of doing this, but it is succinct.
Check out the information here (cplusplus.com):
Delete last element
Removes the last element in the vector, effectively reducing the vector size by one and invalidating all iterators and references to it.