Is end() required to be constant in an STL map/set? - c++

§23.1.2.8 in the standard states that insertion/deletion operations on a set/map will not invalidate any iterators to those objects (except iterators pointing to a deleted element).
Now, consider the following situation: you want to implement a graph with uniquely numbered nodes, where every node has a fixed number (let's say 4) of neighbors. Taking advantage of the above rule, you do it like this:
class Node {
private:
// iterators to neighboring nodes
std::map<int, Node>::iterator neighbors[4];
friend class Graph;
};
class Graph {
private:
std::map<int, Node> nodes;
};
(EDIT: Not literally like this due to the incompleteness of Node in line 4 (see responses/comments), but along these lines anyway)
This is good, because this way you can insert and delete nodes without invalidating the consistency of the structure (assuming you keep track of deletions and remove the deleted iterator from every node's array).
But let's say you also want to be able to store an "invalid" or "nonexistent" neighbor value. Not to worry, we can just use nodes.end()... or can we? Is there some sort of guarantee that nodes.end() at 8 AM will be the same as nodes.end() at 10 PM after a zillion insertions/deletions? That is, can I safely == compare an iterator received as a parameter to nodes.end() in some method of Graph?
And if not, would this work?
class Graph {
private:
std::map<int, Node> nodes;
std::map<int, Node>::iterator _INVALID;
public:
Graph() { _INVALID = nodes.end(); }
};
That is, can I store nodes.end() in a variable upon construction, and then use this variable whenever I want to set a neighbor to invalid state, or to compare it against a parameter in a method? Or is it possible that somewhere down the line a valid iterator pointing to an existing object will compare equal to _INVALID?
And if this doesn't work either, what can I do to leave room for an invalid neighbor value?

You write (emphasis by me):
§23.1.2.8 in the standard states that insertion/deletion operations on a set/map will not invalidate any iterators to those objects (except iterators pointing to a deleted element).
Actually, the text of 23.1.2/8 is a bit different (again, emphasis by me):
The insert members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements.
I read this as: If you have a map, and somehow obtain an iterator into this map (again: it doesn't say to an object in the map), this iterator will stay valid despite insertion and removal of elements. Assuming std::map<K,V>::end() obtains an "iterator into the map", it should not be invalidated by insertion/removal.
This, of course, leaves the question whether "not invalidated" means it will always have the same value. My personal assumption is that this is not specified. However, in order for the "not invalidated" phrase to make sense, all results of std::map<K,V>::end() for the same map must always compare equal even in the face of insertions/removal:
my_map_t::iterator old_end = my_map.end();
// wildly change my_map
assert( old_end == my_map.end() );
My interpretation is that, if old_end remains "valid" throughout changes to the map (as the standard promisses), then that assertion should pass.
Disclaimer: I am not a native speaker and have a very hard time digesting that dreaded legaleze of the Holy PDF. In fact, in general I avoid it like the plague.
Oh, and my first thought also was: The question is interesting from an academic POV, but why doesn't he simply store keys instead of iterators?

23.1/7 says that end() returns an iterator that
is the past-the-end value for the container.
First, it confirms that what end() returns is the iterator. Second, it says that the iterator doesn't point to a particular element. Since deletion can only invalidate iterators that point somewhere (to the element being deleted), deletions can't invalidate end().

Well, there's nothing preventing particular collection implementation from having end() depend on the instance of collection and time of day, as long as comparisons and such work. Which means, that, perhaps, end() value may change, but old_end == end() comparison should still yield true. (edit: although after reading the comment from j_random_hacker, I doubt this paragraph itself evaluates to true ;-), not universally — see the discussion below )
I also doubt you can use std::map<int,Node>::iterator in the Node class due to the type being incomplete, yet (not sure, though).
Also, since your nodes are uniquely numbered, you can use int for keying them and reserve some value for invalid.

Iterators in (multi)sets and (multi)maps won't be invalidated in insertions and deletions and thus comparing .end() against previous stored values of .end() will always yield true.
Take as an example GNU libstdc++ implementation where .end() in maps returns the default intialized value of Rb_tree_node
From stl_tree.h:
_M_initialize()
{
this->_M_header._M_color = _S_red;
this->_M_header._M_parent = 0;
this->_M_header._M_left = &this->_M_header;
this->_M_header._M_right = &this->_M_header;
}

Assuming that (1) map implemented with red-black tree (2) you use same instance "after a zillion insertions/deletions"- answer "Yes".
Relative implmentation I can tell that all incarnation of stl I ever know use the tree algorithm.

A couple points:
1) end() references an element that is past the end of the container. It doesn't change when inserts or deletes change the container because it's not pointing to an element.
2) I think perhaps your idea of storing an array of 4 iterators in the Node could be changed to make the entire problem make more sense. What you want is to add a new iterator type to the Graph object that is capable of iterating over a single node's neighbours. The implementation of this iterator will need to access the members of the map, which possibly leads you down the path of making the Graph class extend the map collection. With the Graph class being an extended std::map, then the language changes, and you no longer need to store an invalid iterator, but instead simply need to write the algorithm to determine who is the 'next neighbour' in the map.

I think it is clear:
end() returns an iterator to the element one past the end.
Insertion/Deletion do not affect existing iterators so the returned values are always valid (unless you try to delete the element one past the end (but that would result in undefined behavior anyway)).
Thus any new iterator generated by end() (would be different but) when compared with the original using operator== would return true.
Also any intermediate values generated using the assignment operator= have a post condition that they compare equal with operator== and operator== is transitive for iterators.
So yes, it is valid to store the iterator returned by end() (but only because of the guarantees with associative containers, therefor it would not be valid for vector etc).
Remember the iterator is not necessarily a pointer. It can potentially be an object where the designer of the container has defined all the operations on the class.

I believe that this depends entirely on what type of iterator is being used.
In a vector, end() is the one past the end pointer and it will obviously change as elements are inserted and removed.
In another kind of container, the end() iterator might be a special value like NULL or a default constructed element. In this case it doesn't change because it doesn't point at anything. Instead of being a pointer-like thing, end() is just a value to compare against.
I believe that set and map iterators are the second kind, but I don't know of anything that requires them to be implemented in that way.

C++ Standard states that iterators should stay valid. And it is. Standard clearly states that in 23.1.2/8:
The insert members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements.
And in 21.1/7:
end() returns an iterator which is the past-the-end value for the container.
So iterators old_end and new_end will be valid. That means that we could get --old_end (call it it1) and --new_end (call it it2) and it will be the-end value iterators (from definition of what end() returns), since iterator of an associative container is of the bidirectional iterator category (according to 23.1.2/6) and according to definition of --r operation (Table 75).
Now it1 should be equal it2 since it gives the-end value, which is only one (23.1.2/9). Then from 24.1.3 follows that: The condition that a == b implies ++a == ++b. And ++it1 and ++it2 will give old_end and new_end iterators (from definition of ++r operation Table 74). Now we get that old_end and new_end should be equal.

I had a similar question recently, but I was wondering if calling end() to retrieve an iterator for comparison purposes could possibly have race conditions.
According to the standard, two iterators are considered equivalent if both can be dereferenced and &*a == &*b or if neither can be dereferenced. Finding the bolded statement took a while and is very relevant here.
Because an std::map::iterator cannot be invalidated unless the element it points to has been removed, you're guaranteed that two iterators returned by end, regardless of what the state of the map was when they were obtained, will always compare to each other as true.

Related

Are iterators still valid when the underlying elements have been moved?

If I have an iterator pointing to an element in an STL container, and I moved the element with the iterator, does the standard guarantee that the iterator is still valid? Can I use it with container's method, e.g. container::erase?
Also does it matter, if the container is a continuous one, e.g. vector, or non-continuous one, e.g. list?
std::list<std::string> l{"a", "b", "c"};
auto iter = l.begin();
auto s = std::move(*iter);
l.erase(iter); // <----- is it valid to erase it, whose underlying element has been removed?
Yes, you've modified the object in the container. You've not modified the container itself so the iterator is still valid
"Moving" an underlying element may not be the best name to use in this context. The name of this operation express the intention behind it but not how it really works.
In fact, the move operation is a form of copy operation with one difference: it is allowed to change the state of the "copied" object if it speeds up the execution. In case of the std::string this means that the internal buffer containing characters may be not deep-copied but just copied by address. The original object has to be then set to an empty state, to tell it to not use this buffer anymore. (Emptying the source string is not guaranteed. Optimizations of std::string are more complicated than I described.)
The important thing is that after the move operation, the original object is still there. It's just not guaranteed to have any specific state.
In this particular case you've done nothing to the iterator, but much rather to the object within it, so yes: The iterator remains valid.
But if you look at std::list::erase, it sports a line such as "References and iterators to the erased elements are invalidated. Other references and iterators are not affected."
So if you tried to do *iter after erase, it would cause your program to fail.
This may seem obvious for erase, but there are other operations where it is not as obvious.
For std::list for example, the reference page says:
Adding, removing and moving the elements within the list or across several lists does not invalidate the iterators or references. An iterator is invalidated only when the corresponding element is deleted.
For std::vector on the other hand, the reference for the push_back method says:
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
That means, unlike with std::list, it is not generally safe to keep an iterator to an element around, if the vector grows (because the underlying storage location of the item changes).

Get pointer to node in std::list or std::forward_list

I am planning to use std::list in my code, I decided not to use std::forward_list, because for deletions (I figured) the whole list will have to traversed, O(N) complexity for std::forward_list (being a single link list). However, when I looked into the documentation I noticed both the stl containers have O(N) complexity to remove an item.
http://www.cplusplus.com/reference/forward_list/forward_list/remove/
http://www.cplusplus.com/reference/list/list/remove/
After some thinking I figured out why (I think). It's because in both cases, the whole list has to be scanned to find the node first, and then delete it. Is this right?
I then looked into the "erase" and "erase_after" methods, and their complexity is "Linear in the number of elements erased (destructions).". It's because, I am passing an iterator to the node (which is kind of like a "pointer"). However, I cannot (or prefer not to) pass this iterator around in my code to access the data in the node. I am not sure if this iterator will be valid if the list is modified? Thoughts?
My question is, is there a way I can get a pointer to the node in the list. That way, I know it will be valid throughout the lifetime of my program, pass it around. And I can just look into it to get access to my data.
However, I cannot (or prefer not to) pass this iterator around in my code to access the data in the node.
Why not? Iterators are easy to use and are quite lightweight. A pointer isn't better in any way.
I am not sure if this iterator will be valid if the list is modified?
For list, any iterator will remain valid, even if the list is modified. Except, of course, if you erase the particular element that is the iterator points to. But that's kind of obvious, you can' expect to have an iterator (or pointer) to something that doesn't exist any more.
(vector is more dangerous. One small change to a vector can invalidate all its iterators.)
You can take a pointer to any individual element in the list.
list<int> iterator it = find(l.begin(), l.end(), 7); // get an iterator
int * ptr = &*it; // get a pointer to the same element.
The pointer is similar to the iterator in many respects. But the iterator is a little more powerful. An iterator can be incremented or decremented, to access neighbouring elements in the list. And an iterator can be used to delete an element from the list. A pointer cannot do either of those things.
Both the iterator and pointer remain valid as long as that particular element isn't removed.
I am not sure if this iterator will be valid if the list is modified
Yeah, in the general case, storing iterators is risky unless you keep a close eye on the operations performed on your container.
Problem is, this is just the same for a pointer. In fact, for many containers, iterators are implemented as pointers.
So either store an iterator or a pointer if you like but, either way, keep an eye on the iterator invalidation rules:
Iterator invalidation rules
For lists, an iterator is valid even if other items in the list are erased. It becomes garbage when that item the iterator references in the list is removed.
So, as long as you know the iterator you're passing around isn't being removed by some other piece of code, they're safe to hold onto. This seems fragile though.
Even if there was a construct outside of iterators to reference a node in the list, it would suffer from the same fragility.
However, you can have each node contain an std::shared_ptr to the data it stores instead of the object itself and then pass around std::weak_ptr's to those objects and check for expired before accessing those weak_ptr's.
eg
instead of
std::list<MyClass> foo;
you would have
std::list<std::shared_ptr<MyClass>> foo;
have a look here for info on weak_ptr's
is there a way I can get a pointer to the node in the list
Yes, in your particular implementation.
No, in a standard-compliant way.
If you look at the std::list documentation, there is not a single word about a node. While it is hard to imagine a different way to implement the std::list other than using a doubly linked list, there is nothing that prevents it.
You should almost never come into any contact with undocumented internals of libraries.
Adding, removing and moving the elements within the list or across several lists does not invalidate the iterators or references. An iterator is invalidated only when the corresponding element is deleted.
Source: https://en.cppreference.com/w/cpp/container/list
So a std::list<>::iterator is only invalidated when the corresponding element is deleted. So yes, as long as you make sure that the corresponding element exists (which you will anyway have to do in your scenario of storing/passing around a pointer to anything) you can save and/or pass around the iterator throughout the lifetime of your program.
Now, an iterator is nothing but a pointer in disguise. So, if you prefer to save/pass around the corresponding pointer instead of iterator, you can always first convert the iterator to the pointer as #Aaron McDaid suggested.
int * ptr = &*it; // get a pointer to the same element.

For which standard container, if any, is the iterator returned by end() persistent?

I need a way to quickly access data in a container.
So I remember iterator of that data position. Container maybe modified (elements added and removed) after that, but if I use container type that does not invalidate my iterator (like std::map or std::list) I am fine.
Also my data may not be in the container (yet), so I set an iterator to container.end() to reflect that.
Which standard container guarantees that end() would not change when elements added and removed? So I can still compare my iterator to the value returned by container.end() and not get false negative.
23.2.4/9 says of Associative Containers:
The insert and emplace members shall not affect the validity of
iterators and references to the container, and the erase members shall
invalidate only iterators and references to the erased elements
Now, there are some places where the standard talks about not invalidating "iterators and references to elements of the container", thus excluding end(). I don't believe that this is one of them - I'm pretty sure that an end() iterator is an "iterator to the container".
23.3.5.4/1 says for std::list that insert "Does not affect the validity of iterators and references", and 23.3.5.4/3 says that erase "invalidates only the iterators and references to the erased elements". Again, end() iterators are iterators and so their validity isn't excluded.
One thing to watch out for is that for any container, swap can invalidate end() iterators (I assume this is because there are two "natural" behaviors, either that the end iterator points to the end of the same container or else to the end of the one it was swapped with, but the standard doesn't want to dictate which or rule out other possibilities). But you aren't swapping, just adding and removing elements.
From my experience, iterators from std::vector and std::dequeue break upon resizing from erasure or addition (this applies to std::string's storage as well). std::list and std::map don't allocate blocks of memory: they usually allocate individual nodes (and most std::unordered_map implementations have buckets as a linked list of items (e.g., std::list)).
If you need to have a surviving end() iterator, choose a std::list (I do this for my Signal/Slots implementation, for their tokens) or do your own personal bookkeeping with std::vector/std::dequeue.
EDIT:
So, std::list is a good way to have your iterators be always valid, provided your list itself never dies (which they aren't). From another answer, if you need standardese clarity:
23.3.5.4/1 says for std::list that insert "Does not affect the validity of iterators and references", and 23.3.5.4/3 says that erase "invalidates only the iterators and references to the erased elements". Again, end() iterators are iterators and so their validity isn't excluded. - Another Answer
Instead of iterators, use a vector and store index values. They'll survive any restructuring. Iterators are primarily intended for use in pairs that designate a range; hanging on to individual iterators gets messy, as you've seen.

Iterator equivalent to null pointer?

In an algorithm I'm currently implementing, I need to manipulate a std::list of struct T.
T holds a reference to another instance of T, but this reference can also be "unassigned".
At first, I wanted to use a pointer to hold this reference, but using an iterator instead makes it easier to remove from the list.
My question is : how to represent the equivalent to null pointer with my iterator?
I read general solution is to use myList.end(), but in my case, I need to test whether the iterator is "null" or not, and I may add or remove elements to the list between the moment when I store the iterator and the moment I remove it from list... Should I make the iterator point to a known list containing the "null" element? Or is there a more elegant solution?
According to this (emphasis by me):
Compared to the other base sequence
containers (vector and deque), lists
are the most efficient container doing
insertions at some position other than
the beginning or the end of the
sequence, and, unlike in these, all of
the previously obtained iterators and
references remain valid after the
insertion and refer to the same
elements they were referring before.
The same applies to erasure (with the obvious exception of iterators referring to a deleted element becoming invalidated). So yes, obtaining end() will always point to the same "invalid" element and should be safe to use.

Does a vector sort invalidate iterators?

std::vector<string> names;
std::vector<string>::iterator start = names.begin();
std::vector<string>::iterator end = names.end();
sort (start,end);
//are my start and end valid at this point?
//or they do not point to front and tail resp?
According to the C++ Standard §23.1/11:
Unless otherwise specified (either explicitly or by defining a function in terms of other functions), invoking
a container member function or passing a container as an argument to a library function shall not invalidate
iterators to, or change the values of, objects within that container.
§25.3 "Sorting and related operations" doesn't specify that iterators will be invalidated, so iterators in the question should stay valid.
They still point to the beginning and end. The values in those slots of the vector have probably changed, but the storage location in which each resides remains the same.
std::sort will not invalidate iterators to a vector. The sort template uses the * operator on the iterators to access and modify the contents of the vector, and modifying a vector element though an iterator to an element already in the vector will not invalidate any iterators.
In summary,
your existing iterators will not be invalidated
however, the elements they point to may have been modified
In addition to the support for the standard provided by Kirill V. Lyadvinsky (Does a vector sort invalidate iterators?):
25/5 "Algorithms library"
If an algorithm’s Effects section says
that a value pointed to by any
iterator passed as an argument is
modified, then that algorithm has an
additional type requirement: The type
of that argument shall satisfy the
requirements of a mutable iterator
(24.1).
24.1/4 "Iterator requirements"
Besides its category, a forward,
bidirectional, or random access
iterator can also be mutable or
constant depending on whether the
result of the expression *i behaves as
a reference or as a reference to a
constant.
std::vector keeps its elements in contiguous memory. std::sort takes arguments (iterators) by value and re-arranges the sequence between them. The net result is your local variables start and end are still pointing to first and one-past-the-last elements of the vector.