I'm working on implementing my own list. I see that std::list::end() returns iterator to one past the last element in the list container. I'm wondering how the position of this past-the-end element is estimated due to list elements are stored in non-contiguous memory locations.
std::list<int> ls;
ls.push_back(1);
ls.push_back(2);
std::list<int>::iterator it = ls.end();
std::cout << &(*it) << std::endl << &(*++it) << std::endl << &(*++it) << std::endl;
As the code above presents, I can even increment the iterator to point to the next elements. How can it be known at which positions (in memory) the next elements will be stored?
How can it be known at which positions (in memory) the next elements will be stored?
It is not. Also, using that memory address as (part of) the past-the-end iterator would be incorrect.
Is not:
An iterator is not (necessarily) a pointer. An iterator is not required to store a memory address. What is required is that the de-reference operator be able to calculate a memory address (returned in the form of a reference). Good news, everyone! Applying the de-reference operator to the past-the-end iterator is undefined behavior. So even this reduced requirement is not applicable to the past-the-end iterator. If you are storing an address, go ahead and store whatever you want. (Just be consistent since two past-the-end iterators must compare equal.)
If your iterator does store a pointer (which admittedly is probably common), a simple approach would be to store whatever you would put in the next field of the last node in the list. This is typically either nullptr or a pointer to the list's sentinel node.
Would be incorrect:
A std::list does not invalidate iterators when elements are added to the list. This includes the past-the-end iterator. (See cppreference.com.) If your past-the-end iterator pointed to where the next element would be stored, it would be invalidated by adding that element to the list. Thus, you would fail to meet the iterator invalidation requirements for a std::list. So not only is storing that address in the past-the-end iterator impossible, it's not allowed.
Related
Below text is snippet of Effective STL Item 9
ofstream logFile; // log file to write to
AssocContainer<int> c;
…
for (AssocContainer<int>::iterator i = c.begin(); // loop conditions are the
i !=c.end();){ //same as before
if (badValue(*i)){
logFile << "Erasing " << *i <<'\n'; // write log file
c.erase(i++); // erase element
}
else ++i;
}
It's vector, string, and deque that now give us trouble. We can't use the erase-remove idiom any longer, because there's no way to get erase or remove to write the log file. Furthermore, we can't use the loop we just developed for associative containers, because it yields undefined behavior for vectors, strings, and deques! Recall that for such containers, invoking erase not only invalidates all iterators pointing to the erased element, it also invalidates all iterators beyond the erased element. In our case, that includes all iterators beyond i. It doesn't matter if we write i++, ++I, or anything else you can think of, because none of the resulting iterators is valid.
We must take a different tack with vector, string, and deque. In particular, we must take advantage of erase's return value. That return value is exactly what we need: it's a valid iterator pointing to the element following the erased element once the erase has been accomplished. In other words, we write this:
for (SeqContainer<int>::iterator i = c.beqin();
i != c.end();){
if (badValue(*i)){
logFile << "Erasing " << *i << '\n';
i = c.erase(i); // keep i valid by assigning
} //erase's return value to it
else ++i;
}
My question is author mentions that if we use erase on vectors, strings, deques "invalidates all iterators pointing to the erased element, it also invalidates all iterators beyond the erased element", but later statement is contradicting if we use return value of erase then we can use it, Question: "Isn't erase invalidate all pointer beyond erased element?
Question: "Isn't erase invalidate all pointer beyond erased element?
It may, but it is irrelevant to you in this case. erase() may invalidate existing iterators which obtained before this call to erase(), but what erase() itself returns is a new, valid iterator in any case.
I am trying to delete a object from a vector at a specific index. The vector iterator keeps track of the index throughout the program. In the below code, the first IF statement works perfectly. But, if the iterator is pointing to anywhere OTHER than the last element, I erase the element from the vector and then increment the iterator. The program crashes and says "iterator not incrementable".
I ran the debugger several times and everything looks correct, so I cannot see what I am missing?
vector<Card> myVector; //container to hold collection of cards.
vector<Card>::iterator myVectorIterator; //points to each "card" in the collection.
Card Collection::remove()
{
if (myVectorIterator== myVector.end()-1) { //at the last card
//erase the "current" card
myVector.erase(myVectorIterator);
//update to the first card.
myVectorIterator= myVector.begin();
}
else
{
myVector.erase(myVectorIterator);
//crashes here!
myVectorIterator++;
}
return *myVectorIterator;
}
erase invalidates the iterator, so you can't use it afterwards. But it helpfully returns an iterator to the element after the removed one:
myVectorIterator = myVector.erase(myVectorIterator);
This is because the call to erase invalidates all iterators. Imagine you have something pointing to an element and that element disappears, what shall you point to?
Worse still, the behavior is highly implementation dependent. The correct approach is to store use the return value from erase which will be an iterator to the next element in the vector.
myVectorIterator = myVector.erase(myVectorIterator);
Note that you now have to remove the incrementation on the next line (or you will skip an item.)
In general, when manipulating different STL containers, you will see that in the documentation whether a given operation will have an effect on iterators. If you take a look at this link, you will see that for vector::erase this is the case:
"Iterators, pointers and references pointing to position (or first) and beyond are invalidated, with all iterators, pointers and references to elements before position (or first) are guaranteed to keep referring to the same elements they were referring to before the call."
Different containers may have different guarantees when it comes to iterator validity.
You have to do this
myVectorIterator = myVector.erase(myVectorIterator);
It will remove the current iterator then assign the next item into it.
Because vectors use an array as their underlying storage, inserting
elements in positions other than the vector end causes the container
to relocate all the elements that were after position to their new
positions.
< http://www.cplusplus.com/reference/vector/vector/insert/ >
I thought that this is the reason that iterator it becomes no longer valid after the last line in code below:
std::vector<int> myvector (3,100);
std::vector<int>::iterator it;
it = myvector.begin();
it = myvector.insert ( it , 200 );
myvector.insert (it,2,300);
But if I change the it's definition into myvector.end();, it's still the same. What is the reason behind this? How exactly does it work and are there situations where iterator insert can be still valid after filling part of vector with some elements? (or single one)
Well yes, that is a reason for iterators to elements after (and at, because insertion is done before the given element) the insertion point are invalidated. If you insert to the end, then there are no elements whose iterators could be invalidated. The end iterator is always invalidated, no matter where you insert. The more relevant description on that page:
Iterator validity
If a reallocation happens, all iterators, pointers and references related to the container are invalidated.
Otherwise, only those pointing to position and beyond are invalidated, with all iterators, pointers and references to elements before position guaranteed to keep referring to the same elements they were referring to before the call.
Here's what it points to if you change the first assignment to end.
it = myvector.end();
it points to end, good.
it = myvector.insert ( it , 200 );
Inserting to end does not invalidate any pointers to elements, but it does invalidate the end iterator which is the old value for it. Luckily, you now assign to the iterator returned by insert. That iterator does not point to the end of the vector but to the newly inserted element.
myvector.insert (it,2,300);
Now it is invalidated again, but you don't reassign it, so it remains so.
Of course, then there is the possibility, after each insert, that the vector was reallocated in which case all previous iterators to any part of the vector would be invalidated. That can be avoided by guaranteeing sufficient space with vector::reserve before initializing the iterators. The new iterator returned by insert will always be valid, even if the vector was reallocated.
Here is a better reference and explanation.
Causes reallocation if the new size() is greater than the old capacity(). If the new size() is greater than capacity(), all iterators and references are invalidated. Otherwise, only the iterators and references before the insertion point remain valid. The past-the-end iterator is also invalidated.
— http://en.cppreference.com/w/cpp/container/vector/insert
In the case where you use end(), size() still goes above capacity(). Try setting the capacity to something larger before insert().
typedef struct value
{
char* contents;
int size;
}Value;
hash_map<Key,list<Value>,hash<Key>,eqKey> dspace;
hash_map<Key, list<Value>, hash<Key>, eqKey>::iterator itr;
list<Value> vallist;
list<Value>::iterator valitr;
Value * ptr;
itr=dspace.find(searchKey);
valitr=(itr->second).begin();
valitr++;
ptr=&*valitr;
here ptr pointer is pointing to the address of the element pointed by the valitr iterator. Now I want to erase this element from the list using this pointer. I have found that list.erase function do this but I have to provide the position or iterator .
Please give me some idea how I can erase this element using pointer instead of going through the list .
valitr denotes the position in the list. *valitr dereferences the iterator, giving you a reference to the value data of that pointer, which no longer has any reference to the list it is stored in.
If you need indeed erase a certain element in the list, and not just go for the 2nd element, you have to scan the list (from begin() to end(), and check the condition for finding the element, and use erase using the iterator to that element.
The API of the list does not intend to have elements deleted by pointer. You need the iterator.
Depending on the implementation you are using, there might be ways to get the element's interator from a pointer, but that is not guaranteed. And it might change later.
Try to keep the iterator somehow.
std::vector<int> ints;
// ... fill ints with random values
for(std::vector<int>::iterator it = ints.begin(); it != ints.end(); )
{
if(*it < 10)
{
*it = ints.back();
ints.pop_back();
continue;
}
it++;
}
This code is not working because when pop_back() is called, it is invalidated. But I don't find any doc talking about invalidation of iterators in std::vector::pop_back().
Do you have some links about that?
The call to pop_back() removes the last element in the vector and so the iterator to that element is invalidated. The pop_back() call does not invalidate iterators to items before the last element, only reallocation will do that. From Josuttis' "C++ Standard Library Reference":
Inserting or removing elements
invalidates references, pointers, and
iterators that refer to the following
element. If an insertion causes
reallocation, it invalidates all
references, iterators, and pointers.
Here is your answer, directly from The Holy Standard:
23.2.4.2 A vector satisfies all of the requirements of a container and of a reversible container (given in two tables in 23.1) and of a sequence, including most of the optional sequence requirements (23.1.1).
23.1.1.12 Table 68
expressiona.pop_back()
return typevoid
operational semanticsa.erase(--a.end())
containervector, list, deque
Notice that a.pop_back is equivalent to a.erase(--a.end()). Looking at vector's specifics on erase:
23.2.4.3.3 - iterator erase(iterator position) - effects - Invalidates all the iterators and references after the point of the erase
Therefore, once you call pop_back, any iterators to the previously final element (which now no longer exists) are invalidated.
Looking at your code, the problem is that when you remove the final element and the list becomes empty, you still increment it and walk off the end of the list.
(I use the numbering scheme as used in the C++0x working draft, obtainable here
Table 94 at page 732 says that pop_back (if it exists in a sequence container) has the following effect:
{ iterator tmp = a.end();
--tmp;
a.erase(tmp); }
23.1.1, point 12 states that:
Unless otherwise specified (either explicitly or by defining a function in terms of other functions), invoking a container
member function or passing a container as an argument to a library function shall not invalidate iterators to, or change
the values of, objects within that container.
Both accessing end() as applying prefix-- have no such effect, erase() however:
23.2.6.4 (concerning vector.erase() point 4):
Effects: Invalidates iterators and references at or after the point of the erase.
So in conclusion: pop_back() will only invalidate an iterator to the last element, per the standard.
Here is a quote from SGI's STL documentation (http://www.sgi.com/tech/stl/Vector.html):
[5] A vector's iterators are invalidated when its memory is reallocated. Additionally, inserting or deleting an element in the middle of a vector invalidates all iterators that point to elements following the insertion or deletion point. It follows that you can prevent a vector's iterators from being invalidated if you use reserve() to preallocate as much memory as the vector will ever use, and if all insertions and deletions are at the vector's end.
I think it follows that pop_back only invalidates the iterator pointing at the last element and the end() iterator. We really need to see the data for which the code fails, as well as the manner in which it fails to decide what's going on. As far as I can tell, the code should work - the usual problem in such code is that removal of element and ++ on iterator happen in the same iteration, the way #mikhaild points out. However, in this code it's not the case: it++ does not happen when pop_back is called.
Something bad may still happen when it is pointing to the last element, and the last element is less than 10. We're now comparing an invalidated it and end(). It may still work, but no guarantees can be made.
Iterators are only invalidated on reallocation of storage. Google is your friend: see footnote 5.
Your code is not working for other reasons.
pop_back() invalidates only iterators that point to the last element. From C++ Standard Library Reference:
Inserting or removing elements
invalidates references, pointers, and
iterators that refer to the following
element. If an insertion causes
reallocation, it invalidates all
references, iterators, and pointers.
So to answer your question, no it does not invalidate all iterators.
However, in your code example, it can invalidate it when it is pointing to the last element and the value is below 10. In which case Visual Studio debug STL will mark iterator as invalidated, and further check for it not being equal to end() will show an assert.
If iterators are implemented as pure pointers (as they would in probably all non-debug STL vector cases), your code should just work. If iterators are more than pointers, then your code does not handle this case of removing the last element correctly.
Error is that when "it" points to the last element of vector and if this element is less than 10, this last element is removed. And now "it" points to ints.end(), next "it++" moves pointer to ints.end()+1, so now "it" running away from ints.end(), and you got infinite loop scanning all your memory :).
The "official specification" is the C++ Standard. If you don't have access to a copy of C++03, you can get the latest draft of C++0x from the Committee's website: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2723.pdf
The "Operational Semantics" section of container requirements specifies that pop_back() is equivalent to { iterator i = end(); --i; erase(i); }. the [vector.modifiers] section for erase says "Effects: Invalidates iterators and references at or after the point of the erase."
If you want the intuition argument, pop_back is no-fail (since destruction of value_types in standard containers are not allowed to throw exceptions), so it cannot do any copy or allocation (since they can throw), which means that you can guess that the iterator to the erased element and the end iterator are invalidated, but the remainder are not.
pop_back() will only invalidate it if it was pointing to the last item in the vector. Your code will therefore fail whenever the last int in the vector is less than 10, as follows:
*it = ints.back(); // Set *it to the value it already has
ints.pop_back(); // Invalidate the iterator
continue; // Loop round and access the invalid iterator
You might want to consider using the return value of erase instead of swapping the back element to the deleted position an popping back. For sequences erase returns an iterator pointing the the element one beyond the element being deleted. Note that this method may cause more copying than your original algorithm.
for(std::vector<int>::iterator it = ints.begin(); it != ints.end(); )
{
if(*it < 10)
it = ints.erase( it );
else
++it;
}
std::remove_if could also be an alternative solution.
struct LessThanTen { bool operator()( int n ) { return n < 10; } };
ints.erase( std::remove_if( ints.begin(), ints.end(), LessThanTen() ), ints.end() );
std::remove_if is (like my first algorithm) stable, so it may not be the most efficient way of doing this, but it is succinct.
Check out the information here (cplusplus.com):
Delete last element
Removes the last element in the vector, effectively reducing the vector size by one and invalidating all iterators and references to it.