Iterators invalidation - c++

Hi I read in C++ primer that adding elements to a vector invalidates the iterators. I don't understand why deleting elements doesn't invalidate them as the following code works
std::vector<int> a = {1,2,3,4,5,6};
auto b = a.begin();
while (b != a.end()){
if (*b%2 != 0)
a.erase(b);
else
b++;
}
NOTE: This code if from C++ primer itself and cpp reference as well

Not actually an answer to the question, but I think it is worth mentioning that in modern C++ you should try to avoid iterators by using algorithms and range-based for loops. In this particular case use std::erase_if:
std::vector<int> a = {1,2,3,4,5,6};
std::erase_if(a, [](int x) { return x%2 != 0; });

In general this code snippet
auto b = a.begin();
while (b != a.end()){
if (*b%2 != 0)
a.erase(b);
else
b++;
}
is invalid. It works because the container std::vector satisfies the concept of contiguous ranges. If instead of the vector you will use for example std::list<int> when the iterator b will be invalid.
It would be correctly to write
auto b = a.begin();
while (b != a.end()){
if (*b%2 != 0)
b = a.erase(b);
else
b++;
}

Common idiom. From cppreference: (erase) 'Invalidates iterators and references at or after the point of the erase, including the end() iterator.
Others have pointed out it should be written like this:
#include <vector>
std::vector<int> vec = { 1, 2, 3, 4 };
for (auto it = vec.begin(); it != vec.end(); )
{
if (*it % 2 != 0)
{
it = vec.erase(it);
}
else
{
++it;
}
}
Adjust if one prefers 'while' over 'for'. If performance is paramount one can start from the end though this may be less cache friendly.
Edit: code snippet is literally the cppreference link.

Adding elements to a vector may lead to a complete reallocation of the vector.
This invalidates all iterators held previously.
If you delete an entry of the vector the erase method returns:
a) a iterator to the next valid element
b) the end iterator.
But you have to use it. In your case:
b = a.erase(b);

As many pointed out it works by chance. Do not do this in prod.
Iterators are designed to be as lightweight as possible so there won't be a flag saying it's invalid. That would be too wasteful.
std::vector iterator is probably implemented as a pointer with some helper functions. Removing one element will shift everything so that same pointer now points to the new element where the old one used to be. It only works because elements are stored in contiguous memory without gaps.

Related

c++ – Disappearing Variable

I am writing a function to take the intersection of two sorted vector<size_t>s named a and b. The function iterates through both vectors removing anything from a that is not also in b so that whatever remains in a is the intersection of the two. Code here:
void intersect(vector<size_t> &a, vector<size_t> &b) {
vector<size_t>::iterator aItr = a.begin();
vector<size_t>::iterator bItr = b.begin();
vector<size_t>::iterator aEnd = a.end();
vector<size_t>::iterator bEnd = b.end();
while(aItr != aEnd) {
while(*bItr < *aItr) {
bItr++;
if(bItr == bEnd) {
a.erase(aItr, aEnd);
return;
}
}
if (*aItr == *bItr) aItr++;
else aItr = a.erase(aItr, aItr+1);
}
}
I am getting a very bug. I am stepping the debugger and once it passes line 8 "while(*bItr < *aItr)" b seems to disappear. The debugger seems not to know that b even exists! When b comes back into existence after it goes back to the top of the loop it has now taken on the values of a!
This is the kind of behavior that I expect to see in dynamic memory error, but as you can see I am not managing any dynamic memory here. I am super confused and could really use some help.
Thanks in advance!
Well, perhaps you should first address a major issue with your code: iterator invalidation.
See: Iterator invalidation rules here on StackOverflow.
When you erase an element in a vector, iterators into that vector at the point of deletion and further on are not guaranteed to be valid. Your code, though, assumes such validity for aEnd (thanks #SidS).
I would guess either this is the reason for what you're seeing, or maybe it's your compiler optimization flags which can change the execution flow, the lifetimes of variables which are not necessary, etc.
Plus, as #KT. notes, your erases can be really expensive, making your algorithm potentially quadratic-time in the length of a.
You are making the assumption that b contains at least one element. To address that, you can add this prior to your first loop :
if (bItr == bEnd)
{
a.clear();
return;
}
Also, since you're erasing elements from a, aEnd will become invalid. Replace every use of aEnd with a.end().
std::set_intersection could do all of this for you :
void intersect(vector<size_t> &a, const vector<size_t> &b)
{
auto it = set_intersection(a.begin(), a.end(), b.begin(), b.end(), a.begin());
a.erase(it, a.end());
}

C++ comparing iterator with int

Is there a simple way to compare an iterator with an int?
I have a loop like this:
for (std::vector<mystruct>::const_iterator it = vec->begin(); it != vec->end(); ++it)
Instead of looping over the entire vector, I would like to just loop over the first 3 elements. However, the following does not compile:
for (std::vector<mystruct>::const_iterator it = vec->begin(); it < 3; ++it)
It there a good way to achieve the same effect?
since it's a vector, why not just access its position directly ?
if (vec->size() > 0)
{
for (int i =0; i<3 && i< vec->size(); i++)
{
// vec[i] can be accessed directly
//do stuff
}
}
std::next(vec->begin(), 3); will be the the iterator 3 places after the first, and so you can compare to it:
for (std::vector<mystruct>::const_iterator it = vec->begin(); it != std::next(vec->begin(), 3); ++it)
Your vector will need to have at least 3 elements inside it though.
I'd want to be careful, because you can easily run into fencepost bugs.
This works on random access containers (like vector and array), but doesn't do ADL on begin because I'm lazy:
template<typename Container>
auto nth_element( Container&& c, std::size_t n )->decltype( std::begin(c) )
{
auto retval = std::begin(c);
std::size_t size = std::end(c) - retval;
retval += std::min( size, n );
return retval;
}
It returns std::end(c) if n is too big.
So you get:
for( auto it = vec->cbegin(); it != nth_element(vec, 3); ++it) {
// code
}
which deals with vectors whose size is less than 3 gracefully.
The basic core of this is that on random access iterators, the difference of iterators is ptrdiff_t -- an integral type -- and you can add integral types to iterators to move around. I just threw in a helper function, because you should only do non-trivial pointer arithmetic (and arithmetic on iterators is pointer arithmetic) in isolated functions if you can help it.
Supporting non-random access iterators is a matter of doing some traits checks. I wouldn't worry about that unless you really need it.
Note that this answer depends on some C++11 features, but no obscure ones. You'll need to #include <iterator> for std::begin and std::end and maybe <algorithm> for std::min.
Sure, you can simply go three elements past the beginning.
for (std::vector<mystruct>::const_iterator it = vec->cbegin(); it != vec->cbegin() + 3; ++it)
However, that might be error prone since you might try to access beyond the end in the case that the vector is fewer than 3 elements. I think you'd get an exception when that happens but you could prevent it by:
for(std::vector<mystruct>::const_iterator it = vec->cbegin(); it != vec->cend() && it != vec->cbegin() + 3; ++it)
Note the use of cbegin() and cend() since you asked for a const_iterator, although these are only available in c++11. You could just as easily use begin() and end() with your const_iterator.

Using std::deque::iterator (in C++ STL) for searching and deleting certain elements

I have encountered a problem invoking the following code:
#include<deque>
using namespace std;
deque<int> deq = {0,1,2,3,4,5,6,7,8};
for(auto it = deq.begin(); it != deq.end(); it++){
if(*it%2 == 0)
deq.erase(it);
}
which resulted in a segmentation fault. After looking into the problem I found that the problem resides in the way the STL manages iterators for deques: if the element being erased is closer to the end of the deque, the iterator used to point to the erased element will now point to the NEXT element, but not the previous element as vector::iterator does. I understand that modifying the loop condition from it != deq.end() to it < deq.end() could possibly solve the problem, but I just wonder if there is a way to traverse & erase certain element in a deque in the "standard form" so that the code can be compatible to other container types as well.
http://en.cppreference.com/w/cpp/container/deque/erase
All iterators and references are invalidated [...]
Return value : iterator following the last removed element.
This is a common pattern when removing elements from an STL container inside a loop:
for (auto i = c.begin(); i != c.end() ; /*NOTE: no incrementation of the iterator here*/) {
if (condition)
i = c.erase(i); // erase returns the next iterator
else
++i; // otherwise increment it by yourself
}
Or as chris mentioned you could just use std::remove_if.
To use the erase-remove idiom, you'd do something like:
deq.erase(std::remove_if(deq.begin(),
deq.end(),
[](int i) { return i%2 == 0; }),
deq.end());
Be sure to #include <algorithm> to make std::remove_if available.

std::vector iterator invalidation

There have been a few questions regarding this issue before; my understanding is that calling std::vector::erase will only invalidate iterators which are at a position after the erased element. However, after erasing an element, is the iterator at that position still valid (provided, of course, that it doesn't point to end() after the erase)?
My understanding of how a vector would be implemented seems to suggest that the iterator is definitely usable, but I'm not entirely sure if it could lead to undefined behavior.
As an example of what I'm talking about, the following code removes all odd integers from a vector. Does this code cause undefined behavior?
typedef std::vector<int> vectype;
vectype vec;
for (int i = 0; i < 100; ++i) vec.push_back(i);
vectype::iterator it = vec.begin();
while (it != vec.end()) {
if (*it % 2 == 1) vec.erase(it);
else ++it;
}
The code runs fine on my machine, but that doesn't convince me that it's valid.
after erasing an element, is the iterator at that position still valid
No; all of the iterators at or after the iterator(s) passed to erase are invalidated.
However, erase returns a new iterator that points to the element immediately after the element(s) that were erased (or to the end if there is no such element). You can use this iterator to resume iteration.
Note that this particular method of removing odd elements is quite inefficient: each time you remove an element, all of the elements after it have to be moved one position to the left in the vector (this is O(n2)). You can accomplish this task much more efficiently using the erase-remove idiom (O(n)). You can create an is_odd predicate:
bool is_odd(int x) { return (x % 2) == 1; }
Then this can be passed to remove_if:
vec.erase(std::remove_if(vec.begin(), vec.end(), is_odd), vec.end());
Or:
class CIsOdd
{
public:
bool operator()(const int& x) { return (x % 2) == 1; }
};
vec.erase(std::remove_if(vec.begin(), vec.end(), CIsOdd()), vec.end());

C++ iterators problem

I'm working with iterators on C++ and I'm having some trouble here. It says "Debug Assertion Failed" on expression (this->_Has_container()) on line interIterator++.
Distance list is a vector< vector< DistanceNode > >. What I'm I doing wrong?
vector< vector<DistanceNode> >::iterator externIterator = distanceList.begin();
while (externIterator != distanceList.end()) {
vector<DistanceNode>::iterator interIterator = externIterator->begin();
while (interIterator != externIterator->end()){
if (interIterator->getReference() == tmp){
//remove element pointed by interIterator
externIterator->erase(interIterator);
} // if
interIterator++;
} // while
externIterator++;
} // while
vector's erase() returns a new iterator to the next element. All iterators to the erased element and to elements after it become invalidated. Your loop ignores this, however, and continues to use interIterator.
Your code should look something like this:
if (condition)
interIterator = externIterator->erase(interIterator);
else
++interIterator; // (generally better practice to use pre-increment)
You can't remove elements from a sequence container while iterating over it — at least not the way you are doing it — because calling erase invalidates the iterator. You should assign the return value from erase to the iterator and suppress the increment:
while (interIterator != externIterator->end()){
if (interIterator->getReference() == tmp){
interIterator = externIterator->erase(interIterator);
} else {
++interIterator;
}
}
Also, never use post-increment (i++) when pre-increment (++i) will do.
I'll take the liberty to rewrite the code:
class ByReference: public std::unary_function<bool, DistanceNode>
{
public:
explicit ByReference(const Reference& r): mReference(r) {}
bool operator()(const DistanceNode& node) const
{
return node.getReference() == r;
}
private:
Reference mReference;
};
typedef std::vector< std::vector< DistanceNode > >::iterator iterator_t;
for (iterator_t it = dl.begin(), end = dl.end(); it != end; ++it)
{
it->erase(
std::remove_if(it->begin(), it->end(), ByReference(tmp)),
it->end()
);
}
Why ?
The first loop (externIterator) iterates over a full range of elements without ever modifying the range itself, it's what a for is for, this way you won't forget to increment (admittedly a for_each would be better, but the syntax can be awkward)
The second loop is tricky: simply speaking you're actually cutting the branch you're sitting on when you call erase, which requires jumping around (using the value returned). In this case the operation you want to accomplish (purging the list according to a certain criteria) is exactly what the remove-erase idiom is tailored for.
Note that the code could be tidied up if we had true lambda support at our disposal. In C++0x we would write:
std::for_each(distanceList.begin(), distanceList.end(),
[const& tmp](std::vector<DistanceNode>& vec)
{
vec.erase(
std::remove_if(vec.begin(), vec.end(),
[const& tmp](const DistanceNode& dn) { return dn.getReference() == tmp; }
),
vec.end()
);
}
);
As you can see, we don't see any iterator incrementing / dereferencing taking place any longer, it's all wrapped in dedicated algorithms which ensure that everything is handled appropriately.
I'll grant you the syntax looks strange, but I guess it's because we are not used to it yet.