Following one of the "deleting while iterating" patterns on a vector, I don't understand why this code works, or if it's making use of undefined behavior:
The Code:
#include <vector>
#include <iostream>
int main(int argc, char* argv[], char* envz[])
{
std::vector<std::string> myVec;
myVec.push_back("1");
myVec.push_back("2");
myVec.push_back("3");
for (std::vector<std::string>::iterator i = myVec.begin();
i != myVec.end();
++i)
{
if ("1" == *i)
{
std::cout << "Erasing " << *i << std::endl;
i = myVec.erase(i);
--i;
continue;
}
std::cout << *i << std::endl;
}
return 0;
}
The Output:
>g++ -g main.cpp
>./a.out
Erasing 1
2
3
Question:
Consider the first iteration of the for-loop:
i is myVec.begin(), which "points to" 1.
We enter the conditional block.
1 is erased and i is set to one past the erased element, i.e. 2, which is now also pointed to by myVec.begin()
I decrement i, so now it points to...one prior to myVec.begin() ???
I'm confused by why this seems to work, as evidenced by the output, but something feels fishy about decrementing the iterator. This code is easy enough to rationalize if the conditional is if ("2" == *i), because the iterator decrement still places it at a valid entry in the vector. I.e. if we conditionally erased 2, i would be set to point to 3, but then manually decremented and thus point to 1, followed by the for-loop increment, setting it to point back to 3 again. Conditionally erasing the last element is likewise easy to follow.
What Else I Tried:
This observation made me hypothesize that decrementing prior to vector::begin() was idempotent, so I tried addition an additional decrement, like so:
#include <vector>
#include <iostream>
int main(int argc, char* argv[], char* envz[])
{
std::vector<std::string> myVec;
myVec.push_back("1");
myVec.push_back("2");
myVec.push_back("3");
for (std::vector<std::string>::iterator i = myVec.begin();
i != myVec.end();
++i)
{
if ("1" == *i)
{
std::cout << "Erasing " << *i << std::endl;
i = myVec.erase(i);
--i;
--i; /*** I thought this would be idempotent ***/
continue;
}
std::cout << *i << std::endl;
}
return 0;
}
But this resulted in a segfault:
Erasing 1
Segmentation fault (core dumped)
Can someone explain why the first code bock works, and specifically why the single decrement after erasing the first element is valid?
No, your code has undefined behaviour: if i == myVec.begin(), then i = myVec.erase(i); results in i again being (the new value of) myVec.begin(), and --i has undefined behaviour since it goes outside the valid range for the iterator.
If you don't want to use the erase-remove idiom (i.e. myVec.erase(std::remove(myVec.begin(), myVec.end(), "1"), myVec.end())), then the manual loop-while-mutating looks like this:
for (auto it = myVec.begin(); it != myVec.end(); /* no increment! */) {
if (*it == "1") {
it = myVec.erase(it);
} else {
++it;
}
}
Regardless, the crucial point both here and in your original code is that erase invalidates iterators, and thus the iterator must be re-assigned with a valid value after the erasing. We achieve this thanks to the return value of erase, which is precisely that new, valid iterator that we need.
This might work in some compilers, but might fail in others (e.g. the compiler might actually check in runtime that you are not decrementing under begin() and throw exception in such case - I believe that at least one compiler does it but don't remember which one).
In this case the general pattern is to not increment in the for but inside the loop:
for (std::vector<std::string>::iterator i = myVec.begin();
i != myVec.end();
/* no increment here */)
{
if ("1" == *i)
{
std::cout << "Erasing " << *i << std::endl;
i = myVec.erase(i);
continue;
}
std::cout << *i << std::endl;
++i;
}
With vector the wrong iteration might actually work in more cases, but you'd have very bad time if you try that e.g. with std::map or std::set.
The key here is the continue right after decrementing.
By calling it, ++i will be triggered by the loop iteration before dereferencing i.
Related
How do I remove from a map while iterating it? like:
std::map<K, V> map;
for(auto i : map)
if(needs_removing(i))
// remove it from the map
If I use map.erase it will invalidate the iterators
The standard associative-container erase idiom:
for (auto it = m.cbegin(); it != m.cend() /* not hoisted */; /* no increment */)
{
if (must_delete)
{
m.erase(it++); // or "it = m.erase(it)" since C++11
}
else
{
++it;
}
}
Note that we really want an ordinary for loop here, since we are modifying the container itself. The range-based loop should be strictly reserved for situations where we only care about the elements. The syntax for the RBFL makes this clear by not even exposing the container inside the loop body.
Edit. Pre-C++11, you could not erase const-iterators. There you would have to say:
for (std::map<K,V>::iterator it = m.begin(); it != m.end(); ) { /* ... */ }
Erasing an element from a container is not at odds with constness of the element. By analogy, it has always been perfectly legitimate to delete p where p is a pointer-to-constant. Constness does not constrain lifetime; const values in C++ can still stop existing.
I personally prefer this pattern which is slightly clearer and simpler, at the expense of an extra variable:
for (auto it = m.cbegin(), next_it = it; it != m.cend(); it = next_it)
{
++next_it;
if (must_delete)
{
m.erase(it);
}
}
Advantages of this approach:
the for loop incrementor makes sense as an incrementor;
the erase operation is a simple erase, rather than being mixed in with increment logic;
after the first line of the loop body, the meaning of it and next_it remain fixed throughout the iteration, allowing you to easily add additional statements referring to them without headscratching over whether they will work as intended (except of course that you cannot use it after erasing it).
Assuming C++11, here is a one-liner loop body, if this is consistent with your programming style:
using Map = std::map<K,V>;
Map map;
// Erase members that satisfy needs_removing(itr)
for (Map::const_iterator itr = map.cbegin() ; itr != map.cend() ; )
itr = needs_removing(itr) ? map.erase(itr) : std::next(itr);
A couple of other minor style changes:
Show declared type (Map::const_iterator) when possible/convenient, over using auto.
Use using for template types, to make ancillary types (Map::const_iterator) easier to read/maintain.
The C++20 draft contains the convenience function std::erase_if.
So you can use that function to do it as a one-liner.
std::map<K, V> map_obj;
//calls needs_removing for each element and erases it, if true was reuturned
std::erase_if(map_obj,needs_removing);
//if you need to pass only part of the key/value pair
std::erase_if(map_obj,[](auto& kv){return needs_removing(kv.first);});
In short "How do I remove from a map while iterating it?"
With old map impl: You can't
With new map impl: almost as #KerrekSB suggested. But there are some syntax issues in what he posted.
From GCC map impl (note GXX_EXPERIMENTAL_CXX0X):
#ifdef __GXX_EXPERIMENTAL_CXX0X__
// _GLIBCXX_RESOLVE_LIB_DEFECTS
// DR 130. Associative erase should return an iterator.
/**
* #brief Erases an element from a %map.
* #param position An iterator pointing to the element to be erased.
* #return An iterator pointing to the element immediately following
* #a position prior to the element being erased. If no such
* element exists, end() is returned.
*
* This function erases an element, pointed to by the given
* iterator, from a %map. Note that this function only erases
* the element, and that if the element is itself a pointer,
* the pointed-to memory is not touched in any way. Managing
* the pointer is the user's responsibility.
*/
iterator
erase(iterator __position)
{ return _M_t.erase(__position); }
#else
/**
* #brief Erases an element from a %map.
* #param position An iterator pointing to the element to be erased.
*
* This function erases an element, pointed to by the given
* iterator, from a %map. Note that this function only erases
* the element, and that if the element is itself a pointer,
* the pointed-to memory is not touched in any way. Managing
* the pointer is the user's responsibility.
*/
void
erase(iterator __position)
{ _M_t.erase(__position); }
#endif
Example with old and new style:
#include <iostream>
#include <map>
#include <vector>
#include <algorithm>
using namespace std;
typedef map<int, int> t_myMap;
typedef vector<t_myMap::key_type> t_myVec;
int main() {
cout << "main() ENTRY" << endl;
t_myMap mi;
mi.insert(t_myMap::value_type(1,1));
mi.insert(t_myMap::value_type(2,1));
mi.insert(t_myMap::value_type(3,1));
mi.insert(t_myMap::value_type(4,1));
mi.insert(t_myMap::value_type(5,1));
mi.insert(t_myMap::value_type(6,1));
cout << "Init" << endl;
for(t_myMap::const_iterator i = mi.begin(); i != mi.end(); i++)
cout << '\t' << i->first << '-' << i->second << endl;
t_myVec markedForDeath;
for (t_myMap::const_iterator it = mi.begin(); it != mi.end() ; it++)
if (it->first > 2 && it->first < 5)
markedForDeath.push_back(it->first);
for(size_t i = 0; i < markedForDeath.size(); i++)
// old erase, returns void...
mi.erase(markedForDeath[i]);
cout << "after old style erase of 3 & 4.." << endl;
for(t_myMap::const_iterator i = mi.begin(); i != mi.end(); i++)
cout << '\t' << i->first << '-' << i->second << endl;
for (auto it = mi.begin(); it != mi.end(); ) {
if (it->first == 5)
// new erase() that returns iter..
it = mi.erase(it);
else
++it;
}
cout << "after new style erase of 5" << endl;
// new cend/cbegin and lambda..
for_each(mi.cbegin(), mi.cend(), [](t_myMap::const_reference it){cout << '\t' << it.first << '-' << it.second << endl;});
return 0;
}
prints:
main() ENTRY
Init
1-1
2-1
3-1
4-1
5-1
6-1
after old style erase of 3 & 4..
1-1
2-1
5-1
6-1
after new style erase of 5
1-1
2-1
6-1
Process returned 0 (0x0) execution time : 0.021 s
Press any key to continue.
Pretty sad, eh? The way I usually do it is build up a container of iterators instead of deleting during traversal. Then loop through the container and use map.erase()
std::map<K,V> map;
std::list< std::map<K,V>::iterator > iteratorList;
for(auto i : map ){
if ( needs_removing(i)){
iteratorList.push_back(i);
}
}
for(auto i : iteratorList){
map.erase(*i)
}
I'm trying to detect whether every single element of the vector is fullfilling given condition, let's say it must even number.
#include <iostream>
#include <vector>
#include <algorithm>
bool isOdd(int i)
{
return i%2==0;
}
int main()
{
int arr[5]={1,2,3,4,5};
std::vector<int> myVec(arr, arr + sizeof(arr)/sizeof(arr[0]));
std::vector<int>::iterator it = std::find_if(myVec.begin(), myVec.end(),
isOdd());
// This piece of code is probably causing some problems;
while(myVec.empty()!=false) // while my vector IS NOT EMPTY
{
std::cout << *it << " "; // print out the value of elements that
// fullfiled the condition given in isOdd
}
return 0;
}
What is wrong with my way of thinking ? Is the condition in while loop wrong or maybe I've completely missed the logic ?
Can you please provide me with some complex explanation of what is wrong with this piece of code ?
Thank you in advance.
P.S. I know that there is a possibility to use lambda function instead, but I don't want to get too confused :)
The problem with your approach is that you are finding the odd number only once, and then for some reason you expect the vector to change, without making any modifications.
You should make a loop that calls find_if repeatedly, like this:
bool isOdd(int i) {
return i%2!=0;
}
...
std::vector<int>::iterator it = myVec.begin();
for (;;) {
it = std::find_if(it, myVec.end(), isOdd);
if (it == myVec.end()) {
break;
}
std::cout << *it << " ";
++it;
}
Demo.
Note: I changed your isOdd function to return true for odd numbers. The original version was returning true for even numbers.
find_if returns the iterator pointing to the first value which meets the given condition. It stops there. You can put this in a loop to find all such elements, until it returns the end iterator.
The following line does the exact opposite of what you meant:
while(myVec.empty()!=false) // while my vector IS NOT EMPTY
Either write
while(myVec.empty()==false)
or
while(myVec.empty()!=true)
or simpler
while(!myVec.empty())
You could write it as a for-loop:
for (auto it = find_if(begin(myVec), end(myVec), isOdd);
it != end(myVec);
it = find_if(it, end(myVec), isOdd))
{
// do something with "it"
}
I ran into the following problem using std::multimap::equal_range() and insert().
According to both cplusplus.com and cppreference.com, std::multimap::insert does not invalidate any iterators, and yet the following code causes an infinite loop:
#include <iostream>
#include <map>
#include <string>
int main(int argc, char* argv[])
{
std::multimap<std::string,int> testMap;
testMap.insert(std::pair<std::string,int>("a", 1));
testMap.insert(std::pair<std::string,int>("a", 2));
testMap.insert(std::pair<std::string,int>("a", 3));
auto range = testMap.equal_range(std::string("a"));
for (auto it = range.first; it != range.second; ++it)
{
testMap.insert(std::pair<std::string,int>("b", it->second));
// this loop becomes infinite
}
// never gets here
for (auto it = testMap.begin(); it != testMap.end(); ++it)
{
std::cout << it->first << " - " << it->second << std::endl;
}
return 0;
}
The intent is to take all existing items in the multimap with a particular key ("a" in this case) and duplicate them under a second key ("b"). In practice, what happens is that the first loop never exits, because it never ends up matching range.second. After the third element in the map is processed, ++it leaves the iterator pointing at the first of the newly inserted items.
I've tried this with VS2012, Clang, and GCC and the same thing seems to happen in all compilers, so I assume it's "correct". Am I reading too much into the statement "No iterators or references are invalidated."? Does end() not count as an iterator in this case?
multimap::equal_range returns a pair whose second element in this case is an iterator to the past-the-end element ("which is the past-the-end value for the container" [container.requirements.general]/6).
I'll rewrite the code a bit to point something out:
auto iBeg = testMap.begin();
auto iEnd = testMap.end();
for(auto i = iBeg; i != iEnd; ++i)
{
testMap.insert( std::make_pair("b", i->second) );
}
Here, iEnd contains a past-the-end iterator. The call to multimap::insert doesn't invalidate this iterator; it stays a valid past-the-end iterator. Therefore the loop is equivalent to:
for(auto i = iBeg; i != testMap.end(); ++i)
Which is of course an infinite loop if you keep adding elements.
The end-iterator range.second is not invalidated.
The reason that the loop is infinite, is that each repetition of the loop body:
inserts a new element at the end of the map, thus increasing the distance between it and the end by one (so, after this insert, range no longer represents the equal_range for the key "a" because you have inserted a new key within the range it does represent, from the first "a" to the end of the container).
increments it, reducing the distance between it and the end by one.
Hence, it never reaches the end.
Here's how I might write the loop you want:
for (auto it = testMap.lower_bound("a"); it != testMap.end() && it->first == "a"; ++it)
{
testMap.insert(std::pair<std::string,int>("b", it->second));
}
A solution to make it work as expected (feel free to improve, it's a community wiki)
auto range = testMap.equal_range(std::string("a"));
if(range.first != range.second)
{
--range.second;
for (auto it = range.first; it != std::next(range.second); ++it)
{
testMap.insert(std::pair<std::string,int>("b", it->second));
}
}
I have this sample code to insert entries to a multimap. I am trying to delete particular entries of a specified key. But this code goes into infinite loop. Can someone help me with this code?
#include <iostream>
#include <map>
#include <string>
using namespace std;
int main()
{
multimap<string, string> names;
string n;
names.insert(pair<string, string>("Z", "F"));
names.insert(pair<string, string>("Z", "A"));
names.insert(pair<string, string>("S", "T"));
names.insert(pair<string, string>("S", "A"));
names.insert(pair<string, string>("S", "J"));
names.insert(pair<string, string>("D", "H"));
names.insert(pair<string, string>("D", "W"));
names.insert(pair<string, string>("D", "R"));
multimap<string, string>::iterator p;
p = names.find("Z");
if(p != names.end()) { // found a name
do {
cout << n << ", " << p->second;
cout << endl;
if (p->second.compare("A") == 0) {
names.erase(p);
p++;
} else {
p++;
}
} while (p != names.upper_bound("Z"));
}
else{
cout << "Name not found.\n";
}
p = names.find("Z");
if(p != names.end()) { // found a name
do {
cout << n << ", " << p->second;
cout << endl;
} while (p != names.upper_bound("Z"));
}
else{
cout << "Name not found.\n";
}
return 0;
}
In the above I am looking up using Key value "Z" and want to delete "A".
multimap::erase invalidates any iterators to the erase elements, so the lines
names.erase(p);
p++;
erases p, thus invalidating it, and then attempt to increment an invalid iterator. You can fix this by copying p to a temporary, incrementing p, and then erasing the temporary iterator.
multimap<string, string>::iterator temp = p;
++p;
names.erase(temp);
Alternatively if you're using C++11 then multimap::erase returns the next iterator in the container
p = names.erase(p);
Edit: the above isn't actually the source of your infinite loop. In the second loop you don't increment p, so it goes forever. However it is still something you should fix as it can cause unpredictable and difficult to track down bugs.
As said by others, advancing an iterator that points to an element that was just erased is not guaranteed to work. What you can do instead is to use the postfix ++ operator to retrieve an iterator to the element that followed the erased one before it was erased:
names.erase(p++);
In C++11, you can alternatively retrieve the return value of erase, which points to the following element (or is end() if there is no more element):
p = names.erase(p);
It has also been said already that your second loop is an infinite loop by definition because it never increments the counter.
However, there is one more thing that should be said: Your method of checking if the last element in a range of elements has been reached is not very efficient: You call upper_bound in every iteration of the loop, which will cause a new O(log(n)) tree search each time, although the iterator returned will always be the same.
You can obviously improve this by running upper_bound before you enter the loop and store the result. But even better, I'd suggest your run the equal_range function once, and then simply iterate through the range it returned:
typedef multimap<string,string>::const_iterator mapit;
std::pair<mapit,mapit> range = names.equal_range("Z");
mapit it = range.first;
while (it != range.second)
if (it->second == "A")
names.erase(it++);
else
++it;
In C++11, the use of auto will make this look even better.
#include "stdafx.h"
int _tmain(int argc, _TCHAR* argv[])
{
string s = "Haven't got an idea why.";
auto beg = s.begin();
auto end = s.end();
while (beg < end)
{
cout << *beg << '\n';
if (*beg == 'a')
{//whithout if construct it works perfectly
beg = s.erase(beg);
}
++beg;
}
return 0;
}
Why if I erase one or more chars from this string this code breaks? I suppose it has something to do with returned iterator after erase operation being created at higher address than end iterator but I'm not sure and it surely isn't right behaviour. Or is it?
There are several problems with this code.
Don't cache the value of s.end(); it changes as you delete elements.
Don't use beg < end. The idiomatic approach is to write beg != end. If you try to iterate past end, the result is undefined, and a debug version of the string library may deliberately crash your process, so it is meaningless to use <.
The iterator returned from s.erase(beg) might be s.end(), in which case ++beg takes you past the end.
Here's a (I think) correct version:
int _tmain(int argc, _TCHAR* argv[])
{
string s = "Haven't got an idea why.";
for (auto beg = s.begin(); beg != s.end();)
{
cout << *beg << '\n';
if (*beg == 'a')
{//whithout if construct it works perfectly
beg = s.erase(beg);
}
else
{
++beg;
}
}
}
EDIT: I suggest accepting FredOverflow's answer. It is simpler and faster than the above.
Erasing elements one by one from vectors or strings has quadratic complexity. There are better solutions with linear complexity:
#include <string>
#include <algorithm>
int main()
{
std::string s = "Haven't got an idea why.";
s.erase(std::remove(s.begin(), s.end(), 'a'), s.end());
std::cout << s << std::endl;
}
The previous s.end() value stored in end is invalid after s.erase(). Hence, do not use it.
Note the semantics of a basic_string and it's iterators.
From www.ski.com/tech/stl
Note also that, according to the C++ standard, basic_string has very unusual iterator invalidation semantics. Iterators may be invalidated by swap, reserve, insert, and erase (and by functions that are equivalent to insert and/or erase, such as clear, resize, append, and replace). Additionally, however, the first call to any non-const member function, including the non-const version of begin() or operator[], may invalidate iterators. (The intent of these iterator invalidation rules is to give implementors greater freedom in implementation techniques.)
Also what happens if
beg = s.erase(beg);
Returns an iterator equivalent to end()
On calling erase operation, stored end iterator pointer becomes invalid. So, use s.end() function in while loop condition
You have to iterate from .end()-1 to .begin(). At the same time, it is not safe to use comparison operators other than == and !=.
Here's my code:
vector<long long> myVector (my, my+myCount);
//sort and iterate through top correlation data counts
sort (myVector.begin(), myVector.end());
cout << endl;
int TopCorrelationDataCount = 0;
bool myVectorIterator_lastItem = false;
vector<long long>::iterator myVectorIterator=myVector.end()-1;
while (true) {
long long storedData = *myVectorIterator;
cout << TopCorrelationDataCount << " " << storedData << endl;
//prepare for next item
TopCorrelationDataCount++;
//if (TopCorrelationDataCount >= this->TopCorrelationDataSize) break;
if (myVectorIterator_lastItem) break;
myVectorIterator--;
if (myVectorIterator==myVector.begin())
{
myVectorIterator_lastItem = true;
}
}
Note: It can't be done using an ordinary for, because you have to find out if ==.begin(). If it is, this will be your last iteration. You can't check if ==.begin()-1, as it will result in run time error.
If you only want to use to X items in a vector, use TopCorrelationDataCount.