Iterating through STL containers and removing/adding multiple items

Iterating through STL containers and removing/adding multiple items - c++

One of the most frequent errors that occur in my code is that STL containers are modified during a loop.
Elements are removed or added during a loop execution so I usually run into out of bounds exceptions.
My for loops usually looks like this:
for (auto& Item : Items) { // Will not work when Items container is modified
//... loop logic
}
When multiple items can be removed, I use this monstrosity:
for (int Index=Items.size()-1;Index<=0;Index--) {
if (Index<Items.size()) { //Because multiple items can be removed in a single loop
//... loop logic
}
}
This looks bad and it makes me feel bad using that second option. The reason multiple items can be removed is due to events, where a single event can remove any number of elements.
Here is some pseudo code to illustrate when this occurs:
// for each button in vector<button> {
// process button events
// event adds more buttons to vector<button>
// *ERROR* vector<button> is modified during loop.
// }
In another example, imagine a vector with following items:
// 0 1 2 3 4 5 6 7 8 9
We start our loop at 0 and go element by element. At 4, I want to remove elements 1,4 and 9 so we can't use a normal loop here.

Use std::remove_if with a predicate that decide if a button needs to be removed:
bool needsRemoved(const Button& button);
vec.erase(std::remove_if(vec.begin(), vec.end(), &needsRemoved), vec.end());
EDIT: For your last example, the quadratic (i.e. bad for performance) algorithm is:
std::vector<int> vec = {0,1,2,3,4,5,6,7,8,9};
auto end = vec.end();
for (auto it = vec.begin(); it < end; ++it)
{
std::set<int> bad = {1, 4, 9};
end = std::remove_if
(vec.begin(), end,
[bad](int x) { return (bad.find(x) != bad.end()); });
}
vec.erase(end, vec.end());
You will probably be better off using a container with fast lookup though (like a set, or a map).

There are pretty much two ways to do this reliably:
Iterate over a copy of the original container and manipulate the original. This may not be feasible unless your container stores pointers, not the actual elements directly.
Don't allow direct manipulation of the container, but instead mark the to-be-deleted elements somehow and sweep them after iterating. You can also support adding new elements by inserting them into a separate temporary container and appending to the original after the loop is done - you can also do this with the removed elements, obviating the need to store a "removed" flag in the elements themselves. This can of course be abstracted out with suitable add and remove functions.
Edit: The removal part of solution #2 can be nicely done with the erase-remove idiom shown by rectummelancolique.

Since there are buttons (and I hope there are not too many) you might want to add a flag to each button, which tells, if it has been processed completely or something like that. Then you look for the first item in the array which has not been processed and process it. You repeat this until all items have been processed.
for (;;) // breaks, when all items have been processed.
{
auto it = std::find( std::begin(Items), std::end(Items),
[](const Item & item){ return item.hasBeenProcessed(); }
if ( it == std::end(Items) )
break;
process( *it );
}
This should be safe. Note that this can have quadratic time complexity with respect to the number of items. As I said, there will hopefully not be too many items. If this is an issue, you might want to optimize this loop a little, for example starting the search where you left the last time. But do this only when it becomes an issue.

Since you're talking about buttons and button events: the
simplest solution is simply to reset the loop to the start when
you process an event:
for ( auto current = items.begin(); current != items.end(); ++ current ) {
if ( current->hasEventWhichNeedsProcessing() ) {
current->processEvent(); // possibly invalidates current
current = items.begin(); // revalidates current
}
}
If we're talking about button events (which occur due to
a human user action), this should be safe, since you should
normally be able to process all of the events before a new event
occurs. (For very rapidly occuring events, you may never reach
the final entry.)
I'm still not sure that it's the best solution, however.
Regardless of how you iterator, it means that you may treat
events in a different order than they arrive. A better solution
would be to push the events themselves onto a list, and then
process this list in order (as a queue).

Related

Constraining remove_if on only part of a C++ list

I have a C++11 list of complex elements that are defined by a structure node_info. A node_info element, in particular, contains a field time and is inserted into the list in an ordered fashion according to its time field value. That is, the list contains various node_info elements that are time ordered. I want to remove from this list all the nodes that verify some specific condition specified by coincidence_detect, which I am currently implementing as a predicate for a remove_if operation.
Since my list can be very large (order of 100k -- 10M elements), and for the way I am building my list this coincidence_detect condition is only verified by few (thousands) elements closer to the "lower" end of the list -- that is the one that contains elements whose time value is less than some t_xv, I thought that to improve speed of my code I don't need to run remove_if through the whole list, but just restrict it to all those elements in the list whose time < t_xv.
remove_if() though does not seem however to allow the user to control up to which point I can iterate through the list.
My current code.
The list elements:
struct node_info {
char *type = "x";
int ID = -1;
double time = 0.0;
bool spk = true;
};
The predicate/condition for remove_if:
// Remove all events occurring at t_event
class coincident_events {
double t_event; // Event time
bool spk; // Spike condition
public:
coincident_events(double time,bool spk_) : t_event(time), spk(spk_){}
bool operator()(node_info node_event){
return ((node_event.time==t_event)&&(node_event.spk==spk)&&(strcmp(node_event.type,"x")!=0));
}
};
The actual removing from the list:
void remove_from_list(double t_event, bool spk_){
// Remove all events occurring at t_event
coincident_events coincidence(t_event,spk_);
event_heap.remove_if(coincidence);
}
Pseudo main:
int main(){
// My list
std::list<node_info> event_heap;
...
// Populate list with elements with random time values, yet ordered in ascending order
...
remove_from_list(0.5, true);
return 1;
}
It seems that remove_if may not be ideal in this context. Should I consider instead instantiating an iterator and run an explicit for cycle as suggested for example in this post?

It seems that remove_if may not be ideal in this context. Should I consider instead instantiating an iterator and run an explicit for loop?
Yes and yes. Don't fight to use code that is preventing you from reaching your goals. Keep it simple. Loops are nothing to be ashamed of in C++.

First thing, comparing double exactly is not a good idea as you are subject to floating point errors.
You could always search the point up to where you want to do a search using lower_bound (I assume you list is properly sorted).
The you could use free function algorithm std::remove_if followed by std::erase to remove items between the iterator returned by remove_if and the one returned by lower_bound.
However, doing that you would do multiple passes in the data and you would move nodes so it would affect performance.
See also: https://en.cppreference.com/w/cpp/algorithm/remove
So in the end, it is probably preferable to do you own loop on the whole container and for each each check if it need to be removed. If not, then check if you should break out of the loop.
for (auto it = event_heap.begin(); it != event_heap.end(); )
{
if (coincidence(*it))
{
auto itErase = it;
++it;
event_heap.erase(itErase)
}
else if (it->time < t_xv)
{
++it;
}
else
{
break;
}
}
As you can see, code can easily become quite long for something that should be simple. Thus, if you need to do that kind of algorithm often, consider writing you own generic algorithm.
Also, in practice you might not need to do a complete search for the end using the first solution if you process you data in increasing time order.
Finally, you might consider using an std::set instead. It could lead to simpler and more optimized code.

Thanks. I used your comments and came up with this solution, which seemingly increases speed by a factor of 5-to-10.
void remove_from_list(double t_event,bool spk_){
coincident_events coincidence(t_event,spk_);
for(auto it=event_heap.begin();it!=event_heap.end();){
if(t_event>=it->time){
if(coincidence(*it)) {
it = event_heap.erase(it);
}
else
++it;
}
else
break;
}
}
The idea to make erase return it (as already ++it) was suggested by this other post. Note that in this implementation I am actually erasing all list elements up to t_event value (meaning, I pass whatever I want for t_xv).

How to avoid out of range exception when erasing vector in a loop?

My apologies for the lengthy explanation.
I am working on a C++ application that loads two files into two 2D string vectors, rearranges those vectors, builds another 2D string vector, and outputs it all in a report. The first element of the two vectors is a code that identifies the owner of the item and the item in the vector. I pass the owner's identification to the program on start and loop through the two vectors in a nested while loop to find those that have matching first elements. When I do, I build a third vector with components of the first two, and I then need to capture any that don't match.
I was using the syntax "vector.erase(vector.begin() + i)" to remove elements from the two original arrays when they matched. When the loop completed, I had my new third vector, and I was left with two vectors that only had elements, which didn't match and that is what I needed. This was working fine as I tried the various owners in the files (the program accepts one owner at a time). Then I tried one that generated an out of range error.
I could not figure out how to do the erase inside of the loop without throwing the error (it didn't seem that swap and pop or erase-remove were feasible solutions). I solved my problem for the program with two extra nested while loops after building my third vector in this one.
I'd like to know how to make the erase method work here (as it seems a simpler solution) or at least how to check for my out of range error (and avoid it). There were a lot of "rows" for this particular owner; so debugging was tedious. Before giving up and going on to the nested while solution, I determined that the second erase was throwing the error. How can I make this work, or are my nested whiles after the fact, the best I can do? Here is the code:
i = 0;
while (i < AIvector.size())
{
CHECK:
j = 0;
while (j < TRvector.size())
{
if (AIvector[i][0] == TRvector[j][0])
{
linevector.clear();
// Add the necessary data from both vectors to Combo_outputvector
for (x = 0; x < AIvector[i].size(); x++)
{
linevector.push_back(AIvector[i][x]); // add AI info
}
for (x = 3; x < TRvector[j].size(); x++) // Don't need the the first three elements; so start with x=3.
{
linevector.push_back(TRvector[j][x]); // add TR info
}
Combo_outputvector.push_back(linevector); // build the combo vector
// then erase these two current rows/elements from their respective vectors, this revises the AI and TR vectors
AIvector.erase(AIvector.begin() + i);
TRvector.erase(TRvector.begin() + j);
goto CHECK; // jump from here because the erase will have changed the two increments
}
j++;
}
i++;
}

As already discussed, your goto jumps to the wrong position. Simply moving it out of the first while loop should solve your problems. But can we do better?
Erasing from a vector can be done cleanly with std::remove and std::erase for cheap-to-move objects, which vector and string both are. After some thought, however, I believe this isn't the best solution for you because you need a function that does more than just check if a certain row exists in both containers and that is not easily expressed with the erase-remove idiom.
Retaining the current structure, then, we can use iterators for the loop condition. We have a lot to gain from this, because std::vector::erase returns an iterator to the next valid element after the erased one. Not to mention that it takes an iterator anyway. Conditionally erasing elements in a vector becomes as simple as
auto it = vec.begin()
while (it != vec.end()) {
if (...)
it = vec.erase(it);
else
++it;
}
Because we assign erase's return value to it we don't have to worry about iterator invalidation. If we erase the last element, it returns vec.end() so that doesn't need special handling.
Your second loop can be removed altogether. The C++ standard defines functions for searching inside STL containers. std::find_if searches for a value in a container that satisfies a condition and returns an iterator to it, or end() if it doesn't exist. You haven't declared your types anywhere so I'm just going to assume the rows are std::vector<std::string>>.
using row_t = std::vector<std::string>;
auto AI_it = AIVector.begin();
while (AI_it != AIVector.end()) {
// Find a row in TRVector with the same first element as *AI_it
auto TR_it = std::find_if (TRVector.begin(), TRVector.end(), [&AI_it](const row_t& row) {
return row[0] == (*AI_it)[0];
});
// If a matching row was found
if (TR_it != TRVector.end()) {
// Copy the line from AIVector
auto linevector = *AI_it;
// Do NOT do this if you don't guarantee size > 3
assert(TR_it->size() >= 3);
std::copy(TR_it->begin() + 3, TR_it->end(),
std::back_inserter(linevector));
Combo_outputvector.emplace_back(std::move(linevector));
AI_it = AIVector.erase(AI_it);
TRVector.erase(TR_it);
}
else
++AI_it;
}
As you can see, switching to iterators completely sidesteps your initial problem of figuring out how not to access invalid indices. If you don't understand the syntax of the arguments for find_if search for the term lambda. It is beyond the scope if this answer to explain what they are.
A few notable changes:
linevector is now encapsulated properly. There is no reason for it to be declared outside this scope and reused.
linevector simply copies the desired row from AIVector rather than push_back every element in it, as long as Combo_outputvector (and therefore linevector) contains the same type than AIVector and TRVector.
std::copy is used instead of a for loop. Apart from being slightly shorter, it is also more generic, meaning you could change your container type to anything that supports random access iterators and inserting at the back, and the copy would still work.
linevector is moved into Combo_outputvector. This can be a huge performance optimization if your vectors are large!
It is possible that you used an non-encapsulated linevector because you wanted to keep a copy of the last inserted row outside of the loop. That would prohibit moving it, however. For this reason it is faster and more descriptive to do it as I showed above and then simply do the following after the loop.
auto linevector = Combo_outputvector.back();

Expanding vector while iterating through it

I have a function that iterates through a vector and calls another function to execute its contents in some manner. As a result of that execution new elements could be added to the vector. Function code is as follows:
void foo() {
for (std::vector<Item*>::iterator it = item_list.begin(); it != item_list.end(); ++it ) {
if (/*some condition*/) {
bar(it);
}
}
}
While I was googling this problem I saw that iterator might get invalidated if resize happens, but the writer was not specific on why nor when or what is the proper way of handling this problem.

As a vector is random access, you can store the distance temporarily and re-create the iterator afterwards:
void foo() {
for (std::vector<Item*>::iterator it = item_list.begin(); it != item_list.end(); ++it ) {
if (/*some condition*/) {
const auto d = std::distance( item_list.begin(), it );
bar(it);
it = item_list.begin();
std::advance( it, d );
}
}
}
The answer assumes that new elements are added after the current position, e.g., at the end. It also assumes that it is desirable that the new elements are also part of the iteration, i.e., they will also be checked against some condition and bar will be called if they do match.

Instead of inserting more items on that vector within a interaction, simply create another vector to store all the items and do the resizing while you iterate in this one, then you discard the current one and swap for that updated vector.
That's the standard way to do that.

According to the documentation, if inserting an element changes the capacity of the vector, all iterators are invalidated.
To work around this, add new items to a temporary vector and merge the two after you're done.

c++ vector object .erase

I have been struggling to put a vector object into a project im doing
I have read what little i could find about doing this and decided to give it a go.
std::vector<BrickFalling> fell;
BrickFalling *f1;
I created the vector. This next piece works fine until i get to the erase
section.
if(brickFall == true){
f1 = new BrickFalling;
f1->getBrickXY(brickfallx,brickfally);
fell.push_back(*f1);
brickFall = false;
}
// Now setup an iterator loop through the vector
vector<BrickFalling>::iterator it;
for( it = fell.begin(); it != fell.end(); ++it ) {
// For each BrickFalling, print out their info
it->printBrickFallingInfo(brick,window,deadBrick);
//This is the part im doing wrong /////
if(deadBrick == true)// if dead brick erase
{
BrickFalling[it].erase;//not sure what im supposed to be doing here
deadBrick = false;
}
}

You can totally avoid the issue by using std::remove_if along with vector::erase.
auto it =
std::remove_if(fell.begin(), fell.end(), [&](BrickFalling& b)
{ bool deadBrick = false;
b.printBrickFallingInfo(brick,window,deadBrick);
return deadBrick; });
fell.erase(it, fell.end());
This avoids the hand-writing of the loop.
In general, you should strive to write erasure loops for sequence containers in this fashion. The reason is that it is very easy to get into the "invalid iterator" scenario when writing the loop yourself, i.e. not remembering to reseat your looping iterator each time an erase is done.
The only issue with your code which I do not know about is the printBrickFallingInfo function. If it throws an exception, you may introduce a bug during the erasure process. In that case, you may want to protect the call with a try/catch block to ensure you don't leave the function block too early.
Edit:
As the comment stated, your print... function could be doing too much work just to determine if a brick is falling. If you really are attempting to print stuff and do even more things that may cause some sort of side-effect, another approach similar in nature would be to use std::stable_partition.
With std::stable_partition you can "put on hold" the erasure and just move the elements to be erased at one position in the container (either at the beginning or at the end) all without invalidating those items. That's the main difference -- with std::stable_partition, all you would be doing is move the items to be processed, but the items after movement are still valid. Not so with std::remove and std::remove_if -- moved items are just invalid and any attempt to use those items as if they are still valid is undefined behavior.
auto it =
std::stable_partition(fell.begin(), fell.end(), [&](BrickFalling& b)
{ bool deadBrick = false;
b.printBrickFallingInfo(brick,window,deadBrick);
return deadBrick; });
// if you need to do something with the moved items besides
// erasing them, you can do so. The moved items start from
// fell.begin() up to the iterator it.
//...
//...
// Now we erase the items since we're done with them
fell.erase(fell.begin(), it);
The difference here is that the items we will eventually erase will lie to the left of the partitioning iterator it, so our erase() call will remove the items starting from the beginning. In addition to that, the items are still perfectly valid entries, so you can work with them in any way you wish before you finally erase them.

The other answer detailing the use of remove_if should be used whenever possible. If, however, your situations does not allow you to write your code using remove_if, which can happen in more complicated situations, you can use the following:
You can use vector::erase with an iterator to remove the element at that spot. The iterator used is then invalidated. erase returns a new iterator that points to the next element, so you can use that iterator to continue.
What you end up with is a loop like:
for( it = fell.begin(); it != fell.end(); /* iterator updated in loop */ )
{
if (shouldDelete)
it = fell.erase(it);
else
++it;
}

Is it at all possible to erase from a vector with C++11's for loops?

Alright. For the sake of other (more simple but not explanatory enough) questions that this might look like, I am not asking if this is possible or impossible (because I found that out already), I am asking if there is a lighter alternative to my question.
What I have is what would be considered a main class, and in that main class, there is a variable that references to a 'World Map' class. In essence, this 'WorldMap' class is a container of other class variables. The main class does all of the looping and updates all of the respective objects that are active. There are times in this loop that I need to delete an object of a vector that is deep inside a recursive set of containers (As shown in the code provided). It would be extremely tedious to repeatedly have to reference the necessary variable as a pointer to another pointer (and so on) to point to the specific object I need, and later erase it (this was the concept I used before switching to C++11) so instead I have a range for loop (also shown in the code). My example code shows the idea that I have in place, where I want to cut down on the tedium as well as make the code a lot more readable.
This is the example code:
struct item{
int stat;
};
struct character{
int otherStat;
std::vector<item> myItems;
};
struct charContainer{
std::map<int, character> myChars;
};
int main(){
//...
charContainer box;
//I want to do something closer to this
for(item targItem: box.myChars[iter].myItems){
//Then I only have to use targItem as the reference
if(targItem.isFinished)
box.myChars[iter].myItems.erase(targItem);
}
//Instead of doing this
for(int a=0;a<box.myChars[iter].myItems.size();a++){
//Then I have to repeatedly use box.myChars[iter].myItems[a]
if(box.myChars[iter].myItems[a].isFinished)
box.myChars[iter].myItems.erase(box.myChars[iter].myItems[a]);
}
}
TLDR: I want to remove the tedium of repeatedly calling the full reference by using the new range for loops shown in C++11.
EDIT: I am not trying to delete the elements all at once. I am asking how I would delete them in the matter of the first loop. I am deleting them when I am done with them externally (via an if statement). How would I delete specific elements, NOT all of them?

If you simply want to clear an std::vector, there is a very simple method you can use:
std::vector<item> v;
// Fill v with elements...
v.clear(); // Removes all elements from v.
In addition to this, I'd like to point out that [1] to erase an element in a vector requires the usage of iterators, and [2] even if your approach was allowed, erasing elements from a vector inside a for loop is a bad idea if you are not careful. Suppose your vector has 5 elements:
std::vector<int> v = { 1, 2, 3, 4, 5 };
Then your loop would have the following effect:
First iteration: a == 0, size() == 5. We remove the first element, then the vector will contain {2, 3, 4, 5}
Second iteration: a == 1, size() == 4. We then remove the second element, then the vector will contain {2,4,5}
Third iteration: a == 2, size() == 3. We remove the third element, and we are left with the final result {2,4}.
Since this does not actually empty the vector, I suppose it is not what you were looking for.
If instead you have some particular condition that you want to apply to remove the elements, it is very easily applied in C++11 in the following way:
std::vector<MyType> v = { /* initialize vector */ };
// The following is a lambda, which is a function you can store in a variable.
// Here we use it to represent the condition that should be used to remove
// elements from the vector v.
auto isToRemove = [](const MyType & value){
return /* true if to remove, false if not */
};
// A vector can remove multiple elements at the same time using its method erase().
// Erase will remove all elements within a specified range. We use this method
// together with another method provided by the standard library: remove_if.
// What it does is it deletes all elements for which a particular predicate
// returns true within a range, and leaves the empty spaces at the end.
v.erase( std::remove_if( std::begin(v), std::end(v), isToRemove ), std::end(v) );
// Done!

I am deleting them when I am done with them externally (via an if statement). How would I delete specific elements, NOT all of them?
In my opinion, you're looking at this the wrong way. Writing loops to delete items from a sequence container is always problematic and not recommended. Strive to stay away from removing items in this fashion.
When you work with containers, you should strategically set up your code so that you place the deleted or "about to be deleted" items in a part of the container that is easily accessed, away from the items in the container that you do not want to delete. At the time you actually do want to remove them, you know where they are and thus can call some function to expel them from the container.
One answer was already given, and that is to use the erase-remove(if) idiom. When you call remove or remove_if, the items that are "bad" are moved to the end of the container. The return value for remove(_if) is the iterator to the start of the items that will be removed. Then you feed this iterator to the vector::erase method to delete these items permanently from the container.
The other solution (but probably less used) is the std::partition algorithm. The std::partition also can move the "bad" items to the end of the container, but unlike remove(_if), the items are still valid (i.e. you can leave them at the end of the container and still use them safely). Then later on, you can remove them as you wish in a separate step since std::partition also returns an iterator.

Why not have a standard iterator iterating over a vector. That way you can delete the element by passing an iterator. Then .erase() will return the next available iterator. And if your next iterator is iterator::end() then your loop will exit.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Iterating through STL containers and removing/adding multiple items - c++

Related

Constraining remove_if on only part of a C++ list

How to avoid out of range exception when erasing vector in a loop?

Expanding vector while iterating through it

c++ vector object .erase

Is it at all possible to erase from a vector with C++11's for loops?

Categories

Resources