In my application I need the ability to traverse a doubly linked list starting from any arbitrary member of the list and continuing past the end(), wrapping around to the begin() and continue until the traversal reaches where it started.
I decided to use std::list for the underlying data structure and wrote a circulate routine to achieve this. However it's showing certain unexpected behavior when it's wrapping around from end() to begin(). Here's my implementation
template <class Container, class BiDirIterator>
void circulate(Container container, BiDirIterator cursor,
std::function<void(BiDirIterator current)> processor)
{
BiDirIterator start = cursor;
do {
processor(cursor);
cursor++;
if (cursor == container.end()) {
cursor = container.begin(); // [A]
}
} while (cursor != start);
}
// ...
typedef int T;
typedef std::list<T> TList;
typedef TList::iterator TIter;
int count = 0;
TList l;
l.push_back(42);
circulate<TList, TIter>(
l, l.begin(),
[&](TIter cur) {
std::cout << *cur << std::endl;
count++;
}
);
The output is:
42
-842150451
When I step through the code I see that the line marked [A] is never reached. The cursor is never equal to container.end(). Surprisingly, invoking ++ on that cursor, takes it to container.begin() in next pass automatically. (I suppose that's specific to this STL implementation).
How can I fix this behavior?
The issue here is that you are taking Container by value. This causes a copy so the iterators returned by container.end() and container.begin() are not that same as the iterator passed to the function. Instead if you pass Container by reference then the code works correctly.
Live Example
Related
I have a bunch of elements stored in some container. Their order does not matter for me.
I iterate over my container and check some predicate - P for each of the elements. If P is true - remove the element from the container. If P is false - just go to the next one.
If at least one element was deleted during the iteration I repeat this process. There is a chance that on a new iteration the P will be true for the elements for which it was false during previous iterations.
I've written a code for this
std::unordered_map<T, T> container;
auto it = container.begin();
while (it != container.end()) {
if (predicate(*it)) {
it = container.erase(it);
} else {
it++;
}
}
I have a question:
Is there a better way to do this (both in terms of clean code and it's time efficiency) considering I have about 500 elements in my container.
Use std::erase_if() in a loop:
while (std::erase_if(your_set, your_predcate))
/**/;
If you don't have C++20, don't despair. Cppreference.com gives an example implementation too.
If it proves to be a bottleneck, hand-rolling your own all_erase_if() with a specialization for node-based containers might be useful:
template <class T>
constexpr bool has_node_type = requires { typename T::node_type; };
template <class T>
constexpr bool is_node_based = has_node_type<T>;
template <class C, class P>
auto all_erase_if(C& c, F f) requires is_node_based<C> {
const auto old_size = std::size(c);
if (!old_size)
return old_size;
auto it = std::begin(c), stop = std::begin(c);
do {
while (f(*it)) {
it = stop = c.erase(it);
if (it != std::end(c))
/**/;
else if (std::empty(c))
return old_size;
else
it = stop = std::begin(c);
}
if (++it == std::end(c))
it = std::begin(c);
} while (it != stop);
return old_size - std::size(c);
}
template <class C, class P>
auto all_erase_if(C& c, F f) requires !is_node_based<C> {
const auto old_size = std::size(c);
while (std::erase_if(c, std::ref(f)))
/**/;
return old_size - std::size(c);
}
You want to circle-iterate over the container until you have done a complete pass where you don't remove anything.
template<class C, class F>
void multi_pass_erase( C& c, F&& f )
{
auto stop_at = c.end();
auto it = current;
while (true)
{
if (c.empty())
return;
if (f(*it))
{
it = c.erase(it);
if (it == c.begin())
stop_at = c.end();
else
stop_at = it;
}
else
{
++it;
if (it == stop_at)
return;
}
if (it == c.end())
it = c.begin();
}
}
At the start of the loop, it refers to the next element to test, and only refers to end if the container is empty.
So if the container is empty, return.
stop_at tracks the element that, if we reach it, we have gone over the entire container and not found something to filter.
If we remove something, we note that the proper place to stop is after the element we erase.
If we don't remove something, we advance the iterator, and check if we should stop.
Then, if we have reached the end of the container, we go back to the start.
There is a bit of careful work where we test for "stop" before we move it from end back to begin, so we should never store stop_at as referring to begin().
Now lets compare it to
while (true) {
if (!std::erase_if( set, test ))
break;
}
Imagine if each cycle removes one element. This could require O(n^2) time.
multi_pass_erase doesn't do amazingly better in this case. If each element causes the previous one to be erased, multi_pass_erase doesn't shave off any visits whatsoever; in both cases, you have to visit every non-removed node before you find the next node to remove.
Basically, all of the fancyness to multi_pass_erase shaves off an average of half an iteration of the set each call whenever there is at least 1 erase, as if we assume the last erase is randomly positioned, we skip doing an average of half of the container.
The added complexity is probably not worth it.
But can we write something more complex and more efficient?
Usually when you delete something that could cause other things to need deletion, you can get information about those other things.
Consider tracking that information, and only looking at those elements, instead of the entire list again.
template<class C, class Test, class Dependencies>
void dependent_erase( C& c, Test&& t, Dependencies&& d ) {
auto it = c.begin();
using key_type = typename C::key_type;
std::vector<key_type> todo_list;
while (it != c.end())
{
if (t(*it)) {
d( *it, &todo_list );
it = c.erase(it);
} else {
++it;
}
}
// remove duplicates:
std::vector<key_type> next_todo_list;
while (!todo_list.empty()) {
// better to shrink the list and ask f(x) less often
std::sort(todo_list.begin(), todo_list.end());
todo_list.erase( std::unique(todo_list.begin(), todo_list.end()), todo_list.end() );
for (auto&& todo : todo_list) {
auto it = c.find( todo );
if (f(*it))
{
d( *it, &next_todo_list );
c.erase(it);
}
}
todo_list = std::move( next_todo_list );
next_todo_list.clear();
}
}
here we have our Test t (do we want to delete this item?) If we do, we call d( item, vector* ) and store any direct dependencies we'd want to retest there.
We then go over the container, removing things as needed. Then we go over the dependencies and remove anything mentioned that should go away, repeatedly until we no longer find new items to delete.
If we assume your code is a bunch of nodes with references to other nodes, and you are doing garbage collection, this should be far better in many cases.
...
Code not tested or even compiled. But I've done this before, so it might work. It should at least work as pseudo-code.
If you expect a LOT of dependencies per deleted node, and lots of overlap, a set-based todo list might be better than a vector one. Ie, if you have N elements removed each with M dependencies, that vector grows to NM size. But if M tends to be small and not have much overlap with other elements, then the vector will be much faster than a node-based set.
This throws when trying to remove element from deque via iterator. The error is "can not seek value-initialized iterator" using VS2017. I wonder why this is happening, isn't std::deque a doubly linked list that does not invalidate iterators on push_front() / push_back()?
class deque2 {
public:
bool enqueue(int val) {
if (mp.find(val) != mp.end()) {
return false;
}
dq.push_front(val);
mp[val] = dq.begin();
return true;
}
int dequeue() {
if (dq.size() == 0) {
return -1;
}
int res = dq.back();
mp.erase(res);
dq.pop_back();
return res;
}
void erase(int val) {
auto it = mp.find(val);
if (it != mp.end()) {
dq.erase(it->second); // exception
mp.erase(val);
}
}
private:
deque<int> dq;
unordered_map<int, deque<int>::iterator> mp;
};
isn't std::deque a doubly linked list
No it is not. As stated in documentation
std::deque (double-ended queue) is an indexed sequence container that allows fast insertion and deletion at both its beginning and its end. In addition, insertion and deletion at either end of a deque never invalidates pointers or references to the rest of the elements.
emphasis is mine. Note that it says that pointer or references not invalidated, not iterators. And documentations on std::deque::push_front() clearly says so:
All iterators, including the past-the-end iterator, are invalidated. No references are invalidated.
As for the logic you are trying to implement I would recommend to use boost::multi_index as it allows single container with different access criteria and you do not have to maintain 2 containers in sync. Documentation can be found here
In my application I have a (unbalanced) tree datastructure. This tree is simply made of "std::list of std::lists" - node holds an arbitrary "list" of sub-nodes. Using this instead of a single list made the rest of the application a lot easier. (The program is about changing moving nodes from one tree to another tree / another part in the tree / to it's own tree).
Now an obvious task is to find a subtree inside a "tree". For non-recursive searches it is simple enough:
subtree_iterator find_subtree(const N& n) {
auto iter(subtrees.begin());
auto e(subtrees.end());
while (iter != e) {
if ((*iter)->name == n) {
return iter;
}
++iter;
}
return e;
}
Which returns an iterator to the subtree position. The problem however starts when I try to implement a multi-level search. Ie, I wish to search for hello.world.test where the dots mark a new level.
Searching worked alright
subtree_iterator find_subtree(const pTree_type& otree, std::string identify) const {
pTree_type tree(otree);
boost::char_separator<char> sep(".");
boost::tokenizer<boost::char_separator<char> > tokens(identify, sep);
auto token_iter(tokens.begin());
auto token_end(tokens.end());
subtree_iterator subtree_iter;
for (auto token_iter(tokens.begin()); token_iter != token_end; ++token_iter) {
std::string subtree_string(*token_iter);
subtree_iter = tree->find_subtree_if(subtree_string);
if (subtree_iter == tree->subtree_end()) {
return otree->subtree_end()
} else {
tree = *subtree_iter;
}
}
return subtree_iter;
}
On first glace it seemed to work "correct", however when I try to use it, it fails. Using it would be like
auto tIn(find_subtree(ProjectTree, "hello.world.test"));
if (tIn != ProjectTree->subtree_end()) {
//rest
}
however that gives a debug assertion error "list iterators not compatible". This isn't too weird: I'm comparing a iterators from different lists to each other. However I could I implement such a thing? My "backup" option would be to return a std::pair<bool,iterator> where the boolean part determines if the tree actually exists. Is there another method, short of making the whole tree single list?
You should not work on iterators internaly. Use nodes instead.
template <typename T>
struct Node {
T item;
Node<T>* next;
};
Then encapsulate your Node in an iterator facade like this :
template<typename T>
class iterator {
private:
Node<T>* node;
public:
...
};
Then use a generic invalid node (when node is nullptr) that is returned whenever end() is reached or returned.
Note that what i suggest is a single linked list (not double linked list as the standard one). this is because you can't go back from an invalid generic end() iterator that point to an invalid null node.
If you don't use iterator operator--() in your algorithms this should be fine.
std::vector<list_iterator> stack to traverse? Where the .back() of the stack is the only one allowed to be equal to end() of the previous one, and .front() is an iterator to the root list?
I couldn't find an instance of how to do this, so I was hoping someone could help me out. I have a map defined in a class as follows:
std::map<std::string, TranslationFinished> translationEvents;
TranslationFinished is a boost::function. I have a method as part of my class that iterates through this map, calling each of the functions like so:
void BaseSprite::DispatchTranslationEvents()
{
for(auto it = translationEvents.begin(); it != translationEvents.end(); ++it)
{
it->second(this);
}
}
However it's possible for a function called by it->second(this); to remove an element from the translationEvents map (usually itself) using the following function:
bool BaseSprite::RemoveTranslationEvent(const std::string &index)
{
bool removed = false;
auto it = translationEvents.find(index);
if (it != translationEvents.end())
{
translationEvents.erase(it);
removed = true;
}
return removed;
}
doing this causes a debug assertion fail when the DispatchTranslationEvents() tries to increment the iterator. Is there a way to iterate through a map safely with the possibility that a function call during the iteration may remove an element from the map?
Thanks in advance
EDIT: Accidently C/Pd the wrong Remove Event code. Fixed now.
map::erase invalidates the iterator being deleted (obviously), but not the rest of the map.
This means that:
if you delete any element other than the current one, you're safe, and
if you delete the current element, you must first get the next iterator, so you can continue iterating from that (that's why the erase function for most containers return the next iterator). std::map's doesn't, so you have to do this manually)
Assuming you only ever delete the current element, then you could simply rewrite the loop like this:
for(auto it = translationEvents.begin(); it != translationEvents.end();)
{
auto next = it;
++next; // get the next element
it->second(this); // process (and maybe delete) the current element
it = next; // skip to the next element
}
Otherwise (if the function may delete any element) it may get a bit more complicated.
Generally speaking it is frowned upon to modify the collection during iteration. Many collections invalidate the iterator when the collection is modified, including many of the containers in C# (I know you're in C++). You can create a vector of events you want removed during the iteration and then remove them afterwards.
After reading all other answers, I am at an advantage here... But here it goes.
However it's possible for a function called by it->second(this); to remove an element from the translationEvents map (usually itself)
If this is true, that is, a callback can remove any element from the container, you cannot possibly resolve this issue from the loop itself.
Deleting the current callback
In the simpler case where the callback can only remove itself, you can use different approaches:
// [1] Let the callback actually remove itself
for ( iterator it = next = m.begin(); it != m.end(); it = next ) {
++next;
it->second(this);
}
// [2] Have the callback tell us whether we should remove it
for ( iterator it = m.begin(); it != m.end(); ) {
if ( !it->second(this) ) { // false means "remove me"
m.erase( it++ );
} else {
++it;
}
}
Among these two options, I would clearly prefer [2], as you are decoupling the callback from the implementation of the handler. That is, the callback in [2] knows nothing at all about the container in which it is held. [1] has a higher coupling (the callback knows about the container) and is harder to reason about as the container is changed from multiple places in code. Some time later you might even look back at the code, think that it is a weird loop (not remembering that the callback removes itself) and refactor it into something more sensible as for ( auto it = m.begin(), end = m.end(); it != end; ++it ) it->second(this);
Deleting other callbacks
For the more complex problem of can remove any other callback, it all depends on the compromises that you can make. In the simple case, where it only removes other callbacks after the complete iteration, you can provide a separate member function that will keep the elements to remove, and then remove them all at once after the loop completes:
void removeElement( std::string const & name ) {
to_remove.push_back(name);
}
...
for ( iterator it = m.begin(); it != m.end(); ++it ) {
it->second( this ); // callback will possibly add the element to remove
}
// actually remove
for ( auto it = to_remove.begin(); it != to_begin.end(); ++it ) {
m.erase( *it );
}
If removal of the elements need to be immediate (i.e. they should not be called even in this iteration if they have not yet been called), then you can modify that approach by checking whether it was marked for deletion before executing the call. The mark can be done in two ways, the generic of which would be changing the value type in the container to be a pair<bool,T>, where the bool indicates whether it is alive or not. If, as in this case, the contained object can be changed you could just do that:
void removeElement( std::string const & name ) {
auto it = m.find( name ); // add error checking...
it->second = TranslationFinished(); // empty functor
}
...
for ( auto it = m.begin(); it != m.end(); ++it ) {
if ( !it->second.empty() )
it->second(this);
}
for ( auto it = m.begin(); it != m.end(); ) { // [3]
if ( it->second.empty() )
m.erase( it++ );
else
++it;
}
Note that since a callback can remove any element in the container, you cannot erase as you go, as the current callback could remove an already visited iterator. Then again, you might not care about leaving the empty functors for a while, so it might be ok just to ignore it and perform the erase as you go. Elements already visited that are marked for removal will be cleared in the next pass.
My solution is to first create a temporary container, and swap it with the original one. Then you can iterator through the temporary container and insert the ones you want to keep to the original container.
void BaseSprite::DispatchTranslationEvents()
{
typedef std::map<std::string, TranslationFinished> container_t;
container_t tempEvents;
tempEvents.swap(translationEvents);
for(auto it = tempEvents.begin(); it != tempEvents.end(); ++it)
{
if (true == it->second(this))
translationEvents.insert(it);
}
}
And the TranslationFinished functions should return true if it want to be keeped and return false to get removed.
bool BaseSprite::RemoveTranslationEvent(const std::string &index)
{
bool keep = false;
return keep;
}
There should be a way for you to erase a element during your iteration, maybe a little tricky.
for(auto it = translationEvents.begin(); it != translationEvents.end();)
{
//remove the "erase" logic from second call
it->second(this);
//do erase and increase the iterator here, NOTE: ++ action is very important
translationEvents.erase(it++);
}
The iterator will be invalid once the element is removed, so you can not use that iterator to do increase action anymore after you remove it. However, remove an element will not affect other element in map implementation, IIRC. So suffix ++ will copy the iter first and increase the iterator right after that, then return the copy value, which means iterator is increased before erase action, this should be safe for you requirement.
You could defer the removal until the dispatch loop:
typedef boost::function< some stuff > TranslationFunc;
bool BaseSprite::RemoveTranslationEvent(const std::string &index)
{
bool removed = false;
auto it = translationEvents.find(index);
if (it != translationEvents.end())
{
it->second = TranslationFunc(); // a null function indicates invalid event for later
removed = true;
}
return removed;
}
protect against invoking an invalid event in the loop itself, and cleanup any "removed" events:
void BaseSprite::DispatchTranslationEvents()
{
for(auto it = translationEvents.begin(); it != translationEvents.end();)
{
// here we invoke the event if it exists
if(!it->second.empty())
{
it->second(this);
}
// if the event reset itself in the map, then we can cleanup
if(it->second.empty())
{
translationEvents.erase(it++); // post increment saves hassles
}
else
{
++it;
}
}
}
one obvious caveat is if an event is iterated over, and then later on deleted, it will not get a chance to be iterated over again to be deleted during the current dispatch loop.
this means the actual deletion of that event will be deferred until the next time the dispatch loop is run.
The problem is ++it follows the possible erasure. Would this work for you?
void BaseSprite::DispatchTranslationEvents()
{
for(auto it = translationEvents.begin(), next = it;
it != translationEvents.end(); it = next)
{
next=it;
++next;
it->second(this);
}
}
I recently finished fixing a bug in the following function, and the answer surprised me. I have the following function (written as it was before I found the bug):
void Level::getItemsAt(vector<item::Item>& vect, const Point& pt)
{
vector<itemPtr>::iterator it; // itemPtr is a typedef for a std::tr1::shared_ptr<item::Item>
for(it=items.begin(); it!=items.end(); ++it)
{
if((*it)->getPosition() == pt)
{
item::Item item(**it);
items.erase(it);
vect.push_back(item);
}
}
}
This function finds all Item objects in the 'items' vector that has a certain position, removes them from 'items', and puts them in 'vect'. Later, a function named putItemsAt does the opposite, and adds items to 'items'. The first time through, getItemsAt works fine. After putItemsAt is called, though, the for loop in getItemsAt will run off the end of 'items'. 'it' will point at an invalid Item pointer, and getPosition() segfaults. On a hunch, I changed it!=items.end() to it<items.end(), and it worked. Can anyone tell me why? Looking around SO suggests it might involve erase invalidating the iterator, but it still doesn't make sense why it would work the first time through.
I'm also curious because I plan to change 'items' from a vector to a list, since list's erase is more efficient. I know I'd have to use != for a list, as it doesn't have a < operator. Would I run into the same problem using a list?
When you call erase(), that iterator becomes invalidated. Since that is your loop iterator, calling the '++' operator on it after invalidating it is undefined behavor. erase() returns a new valid iterator that points to the next item in the vector. You need to use that new iterator from that point onwards in your loop, ie:
void Level::getItemsAt(vector<item::Item>& vect, const Point& pt)
{
vector<itemPtr>::iterator it = items.begin();
while( it != items.end() )
{
if( (*it)->getPosition() == pt )
{
item::Item item(**it);
it = items.erase(it);
vect.push_back(item);
}
else
++it;
}
}
You're invoking undefined behavior. All the iterators to a vector are invalidated by the fact that you called erase on that vector. It's perfectly valid for an implementation to do whatever it wants.
When you call items.erase(it);, it is now invalid. To conform to the standard, you must now assume that it is dead.
You invoke undefined behavior by using that invalid iterator in the next call to vect.push_back.
You invoke undefined behavior again by using it as the tracking variable of your for loop.
You can make your code valid by using std::remove_copy_if.
class ItemIsAtPoint : std::unary_function<bool, item::Item>
{
Point pt;
public:
ItemIsAtPoint(const Point& inPt) : pt(inPt) {}
bool operator()(const item::Item* input)
{
return input->GetPosition() == pt;
}
};
void Level::getItemsAt(vector<item::Item>& vect, const Point& pt)
{
std::size_t oldSize = items.size();
std::remove_copy_if(items.begin(), items.end(), std::back_inserter(vect),
ItemIsAtPoint(pt));
items.resize(vect.size() - (items.size() - oldSize));
}
You can make this a lot prettier if you are using boost::bind, but this works.
I'll go with Remy Lebeau's explanation about iterator invalidation, and just add that you can make your code valid and asymptotically faster (linear time, instead of quadratic time) by using a std::list instead of a std::vector. (std::list deletions only invalidate the iterator that was deleted, and insertions don't invalidate any iterators.)
You can also predictibly identify iterator invalidation while debugging by activating your STL implementation's debug mode. On GCC, you do with with the compiler flag -D_GLIBCXX_DEBUG (see some caveats there).