How can I point to a member of a std::set in such a way that I can tell if the element has been removed? - c++

An iterator into a std::set becomes invalidated if the item it's pointing to is erased. (It does not get invalidated if the set is modified in any other way, which is nice.) However, there is no way to detect whether an iterator has been invalidated or not.
I'm implementing an algorithm that requires me to be able to keep track of members of a std::set in such a way that I can erase them in constant time, but without risking undefined behaviour if I try to delete the same one twice. If I have two iterators pointing to the same member of a set, Bad Things will happen if I try to erase both of them.
My question is, how can I avoid this? Is there some way to implement something that behaves like an iterator into a set, but which knows when it has been invalidated?
Incidentally, I'm using std::set because this is a performance critical situation and I need the complexity guarantees that set provides. I'm happy to accept answers that suggest a different data structure, but only if it allows me to (a) access and remove the smallest element in constant time, (b) remove the pointed-to elements in constant time, and (c) insert elements in O(log(N)) time or better. C++11 is OK.

You could keep a set of shared pointers. And every time you store an iterator, pair it with a weak pointer to the element. When you want to erase the element, first check the weak pointer to see if the object still exists.

Related

std::map without parent pointers?

libstdc++, as an example, implements std::map using a red-black binary tree with parent pointers in the nodes. This means that iterators can just be pointers to a node.
Is it possible for a standard library to implement std::map without storing parent pointers in the nodes? I think this would mean that iterators would need to contain a stack of parent pointers, and as such would need to dynamically allocate a logarithmic amount of memory. Would this violate standard performance constraints on iterators? Would not having parent pointers violate any other performance contraints on the rest of the interface?
What about the new node stuff/interface in C++17?
They may not do so. std::map guarantees that removing a key-value pair from it won't invalidate any iterators other than to the pair being removed.
If iterators will store a stack of parents, and a parent is removed, that will invalidate those iterators as well. And the guarantee will no longer hold.
Is it possible? Possibly :-) Is it a good idea? Almost certainly not. Most things are possible, if you throw more storage or speed at them :-)
In terms of just getting rid of the parent pointers, you could, for example, maintain within the map a monotonic value that is incremented each time the map structure is changed. In essence, it's a version identifier of the map structure. So, adding or deleting elements in the map increments this value, while merely changing the data within the map does not.
The iterator would then contain:
a pointer to the map itself (to get the current version);
the stack of pointers; and
the version matching the last time the stack above was created.
The idea would basically be to, before doing anything with the iterator, detect when the map version is different to the iterator one and, if it is, rebuild the stack and update the iterator version before carrying on with whatever operation you're trying to perform.
Now, while that makes it possible to iterate without parent pointers, it unfortunately violates some other requirements of iterators, such as being able to action them in constant time. Anything that has to rebuild a data structure, based on the data within the map, will violate that restriction.
In any case, there's no way anyone in their right mind would implement such a horrid scheme when it's far simpler to have parent pointers, but the intent here is simply to show that it's possible.
Hence my advice would be to just stick with the parent pointers. The use of such parent pointers makes the process of finding the next/previous element a rather simple one, based only the current item in the iterator.

insert in C++ vector

I want to insert a bit from a bitset in the beginning of a vector. I am having a hard time understanding how to do that. Here is how I think I can do it:
keyRej.insert(x, inpSeq[0]);
I don't know what to put in the place of x?
I don't know what to put in the place of x?
An iterator to the position you want to insert in:
keyRej.insert(keyRej.begin(), inpSeq[0]);
Semantically, the inserted element goes before the iterator passed as first argument. But this will result in all elements of the vector having to be moved across one position, and may also incur a re-allocation of the vector's internal data storage block. It also means that all iterators or references to the vector's elements are invalidated.
See this reference for std::vector::insert for more information.
Note that there are containers, such as std::deque, for which appending elements to the front is cheap, and reference (but not iterator) validity is maintained.
x is an iterator according to the documentation you probably read here, the new object is insert just before it.
keyRej.insert(keyRej.begin(), inpSeq[0]);

Get pointer to node in std::list or std::forward_list

I am planning to use std::list in my code, I decided not to use std::forward_list, because for deletions (I figured) the whole list will have to traversed, O(N) complexity for std::forward_list (being a single link list). However, when I looked into the documentation I noticed both the stl containers have O(N) complexity to remove an item.
http://www.cplusplus.com/reference/forward_list/forward_list/remove/
http://www.cplusplus.com/reference/list/list/remove/
After some thinking I figured out why (I think). It's because in both cases, the whole list has to be scanned to find the node first, and then delete it. Is this right?
I then looked into the "erase" and "erase_after" methods, and their complexity is "Linear in the number of elements erased (destructions).". It's because, I am passing an iterator to the node (which is kind of like a "pointer"). However, I cannot (or prefer not to) pass this iterator around in my code to access the data in the node. I am not sure if this iterator will be valid if the list is modified? Thoughts?
My question is, is there a way I can get a pointer to the node in the list. That way, I know it will be valid throughout the lifetime of my program, pass it around. And I can just look into it to get access to my data.
However, I cannot (or prefer not to) pass this iterator around in my code to access the data in the node.
Why not? Iterators are easy to use and are quite lightweight. A pointer isn't better in any way.
I am not sure if this iterator will be valid if the list is modified?
For list, any iterator will remain valid, even if the list is modified. Except, of course, if you erase the particular element that is the iterator points to. But that's kind of obvious, you can' expect to have an iterator (or pointer) to something that doesn't exist any more.
(vector is more dangerous. One small change to a vector can invalidate all its iterators.)
You can take a pointer to any individual element in the list.
list<int> iterator it = find(l.begin(), l.end(), 7); // get an iterator
int * ptr = &*it; // get a pointer to the same element.
The pointer is similar to the iterator in many respects. But the iterator is a little more powerful. An iterator can be incremented or decremented, to access neighbouring elements in the list. And an iterator can be used to delete an element from the list. A pointer cannot do either of those things.
Both the iterator and pointer remain valid as long as that particular element isn't removed.
I am not sure if this iterator will be valid if the list is modified
Yeah, in the general case, storing iterators is risky unless you keep a close eye on the operations performed on your container.
Problem is, this is just the same for a pointer. In fact, for many containers, iterators are implemented as pointers.
So either store an iterator or a pointer if you like but, either way, keep an eye on the iterator invalidation rules:
Iterator invalidation rules
For lists, an iterator is valid even if other items in the list are erased. It becomes garbage when that item the iterator references in the list is removed.
So, as long as you know the iterator you're passing around isn't being removed by some other piece of code, they're safe to hold onto. This seems fragile though.
Even if there was a construct outside of iterators to reference a node in the list, it would suffer from the same fragility.
However, you can have each node contain an std::shared_ptr to the data it stores instead of the object itself and then pass around std::weak_ptr's to those objects and check for expired before accessing those weak_ptr's.
eg
instead of
std::list<MyClass> foo;
you would have
std::list<std::shared_ptr<MyClass>> foo;
have a look here for info on weak_ptr's
is there a way I can get a pointer to the node in the list
Yes, in your particular implementation.
No, in a standard-compliant way.
If you look at the std::list documentation, there is not a single word about a node. While it is hard to imagine a different way to implement the std::list other than using a doubly linked list, there is nothing that prevents it.
You should almost never come into any contact with undocumented internals of libraries.
Adding, removing and moving the elements within the list or across several lists does not invalidate the iterators or references. An iterator is invalidated only when the corresponding element is deleted.
Source: https://en.cppreference.com/w/cpp/container/list
So a std::list<>::iterator is only invalidated when the corresponding element is deleted. So yes, as long as you make sure that the corresponding element exists (which you will anyway have to do in your scenario of storing/passing around a pointer to anything) you can save and/or pass around the iterator throughout the lifetime of your program.
Now, an iterator is nothing but a pointer in disguise. So, if you prefer to save/pass around the corresponding pointer instead of iterator, you can always first convert the iterator to the pointer as #Aaron McDaid suggested.
int * ptr = &*it; // get a pointer to the same element.

C++ map allocator stores items in a vector?

Here is the problem I would like to solve: in C++, iterators for map, multimap, etc are missing two desirable features: (1) they can't be checked at run-time for validity, and (2) there is no operator< defined on them, which means that they can't be used as keys in another associative container. (I don't care whether the operator< has any relationship to key ordering; I just want there to be some < available at least for iterators to the same map.)
Here is a possible solution to this problem: convince map, multimap, etc to store their key/data pairs in a vector, and then have the iterators be a small struct that contain a pointer to the vector itself and a subscript index. Then two iterators, at least for the same container, could be compared (by comparing their subscript indices), and it would be possible to test at run time whether an iterator is valid.
Is this solution achievable in standard C++? In particular, could I define the 'Allocator' for the map class to actually put the items in a vector, and then define the Allocator::pointer type to be the small struct described in the last paragraph? How is the iterator for a map related to the Allocator::pointer type? Does the Allocator::pointer have to be an actual pointer, or can it be anything that supports a dereference operation?
UPDATE 2013-06-11: I'm not understanding the responses. If the (key,data) pairs are stored in a vector, then it is O(1) to obtain the items given the subscript, only slightly worse than if you had a direct pointer, so there is no change in the asymptotics. Why does a responder say map iterators are "not kept around"? The standard says that iterators remain valid as long as the item to which they refer is not deleted. As for the 'real problem': say I use a multimap for a symbol table (variable name->storage location; it is a multimap rather than map because the variables names in an inner scope may shadow variables with the same name), and say now I need a second data structure keyed by variables. The apparently easiest solution is to use as key for the second map an iterator to the specific instance of the variable's name in the first map, which would work if only iterators had an operator<.
I think not.
If you were somehow able to "convince" map to store its pairs in a vector, you would fundamentally change certain (at least two) guarantees on the map:
insert, erase and find would no longer be logarithmic in complexity.
insert would no longer be able to guarantee the validity of unaffected iterators, as the underlying vector would sometimes need to be reallocated.
Taking a step back though, two things suggest to me that you are trying to "solve" the wrong problem.
First, it is unusual to need to have a vector of iterators.
Second, it is unusual to need to check an iterator for validity, as iterators are not generally kept around.
I wonder what the real problem is that you are trying to solve?

Storing iterators inside containers

I am building a DLL that another application would use. I want to store the current state of some data globally in the DLL's memory before returning from the function call so that I could reuse state on the next call to the function.
For doing this, I'm having to save some iterators. I'm using a std::stack to store all other data, but I wasn't sure if I could do that with the iterators also.
Is it safe to put list iterators inside container classes? If not, could you suggest a way to store a pointer to an element in a list so that I can use it later?
I know using a vector to store my data instead of a list would have allowed me to store the subscript and reuse it very easily, but unfortunately I'm having to use only an std::list.
Iterators to list are invalidated only if the list is destroyed or the "pointed" element is removed from the list.
Yes, it'll work fine.
Since so many other answers go on about this being a special quality of list iterators, I have to point out that it'd work with any iterators, including vector ones. The fact that vector iterators get invalidated if the vector is modified is hardly relevant to a question of whether it is legal to store iterators in another container -- it is. Of course the iterator can get invalidated if you do anything that invalidates it, but that has nothing to do with whether or not the iterator is stored in a stack (or any other data structure).
It should be no problem to store the iterators, just make sure you don't use them on a copy of the list -- an iterator is bound to one instance of the list, and cannot be used on a copy.
That is, if you do:
std::list<int>::iterator it = myList.begin ();
std::list<int> c = myList;
c.insert (it, ...); // Error
As noted by others: Of course, you should also not invalidate the iterator by removing the pointed-to element.
This might be offtopic, but just a hint...
Be aware, that your function(s)/data structure would probably be thread unsafe for read operations. There is a kind of basic thread safety where read operations do not require synchronization. If you are going to store the sate how much the caller read from your structure it will make the whole concept thread unsafe and a bit unnatural to use. Because nobody assumes a read to be state-full operation.
If two threads are going to call it they will either need to synchronize the calls or your data structure might end-up in a race condition. The problem in such a design is that both threads must have access to a common synchronization variable.
I would suggest making two overloaded functions. Both are stateless, but one of them should accept a hint iterator, where to start next read/search/retrieval etc. This is e.g. how Allocator in STL is implemented. You can pass to allocator a hint pointer (default 0) so that it quicker finds a new memory chunk.
Regards,
Ovanes
Storing the iterator for the list should be fine. It will not get invalidated unless you remove the same element from the list for which you have stored the iterator. Following quote from SGI site:
Lists have the important property that
insertion and splicing do not
invalidate iterators to list elements,
and that even removal invalidates only
the iterators that point to the
elements that are removed
However, note that the previous and next element of the stored iterator may change. But the iterator itself will remain valid.
The same rule applies to an iterator stored in a local variable as in a longer lived data structure: it will stay valid as long as the container allows.
For a list, this means: as long as the node it points to is not deleted, the iterator stays valid. Obviously the node gets deleted when the list is destructed...