I understand that vectors get moved if you push more elements than it has capacity for, but what happens to std::list if one of it's elements get's moved for reasons unrelated to the list itself? For instance to make space for a vector?
Will the list get invalidated because the elements around the moved element no longer point to it? Or is the list prepared for such eventuality?
If it is the later, what happens to pointers that point to the moved element?
For case application, i want to make a node map, which of course means that every node has to point to other nodes. But I also need to have a list of the nodes so I can search them easily.
So i wanted to have a list where every object of the list will have pointers to some other elements of the same list (this is outside of the normal std::list back and forth pointers). But I got worried about how would std::list handle one of its elements getting moved, and how could i handle my own pointers in such eventuality.
I discarded vectors already because the documentation already states that if it gets moved all pointers and references to its elements will get invalidated. If my approach of using std::list can not work, my 2nd best would be to keep the list of nodes into a vector and make the nodes reference eachother through index number (which i can do because once built the vector won't change it's size)
but what happens to std::list if one of it's elements get's moved for
reasons unrelated to the list itself?
In C++, the runtime environment is not allowed to unilaterally move objects in that way, for exactly the reason you imagine -- there is no reliable or efficient way to find all pointers to the object's old location and modify them to point to a new location, so any attempt to surreptitiously move an object would run the risk of creating dangling-pointers which would lead to undefined behavior. (btw this is one reason why garbage collectors don't work very well in C++)
So having your list-nodes objects moved behind your back is not something you need to worry about. Since your list-container has sole ownership of the nodes, the only way for this to happen would be if the list itself decided to explicitly move one of its nodes for some reason (and in general there's no general reason why it would need or want to do that).
Related
My implementation uses std::map to store data. When I started my code it seemed like the best option. Now I came to a point where I have to change the key values of all objects inside the map.
The problem is that each object points to another object inside the map:
class AND : public node{
vector <node*> inputs;
vector <node*> outputs;
}
And the map is declared like this:
map<unsigned int, AND> all_ANDs;
My question is: If I use map::extract from C++17 to modify the key values in all_ANDs map, will my pointers (E.g. the ones inside the attribute inputs) keep pointing to the right places?
In other words: If I change the value of "first" element with extract, the address of "second" will keep intact?
I noticed from this link that the string "papaya" stays the same (and works gracefully). But I wanted to be sure about pointers.
YES
The reference you have already quoted in your posts clearly states that no elements are copied or moved. (This assumes that node in your code snippet does not refer to map::node_type).
The same holds for the insert operation of the map-node (after modifying its key):
If the insertion is successful, pointers and references to the element obtained while it is held in the node handle are invalidated, and pointers and references obtained to that element before it was extracted become valid. (since C++17)
However, accessing the object between extract()ion and re-insert()ion has undefined behaviour and its address taken whilst in the extracted state is of limited use. Quoting from the standard:
The extract members invalidate only iterators to the removed element;
pointers and references to the removed element remain valid. However,
accessing the element through such pointers and references while the
element is owned by a node_type is undefined behavior. References and
pointers to an element obtained while it is owned by a node_type are
invalidated if the element is successfully inserted.
Explanation
Essentially, a map<> is implemented as a tree of nodes, each holding a key and T (which are exposed as pair<const Key, T> to the user). Nodes are allocated (typically) on the heap and the address of your object is related to that of a node. map::extract() un-links a node from its tree and returns a node handle (an object holding a pointer to a map node), but AFAIK the node itself is not re-allocated, moved, or copied. Upon map::insert(handle), the node is re-linked into the tree according to its (new) key. Again, this involves no re-allocation, move, or copy of the node.
Remark
The above is a rough sketch. How things are actually done is likely more complex and also implementation defined. As explained here a node_handle does allow to alter the key through the member function
Key &node_handle::key() const;
How this is done under the hood is not specified and I speculate that the implementation uses a union or some cast to allow this. Of course, the map has to present to users a pair<const Key,T> in order to prevent them from changing the key and hence breaking the map, but this is not of any concern for an element extracted from the map.
My above answer addresses your immediate question. However, as I have suggested in a comment, this appears to be a XY problem. What I suspect:
You have some structure of AND objects which are interlinked via their inputs and outputs fields. This linkage must not be broken by any re-allocation, so you cannot store them in a growing vector<AND> with re-allocation.
You also want to order these objects according to some key and have therefore stored them in a map<Key,AND>, which indeed does not re-allocate when grown.
You now want to order them according to another key (and/or change all the keys).
(If you're actually are not interested in ordering but merely in finding your objects by their key, you should have used unordered_map instead of map, which supports find() in O(n) rather than O(log(n)) operations.)
I suggest a different layout of your data:
You store your AND objects in a way that allows growing their number without re-allocation. An obvious choice here is deque<AND>, since
insertion and deletion at either end of a deque never invalidates
pointers or references to the rest of the elements
You may also make AND non-copyable and non-movable, ensuring that once allocated their address never changes (and pointers to them remain valid until destruction).
You can support any find-by-key or order-by-key operations by actually working on pointers to the stored objects, either by sorting a vector of pair<key,AND*> or by using a map<key,AND*> or unordered_map<key,AND*>. You can even simultaneously have various keys per object (and a map for each).
When you must re-key all objects, simply forget the old map and make a new one: since the map only stores pointers and not the objects, this does not affect your linkages.
Your map holds actual AND objects, not pointers to objects. So, if the AND* pointers stored inside your vectors are pointing at the map's AND objects, then those pointers WILL become invalid once those objects are erased from the map.
However, extraction merely unlinks a specified node from the map, the node and thus its key and value are still valid in memory. The node can be re-inserted into a map without affecting the addresses of the node's key and value. In this regard, the pointers in the vectors WILL NOT become invalid (although it is undefined to dereference them while the node is detached from the container).
Another option is to change your map to hold AND* pointers instead. Or better, consider using std::shared_ptr<AND> in the map and std::shared_ptr<node> in the vectors, instead of raw pointers. Then it won't matter whether the map entries are erased or extracted, the AND objects will remain valid as long as there are active shared_ptr references to them.
libstdc++, as an example, implements std::map using a red-black binary tree with parent pointers in the nodes. This means that iterators can just be pointers to a node.
Is it possible for a standard library to implement std::map without storing parent pointers in the nodes? I think this would mean that iterators would need to contain a stack of parent pointers, and as such would need to dynamically allocate a logarithmic amount of memory. Would this violate standard performance constraints on iterators? Would not having parent pointers violate any other performance contraints on the rest of the interface?
What about the new node stuff/interface in C++17?
They may not do so. std::map guarantees that removing a key-value pair from it won't invalidate any iterators other than to the pair being removed.
If iterators will store a stack of parents, and a parent is removed, that will invalidate those iterators as well. And the guarantee will no longer hold.
Is it possible? Possibly :-) Is it a good idea? Almost certainly not. Most things are possible, if you throw more storage or speed at them :-)
In terms of just getting rid of the parent pointers, you could, for example, maintain within the map a monotonic value that is incremented each time the map structure is changed. In essence, it's a version identifier of the map structure. So, adding or deleting elements in the map increments this value, while merely changing the data within the map does not.
The iterator would then contain:
a pointer to the map itself (to get the current version);
the stack of pointers; and
the version matching the last time the stack above was created.
The idea would basically be to, before doing anything with the iterator, detect when the map version is different to the iterator one and, if it is, rebuild the stack and update the iterator version before carrying on with whatever operation you're trying to perform.
Now, while that makes it possible to iterate without parent pointers, it unfortunately violates some other requirements of iterators, such as being able to action them in constant time. Anything that has to rebuild a data structure, based on the data within the map, will violate that restriction.
In any case, there's no way anyone in their right mind would implement such a horrid scheme when it's far simpler to have parent pointers, but the intent here is simply to show that it's possible.
Hence my advice would be to just stick with the parent pointers. The use of such parent pointers makes the process of finding the next/previous element a rather simple one, based only the current item in the iterator.
I am currently coding in a 2D geometry editor in c++. I am having the user place nodes. Lines and arcs can be drawn by selecting 2 nodes.
Right now, I am storing the nodes in a std::deque container (same thing for the lines and arcs) because I would like to store the address of the node into each line/arc. This makes things very convenient coding wise when I implement a feature to move the node. If I were to store the actual node inside of each line/arc, then when I want to move a node, then I would have to iterate through the entire line and arc stucture to find the node that I just moved and update the parameters. This option isn't an option on the table. Hence, the need to be able to store the address of the node inside each line/arc.
However, I am running into some issue where I need to delete the node. Looking on the reference manual, it seems that for all pointer, these are invalidated when you erase an element from the deque (unless that element is at the beginning or the end. For the sake of discussion, I will not be considering this case). This causes issue with the erasing because now, all of my lines/arc reconnect themselves to different nodes or are not drawn at all when a node is erased and the program eventually crashes.
Continuing to look online, I come across std::list which (from my understanding of reading the documentation) does not invalidate any pointers or references when one of the elements is erased. This seems to be a very nice solution to my problem.
However, I have been looking a little bit on stack overflow to see what are the benefits/disadvantages of using a list vs a deque. And it seems like there is more of a preference to use a deque then a list. It seems as though the list is slower to access then the deque. This is not good because I am not sure how many nodes a user would like to draw. For all I know, there could be 10,000+ nodes in the geometry and if the user wants to move a node, I don't want the user have to wait 30 sec for the program to iterate through all of the elements to find the node(s) to erase.
So on one hand, deque are alot faster but as soon as an element is removed, all of the pointers and references are invalidated. On the other hand, std::list allows me to erase whatever element I want without invalidating any of the pointers and references but is slower compared to a deque.
I am considering to switch to a list because even if the list is slower, if I can't erase an element without invalidating the pointers and references, then there isn't much of a benefit speed wise if the program doesn't work.
However, is using a list the best choice in my situation? Is there any way to use a deque? Or is there a third option that I haven't considered?
Edit:
I forgot to mention. One thing that I am not to fond of with lists is the inability to get an element's data directly (in std::deque and vector, I can use the at function to access elements). This isn't a huge deal breaker with my code. But it does makes things convenient. For example, when a user selects a node when they want to create a line/arc, the code iterates over the entire node list to find out which one was selected and then, for the first selection, stores the index into a variable (called firstNodeIndex). For the second node, it does the same thing but when both variables (firstNodeIndex and secondNodeIndex) are viable numbers, then the function for creating the line/arc is called and the function uses the two stored indexes to re-access the node list to grab an address to the node. If I were to use the list, I would have to store the address of the two nodes in variables and then create some additional logic to make sure that the two variables containing the addresses to the two nodes are viable options.
Another alternate solution would be to reiterate through the entire node list again to grab the nodes that are selected (I would have a variable inside each node to indicate that it is selected). But I am afraid that this might not be a good idea given std::list limitations.
I am kind of in favor of my first way but I am open to change if need be or if there is a better method
So your problem is that you don't want your iterators invalidated when you insert or erase element, but you want your data structure to be fast.
Linked list is only slow when you have to iterate all elements frequently. In does not take advantage of continuous data access like vector or deque. Also linear search in list is slow.
I had similar situations. Here are some options:
Use list and try to avoid linear searches. See if memory access speed of linked list affect your performance significantly and if it doesn't - use it.
Use map or set. Same cons as list except search, which is O(logn). Or you can use unordered versions if you don't care about sorting elements.
Use non-standard data structure like plf::colony. If you don't care about order of insertion, this is probably your best option.
Create your own deque-like data structure that does not invalidate iterators (using skipfields or storing free elements somewhere). I wouldn't recommend it since you will probably end up writing something like plf::colony anyway.
A rule of thumb:
will I want to add and delete items at random?
set, list, map, multimap and unordered versions of same
will I want to be able to name individual items and find them quickly?
map, set, multimap and unordered versions of same
does the thing I am storing have mutable data, or is it more detailed than just its name (key)?
map, multimap, unordered versions thereof
do I need the items to say in order?
yes: map, no: unordered_map
What are recommended methods of avoiding errors, when iterating through a vector; where any number of elements in the vector may (directly, or indirectly) cause insertions or removals of elements - invaliding the iterators?
Specifically, I'm asking in relation to games programming, with elements in the vector being game objects; some of which can spawn other objects, and some of which will be killed and need removed when they are updated. Because of this, reserving a large capacity before iteration is also not possible, as it's entirely unknown how many elements can be expected to be added.
It's not possible to say in general. Part of the reason that the iterators are invalidated is that there's no way to read your mind to know what element you would like the invalidated iterators to refer to.
For example, I have an iterator to element 3 of 5. I erase element 2. Should the iterator point to the new element 2 (because it's "the same value moved down"), or should it point to the new element 3 (because it's "the same element of the vector with a different value moved down into it")?
Practically, your options are:
Do not insert/erase elements while someone else has iterators. This means altering your code to make the changes at another time.
Use indexes instead of iterators. This results in the second option above ("index refers to the same element with a new value"). Beware that indexes can also become invalid, if you erase enough elements that the index is off the end.
Use a different data structure with different iterator invalidation rules. For example a std::list would give you the first option above ("iterator refers to the same element in a new position")
The iterator that you're using to iterate should never be harmfully invalidated by insertions/erases of a single element at that position. It is invalidated, but both functions return a new iterator that you can use to continue iterating. See their documentation for what element that new iterator refers to. That's why the problem only arises when you're using multiple iterators on the same container.
Here are some ideas:
Iterate over the vector using an integer index. This won't get invalidated, but you need to take care to adjust the current index when inserting/removing elements.
Iterate over a copy of the vector.
Instead of making changes to the vector as you go along, keep track of what needs to be inserted/removed, and apply the changes after you've finished iterating.
Use a different container, for example a linked list.
Vector might not be the best container to support intensive insertion/removal (especially when the size is large).
But if you have to use vector, IMHO the safest way would be to work not with an iterator but with an index and keep track (and adjust the index) of items being inserted/deleted before the current position. This way you will at least avoid problems with reallocations.
And recalculate v.size()/v.end() after each iteration.
Use a separate vector and std::move the elements which don't get removed to it along with any created new elements. Then drop the old vector.
This way you don't have to remove/insert anything from/to the original vector and the iterators don't get invalidated.
You can, of course, go around the problem by using a container that doesn't invalidate the iterators on insertion and removal.
I'd like to recommend a different approach. Essentially you're iterating over the vector to create a new state in the game world. Instead of changing the state of the world as you traverse the vector, store what needs to change in a new vector. Then, after you've completely examined the old state, apply the changes you stored in the new vector.
This approach has one conceptual advantage: every object's new state depends on the old state only. If you apply the changes as you traverse the old state, and decisions about one object can depend on other objects, the order of traversal can affect the outcome, which can give "earlier" or "later" objects an unfair advantage.
I am building a DLL that another application would use. I want to store the current state of some data globally in the DLL's memory before returning from the function call so that I could reuse state on the next call to the function.
For doing this, I'm having to save some iterators. I'm using a std::stack to store all other data, but I wasn't sure if I could do that with the iterators also.
Is it safe to put list iterators inside container classes? If not, could you suggest a way to store a pointer to an element in a list so that I can use it later?
I know using a vector to store my data instead of a list would have allowed me to store the subscript and reuse it very easily, but unfortunately I'm having to use only an std::list.
Iterators to list are invalidated only if the list is destroyed or the "pointed" element is removed from the list.
Yes, it'll work fine.
Since so many other answers go on about this being a special quality of list iterators, I have to point out that it'd work with any iterators, including vector ones. The fact that vector iterators get invalidated if the vector is modified is hardly relevant to a question of whether it is legal to store iterators in another container -- it is. Of course the iterator can get invalidated if you do anything that invalidates it, but that has nothing to do with whether or not the iterator is stored in a stack (or any other data structure).
It should be no problem to store the iterators, just make sure you don't use them on a copy of the list -- an iterator is bound to one instance of the list, and cannot be used on a copy.
That is, if you do:
std::list<int>::iterator it = myList.begin ();
std::list<int> c = myList;
c.insert (it, ...); // Error
As noted by others: Of course, you should also not invalidate the iterator by removing the pointed-to element.
This might be offtopic, but just a hint...
Be aware, that your function(s)/data structure would probably be thread unsafe for read operations. There is a kind of basic thread safety where read operations do not require synchronization. If you are going to store the sate how much the caller read from your structure it will make the whole concept thread unsafe and a bit unnatural to use. Because nobody assumes a read to be state-full operation.
If two threads are going to call it they will either need to synchronize the calls or your data structure might end-up in a race condition. The problem in such a design is that both threads must have access to a common synchronization variable.
I would suggest making two overloaded functions. Both are stateless, but one of them should accept a hint iterator, where to start next read/search/retrieval etc. This is e.g. how Allocator in STL is implemented. You can pass to allocator a hint pointer (default 0) so that it quicker finds a new memory chunk.
Regards,
Ovanes
Storing the iterator for the list should be fine. It will not get invalidated unless you remove the same element from the list for which you have stored the iterator. Following quote from SGI site:
Lists have the important property that
insertion and splicing do not
invalidate iterators to list elements,
and that even removal invalidates only
the iterators that point to the
elements that are removed
However, note that the previous and next element of the stored iterator may change. But the iterator itself will remain valid.
The same rule applies to an iterator stored in a local variable as in a longer lived data structure: it will stay valid as long as the container allows.
For a list, this means: as long as the node it points to is not deleted, the iterator stays valid. Obviously the node gets deleted when the list is destructed...