std::map without parent pointers? - c++

libstdc++, as an example, implements std::map using a red-black binary tree with parent pointers in the nodes. This means that iterators can just be pointers to a node.
Is it possible for a standard library to implement std::map without storing parent pointers in the nodes? I think this would mean that iterators would need to contain a stack of parent pointers, and as such would need to dynamically allocate a logarithmic amount of memory. Would this violate standard performance constraints on iterators? Would not having parent pointers violate any other performance contraints on the rest of the interface?
What about the new node stuff/interface in C++17?

They may not do so. std::map guarantees that removing a key-value pair from it won't invalidate any iterators other than to the pair being removed.
If iterators will store a stack of parents, and a parent is removed, that will invalidate those iterators as well. And the guarantee will no longer hold.

Is it possible? Possibly :-) Is it a good idea? Almost certainly not. Most things are possible, if you throw more storage or speed at them :-)
In terms of just getting rid of the parent pointers, you could, for example, maintain within the map a monotonic value that is incremented each time the map structure is changed. In essence, it's a version identifier of the map structure. So, adding or deleting elements in the map increments this value, while merely changing the data within the map does not.
The iterator would then contain:
a pointer to the map itself (to get the current version);
the stack of pointers; and
the version matching the last time the stack above was created.
The idea would basically be to, before doing anything with the iterator, detect when the map version is different to the iterator one and, if it is, rebuild the stack and update the iterator version before carrying on with whatever operation you're trying to perform.
Now, while that makes it possible to iterate without parent pointers, it unfortunately violates some other requirements of iterators, such as being able to action them in constant time. Anything that has to rebuild a data structure, based on the data within the map, will violate that restriction.
In any case, there's no way anyone in their right mind would implement such a horrid scheme when it's far simpler to have parent pointers, but the intent here is simply to show that it's possible.
Hence my advice would be to just stick with the parent pointers. The use of such parent pointers makes the process of finding the next/previous element a rather simple one, based only the current item in the iterator.

Related

Is it safe to have pointers to elements in Data Structures? (c++ with QT)

I have the following structure on the software I am developing:
ClassA:
QHash<int, ClassB>
ClassB:
QHash<int, ClassC>
ClassC:
QMap<ID, QSharedPointer<ClassD> > (this is because I need to have the items ordered)
QHash<int, QSharedPointer<ClassD> > (this exists so I can access an item via id)
My question is if it is safe to have a pointer, that will be edited, to an element inside a data structure. I have been getting errors while trying to debug in which the debugger is unable to stop at a break point and I get a SIGTRAP error, but I am not sure if it is related to a memory issue on this.
To give a better example, related to the software I'm developing I have a QHash<int, Take> that represents a list of videos. The user will be editing only one video at a time, so I have a pointer to the current video, which is a Take inside the Hash. Each Take has a bunch of parameters that can be edited but the most common is a QMap of Notes. Is is safe to do something like this?
Take *currentTake = &takes[currentTakeId];
----//---
currentTake->addNote(new Note());
currentTake->changeSomeParameter();
etc
Whether (or how long) it is safe to keep a pointer/reference to an element of a collection is up to that collection. For example, a std::vector invalidates all pointers into it on reallocation, and removal/insertion without reallocation invalidates (well, changes what they point to) all pointers beyond the insertion/removal point. A std::list on the other hand is stable; pointers only get invalidated if the specific element they point to is removed.
A collection should generally document its invalidation behavior. Unfortunately, the Qt collections don't. Reading their documentation tells us that QMap is a red-black balanced binary tree, and QHash is a separate chaining hash table, so they should both have the std::list behavior for invalidation, but there's no guarantee of that. (For example, QHash could store the head entry directly in the bucket list, which would mean that rehashing would invalidate most pointers, and removing an element could invalidate pointers to elements in the same bucket.)
Use references, the data container of your example is storing values of object and returns reference to the object when you do takes[currentTakeId].
You can do this
Take &currentTake = takes.value(currentTakeId]);
// same as 'takes[currentTakeId]'
// but avoid to create empty element if 'currentTakeId' has no element
----//---
currentTake.addNote(new Note());
currentTake.changeSomeParameter();
// do not change 'takes' as long as you use 'currentTake'
Your idea is fine (and more or less the same), you can retrieve the address and work on the pointer as long as the container is not modified when you are using this pointer. Because the data container might copy your element if resizing, or delete it, and then the pointer would be invalid.
If your objects Take are quite big you can lose performance because of the copy when the container is resized or passed by value (and copied). Then storing pointer can be a solution, or use the implicit sharing pattern of Qt on your data class.

How can I point to a member of a std::set in such a way that I can tell if the element has been removed?

An iterator into a std::set becomes invalidated if the item it's pointing to is erased. (It does not get invalidated if the set is modified in any other way, which is nice.) However, there is no way to detect whether an iterator has been invalidated or not.
I'm implementing an algorithm that requires me to be able to keep track of members of a std::set in such a way that I can erase them in constant time, but without risking undefined behaviour if I try to delete the same one twice. If I have two iterators pointing to the same member of a set, Bad Things will happen if I try to erase both of them.
My question is, how can I avoid this? Is there some way to implement something that behaves like an iterator into a set, but which knows when it has been invalidated?
Incidentally, I'm using std::set because this is a performance critical situation and I need the complexity guarantees that set provides. I'm happy to accept answers that suggest a different data structure, but only if it allows me to (a) access and remove the smallest element in constant time, (b) remove the pointed-to elements in constant time, and (c) insert elements in O(log(N)) time or better. C++11 is OK.
You could keep a set of shared pointers. And every time you store an iterator, pair it with a weak pointer to the element. When you want to erase the element, first check the weak pointer to see if the object still exists.

When the data structure is a template parameter, how can I tell if an operation will invalidate an iterator?

Specifically, I have a class which currently uses a vector and push_back. There is an element inside the vector I want to keep track of. Pushing back on the vector may invalidate the iterator, so I keep its index around. It's cheap to find the iterator again using the index. I can't reserve the vector as I don't know how many items will be inserted.
I've considered making the data structure a template parameter, and perhaps list may be used instead. In that case, finding an iterator from the index is not a trivial operation. Since pushing back on a list doesn't invalidate iterators to existing elements, I could just store this iterator instead.
But how can I code a generic class which handles both cases easily?
If I can find out whether push_back will invalidate the iterator, I could store the iterator and update it after each push_back by storing the distance from the beginning before the operation.
You should probably try to avoid this flexibility. Quote from Item 2 "Beware the illusion of container-independent code" from Effective STL by Scott Meyers:
Face the truth: it's not worth it. The different containers are
different, and they have strengths and weaknesses that vary in significant ways. They're not designed to be interchangeable, and
there's littel you can do to paper that over. If you try, you're
merely tempting fate, and fate doesn't like to be tempted.
If you really, positively, definitely have to maintain valid iterators, use std::list. If you also need to have random access, try Boost.MultiIndex (although you'll lose contiguous memory access).
If you look at the standard container adapators (std::stack, std::queue) you see that they support the intersection of the adaptable containers interfaces, not their union.
I'd create a second class, which responsibility would be to return the iterator you are interested in.
It should also be parametrized with the same template parameter, and then you can specialize it for any type you want (vector/list etc). So inside your specializations you can use any method you want.
So it's some traits-based solution.
If you really want to stick with the vector and have that functionality maybe take a look at
http://en.cppreference.com/w/cpp/container/vector/capacity function. wrap your push_backs in defined function or even better wrap whole std::vector in ur class and before push_backing compare capacity against size() to check if resize will happen.

Should I return an iterator or a pointer to an element in a STL container?

I am developing an engine for porting existing code to a different platform. The existing code has been developed using a third party API, and my engine will redefine those third party API functions in terms of my new platform.
The following definitions come from the API:
typedef unsigned long shape_handle;
shape_handle make_new_shape( int type );
I need to redefine make_new_shape and I have the option to redefine shape_handle.
I have defined this structure ( simplified ):
struct Shape
{
int type
};
The Caller of make_new_shape doesn't care about the underlying structure of Shape, it just needs a "handle" to it so that it can call functions like:
void `set_shape_color( myshape, RED );`
where myshape is the handle to the shape.
My engine will manage the memory for the Shape objects and other requirements dictate that the engine should be storing Shape objects in a list or other iterable container.
My question is, what is the safest way to represent this handle - if the Shape itself is going to be stored in a std::list - an iterator, a pointer, an index?
Both an iterators or a pointers will do bad stuff if you try to access them after the object has been deleted so neither is intrinsically safer. The advantage of an iterator is that it can be used to access other members of your collection.
So, if you just want to access your Shape then a pointer will be simplest. If you want to iterate through your list then use an iterator.
An index is useless in a list since std::list does not overload the [] operator.
The answer depends on your representation:
for std::list, use an iterator (not a pointer), because an iterator allows you to remove the element without walking the whole list.
for std::map or boost::unordered_map, use the Key (of course)
Your design would be much strong if you used an associative container, because associative containers give you the ability to query for the presence of the object, rather than invoking Undefined Behavior.
Try benchmarking both map and unordered_map to see which one is faster in your case :)
IIF the internal representation will be a list of Shapes, then pointers and iterators are safe. Once an element is allocated, no relocation will ever occur. I wouldn't recommend an index for obvious access performance reasons. O(n) in case of lists.
If you were using a vector, then don't use iterators or pointers, because elements can be relocated when you exceed the vectors capacity, and your pointers/iterators would become invalid.
If you want a representation that is safe regardless of the internal container, then create a container (list/vector) of pointers to your shapes, and return the shape pointer to your client. Even if the container is moved around in memory, the Shape objects will stay in the same location.
Iterators aren't safer than pointers, but they have much better diagnostics than raw pointers if you're using a checked STL implementation!
For example, in a debug build, if you return a pointer to a list element, then erase that list element, you have a dangling pointer. If you access it you get a crash and all you can see is junk data. That can make it difficult to work out what went wrong.
If you use an iterator and you have a checked STL implementation, as soon as you access the iterator to an erased element, you get a message something like "iterator was invalidated". That's because you erased the element it points to. Boom, you just saved yourself potentially a whole lot of debugging effort.
So, not indices for O(n) performance. Between pointers and iterators - always iterators!

Storing iterators inside containers

I am building a DLL that another application would use. I want to store the current state of some data globally in the DLL's memory before returning from the function call so that I could reuse state on the next call to the function.
For doing this, I'm having to save some iterators. I'm using a std::stack to store all other data, but I wasn't sure if I could do that with the iterators also.
Is it safe to put list iterators inside container classes? If not, could you suggest a way to store a pointer to an element in a list so that I can use it later?
I know using a vector to store my data instead of a list would have allowed me to store the subscript and reuse it very easily, but unfortunately I'm having to use only an std::list.
Iterators to list are invalidated only if the list is destroyed or the "pointed" element is removed from the list.
Yes, it'll work fine.
Since so many other answers go on about this being a special quality of list iterators, I have to point out that it'd work with any iterators, including vector ones. The fact that vector iterators get invalidated if the vector is modified is hardly relevant to a question of whether it is legal to store iterators in another container -- it is. Of course the iterator can get invalidated if you do anything that invalidates it, but that has nothing to do with whether or not the iterator is stored in a stack (or any other data structure).
It should be no problem to store the iterators, just make sure you don't use them on a copy of the list -- an iterator is bound to one instance of the list, and cannot be used on a copy.
That is, if you do:
std::list<int>::iterator it = myList.begin ();
std::list<int> c = myList;
c.insert (it, ...); // Error
As noted by others: Of course, you should also not invalidate the iterator by removing the pointed-to element.
This might be offtopic, but just a hint...
Be aware, that your function(s)/data structure would probably be thread unsafe for read operations. There is a kind of basic thread safety where read operations do not require synchronization. If you are going to store the sate how much the caller read from your structure it will make the whole concept thread unsafe and a bit unnatural to use. Because nobody assumes a read to be state-full operation.
If two threads are going to call it they will either need to synchronize the calls or your data structure might end-up in a race condition. The problem in such a design is that both threads must have access to a common synchronization variable.
I would suggest making two overloaded functions. Both are stateless, but one of them should accept a hint iterator, where to start next read/search/retrieval etc. This is e.g. how Allocator in STL is implemented. You can pass to allocator a hint pointer (default 0) so that it quicker finds a new memory chunk.
Regards,
Ovanes
Storing the iterator for the list should be fine. It will not get invalidated unless you remove the same element from the list for which you have stored the iterator. Following quote from SGI site:
Lists have the important property that
insertion and splicing do not
invalidate iterators to list elements,
and that even removal invalidates only
the iterators that point to the
elements that are removed
However, note that the previous and next element of the stored iterator may change. But the iterator itself will remain valid.
The same rule applies to an iterator stored in a local variable as in a longer lived data structure: it will stay valid as long as the container allows.
For a list, this means: as long as the node it points to is not deleted, the iterator stays valid. Obviously the node gets deleted when the list is destructed...