Implementing Dijkstra's shortest path algorithm using C++ and STL - c++

I am trying to implement Dijkstra's shortest path algorithm using C++ and STL. Since STL's priority queues do not support a decrease-key operation, I decided to use regular ordered sets. My algorithm is almost identical to this one.
However, I have some concerns. Namely, the ordering of the edges in the set will depend both upon the vertex number of the destination and the weight (as the regular relational operators of std::pair will be used). I do believe that it should only depend on the weight. If I were to declare the set by using a custom comparator which would compare only the weights, how would I make std::set::erase work, as it is needed to erase the edges between the same vertices but with greater weight?
Are there any other flaws that you guys can think of? Or do you perhaps have some better ideas than using std::set?
Have a great Sunday, everyone.

Your question seem to confuse the technical implementation and the algorithm.
First, on the technical side, for std::set you seem to need a special ordering as well as an erasement of certain elements. The ordering can be changed by a custom comparator, for example see here. However, I would not order only by the weights, as there might be duplicates. Just put the weights in the component of std::pair which has a higher priority (--the first component).
Next, in order to erase an element, you must first be sure which one, which is done by providing an iterator pointing to that element. This step is not at all influenced by your custom comparison function.
Quickly summarizing: you should (i) find out which elements need to be erased exactly, (ii) find the corresponding iterators via std::set::find and (iii) erase them. To me it seems as if the first point would be the problem here.

Related

iterate ordered versus unordered containers

I want to know which data-structures are more efficient for iterating through their elements between std::set, std::map and std::unordered_set, std::unordered_map.
I searched through SO and I found this question. The answers either propose to copy the elements in a std::vector or to use Boost.Container, which IMHO don't answer my question.
My purpose is to keep in a container a big number of unique elements, that most of the time I want to iterate through them. Insertions and extractions are more rare. I want to avoid std::vector in combination with std::unique.
Lets consider set vs unordered_set.
The main difference here is the 'nature' of the iteration, that is the traversal of the set will give you the elements in order while traversing a range in an unordered set will give you a bunch of values in no particular order.
Suppose you want to traverse a range [it1, it2]. If we exclude the lookup time that's needed to find elements it1 and it2 there can be no direct mapping from one case to another since the elements in between are not guarrandeed to be the same even if you've used the same elements to construct the container.
There are cases however where something like this has meaning when e.g. you want to traverse a fixed number of elements (regardless of what they are) or when you need to traverse the whole container. In such cases you need to consider implementation mechanics :
Sets are usually implemented like Red–black trees (a form of binary search trees). Like all binary search trees allow efficient in-order traversal (LRR: left root right) of their elements. That is to traverse you pay the cost of pointer chasing (just like traversing a list).
Unordered sets on the other hand are hash tables and to my knowledge the STL implementation uses hashing with chaining. That means (in a very very high level) that what's used for the structure is a (contiguous) buffer where each element is the head of a chain (list) that contains the elements. The way the elements are layed out across those chains (buckets) and across the buffer will affect the traversal time, however you'll be chasing pointers once again jumping through differents lists as well this time. I don't think it'll vary significantly from the tree case but won't be any better for sure.
In any case micro tuning and benchmarking will give you the answer for your particular application.
The difference does not lie between the ordering or lack of one but in the backing container. If it's a contiguous memory it should be fast to iterate over, due to simple implementation of iterator and cache friendliness.
Unordered containers are usually stored as a vector of vectors (or a similar thing), while ordered containers are implemented using trees, but it is left for implementation after all. This would suggest that iterating over unordered version should be waster. However this is left for implementation after all, and I saw implementations (which bent rules a little to be fair) with different behaviour.
Generally speaking, container performance is quite a complex topic and usually has to be tested in actual application to get reliable answer. There is plenty on implemention-defined stuff that might affect the performance. I'd go with hash_set if I had to go in blind. Copying into a vector might also turn out a good option.
EDIT: As #TonyD said in it's comment, there is a rule, that disallows invalidating iterators during addition of element when the max_load_factor() is not exceeded, this practically rules out backing containers which are contiguous in memory.
Thus, copying everything into a vector seems like even more reasonable option. If you need to remove duplicates, a feasible option might be to use http://en.cppreference.com/w/cpp/algorithm/sort and have dupes easily ignored. I have heard that using vector and sort to have a sorted array (or vector) is quite often a used option in case of need for a container that needs to be sorter and is being iterated over more often than modified.
iterate from fastest to slowest should be : set > map > unordered_set > unordered_map;
set is a little lighter than map, and they are ordered with binary tree rule, so should be faster than unordered_ containers.

C+11 Associative Container that keeps insertion order?

is there, in the c++ "Standard Library", any "Associative" (i.e. "Key-Value") Container/Data Structure, that has the ability, to preserve order, by order of insertion?
I have seen several topics on this, however, it seems, most before C++11.
Some suggest using "boost::multi_index", but, if at all possible, I would "rather" use standard containers/structures.
I see that C++11 has several, apparently, "unordered" associative containers :link.
Are any of these, by some way, "configurable", such that they are only sorted by insertion order?
Thanks!
C
No.
You are mixing linear access with random. Not very good bed fellows.
Just use both a vector/list (i.e. order of insertion) along with a map using an index into the former.
No; such capability was apparently sacrificed in the name of performance.
The order of equivalent items is required to be preserved across operations including rehashes, but there's no way to specify the original order. You could, in theory, use std::rotate or the like to permute the objects into the desired order after each insertion. Obviously impractical, but it proves the lack of capability is a little arbitrary.
Your best bet is to keep the subsequences in inner containers. You may use an iterator adaptor to iterate over such a "deep" container as if it were a single sequence. Such a utility can probably be found in Boost.
No.
In unordered maps too, there are not stored according to the order of insertion.
You can use vector to keep the track of the key!

which std container to use in A* algorithm's openSet?

I'm implementing the A* algorithm using std::priority_queue on the openSet. At some point on the algorithm, as in wikipedia pseudo-code:
else if tentative_g_score < g_score[neighbor]
tentative_is_better := true
followed by
if tentative_is_better = true
came_from[neighbor] := current
g_score[neighbor] := tentative_g_score
f_score[neighbor] := g_score[neighbor] + h_score[neighbor]
means that one has to perform a search on the priority_queue and change a value of one of their elements, which is not possible (as far as I understood).
Also, on this line:
if neighbor not in openset
one cannot search on a priority-queue and so this if cannot be implemented on a priority_queue, which I solved by creating a std::set which only tell us which elements are on the openSet (so that when I add/remove one element to the openSet, I add/remove to both std::set and std::priority_queue).
So, I wonder how can I avoid the first problem, or which std::container should one really use for this particular (yet general A*) implementation.
More generically, I wonder which is an efficient approach to A* using std containers?
I implemented A* algorithm with the STL before and got roughly through the same situation.
I ended up just working with std::vector only, using standard algorithms like push_heap and pop_heap (which are what priority_queue uses) to keep them in order.
To be clear: you should implement it with vectors and use algorithms to manipulate the vectors and keep them in a good state. It's far easier and potentially more efficient than using some alternatives to do it that way.
Update:
Today I would certainly try some of the Boost containers, like these ones: http://www.boost.org/doc/libs/1_55_0/doc/html/heap.html But only if I'm allowed to use Boost (like for my own code for example).
You can solve this by relying on the algorithm's behavior. Use a standard priority_queue, but instead of the increase/decrease_key operations, you insert a new node into the priority queue. Both successors now live in the priority queue. The one with the better priority will be taken first and then expanded and added to the closed list. When the additional node with higher priority is taken out it is already closed and thus discarded.
Unfortunately, the std:: containers don't currently support the operations you require - what's really needed is an "indexed" priority queue that supports decrease/increase_key style operations.
One option is to roll your own container (based on an augmented binary heap) that does this, if this sounds like too much work, you can almost fake it by making use of an augmented std::set data structure - both options are discussed in more detail here.
As others have said, another option is to just remove the priority queue entirely and try to maintain a sorted std::vector. This approach will work for sure, and might require the least coding on your part, but it does have significant implications for the asymptotic scaling of the overall algorithm - it will no longer be possible to achieve the fast O(log(n)) updates of the queue while maintaining sorted order.
Hope this helps.
Without decrease_key, you can instead just re-add the node to the open set. Whenever you pop a node off the open set, check to see whether its key was greater than that node's current score; if so, continue without processing the node. That compromises the efficiency proof of A*, but in practice it isn't a serious issue.

What's a good and stable C++ tree implementation?

I'm wondering if anyone can recommend a good C++ tree implementation, hopefully one that is
stl compatible if at all possible.
For the record, I've written tree algorithms many times before, and I know it can be fun, but I want to be pragmatic and lazy if at all possible. So an actual link to a working solution is the goal here.
Note: I'm looking for a generic tree, not a balanced tree or a map/set, the structure itself and the connectivity of the tree is important in this case, not only the data within.
So each branch needs to be able to hold arbitrary amounts of data, and each branch should be separately iterateable.
I don't know about your requirements, but wouldn't you be better off with a graph (implementations for example in Boost Graph) if you're interested mostly in the structure and not so much in tree-specific benefits like speed through balancing? You can 'emulate' a tree through a graph, and maybe it'll be (conceptually) closer to what you're looking for.
Take a look at this.
The tree.hh library for C++ provides an STL-like container class for n-ary trees, templated over the data stored at the nodes. Various types of iterators are provided (post-order, pre-order, and others). Where possible the access methods are compatible with the STL or alternative algorithms are available.
HTH
I am going to suggest using std::map instead of a tree.
The complexity characteristics of a tree are:
Insert: O(ln(n))
Removal: O(ln(n))
Find: O(ln(n))
These are the same characteristics the std::map guarantees.
Thus as a result most implementations of std::map use a tree (Red-Black Tree) underneath the covers (though technically this is not required).
If you don't have (key, value) pairs, but simply keys, use std::set. That uses the same Red-Black tree as std::map.
Ok folks, I found another tree library; stlplus.ntree. But haven't tried it out yet.
Let suppose the question is about balanced (in some form, mostly red black tree) binary trees, even if it is not the case.
Balanced binaries trees, like vector, allow to manage some ordering of elements without any need of key (like by inserting elements anywhere in vector), but :
With optimal O(log(n)) or better complexity for all the modification of one element (add/remove at begin, end and before & after any iterator)
With persistance of iterators thru any modifications except direct destruction of the element pointed by the iterator.
Optionally one may support access by index like in vector (with a cost of one size_t by element), with O(log(n)) complexity. If used, iterators will be random.
Optionally order can be enforced by some comparison func, but persistence of iterators allow to use non repeatable comparison scheme (ex: arbitrary car lanes change during traffic jam).
In practice, balanced binary tree have interface of vector, list, double linked list, map, multimap, deque, queue, priority_queue... with attaining theoretic optimal O(log(n)) complexity for all single element operations.
<sarcastic> this is probably why c++ stl does not propose it </sarcastic>
Individuals may not implement general balanced tree by themselves, due to the difficulties to get correct management of balancing, especially during element extraction.
There is no widely available implementation of balanced binary tree because the state of the art red black tree (at this time the best type of balanced tree due to fixed number of costly tree reorganizations during remove) know implementation, slavishly copied by every implementers’ from the initial code of the structure inventor, does not allow iterator persistency. It is probably the reason of the absence of fully functionnal tree template.

Dynamically sorted STL containers

I'm fairly new to the STL, so I was wondering whether there are any dynamically sortable containers? At the moment my current thinking is to use a vector in conjunction with the various sort algorithms, but I'm not sure whether there's a more appropriate selection given the (presumably) linear complexity of inserting entries into a sorted vector.
To clarify "dynamically", I am looking for a container that I can modify the sorting order at runtime - e.g. sort it in an ascending order, then later re-sort in a descending order.
You'll want to look at std::map
std::map<keyType, valueType>
The map is sorted based on the < operator provided for keyType.
Or
std::set<valueType>
Also sorted on the < operator of the template argument, but does not allow duplicate elements.
There's
std::multiset<valueType>
which does the same thing as std::set but allows identical elements.
I highly reccomend "The C++ Standard Library" by Josuttis for more information. It is the most comprehensive overview of the std library, very readable, and chock full of obscure and not-so-obscure information.
Also, as mentioned by 17 of 26, Effective Stl by Meyers is worth a read.
If you know you're going to be sorting on a single value ascending and descending, then set is your friend. Use a reverse iterator when you want to "sort" in the opposite direction.
If your objects are complex and you're going to be sorting in many different ways based on the member fields within the objects, then you're probably better off with using a vector and sort. Try to do your inserts all at once, and then call sort once. If that isn't feasible, then deque may be a better option than the vector for large collections of objects.
I think that if you're interested in that level of optimization, you had better be profiling your code using actual data. (Which is probably the best advice anyone here can give: it may not matter that you call sort after each insert if you're only doing it once in a blue moon.)
It sounds like you want a multi-index container. This allows you to create a container and tell that container the various ways you may want to traverse the items in it. The container then keeps multiple lists of the items, and those lists are updated on each insert/delete.
If you really want to re-sort the container, you can call the std::sort function on any std::deque, std::vector, or even a simple C-style array. That function takes an optional third argument to determine how to sort the contents.
The stl provides no such container. You can define your own, backed by either a set/multiset or a vector, but you are going to have to re-sort every time the sorting function changes by either calling sort (for a vector) or by creating a new collection (for set/multiset).
If you just want to change from increasing sort order to decreasing sort order, you can use the reverse iterator on your container by calling rbegin() and rend() instead of begin() and end(). Both vector and set/multiset are reversible containers, so this would work for either.
std::set is basically a sorted container.
You should definitely use a set/map. Like hazzen says, you get O(log n) insert/find. You won't get this with a sorted vector; you can get O(log n) find using binary search, but insertion is O(n) because inserting (or deleting) an item may cause all existing items in the vector to be shifted.
It's not that simple. In my experience insert/delete is used less often than find. Advantage of sorted vector is that it takes less memory and is more cache-friendly. If happen to have version that is compatible with STL maps (like the one I linked before) it's easy to switch back and forth and use optimal container for every situation.
in theory an associative container (set, multiset, map, multimap) should be your best solution.
In practice it depends by the average number of the elements you are putting in.
for less than 100 elements a vector is probably the best solution due to:
- avoiding continuous allocation-deallocation
- cache friendly due to data locality
these advantages probably will outperform nevertheless continuous sorting.
Obviously it also depends on how many insertion-deletation you have to do. Are you going to do per-frame insertion/deletation?
More generally: are you talking about a performance-critical application?
remember to not prematurely optimize...
The answer is as always it depends.
set and multiset are appropriate for keeping items sorted but are generally optimised for a balanced set of add, remove and fetch. If you have manly lookup operations then a sorted vector may be more appropriate and then use lower_bound to lookup the element.
Also your second requirement of resorting in a different order at runtime will actually mean that set and multiset are not appropriate because the predicate cannot be modified a run time.
I would therefore recommend a sorted vector. But remember to pass the same predicate to lower_bound that you passed to the previous sort as the results will be undefined and most likely wrong if you pass the wrong predicate.
Set and multiset use an underlying binary tree; you can define the <= operator for your own use. These containers keep themselves sorted, so may not be the best choice if you are switching sort parameters. Vectors and lists are probably best if you are going to be resorting quite a bit; in general list has it's own sort (usually a mergesort) and you can use the stl binary search algorithm on vectors. If inserts will dominate, list outperforms vector.
STL maps and sets are both sorted containers.
I second Doug T's book recommendation - the Josuttis STL book is the best I've ever seen as both a learning and reference book.
Effective STL is also an excellent book for learning the inner details of STL and what you should and shouldn't do.
For "STL compatible" sorted vector see A. Alexandrescu's AssocVector from Loki.