which std container to use in A* algorithm's openSet?

which std container to use in A* algorithm's openSet? - c++

I'm implementing the A* algorithm using std::priority_queue on the openSet. At some point on the algorithm, as in wikipedia pseudo-code:
else if tentative_g_score < g_score[neighbor]
tentative_is_better := true
followed by
if tentative_is_better = true
came_from[neighbor] := current
g_score[neighbor] := tentative_g_score
f_score[neighbor] := g_score[neighbor] + h_score[neighbor]
means that one has to perform a search on the priority_queue and change a value of one of their elements, which is not possible (as far as I understood).
Also, on this line:
if neighbor not in openset
one cannot search on a priority-queue and so this if cannot be implemented on a priority_queue, which I solved by creating a std::set which only tell us which elements are on the openSet (so that when I add/remove one element to the openSet, I add/remove to both std::set and std::priority_queue).
So, I wonder how can I avoid the first problem, or which std::container should one really use for this particular (yet general A*) implementation.
More generically, I wonder which is an efficient approach to A* using std containers?

I implemented A* algorithm with the STL before and got roughly through the same situation.
I ended up just working with std::vector only, using standard algorithms like push_heap and pop_heap (which are what priority_queue uses) to keep them in order.
To be clear: you should implement it with vectors and use algorithms to manipulate the vectors and keep them in a good state. It's far easier and potentially more efficient than using some alternatives to do it that way.
Update:
Today I would certainly try some of the Boost containers, like these ones: http://www.boost.org/doc/libs/1_55_0/doc/html/heap.html But only if I'm allowed to use Boost (like for my own code for example).

You can solve this by relying on the algorithm's behavior. Use a standard priority_queue, but instead of the increase/decrease_key operations, you insert a new node into the priority queue. Both successors now live in the priority queue. The one with the better priority will be taken first and then expanded and added to the closed list. When the additional node with higher priority is taken out it is already closed and thus discarded.

Unfortunately, the std:: containers don't currently support the operations you require - what's really needed is an "indexed" priority queue that supports decrease/increase_key style operations.
One option is to roll your own container (based on an augmented binary heap) that does this, if this sounds like too much work, you can almost fake it by making use of an augmented std::set data structure - both options are discussed in more detail here.
As others have said, another option is to just remove the priority queue entirely and try to maintain a sorted std::vector. This approach will work for sure, and might require the least coding on your part, but it does have significant implications for the asymptotic scaling of the overall algorithm - it will no longer be possible to achieve the fast O(log(n)) updates of the queue while maintaining sorted order.
Hope this helps.

Without decrease_key, you can instead just re-add the node to the open set. Whenever you pop a node off the open set, check to see whether its key was greater than that node's current score; if so, continue without processing the node. That compromises the efficiency proof of A*, but in practice it isn't a serious issue.

Related

iterate ordered versus unordered containers

I want to know which data-structures are more efficient for iterating through their elements between std::set, std::map and std::unordered_set, std::unordered_map.
I searched through SO and I found this question. The answers either propose to copy the elements in a std::vector or to use Boost.Container, which IMHO don't answer my question.
My purpose is to keep in a container a big number of unique elements, that most of the time I want to iterate through them. Insertions and extractions are more rare. I want to avoid std::vector in combination with std::unique.

Lets consider set vs unordered_set.
The main difference here is the 'nature' of the iteration, that is the traversal of the set will give you the elements in order while traversing a range in an unordered set will give you a bunch of values in no particular order.
Suppose you want to traverse a range [it1, it2]. If we exclude the lookup time that's needed to find elements it1 and it2 there can be no direct mapping from one case to another since the elements in between are not guarrandeed to be the same even if you've used the same elements to construct the container.
There are cases however where something like this has meaning when e.g. you want to traverse a fixed number of elements (regardless of what they are) or when you need to traverse the whole container. In such cases you need to consider implementation mechanics :
Sets are usually implemented like Red–black trees (a form of binary search trees). Like all binary search trees allow efficient in-order traversal (LRR: left root right) of their elements. That is to traverse you pay the cost of pointer chasing (just like traversing a list).
Unordered sets on the other hand are hash tables and to my knowledge the STL implementation uses hashing with chaining. That means (in a very very high level) that what's used for the structure is a (contiguous) buffer where each element is the head of a chain (list) that contains the elements. The way the elements are layed out across those chains (buckets) and across the buffer will affect the traversal time, however you'll be chasing pointers once again jumping through differents lists as well this time. I don't think it'll vary significantly from the tree case but won't be any better for sure.
In any case micro tuning and benchmarking will give you the answer for your particular application.

The difference does not lie between the ordering or lack of one but in the backing container. If it's a contiguous memory it should be fast to iterate over, due to simple implementation of iterator and cache friendliness.
Unordered containers are usually stored as a vector of vectors (or a similar thing), while ordered containers are implemented using trees, but it is left for implementation after all. This would suggest that iterating over unordered version should be waster. However this is left for implementation after all, and I saw implementations (which bent rules a little to be fair) with different behaviour.
Generally speaking, container performance is quite a complex topic and usually has to be tested in actual application to get reliable answer. There is plenty on implemention-defined stuff that might affect the performance. I'd go with hash_set if I had to go in blind. Copying into a vector might also turn out a good option.
EDIT: As #TonyD said in it's comment, there is a rule, that disallows invalidating iterators during addition of element when the max_load_factor() is not exceeded, this practically rules out backing containers which are contiguous in memory.
Thus, copying everything into a vector seems like even more reasonable option. If you need to remove duplicates, a feasible option might be to use http://en.cppreference.com/w/cpp/algorithm/sort and have dupes easily ignored. I have heard that using vector and sort to have a sorted array (or vector) is quite often a used option in case of need for a container that needs to be sorter and is being iterated over more often than modified.

iterate from fastest to slowest should be : set > map > unordered_set > unordered_map;
set is a little lighter than map, and they are ordered with binary tree rule, so should be faster than unordered_ containers.

Implementing Dijkstra's shortest path algorithm using C++ and STL

I am trying to implement Dijkstra's shortest path algorithm using C++ and STL. Since STL's priority queues do not support a decrease-key operation, I decided to use regular ordered sets. My algorithm is almost identical to this one.
However, I have some concerns. Namely, the ordering of the edges in the set will depend both upon the vertex number of the destination and the weight (as the regular relational operators of std::pair will be used). I do believe that it should only depend on the weight. If I were to declare the set by using a custom comparator which would compare only the weights, how would I make std::set::erase work, as it is needed to erase the edges between the same vertices but with greater weight?
Are there any other flaws that you guys can think of? Or do you perhaps have some better ideas than using std::set?
Have a great Sunday, everyone.

Your question seem to confuse the technical implementation and the algorithm.
First, on the technical side, for std::set you seem to need a special ordering as well as an erasement of certain elements. The ordering can be changed by a custom comparator, for example see here. However, I would not order only by the weights, as there might be duplicates. Just put the weights in the component of std::pair which has a higher priority (--the first component).
Next, in order to erase an element, you must first be sure which one, which is done by providing an iterator pointing to that element. This step is not at all influenced by your custom comparison function.
Quickly summarizing: you should (i) find out which elements need to be erased exactly, (ii) find the corresponding iterators via std::set::find and (iii) erase them. To me it seems as if the first point would be the problem here.

Does changing a priority queue element result in resorting the queue?

I have a priority_queue, and I want to modify some of it's contents (the priority value), will the queue be resorted then?
It depends if it resorts on push/pop (more probable, becouse you just need to "insert", not resort whole), or when accessing top or pop.
I really want to change some elements in the queue. Something like that:
priority_queue<int> q;
int a=2,b=3,c=5;
int *ca=&a, *cb=&b, cc=&c;
q.push(a);
q.push(b);
q.push(c); //q is now {2,3,5}
*ca=4;
//what happens to q?
// 1) {3,4,5}
// 2) {4,2,5}
// 3) crash

priority_queue copies the values you push into it. Your assignment at the end there will have zero effect on the order of the priority queue, nor the values stored inside of it.

Unfortunately, the std::priority_queue class doesn't support the increase/decrease_key operations that you're looking for. Of course it's possible to find the element within the heap you want to update, and then call make_heap to restore the binary heap invariants, but this can't be done as efficiently as it should be with the std:: container/algorithms. Scanning the heap to find the item is O(N) and then make_heap is O(N) on top of that - it should be possible to do increase/decrease_key in O(log(N)) for binary heaps that properly support updates.
Boost provides a set of priority queue implementations, which are potentially more efficient than the std::priority_queue (pairing heaps, Fibonacci heaps, etc) and also offer mutability, so you can efficiently perform dynamic updates. So all round, using the boost containers is potentially a much better option.

Okay, after searching a bit I found out how to "resort" queue, so after each priority value change you need to call:
std::make_heap(const_cast<Type**>(&queue.top()),
const_cast<Type**>(&queue.top()) + queue.size(),
ComparerClass());
And queue must be then
std::priority_queue<Type*,vector<Type*>,ComparerClass> queue;
Hope this helps.

I stumbled on this issue while considering the use of priority queues for an A* algorithm.
Basically, C++ priority queues are a very limited toy.
Dynamically changing the priority of a queued element requires to perform a complete reconstruction of the underlying heap manually, which is not even guaranteed to work on a given STL implementation and is grossly inefficient.
Besides, reconstructing the heap requires butt-ugly code, which would have to be hidden in yet another obfuscated class/template.
As for so many other things in C++, you'll have to reinvent the wheel, or find whatever fashionable library that reinvented it for you.

Need some advice to choose the proper container

I'm trying to design a task scheduler to a game engine. A task could be an animation, a trigger controller, etc.
My problem is what container to choose. The idea is: when you insert a new task, the container must reorder and put the task in the proper place. Once executed, task could change and be scheduled again or deleted. This is mainly push and pop.
But, if possible, it would be nice if I could have random access to an element, but not vital. No matter if the container supports one or more elements with the same key.
I think that priority queue fits my needs but I saw that is based on vector implementation, and I think that this container must be somehow optimized to push and pop.
Opinions?

(source: adrinael.net)
(original source: Liam Devine)

A priority queue seems to be the best option for you.
As you can see, the pop functions has a constant complexity and the push function is logarithmic in time.

std::vector is pretty good for this task, especially if the "steady-state" size of the container remains reasonably constant (you have a number of tasks on the queue doesn't differ widely).
If you need an updatable queue (and std::priority_queue is not), I would suggest you use the d_ary_heap_indirect (which can be found in the Boost.Graph "detail" folder). This is a priority queue used a lot for Dijkstra and A* algorithms that require an updatable priority queue. Random-access is necessary, anyways. Also, using an indirect makes the insertion and deletion from the queue quite efficient. Finally, you can choose your container (as a template argument), but it has to be random-access (so, you can try either vector or deque). Pop is constant-time, push and/or update is log-time, and the proper choice of container will make the container insertion constant-amortized (and the d_ary_heap_indirect amortizes a second time as well, so I wouldn't worry about that).

The vector is optimized for push and pop at one end. :-)
To prioritize you will have to sort the tasks. A vector isn't that bad, if the number of objects is reasonably small, even if it means copying objects during the sort.
Other containers, like linked lists, instead suffer from the need to allocate a new node for each object.

You can specify the container type you want with std::priority_queue.
However: you're storing pointers (I presume, since it sounds like what you're is
polymorphic and has identity), so copying is cheap. You're managing it
as a heap (that's what std::priority_queue does), so insertions are done
using push_back and a number of swaps (lg(n) max). I can't see any
reason to even consider another structure than std::vector.
std::priority_queue does hide all of the direct access operators (e.g.
operator[]). It does this because if you modify an entry, you're
likely to invalidate the heap (which is a class invariant of the class).
If you do want to provide direct read access, however, the underlying
container is only protected, not private, so you can derive from it
and add the operators you want. I'd very much limit it to const
operators, however.

Depends on how often you're going to be adding tasks and pulling tasks off (and presumably executing them) and how many there are.
If you're going to have tons of little tasks, then prefer priority queue because the cost of node allocation will probably not hurt you as much as the asymptotic growth of n log n for the sort.
If you're going to have a small number of tasks that constantly keep changing priority, then sorting a vector might be reasonable, but you want to use an sorting algorithm that works well when the list is almost sorted.
Scheduling is an art though and you're going to have to profile it once you build it. There's probably too little information at this point so say. I'd lean towards a priority queue, but keep other options in mind if performance isn't adequate.

C++ boost - Is there a container working like a queue with direct key access?

I was wonndering about a queue-like container but which has key-access, like a map.
My goal is simple : I want a FIFO queue, but, if I insert an element and an element with a given key is already in the queue, I want it the new element to replaced the one already in the queue. For example, a map ordered by insertion time would work .
If there is no container like that, do you think it can be implemented by using both a queue and a map ?

Boost multi-index provides this kind of container.
To implement it myself, I'd probably go for a map whose values consist of a linked list node plus a payload. The list node could be hand-rolled, or could be Boost intrusive.
Note that the main point of the queue adaptor is to hide most of the interface of Sequence, but you want to mess with the details it hides. So I think you should aim to reproduce the interface of queue (slightly modified with your altered semantics for push) rather than actually use it.

Obviously what you want can be done simply with the queue-like container, but you would have to spend O(n) time on every insertion to determine if the element is already present. If you implement your queue based on something like std::vector you could use the binary search and basically speed up your insertion to O(log n) (that would still require O(n) operations when the memory reallocation is done).
If this is fine, just stick to it. The variant with additional container might give you a performance boost, but it's also likely to be error-prone to write and if the first solution is sufficient, just use it.
In the second scenario you might want to store your elements twice in different containers - the original queue and something like a map (or sometimes a hashmap may perform better). The map is used only to determine if the element is already present in the container or not - and if YES, you will have to update it in your queue.
Basically that gives us O(1) complexity for hashmap lookups (in real world this might get uglier because of the collisions - hashmaps aren't really good for determining element existence) and O(1) insertion time for the case when no update is required and O(n) insertion time for the case update is needed.
Based on the percentage of the actual update operations, the actual insertion performance may vary from O(1) to O(n), but this scheme will definitely outperform the first one if the number of updates is small enough.
Still, you have to insert your elements in two containers simultaneosly and the same should be done if the element is deleted and I would think twice "do I really need that performance boost?".

I see easy way of doing this with a queue and optionally a map.
Define some sort of == operator for your elements.
Then simply have a queue and search for your element every time you want to insert it.
You could optimize this by having a map of element locations to elements instead of searching the queue every time.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js