I am using an STL queue to implement a BFS (breadth first search) on a graph. I need to push a node in the queue if that node already doesn't exist in the queue. However, STL queue does not allow iteration through its elements and hence I cannot use the STL find function.
I could use a flag for each node to mark them when they are visited and push them only when the flag is false, however, I need to run BFS multiple times and after each time I will have to reset all the flags, so I ended up using a counter instead of a flag, but I still would like to know if there is a standard way of finding an item in a queue.
I assume you're implementing the concept of a "closed set" in your BFS? The standard way of doing that is to simply maintain a separate std::set or std::unordered_set of elements already encountered. That way, you get O(lg n) or O(1) lookup, while iterating through a queue, if it were supported, would take O(n) time.
the accepted answer is silly.
each of the items in a BFS search will go through three states: not-visited, visited-but-not-complete, visited. For search purposes you can trim this down to visited or not-visited...
you simply need to use a flag...
the flag just alternates.
well, with the first traversal of the tree each node will start at false (not-visited) and go to true (visited). At the second traversal it will switch to true (not-visited) and go to false (visited).
so all you have to do is to keep a separate flag that simply changes state on each traversal...
then your logic is
if ( visitedFlag ^ alternatingFlagStateOfTree )
{
visitedFlag ^= 1;
}
basically the alternatingFlagStateOfTree is just used to indicate if a true is visited or a false is visited state. Each run alternates so we just swap them around.
This entirely eliminates the need for the set(), all the memory overhead, etc. as well as eliminated any need to reset the flag values between runs.
This technique can be used for more complex states, so long as there is a consistent ending state across all items being traversed. You simply do some math to reset what the flag value is to return the state back to base-state.
Related
Consider I have a min priority queue with the smallest value on the top, I hope to reduce the value on the top so that the property of the queue would still be maintained and time complexity would be O(1). How to do that?
I have seen the question here how to change the value of std::priority_queue top()?
Where take the value out, modify it and push it back basically would be O(logN) complexity, I wonder can I make use of the property of reducing the value so that there is no need to push the value back again?
The standard priority queue doesn't support changing the keys.
What you are looking for is something similar to another data structure called Indexed Priority Queue, often used by Dijkstra algorithm.
The Indexed Prioirty queue supports 2 more methods in it's API: increaseKey and decreaseKey enabling modifying the key's itself.
The STL doesnt define indexed priority queue. You'd probably need to implement one by yourself or look for some third party implementation.
I see the point of this question differently from others. Based on this question,
I have a min priority queue with the smallest value on the top, I hope to reduce the value on the top so that the property of the queue would still be maintained and time complexity would be O(1).
With std::priority_queue, you can't. We may try a hack like const_cast<int &>(pq.top()) = lesser_value_than_top. It could work in some implementations, but it is NOT safe.
So, I would like to suggest building your own PQ with an array-based heap, and you can just change the top's value without log(N) work, as long as it is a value equal or less than the current top.
I am tasked with programming an A* Search Algorithm for an assignment that involves solving an '8-Puzzle'.
One of the algorithm's steps is to:
Add all the extended paths to Q. If the descendant state is already in Q, keep only the shorter path to state in Q (where Q is a Priority Queue (PQ)).
As such, I will need to search the PQ if an identical state exists but has a shorter path. If an identical state already exists but it has a longer path, I will need to delete this state from the PQ.
I have been directed to use an STL PQ, not my own implementation. I have managed to implement other types of searches using the below to create a Min PQ - which works as desired.
auto cmp = [](Puzzle* a, Puzzle* b) {
return a->getHCost() > b->getHCost();
};
std::priority_queue<Puzzle*, std::vector<Puzzle*>, decltype(cmp)> Q(cmp);
How can I extend my implementation so that...
I can perform a brute force search - looping through each element of the STL PQ?
I can delete an element somewhere in the STL PQ by its index? - shuffling elements 'upwards' if appropriate.
You could have a secondary array named shortest[] where shortest[i] would be the shortest known path to the state i. Then whenever you get in the top of the PQ an element with state x, you check shortest[x] if it is indeed the shortest found and do whatever you want to it, else delete the element from the top of the PQ.
However, given that the states are from an 8-puzzle, you'd have to come up with an efficient way to give them a unique identifying number and ability to get it back efficiently.
It is possible to do both such things in O(1). I'm not sure if I should spoil my personal idea yet, given it is after all an assignment and you should undertake such a challenge.
I have found some similar questions on this subject, but I wanted to ask again in order to get a more clear answer. I am writing a graph matching algorithm, where each node on the graph assigned to a priority set depending on the matching of its neighbours. Details are not really important, but I am using an std::priority_queue in order to match the highest priority nodes first. Here is the tricky point: Each time a new match is introduced, the priority of the neighbours of the matching nodes shall be updated.
This algorithm is referenced from a paper and although I have implemented the exact same algorithm, I couldn't reach the same matching percentage. I naturally suspected that std::priority_queue may not be reordered as I wanted on priority updates, so I have run some tests and then I found out other questions asking the same thing:
How to tell a std::priority_queue to refresh its ordering?
Does changing a priority queue element result in resorting the queue?
My question naturally is, how can I update the order on new matchings? Can I enforce it? Or is there any other data structure (max heap for example) that can serve to this purpose? Note that, pushing new elements into the queue is not a valid solution for me. Here is the code piece I am using (matchFace() function updates the element priorities):
while (priorityQueue.size() != 0) {
// Take the face at the top of the queue and check if it is already matched
FaceData* currentFace = priorityQueue.top();
// Pop the face at the top in any case
priorityQueue.pop();
// If the face is not already matched, try to find a matching
if (!currentFace->matched) {
// Try to match the face with one of its neighbors, add it to the unmatched faces list if it fails
int neighborId = matchFace(currentFace);
if (neighborId == -1) {
unmatchedFaces.push_back(currentFace);
} else {
matchingMap[currentFace->id] = neighborId;
}
}
}
Using the comments that I received on the problem, I decided to answer it myself. I found out there are three possible ways to overcome this problem:
Implement your own updatable priority queue or use external libraries. Boost might have some additional data structures for this purpose. I also found an Updatable Priority Queue source code here.
Use a vector to store the values and use std::make_heap function provided in the algorithm library each time an update is received. This is the easiest way but it works very slow.
Remove and re-insert the elements. If this is not a valid approach, use a map to store the element ids and instead of removing the elements, mark the elements on the map so if you encounter them multiple times you can just ignore them. An alternative strategy is to alter the items by adding a flag and marking the elements by turning the flag on.
One step in the A* pathfinding algorithm requires searching the list of open nodes for the node you're currently interacting with, and adding that node to the list if it isn't already there, or updating its value and parent, if it's present but with a higher weight than the current version of the node.
These behaviors aren't supported in the STL priority_queue structure. How should I implement that step?
Updates since this question is getting a lot of views:
std::priority_queue may look like a good choice for this, but it isn't.
Implementing A* yourself is an enormous confidence-booster, but after you've done it, you should try to switch to using the one provided by boost. I was nervous about installing it when I asked this question, but installation is very easy and won't produce any complications; and A* isn't the only useful functionality that boost provides. (In particular, if you don't use their string-processing functionality, you'll end up writing your own copy of it; I speak from personal experience...)
You can use a plain vector or array to store the elements and then use std::make_heap, std::push_heap, std::pop_heap, std::sort_heap, std::is_heap and std::is_heap_until to manage it.
This allows you to break containment and implement custom operations on a priority queue, without having to implement the standard operations yourself.
If you are limited to STL you could use STL Set and constantly erasing and re-inserting the elements (with new priority).
Set< pair<int,int> > s; // < priority,value >
s.insert( make_pair(0,5) );
// Decrease Key operation //
s.erase( s.find( make_pair(0,5) ) );
s.insert( make_pair(1,5) );
Time complexity is still O(log N) but it will probably take more time for large sets.
STL priority_queue does not suit for A* implementation. You need a heap structure that supports the increase operation to change the priority of already inserted items. Use Boost.Heap for an implementation of many classical heaps.
EDIT: Boost.Graph library has an implementation of A* search too.
Here is the solution I used for this if you really want to use std::priority_queue:
When you need to update a node that is already in the priority queue, just insert a new node having the same state and a new cost value and parent into the queue. The most recently updated copy of this node will come off the queue first and be added to your visited set. To deal with the older duplicates, check any node coming off the queue against your visited set before processing it. If it is in the visited set then the lowest-cost path through this node has already been seen, so just ignore it and process the next node.
There are three likely solutions to this:
Track the list of nodes currently open independently of the priority queue. Try creating a list of nodes in the same manner you do for closed nodes.
Create a map of nodes (by coordinate) to open-closed state.
Install the Boost library, which includes a templated implementation of A* (I think in <graph>).
I have got this problem:
Find the first element in a list, for which a given condition holds.
Unfortunately, the list is quite long (100.000 elements), and evaluation the condition for each element takes in total about 30 seconds using one single Thread.
Is there a way to cleanly parallelize this problem? I have looked through all the tbb patterns, but could not find any fitting.
UPDATE: for performance reason, I want to stop as early as possible when an item is found and stop processing the rest of the list. That's why I believe I cannot use parallel_while or parallel_do.
I'm not too familiar with libraries for this, but just thinking aloud, could you not have a group of threads iterating at different at the same stride from different staring points?
Say you decide to have n threads (= number of cores or whatever), each thread should be given a specific starting point up to n, so the first thread starts on begin(), the next item it compares is begin() + n, etc. etc. second thread starts on begin()+1 and then it's next comparison is in n too etc.
This way you can have a group of threads iterating in parallel through the list, the iteration itself is presumably not expensive - just the comparison. No node will be compared more than once and you can have some condition which is set when a match is made by any of the threads and all should check this condition before iterating/comparing..
I think it's pretty straightforward to implement(?)
I think the best way to solve this problem with TBB is parallel_pipeline.
There should be (at least) two stages in the pipeline. The 1st stage is serial; it just reads the next element from the list and passes it to the 2nd stage. This 2nd stage is parallel; it evaluates the condition of interest for a given element. As soon as the condition is met, the second stage sets a flag (which should be either atomic or protected with a lock) to indicate that a solution is found. The first stage must check this flag and stop reading the list once the solution is found.
Since condition evaluation is performed in parallel for a few elements, it can happen that a found element is not the first suitable one in the list. If this is important, you also need to keep an index of the element, and when a suitable solution is found you detect whether its index is less than that of a previously known solution (if any).
HTH.
ok, I have done it this way:
Put all elements into a tbb::concurrent_bounded_queue<Element> elements.
Create an empty tbb::concurrent_vector<Element> results.
Create a boost::thread_group, and create several threads that run this logic:
logic to run in parallel:
Element e;
while (results.empty() && elements.try_pop(e) {
if (slow_and_painfull_check(e)) {
results.push_back(e);
}
}
So when the first element is found, all other threads will stop processing the next time they check results.empty().
It is possible that two or more threads are working on an element for which slow_and_painfull_check returns true, so I just put the result into a vector and deal with this outside of the parallel loop.
After all threads in the thread group have finished, I check all elements in the results and use the one that comes first.
you can take a look at http://gcc.gnu.org/onlinedocs/libstdc++/manual/parallel_mode.html for parallel algorithms implementations.
And in particular you need find_if algorithm http://www.cplusplus.com/reference/algorithm/find_if/
I see two opportunities for parallelism here: evaluating one element on multiple threads, or evaluating multiple elements at once on different threads.
There isn't enough information to determine the difficulty nor the effectiveness of evaluating one element on multiple threads. If this is easy, the 30 second per element time could be reduced.
I do not see a clean fit into TBB for this problem. There are issues with lists not having random access iterators, determining when to stop, and guaranteeing the first element is found. There may be some games you can play with the ranges to get it to work though.
You could use some lower level thread constructs to implement this yourself as well, but there are a number of places for incorrect results to be returned. To prevent such errors, I would recommend using an existing algorithm. You could convert the list to an array (or some other structure with random access iterators) and use the experimental libstdc++ Parellel Mode find_if algorithm user383522 referenced.
If it's a linked list, A parallel search isn't going to add much speed. However, linked lists tend to perform poorly with caches. You may get a tiny performance increase if you have two threads: one does the find_first_element, and one simply iterates through the list, making sure not to get more than X (100?) ahead of the first thread. The second thread doesn't do any comparisons, but will assure that the items are cached as well as possible for the first thread. This may help your time, or it might make little difference, or it might hinder. Test everything.
Can't you transform the list to a balanced tree or similar? Such data structures are easier to process in parallel - usually you get back the overhead you may have paid in making it balanced in the first time... For example, if you write functional-style code, check this paper: Balanced trees inhabiting functional parallel programming
If you are using GCC, GNU OpenMP provides parallel std functions
link
I've never heard of the Intel tbb library but a quick open and scan of the Tutorial led me to parallel_for which seems like it will do the trick.