I am starting to work with boost graph library. I need a best-first search, which I could implement using astar_search by having zero costs. (Please correct me if I'm wrong.)
However, I wonder if there is another possibility of doing this? If the costs are not considered, the algorithm should be slightly more efficient.
EDIT: Sorry for the unclear description. I am actually implementing a potential field search, so I don't have any costs/weights associated with the edges but rather need to do a steepest-descent-search (which can overcome local minima).
Thanks for any hints!
You could definitely use A* to tackle this; you'd need h(x) to be 0 though, not g(x). A* rates nodes based on F which is defined by
F(n) = g(n) + h(n).
TotalCost = PathCost + Heuristic.
g(n) = Path cost, the distance from the initial to the current state
h(n) = Heuristic, the estimation of cost from current state to end state.
From Wikipedia:
Dijkstra's algorithm, as another
example of a best-first search
algorithm, can be viewed as a special
case of A* where h(x) = 0 for all x.
If you are comfortable with C++, I would suggest trying out YAGSBPL.
As suggested by Aphex's answer, you might want to use Dijkstra's algorithm; one way to set the edge weights is to set w(u, v) to potential(v) - potential(u), assuming that is nonnegative. Dijkstra's algorithm assumes that edge weights are positive and so that distances increase as you move away from the source node. If you are searching for the smallest potential, flip the sides of the subtraction; if you have potentials that go both up and down you might need to use something like Bellman-Ford which is not best-first.
Related
Good day, dear friends.
I want to find shortest path in random graph. I use boost graph library. As I understand I need to build graph using existing distances between dots. After that I need to use some algorithm...
As I see Dijkstra's algorithm is really finds all paths from 1 point to others. (It should be slow?)
A* wants some additional data (not only distances)
How can I find the shortest path between 2 points? I saw many shortest path algorithms headers in bgl folder, but I didn't find examples how to use them.
Also I can precompute something for graph is needed.
What should I do?
it depends on how many nodes you have , as you mentioned your nodes are around O(10^4) and edges are O(10^4) which is good
so in BOOST LIBRARY DOCS it sasy The time complexity is O(V log V + E). so if you put V = 10^4 and E = 10^4 you get about O(10^5) which is very good and can run less than 1 second on a normal computer so you can use it.
A* Algorithm can run faster than Dijkstra but it needs a heuristic function which must be monotonic and admissible and it might be hard to find that function depending on your problem.
so i think Dijkstra would be good enough for your case
Dijkstra's algorithm takes O(E log(n)) time - where E = #edges and N=#nodes.
It should be fast enough. Please comment on approximate values of E and N.
In some cases (e.g. a Social graph), the following is faster:
- assuming edge weights are 1, N is very large, degree of nodes is small (few hundreds):
Do a 2 level BFS from node1, 2 level BFS from node2 and intersect the sets. If there's a path length of <= 4, you'll find it.
I've tried to figure what the A* algorithm would be in case the Heuristic function would not satisfy the monotonicity condition, where in
h(u) <= e(u,v) + h(v), for every u,v such that there is an edge between u and v
is the condition for monotonicity, where h is the heuristic function, u and v are vertices in the search graph, and the function e gives the edge cost between u and v (The search graph is undirected). However, wikipedia (here) does not give the algorithm for this, nor do other sources like Norvig's book on Artificial Intelligence.
Is there a good source to study this. Pseudo code would be great!
Also, I do not wish to solve this by converting the non monotonic heuristic function to a heuristic one.
Assuming the heuristic function function is still admissible - A* algorithm will work fine.
However, for non-monotonic heuristic functions, you might need to update an already 'closed' node, and you should allow this behavior.
In the case of tree search it's not required to be consistent to be optimal. Instead if it's a graph search A* is optimal only in the case that the heuristic is admissible and consistent.
In this picture you can see an example of a non consistent heuristic: the A* algorithm doesn't find the right path.
Maybe you can modify the standard A* algorithm as #amit said, but in that case you need to consider an already closed state, so the search will not be optimal. It may find the optimal path but it will expand more nodes than in the solution with a consistent heuristic, so it'll be sub-optimal.
How can I use the A star algorithm to find the first 100 shortest paths?
The problem of finding k'th shortest path is NP-Hard, so any modification to A-Star that will do what you are after - will be exponential in the size of the input.
Proof:
(Note: I will show on simple paths)
Assume you had a polynomial algorithm that runs in polynomial time and returns the length of kthe shortest path let the algorithm be A(G,k)
The maximal number of paths is n!, and by applying binary search on the range [1,n!] to find a shortest path of length n, you need O(log(n!)) = O(nlogn) invokations of A.
If you have found there is a path of length n - it is a hamiltonian path.
By repeating the process for each source and target in the graph (O(n^2) of those), you can solve the Hamiltonian Path Problem polynomially, assuming such A exists.
QED
From this we can conclude, that unless P=NP (and it is very unlikely according to most CS researchers), the problem cannot be solved polynomially.
An alternative is using a variation of Uniform Cost Search without maintaining visited/closed set. You might be able to modify A* as well, by disabling the closed nodes, and yielding/generating solutions once encountered instead of returning them and finishing, but I cannot think of a way to prove it for A* at the moment.
Besides of this problem being NP-hard, it is impossible to do this with A* or dijkstra without major modifications. Here are some major reasons:
First of all, the algorithm keeps at every step only the best path so far. Consider the following Graph:
A
/ \
S C-E
\ /
B
Assume distances d(S,A)=1, d(S,B)=2, d(A,C)=d(B,C)=d(C,E)=10.
When visiting C you will pick the path via A, but you will nowhere store the path via B. So you'd have to keep this information.
But, secondly, you don't even consider every path possible, assume the following graph:
S------A--E
\ /
B--C
Assume distances d(S,A)=1, d(S,B)=2, d(B,C)=1, d(A,E)=3. Your visiting order will be {S,A,B,C,E}. So when visiting A you can't even save the detour via B and C because you don't know of it. You'd have to add something like a "potential path via C" for every unvisited neighbor.
Thirdly, you'd have to incorporate loops and cul-de-sacs's , because yes, it is perfectly possible that a path with a loop in it ends up being one of your 100 shortest paths. You'd of course might want to constraint this away, but it is a generic possibility. Consider for example graphs like this:
S-A--D--E
| |
B--C
It's clear you can easily start looping here, unless you disallow 'going back' (e.g. forbid D->A if A->D already in path). Actually this is even a problem without an obvious graphical loop, because in the generic case you can always ping-pong between two neighbors (path A-B-A-B-A-...).
And now I'm probably even forgetting some issues.
Note that most of these things make it also very hard to develop a generic algorithm, certainly the last part because with loops it is hard to constrain your number of possible paths ('endless loop').
This is not an NP hard algorithm, and the below link is the Yen's algorithm for computing K-shortest paths in a graph in polynomial time.
Yen's algorithm link
Use a* search, when the destination is k-th time pushing into the queue. It would be the k-th shortest path.
I wonder what are the advantages and disadvantages of these two algorithms. I want to write AddEmUp C++ solved, but I'm not sure which (IDA or DFID) algorithm should I use.
The best article I found is this one, but it seems too old - '93. Any newer?
I think IDA* would be better, but.. ? Any other ideas?
Any ideas and info would be helpful.
Thanks ! (:
EDIT: Some good article about IDA* and good explanation of the algorithm?
EDIT2: Or some good heuristic function for that game? I have no idea how to think of some :/
The Russel & Norvig book is an excellent reference on these algorithms, and I'll give larsmans a virtual high-five for suggesting it; however I disagree that IDA* is in any appreciable way harder to program than A*. I've done it for a project where I had to write an AI to solve a sliding-block puzzle - the familiar problem of having a N x N grid of numbered tiles, and using the single free space to slide tiles around until they are in ascending order.
Recall:
F(n) = g(n) + h(n).
TotalCost = PathCost + Heuristic.
g(n) = Path cost, the distance from the initial to the current state
h(n) = Heuristic, the estimation of cost from current state to end state. To be an admissible heuristic (and thus ensure A*'s optimality), you cannot in any case overestimate the cost. See this question for more info on the effects of overestimating/underestimating heuristics on A*.
Remember that Iterative Deepening A* is just A* with a limit on the F value of nodes you are allowed to traverse. This FLimit increases with each outer iteration; with each iteration you are deepening the search.
Here's my C++ code implementing both A* and IDA* to solve the aforementioned sliding block puzzle. You can see that I use a std::priority_queue with a custom Comparator to store Puzzle states in the queue prioritized by their F value. You will also note that the only difference between A* and IDA* is the addition of an FLimit check and an outer loop that increments this FLimit. I hope this helps shed some light on this subject.
Check out Russell & Norvig, chapters 3 and 4, and realize that IDA* is hard to program correctly. You might want to try recursive best first search (RBFS), also described by R&N, or plain old A*. The latter can be implemented using an std::priority_queue.
IIRC, R&N described IDA* in the first edition, then replaced it with RBFS in the second. I haven't seen the third edition yet.
As regards your second edit, I haven't looked into the game, but a good procedure for deriving heuristics is that of relaxed problems. You take away the rules of the game until you derive a version for which the heuristic is easily expressed and implemented (and cheap to compute). Or, following a bottom-up approach, you check the main rules to see which one admits an easy heuristic, then try that and add in other rules as you need them.
DFID is just a special case of IDA* where the heuristic function is the constant 0; in other words, it has no provision for introducing heuristics. If the problem is not small enough that it can be solved without using heuristics, it seems you have no choice but to use IDA* (or some other member of the A* family). That said, IDA* is really not that hard: the implementation provided by the authors of AIMA is only about half a page of Lisp code; I imagine a C++ implementation shouldn't take more than twice that.
I did recently attach the 3rd version of Dijkstra algorithm for shortest path of single source into my project.
I realize that there are many different implementations which vary strongly in performance and also do vary in the quality of result in large graphs. With my data set (> 100.000 vertices) the runtime varies from 20 minutes to a few seconds. Th shortest paths also vary by 1-2%.
Which is the best implementation you know?
EDIT:
My Data is a hydraulic network, with 1 to 5 vertices per node. Its comparable to a street map. I made some modifications to a already accelerated algorithm (using a sorted list for all remaining nodes) and now find to the same results in a fraction of time. I have searched for such a thing quite a while. I wonder if such a implementation already exists.
I can not explain the slight differences in results. I know that Dijkstra is not heuristic, but all the implementations seem to be correct. The faster solutions have the results with shorter paths. I use double precision math exclusively.
EDIT 2:
I found out that the differences in the found path are indeed my fault. I had inserted special handling for some vertices (only valid in one direction) and forgot about that in the other implementation.
BUT im still more than surprised that Dijkstra can be accelerated dramatically by the following change:
In general a Dijkstra algorithm contains a loop like:
MyListType toDoList; // List sorted by smallest distance
InsertAllNodes(toDoList);
while(! toDoList.empty())
{
MyNodeType *node = *toDoList.first();
toDoList.erase(toDoList.first());
...
}
If you change this a little bit, it works the same, but performs better:
MyListType toDoList; // List sorted by smallest distance
toDoList.insert(startNode);
while(! toDoList.empty())
{
MyNodeType *node = *toDoList.first();
toDoList.erase(toDoList.first());
for(MyNeigborType *x = node.Neigbors; x != NULL; x++)
{
...
toDoList.insert(x->Node);
}
}
It seems, that this modification reduces the runtime by a order not of magnitude, but a order of exponent. It reduced my runtime form 30 Seconds to less than 2. I can not find this modification in any literature. It's also very clear that the reason lies in the sorted list. insert/erase performs much worse with 100.000 elements that with a hand full of.
ANSWER:
After a lot of googling i found it myself. The answer is clearly:
boost graph lib. Amazing - i had not found this for quite a while. If you think, that there is no performance variation between Dijkstra implementations, see wikipedia.
The best implementations known for road networks (>1 million nodes) have query times expressed in microseconds. See for more details the 9th DIMACS Implementation Challenge(2006). Note that these are not simply Dijkstra, of course, as the whole point was to get results faster.
May be I am not answering your question. My point is why to use Dijkstra when there are pretty much more efficient algorithms for your problem. If your graph fullfills the triangular property (it is an euclidian graph)
|ab| +|bc| > |ac|
(the distance from node a to node b plus distance from node b to node c is bigger than the distance from node a to node c) then you can apply the A* algorithm.
This algorithm is pretty efficient. Otherwise consider using heuristics.
The implementation is not the major issue. The algorithm to be used does matter.
Two points I'd like to make:
1) Dijkstra vs A*
Dijkstra's algorithm is a dynamic programming algorithm, not an heuristic. A* is an heuristic because it also uses an heuristic function (lets say h(x) ) to "estimate" how close a point x is getting to the end point. This information is exploited in subsequent decisions of which nodes to explore next.
For cases such as an Euclidean graph, then A* works well because the heuristic function is easy to define (one can simply use the Euclidean distance, for example). However, for non Euclidean graphs it may be harder to define the heuristic function, and a wrong definition can lead to a non-optimal path.
Therefore, dijkstra has the advantage over A* which is that it works for any general graph (with the exception of A* being faster in some cases). It could well be that certain implementations use these algorithms interchangeably, resulting in different results.
2) The dijkstra algorithm (and others such as A*) use a priority queue to obtain the next node to explore. A good implementation may use a heap instead of a queue, and an even better one may use a fibonacci heap. This could explain the different run times.
The last time I checked, Dijkstra's Algorithm returns an optimal solution.
All "true" implementations of Dijkstra's should return the same result each time.
Similarly, asymptotic analysis shows us that minor optimisations to particular implementations are not going to affect performance significantly as the input size increases.
It's going to depend on a lot of things. How much do you know about your input data? Is it dense, or sparse? That will change which versions of the algorithm are the fastest.
If it's dense, just use a matrix. If its sparse, you might want to look at more efficient data structures for finding the next closest vertex. If you have more information about your data set than just the graph connectivity, then see if a different algorithm would work better like A*.
Problem is, there isn't "one fastest" version of the algorithm.