boost shortest path finding algorithm - c++

Good day, dear friends.
I want to find shortest path in random graph. I use boost graph library. As I understand I need to build graph using existing distances between dots. After that I need to use some algorithm...
As I see Dijkstra's algorithm is really finds all paths from 1 point to others. (It should be slow?)
A* wants some additional data (not only distances)
How can I find the shortest path between 2 points? I saw many shortest path algorithms headers in bgl folder, but I didn't find examples how to use them.
Also I can precompute something for graph is needed.
What should I do?

it depends on how many nodes you have , as you mentioned your nodes are around O(10^4) and edges are O(10^4) which is good
so in BOOST LIBRARY DOCS it sasy The time complexity is O(V log V + E). so if you put V = 10^4 and E = 10^4 you get about O(10^5) which is very good and can run less than 1 second on a normal computer so you can use it.
A* Algorithm can run faster than Dijkstra but it needs a heuristic function which must be monotonic and admissible and it might be hard to find that function depending on your problem.
so i think Dijkstra would be good enough for your case

Dijkstra's algorithm takes O(E log(n)) time - where E = #edges and N=#nodes.
It should be fast enough. Please comment on approximate values of E and N.
In some cases (e.g. a Social graph), the following is faster:
- assuming edge weights are 1, N is very large, degree of nodes is small (few hundreds):
Do a 2 level BFS from node1, 2 level BFS from node2 and intersect the sets. If there's a path length of <= 4, you'll find it.

Related

Boost Graph Library: shortest cycle with resource constraints

Problem
In a directed graph with arbitrary arc lengths (travel times, costs) find the shortest (fastest, cheapest) cycle (or closed walk without repeated vertices). Or, alternatively, find the shortest cycle through a given vertex.
Toward Solution
The r_c_shortest_paths of the Boost Graph Library solves this exact question for... shortest paths. The example demonstrates its usage clearly.
Despite several attempted approaches it does not seam possible to efficiently use the r_c_shortest_paths for the problem described above.
Question
Is it possible to use the r_c_shortest_paths to solve this problem? If so, how?
Another BGL algorithm?
Another C++ Graph library?
Thanks
If you do not want to write something up, such as a complete traversal of the graph reachable from the origin (which may not be that bad or that hard), the path of least resistance here, especially considering your graph is directed, would probably be to just use r_c_shortest_paths on each of the origin's neighbors (in the sense of correct direction). Assuming one implementation it would be something like:
std::vector<Path<Nodes>> best_paths;
size_t first_step = -1;
for(auto&& [neighbor, weight] : boost::zip(origin.neighbors(), origin.weights())) {
auto paths = r_c_shortest_paths(neighbor, origin);
if(!paths.empty() && (best_paths.empty() || paths[0].cost + weight < best_paths[0].cost + first_step)) {
best_paths = paths
first_step = weight
}
}
Turns out that the excellent Boost library has the solution, r_c_shortest_paths function is able to do exactly that... Just need to spend enough time on the documentation I guess.
Cycles
As can be seen from the pseudo code in the documentation, the algorithm stops extending a path when the resource constraint or domination function hits. That the end-node was reached is checked in a next step. So using the same vertex as start and end allows for listing cycles.

How to Modify the A* algorithm in order to find the first k(let's say 50) shortest paths? [duplicate]

How can I use the A star algorithm to find the first 100 shortest paths?
The problem of finding k'th shortest path is NP-Hard, so any modification to A-Star that will do what you are after - will be exponential in the size of the input.
Proof:
(Note: I will show on simple paths)
Assume you had a polynomial algorithm that runs in polynomial time and returns the length of kthe shortest path let the algorithm be A(G,k)
The maximal number of paths is n!, and by applying binary search on the range [1,n!] to find a shortest path of length n, you need O(log(n!)) = O(nlogn) invokations of A.
If you have found there is a path of length n - it is a hamiltonian path.
By repeating the process for each source and target in the graph (O(n^2) of those), you can solve the Hamiltonian Path Problem polynomially, assuming such A exists.
QED
From this we can conclude, that unless P=NP (and it is very unlikely according to most CS researchers), the problem cannot be solved polynomially.
An alternative is using a variation of Uniform Cost Search without maintaining visited/closed set. You might be able to modify A* as well, by disabling the closed nodes, and yielding/generating solutions once encountered instead of returning them and finishing, but I cannot think of a way to prove it for A* at the moment.
Besides of this problem being NP-hard, it is impossible to do this with A* or dijkstra without major modifications. Here are some major reasons:
First of all, the algorithm keeps at every step only the best path so far. Consider the following Graph:
A
/ \
S C-E
\ /
B
Assume distances d(S,A)=1, d(S,B)=2, d(A,C)=d(B,C)=d(C,E)=10.
When visiting C you will pick the path via A, but you will nowhere store the path via B. So you'd have to keep this information.
But, secondly, you don't even consider every path possible, assume the following graph:
S------A--E
\ /
B--C
Assume distances d(S,A)=1, d(S,B)=2, d(B,C)=1, d(A,E)=3. Your visiting order will be {S,A,B,C,E}. So when visiting A you can't even save the detour via B and C because you don't know of it. You'd have to add something like a "potential path via C" for every unvisited neighbor.
Thirdly, you'd have to incorporate loops and cul-de-sacs's , because yes, it is perfectly possible that a path with a loop in it ends up being one of your 100 shortest paths. You'd of course might want to constraint this away, but it is a generic possibility. Consider for example graphs like this:
S-A--D--E
| |
B--C
It's clear you can easily start looping here, unless you disallow 'going back' (e.g. forbid D->A if A->D already in path). Actually this is even a problem without an obvious graphical loop, because in the generic case you can always ping-pong between two neighbors (path A-B-A-B-A-...).
And now I'm probably even forgetting some issues.
Note that most of these things make it also very hard to develop a generic algorithm, certainly the last part because with loops it is hard to constrain your number of possible paths ('endless loop').
This is not an NP hard algorithm, and the below link is the Yen's algorithm for computing K-shortest paths in a graph in polynomial time.
Yen's algorithm link
Use a* search, when the destination is k-th time pushing into the queue. It would be the k-th shortest path.

Finding path from A to B using a minimal spanning tree - C/C++

Say we find a minimal spanning tree. Now, we just need a path from A to Z in the MST. How can we do this in O(n^2) time?
We start at root A. then we look at all edges in the tree of the form Ax (where x is any vertex).
Then, say we find: AB, AC, AD, etc...
Then for each one, we look for edges of form: Bx, Cx, Dx...this is clearly not O(n^2).
So what is a better / efficient way to find path A -> Z given a MST?
Thanks
Depth-first search will be sufficient, it is in the worst case O(|V| + |E|). The fact that your input is a MST means that you don't have to worry about any loop detection, as you would have in a general graph.
Look up Minimum Spanning Tree and you will find that it is a minimum subgraph that connects all the vertices together. That means that every edge will be used at most once. You can just use either a DFS or BFS to find the desired path, without the need to check for cycles since you already have the MST.
During MST creation you could fill parent[], so after that using simple backtracking you would be able to find path without DFS.
If you think about it, Prim's algorithm for finding an MST is really just Dijkstra's in disguise. So the MST already gives you the shortest path if you find one (as stated above, think DFS).

Best-First search in Boost Graph Library

I am starting to work with boost graph library. I need a best-first search, which I could implement using astar_search by having zero costs. (Please correct me if I'm wrong.)
However, I wonder if there is another possibility of doing this? If the costs are not considered, the algorithm should be slightly more efficient.
EDIT: Sorry for the unclear description. I am actually implementing a potential field search, so I don't have any costs/weights associated with the edges but rather need to do a steepest-descent-search (which can overcome local minima).
Thanks for any hints!
You could definitely use A* to tackle this; you'd need h(x) to be 0 though, not g(x). A* rates nodes based on F which is defined by
F(n) = g(n) + h(n).
TotalCost = PathCost + Heuristic.
g(n) = Path cost, the distance from the initial to the current state
h(n) = Heuristic, the estimation of cost from current state to end state.
From Wikipedia:
Dijkstra's algorithm, as another
example of a best-first search
algorithm, can be viewed as a special
case of A* where h(x) = 0 for all x.
If you are comfortable with C++, I would suggest trying out YAGSBPL.
As suggested by Aphex's answer, you might want to use Dijkstra's algorithm; one way to set the edge weights is to set w(u, v) to potential(v) - potential(u), assuming that is nonnegative. Dijkstra's algorithm assumes that edge weights are positive and so that distances increase as you move away from the source node. If you are searching for the smallest potential, flip the sides of the subtraction; if you have potentials that go both up and down you might need to use something like Bellman-Ford which is not best-first.

What is the fastest Dijkstra implementation you know (in C++)?

I did recently attach the 3rd version of Dijkstra algorithm for shortest path of single source into my project.
I realize that there are many different implementations which vary strongly in performance and also do vary in the quality of result in large graphs. With my data set (> 100.000 vertices) the runtime varies from 20 minutes to a few seconds. Th shortest paths also vary by 1-2%.
Which is the best implementation you know?
EDIT:
My Data is a hydraulic network, with 1 to 5 vertices per node. Its comparable to a street map. I made some modifications to a already accelerated algorithm (using a sorted list for all remaining nodes) and now find to the same results in a fraction of time. I have searched for such a thing quite a while. I wonder if such a implementation already exists.
I can not explain the slight differences in results. I know that Dijkstra is not heuristic, but all the implementations seem to be correct. The faster solutions have the results with shorter paths. I use double precision math exclusively.
EDIT 2:
I found out that the differences in the found path are indeed my fault. I had inserted special handling for some vertices (only valid in one direction) and forgot about that in the other implementation.
BUT im still more than surprised that Dijkstra can be accelerated dramatically by the following change:
In general a Dijkstra algorithm contains a loop like:
MyListType toDoList; // List sorted by smallest distance
InsertAllNodes(toDoList);
while(! toDoList.empty())
{
MyNodeType *node = *toDoList.first();
toDoList.erase(toDoList.first());
...
}
If you change this a little bit, it works the same, but performs better:
MyListType toDoList; // List sorted by smallest distance
toDoList.insert(startNode);
while(! toDoList.empty())
{
MyNodeType *node = *toDoList.first();
toDoList.erase(toDoList.first());
for(MyNeigborType *x = node.Neigbors; x != NULL; x++)
{
...
toDoList.insert(x->Node);
}
}
It seems, that this modification reduces the runtime by a order not of magnitude, but a order of exponent. It reduced my runtime form 30 Seconds to less than 2. I can not find this modification in any literature. It's also very clear that the reason lies in the sorted list. insert/erase performs much worse with 100.000 elements that with a hand full of.
ANSWER:
After a lot of googling i found it myself. The answer is clearly:
boost graph lib. Amazing - i had not found this for quite a while. If you think, that there is no performance variation between Dijkstra implementations, see wikipedia.
The best implementations known for road networks (>1 million nodes) have query times expressed in microseconds. See for more details the 9th DIMACS Implementation Challenge(2006). Note that these are not simply Dijkstra, of course, as the whole point was to get results faster.
May be I am not answering your question. My point is why to use Dijkstra when there are pretty much more efficient algorithms for your problem. If your graph fullfills the triangular property (it is an euclidian graph)
|ab| +|bc| > |ac|
(the distance from node a to node b plus distance from node b to node c is bigger than the distance from node a to node c) then you can apply the A* algorithm.
This algorithm is pretty efficient. Otherwise consider using heuristics.
The implementation is not the major issue. The algorithm to be used does matter.
Two points I'd like to make:
1) Dijkstra vs A*
Dijkstra's algorithm is a dynamic programming algorithm, not an heuristic. A* is an heuristic because it also uses an heuristic function (lets say h(x) ) to "estimate" how close a point x is getting to the end point. This information is exploited in subsequent decisions of which nodes to explore next.
For cases such as an Euclidean graph, then A* works well because the heuristic function is easy to define (one can simply use the Euclidean distance, for example). However, for non Euclidean graphs it may be harder to define the heuristic function, and a wrong definition can lead to a non-optimal path.
Therefore, dijkstra has the advantage over A* which is that it works for any general graph (with the exception of A* being faster in some cases). It could well be that certain implementations use these algorithms interchangeably, resulting in different results.
2) The dijkstra algorithm (and others such as A*) use a priority queue to obtain the next node to explore. A good implementation may use a heap instead of a queue, and an even better one may use a fibonacci heap. This could explain the different run times.
The last time I checked, Dijkstra's Algorithm returns an optimal solution.
All "true" implementations of Dijkstra's should return the same result each time.
Similarly, asymptotic analysis shows us that minor optimisations to particular implementations are not going to affect performance significantly as the input size increases.
It's going to depend on a lot of things. How much do you know about your input data? Is it dense, or sparse? That will change which versions of the algorithm are the fastest.
If it's dense, just use a matrix. If its sparse, you might want to look at more efficient data structures for finding the next closest vertex. If you have more information about your data set than just the graph connectivity, then see if a different algorithm would work better like A*.
Problem is, there isn't "one fastest" version of the algorithm.