I've been reading about List Ranking Algorithm from many sources like
http://www.cs.cmu.edu/~scandal/alg/listrank.html
I found that it is useful in Parallel Tree Contraction,Euler tour of tree etc.but i'm not getting actual use of this list algorithm in above applications.Does anyone have any idea of how List ranking is useful in these or any algorithms?
The first thing I can imagine is that you can easily compute the height of all nodes in a tree. It's not exactly an algorithm, but can be pretty useful in some cases.
Related
I "found" an algorithm for finding maximum flow in undirected graph which I think isn't correct, but I can't find my mistake. Here is my algorithm:
We construct a new directed graph in the following way: for every edge ${u,v}$ we create edges $(u,v)$ and $(v,u)$ with $c((u,v))=c((v,u))=c({u,v})$. Then we apply Ford-Falkerson's algorithm on new graph. Now we make a flow in our first graph in the following way: Let's $f((u,v))\ge f((v,u))$, than we direct edge ${u,v}$ from $u$ to $v$ and take $f'((u,v))=f((u,v))-f((v,u))$. Now it will be maximum flow for our undirected graph, because otherwise we will construct a flow for corresponding directed graph, which is contradiction.
The reason that I think I have missed something is that there is an article on the Internet about this problem and I don't think anybody would wrote an article about such a trivial problem.
And this is the article: http://www.inf.ufpr.br/pos/techreport/RT_DINF003_2004.pdf
Thanks!
Ford-Fulkerson is not the best algorithm to find maximum flow in a directed graph, and in the undirected case it is possible to do much better (close to linear-time if I recall correctly).
You don't cite the article you are talking about, but it is most likely that it describe an algorithm which is much better than yours.
For many problems in optimization, it is not difficult to find an algorithm that gives an optimal solution; nevertheless there is a lot of work to find the most efficient ones.
Recently I asked a question on Stack Overflow asking for help to solve a problem. It is a travelling salesman problem where I have up to 40,000 cities but I only need to visit 15 of them.
I was pointed to use Dijkstra with a priority queue to make a connectivity matrix for the 15 cities I need to visit and then do TSP on that matrix with DP. I had previously only used Dijkstra with O(n^2). After trying to figure out how to implement Dijkstra, I finally did it (enough to optimize from 240 seconds to 0.6 for 40,000 cities). But now I am stuck at the TSP part.
Here are the materials I used for learning TSP :
Quora
GeeksForGeeks
I sort of understand the algorithm (but not completely), but I am having troubles implementing it. Before this I have done dynamic programming with arrays that would be dp[int] or dp[int][int]. But now when my dp matrix has to be dp[subset][int] I don't have any idea how should I do this.
My questions are :
How do I handle the subsets with dynamic programming? (an example in C++ would be appreciated)
Do the algorithms I linked to allow visiting cities more than once, and if they don't what should I change?
Should I perhaps use another TSP algorithm instead? (I noticed there are several ways to do it). Keep in mind that I must get the exact value, not approximate.
Edit:
After some more research I stumbled across some competitive programming contest lectures from Stanford and managed to find TSP here (slides 26-30). The key is to represent the subset as a bitmask. This still leaves my other questions unanswered though.
Can any changes be made to that algorithm to allow visiting a city more than once. If it can be done, what are those changes? Otherwise, what should I try?
I think you can use the dynamic solution and add to each pair of node a second edge with the shortest path. See also this question:Variation of TSP which visits multiple cities.
Here is a TSP implementation, you will find the link of the implemented problem in the post.
The algorithms you linked don't allow visiting cities more than once.
For your third question, I think Phpdna answer was good.
Can cities be visited more than once? Yes and no. In your first step, you reduce the problem to the 15 relevant cities. This results in a complete graph, i.e. one where every node is connected to every other node. The connection between two such nodes might involve multiple cities on the original map, including some of the relevant ones, but that shouldn't be relevant to your algorithm in the second step.
Whether to use a different algorithm, I would perhaps do a depth-first search through the graph. Using a minimum spanning tree, you can give an upper and lower bound to the remaining cities, and use that to pick promising solutions and to discard hopeless ones (aka pruning). There was also a bunch of research done on this topic, just search the web. For example, in cases where the map is actually carthesian (i.e. the travelling costs are the distance between two points on a plane), you can exploit this info to improve the algorithms a bit.
Lastly, if you really intend to increase the number of visited cities, you will find that the time for computing it increases vastly, so you will have to abandon your requirement for an exact solution.
opencv has an implementation of max-flow algorithm (class GCGRAPH in file gcgraph.hpp). It's available here.
Does anyone know which particular max-flow algorithm is implemented by this class?
I am not 100% confident about this, but I believe that the algorithm is based on this research paper describing max-flow algorithms for computer vision. Specifically, Section 3 describes a new algorithm for computing maximum flows.
I haven't lined up every detail of the paper's algorithm with the implementation of the algorithm, but many details seem to match:
The algorithm described works by using a bidirectional search from both s and t, which the implementation is doing as well: for example, there's a comment reading // grow S & T search trees, find an edge connecting them.
The algorithm described keeps track of a set of orphaned nodes, which the variable std::vector<Vtx*> orphans seems to track in the implementation.
The algorithm described works by building up a set of trees and reusing them; the algorithm implementation keeps track of a tree associated with each node.
I hope this helps!
I apologize if this question is a bit broad, but I'm having a difficult time trying to understand how I would create a minimum cost spanning tree. This is in C++ if it matters at all.
From what I understand, you would use Kruskal's to select the minimum cost edges for building the spanning tree. My thinking is to read the edges into a minheap and that way you can remove from the top in order to get the edge with the minimum cost.
So far I've only been able to implement the minheap and sets for union-find, I am still unsure of the purpose of union-find and a sorting algorithm for the purpose of creating a spanning tree.
I would greatly appreciate any advice.
EDIT: I am not limited to union find, minheap, kruskals, and a sorting algorith, nor am I required to do any. These were just the items suggested by the instructor.
These two structures serve different purposes in the algorithm. Kruskal's algorithm works by adding the cheapest possible edge at each point that doesn't form a cycle. It can be shown using some not particularly complex math that this guarantees that the resulting spanning tree is minimal. The intuition behind why this works is as follows. Suppose that Kruskal's algorithm is not optimal and that there is a cheaper spanning tree. Sort all of the edges in that tree by weight, then compare those edges in sorted order to the edges chosen by Kruskal's algorithm in sorted order. Since we assume for contradiction that Kruskal's algorithm isn't optimal, there must be some place in the sequences where there's a disagreement. If in this disagreement Kruskal's algorithm has a lighter edge than the optimal solution, then we can make the optimal solution even better by adding that edge in, finding the cycle it creates, then deleting the heaviest edge in the cycle. That edge can't be the edge we just added, because otherwise that would have created a cycle in the MST produced by Kruskal's algorithm and Kruskal's algorithm never adds an edge that creates a cycle. So this means that Kruskal's algorithm must have diverged from the optimal solution by not adding some light edge. But the only reason Kruskal's algorithm skips an edge is if it creates a cycle, and this means that there must be a cycle in the optimal MST, also a contradiction. This means that our assumption was wrong and that Kruskal's algorithm must be optimal.
Hopefully, this motivates why Kruskal's algorithm needs the heap and the union-find structure. We need the heap so that we can get back all the edges in sorted order. If we don't visit the edges in this order, then the above proof breaks down and all bets are off. Interestingly, you don't actually need a heap; you just need some way of visiting all the edges in sorted order. If you want, you can just dump all the edges into a giant array and then sort the array. This doesn't change the runtime of the algorithm from the binary heap case if you use a fast sort.
The union-find structure is a bit trickier. At each point in Kruskal's algorithm you need to be able to tell whether adding an edge would create a cycle in the graph. One way to do this is to store a structure that keeps track of what nodes are already connected to one another. That way, when adding an edge, you can check whether the endpoints are already connected. If they are, then the edge would form a cycle and should be ignored. The union-find structure is a way of maintaining this information efficiently. In particular, its two operations - union and find - correspond to the act of connecting together two distinct groups of nodes that were previously not connected, as would be the case if you added an edge that connected two trees contained in different parts of the spanning forest. The find step gives you a way to check if two nodes are already connected; if so you should skip the current edge.
Hope this helps!
Is there any fast way to determine the size of the largest strongly connected component in a graph?
I mean, like, the obvious approach would mean determining every SCC (could be done using two DFS calls, I suppose) and then looping through them and taking the maximum.
I'm pretty sure there has to be some better approach if I only need to have the size of that component and only the largest one, but I can't think of a good solution. Any ideas?
Thanks.
Let me answer your question with another question -
How can you determine which value in a set is the largest without examining all of the values?
Firstly you could use Tarjan's algorithm which needs only one DFS instead of two. If you understand the algorithm clearly, the SCCs form a DAG and this algo finds them in the reverse topological sort order. So if you have a sense of the graph (like a visual representation) and if you know that relative big SCCs occur at end of the DAG then you could stop the algorithm once first few SCCs are found.