What algorithm opencv GCGRAPH (max flow) is based on? - c++

opencv has an implementation of max-flow algorithm (class GCGRAPH in file gcgraph.hpp). It's available here.
Does anyone know which particular max-flow algorithm is implemented by this class?

I am not 100% confident about this, but I believe that the algorithm is based on this research paper describing max-flow algorithms for computer vision. Specifically, Section 3 describes a new algorithm for computing maximum flows.
I haven't lined up every detail of the paper's algorithm with the implementation of the algorithm, but many details seem to match:
The algorithm described works by using a bidirectional search from both s and t, which the implementation is doing as well: for example, there's a comment reading // grow S & T search trees, find an edge connecting them.
The algorithm described keeps track of a set of orphaned nodes, which the variable std::vector<Vtx*> orphans seems to track in the implementation.
The algorithm described works by building up a set of trees and reusing them; the algorithm implementation keeps track of a tree associated with each node.
I hope this helps!

Related

How to implement TSP with dynamic in C++

Recently I asked a question on Stack Overflow asking for help to solve a problem. It is a travelling salesman problem where I have up to 40,000 cities but I only need to visit 15 of them.
I was pointed to use Dijkstra with a priority queue to make a connectivity matrix for the 15 cities I need to visit and then do TSP on that matrix with DP. I had previously only used Dijkstra with O(n^2). After trying to figure out how to implement Dijkstra, I finally did it (enough to optimize from 240 seconds to 0.6 for 40,000 cities). But now I am stuck at the TSP part.
Here are the materials I used for learning TSP :
Quora
GeeksForGeeks
I sort of understand the algorithm (but not completely), but I am having troubles implementing it. Before this I have done dynamic programming with arrays that would be dp[int] or dp[int][int]. But now when my dp matrix has to be dp[subset][int] I don't have any idea how should I do this.
My questions are :
How do I handle the subsets with dynamic programming? (an example in C++ would be appreciated)
Do the algorithms I linked to allow visiting cities more than once, and if they don't what should I change?
Should I perhaps use another TSP algorithm instead? (I noticed there are several ways to do it). Keep in mind that I must get the exact value, not approximate.
Edit:
After some more research I stumbled across some competitive programming contest lectures from Stanford and managed to find TSP here (slides 26-30). The key is to represent the subset as a bitmask. This still leaves my other questions unanswered though.
Can any changes be made to that algorithm to allow visiting a city more than once. If it can be done, what are those changes? Otherwise, what should I try?
I think you can use the dynamic solution and add to each pair of node a second edge with the shortest path. See also this question:Variation of TSP which visits multiple cities.
Here is a TSP implementation, you will find the link of the implemented problem in the post.
The algorithms you linked don't allow visiting cities more than once.
For your third question, I think Phpdna answer was good.
Can cities be visited more than once? Yes and no. In your first step, you reduce the problem to the 15 relevant cities. This results in a complete graph, i.e. one where every node is connected to every other node. The connection between two such nodes might involve multiple cities on the original map, including some of the relevant ones, but that shouldn't be relevant to your algorithm in the second step.
Whether to use a different algorithm, I would perhaps do a depth-first search through the graph. Using a minimum spanning tree, you can give an upper and lower bound to the remaining cities, and use that to pick promising solutions and to discard hopeless ones (aka pruning). There was also a bunch of research done on this topic, just search the web. For example, in cases where the map is actually carthesian (i.e. the travelling costs are the distance between two points on a plane), you can exploit this info to improve the algorithms a bit.
Lastly, if you really intend to increase the number of visited cities, you will find that the time for computing it increases vastly, so you will have to abandon your requirement for an exact solution.

K-nearest neighbour C/C++ implementation

Where can I find an serial C/C++ implementation of the k-nearest neighbour algorithm?
Do you know of any library that has this?
I have found openCV but the implementation is already parallel.
I want to start from a serial implementation and parallelize it with pthreads openMP and MPI.
Thanks,
Alex
How about ANN? http://www.cs.umd.edu/~mount/ANN/. I have once used the kdtree implementation, but there are other options.
Quoting from the website: "ANN is a library written in C++, which supports data structures and algorithms for both exact and approximate nearest neighbor searching in arbitrarily high dimensions."
I wrote a C++ implementation for a KD-tree with nearest neighbor search. You can easily extend it for K-nearest neighbors by adding a priority queue.
Update: I added support for k-nearest neighbor search in N dimensions
The simplest way to implement this is to loop through all elements and store K nearest. (just comparing). Complexity of this is O(n) which is not so good but no preprocessing is needed. So now really depends on your application. You should use some spatial index to partition area where you search for knn. For some application grid based spatial structure is just fine (just divide your world into fixed block and search only within closes blocks first). This is good when your entities are evenly distributed. Better approach is to use some hierarchical structure like kd-tree... It really all depends on what you need
for more information including pseudocode look in these presentations:
http://www.ulozto.net/xCTidts/dpg06-pdf
http://www.ulozto.net/xoh6TSD/dpg07-pdf

Can I implement potential field/depth first method for obstacle avoidance using boost graph?

I implemented an obstacle avoidance algorithm in Matlab which assigns every node in a graph a potential and tries to descend this potential (the goal of the pathplanning is in the global minimum). Now there might appear local minima, so the (global) planning needs a way to get out of these. I used the strategy to have a list of open nodes which are reachable from the already visited nodes. I visit the open node which has the smallest potential next.
I want to implement this in C++ and I am wondering if Boost Graph has such algorithms already. If not - is there any benefit from using this library if I have to write the algorithm myself and I will also have to create my own graph class because the graph is too big to be stored as adjacency list/edge list in memory.
Any advice appreciated!
boost::graph provides a list of Shortest Paths / Cost Minimization Algorithms. You might be interested in the followings: Dijkstra Shortest path, A*.
The algorithms can be easily customized. If that doesn't exactly fit your needs, take a look at the visitor concepts. It allows you to customize your algorithm at some predefined event point.
Finally Distributed BGL handles huge graph (potentially millions of nodes). It will work for you if your graph does not fit in memory.
You can find good overview of the Boost Graph Library here.
And of course, do not hesitate to ask more specific question about BGL on stackoverflow.
To my mind, boost::graph is really awesome for implementing new algorithms, because it provides various data holders, adaptors and commonly used stuff (which can obviously be used as parts of the newly constructed algorithms).
Last ones are also customizable due to usage of visitors and other smart patterns.
Actually, boost::graph may take some time to get used to, but in my opinion it's really worth it.

Boost Graph Library: Is there a neat algorithm built into BGL for community detection?

Anybody out there using BGL for large production servers?
How many node does your network consist of?
How do you handle community detection
Does BGL have any cool ways to detect communities?
Sometimes two communities might be linked together by one or two edges, but these edges are not reliable and can fade away. Sometimes there are no edges at all.
Could someone speak briefly on how to solve this problem.
Please open my mind and inspire me.
So far I have managed to work out if two nodes are on an island (in a community)
in a lest expensive manner, but now I need to work out which two nodes on separate islands are closest to each other. We can only make minimal use of unreliable geographical data.
If we figuratively compare it to a mainland and an island and take it out of social distance context. I want to work out which two bits of land are the closest together across a body of water.
I've used the BGL for graphs with millions of nodes, but the size of the graph you can use depends on what algorithm you are trying to run. You can quickly compute distances between nodes. There are 4 shortest path algorithms which are most applicable depending on your data: (single pairs of points, for all pairs of points, sparse and dense graphs,...).
As for community detection, there aren't any algorithms built-into the BGL specifically for that (but maybe you can contribute one when you are finished with your project). There are a few algorithms that might be helpful in building a community detection algorithm. The max-flow/min-cut algorithms are typically used in community detection (if there is a lot of flow possible between two nodes, then they are likely to be in the same community, if there isn't much flow, then the min-cut is likely to represent roads between communities). There are also heuristics to order the nodes of the graph to reduce bandwidth. Nodes making up "communities" are likely to be close to each other in such an ordering.
As far as I know BGL doesn't have any algorithms specifically for community detection.
By "island" do you mean a disconnected subgraph?
Also, graphs do not have any notion of 'distance'.
This 'social distance' is something that you are going to have to define. Once you've done that a large part of the work is done.
There are numerous methods listed on the page you linked to, most of those only require you to define something like a 'distance' metric, and then plug your definitions into the algorithm.
# David Nehme
Graphs without edge-weights are only about connectedness, they have no notion of distance. If you want to talk about a network then you can talk about distance. But a graph with no edge-weights does not have any distance, unless you want to assume an implied edge-weight of 1 for all edges. But this is really just turning the graph into a network.
Also, he is talking about the distance between two disconnected graphs. To model this, you have to introduce an external concept for distance between nodes, separate from the edge-distance.

HOT(Heap On Top) Queues

Could anyone point me to an example implementation of a HOT Queue or give some pointers on how to approach implementing one?
Here is a page I found that provides at least a clue toward what data structures you might use to implement this. Scroll down to the section called "Making A* Scalable." It's unfortunate that the academic papers on the subject mention having written C++ code but don't provide any.
Here is the link to the paper describing HOT queues. It's very abstract, that's why i wanted
to see a coded example (I'm still trying to fin my way around it).
http://www.star-lab.com/goldberg/pub/neci-tr-97-104.ps
The "cheapest", sort to speak variant of this is a two-level heap queue (maybe this sounds more familiar). What i wanted to do is to improve the running time of the Dijkstra's shortest path algorithm.
What i wanted to do is to improve the
running time of the Dijkstra's
shortest path algorithm.
Have you considered using the Boost Graph Library?
If you are using your own implementation of the algorithm you might already get better results using the one the BGL provides.
However It might be nontrivial to modify your code so it works with the BGL.
Of course speed-up could also be gained by not using Dijkstra at all but another algorithm.