I need to find an optimal solution for a Travelling Salesman Problem on graphs with the small number of vertices (< 10). Since this is an NP hard problem I am ready to do the brute force approach, for the small number of vertices it should be doable in a very small time.
I have a slightly modified conditions for 2 problems:
(A)
The graph is bidirectional, with different weights in each direction.
All vertices are connected to all.
(Nice to have condition) You can visit the same vertices more than once, and travel the same paths more than once (however for eventual completeness you should not loop infinitely)
(B)
In addition to conditions of (A), here you need to visit a subset of vertices, while you still allow to travel through all other vertices of a graph. (given that it is a better solution).
A while back I have implemented a brute force solution and some heuristics like Lin–Kernighan (using simple matrix of weights), however I never used Graph data structures like in boost. And I was wounding if there is an existing implementation that I could use or a set of algorithms that could help me out to get optimal solution. Also I would appreciate if you could advise on how to get the part (B) right.
Thanks!
Related
I am pretty knew to the boost graph library. I need to implement a visitor that would be able to revisit the same vertices multiple times. For the undirectedS graphs (or directed with 2 edges) this would result in the infinite search depth, however I am planning to put a finite constraint on the maximum depth.
I am considering to use breadth_first_search(), and I think what I need to do is to un-mark visited vertices in vis.black_target()? Would you know how to do it?
Thanks!
EDIT: Technically I would like to explore all possible paths of a constant depth from a starting vertex. (even if some of them revisit the same vertices several times)
I have a large graph (the number of vertices can be in the range of 50,000-100,000, the adjacency matrix need not be sparse). Edges in the graph can be removed/added, and I want to update the resulting connected components structure after such changes. I have implemented this in a straightforward fashion with a BFS search myself in C++ (keeping track of unordered_maps of vertices to connected component ids and updating them), but I am wondering if there is a more efficient way to do this using Boost's graph library.
I was able to find some questions similar to this here in Stackoverflow, and came to know of filtered_graph (and the connected_components function) but I am worried about the overhead involved in creating such filtered instances, every time we add or remove an edge. (Or should this be a concern at all?!)
I believe your solution is essentially the best possible. If you are only allowed to add edges, then I believe the algorithm can be improved by keeping track of connected components in terms of vertices included, and then when an edge is included you check to see if the two vertices belong to different connected components, in which case you merge the two connected components. This will reduce the complexity from quadratic to best-case per edge added. However, if you are allowed to insert and delete edges, I don't see any asymptotically faster way to solve the problem other that what you described.
There are algorithms for maintaining connectivity under edge insertions and deletions that are faster than recalculating. This is called "dynamic graph connectivity". Here is a paper on experimental evaluations (some newer theoretical results have been found since, but it is unclear whether they have practical relevance).
Recently I asked a question on Stack Overflow asking for help to solve a problem. It is a travelling salesman problem where I have up to 40,000 cities but I only need to visit 15 of them.
I was pointed to use Dijkstra with a priority queue to make a connectivity matrix for the 15 cities I need to visit and then do TSP on that matrix with DP. I had previously only used Dijkstra with O(n^2). After trying to figure out how to implement Dijkstra, I finally did it (enough to optimize from 240 seconds to 0.6 for 40,000 cities). But now I am stuck at the TSP part.
Here are the materials I used for learning TSP :
Quora
GeeksForGeeks
I sort of understand the algorithm (but not completely), but I am having troubles implementing it. Before this I have done dynamic programming with arrays that would be dp[int] or dp[int][int]. But now when my dp matrix has to be dp[subset][int] I don't have any idea how should I do this.
My questions are :
How do I handle the subsets with dynamic programming? (an example in C++ would be appreciated)
Do the algorithms I linked to allow visiting cities more than once, and if they don't what should I change?
Should I perhaps use another TSP algorithm instead? (I noticed there are several ways to do it). Keep in mind that I must get the exact value, not approximate.
Edit:
After some more research I stumbled across some competitive programming contest lectures from Stanford and managed to find TSP here (slides 26-30). The key is to represent the subset as a bitmask. This still leaves my other questions unanswered though.
Can any changes be made to that algorithm to allow visiting a city more than once. If it can be done, what are those changes? Otherwise, what should I try?
I think you can use the dynamic solution and add to each pair of node a second edge with the shortest path. See also this question:Variation of TSP which visits multiple cities.
Here is a TSP implementation, you will find the link of the implemented problem in the post.
The algorithms you linked don't allow visiting cities more than once.
For your third question, I think Phpdna answer was good.
Can cities be visited more than once? Yes and no. In your first step, you reduce the problem to the 15 relevant cities. This results in a complete graph, i.e. one where every node is connected to every other node. The connection between two such nodes might involve multiple cities on the original map, including some of the relevant ones, but that shouldn't be relevant to your algorithm in the second step.
Whether to use a different algorithm, I would perhaps do a depth-first search through the graph. Using a minimum spanning tree, you can give an upper and lower bound to the remaining cities, and use that to pick promising solutions and to discard hopeless ones (aka pruning). There was also a bunch of research done on this topic, just search the web. For example, in cases where the map is actually carthesian (i.e. the travelling costs are the distance between two points on a plane), you can exploit this info to improve the algorithms a bit.
Lastly, if you really intend to increase the number of visited cities, you will find that the time for computing it increases vastly, so you will have to abandon your requirement for an exact solution.
Given a huge collection of points (float64) in 2d space...
Is there a way to determine the nearest neighbour using a feature of OpenGL or DirectX?
I've implemented a kd-tree, which is still not fast enough.
A kd-tree should work just fine. But here's some hints.
I implemented a kd-tree once for a million point data set once. Here's what I learned out of it:
Did you try profiling your code? You might find that there are easy optimizations to make such as common helper functions needing to be forced inline.
Did you actually test your code to validate that it was culling out tree branches for partitions that are easily identified as "too far away". If you aren't careful, you can easily have a bug that does needless distance computations on points too far away.
Easiest thing: Where comparing linear distance between points, you don't need to take the SQRT of (x2-x1)*(y2-y1).
Most of the time spent in my code was just building the tree from the original data set, including multiple full sorts on each iteration deciding which axis was the best to partition on. An easier algorithm would be to just alternate between partitioning on the x and y axis for each tree branch and to cache the sorting order for each axis. It may not build the most optimal search tree, but the overall savings can be enormous.
Anybody out there using BGL for large production servers?
How many node does your network consist of?
How do you handle community detection
Does BGL have any cool ways to detect communities?
Sometimes two communities might be linked together by one or two edges, but these edges are not reliable and can fade away. Sometimes there are no edges at all.
Could someone speak briefly on how to solve this problem.
Please open my mind and inspire me.
So far I have managed to work out if two nodes are on an island (in a community)
in a lest expensive manner, but now I need to work out which two nodes on separate islands are closest to each other. We can only make minimal use of unreliable geographical data.
If we figuratively compare it to a mainland and an island and take it out of social distance context. I want to work out which two bits of land are the closest together across a body of water.
I've used the BGL for graphs with millions of nodes, but the size of the graph you can use depends on what algorithm you are trying to run. You can quickly compute distances between nodes. There are 4 shortest path algorithms which are most applicable depending on your data: (single pairs of points, for all pairs of points, sparse and dense graphs,...).
As for community detection, there aren't any algorithms built-into the BGL specifically for that (but maybe you can contribute one when you are finished with your project). There are a few algorithms that might be helpful in building a community detection algorithm. The max-flow/min-cut algorithms are typically used in community detection (if there is a lot of flow possible between two nodes, then they are likely to be in the same community, if there isn't much flow, then the min-cut is likely to represent roads between communities). There are also heuristics to order the nodes of the graph to reduce bandwidth. Nodes making up "communities" are likely to be close to each other in such an ordering.
As far as I know BGL doesn't have any algorithms specifically for community detection.
By "island" do you mean a disconnected subgraph?
Also, graphs do not have any notion of 'distance'.
This 'social distance' is something that you are going to have to define. Once you've done that a large part of the work is done.
There are numerous methods listed on the page you linked to, most of those only require you to define something like a 'distance' metric, and then plug your definitions into the algorithm.
# David Nehme
Graphs without edge-weights are only about connectedness, they have no notion of distance. If you want to talk about a network then you can talk about distance. But a graph with no edge-weights does not have any distance, unless you want to assume an implied edge-weight of 1 for all edges. But this is really just turning the graph into a network.
Also, he is talking about the distance between two disconnected graphs. To model this, you have to introduce an external concept for distance between nodes, separate from the edge-distance.