I want to induce a small graph on a large graph. Both of these graphs are shown bellow. The vertices with same color are equivalent.
small graph (subject to be induced)
large graph (The smaller graph is induced on)
The problem is none of the red vertices on the large graph have 4 neighbors as the small graph. So boost::vf2_subgraph_iso would fail inducing small on the large. But If the inducing algorithm is more tolerant (maximal matching instead of matching all vertices) then it may yield a closest match.
On the other hand boost::mcgregor_common_subgraphs takes really exponentially long time to complete. The other consequence is applying Maximum Common Subgraph finding algorithm on, two absolutely incompatible graphs is wastage of huge amount of CPU time. mcgregor's algorithm is based on finding cliques. So it starts with 1 vertex subgraph aand progresses towards larger subgraphs. For a subgraph hving more than 10 vertices it takes unacceptable amount of time to reach to the obvious solution.
So What is the solution in this scenario ? Is there a more tolerant algorithm for inducing subgraph ?
I have a directed graph with two weights between vertices, time and cost. The goal is to minimize time while keeping the cost under a maximum value given by the user. I was told to modify the Bellman-Ford algorithm by maintaining an ordered list based on cost instead of a single distance for each vertex of the graph. I am able to correctly implement the Bellman-Ford algorithm when only considering time as a factor, however what modifications to the algorithm do I need to keep cost under a maximum value as well?
I need to parallelize kruskal's algorithm, the serial version used the union find algorithm for detecting cycle in the undirected graph. Is there any way to parallelize this part of code?
Well, it can be parallelized to some extent. It is as follows:
Initially all the edges are sorted in the ascending order. There is a main thread which actually scans each edge from the beginning and decides whether adding the current edge forms the cycle. Our main aim in parallelizing the algorithm is to make these checks parallel.
This is where we use the worker threads. Each thread is given certain number of edges to examine, where in each thread checks if its edges form a cycle with the current representation after every iteration (iteration means the main thread adding a new edge). As the main thread keeps on adding the edges, some threads see that certain edges are already forming a cycle with the current representation.
Such edges are marked as discarded. When the main thread reaches such edges, it simply moves on to the next one without making any check on it.
Thus, we have actually made these checks parallel, which means the algorithm runs quickly increasing the efficiency.
In fact, there is a nice paper that uses the same idea described above.
EDIT:
If you are pretty much concerned about the running time of over-all algorithm, you can even use a parallel sorting algorithm initially as #jarod42 suggested.
I need to write a program on breath first search. I have good idea about the algorithm and can implement it. I have a small problem. In my homework i have been asked to generate random connectivity among the nodes. I thought generate a random number between 0 and all edges possible which will represent the total number of edges. Then for each one of those edges, you will randomly select two nodes. but this doesn't sound good. Need help
I have an implementation of Dijkstra's Algorithm, based on the code on this website. Basically, I have a number of nodes (say 10000), and each node can have 1 to 3 connections to other nodes.
The nodes are generated randomly within a 3d space. The connections are also randomly generated, however it always tries to find connections with it's closest neighbors first and slowly increases the search radius. Each connection is given a distance of one. (I doubt any of this matters but it's just background).
In this case then, the algorithm is just being used to find the shortest number of hops from the starting point to all the other nodes. And it works well for 10,000 nodes. The problem I have is that, as the number of nodes increases, say towards 2 million, I use up all of my computers memory when trying to build the graph.
Does anyone know of an alternative way of implementing the algorithm to reduce the memory footprint, or is there another algorithm out there that uses less memory?
According to your comment above, you are representing the edges of the graph with a distance matrix long dist[GRAPHSIZE][GRAPHSIZE]. This will take O(n^2) memory, which is too much for large values of n. It is also not a good representation in terms of execution time when you only have a small number of edges: it will cause Dijkstra's algorithm to take O(n^2) time (where n is the number of nodes) when it could potentially be faster, depending on the data structures used.
Since in your case you said each node is only connected to up to 3 other nodes, you shouldn't use this matrix: Instead, for each node you should store a list of the nodes it is connected to. Then when you want to go over the neighbors of a node, you just need to iterate over this list.
In some specific cases you don't even need to store this list because it can be calculated for each node when needed. For example, when the graph is a grid and each node is connected to the adjacent grid nodes, it's easy to find a node's neighbors on the fly.
If you really cannot afford memory, even with minimizations on your graph representation, you may develop a variation of the Dijkstra's algorithm, considering a divide and conquer method.
The idea is to split data into minor chunks, so you'll be able to perform Dijkstra's algorithm in each chunk, for each of the points within it.
For each solution generated in these minor chunks, consider the it as an unique node to another data chunk, from which you'll start another execution of Dijkstra.
For example, consider the points below:
.B .C
.E
.A .D
.F .G
You can select the closest points to a given node, say, within two hops, and then use the solution as part of the graph extended, considering the former points as only one set of points, with a distance equal to the resulting distance of the Dijkstra solution.
Say you start from D:
select the closest points to D within a given number of hops;
use Dijkstra's algorithm upon the selected entries, commencing from D;
use the solution as a graph with the central node D and the last nodes in the shortest paths as nodes directly linked to D;
extend the graph, repeating the algorithm until all the nodes have been considered.
Although there's a costly extra processing here, you'd be able to surpass memory limitation, and, if you have some other machines, you can even distribute the processes.
Please, note this is just the idea of the process, the process I've described is not necessarily the best way to do it. You may find something interesting looking for distributed Dijkstra's algorithm.
I like boost::graph a lot. It's memory consumption is very decent (I've used it on road networks with 10 million nodes and 2Gb ram).
It has a Dijkstra implementation, but if the goal is to implement and understand it by yourself, you can still use their graph representation (I suggest adjacency list) and compare your result with theirs to be sure your result is correct.
Some people mentioned other algorithms. I don't think this will play a big role on the memory usage, but more likely in the speed. 2M nodes, if the topology is close to a street-network, the running time will be less than a second from one node to all others.
http://www.boost.org/doc/libs/1_52_0/libs/graph/doc/index.html