Minimum path to connect k nodes in a graph with n nodes - c++

Given a graph with n nodes and n-1 edges, find the lowest amount of edges needed to connect k nodes. You are given m numbers k <= n for which you have to solve the problem. The edges are unweighted and the graph does not necessarily describe a binary tree.
For a graph with n=5 and k=4, a possible path could be 1-2-5-2-3.
My approach was a greedy method in which I start from the node with the highest rank and try to add edges from there, but I did not have much success in trying to formulate an algorithm.

Try to find longest path in your tree, you will take every edge on that route once, than choose random node that has an edge, that connect that node to already choosen elements. So find longest path in your tree if that path has k or more nodes it is the end, you choose element on that path, every edge will be taken once Else choose a random, untaken already node, that is connected by any edge directly to one of already taken nodes, as long as you do not have k nodes in your collection.

Related

Find minimum steps to reach the end of graph

Given a directed graph which has nodes from 1 to N where each node has a color(blue/red) which is given to us. Now nodes are connected and each edge has some weight. We have to reach N from 1 such that in the path abs(x-y)<=k where x is the number of red nodes and y is the number of blue nodes and k is some integer value.
I tried solving using the Dijkstra shortest path algorithm but that will work only when there is no condition on no. of blue and red balls. So how should things proceed with Dijkstra or this requires something else?

how to find S or less nodes in tree with minimum distance? [duplicate]

Given an unoriented tree with weightless edges with N vertices and N-1 edges and a number K find K nodes so that every node from a tree is within S distance of at least one of the K nodes. Also, S has to be the smallest possible S, so that if there were S' < S at least one node would be unreachable in S' steps.
I tried solving this problem, however, I feel that my supposed solution is not very fast.
My solution:
set x=1
find nodes which are x distance from every node
let the node which has the most nodes in its distance be one of the K nodes.
recompute for every node whilst not counting already covered nodes.
do this till I find K number of K nodes. Then if every node is covered we are done else increase x.
This problem is called p-center, and you can find several papers online about it such as this. It is indeed NP for general graphs, but polynomial on trees, both weighted and unweighted.
For me it looks like a clustering problem. Try it with the k-Means (wikipedia) algorithm where k equals to your K. Since you have a tree and all vertices are connected, you can use as distance measurement the distance/number of edges between your vertices.
When the algorithm converts you get the K nodes which should be found. Then you can determine S by iterating through all k clusters. There you calculate the maximum distance for every node in the cluster to the center node. And the overall max should be S.
Update: But actually I see that the k-means algorithm does not produce a global optimum, so this algorithm wouldn't also produce the best result ...
You say N nodes and N-1 vertices so your graph is a tree. You are actually looking for a connected K-subset of nodes minimizing the longest edge.
A polynomial algorithm may be:
Sort all your edges increasing distance.
Then loop on edges:
if none of the 2 nodes are in a group, create a new group.
else if one node is in 1 existing goup, add the other to the group
else both nodes are in 2 different groups, then fuse the groups
When a group reach K, break the loop and you have your connected K-subset.
Nevertheless, you have to note that your group can contain more than K nodes. You can imagine the problem of having 4 nodes, closed two by two. There would be no exact 3-subset solution of your problem.

Graph theory: Breadth First Search

There are n vertices connected by m edges. Some of the vertices are special and others are ordinary. There is atmost one path to move from one vertex to another.
First Query:
I need to find out how many pairs of special vertices exists which are connected directly or indirectly.
My approach:
I"ll apply BFS (via queue )to see how many nodes are connected to each other somehow. Let number of special vertices I discover in this be n, then answer to my query would be nC2. I'll repeat this till all vertices are visited.
Second Query:
How many vertices lie on path between any two special vertices.
My approach:
In my approach for query 1, I'll apply BFS to find out path between any two special vertices and then backtrack and mark the vertices lying on the path.
Problem:
Number of vertices can be as high as 50,000. So, applying BFS and then I guess, backtracking would be slower for my time constraint (2 seconds).
I have list of all vertices and their adjacency list. Now while pushing vertices in my queue while BFS, can I somehow calculate answer to query 2 also? Is there a better approach one can use to solve the problem? Input format will be such that I'll be told whether a vertex is special or not one by one and then I'll be given info about i th pathway which connects two vertices.There is atmost one path to move from one vertex to another.
The first query is solved by splitting your forest in trees.
Starting with the full set of vertices, pick the one, then visit every node you can from there, until you cannot visit any more vertex. This is one tree. Repeat for each tree.
You now have K bags of vertices, each containing 0-j special ones. That answers the first question.
For the second question, I suppose a trivial solution is indeed to BFS the path between a vertex to another for each pair in their sub-graph.
You could also take advantage of the tree nature of your sub-graph. This question: How to find the shortest simple path in a Tree in a linear time? mentions it. (I have not really dug into this yet, though)
For the first query, one round of BFS and some simple calculation as you have described is optimal.
For the second query, assuming the worst case where all vertices are special and the graph is a tree, doing a BFS per query is going to give O(Q|V|) complexity, where Q is the number of queries. You are going to run into trouble if Q is larger than 104 and |V| is also larger than 104.
In the worst case, we are basically solving the all pairs shortest path problem, but on a tree/forest. When |V| is small, we can do BFS on all nodes, which results in O(|V|2) algorithm. However, there is a faster algorithm:
Read all second type queries and store all the pairs in a set S.
For each tree in the forest:
Choose a root node for the current tree. Calculate the distance from the root node to all the other nodes in the current tree (regardless it is special or not).
Calculating lowest common ancestor (LCA) for all pairs of nodes being queried (which are stored in set S). This can be done with Tarjan's offline LCA algorithm.
Calculate the distance between a pair of node by: dist(root, a) + dist(root, b) - dist(root, lca(a,b))
Let the arr be bool array where arr[i] is 1 if it is special and 0 otherwise.
find-set(i) returns the root node of the tree. So any nodes lying in the same tree returns the same number.
for(int i=1; i<n; i++){
for(int j=i+1; j<=n; j++){
if(arr[i]==1 && arr[j]==1){ //If both are special
if(find-set(i)==find-set(j)){ //and both i and j belong to the same tree
//k++ where k is answer to the first query.
//bfs(i,j) and find the intermediate vertices and do ver[i]=1 for the corresponding intermediate vertex/node.
}
}
}
}
finally count no of 1's in ver matrix which is the answer to the second query.

topological sort on boost graph for vertices that are connected through a given vertex only

I have a large graph which represents a set of dependencies.
A -->B
A -->C
B -->F
C -->G
D -->E
E -->L
F -->H
F -->I
H -->K
J -->K
Given B as a starting node, result needs to be : B F I H K
I do not need any other nodes since they cannot be reached from B.
A user can specify a node and I need to collect all the paths and the nodes in those paths and print the topological sort of those nodes.
Currently I am implementing this by
Create a new graph
Extract all the outgoing edges from the given node.
Add those edges in a graph. From those edges, get the nodes and also add them to the new graph.
For each node in step 3, repeat (2) and (3)
On that new graph, perform a topological sort and get the order of the nodes
Is there any other better way to do this or known algorithm for finding the topological sort of a subset of nodes?
I would look into the depth_first_visit() function, and the Visitor Pattern provided by boost. The Visitor Pattern allows you to inspect vertices and edges as the tree is searched. The depth_first_visit() function allows you to start a DFS search starting at a specific vertex.

Subset of vertices

I have a homework problem and i don't know how to solve it. If you could give me an idea i would be very grateful.
This is the problem:
"You are given a connected undirected graph which has N vertices and N edges. Each vertex has a cost. You have to find a subset of vertices so that the total cost of the vertices in the subset is minimum, and each edge is incident with at least one vertex from the subset."
Thank you in advance!
P.S: I have tought about a solution for a long time, and the only ideas i came up with are backtracking or an minimum cost matching in bipartite graph but both ideas are too slow for N=100000.
This may be solved in linear time using dynamic programming.
A connected graph with N vertices and N edges contains exactly one cycle. Start with detecting this cycle (with the help of depth-first search).
Then remove any edge on this cycle. Two vertices incident to this edge are u and v. After this edge removal, we have a tree. Interpret it as a rooted tree with the root u.
Dynamic programming recurrence for this tree may be defined this way:
w0[k] = 0 (for leaf nodes)
w1[k] = vertex_cost (for leaf nodes)
w0[k] = w1[k+1] (for nodes with one descendant)
w1[k] = vertex_cost + min(w0[k+1], w1[k+1]) (for nodes with one descendant)
w0[k] = sum(w1[k+1], x1[k+1], ...) (for branch nodes)
w1[k] = vertex_cost + sum(min(w0[k+1], w1[k+1]), min(x0[k+1], x1[k+1]), ...)
Here k is the node depth (distance from root), w0 is cost of the sub-tree starting from node w when w is not in the "subset", w1 is cost of the sub-tree starting from node w when w is in the "subset".
For each node only two values should be calculated: w0 and w1. But for nodes that were on the cycle we need 4 values: wi,j, where i=0 if node v is not in the "subset", i=1 if node v is in the "subset", j=0 if current node is not in the "subset", j=1 if current node is in the "subset".
Optimal cost of the "subset" is determined as min(u0,1, u1,0, u1,1). To get the "subset" itself, store back-pointers along with each sub-tree cost, and use them to reconstruct the subset.
Due to the number of edges are strict to the same number of vertices, so it's not the common Vertex cover problem which is NP-Complete. I think there's a polynomial solution here:
An N vertices and (N-1) edges graph is a tree. Your graph has N vertices and N edges. Firstly find the awful edge causing a loop and make the graph to a tree. You could use DFS to find the loop (O(N)). Removing any one of the edges in the loop would make a possible tree. In extreme condition you would get N possible trees (the raw graph is a circle).
Apply a simple dynamic planning algorithm (O(N)) to each possible tree (O(N^2)), then find the one with the least cost.