Graph theory: Breadth First Search - c++

There are n vertices connected by m edges. Some of the vertices are special and others are ordinary. There is atmost one path to move from one vertex to another.
First Query:
I need to find out how many pairs of special vertices exists which are connected directly or indirectly.
My approach:
I"ll apply BFS (via queue )to see how many nodes are connected to each other somehow. Let number of special vertices I discover in this be n, then answer to my query would be nC2. I'll repeat this till all vertices are visited.
Second Query:
How many vertices lie on path between any two special vertices.
My approach:
In my approach for query 1, I'll apply BFS to find out path between any two special vertices and then backtrack and mark the vertices lying on the path.
Problem:
Number of vertices can be as high as 50,000. So, applying BFS and then I guess, backtracking would be slower for my time constraint (2 seconds).
I have list of all vertices and their adjacency list. Now while pushing vertices in my queue while BFS, can I somehow calculate answer to query 2 also? Is there a better approach one can use to solve the problem? Input format will be such that I'll be told whether a vertex is special or not one by one and then I'll be given info about i th pathway which connects two vertices.There is atmost one path to move from one vertex to another.

The first query is solved by splitting your forest in trees.
Starting with the full set of vertices, pick the one, then visit every node you can from there, until you cannot visit any more vertex. This is one tree. Repeat for each tree.
You now have K bags of vertices, each containing 0-j special ones. That answers the first question.
For the second question, I suppose a trivial solution is indeed to BFS the path between a vertex to another for each pair in their sub-graph.
You could also take advantage of the tree nature of your sub-graph. This question: How to find the shortest simple path in a Tree in a linear time? mentions it. (I have not really dug into this yet, though)

For the first query, one round of BFS and some simple calculation as you have described is optimal.
For the second query, assuming the worst case where all vertices are special and the graph is a tree, doing a BFS per query is going to give O(Q|V|) complexity, where Q is the number of queries. You are going to run into trouble if Q is larger than 104 and |V| is also larger than 104.
In the worst case, we are basically solving the all pairs shortest path problem, but on a tree/forest. When |V| is small, we can do BFS on all nodes, which results in O(|V|2) algorithm. However, there is a faster algorithm:
Read all second type queries and store all the pairs in a set S.
For each tree in the forest:
Choose a root node for the current tree. Calculate the distance from the root node to all the other nodes in the current tree (regardless it is special or not).
Calculating lowest common ancestor (LCA) for all pairs of nodes being queried (which are stored in set S). This can be done with Tarjan's offline LCA algorithm.
Calculate the distance between a pair of node by: dist(root, a) + dist(root, b) - dist(root, lca(a,b))

Let the arr be bool array where arr[i] is 1 if it is special and 0 otherwise.
find-set(i) returns the root node of the tree. So any nodes lying in the same tree returns the same number.
for(int i=1; i<n; i++){
for(int j=i+1; j<=n; j++){
if(arr[i]==1 && arr[j]==1){ //If both are special
if(find-set(i)==find-set(j)){ //and both i and j belong to the same tree
//k++ where k is answer to the first query.
//bfs(i,j) and find the intermediate vertices and do ver[i]=1 for the corresponding intermediate vertex/node.
}
}
}
}
finally count no of 1's in ver matrix which is the answer to the second query.

Related

When adding a vertex to a weighted undirected graph, which weight stays?

I'm doing an adjacency list implementation of a graph class in C++ (but that's kinda irrelevant). I have a weighted directionless graph and I am writing the addVertex method. I have basically everything down, but I'm not sure what I should do when I add a vertex in between two others that were already there and had their own corresponding weights.
Should I just throw away the old weight that the vertex stored? Should I use the new one that was passed in? A mix of both? Or does it not matter at all what I pick?
I just wanted to make sure that I was not doing something I shouldn't.
Thanks!
I guess it depends on what you want to achieve. Usually, an adjacency list is a nested list whereby each row i indicates the i-th node's neighbourhood. To be precise, each entry in the i-th node's neighbourhood represents an outgoing connection from node i to j. The adjacency list does not comprise edge or arc weights.
Hence, adding a vertex n should do not affect the existing adjacency list's entries but adds a new empty row n to the adjacency list. However, adding or removing edges alter the adjacency list's entries. Thus, adding a vertex n "between two other [nodes i and j] that were already there and had their own corresponding weights" implies that you remove the existing connection between i and j and eventually add two new connections (i,n) and (n,j). If there are no capacity restrictions on the edges and the sum of distances (i,n) and (n,j) dominates the distance (I,j) this could be fine. However, if the weights represent capacities (e.g. max-flow problem) you should keep both connections.
So your question seems to be incomplete or at least unprecise. I assume that your goal is to calculate the shortest distances between each pair of nodes within an undirected graph. I suggest keeping all the different connections in your graph. Shortest path algorithms can calculate the shortest connections between each node pair after you have finished your graph's creation.

how to find S or less nodes in tree with minimum distance? [duplicate]

Given an unoriented tree with weightless edges with N vertices and N-1 edges and a number K find K nodes so that every node from a tree is within S distance of at least one of the K nodes. Also, S has to be the smallest possible S, so that if there were S' < S at least one node would be unreachable in S' steps.
I tried solving this problem, however, I feel that my supposed solution is not very fast.
My solution:
set x=1
find nodes which are x distance from every node
let the node which has the most nodes in its distance be one of the K nodes.
recompute for every node whilst not counting already covered nodes.
do this till I find K number of K nodes. Then if every node is covered we are done else increase x.
This problem is called p-center, and you can find several papers online about it such as this. It is indeed NP for general graphs, but polynomial on trees, both weighted and unweighted.
For me it looks like a clustering problem. Try it with the k-Means (wikipedia) algorithm where k equals to your K. Since you have a tree and all vertices are connected, you can use as distance measurement the distance/number of edges between your vertices.
When the algorithm converts you get the K nodes which should be found. Then you can determine S by iterating through all k clusters. There you calculate the maximum distance for every node in the cluster to the center node. And the overall max should be S.
Update: But actually I see that the k-means algorithm does not produce a global optimum, so this algorithm wouldn't also produce the best result ...
You say N nodes and N-1 vertices so your graph is a tree. You are actually looking for a connected K-subset of nodes minimizing the longest edge.
A polynomial algorithm may be:
Sort all your edges increasing distance.
Then loop on edges:
if none of the 2 nodes are in a group, create a new group.
else if one node is in 1 existing goup, add the other to the group
else both nodes are in 2 different groups, then fuse the groups
When a group reach K, break the loop and you have your connected K-subset.
Nevertheless, you have to note that your group can contain more than K nodes. You can imagine the problem of having 4 nodes, closed two by two. There would be no exact 3-subset solution of your problem.

MST from an array of 2D triangles, but with a little twist

Here's an illustration of the steps taken thus far:
Pseudo-random rectangle generation
"Central node" insertion, rect separation, and final node selections
Delaunay triangulation (shown with previously selected nodes)
Rendering of triangle edges
At this point (Step 5), I would like to use this data to form a Minimum Spanning Tree, but there's a slight catch...
Somewhere in the graph (likely near the center, but not always) will be a node that requires between 3-5 connections to it from other unique nodes. This complicates things, since every other node should only contain a single connection, and the data structures being used make it difficult to determine "what's connected to what" in a solid, traversable format.
So, given an array of triangles in the above format, and a random vertex to use as the "root node", how could I properly traverse the network to create an MST where there are at least 3 connections to our "central node", but no more than 5 connections to it? Is this possible?
Since it's rare to have a vertex in a Delaunay triangulation have much more than 6 edges, you can use brute force: there are only 20+15+6 ways to select 3, 4, or 5 edges out of 6 (respectively), so try all of them. For each of the 41 (up to 336 for degree 9) small trees (the root and a few edges) thus created, run either Kruskal's algorithm or Prim's algorithm starting with that tree already "found" to be part of the MST. (Ignore the root's other edges so as not to increase its degree further.) Then just pick the best one (including the cost of the seed tree).
As for the general neighbor information problem, it seems you just need to build a standard graph representation first. For example, you can make an adjacency list by scanning over all your Edge objects and storing each endpoint in a list associated with the other (in a map<Vector2<T>,vector<Vector2<T>>> or an equivalent based on whatever identifiers for your vertices).
I've taken a workaround approach...
After step 3 of my algorithm, I simply remove all edges which connect to the "central node", keeping track of which edges form the "ring" (aka "edge loop") around it, and run the MST on all remaining edges.
For the MST, I went with the boost graph library.
That made it easy to loop through the triangles I had, adding each of its three edges to an adjacency_list. Then a simple call to whichever boost-provided MST algorithm took care of the rest.
Finally, I readd the edges that were previously taken out. The shortest path is whatever it was in the previous step, plus the length of whichever readded edge that connects to another edge on the "ring" is shortest.
I can then add (or remove) an arbitrary number of previous edges to ensure there are between 3 to 5 edges connecting from the edge loop to the "central node".
Doing things in this order also allows me to know as soon as step 3 if we'll even have a valid number of edges, so we don't waste cycles running a MST.

Efficient algorithm to find weights of all cycles in an undirected weighted graph

My aim is to find all the cycles and their respective weights in an weighted undirected graph. The weight of a cycle is defined as sum of the weights of the paths that constitute the cycle. My preset algorithm does the following:
dfs(int start, int now,int val)
{
if(visited[now])
return;
if(now==start)
{
v.push_back(val);// v is the vector of all weights
return;
}
dfs through all nodes neighbouring to now;
}
I call dfs() from each start point:
for(int i=0;i<V;++i)
{
initialise visited[];
for(int j=0;j<adj[i].size();++j)// adj is the adjacency matrix
dfs(i,adj[i][j].first,adj[i][j].second);
// adj is a vector of vector of pairs
// The first element of the pair is the neighbour index and the second element is the weight
}
So the overall complexity of this algorithm is O(V*E)(I think so). Can anyone suggest a better approach?
Since not everyone defines it the same way, I assume...
the weigths are on the edges, not vertices
the only vertex in a cycle that is visited more than one time is the start/end vertex
a single vertex, or two connected vertices, are no cycle, ie. it needs at least three vertices
between two vertices in your graph, there can't be more than one edge (no "multigraph")
Following steps can determine if (at least) one odd-weighted cycle exists:
Remove all vertices that have only 0 or 1 connected edges (not really necessary, but it might be faster with it).
Split every even-weighted edge (only them, not the odd-weighted ones!) by inserting a new vertex. Eg. if the egde between vertex A and B has weight 4, it should become A-Z 2 and Z-B 2, or A-Z 3 and Z-B 1, or something like that.
The actual weight distribution is not important, you don't even need to save it. Because, starting after this step, all weights are not necessary anymore.
What did this actually do? Think like every odd weight is 1, and every even one is 2. (This doesn't change if there is a odd-weighted cycle: If 3+4+8 is odd then 1+2+2 is too). Now you're splitting all 2 into two 1. Since now the only existing weight is 1, determining if the sum is odd is the same as determining if the edge "count" is odd.
Now, for checking bipartiteness / 2coloring:
You can use a modified DFS here
A vertex can be unknown, 0, or 1. When starting, assign 0 to a single vertex, all others are unknown. The unknown neighbors of a 0-vertex always get 1, and the ones of a 1-vertex always get 0.
While checking neighbors of a vertex if they were already visited, check too if the number is different from the vertex you're processing now. If not, you just found out that your graph has odd-weigthed cycles and you can stop everything.
If you reach then end of DFS without finding that, there are no odd-weighted cycles.
For the implementation, note that you could reach the "end" of DFS while there are still unvisited vertices, namely if you have a disconnected graph. If so, you'll need to set one of the remaining vertices to a known number (0) and continue DFS from there on.
Complexity O(V + E) (this time really, instead of a exponential thing or not-working solutions).

Subset of vertices

I have a homework problem and i don't know how to solve it. If you could give me an idea i would be very grateful.
This is the problem:
"You are given a connected undirected graph which has N vertices and N edges. Each vertex has a cost. You have to find a subset of vertices so that the total cost of the vertices in the subset is minimum, and each edge is incident with at least one vertex from the subset."
Thank you in advance!
P.S: I have tought about a solution for a long time, and the only ideas i came up with are backtracking or an minimum cost matching in bipartite graph but both ideas are too slow for N=100000.
This may be solved in linear time using dynamic programming.
A connected graph with N vertices and N edges contains exactly one cycle. Start with detecting this cycle (with the help of depth-first search).
Then remove any edge on this cycle. Two vertices incident to this edge are u and v. After this edge removal, we have a tree. Interpret it as a rooted tree with the root u.
Dynamic programming recurrence for this tree may be defined this way:
w0[k] = 0 (for leaf nodes)
w1[k] = vertex_cost (for leaf nodes)
w0[k] = w1[k+1] (for nodes with one descendant)
w1[k] = vertex_cost + min(w0[k+1], w1[k+1]) (for nodes with one descendant)
w0[k] = sum(w1[k+1], x1[k+1], ...) (for branch nodes)
w1[k] = vertex_cost + sum(min(w0[k+1], w1[k+1]), min(x0[k+1], x1[k+1]), ...)
Here k is the node depth (distance from root), w0 is cost of the sub-tree starting from node w when w is not in the "subset", w1 is cost of the sub-tree starting from node w when w is in the "subset".
For each node only two values should be calculated: w0 and w1. But for nodes that were on the cycle we need 4 values: wi,j, where i=0 if node v is not in the "subset", i=1 if node v is in the "subset", j=0 if current node is not in the "subset", j=1 if current node is in the "subset".
Optimal cost of the "subset" is determined as min(u0,1, u1,0, u1,1). To get the "subset" itself, store back-pointers along with each sub-tree cost, and use them to reconstruct the subset.
Due to the number of edges are strict to the same number of vertices, so it's not the common Vertex cover problem which is NP-Complete. I think there's a polynomial solution here:
An N vertices and (N-1) edges graph is a tree. Your graph has N vertices and N edges. Firstly find the awful edge causing a loop and make the graph to a tree. You could use DFS to find the loop (O(N)). Removing any one of the edges in the loop would make a possible tree. In extreme condition you would get N possible trees (the raw graph is a circle).
Apply a simple dynamic planning algorithm (O(N)) to each possible tree (O(N^2)), then find the one with the least cost.