Find distance from a node to the one farthest from it BOOST - c++

I need to fin the distance from all nodes to the node farthest from it in the minimum spanning tree. I have done this so far but I got no clue as to find the longest distance from a node.
#include<iostream>
#include<boost/config.hpp>
#include<boost/graph/adjacency_list.hpp>
#include<boost/graph/kruskal_min_spanning_tree.hpp>
#include<boost/graph/prim_minimum_spanning_tree.hpp>
using namespace std;
using namespace boost;
int main()
{
typedef adjacency_list< vecS, vecS, undirectedS, property <vertex_distance_t,int>, property< edge_weight_t, int> > Graph;
int test=0,m,a,b,c,w,d,i,no_v,no_e,arr_w[100],arr_d[100];
cin>>test;
m=0;
while(m!=test)
{
cin>>no_v>>no_e;
Graph g(no_v);
property_map <Graph, edge_weight_t>:: type weightMap=get(edge_weight,g);
bool bol;
graph_traits<Graph>::edge_descriptor ed;
for(i=0;i<no_e;i++)
{
cin>>a>>b>>c;
tie(ed,bol)=add_edge(a,b,g);
weightMap[ed]=c;
}
property_map<Graph,edge_weight_t>::type weightM=get(edge_weight,g);
property_map<Graph,vertex_distance_t>::type distanceMap=get(vertex_distance,g);
property_map<Graph,vertex_index_t>::type indexMap=get(vertex_index,g);
vector< graph_traits<Graph>::edge_descriptor> spanning_tree;
kruskal_minimum_spanning_tree(g,back_inserter(spanning_tree));
vector<graph_traits<Graph>::vector_descriptor>p(no_v);
prim_minimum_spanning_tree(g,0,&p[0],distancemap,weightMap,indexMap,default_dijkstra_visitor());
w=0;
for(vector<graph_traits<Graph>::edge_descriptor>::iterator eb=spanning_tree.begin();eb!=spanning_tree.end();++eb) //spanning tree weight
{
w=w+weightM[*eb];
}
arr_w[m]=w;
d=0;
graph_traits<Graph>::vertex_iterator vb,ve;
for(tie(vb,ve)=vertices(g),.
arr_d[m]=d;
m++;
}
for( i=0;i<test;i++)
{
cout<<arr_w[i]<<endl;
}
return 0;
}
If i have a spanning tree with nodes 1 2 3 4 I need to find longest distance from 1 2 3 4 in the spanning tree(and the longest distance can comprise of many edges not only one).

I'll not give you exact code how to do this but I'll give you and idea how to do this.
First, result of MST (minimum spanning tree) is so called tree. Think about the definition. One can say it is a graph where exists path from every node to every other nodes and there are no cycles. Alternatively you can say that given graph is a tree iff exists exactly one path from vertex u to v for every u and v.
According to the definition you can define following
function DFS_Farthest (Vertex u, Vertices P)
begin
define farthest is 0
define P0 as empty set
add u to P
foreach v from neighbours of u and v is not in P do
begin
( len, Ps ) = DFS_Farthest(v, P)
if L(u, v) + len > farthest then
begin
P0 is Ps union P
farthest is len + L(u, v)
end
end
return (farthest, P0)
end
Then you'll for every vertex v in graph call DFS_Farthest(v, empty set) and it'll give you (farthest, P) where farthest is distance of the farthest node and P is set of vertices from which you can reconstruct the path from v to farthest vertex.
So now to describe what is it doing. First the signature. First parameter is from what vertex you want to know farthest one. Second parameter is a set of banned vertices. So it says "Hey, give me the longest path from v to farthest vertex so the vertices from P are not in that path".
Next there is this foreach thing. There you are looking for farthest vertices from current vertex without visiting vertices already in P (current vertex is already there). When you find path longer then currently found not it to farthest and P0. Note that L(u, v) is length of the edge {u, v}.
At the end you'll return those length and banned vertices (this is the path to the farthest vertex).
This is just simple DFS (depth first search) algorithm where you remember already visited vertices.
Now about time complexity. Suppose you can get neighbours of given vertex in O(1) (depends on data structure you have). Function visits every vertex exactly once. So it is at least O(N). To know farthest vertex from every vertex you have to call this function for every vertex. This gives you time complexity of this solution of your's problem at least O(n^2).
My guess is that better solution might be done using dynamic programming but this is just a guess. Generally finding longest path in graph is NP-hard problem. This makes me suspicious that there might not me any significantly better solution. But it's another guess.

Related

How to calculate the total distance between various vertices in a graph?

Let's say I have a weighted, undirected, acyclic graph with no negative value weights, comprised of n vertices and n-1 edges. If I want to calculate the total distance between every single one of them (using edge weight) and then add it up, which algorithm should I use? If for example a graph has 4 vertices, connected like a-b, a-c, c-d then the program should output the total distance needed to go from a-d, a-c, a-b, b-c, b-d and so on. You could call it every possible path between the given vertices. The language I am using is C++.
I have tried using Dijikstra's and Prim's algorithm, but none have worked for me. I have thought about using normal or multisource DFS, but I have been struggling with it for some time now. Is there really a fast way to calculate it, or have I misunderstood the problem entirely?
Since you have an acyclic graph, there is only one possible path between any two points. This makes things a lot simpler to compute and you don't need to use any real pathfinding algorithms.
Let's say we have an edge E that connects nodes A and B. Calculate how many nodes can be reached from node A, not using edge E (including A). Multiply that by the number of nodes that can be reached from node B, not using edge E (including B). Now you have the number of paths that travel through edge E. Multiply this by the weight of edge E, and you have the total contribution of edge E to the sum.
Do the same thing for every edge and add up the results.
To make the algorithm more efficient, each edge can store cached values that say the number of nodes that are reachable on each side of the edge.
You don't have to use a depth first search. Here is some pseudocode showing how you calculate the number of nodes reachable on a side of edge E very fast taking advantage of caching:
int count_nodes_reachable_on_edge_side(Edge e, Node a) {
// assume edge e directly connects to node a
if (answer is already cached in e) { return the answer; }
answer = 1; // a is reachable
for each edge f connected to node a {
if (f is not e) {
let b be other node f touches (not a)
answer += count_nodes_reachable_on_edge_side(f, b)
}
}
cache the answer in edge e;
return answer;
}
I already presented an O(N^2) algorithm in my other answer, but I think you can actually do this in O(N) time with this pseudo code:
let root be an arbitrary node on the graph;
let total_count be the total number of nodes;
let total_cost be 0;
process(root, null);
// Returns the number of nodes reachable from node n without going
// through edge p. Also adds to total_cost the contribution from
// all edges touching node n, except for edge p.
int process(Node n, Edge p)
{
count = 1
for each edge q that touches node n {
if (q != p) {
let m be the other node connected to q (not n)
sub_count = process(m, q)
total_cost += weight(q) * sub_count * (total_count - sub_count)
count += sub_count
}
}
return count
}
The run time of this is O(N), where N is the number of nodes, because process will be called exactly once for each node.
(For the detail-oriented readers: the loop inside process does not matter: there are O(N) iterations that call process, because process is called on each node exactly once. There are O(N) iterations that don't do anything (because q == p), because those iterations can only happen once for process call.)
Every edge will also be visited. After we recursively count the number of nodes on one side of the edge, we can do a simple subtraction (total_count - sub_count) to get the number of nodes on the other side of the edge. When we have these two node counts, we can just multiply them together to get the total number of paths going through the edge, then mulitply that by the weight, and add it to the total cost.

How does this Dijkstra code return minimum value (and not maximum)?

I am solving this question on LeetCode.com called Path With Minimum Effort:
You are given heights, a 2D array of size rows x columns, where heights[row][col] represents the height of cell (row, col). Aim is to go from top left to bottom right. You can move up, down, left, or right, and you wish to find a route that requires the minimum effort. A route's effort is the maximum absolute difference in heights between two consecutive cells of the route. Return the minimum effort required to travel from the top-left cell to the bottom-right cell. For e.g., if heights = [[1,2,2],[3,8,2],[5,3,5]], the answer is 2 (in green).
The code I have is:
class Solution {
public:
vector<pair<int,int>> getNeighbors(vector<vector<int>>& h, int r, int c) {
vector<pair<int,int>> n;
if(r+1<h.size()) n.push_back({r+1,c});
if(c+1<h[0].size()) n.push_back({r,c+1});
if(r-1>=0) n.push_back({r-1,c});
if(c-1>=0) n.push_back({r,c-1});
return n;
}
int minimumEffortPath(vector<vector<int>>& heights) {
int rows=heights.size(), cols=heights[0].size();
using arr=array<int, 3>;
priority_queue<arr, vector<arr>, greater<arr>> pq;
vector<vector<int>> dist(rows, vector<int>(cols, INT_MAX));
pq.push({0,0,0}); //r,c,weight
dist[0][0]=0;
//Dijkstra
while(pq.size()) {
auto [r,c,wt]=pq.top();
pq.pop();
if(wt>dist[r][c]) continue;
vector<pair<int,int>> neighbors=getNeighbors(heights, r, c);
for(auto n: neighbors) {
int u=n.first, v=n.second;
int curr_cost=abs(heights[u][v]-heights[r][c]);
if(dist[u][v]>max(curr_cost,wt)) {
dist[u][v]=max(curr_cost,wt);
pq.push({u,v,dist[u][v]});
}
}
}
return dist[rows-1][cols-1];
}
};
This gets accepted, but I have two questions:
a. Since we update dist[u][v] if it is greater than max(curr_cost,wt), how does it guarantee that in the end we return the minimum effort required? That is, why don't we end up returning the effort of the one in red above?
b. Some solutions such as this one, short-circuit and return immediately when we reach the bottom right the first time (ie, if(r==rows-1 and c==cols-1) return wt;) - how does this work? Can't we possibly get a shorter dist when we revisit the bottom right node in future?
The problem statement requires that we find the path with the minimum "effort".
And "effort" is defined as the maximum difference in heights between adjacent cells on a path.
The expression max(curr_cost, wt) takes care of the maximum part of the problem statement. When moving from one cell to another, the distance to the new cell is either the same as the distance to the old cell, or it's the difference in heights, whichever is greater. Hence max(difference_in_heights, distance_to_old_cell).
And Dijkstra's algorithm takes care of the minimum part of the problem statement, where instead of using a distance from the start node, we're using the "effort" needed to get from the start node to any given node. Dijkstra's attempts to minimize the distance, and hence it minimizes the effort.
Dijkstra's has two closely related concepts: visited and explored. A node is visited when any incoming edge is used to arrive at the node. A node is explored when its outgoing edges are used to visit its neighbors. The key design feature of Dijkstra's is that after a node has been explored, additional visits to that node will never improve the distance to that node. That's the reason for the priority queue. The priority queue guarantees that the node being explored has the smallest distance of any unexplored nodes.
In the sample grid, the red path will be explored before the green path because the red path has effort 1 until the last move, whereas the green path has effort 2. So the red path will set the distance to the bottom right cell to 3, i.e. dist[2][2] = 3.
But when the green path is explored, and we arrive at the 3 at row=2, col=1, we have
dist[2][2] = 3
curr_cost=2
wt=2
So dist[2][2] > max(curr_cost, wt), and dist[2][2] gets reduced to 2.
The answers to the questions:
a. The red path does set the bottom right cell to a distance of 3, temporarily. But the result of the red path is discarded in favor of the result from the green path. This is the natural result of Dijkstra's algorithm searching for the minimum.
b. When the bottom right node is ready to be explored, i.e. it's at the head of the priority queue, then it has the best distance it will ever have, so the algorithm can stop at that point. This is also a natural result of Dijkstra's algorithm. The priority queue guarantees that after a node has been explored, no later visit to that node will reduce its distance.

How to find largest bi-partite subgraph in the given graph?

Given an undirected unweighted graph : it may be cyclic and each vertex has given value ,as shown in image.
Find the size of largest Bi-Partite sub-graph (Largest means maximum number of vertices (connected) in that graph) ?
Answer:
The largest graph is the orange-coloured one, so the answer is 8.
My approach:
#define loop(i,n) for(int i=0;i<n;i++)
int vis[N+1];
vector<int> adj[N+1] // graph in adjacency vector list
int dfs(int current_vertex,int parent,int original_value,int other_value){
int ans=0;
vis[current_vertex]=1; // mark as visited
// map for adding values from neighbours having same value
map<int,int> mp;
// if curr vertex has value original_value then look for the neighbours
// having value as other,but if other is not defined define it
if(value[current_vertex]==original_value){
loop(i,adj[current_vertex].size()){
int v=adj[current_vertex][i];
if(v==parent)continue;
if(!vis[v]){
if(value[v]==other_value){
mp[value[v]]+=dfs(v,current_vertex,original,other);
}
else if(other==-1){
mp[value[v]]+=dfs(v,current_vertex,original,value[v]);
}
}
}
}
//else if the current_vertex has other value than look for original_value
else{
loop(i,adj[current_vertex].size()){
int v=adj[current_vertex][i];
if(v==p)continue;
if(!vis[v]){
if(value[v]==original){
mp[value[v]]+=dfs(v,current_vertex,original,other);
}
}
}
}
// find maximum length that can be found from neighbours of curr_vertex
map<int,int> ::iterator ir=mp.begin();
while(ir!=mp.end()){
ans=max(ans,ir->second);
ir++;
}
return ans+1;
}
calling :
// N is the number of vertices in original graph : n=|V|
for(int i=0;i<N;i++){
ans=max(ans,dfs(i,-1,value[i],-1);
memset(vis,0,sizeof(vis));
}
But I'd like to improve this to run in O(|V|+|E|) time. |V| is the number of veritces and |E| is the number of edges and How do I do that?
This doesn't seem hard. Traverse the edge list and add each one to a multimap keyed by vertex label canonical pairs (the 1,2,3 in your diagram, e.g. with the lowest vertex label value first in the pair).
Now for each value in the multimap - which is a list of edges - accumulate the corresponding vertex set.
The biggest vertex set corresponds to the edges of the biggest bipartite graph.
This algorithm traverses each edge twice, doing a fixed number of map and set operations per edge. So its amortized run time and space is indeed O(|V|+|E|).
Note that it's probably simpler to implement this algorithm with an adjacency list representation than with a matrix because the list gives the edge explicitly. The matrix requires a more careful traversal (like a DFS) to avoid Omega(|V|^2) performance.

If edges are not inserted in the deque in sorted order of weights, does 0-1 BFS produce the right answer?

The general trend of 0-1 BFS algorithms is: if the edge is encountered having weight = 0, then the node is pushed to the front of the deque and if the edge's weight = 1, then it will be pushed to the back of the deque.
If we randomly push the edges, then can 0-1 BFS calculate the right answer? What if edges are entered in the deque are not in sorted order of their weights?
This is the general 0-1 BFS algorithm. If I skip out the last if and else parts and randomly push the edges, then what will happen?
To me, it should work, but then why is this algorithm made in this way?
void bfs (int start)
{
std::deque<int> Q; // double ended queue
Q.push_back(start);
distance[start] = 0;
while(!Q.empty())
{
int v = Q.front();
Q.pop_front();
for(int i = 0 ; i < edges[v].size(); i++)
{
// if distance of neighbour of v from start node is greater than sum of
// distance of v from start node and edge weight between v and its
// neighbour (distance between v and its neighbour of v) ,then change it
if(distance[edges[v][i].first] > distance[v] + edges[v][i].second)
{
distance[edges[v][i].first] = distance[v] + edges[v][i].second;
// if edge weight between v and its neighbour is 0
// then push it to front of
// double ended queue else push it to back
if(edges[v][i].second == 0)
{
Q.push_front(edges[v][i].first);
}
else
{
Q.push_back(edges[v][i].first);
}
}
}
}
}
It is all a matter of performance. While random insertion still finds the shortest path, you have to consider a lot more paths (exponential in the size of the graph). So basically, the structured insertion guarantees a linear time complexity. Let's start with why the 0-1 BFS guarantees this complexity.
The basic idea is the same as the one of Dijkstra's algorithm. You visit nodes ordered by their distance from the start node. This ensures that you won't discover an edge that would decrease the distance to a node observed so far (which would require you to compute the entire subgraph again).
In 0-1 BFS, you start with the start node and the distances in the queue are just:
d = [ 0 ]
Then you consider all neighbors. If the edge weight is zero, you push it to the front, if it is one, then to the back. So you get a queue like this:
d = [ 0 0 0 1 1]
Now you take the first node. It may have neighbors for zero-weight edges and neighbors for one-weight edges. So you do the same and end up with a queue like this (new node are marked with *):
d = [ 0* 0* 0 0 1 1 1*]
So as you see, the nodes are still ordered by their distance, which is essential. Eventually, you will arrive at this state:
d = [ 1 1 1 1 1 ]
Going from the first node over a zero-weight edge produces a total path length of 1. Going over a one-weight edge results in two. So doing 0-1 BFS, you will get:
d = [ 1* 1* 1 1 1 1 2* 2*]
And so on... So concluding, the procedure is required to make sure that you visit nodes in order of their distance to the start node. If you do this, you will consider every edge only twice (once in the forward direction, once in the backward direction). This is because when visiting a node, you know that you cannot get to the node again with a smaller distance. And you only consider the edges emanating from a node when you visit it. So even if the node is added to the queue again by one of its neighbors, you will not visit it because the resulting distance will not be smaller than the current distance. This guarantees the time complexity of O(E), where E is the number of edges.
So what would happen if you did not visit nodes ordered by their distance from the start node? Actually, the algorithm would still find the shortest path. But it will consider a lot more paths. So assume that you have visited a node and that node is put in the queue again by one of its neighbors. This time, we cannot guarantee that the resulting distance will not be smaller. Thus, we might need to visit it again and put all its neighbors in the queue again. And the same applies to the neighbors, so in the worst case this might propagate through the entire graph and you end up visiting nodes over and over again. You will find a solution eventually because you always decrease the distance. But the time needed is far more than for the smart BFS.

Dijkstra' algorithm- vertex as coordinate

I went through Dijkstra for shortest path algorithm,while i was practicing i encountered a question in which vertex is not a single number(say 1,2,3...and so)but it was a pair more specifically given as (x,y)coordinates.I have never done such type of question nor i have seen them.Can you please help me out how to approach for such kind of question.O(V^2) is heartily welcome
Map the coordinates to integer vertices using a hashmap. Now you have a graph with nodes as single numbers. Apply dijkstra's algorithm. Time complexity : O(V) for converting to integer vertices. O(V^2) for running dijkstra's algorithm. Therefore O(V^2) total complexity.
Pseudo code:
int cntr = 0;
for(Edge e : graph){
int from = e.from;
int to= e.to;
if(!map.contains(from)){
map.put(from, cntr++);
}
if(!map.contains(to)){
map.put(to, cntr++);
}
}
Each vertex would still have an id (which you could assign, if not given). The Cartesian coordinates are just additional attributes of the vertex, which could be used to compute distances between connected vertices. (sqrt(delta_x^2 + delta_y^2))