Correct DFS traversal of graph

Correct DFS traversal of graph - c++

[Question] (http://imgur.com/KHBuDcf)
[Attempted Answer] (http://imgur.com/aO0lblA)
since it's DFS traversal of graph we use a stack so i visited A as it's given in the question then i went to B since it's a directed graph then to C since C doesn't have anyother directions so i have to visit back my stack i.e B now i went to D now D either leads to C or i have to move back in my stack so i moved to B (since i already visited C) B is exhausted again so i went back to A, now my A leads me to the F and it's reversable G,H don't even have a link so is it the correct way to ignore them or i should visit them aswell ? what should be the correct DFS traversal answer?

The concept behind the DFS/BFS is that you should only visit the connected nodes while traversing not the disconnected nodes so indeed the visited nodes are correct and the order in which you have visited is also correct but you should really try to make the stack representation a little bit clear since it's tough to make out how you followed the stack in the image to your attempt.

The best way to represent a DFS traversal is by building the corresponding spanning tree.
The graph of the figure in your question can be represented using adjacency lists that way:
A: B, C, F
B: C, D
C:
D: A, C
E: C, G
F: A, C
G: E
A DFS starting from A will only visit A, B, C, D and F and yield the following spanning tree:
You can add to this tree the unused edges (those ignored because they lead to allready visited vertices) and even give their classification (forward edges, backward edges and cross edges.) But the spanning tree is probably enough.
For more general consideration: a DFS is a recursive procedure (which can be simulated using a stack) that can be described this way:
DFS(g, cur, mark):
mark[cur] = True
foreach s in successors of cur in g:
if not mark[s]:
DFS(g, s, mark)
But you should already know ...
EDIT: here is a simple implementation in python that produces the dot used to build figure:
https://gist.github.com/slashvar/d4954d04352fc38356f774198e3aa86b

Related

DFS with stack to find length of longest path from a certain node in a directed, acyclic, unweighted graph

I am trying to figure out a way to implement a DFS search to find the length of the longest path from a given node on a directed graph represented as an adjacency list by using a stack, and not using recursion. Specifically, I want to implement the DFS search so that as it runs, the stack gets populated as shown in the picture below..
If that isn't clear, this video is sort of how I want the stack to be built up as my program runs (DFS starts at around 12:45 : https://www.youtube.com/watch?v=pcKY4hjDrxk
But im struggling to find a way to achieve this, as I am still pretty new to programming in general. My current code represents the graph as an unordered map, with each entry containing a vector of all the nodes that point to it. i.e:
std::unordered_map<long, std::vector<long>> nodes;
And basically, I want to implement a DFS search from all nodes with a key value of -1 in that unordered_map as shown in the picture and in the video- with the stack getting allocated as shown. I was thinking, that way I can just record when the stack reaches its maximum size, and that would be the longest path.
One thing to note is the graphs in this specific problem is that each edge will only have one outgoing degree, as shown in the picture.. Is this possible, or will I have to use some sort of recursion to do what I want? Thanks in advance for any help.

You can probably use a task list instead of recursion. If you use the task list in FIFO order like a queue, you get a breadth-first search; if you use it LIFO like a stack, you get depth-first behavior.
However, note that it is possible for a DAG with N nodes to have O(2^(N/2)) possible paths! You should not need to evaluate all possible paths to solve your problem, though, so be careful not to write an algorithm that can take exponential time.
In order to do that, you will need to mark which nodes you have processed. Also, since you are looking for the longest path, you'll need to track per-node information about the longest path found so far.

Although it's possible to achieve this without recursion, as an alternative, I would suggest you to design a function this way, which requires you to write less code and which will provide the nice intuition to understand this algorithm for you're a beginner. And you'll not need to think about creating stack by your own
const int n = 100;
vector< int > graph[n];
int ans = 0, level = 0;
int vis[n];
void dfs(int src) {
level++;
ans = max(level, ans);
for (int x: graph[src]) {
if(vis[x]) continue;
vis[x] = 1;
dfs(x);
level--;
}
}
I hard coded the value of n and graph you can change it as you need as required structure.
Where we take the advantages of the stack created for the recursion tree by the program.
This function will work in O(V+E) for a given graph of V nodes and E edges.
Note that, if your Graph is so large that the stack created default by the program can't handle, then you still have to write your own stack to handle the recursion.

DAG and Top Sort

"Arranging the vertices of a DAG according to increasing pre-number results in a topological sort." is not a true statement apparently, but I'm not seeing why it isn't. If the graph is directed and doesn't have cycles, then shouldn't the order in which we visit the vertices necessarily be the correct order in which we sort it topologically?

Arranging by increasing pre-number does not guarantee a valid topological sort. Consider this graph:
A
↓
B → C → D
The two valid topological orders of this graph are:
A, B, C, D
B, A, C, D
If you were to visit the nodes beginning with C, one possible pre-number order would be:
C, D, A, B
That is not a valid topological order. An even simpler example is this graph:
B → A
There is clearly one valid topological order, but if we were to visit A first and sort by pre-number, the resulting order would be backwards.

Graph Representation in C++

I am going through a book The Design and Analysis of Computer Algorithms Reading through the Graph chapter, I am trying to implement DFS. By Reading definition of this algorithm it says, Graph G=(V,E) partiions the edges in E into two sets T and B. An Edge (v,w) is place in set T if vertes w has not been previously visited when we are at vertex v considering edged (v,w) , otherwise edge `(v,w) is place in set B.
Basically his algorithm of DFS will give me new Graph which will be G=(V,T). I want to know how one would implement this in C++.
I tried using adjacency list, but I am confuse is there a need of storing edges of just a map of list should be fine.

In VTK, edges are stored in a vector, and it always stores a pair (v,w). Near this vector there are 2 other vector of vectors to store in and out edges of graph nodes. When a new edge is added, it added to edge vector, its nodes (v,w) are added to in and out edges vector of vectors, too.

I am not quite clear about what your exact question is. I assume that you are asking about how to maintain two sets T and B to distinguish edges that have been visited from edges that have been not during DFS. I think the easiest way to do so is to add a bool field "visited" to the node struct in your adjacency list. Initial value of this field for all nodes are "false". Suppose in the above case, when DFS come to v, and the edge (v,w) is not visited, then the node on the list of v that corresponds to w would have a value "false" for "visited" at that time. Otherwise it will have a value of "true".
I think the author just try to give you the idea that edges will be categorized into two kinds: visited and not visited at the end of DFS. But I don't think keep two explicit sets maintaining those two kinds of edges are necessary. You can always print the visited edges after DFS according to their updated "visited" value.

Beginning of loop in doubly linked list?

One question I came across,
In a circular linked list, find node at the beginning of the loop?
EXAMPLE Input: A -> B -> C -> D -> E -> C [the same C as earlier] Output: C
Can one of the solutions be, to see if the address of value stored at these nodes are same?
So something like &(A->value) would return us address and then we find if an address is repeating, if yes that is the beginning of the loop?

You could do so, but this is very not efficient, in terms of space complexity, since you will need to store all nodes you 'saw along the way'.
A better solution would probably be Floyd's cycle finding algorithm, as described in this post

Please suggest some algorithm to find the node in a tree whose distance to its farthest node is minimum among all the nodes

Please suggest some algorithm to find the node in a tree whose distance to its farthest node is minimum among all the nodes.
Its is not a graph and it is not weighted.

Choose an arbitrary node v in the tree T.
Run BFS making v as the root of T.
BFS outputs the distances from v to all the other nodes of T.
Now choose a node u that is farthest from v.
Run again BFS making u as the root.
On the new distance output, find a node w that is farthest from u.
Consider the path between u and w.
This is the longest path in the tree T.
The node in the midle of the path is the center of the tree T.
Note that there may exist two centers in a tree. If so, they are neighbours.
Performance: O(n), where n is the number of nodes of T.
Proof
Claim: a leaf (u) that is furthest from some node v lies on the longest path.
If we prove it, then the algorithm is correct, since it first finds u, and, since it's one end of the longest path, uses DFS to find this path itself.
Proof of the claim: Let's use retucto ad absurdum. Assume u---r is the longest path in the tree; and for some node v neither v---u, nor v---r is the longest path from v. Instead, the longest path is v---k. We have two cases:
a) u---r and v--k have a common node o. Then v--o--u and v--o--r are shorter than u---o---k. Then o---r is shorter than o---k. Then u---o---r is not the longest path in graph, because u---o---k is longer. It contradicts our assumption.
b) u---r and v--k don't have common nodes. But since the graph is connected, there are nodes o1 and o2 on each of these paths, such that the path between them o1--o2 doesn't contain any other nodes on these two paths. The contradiction to the assumption is the same as in point a), but with o1--o2 instead of mere o (in fact, point a is just a special case of b, where o1=o2).
This proves the claim and hence the correctness of the algorithm.
(this is proof written by Pavel Shved, and the original author might have a shorter one).

Remove leaves. If more than 2 nodes left, repeat. The node (or 2 nodes) left will be the node you are looking for.
Why this works:
The node(s) are in the middle of the longest path P in the tree. Their max distance to any node is at most half of the length of the path (otherwise it would not be the longest one). Any other node on P will obviously have greater distance to the further end of P than the found node(s). Any node n not on P will have its furthest node at least (distance from n to the closest node on P, say c) + (distance from c to the further end of P), so again more than the node(s) found by the algorithm.

You can use Johnson's algorithm for sparse graphs, but otherwise use the Floyd-Warshall algorithm simply because it is trivial to implement.
Essentially you want to find the distance from every node to every other node, and then just trivially search for the property you want.

You could use Dijkstra's algorithm (http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm) on each node in turn, to find all the distances from that node to every other node; scan the resulting list to get the distance to the farthest node. Once you've Dijkstra'd every node, another scan will give you the minimum of that maximal distances.
Dijkstra is usually regarded as having runtime O(v^2), where v is the number of nodes; you'd be running it once per node, which will increase the time to O(v^3) in a naive implementation. You may be able to make gains by storing the results of earlier nodes' Dijkstra runs and using them as known values in later runs.

As others have said in comments:
A tree is a graph - an undirected connected acyclic graph to be exact - see "Tree" (Graph theory).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js