Topological sort orderings on a disconnected DAG - directed-acyclic-graphs

If the numbers 0,1,2 are nodes in a directed acyclic graph and we have only 1 edge: 1 -> 2. Then all valid orderings are:
1,2,0
0,1,2
1,0,2
Am I correct? I'm only not sure about the last ordering: 1,0,2 Is it valid?

Yes you are correct.
According to the definition, the only condition for topological ordering is that for every directed edge u->v u should come before v. It is not said that it should come just before v.
Consider the vertices to represent tasks to be performed, say you getting ready.
Say 0 is putting on a tie, 1 is wearing a pair of socks and 2 is wearing shoes. Thus 1 comes before 2(1->2). As you can see, the last ordering you have written, can be considered to be a topological order(Wear socks, then tie and then your shoes)

Related

About directionality in a topological graph

I am solving this question on LeetCode.com. A statement in the question says:
Some courses may have prerequisites, for example to take course 0 you have to first take course 1, which is expressed as a pair: [0,1]
My aim is to come up with a graphical representation. My question is, as per the above statement, should I create a graph from:
a. 0 -> 1; or
b. 1 -> 0?
The reason I am confused is, if I come up with the former, I would actually do the opposite of what is required - I would visit 0 before I do the prerequisite 1. On the other hand, if I do it the latter way, what if there's a scenario wherein to take course 0, I have to take multiple prerequisite courses, say, 1 and 2? Using the latter representation, I would end up completing course 0 from 1 (thanks to the edge), without first doing course 2.
How should I create the directional edge?
It doesn't matter. If you reverse all your edges and topological sort, the result that you'll get will be the reverse of some topological ordering of the original graph. Do it in whichever way makes the most sense to you.

Minimum Mean Weight Cycle - Intuitive Explanation

In a directed graph, we are looking for the cycle that had the lowest average edge weights. For instance, a graph with nodes 1 and 2 with path from 1 to 2 of length 2 and from 2 to 1 of length 4 would have minimum mean cycle of 3.
Not looking for a complicated method(Karp), but a simple backtracking wtih pruning solution. An explanation is given as "Solvable with backtracking with important pruning when current running mean is greater than the best found mean weight cycle cost."
However, why does this method work? If we are halfway through a cycle and the weight is more than the best found mean, isn't it possible that with small weight edges we can reach a situation where our current cycle can go lower than the best found mean?
Edit: Here is a sample problem: http://uva.onlinejudge.org/index.php?option=onlinejudge&page=show_problem&problem=2031
Lets optimal solution for given graph be a cycle with avg edge weight X.
There is some optimal cycle with edges e_1, e_2 ... e_n, such that avg(e_i) = X.
For my proof, I assume all indexes modulo n, so e_(n + 1) is e_1.
Lets say that our heuristic can't find this solution, that means: for each i (whatever edge we took first) exists such j (we followed all edges from i to j so far) that average edge weight in the sequence e_i ... e_j is greater than X (heuristic prunes this solution).
Then we can show that average edge weight cannot be equal to X. Lets take a longest contiguos subsequence that is not pruned by heuristic (having average edge weight not greater than X for every element). At least one e_i <= X, so such subsequence exists. For the first element e_k of such subsequence, there is p such that avg(e_k ... e_p) > X. We take first such p. Now lets take k' = p + 1 and get another p'. We will repeat this process until we hit our initial k again. Final p may not outrun initial k, this mean that final subsequence contains initial [e_k, e_p - 1], which contradicts with our construction for e_k. Now our sequence e_1 ... e_n is completely covered by non-overlapping subsequences e_k ... e_p, e_k'...e_p' etc, each of those has average edge weight greater than X. So we have a contradiction that avg(e_i) = X.
As for your question:
If we are halfway through a cycle and the weight is more than the best
found mean, isn't it possible that with small weight edges we can
reach a situation where our current cycle can go lower than the best
found mean?
Of course it is. But we can safely prune this solution, as later we will discover the same cycle starting from another edge, which will not be pruned. My proof states that if we consider every possible cycle in the graph, sooner or later we will find an optimal cycle.

Find maximal independent set [duplicate]

This question already has answers here:
Algorithm to find 'maximal' independent set in a simple graph
(2 answers)
Closed 8 years ago.
What will be the basic naive approach to find maximal independent set of a undirected graph given its adjacency matrix .What will be its complexity ?
Like if we have 3 vertices and matrix is :
0 1 0
1 0 1
0 1 0
Here solution will be 2 as maximal independent set includes {1,3}.
How naive approach can be improved also ?
My approach : Select the node with minimum number of edges and Eliminate all it's neighbors . From the rest of the nodes, select the node with minimum number of edges and repeat the steps until the whole graph is covered
Is this correct?
Finding a Maximal Independent Set (MIS):
Parallel MIS algorithms use randomization to gain concurrency (Luby's algorithm for graph coloring).
Initially, each node is in the candidate set C. Each node generates a (unique) random number and communicates it to its neighbors.
If a nodes number exceeds that of all its neighbors, it joins set I. All of its neighbors are removed from C.
This process continues until C is empty.
On average, this algorithm converges after O(log|V|) such steps.

Prim's algorithm for dynamic locations

Suppose you have an input file:
<total vertices>
<x-coordinate 1st location><y-coordinate 1st location>
<x-coordinate 2nd location><y-coordinate 2nd location>
<x-coordinate 3rd location><y-coordinate 3rd location>
...
How can Prim's algorithm be used to find the MST for these locations? I understand this problem is typically solved using an adjacency matrix. Any references would be great if applicable.
If you already know prim, it is easy. Create adjacency matrix adj[i][j] = distance between location i and location j
I'm just going to describe some implementations of Prim's and hopefully that gets you somewhere.
First off, your question doesn't specify how edges are input to the program. You have a total number of vertices and the locations of those vertices. How do you know which ones are connected?
Assuming you have the edges (and the weights of those edges. Like #doomster said above, it may be the planar distance between the points since they are coordinates), we can start thinking about our implementation. Wikipedia describes three different data structures that result in three different run times: http://en.wikipedia.org/wiki/Prim's_algorithm#Time_complexity
The simplest is the adjacency matrix. As you might guess from the name, the matrix describes nodes that are "adjacent". To be precise, there are |v| rows and columns (where |v| is the number of vertices). The value at adjacencyMatrix[i][j] varies depending on the usage. In our case it's the weight of the edge (i.e. the distance) between node i and j (this means that you need to index the vertices in some way. For instance, you might add the vertices to a list and use their position in the list).
Now using this adjacency matrix our algorithm is as follows:
Create a dictionary which contains all of the vertices and is keyed by "distance". Initially the distance of all of the nodes is infinity.
Create another dictionary to keep track of "parents". We use this to generate the MST. It's more natural to keep track of edges, but it's actually easier to implement by keeping track of "parents". Note that if you root a tree (i.e. designate some node as the root), then every node (other than the root) has precisely one parent. So by producing this dictionary of parents we'll have our MST!
Create a new list with a randomly chosen node v from the original list.
Remove v from the distance dictionary and add it to the parent dictionary with a null as its parent (i.e. it's the "root").
Go through the row in the adjacency matrix for that node. For any node w that is connected (for non-connected nodes you have to set their adjacency matrix value to some special value. 0, -1, int max, etc.) update its "distance" in the dictionary to adjacencyMatrix[v][w]. The idea is that it's not "infinitely far away" anymore... we know we can get there from v.
While the dictionary is not empty (i.e. while there are nodes we still need to connect to)
Look over the dictionary and find the vertex with the smallest distance x
Add it to our new list of vertices
For each of its neighbors, update their distance to min(adjacencyMatrix[x][neighbor], distance[neighbor]) and also update their parent to x. Basically, if there is a faster way to get to neighbor then the distance dictionary should be updated to reflect that; and if we then add neighbor to the new list we know which edge we actually added (because the parent dictionary says that its parent was x).
We're done. Output the MST however you want (everything you need is contained in the parents dictionary)
I admit there is a bit of a leap from the wikipedia page to the actual implementation as outlined above. I think the best way to approach this gap is to just brute force the code. By that I mean, if the pseudocode says "find the min [blah] such that [foo] is true" then write whatever code you need to perform that, and stick it in a separate method. It'll definitely be inefficient, but it'll be a valid implementation. The issue with graph algorithms is that there are 30 ways to implement them and they are all very different in performance; the wikipedia page can only describe the algorithm conceptually. The good thing is that once you implement it some way, you can find optimizations quickly ("oh, if I keep track of this state in this separate data structure, I can make this lookup way faster!"). By the way, the runtime of this is O(|V|^2). I'm too lazy to detail that analysis, but loosely it's because:
All initialization is O(|V|) at worse
We do the loop O(|V|) times and take O(|V|) time to look over the dictionary to find the minimum node. So basically the total time to find the minimum node multiple times is O(|V|^2).
The time it takes to update the distance dictionary is O(|E|) because we only process each edge once. Since |E| is O(|V|^2) this is also O(|V|^2)
Keeping track of the parents is O(|V|)
Outputting the tree is O(|V| + |E|) = O(|E|) at worst
Adding all of these (none of them should be multiplied except within (2)) we get O(|V|^2)
The implementation with a heap is O(|E|log(|V|) and it's very very similar to the above. The only difference is that updating the distance is O(log|V|) instead of O(1) (because it's a heap), BUT finding/removing the min element is O(log|V|) instead of O(|V|) (because it's a heap). The time complexity is quite similar in analysis and you end up with something like O(|V|log|V| + |E|log|V|) = O(|E|log|V|) as desired.
Actually... I'm a bit confused why the adjacency matrix implementation cares about it being an adjacency matrix. It could just as well be implemented using an adjacency list. I think the key part is how you store the distances. I could be way off in my implementation outlined above, but I am pretty sure it implements Prim's algorithm is satisfies the time complexity constraints outlined by wikipedia.

Algorithm for Enumerating Hamiltonian Cycles of a Complete Graph (Permutations where loops, reverses, wrap-arounds or repeats don't count)

I want to generate all the Hamiltonian Cycles of a complete undirected graph (permutations of a set where loops and reverses count as duplicates, and are left out).
For example, permutations of {1,2,3} are
Standard Permutations:
1,2,3
1,3,2
2,1,3
2,3,1
3,1,2
3,2,1
What I want the program/algorithm to print for me:
1,2,3
Since 321 is just 123 backward, 312 is just 123 rotated one place, etc.
I see a lot of discussion on the number of these cycles a given set has, and algorithms to find if a graph has a Hamiltonian cycle or not, but nothing on how to enumerate them in a complete, undirected graph (i.e. a set of numbers that can be preceded or succeeded by any other number in the set).
I would really like an algorithm or C++ code to accomplish this task, or if you could direct me to where there is material on the topic. Thanks!
You can place some restrictions on the output to eliminate the unwanted permutations. Lets say we want to permute the numbers 1, ..., N. To avoid some special cases assume that N > 2.
To eliminate simple rotations we can require that the first place is 1. This is true, because an arbitrary permutation can always be rotated into this form.
To eliminate reverses we can require that the number at the second place must be smaller than the number at the last place. This is true, because from the two permutations starting with 1 that are reverses of each other, exactly one has this property.
So a very simple algorithm could enumerate all permutations and leave out the invalid ones. Of course there are optimisations possible. For example permutations that do not start with 1 can easily be avoided during the generation step.
An uber-lazy way to check if a path is the same one that starts in a different point in the cycle (IE, the same loop or the reverse of the same loop is this:
1: Decide that by convention all cycles will start from the lowest vertex number and continue in the direction of the lower of the two adjacent ordinals.
Hence, all of the above paths would be described in the same way.
The second, other useful bit of information here:
If you would like to check that two paths are the same you can concatenate one with its self and check if it contains either the second path or the reverse of the second path.
That is,
1 2 3 1 2 3
contains all of the above paths or their reverses. Since the process of finding all hamiltonian cycles seems much, much slower than the slight inefficiency this algorithm has, I felt I could throw it in :)