Connected Component Counting - c++

In the standard algorithm for connected component counting, a disjoint-set data structure called union-find is used.
Why is this data structure used? I've written code to just search the image linearly, maintaining two linear buffers to store the current and next component counts for each connected pixels by just examining four neighbors (E, SE, S, SW), and in case of a connection, update the connection map to join the higher component with the lower component.
Once done, search for all non joined components and report the count.
I just can't see why this approach is less efficient than using union-find.
Here's my code. The input file has been reduced to 0s and 1s. The program outputs the number of connected components formed from 0s.
def CompCount(fname):
fin = open(fname)
b,l = fin.readline().split()
b,l = int(b),int(l)+1
inbuf = '1'*l + fin.read()
prev = curr = [sys.maxint]*l
nextComp = 0
tree = dict()
for i in xrange(1, b+1):
curr = [sys.maxint]*l
for j in xrange(0, l-1):
curr[j] = sys.maxint
if inbuf[i*l+j] == '0':
p = [prev[j+n] for m,n in [(-l+1,1),(-l,0),(-l-1,-1)] if inbuf[i*l + j+m] == '0']
curr[j] = min([curr[j]] + p + [curr[j-1]])
if curr[j] == sys.maxint:
nextComp += 1
curr[j] = nextComp
tree[curr[j]] = 0
else:
if curr[j] < prev[j+1]: tree[prev[j+1]] = curr[j]
if curr[j] < prev[j]: tree[prev[j]] = curr[j]
if curr[j] < prev[j-1]: tree[prev[j-1]] = curr[j]
if curr[j] < curr[j-1]: tree[curr[j-1]] = curr[j]
prev = curr
return len([x for x in tree if tree[x]==0])

I didn't completely understand your question, you'd really gain for yourself in writing this up clearly and structuring your question.
What I understand is that you want to do a connected component labeling in a 0-1 image by using the 8 neighborhood. If this is so your assumption that the resulting neighborhood graph would be planar is wrong. You have crossings at the "diagonals". It should be easy to construct a K_{3,3} or K_{5} in such an image.

Your algorithm is flawed. Consider this example:
11110
01010
10010
11101
Your algorithm says 2 components whereas it has only 1.
To test, I used this slightly-modified version of your code.
import sys
def CompCount(image):
l = len(image[0])
b = len(image)
prev = curr = [sys.maxint]*(l+1)
nextComp = 0
tree = dict()
for i in xrange(b):
curr = [sys.maxint]*(l+1)
for j in xrange(l):
curr[j] = sys.maxint
if image[i][j] == '0':
p = [prev[j+n] for m,n in [(1,1),(-1,0),(-1,-1)] if 0<=i+m<b and 0<=j+n<l and image[i+m][j+n] == '0']
curr[j] = min([curr[j]] + p + [curr[j-1]])
if curr[j] == sys.maxint:
nextComp += 1
curr[j] = nextComp
tree[curr[j]] = 0
else:
if curr[j] < prev[j+1]: tree[prev[j+1]] = curr[j]
if curr[j] < prev[j]: tree[prev[j]] = curr[j]
if curr[j] < prev[j-1]: tree[prev[j-1]] = curr[j]
if curr[j] < curr[j-1]: tree[curr[j-1]] = curr[j]
prev = curr
return len([x for x in tree if tree[x]==0])
print CompCount(['11110', '01010', '10010', '11101'])
Let me try to explain your algorithm in words (in terms of a graph rather than a grid).
Set 'roots' be an empty set.
Iterate over the nodes in the graph.
For a node, n, look at all its neighbours already processed. Call this set A.
If A is empty, pick a new value k, set v[node] to be k, and add k to roots.
Otherwise, let k be the min of v[node] for node in A. Remove v[x] from roots for each x in A with v[x] != k.
The number of components is the number of elements of roots.
(Your tree is the same as my roots: note that you never use the value of tree[] elements, only whether they are 0 or not... this is just implementing a set)
It's like union-find, except that it assumes that when you merge two components, the one with the higher v[] value has never been previously merged with another component. In the counterexample this is exploited because the two 0s in the center column have been merged with the 0s to their left.

My variant:
Split your entire graph into edges. Add each edge to a set.
On next iteration, draw edges between the 2 outer nodes of the edge you made in step 2. This means adding new nodes (with their corresponding sets) to the set the original edge was from. (basically set merging)
Repeat 2 until the 2 nodes you're looking for are in the same set. You will also need to do a check after step 1 (just in case the 2 nodes are adjacent).
At first your nodes will be each in its set,
o-o o-o o1-o3 o2 o3-o4
\ / |
o-o-o-o o2 o1-o3-o4
As the algorithm progresses and merges the sets, it relatively halves the input.
In the example I am checking for components in some graph. After merging all edges to their maximum possible set, I am left with 3 sets giving 3 disconnected components.
(The number of components is the number of sets you are able to get when the algorithm finishes.)
A possible graph (for the tree above):
o-o-o o4 o2
| |
o o3
|
o1

Related

Levenshtein distance with substitution, deletion and insertion count

There's a great blog post here https://davedelong.com/blog/2015/12/01/edit-distance-and-edit-steps/ on Levenshtein distance. I'm trying to implement this to also include counts of subs, dels and ins when returning the Levenshtein distance. Just running a smell check on my algorithm.
def get_levenshtein_w_counts(s1: str, s2: str):
row_dim = len(s1) + 1 # +1 for empty string
height_dim = len(s2) + 1
# tuple = [ins, del, subs]
# Moving across row is insertion
# Moving down column is deletion
# Moving diagonal is sub
matrix = [[[n, 0, 0] for n in range(row_dim)] for m in range(height_dim)]
for i in range(1, height_dim):
matrix[i][0][1] = i
for y in range(1, height_dim):
for x in range(1, row_dim):
left_scores = matrix[y][x - 1].copy()
above_scores = matrix[y - 1][x].copy()
diagonal_scores = matrix[y - 1][x - 1].copy()
scores = [sum_list(left_scores), sum_list(diagonal_scores), sum_list(above_scores)]
min_idx = scores.index(min(scores))
if min_idx == 0:
matrix[y][x] = left_scores
matrix[y][x][0] += 1
elif min_idx == 1:
matrix[y][x] = diagonal_scores
matrix[y][x][2] += (s1[x-1] != s2[y-1])
else:
matrix[y][x] = above_scores
matrix[y][x][1] += 1
return matrix[-1][-1]
So according to the blog post if you make a matrix where the row is the first word + and empty str and the column is the 2nd word plus an empty string. You store the edit distance at each index. Then you get the smallest from the left, above and diagonal. If the min is diagonal then you know you're just adding 1 sub, if the min is from the left then you're just adding 1 insertion. If the min is from above then you're just deleting 1 character.
I think I did something wrong cause get_levenshtein_w_counts("Frank", "Fran") returned [3, 2, 2]
The problem was that Python does address passing for objects so I should be cloning the lists to the variables rather than doing a direct reference.

Detect rings/circuits of connected voxels

I have a skeletonized voxel structure that looks like this:
The actual structure is significantly larger than this exampleIs there any way to find the closed rings in the structure?
I tried converting it to a graph and using graph based approaches but they all have the problem that a graph has no spatial information of node position and hence a graph can have multiple rings that are homologous.
It is not possible to find all the rings and then filter out the ones of interest since the graph is just too large. The size of the rings varies significantly.
Thanks for your help and contribution!
Any language approaches and pseudo-code are welcomed though I work mostly in Python and Matlab.
EDIT:
No the graph is not planar.
The problem with the Graph cycle base is the same as with other simple graph based approaches. The graph lacks any spatial information and different spatial configurations can have the same cycle base, hence the cycle base does not necessarily correspond to the cycles or holes in the graph.
Here is the adjacency matrix in sparse format:
NodeID1 NodeID2 Weight
Pastebin with adjacency matrix
And here are the corresponding X,Y,Z coordinates for the Nodes of the graph:
X Y Z
Pastebin with node coordinates
(The actual structure is significantly larger than this example)
First I reduce the size of the problem considerably by contracting neighbouring nodes of degree 2 into hypernodes: each simple chain in the graph is substituted with a single node.
Then I find the cycle basis, for which the maximum cost of the cycles in the basis set is minimal.
For the central part of the network, the solution can easily be plotted as it is planar:
For some reason, I fail to correctly identify the cycle basis but I think the following should definitely get you started and maybe somebody else can chime in.
Recover data from posted image (as OP wouldn't provide some real data)
import numpy as np
import matplotlib.pyplot as plt
from skimage.morphology import medial_axis, binary_closing
from matplotlib.patches import Path, PathPatch
import itertools
import networkx as nx
img = plt.imread("tissue_skeleton_crop.jpg")
# plt.hist(np.mean(img, axis=-1).ravel(), bins=255) # find a good cutoff
bw = np.mean(img, axis=-1) < 200
# plt.imshow(bw, cmap='gray')
closed = binary_closing(bw, selem=np.ones((50,50))) # connect disconnected segments
# plt.imshow(closed, cmap='gray')
skeleton = medial_axis(closed)
fig, ax = plt.subplots(1,1)
ax.imshow(skeleton, cmap='gray')
ax.set_xticks([])
ax.set_yticks([])
def img_to_graph(binary_img, allowed_steps):
"""
Arguments:
----------
binary_img -- 2D boolean array marking the position of nodes
allowed_steps -- list of allowed steps; e.g. [(0, 1), (1, 1)] signifies that
from node with position (i, j) nodes at position (i, j+1)
and (i+1, j+1) are accessible,
Returns:
--------
g -- networkx.Graph() instance
pos_to_idx -- dict mapping (i, j) position to node idx (for testing if path exists)
idx_to_pos -- dict mapping node idx to (i, j) position (for plotting)
"""
# map array indices to node indices and vice versa
node_idx = range(np.sum(binary_img))
node_pos = zip(*np.where(np.rot90(binary_img, 3)))
pos_to_idx = dict(zip(node_pos, node_idx))
# create graph
g = nx.Graph()
for (i, j) in node_pos:
for (delta_i, delta_j) in allowed_steps: # try to step in all allowed directions
if (i+delta_i, j+delta_j) in pos_to_idx: # i.e. target node also exists
g.add_edge(pos_to_idx[(i,j)], pos_to_idx[(i+delta_i, j+delta_j)])
idx_to_pos = dict(zip(node_idx, node_pos))
return g, idx_to_pos, pos_to_idx
allowed_steps = set(itertools.product((-1, 0, 1), repeat=2)) - set([(0,0)])
g, idx_to_pos, pos_to_idx = img_to_graph(skeleton, allowed_steps)
fig, ax = plt.subplots(1,1)
nx.draw(g, pos=idx_to_pos, node_size=1, ax=ax)
NB: These are not red lines, these are lots of red dots corresponding to nodes in the graph.
Contract Graph
def contract(g):
"""
Contract chains of neighbouring vertices with degree 2 into one hypernode.
Arguments:
----------
g -- networkx.Graph or networkx.DiGraph instance
Returns:
--------
h -- networkx.Graph or networkx.DiGraph instance
the contracted graph
hypernode_to_nodes -- dict: int hypernode -> [v1, v2, ..., vn]
dictionary mapping hypernodes to nodes
"""
# create subgraph of all nodes with degree 2
is_chain = [node for node, degree in g.degree() if degree == 2]
chains = g.subgraph(is_chain)
# contract connected components (which should be chains of variable length) into single node
components = list(nx.components.connected_component_subgraphs(chains))
hypernode = g.number_of_nodes()
hypernodes = []
hyperedges = []
hypernode_to_nodes = dict()
false_alarms = []
for component in components:
if component.number_of_nodes() > 1:
hypernodes.append(hypernode)
vs = [node for node in component.nodes()]
hypernode_to_nodes[hypernode] = vs
# create new edges from the neighbours of the chain ends to the hypernode
component_edges = [e for e in component.edges()]
for v, w in [e for e in g.edges(vs) if not ((e in component_edges) or (e[::-1] in component_edges))]:
if v in component:
hyperedges.append([hypernode, w])
else:
hyperedges.append([v, hypernode])
hypernode += 1
else: # nothing to collapse as there is only a single node in component:
false_alarms.extend([node for node in component.nodes()])
# initialise new graph with all other nodes
not_chain = [node for node in g.nodes() if not node in is_chain]
h = g.subgraph(not_chain + false_alarms)
h.add_nodes_from(hypernodes)
h.add_edges_from(hyperedges)
return h, hypernode_to_nodes
h, hypernode_to_nodes = contract(g)
# set position of hypernode to position of centre of chain
for hypernode, nodes in hypernode_to_nodes.items():
chain = g.subgraph(nodes)
first, last = [node for node, degree in chain.degree() if degree==1]
path = nx.shortest_path(chain, first, last)
centre = path[len(path)/2]
idx_to_pos[hypernode] = idx_to_pos[centre]
fig, ax = plt.subplots(1,1)
nx.draw(h, pos=idx_to_pos, node_size=20, ax=ax)
Find cycle basis
cycle_basis = nx.cycle_basis(h)
fig, ax = plt.subplots(1,1)
nx.draw(h, pos=idx_to_pos, node_size=10, ax=ax)
for cycle in cycle_basis:
vertices = [idx_to_pos[idx] for idx in cycle]
path = Path(vertices)
ax.add_artist(PathPatch(path, facecolor=np.random.rand(3)))
TODO:
Find the correct cycle basis (I might be confused what the cycle basis is or networkx might have a bug).
EDIT
Holy crap, this was a tour-de-force. I should have never delved into this rabbit hole.
So the idea is now that we want to find the cycle basis for which the maximum cost for the cycles in the basis is minimal. We set the cost of a cycle to its length in edges, but one could imagine other cost functions. To do so, we find an initial cycle basis, and then we combine cycles in the basis until we find the set of cycles with the desired property.
def find_holes(graph, cost_function):
"""
Find the cycle basis, that minimises the maximum individual cost of the cycles in the basis set.
"""
# get cycle basis
cycles = nx.cycle_basis(graph)
# find new basis set that minimises maximum cost
old_basis = set()
new_basis = set(frozenset(cycle) for cycle in cycles) # only frozensets are hashable
while new_basis != old_basis:
old_basis = new_basis
for cycle_a, cycle_b in itertools.combinations(old_basis, 2):
if len(frozenset.union(cycle_a, cycle_b)) >= 2: # maybe should check if they share an edge instead
cycle_c = _symmetric_difference(graph, cycle_a, cycle_b)
new_basis = new_basis.union([cycle_c])
new_basis = _select_cycles(new_basis, cost_function)
ordered_cycles = [order_nodes_in_cycle(graph, nodes) for nodes in new_basis]
return ordered_cycles
def _symmetric_difference(graph, cycle_a, cycle_b):
# get edges
edges_a = list(graph.subgraph(cycle_a).edges())
edges_b = list(graph.subgraph(cycle_b).edges())
# also get reverse edges as graph undirected
edges_a += [e[::-1] for e in edges_a]
edges_b += [e[::-1] for e in edges_b]
# find edges that are in either but not in both
edges_c = set(edges_a) ^ set(edges_b)
cycle_c = frozenset(nx.Graph(list(edges_c)).nodes())
return cycle_c
def _select_cycles(cycles, cost_function):
"""
Select cover of nodes with cycles that minimises the maximum cost
associated with all cycles in the cover.
"""
cycles = list(cycles)
costs = [cost_function(cycle) for cycle in cycles]
order = np.argsort(costs)
nodes = frozenset.union(*cycles)
covered = set()
basis = []
# greedy; start with lowest cost
for ii in order:
cycle = cycles[ii]
if cycle <= covered:
pass
else:
basis.append(cycle)
covered |= cycle
if covered == nodes:
break
return set(basis)
def _get_cost(cycle, hypernode_to_nodes):
cost = 0
for node in cycle:
if node in hypernode_to_nodes:
cost += len(hypernode_to_nodes[node])
else:
cost += 1
return cost
def _order_nodes_in_cycle(graph, nodes):
order, = nx.cycle_basis(graph.subgraph(nodes))
return order
holes = find_holes(h, cost_function=partial(_get_cost, hypernode_to_nodes=hypernode_to_nodes))
fig, ax = plt.subplots(1,1)
nx.draw(h, pos=idx_to_pos, node_size=10, ax=ax)
for ii, hole in enumerate(holes):
if (len(hole) > 3):
vertices = np.array([idx_to_pos[idx] for idx in hole])
path = Path(vertices)
ax.add_artist(PathPatch(path, facecolor=np.random.rand(3)))
xmin, ymin = np.min(vertices, axis=0)
xmax, ymax = np.max(vertices, axis=0)
x = xmin + (xmax-xmin) / 2.
y = ymin + (ymax-ymin) / 2.
# ax.text(x, y, str(ii))

Finding the two closest numbers in a list using sorting

If I am given a list of integers/floats, how would I find the two closest numbers using sorting?
Such a method will do what you want:
>>> def minDistance(lst):
lst = sorted(lst)
index = -1
distance = max(lst) - min(lst)
for i in range(len(lst)-1):
if lst[i+1] - lst[i] < distance:
distance = lst[i+1] - lst[i]
index = i
for i in range(len(lst)-1):
if lst[i+1] - lst[i] == distance:
print lst[i],lst[i+1]
In the first for loop we find out the minimum distance, and in the second loop, we print all the pairs with this distance. Works as below:
>>> lst = (1,2,3,6,12,9,1.4,145,12,83,53,12,3.4,2,7.5)
>>> minDistance(lst)
2 2
12 12
12 12
>>>
It could be more than one possibilities. Consider this list
[0,1, 20, 25, 30, 200, 201]
[0,1] and [200, 201] are equal closest.
Jose has a valid point. However, you could just consider these cases equal and not care about returning one or the other.
I don't think you need a sorting algorithm, per say, but maybe just a sort of 'champion' algorithm like this one:
def smallestDistance(self, arr):
championI = -1
championJ = -1
champDistance = sys.maxint
i = 0
while i < arr.length:
j = i + 1
while j < arr.length:
if math.fabs(arr[i] - arr[j]) < champDistance:
championI = i
championJ = j
champDistance = math.fabs(arr[i] - arr[j])
j += 1
i += 1
r = [arr[championI], arr[championJ]]
return r
This function will return a sub array with the two values that are closest together. Note that this will only work given an array of at least two long. Otherwise, you will throw some error.
I think the popular sorting algorithm known as bubble sort would do this quite well. Though running at possible O(n^2) time if that kind of thing matters to you...
Here is standard bubble sort based on the sorting of arrays by integer size.
def bubblesort( A ):
for i in range( len( A ) ):
for k in range( len( A ) - 1, i, -1 ):
if ( A[k] < A[k - 1] ):
swap( A, k, k - 1 )
def swap( A, x, y ):
tmp = A[x]
A[x] = A[y]
A[y] = tmp
You can just modify the algorithm slightly to fit your purposes if you insist on doing this using a sorting algorithm. However, I think the initial function works as well...
hope that helps.

Enumeration all possible matrices with constraints

I'm attempting to enumerate all possible matrices of size r by r with a few constraints.
Row and column sums must be in non-ascending order.
Starting from the top left element down the main diagonal, each row and column subset from that entry must be made up of combinations with replacements from 0 to the value in that upper left entry (inclusive).
The row and column sums must all be less than or equal to a predetermined n value.
The main diagonal must be in non-ascending order.
Important note is that I need every combination to be store somewhere, or if written in c++, to be ran through another few functions after finding them
r and n are values that range from 2 to say 100.
I've tried a recursive way to do this, along with an iterative, but keep getting hung up on keeping track column and row sums, along with all the data in a manageable sense.
I have attached my most recent attempt (which is far from completed), but may give you an idea of what is going on.
The function first_section(): builds row zero and column zero correctly, but other than that I don't have anything successful.
I need more than a push to get this going, the logic is a pain in the butt, and is swallowing me whole. I need to have this written in either python or C++.
import numpy as np
from itertools import combinations_with_replacement
global r
global n
r = 4
n = 8
global myarray
myarray = np.zeros((r,r))
global arraysums
arraysums = np.zeros((r,2))
def first_section():
bigData = []
myarray = np.zeros((r,r))
arraysums = np.zeros((r,2))
for i in reversed(range(1,n+1)):
myarray[0,0] = i
stuff = []
stuff = list(combinations_with_replacement(range(i),r-1))
for j in range(len(stuff)):
myarray[0,1:] = list(reversed(stuff[j]))
arraysums[0,0] = sum(myarray[0,:])
for k in range(len(stuff)):
myarray[1:,0] = list(reversed(stuff[k]))
arraysums[0,1] = sum(myarray[:,0])
if arraysums.max() > n:
break
bigData.append(np.hstack((myarray[0,:],myarray[1:,0])))
if printing: print 'myarray \n%s' %(myarray)
return bigData
def one_more_section(bigData,index):
newData = []
for item in bigData:
if printing: print 'item = %s' %(item)
upperbound = int(item[index-1]) # will need to have logic worked out
if printing: print 'upperbound = %s' % (upperbound)
for i in reversed(range(1,upperbound+1)):
myarray[index,index] = i
stuff = []
stuff = list(combinations_with_replacement(range(i),r-1))
for j in range(len(stuff)):
myarray[index,index+1:] = list(reversed(stuff[j]))
arraysums[index,0] = sum(myarray[index,:])
for k in range(len(stuff)):
myarray[index+1:,index] = list(reversed(stuff[k]))
arraysums[index,1] = sum(myarray[:,index])
if arraysums.max() > n:
break
if printing: print 'index = %s' %(index)
newData.append(np.hstack((myarray[index,index:],myarray[index+1:,index])))
if printing: print 'myarray \n%s' %(myarray)
return newData
bigData = first_section()
bigData = one_more_section(bigData,1)
A possible matrix could look like this:
r = 4, n >= 6
|3 2 0 0| = 5
|3 2 0 0| = 5
|0 0 2 1| = 3
|0 0 0 1| = 1
6 4 2 2
Here's a solution in numpy and python 2.7. Note that all the rows and columns are in non-increasing order, because you only specified that they should be combinations with replacement, and not their sortedness (and generating combinations is the simplest with sorted lists).
The code could be optimized somewhat by keeping row and column sums around as arguments instead of recomputing them.
import numpy as np
r = 2 #matrix dimension
maxs = 5 #maximum sum of row/column
def generate(r, maxs):
# We create an extra row and column for the starting "dummy" values.
# Filling in the matrix becomes much simpler when we do not have to treat cells with
# one or two zero indices in special way. Thus, we start iteration from the
# (1, 1) index.
m = np.zeros((r + 1, r + 1), dtype = np.int32)
m[0] = m[:,0] = maxs + 1
def go(n, i, j):
# If we completely filled the matrix, yield a copy of the non-dummy parts.
if (i, j) == (r, r):
yield m[1:, 1:].copy()
return
# We compute the next indices in row major order (the choice is arbitrary).
(i2, j2) = (i + 1, 1) if j == r else (i, j + 1)
# Computing the maximum possible value for the current cell.
max_val = min(
maxs - m[i, 1:].sum(),
maxs - m[1:, j].sum(),
m[i, j-1],
m[i-1, j])
for n2 in xrange(max_val, -1, -1):
m[i, j] = n2
for matrix in go(n2, i2, j2):
yield matrix
return go(maxs, 1, 1) #note that this is a generator object
# testing
for matrix in generate(r, maxs):
print
print matrix
If you'd like to have all the valid permutations in the rows and columns, this code below should work.
def generate(r, maxs):
m = np.zeros((r + 1, r + 1), dtype = np.int32)
rows = [0]*(r+1) # We avoid recomputing row/col sums on each cell.
cols = [0]*(r+1)
rows[0] = cols[0] = m[0, 0] = maxs
def go(i, j):
if (i, j) == (r, r):
yield m[1:, 1:].copy()
return
(i2, j2) = (i + 1, 1) if j == r else (i, j + 1)
max_val = min(rows[i-1] - rows[i], cols[j-1] - cols[j])
if i == j:
max_val = min(max_val, m[i-1, j-1])
if (i, j) != (1, 1):
max_val = min(max_val, m[1, 1])
for n in xrange(max_val, -1, -1):
m[i, j] = n
rows[i] += n
cols[j] += n
for matrix in go(i2, j2):
yield matrix
rows[i] -= n
cols[j] -= n
return go(1, 1)

Generating random graphs

I need to generate random single-source/single-sink flow networks of different dimensions so that I can measure the performance of some algorithms such as the Ford-Fulkerson and Dinic.
Is the Kruskal algorithm a way to generate such graphs?
To create a generic flow network you just need to create an adjancency matrix.
adj[u][v] = capacity from node u to node v
So, you just have to randomly create this matrix.
For example, if n is the number of vertices that you want ( you could make that random too ):
for u in 0..n-1:
for v in 0..u-1:
if (rand() % 2 and u != sink and v != source or u == source):
adj[u][v] = rand()
adj[v][u] = 0
else:
adj[u][v] = 0
adj[v][u] = rand()
Himadris answer is partly correct. I had to add some constraints to make sure that single-source/single-sink is satisfied.
For single source only one column has to be all 0 of the adjacency matrix as well as one row for single sink.
import numpy
def random_dag(n):
adj = np.zeros((n, n))
sink = n-1
source = 0
for u in range(0, n):
for v in range(u):
if (u != sink and v != source or u == source):
adj[u, v] = np.random.randint(0, 2)
adj[v, u] = 0
else:
adj[u, v] = 0
adj[v, u] = np.random.randint(0, 2)
# Additional constraints to make sure single-source/single-sink
# May be further randomized (but fixed my issues so far)
for u in range(0, n):
if sum(adj[u]) == 0:
adj[u, -1] = 1
adj[-1, u] = 0
if sum(adj.T[u]) == 0:
adj.T[u, 0] = 1
adj.T[0, u] = 0
return adj
You can visualize with the following code:
import networkx
import matplotlib.plot as plt
def show_graph_with_labels(adjacency_matrix, mylabels):
rows, cols = np.where(adjacency_matrix == 1)
edges = zip(rows.tolist(), cols.tolist())
gr = nx.DiGraph()
gr.add_edges_from(edges)
nx.draw(gr, node_size=500, labels=mylabels, with_labels=True)
plt.show()
n = 4
show_graph_with_labels(random_dag(n), {i: i for i in range(n)})