Generate all possible combinations for a vector of dimension 9 - combinations

Let v = {x: x in {-1,0,1}} such that the dimension of|v| = 9
Every element x in vector v can take 3 possible values -1,0 or 1
How can I generate all the possible combinations of vector v ?
Example: v = {1,0,-1,0,0,1,1,1,0}, v = {-1,0,-1,1,0,0,1,1,0} etc...
Will i have 3^9 combinations?
thank you.

If you are using python, you can simply do that:
import itertools
v = itertools.product([-1,0,1], repeat=9)
# v will be a generator
# to have the whole list as tuples
list_v = list(v)
# Verify the number of combination
print len(list(v))
And it gives you: 19683, or 3^9

The idea is this:
Your have a position for each element of v (9 in your case):
- - - - - - - - -
Each position can hold three different values (-1 | 0 | 1) and then the total number of combinations is equal to 3 * 3 * 3 * 3 * 3 * 3 * 3 * 3 * 3 = 3^9.
To generate such combinations only simulate this process, for example with for loops, e.g. for three positions:
values[] = {-1, 0, 1};
for (i = 0; i < 3; i++)
for (j = 0; j < 3; j++)
for (k = 0; k < 3; k++)
print values[i], values[j], values[k]
In your case you need nine nested loops! An easier implementation will involve recursion but its sometimes more complicated to understand. Here is the idea anyway:
values[] = {-1, 0, 1};
void generate(int position)
{
if (position == 0) {
println();
return;
}
for (int i = 0; i < 3; i++) {
print(values[i], ", ");
generate(position - 1);
}
}
// call the function with
generate(9);
This another answer explains a little more how a recursive generator works.

Related

Zero Subsequences problem - What's wrong with my C++ solution?

Problem Statement:
Given an array arr of n integers, count the number of non-empty subsequences of the given array such that their product of maximum element and minimum element is zero. Since this number can be huge, compute it modulo 10 ^ 9 + 7
A subsequence of an array is defined as the sequence obtained by deleting several elements from the array (possible none) without changing the order of the remaining elements.
Example
Given n = 3, arr = [1, 0, – 2].
There are 7 subsequences of arr that are-
[1], minimum = 1, maximum =1 , min * max = 1 .
[1,0] minimum = 0, maximum=1, min * max=0
[ 1,0, – 2], minimum = – 2, maximum =1, min* max = -2.
[0], minimum = 0, maximum =0, min * max=0
[0,-2],minimum=-2,maximum=0, min* max=0,
[1, -2] minimum=-2, maximum=1,min* max=-2
[- 2] minimum =-2 maximum = – 2 , min* max = 4.
There are 3 subsequences whose minimum * maximum = 0 that are
[1, 0], [0], [0, – 2] . Hence the answer is 3.
I tried to come up with a solution, by counting the number of zeroes, positive numbers and negative numbers and then adding possible subsequences(2^n, per count) to an empty variable.
My answer is way off though, it's 10 when the expected answer is 3. Can someone please point out my mistake?
#include<bits/stdc++.h>
using namespace std;
#define int long long
int zeroSubs(vector<int> arr){
int x = 0, y = 0, z = 0, ans = 0;
for(int i = 0; i < arr.size(); i++){
if(arr[i] == 0) z++;
else if(arr[i] < 0) x++;
else y++;
}
ans += ((int)pow(2, z))*((int)pow(2, x));
ans += ((int)pow(2, y))*((int)pow(2, z));
ans += ((int)pow(2, z));
return ans;
}
int32_t main()
{
//directly passed the sample test case as an array
cout<<zeroSubs({1, 0, -2});
return 0;
}
ans += ((1<<z)-1)*((1<<x)-1);
ans += ((1<<y)-1)*((1<<z)-1);
ans += ((1<<z)-1);
Made this slight change in the logic, thanks a lot to everyone for the valuable feedback. It works now.

Dynamic programming state calculations

Question:
Fox Ciel is writing an AI for the game Starcraft and she needs your help.
In Starcraft, one of the available units is a mutalisk. Mutalisks are very useful for harassing Terran bases. Fox Ciel has one mutalisk. The enemy base contains one or more Space Construction Vehicles (SCVs). Each SCV has some amount of hit points.
When the mutalisk attacks, it can target up to three different SCVs.
The first targeted SCV will lose 9 hit points.
The second targeted SCV (if any) will lose 3 hit points.
The third targeted SCV (if any) will lose 1 hit point.
If the hit points of a SCV drop to 0 or lower, the SCV is destroyed. Note that you may not target the same SCV twice in the same attack.
You are given a int[] HP containing the current hit points of your enemy's SCVs. Return the smallest number of attacks in which you can destroy all these SCVs.
Constraints-
- x will contain between 1 and 3 elements, inclusive.
- Each element in x will be between 1 and 60, inclusive.
And the solution is:
int minimalAttacks(vector<int> x)
{
int dist[61][61][61];
memset(dist, -1, sizeof(dist));
dist[0][0][0] = 0;
for (int total = 1; total <= 180; total++) {
for (int i = 0; i <= 60 && i <= total; i++) {
for (int j = max(0, total - i - 60); j <= 60 && i + j <= total; j++) {
// j >= max(0, total - i - 60) ensures that k <= 60
int k = total - (i + j);
int & res = dist[i][j][k];
res = 1000000;
// one way to avoid doing repetitive work in enumerating
// all options is to use c++'s next_permutation,
// we first createa vector:
vector<int> curr = {i,j,k};
sort(curr.begin(), curr.end()); //needs to be sorted
// which will be permuted
do {
int ni = max(0, curr[0] - 9);
int nj = max(0, curr[1] - 3);
int nk = max(0, curr[2] - 1);
res = std::min(res, 1 + dist[ni][nj][nk] );
} while (next_permutation(curr.begin(), curr.end()) );
}
}
}
// get the case's respective hitpoints:
while (x.size() < 3) {
x.push_back(0); // add zeros for missing SCVs
}
int a = x[0], b = x[1], c = x[2];
return dist[a][b][c];
}
As far as i understand, this solution calculates all possible state's best outcome first then simply match the queried position and displays the result. But I dont understand the way this code is written. I can see that nowhere dist[i][j][k] value is edited. By default its -1. So how come when i query any dist[i][j][k] I get a different value?.
Can someone explain me the code please?
Thank you!

Diagonally Sorting a Two Dimensional Array in C++ [duplicate]

I'm building a heatmap-like rectangular array interface and I want the 'hot' location to be at the top left of the array, and the 'cold' location to be at the bottom right. Therefore, I need an array to be filled diagonally like this:
0 1 2 3
|----|----|----|----|
0 | 0 | 2 | 5 | 8 |
|----|----|----|----|
1 | 1 | 4 | 7 | 10 |
|----|----|----|----|
2 | 3 | 6 | 9 | 11 |
|----|----|----|----|
So actually, I need a function f(x,y) such that
f(0,0) = 0
f(2,1) = 7
f(1,2) = 6
f(3,2) = 11
(or, of course, a similar function f(n) where f(7) = 10, f(9) = 6, etc.).
Finally, yes, I know this question is similar to the ones asked here, here and here, but the solutions described there only traverse and don't fill a matrix.
Interesting problem if you are limited to go through the array row by row.
I divided the rectangle in three regions. The top left triangle, the bottom right triangle and the rhomboid in the middle.
For the top left triangle the values in the first column (x=0) can be calculated using the common arithmetic series 1 + 2 + 3 + .. + n = n*(n+1)/2. Fields in the that triangle with the same x+y value are in the same diagonal and there value is that sum from the first colum + x.
The same approach works for the bottom right triangle. But instead of x and y, w-x and h-y is used, where w is the width and h the height of rectangle. That value have to be subtracted from the highest value w*h-1 in the array.
There are two cases for the rhomboid in the middle. If the width of rectangle is greater than (or equal to) the height, then the bottom left field of the rectangle is the field with the lowest value in the rhomboid and can be calculated that sum from before for h-1. From there on you can imagine that the rhomboid is a rectangle with a x-value of x+y and a y-value of y from the original rectangle. So calculations of the remaining values in that new rectangle are easy.
In the other case when the height is greater than the width, then the field at x=w-1 and y=0 can be calculated using that arithmetic sum and the rhomboid can be imagined as a rectangle with x-value x and y-value y-(w-x-1).
The code can be optimised by precalculating values for example. I think there also is one formula for all that cases. Maybe i think about it later.
inline static int diagonalvalue(int x, int y, int w, int h) {
if (h > x+y+1 && w > x+y+1) {
// top/left triangle
return ((x+y)*(x+y+1)/2) + x;
} else if (y+x >= h && y+x >= w) {
// bottom/right triangle
return w*h - (((w-x-1)+(h-y-1))*((w-x-1)+(h-y-1)+1)/2) - (w-x-1) - 1;
}
// rhomboid in the middle
if (w >= h) {
return (h*(h+1)/2) + ((x+y+1)-h)*h - y - 1;
}
return (w*(w+1)/2) + ((x+y)-w)*w + x;
}
for (y=0; y<h; y++) {
for (x=0; x<w; x++) {
array[x][y] = diagonalvalue(x,y,w,h);
}
}
Of course if there is not such a limitation, something like that should be way faster:
n = w*h;
x = 0;
y = 0;
for (i=0; i<n; i++) {
array[x][y] = i;
if (y <= 0 || x+1 >= w) {
y = x+y+1;
if (y >= h) {
x = (y-h)+1;
y -= x;
} else {
x = 0;
}
} else {
x++;
y--;
}
}
What about this (having an NxN matrix):
count = 1;
for( int k = 0; k < 2*N-1; ++k ) {
int max_i = std::min(k,N-1);
int min_i = std::max(0,k-N+1);
for( int i = max_i, j = min_i; i >= min_i; --i, ++j ) {
M.at(i).at(j) = count++;
}
}
Follow the steps in the 3rd example -- this gives the indexes (in order to print out the slices) -- and just set the value with an incrementing counter:
int x[3][3];
int n = 3;
int pos = 1;
for (int slice = 0; slice < 2 * n - 1; ++slice) {
int z = slice < n ? 0 : slice - n + 1;
for (int j = z; j <= slice - z; ++j)
x[j][slice - j] = pos++;
}
At a M*N matrix, the values, when traversing like in your stated example, seem to increase by n, except for border cases, so
f(0,0)=0
f(1,0)=f(0,0)+2
f(2,0)=f(1,0)+3
...and so on up to f(N,0). Then
f(0,1)=1
f(0,2)=3
and then
f(m,n)=f(m-1,n)+N, where m,n are index variables
and
f(M,N)=f(M-1,N)+2, where M,N are the last indexes of the matrix
This is not conclusive, but it should give you something to work with. Note, that you only need the value of the preceding element in each row and a few starting values to begin.
If you want a simple function, you could use a recursive definition.
H = height
def get_point(x,y)
if x == 0
if y == 0
return 0
else
return get_point(y-1,0)+1
end
else
return get_point(x-1,y) + H
end
end
This takes advantage of the fact that any value is H+the value of the item to its left. If the item is already at the leftmost column, then you find the cell that is to its far upper right diagonal, and move left from there, and add 1.
This is a good chance to use dynamic programming, and "cache" or memoize the functions you've already accomplished.
If you want something "strictly" done by f(n), you could use the relationship:
n = ( n % W , n / H ) [integer division, with no remainder/decimal]
And work your function from there.
Alternatively, if you want a purely array-populating-by-rows method, with no recursion, you could follow these rules:
If you are on the first cell of the row, "remember" the item in the cell (R-1) (where R is your current row) of the first row, and add 1 to it.
Otherwise, simply add H to the cell you last computed (ie, the cell to your left).
Psuedo-Code: (Assuming array is indexed by arr[row,column])
arr[0,0] = 0
for R from 0 to H
if R > 0
arr[R,0] = arr[0,R-1] + 1
end
for C from 1 to W
arr[R,C] = arr[R,C-1]
end
end

Efficient C/C++ algorithm on 2-dimensional max-sum window

I have a c[N][M] matrix where I apply a max-sum operation over a (K+1)² window. I am trying to reduce the complexity of the naive algorithm.
In particular, here's my code snippet in C++:
<!-- language: cpp -->
int N,M,K;
std::cin >> N >> M >> K;
std::pair< unsigned , unsigned > opt[N][M];
unsigned c[N][M];
// Read values for c[i][j]
// Initialize all opt[i][j] at (0,0).
for ( int i = 0; i < N; i ++ ) {
for ( int j = 0; j < M ; j ++ ) {
unsigned max = 0;
int posX = i, posY = j;
for ( int ii = i; (ii >= i - K) && (ii >= 0); ii -- ) {
for ( int jj = j; (jj >= j - K) && (jj >= 0); jj -- ) {
// Ignore the (i,j) position
if (( ii == i ) && ( jj == j )) {
continue;
}
if ( opt[ii][jj].second > max ) {
max = opt[ii][jj].second;
posX = ii;
posY = jj;
}
}
}
opt[i][j].first = opt[posX][posY].second;
opt[i][j].second = c[i][j] + opt[posX][posY].first;
}
}
The goal of the algorithm is to compute opt[N-1][M-1].
Example: for N = 4, M = 4, K = 2 and:
c[N][M] = 4 1 1 2
6 1 1 1
1 2 5 8
1 1 8 0
... the result should be opt[N-1][M-1] = {14, 11}.
The running complexity of this snippet is however O(N M K²). My goal is to reduce the running time complexity. I have already seen posts like this, but it appears that my "filter" is not separable, probably because of the sum operation.
More information (optional): this is essentially an algorithm which develops the optimal strategy in a "game" where:
Two players lead a single team in a N × M dungeon.
Each position of the dungeon has c[i][j] gold coins.
Starting position: (N-1,M-1) where c[N-1][M-1] = 0.
The active player chooses the next position to move the team to, from position (x,y).
The next position can be any of (x-i, y-j), i <= K, j <= K, i+j > 0. In other words, they can move only left and/or up, up to a step K per direction.
The player who just moved the team gets the coins in the new position.
The active player alternates each turn.
The game ends when the team reaches (0,0).
Optimal strategy for both players: maximize their own sum of gold coins, if they know that the opponent is following the same strategy.
Thus, opt[i][j].first represents the coins of the player who will now move from (i,j) to another position. opt[i][j].second represents the coins of the opponent.
Here is a O(N * M) solution.
Let's fix the lower row(r). If the maximum for all rows between r - K and r is known for every column, this problem can be reduced to a well-known sliding window maximum problem. So it is possible to compute the answer for a fixed row in O(M) time.
Let's iterate over all rows in increasing order. For each column the maximum for all rows between r - K and r is the sliding window maximum problem, too. Processing each column takes O(N) time for all rows.
The total time complexity is O(N * M).
However, there is one issue with this solution: it does not exclude the (i, j) element. It is possible to fix it by running the algorithm described above twice(with K * (K + 1) and (K + 1) * K windows) and then merging the results(a (K + 1) * (K + 1) square without a corner is a union of two rectangles with K * (K + 1) and (K + 1) * K size).

Generating a random DAG

I am solving a problem on directed acyclic graph.
But I am having trouble testing my code on some directed acyclic graphs. The test graphs should be large, and (obviously) acyclic.
I tried a lot to write code for generating acyclic directed graphs. But I failed every time.
Is there some existing method to generate acyclic directed graphs I could use?
I cooked up a C program that does this. The key is to 'rank' the nodes, and only draw edges from lower ranked nodes to higher ranked ones.
The program I wrote prints in the DOT language.
Here is the code itself, with comments explaining what it means:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MIN_PER_RANK 1 /* Nodes/Rank: How 'fat' the DAG should be. */
#define MAX_PER_RANK 5
#define MIN_RANKS 3 /* Ranks: How 'tall' the DAG should be. */
#define MAX_RANKS 5
#define PERCENT 30 /* Chance of having an Edge. */
int main (void)
{
int i, j, k,nodes = 0;
srand (time (NULL));
int ranks = MIN_RANKS
+ (rand () % (MAX_RANKS - MIN_RANKS + 1));
printf ("digraph {\n");
for (i = 0; i < ranks; i++)
{
/* New nodes of 'higher' rank than all nodes generated till now. */
int new_nodes = MIN_PER_RANK
+ (rand () % (MAX_PER_RANK - MIN_PER_RANK + 1));
/* Edges from old nodes ('nodes') to new ones ('new_nodes'). */
for (j = 0; j < nodes; j++)
for (k = 0; k < new_nodes; k++)
if ( (rand () % 100) < PERCENT)
printf (" %d -> %d;\n", j, k + nodes); /* An Edge. */
nodes += new_nodes; /* Accumulate into old node set. */
}
printf ("}\n");
return 0;
}
And here is the graph generated from a test run:
The answer to https://mathematica.stackexchange.com/questions/608/how-to-generate-random-directed-acyclic-graphs applies: if you have a adjacency matrix representation of the edges of your graph, then if the matrix is lower triangular, it's a DAG by necessity.
A similar approach would be to take an arbitrary ordering of your nodes, and then consider edges from node x to y only when x < y. That constraint should also get your DAGness by construction. Memory comparison would be one arbitrary way to order your nodes if you're using structs to represent nodes.
Basically, the pseudocode would be something like:
for(i = 0; i < N; i++) {
for (j = i+1; j < N; j++) {
maybePutAnEdgeBetween(i, j);
}
}
where N is the number of nodes in your graph.
The pseudocode suggests that the number of potential DAGs, given N nodes, is
2^(n*(n-1)/2),
since there are
n*(n-1)/2
ordered pairs ("N choose 2"), and we can choose either to have the edge between them or not.
So, to try to put all these reasonable answers together:
(In the following, I used V for the number of vertices in the generated graph, and E for the number of edges, and we assume that E ≤ V(V-1)/2.)
Personally, I think the most useful answer is in a comment, by Flavius, who points at the code at http://condor.depaul.edu/rjohnson/source/graph_ge.c. That code is really simple, and it's conveniently described by a comment, which I reproduce:
To generate a directed acyclic graph, we first
generate a random permutation dag[0],...,dag[v-1].
(v = number of vertices.)
This random permutation serves as a topological
sort of the graph. We then generate random edges of the
form (dag[i],dag[j]) with i < j.
In fact, what the code does is generate the request number of edges by repeatedly doing the following:
generate two numbers in the range [0, V);
reject them if they're equal;
swap them if the first is larger;
reject them if it has generated them before.
The problem with this solution is that as E gets closes to the maximum number of edges V(V-1)/2, then the algorithm becomes slower and slower, because it has to reject more and more edges. A better solution would be to make a vector of all V(V-1)/2 possible edges; randomly shuffle it; and select the first (requested edges) edges in the shuffled list.
The reservoir sampling algorithm lets us do this in space O(E), since we can deduce the endpoints of the kth edge from the value of k. Consequently, we don't actually have to create the source vector. However, it still requires O(V2) time.
Alternatively, one can do a Fisher-Yates shuffle (or Knuth shuffle, if you prefer), stopping after E iterations. In the version of the FY shuffle presented in Wikipedia, this will produce the trailing entries, but the algorithm works just as well backwards:
// At the end of this snippet, a consists of a random sample of the
// integers in the half-open range [0, V(V-1)/2). (They still need to be
// converted to pairs of endpoints).
vector<int> a;
int N = V * (V - 1) / 2;
for (int i = 0; i < N; ++i) a.push_back(i);
for (int i = 0; i < E; ++i) {
int j = i + rand(N - i);
swap(a[i], a[j]);
a.resize(E);
This requires only O(E) time but it requires O(N2) space. In fact, this can be improved to O(E) space with some trickery, but an SO code snippet is too small to contain the result, so I'll provide a simpler one in O(E) space and O(E log E) time. I assume that there is a class DAG with at least:
class DAG {
// Construct an empty DAG with v vertices
explicit DAG(int v);
// Add the directed edge i->j, where 0 <= i, j < v
void add(int i, int j);
};
Now here goes:
// Return a randomly-constructed DAG with V vertices and and E edges.
// It's required that 0 < E < V(V-1)/2.
template<typename PRNG>
DAG RandomDAG(int V, int E, PRNG& prng) {
using dist = std::uniform_int_distribution<int>;
// Make a random sample of size E
std::vector<int> sample;
sample.reserve(E);
int N = V * (V - 1) / 2;
dist d(0, N - E); // uniform_int_distribution is closed range
// Random vector of integers in [0, N-E]
for (int i = 0; i < E; ++i) sample.push_back(dist(prng));
// Sort them, and make them unique
std::sort(sample.begin(), sample.end());
for (int i = 1; i < E; ++i) sample[i] += i;
// Now it's a unique sorted list of integers in [0, N-E+E-1]
// Randomly shuffle the endpoints, so the topological sort
// is different, too.
std::vector<int> endpoints;
endpoints.reserve(V);
for (i = 0; i < V; ++i) endpoints.push_back(i);
std::shuffle(endpoints.begin(), endpoints.end(), prng);
// Finally, create the dag
DAG rv;
for (auto& v : sample) {
int tail = int(0.5 + sqrt((v + 1) * 2));
int head = v - tail * (tail - 1) / 2;
rv.add(head, tail);
}
return rv;
}
You could generate a random directed graph, and then do a depth-first search for cycles. When you find a cycle, break it by deleting an edge.
I think this is worst case O(VE). Each DFS takes O(V), and each one removes at least one edge (so max E)
If you generate the directed graph by uniformly random selecting all V^2 possible edges, and you DFS in random order and delete a random edge - this would give you a uniform distribution (or at least close to it) over all possible dags.
A very simple approach is:
Randomly assign edges by iterating over the indices of a lower diagonal matrix (as suggested by a link above: https://mathematica.stackexchange.com/questions/608/how-to-generate-random-directed-acyclic-graphs)
This will give you a DAG with possibly more than one component. You can use a Disjoint-set data structure to give you the components that can then be merged by creating edges between the components.
Disjoint-sets are described here: https://en.wikipedia.org/wiki/Disjoint-set_data_structure
Edit: I initially found this post while I was working with a scheduling problem named flexible job shop scheduling problem with sequencing flexibility where jobs (the order in which operations are processed) are defined by directed acyclic graphs. The idea was to use an algorithm to generate multiple random directed graphs (jobs) and create instances of the scheduling problem to test my algorithms. The code at the end of this post is a basic version of the one I used to generate the instances. The instance generator can be found here.
I translated to Python and integrated some functionalities to create a transitive set of the random DAG. In this way, the graph generated has the minimum number of edges with the same reachability.
The transitive graph can be visualized at http://dagitty.net/dags.html by pasting the output in Model code (in the right).
Python version of the algorithm
import random
class Graph:
nodes = []
edges = []
removed_edges = []
def remove_edge(self, x, y):
e = (x,y)
try:
self.edges.remove(e)
# print("Removed edge %s" % str(e))
self.removed_edges.append(e)
except:
return
def Nodes(self):
return self.nodes
# Sample data
def __init__(self):
self.nodes = []
self.edges = []
def get_random_dag():
MIN_PER_RANK = 1 # Nodes/Rank: How 'fat' the DAG should be
MAX_PER_RANK = 2
MIN_RANKS = 6 # Ranks: How 'tall' the DAG should be
MAX_RANKS = 10
PERCENT = 0.3 # Chance of having an Edge
nodes = 0
ranks = random.randint(MIN_RANKS, MAX_RANKS)
adjacency = []
for i in range(ranks):
# New nodes of 'higher' rank than all nodes generated till now
new_nodes = random.randint(MIN_PER_RANK, MAX_PER_RANK)
# Edges from old nodes ('nodes') to new ones ('new_nodes')
for j in range(nodes):
for k in range(new_nodes):
if random.random() < PERCENT:
adjacency.append((j, k+nodes))
nodes += new_nodes
# Compute transitive graph
G = Graph()
# Append nodes
for i in range(nodes):
G.nodes.append(i)
# Append adjacencies
for i in range(len(adjacency)):
G.edges.append(adjacency[i])
N = G.Nodes()
for x in N:
for y in N:
for z in N:
if (x, y) != (y, z) and (x, y) != (x, z):
if (x, y) in G.edges and (y, z) in G.edges:
G.remove_edge(x, z)
# Print graph
for i in range(nodes):
print(i)
print()
for value in G.edges:
print(str(value[0]) + ' ' + str(value[1]))
get_random_dag()
Bellow, you may see in the figure the random DAG with many redundant edges generated by the Python code above.
I adapted the code to generate the same graph (same reachability) but with the least possible number of edges. This is also called transitive reduction.
def get_random_dag():
MIN_PER_RANK = 1 # Nodes/Rank: How 'fat' the DAG should be
MAX_PER_RANK = 3
MIN_RANKS = 15 # Ranks: How 'tall' the DAG should be
MAX_RANKS = 20
PERCENT = 0.3 # Chance of having an Edge
nodes = 0
node_counter = 0
ranks = random.randint(MIN_RANKS, MAX_RANKS)
adjacency = []
rank_list = []
for i in range(ranks):
# New nodes of 'higher' rank than all nodes generated till now
new_nodes = random.randint(MIN_PER_RANK, MAX_PER_RANK)
list = []
for j in range(new_nodes):
list.append(node_counter)
node_counter += 1
rank_list.append(list)
print(rank_list)
# Edges from old nodes ('nodes') to new ones ('new_nodes')
if i > 0:
for j in rank_list[i - 1]:
for k in range(new_nodes):
if random.random() < PERCENT:
adjacency.append((j, k+nodes))
nodes += new_nodes
for i in range(nodes):
print(i)
print()
for edge in adjacency:
print(str(edge[0]) + ' ' + str(edge[1]))
print()
print()
Result:
Create a graph with n nodes and an edge between each pair of node n1 and n2 if n1 != n2 and n2 % n1 == 0.
I recently tried re-implementing the accepted answer and found that it is indeterministic. If you don't enforce the min_per_rank parameter, you could end up with a graph with 0 nodes.
To prevent this, I wrapped the for loops in a function and then checked to make sure that, after each rank, that min_per_rank was satisfied. Here's the JavaScript implementation:
https://github.com/karissa/random-dag
And some pseudo-C code that would replace the accepted answer's main loop.
int pushed = 0
int addRank (void)
{
for (j = 0; j < nodes; j++)
for (k = 0; k < new_nodes; k++)
if ( (rand () % 100) < PERCENT)
printf (" %d -> %d;\n", j, k + nodes); /* An Edge. */
if (pushed < min_per_rank) return addRank()
else pushed = 0
return 0
}
Generating a random DAG which might not be connected
Here's an simple algorithm for generating a random DAG that might not be connected.
const randomDAG = (x, n) => {
const length = n * (n - 1) / 2;
const dag = new Array(length);
for (let i = 0; i < length; i++) {
dag[i] = Math.random() < x ? 1 : 0;
}
return dag;
};
const dagIndex = (n, i, j) => n * i + j - (i + 1) * (i + 2) / 2;
const dagToDot = (n, dag) => {
let dot = "digraph {\n";
for (let i = 0; i < n; i++) {
dot += ` ${i};\n`;
for (let j = i + 1; j < n; j++) {
const k = dagIndex(n, i, j);
if (dag[k]) dot += ` ${i} -> ${j};\n`;
}
}
return dot + "}";
};
const randomDot = (x, n) => dagToDot(n, randomDAG(x, n));
new Viz().renderSVGElement(randomDot(0.3, 10)).then(svg => {
document.body.appendChild(svg);
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/viz.js/2.1.2/viz.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/viz.js/2.1.2/full.render.js"></script>
If you run this code snippet a couple of times, you might see a DAG which is not connected.
So, how does this code work?
A directed acyclic graph (DAG) is just a topologically sorted undirected graph. An undirected graph of n vertices can have a maximum of n * (n - 1) / 2 edges, not counting repeated edges or edges from a vertex to itself. Now, you can only have an edge from a lower vertex to a higher vertex. Hence, the direction of all the edges are predetermined.
This means that you can represent the entire DAG using a one dimensional array of n * (n - 1) / 2 edge weights. An edge weight of 0 means that the edge is absent. Hence, we just create a random array of zeros or ones, and that's our random DAG.
An edge from vertex i to vertex j in a DAG of n vertices, where i < j, has an edge weight at index k where k = n * i + j - (i + 1) * (i + 2) / 2.
Generating a connected DAG
Once you generate a random DAG, you can check if it's connected using the following function.
const isConnected = (n, dag) => {
const reached = new Array(n).fill(false);
reached[0] = true;
const queue = [0];
while (queue.length > 0) {
const x = queue.shift();
for (let i = 0; i < n; i++) {
if (i === n || reached[i]) continue;
const j = i < x ? dagIndex(n, i, x) : dagIndex(n, x, i);
if (dag[j] === 0) continue;
reached[i] = true;
queue.push(i);
}
}
return reached.every(x => x); // return true if every vertex was reached
};
If it's not connected then its complement will always be connected.
const complement = dag => dag.map(x => x ? 0 : 1);
const randomConnectedDAG = (x, n) => {
const dag = randomDAG(x, n);
return isConnected(n, dag) ? dag : complement(dag);
};
Note that if we create a random DAG with 30% edges then its complement will have 70% edges. Hence, the only safe value for x is 50%. However, if you care about connectivity more than the percentage of edges then this shouldn't be a deal breaker.
Finally, putting it all together.
const randomDAG = (x, n) => {
const length = n * (n - 1) / 2;
const dag = new Array(length);
for (let i = 0; i < length; i++) {
dag[i] = Math.random() < x ? 1 : 0;
}
return dag;
};
const dagIndex = (n, i, j) => n * i + j - (i + 1) * (i + 2) / 2;
const isConnected = (n, dag) => {
const reached = new Array(n).fill(false);
reached[0] = true;
const queue = [0];
while (queue.length > 0) {
const x = queue.shift();
for (let i = 0; i < n; i++) {
if (i === n || reached[i]) continue;
const j = i < x ? dagIndex(n, i, x) : dagIndex(n, x, i);
if (dag[j] === 0) continue;
reached[i] = true;
queue.push(i);
}
}
return reached.every(x => x); // return true if every vertex was reached
};
const complement = dag => dag.map(x => x ? 0 : 1);
const randomConnectedDAG = (x, n) => {
const dag = randomDAG(x, n);
return isConnected(n, dag) ? dag : complement(dag);
};
const dagToDot = (n, dag) => {
let dot = "digraph {\n";
for (let i = 0; i < n; i++) {
dot += ` ${i};\n`;
for (let j = i + 1; j < n; j++) {
const k = dagIndex(n, i, j);
if (dag[k]) dot += ` ${i} -> ${j};\n`;
}
}
return dot + "}";
};
const randomConnectedDot = (x, n) => dagToDot(n, randomConnectedDAG(x, n));
new Viz().renderSVGElement(randomConnectedDot(0.3, 10)).then(svg => {
document.body.appendChild(svg);
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/viz.js/2.1.2/viz.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/viz.js/2.1.2/full.render.js"></script>
If you run this code snippet a couple of times, you may see a DAG with a lot more edges than others.
Generating a connected DAG with a certain percentage of edges
If you care about both connectivity and having a certain percentage of edges then you can use the following algorithm.
Start with a fully connected graph.
Randomly remove edges.
After removing an edge, check if the graph is still connected.
If it's no longer connected then add that edge back.
It should be noted that this algorithm is not as efficient as the previous method.
const randomDAG = (x, n) => {
const length = n * (n - 1) / 2;
const dag = new Array(length).fill(1);
for (let i = 0; i < length; i++) {
if (Math.random() < x) continue;
dag[i] = 0;
if (!isConnected(n, dag)) dag[i] = 1;
}
return dag;
};
const dagIndex = (n, i, j) => n * i + j - (i + 1) * (i + 2) / 2;
const isConnected = (n, dag) => {
const reached = new Array(n).fill(false);
reached[0] = true;
const queue = [0];
while (queue.length > 0) {
const x = queue.shift();
for (let i = 0; i < n; i++) {
if (i === n || reached[i]) continue;
const j = i < x ? dagIndex(n, i, x) : dagIndex(n, x, i);
if (dag[j] === 0) continue;
reached[i] = true;
queue.push(i);
}
}
return reached.every(x => x); // return true if every vertex was reached
};
const dagToDot = (n, dag) => {
let dot = "digraph {\n";
for (let i = 0; i < n; i++) {
dot += ` ${i};\n`;
for (let j = i + 1; j < n; j++) {
const k = dagIndex(n, i, j);
if (dag[k]) dot += ` ${i} -> ${j};\n`;
}
}
return dot + "}";
};
const randomDot = (x, n) => dagToDot(n, randomDAG(x, n));
new Viz().renderSVGElement(randomDot(0.3, 10)).then(svg => {
document.body.appendChild(svg);
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/viz.js/2.1.2/viz.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/viz.js/2.1.2/full.render.js"></script>
Hope that helps.
To test algorithms I generated random graphs based on node layers. This is the Python script (also print the adjacency list). You can change the nodes connection probability percentages or add layers to have a slightly different or "taller" graphs:
# Weighted DAG generator by forward layers
import argparse
import random
parser = argparse.ArgumentParser("dag_gen2")
parser.add_argument(
"--layers",
help="DAG forward layers. Default=5",
type=int,
default=5,
)
args = parser.parse_args()
layers = [[] for _ in range(args.layers)]
edges = {}
node_index = -1
print(f"Creating {len(layers)} layers graph")
# Random horizontal connections -low probability-
def random_horizontal(layer):
for node1 in layer:
# Avoid cycles
for node2 in filter(
lambda n2: node1 != n2 and node1 not in map(lambda el: el[0], edges[n2]),
layer,
):
if random.randint(0, 100) < 10:
w = random.randint(1, 10)
edges[node1].append((node2, w))
# Connect two layers
def connect(layer1, layer2):
random_horizontal(layer1)
for node1 in layer1:
for node2 in layer2:
if random.randint(0, 100) < 30:
w = random.randint(1, 10)
edges[node1].append((node2, w))
# Start nodes 1 to 3
start_nodes = random.randint(1, 3)
start_layer = []
for sn in range(start_nodes + 1):
node_index += 1
start_layer.append(node_index)
# Gen nodes
for layer in layers:
nodes = random.randint(2, 5)
for n in range(nodes):
node_index += 1
layer.append(node_index)
# Connect all
layers.insert(0, start_layer)
for layer in layers:
for node in layer:
edges[node] = []
for i, layer in enumerate(layers[:-1]):
connect(layer, layers[i + 1])
# Print in DOT language
print("digraph {")
for node_key in [node_key for node_key in edges.keys() if len(edges[node_key]) > 0]:
for node_dst, weight in edges[node_key]:
print(f" {node_key} -> {node_dst} [label={weight}];")
print("}")
print("---- Adjacency list ----")
print(edges)