how to Crossover without Repeatition - scheduling

i am doing Static Flow-Shop Scheduling problem, i considered 10 tasks and 3 machines. in permutation i face many possible sequence through which tasks should go through machines.
now i consider the following two sequence and i would like how to crossover them in order to get non-repetitive child chromosomes .
first sequence = T3 T2 T5 T6 T9 T1 T4 T7 T8 T10
second sequence= T3 T2 T6 T8 T1 T5 T4 T7 T9 T10
now how to make child-sequence so that it does not have repetition and all task are present in child-sequence as well.

Your question does not include any code for us to review, thus I conclude that you need a general idea for doing it.
Possible Solution:
This would be the canonic approach for a position-based crossover. Genes in the same position in both parents are not moved around, whereas genes occupying different positions are exchanged with one another with a two-step transition:
clone: create child c1 copy of p1 and c2 copy of p2
slice: compute the set of indexes for which c1 and c2 have a different value in the same position
randomize: pick a random idx from this set, and remove it from the set
swap: pick a random index idx in this set: if v1 is in position idx in c1 and v2 is in position idx in c2, swap the values so that v1 is now at position idx in c2 and v2 is in position idx in c1
compensate: find the index idy s.t. v2 is at position idy in p1 and index idz s.t. v1 is at position idz in p2 and compensate the swap operation so that now v1 is at position idy in c1 and v2 is at position idz in c2. Remove both idy and idz from the index set of points 2/3.
reiterate: with probability p, go back to step 3.
Example:
// idx = 8, indexes start from 0
|3 2| 9 6 5 8 |4 7| 1 |10| // c1
/ |
|3 2| 6 1 8 9 |4 7| 5 |10| // c2
=
|3 2| 9 6 1 8 |4 7| 5 |10| // c1
|3 2| 6 5 8 9 |4 7| 1 |10| // c2
Considerations:
As you can see, the crossover degenerates into a guided mutation, and this is due to the constraint of not introducing repetitions within your genetic code. However, it is still different from mutation, as the latter would also affect genes that are in the same position for both parents.

Related

Shortest Path between two matrices

I have two distance matrices with overlapping variable names.
dfA:
Start A1 A2 A3 A4 … A150
Location
A 12 4 12 2 9
B 5 2 19 4 3
C 1 4 8 7 12
dfB:
A B C
X 4 12 32
Y 1 6 12
Z 2 8,5 11
So from start A1, A2, etc. through ABC there are paths to X, Y and Z
I would like to see what is the shortest path for an item, for example the the combination A1 -> Z. I programmed this by loading csv's with the distance matrices and unstack them. Then with df.itterows() and two for loops loop through the possible combinations and see what the smallest is for the combination A1 -> Z.
But since i have to do this for around 30000 items, it takes way to long.
Anybody know how to do this in a vectorized way?
I added D so that the axis lengths will be different (dfB won't be square matrix) just for my convenience (it works with square matrices too).
import pandas as pd
import numpy as np
df_a = pd.read_csv('dfA.csv', delim_whitespace=True, index_col=0, decimal=",")
df_b = pd.read_csv('dfB.csv', delim_whitespace=True, index_col=0, decimal=",")
mat_a = df_a.values
mat_b = df_b.values
mat_a2 = np.expand_dims(mat_a, axis=2)
mat_b2 = np.expand_dims(mat_b.T, axis=1)
mat_a3 = np.tile(mat_a2, (1, 1, mat_b.shape[0]))
mat_b3 = np.tile(mat_b2, (1, mat_a.shape[1], 1))
tot = mat_a3 + mat_b3
ind = np.argmin(tot, axis=0).T
df_c = pd.DataFrame(df_b.columns.values[ind], columns=df_a.columns, index=df_b.index)
print(df_c)
dfA:
Start_Location A1 A2 A3 A4 A150
A 12 4 12 2 9
B 5 2 19 4 3
C 1 4 8 7 12
D 5 2 9 11 4
dfB:
A B C D
X 4 12 32 11,4
Y 1 6 2 9,3
Z 2 8,5 11 1,4
dfC:
A1 A2 A3 A4 A150
X A A A A A
Y C A C A B
Z D D D A D

OpenOffice Calc: Finding max value

Let's say that i have an array:
|A B C D
---------------
1 |2 8 6 3
2 |1 2 5 2
Where first row stand for "Goals Scored" and the second row for "Goals Lost". Columns stands for games/matches
I want to find the maximum total number of goals scored and lost in one match. In case above it would be 11 (C1 + C2).
I don't want to use
I spent few days trying functions like: MAX, ADDRESS, CELL, SUBTOTAL, SUM, MMULT, TRANSPOSE, etc. and even combined but i didn't get satisfying result.
The MAX function have to work:
Returns the maximum of a list of arguments, ignoring text entries.
Example:
MAX(B1:B3)
where cells B1, B2, B3 contain 1.1, 2.2, and apple returns 2.2.
Select cell A3 and set the formula to =A1+A2. Then click the square in the lower-right corner of the selected cell and drag to D3. This gives:
|A B C D
---------------
1 |2 8 6 3
2 |1 2 5 2
3 |3 10 11 5
Now =MAX(A3:D3) produces 11.

Finding the neighbors of a node/vertex in a 2D mesh

I have a 2D mesh defined by nodes and elements.
Structure of a node: Node ID, X position, Y position
Structure of an element: Element ID, Node 1, Node 2, Node 3, Node 4
Example of a 2x2 elements mesh:
Nodes:
ID X Y
1 0 0
2 0 1
3 0 2
4 1 0
5 1 1
6 1 2
7 2 0
8 2 1
9 2 2
Elements:
ID N1 N2 N3 N4
1 1 2 4 5
2 2 3 5 6
3 4 5 7 8
4 5 6 8 9
N7-----N8-----N9
| | |
| E3 | E4 |
| | |
N4-----N5-----N6
| | |
| E1 | E2 |
| | |
N1-----N2-----N3
I'm storing both nodes and elements in linked lists.
My question: How can I find the neighbors (nodes) for an arbitrary selected node?
The neighbors of N5, for example, would be N2, N4, N6 and N8.
*Note: This 2x2 element mesh simplified example for explanation proposes, the meshes I'm dealing with may contain several thousands of nodes and elements.
I also have been looking at some concepts of graph theory, but I'm not sure which may be the right way to go.
It would be good to have element's vertices ordered in a way that they make closed polygon. Vertices [1, 2, 4, 5] do not uniquely define first element. From your description it can be seen that you mean that is a polygon with four vertices in order (1, 2, 5, 4). But without picture it can be also degenerated quad (1, 2, 4, 5).
Like:
Elements:
ID N1 N2 N3 N4
1 1 2 5 4
2 2 3 6 5
3 4 5 8 7
4 5 6 9 8
If you are not sure about vertices order, than you have to check about element self-intersection, and reorder vertices to resolve intersections.
With that kind of data it is easy to find all neighbours of given node. Pass through all elements, if element contains given node, than there are two neighbours in that element, vertex before and after in a list.
For node 5, in first element there are neighbours 2 and 4, in second element there are neighbours 6 and 2, ...
If there will be lot of inquires of this kind, than it is better to make extract connectivity information in separate structure. That can be map that maps node to set of it's neighbours. To make it, pass through all elements, and for each element vertex add two neighbours in node's list.

Split Data knowing its common ID

I want to split this data,
ID x y
1 2.5 3.5
1 85.1 74.1
2 2.6 3.4
2 86.0 69.8
3 25.8 32.9
3 84.4 68.2
4 2.8 3.2
4 24.1 31.8
4 83.2 67.4
I was able, making match with their partner like,
ID x y ID x y
1 2.5 3.5 1 85.1 74.1
2 2.6 3.4 2 86.0 69.8
3 25.8 32.9
4 24.1 31.8
However, as you notice some of the new row in ID 4 were placed wrong, because it just got added in the next few rows. I want to split them properly without having to use complex logic which I am already using... Someone can give me an algorithm or idea?
it should looks like,
ID x y ID x y ID x y
1 2.5 3.5 1 85.1 74.1 3 25.8 32.9
2 2.6 3.4 2 86.0 69.8 4 24.1 31.8
4 2.8 3.2 3 84.4 68.2
4 83.2 67.4
It seems that your question is really about clustering, and that the ID column has nothing to do with the determining which points correspond to which.
A common algorithm to achieve that would be k-means clustering. However, your question implies that you don't know the number of clusters in advance. This complicates matters, and there have been already a lot of questions asked here on StackOverflow regarding this issue:
Kmeans without knowing the number of clusters?
compute clustersize automatically for kmeans
How do I determine k when using k-means clustering?
How to optimal K in K - Means Algorithm
K-Means Algorithm
Unfortunately, there is no "right" solution for this. Two clusters in one specific problem could be indeed considered as one cluster in another problem. This is why you'll have to decide that for yourself.
Nevertheless, if you're looking for something simple (and probably inaccurate), you can use Euclidean distance as a measure. Compute the distances between points (e.g. using pdist), and group points where the distance falls below a certain threshold.
Example
%// Sample input
A = [1, 2.5, 3.5;
1, 85.1, 74.1;
2, 2.6, 3.4;
2, 86.0, 69.8;
3, 25.8, 32.9;
3, 84.4, 68.2;
4, 2.8, 3.2;
4, 24.1, 31.8;
4, 83.2, 67.4];
%// Cluster points
pairs = nchoosek(1:size(A, 1), 2); %// Rows of pairs
d = sqrt(sum((A(pairs(:, 1), :) - A(pairs(:, 2), :)) .^ 2, 2)); %// d = pdist(A)
thr = d < 10; %// Distances below threshold
kk = 1;
idx = 1:size(A, 1);
C = cell(size(idx)); %// Preallocate memory
while any(idx)
x = unique(pairs(pairs(:, 1) == find(idx, 1) & thr, :));
C{kk} = A(x, :);
idx(x) = 0; %// Remove indices from list
kk = kk + 1;
end
C = C(~cellfun(#isempty, C)); %// Remove empty cells
The result is a cell array C, each cell representing a cluster:
C{1} =
1.0000 2.5000 3.5000
2.0000 2.6000 3.4000
4.0000 2.8000 3.2000
C{2} =
1.0000 85.1000 74.1000
2.0000 86.0000 69.8000
3.0000 84.4000 68.2000
4.0000 83.2000 67.4000
C{3} =
3.0000 25.8000 32.9000
4.0000 24.1000 31.8000
Note that this simple approach has the flaw of restricting the cluster radius to the threshold. However, you wanted a simple solution, so bear in mind that it gets complicated as you add more "clustering logic" to the algorithm.

Calculating a boundary around several linked rectangles

I am working on a project where I need to create a boundary around a group of rectangles.
Let's use this picture as an example of what I want to accomplish.
EDIT: Couldn't get the image tag to work properly, so here is the full link:
http://www.flickr.com/photos/21093416#N04/3029621742/
We have rectangles A and C who are linked by a special link rectangle B. You could think of this as two nodes in a graph (A,C) and the edge between them (B). That means the rectangles have pointers to each other in the following manner: A->B, A<-B->C, C->B
Each rectangle has four vertices stored in an array where index 0 is bottom left, and index 3 is bottom right.
I want to "traverse" this linked structure and calculate the vertices making up the boundary (red line) around it. I already have some small ideas around how to accomplish this, but want to know if some of you more mathematically inclined have some neat tricks up your sleeves.
The reason I post this here is just that someone might have solved a similar problem before, and have some ideas I could use. I don't expect anyone to sit down and think this through long and hard. I'm going to work on a solution in parallell as I wait for answers.
Any input is greatly appreciated.
Using the example, where rectangles are perpendicular to each other and can therefore be presented by four values (two x coordinates and two y coordinates):
1 2 3 4 5 6
1 +---+---+
| |
2 + A +---+---+
| | B |
3 + + +---+---+
| | | | |
4 +---+---+---+---+ +
| |
5 + C +
| |
6 +---+---+
1) collect all the x coordinates (both left and right) into a list, then sort it and remove duplicates
1 3 4 5 6
2) collect all the y coordinates (both top and bottom) into a list, then sort it and remove duplicates
1 2 3 4 6
3) create a 2D array by number of gaps between the unique x coordinates * number of gaps between the unique y coordinates. It only needs to be one bit per cell, so in c++ a vector<bool> with likely give you a very memory-efficient version of this
4 * 4
4) paint all the rectangles into this grid
1 3 4 5 6
1 +---+
| 1 | 0 0 0
2 +---+---+---+
| 1 | 1 | 1 | 0
3 +---+---+---+---+
| 1 | 1 | 1 | 1 |
4 +---+---+---+---+
0 0 | 1 | 1 |
6 +---+---+
5) for each cell in the grid, for each edge, if the cell beside it in that cardinal direction is not painted, draw the boundary line for that edge
In the question, the rectangles are described as being four vectors where each represents a corner. If each rectangle can be at arbitrary and different rotation from others, then the approach I've outlined above won't work. The problem of finding the path around a complex polygon is regularly solved by vector graphics rasterizers, and a good approach to solving the problem is using a library such as Cairo to do the work for you!
The generalized solution to this problem is to implement boolean operations in terms of a scanline. You can find a brief discussion here to get you started. From the text:
"The basis of the boolean algorithms is scanlines. For the basic principles the book: Computational Geometry an Introduction by Franco P. Preparata and Michael Ian Shamos is very good."
I own this book, though it's at the office now, so I can't look up the page numbers you should read, though chapter 8, on the geometry of rectangles is probably the best starting point.
Calculate the sum of the boundaries of all 3 rectangles seperately
calculate the overlapping rectangle of A and B, and subtract it from the sum
Do the same for the overlapping rectangle of B and C
(to get the overlapping rectangle from A and B take the middle 2 X positions, together with the middle 2 Y positions)
Example (x1,y1) - (x2,y2):
Rectangle A: (1,1) - (3,4)
Rectangle B: (3,2) - (5,4)
Rectangle C: (4,3) - (6,6)
Calculation:
10 + 8 + 10 = 28
X coords ordered = 1,3,3,5 middle two are 3 and 3
Y coords ordered = 1,2,4,4 middle two are 2 and 4
so: (3,2) - (3,4) : boundery = 4
X coords ordered = 3,4,5,6 middle two are 4 and 5
Y coords ordered = 2,3,4,6 middle two are 3 and 4
so: (4,3) - (5,4) : boundery = 4
28 - 4 - 4 = 20
This is my example visualized:
1 2 3 4 5 6
1 +---+---+
| |
2 + A +---+---+
| | B |
3 + + +---+---+
| | | | |
4 +---+---+---+---+ +
| |
5 + C +
| |
6 +---+---+
A simple trick should be:
Create a region from the first rectangle
Add the other rectangles to the region
Get the boundary of the region (somehow? :P)
After some thinking I might end up doing something like this:
Pseudo code:
LinkRectsConnectedTo(Rectangle rectangle,Edge startEdge) // Edge can be West,North,East,South
for each edge in rectangle starting with the edge facing last rectangle
add vertices in the edge to the final boundary polygon
if edge is connected to another rectangle
if edge not equals startEdge
recursively call LinkRectsConnectedTo(rectangle,startEdge)
Obvisouly this pseudo code would have to be refined a bit and might not cover all cases, but I think I might have solved my own problem.
I haven't thought this out completely, but I wonder if you couldn't do something like:
Make a list of all the edges.
Get all the edges where P1.X = P2.X
In that list, get the pairs where X are equal
For each pair, replace with one or two edges for the parts where they DON'T overlap
Do something clever to get the edges in the right order
Will your rectangles always be horizontally aligned, if not you'd need to do the same thing but for Y too?
And are they always guaranteed to be touching? If not the algorithm wouldn't be broken, but the 'right order' wouldn't be definable.