MPI Fortran order of operations (division) - fortran

I have an MPI code in which I have to loop over matrices.
The data represents a 3D domain, but it's saved in a 1D array. The decomposition is done in the following fashion:
Suppose dims = (/2,2,1/).
Number of datapoints in I,,J,K direction:
NI = 10, NJ=10, NK=10. Then the decomposition would split the domain equally in only the I and J direction:
First domain:
NI = 5 + 2 (2 extra cells for boundary information from neighbor)
NJ = 5 + 2 ( same here )
Second domain:
NI = 2 + 5 (Now the extra boundary cells are in front of the 5 datapoints which represent data belonging to this domain)
NJ = 2 + 5
Now for the loop:
U_loop_K : DO K=SK_P,EK_P
LKK=LK_P(K)
U_loop_I : DO I=SI_P,EI_P
LIK=LKK+LI_P(I)
U_loop_J : DO J=SJ_P,EJ_P
INP=LIK+J
AP(INP)=AE(INP)+AW(INP)+AN(INP)+AS(INP)+AT(INP)+AB(INP)+AP(INP)
AP(INP)=AP(INP)*URFRS
SU(INP)=SU(INP)+URFMS*AP(INP)*U(INP)
APU(INP)=1./(AP(INP)+SMALL)
END DO U_loop_J
END DO U_loop_I
END DO U_loop_K
I'm doing a multi-directional domain decomposition. The SK_P and EK_P are the starting and ending indices in the K direction respectively. The I and J are done in a similar manner.
If I decompose the data and execute this code, a small error will start to exist in the APU variable, and only that variable.
How does this happen? Is there a round-off error I'm not aware of which is changed by splitting up the do-loops?
I have checked the values of the 'input arrays' so to speak (AE, AW, etc.) and they are all exactly the same with or without MPI. The first and only difference starts to exist when using the division: APU(INP)=1./(AP(INP)+SMALL).
The variables are all of the real type.

Related

MPI Matrix Multiplication - Task sharing when size(of procs) < rows of matrix

I am trying to perform matrix-matrix multiplication in MPI using c++.
I have coded for the cases where number_of_processes = number_of_rows_of_matrix_A (so that rows of matrix_A is sent across all processes and matrix_B is Broadcasted to all processes to perform subset calculation and they are sent back to root process for accumulation of all results into Matrix_C) and I have also coded for the case when number_of_processes > number_of_rows_of_Matrix_A
I have no idea how to approach for the case when number_of_processes < rows_of_matrix_A.
Lets say I have 4 processes and 8 * 8 matrix_A and matrix_B. I can easily allocate first 4 rows to respective ranks of processes, i.e 0,1,2,3. How should I allocate the remaining rows so that I wont mess up with synchronization of the results which I get from respective processes.
Side note of my implementation:
I have used only MPI_Recv, MPI_Send for all the coding part which I have done.
Thanks in advance.
Let N be the number of rows and P the number of processes, then process p starts at row floor( p*N/P ). Try it. This gives a beautifully even distritution.
From getting suggestion from people here, I came to the below solution.
floor(N * (j + 1)/P) - floor(N * j/P)
Where :
N : Number of rows in matrix
P : Total number of processes available
j : jth process. (i.e if P = 4, j = 0,1,2,3)

2-D plane division

Here is the problem statement:
*Chef is working with lines on a 2-D plane. He knows that every line on a plane can be clearly defined by three coefficients A, B and C: any point (x, y) lies on the line if and only if A * x + B * y + C = 0. Let's call a set of lines to be perfect if there does not exist a point that belongs to two or more distinct lines of the set. He has a set of lines on a plane and he wants to find out the size of the largest perfect subset of this set.
Input
The first line of input contains one integers T denoting the number of test cases. Each test case consists of one integer N denoting number of lines. Next N lines contain 3 space-separated integers each denoting coefficients A, B and C respectively.
Output
For each test case output the cardinality of the largest perfect subset in a single line. Constraints
Input:
1 5
1 1 0
1 2 3
3 4 5
30 40 0
30 40 50
Output: 2 Explanation
Lines 3*x + 4*y + 5 = 0 and 30*x + 40*y + 0 = 0 form a biggest perfect subset.*
So if the ratios of As and Bs are the same, then the lines would be parallel which fulfills the problem statement. For example: if A[1] / B[1] == A[2] / B[2] then these line one and line two are parallel. But when the two lines in question are the same lines, which means there are an infinite number of common points, this equation holds, which is not what the problem wants. So we need to use C to determine whether the lines are the same or not (i.e. A[1]/A[2] == B[1]/B[2] == C[1]/C[2]). But the code I wrote with these ideas are so inefficient. Can you all suggest a more time-efficient solution?
You can write a linear algorithm for this.
The idea is to have a map, where the key is a direction and the value is a set.
For each direction, the set contains only different lines which have the given direction. Then the answer is the size of the larger set.
The direction of a line Ax + By + C = 0 is A/B. The problem is that if B=0 it won't quite work as a key.
You can have a special set for the case B=0, which you keep separate and don't insert into the map.
The values that you insert into the set for a given line Ax + By + C = 0, should be C/B.
In the special case, when B = 0, you should use C/A.

Issue when generate random vectors with limits on matlab

I have a problem, I want to generate a table of 4 columns and 1 line, and with integers in the range 0 to 9, without repeating and are random each time it is run.
arrives to this, but I have a problem I always generates a 0 in the first element. And i dont know how to put a limit of 0-9
anyone who can help me?
Code of Function:
function [ n ] = generar( )
n = [-1 -1 -1 -1];
for i = 1:4
r=abs(i);
dig=floor((r-floor(r))*randn);
while find (n == dig)
r=r+1;
dig=dig+floor(r-randn);
end
n(i)=dig;
end
end
And the results:
generar()
ans =
0 3 9 6
generar()
ans =
0 2 4 8
I dont know if this post is a duplicate, but i need help with my specific problem.
So assuming you want matlab, because the code you supplied is matlab, you can simply do this:
randperm(10, 4) - 1
This will give you 4 unique random numbers from 0-9.
Another way of getting there is randsample(n, k) where n is an integer, then a random sample of size k will be drawn from the population 1:n (as a column vector). So for your case, you would get the result by:
randsample(10, 4)' - 1
It draws 4 random numbers from the population without replacement and all with same weights. This might be slower than randperm(10, 4) - 1 as its real strength comes with the ability to pass over population vectors for more sophisticated examples.
Alternatively one can call it with randsample(pop, k) where pop is the population-vector of which you want to draw a random sample of size k. So for your case, one would do:
randsample(0:9, 4)
The result will have the same singleton dimension as the population-vector, which in this case is a row vector.
Just to offer another solution and get you in touch with randsample().

To make an array non-decreasing using dynamic programing

I came accross this question in a programming contest, i think it can be solved by DP but cannot think of any, so plz help. Here's the questn :
There are n stack of coins placed linearly, each labelled from 1 to n. You also have a sack of coins containing infinite coins with you. All the coins in the stacks and the sack are identical. All you have to do is to make the heights of coins non-decreasing.
You select two stacks i and j and place one coin on each of the stacks of coins from stack'i' to stack'j' (inclusive). This complete operations is considered as one move. You have to minimize the number of moves to make the heights non-decreasing.
No. of Test Cases < 50
1 <= n <= 10^5
0 <= hi <= 10^9
Input Specification :
There will be a number of test cases. Read till EOF. First line of each test case will contain a single integer n, second line contains n heights (h[i]) of stacks.
Output Specification :
Output single integer denoting the number of moves for each test case.
for eg: H={3,2,1}
answer is 2
step1: i=2, j=3, H = {3,3,2}
step2: i=3, j=3, H = {3,3,3}

All possible combinations of length 8 in a 2d array

I've been trying to solve a problem in combinations. I have a matrix 6X6 i'm trying to find all combinations of length 8 in the matrix.
I have to move from neighbor to neighbor form each row,column position and i wrote a recursive program which generates the combination but the problem is it generates a lot of duplicates as well and hence is inefficient. I would like to know how could i eliminate calculating duplicates and save time.
int a={{1,2,3,4,5,6},
{8,9,1,2,3,4},
{5,6,7,8,9,1},
{2,3,4,5,6,7},
{8,9,1,2,3,4},
{5,6,7,8,9,1},
}
void genSeq(int row,int col,int length,int combi)
{
if(length==8)
{
printf("%d\n",combi);
return;
}
combi = (combi * 10) + a[row][col];
if((row-1)>=0)
genSeq(row-1,col,length+1,combi);
if((col-1)>=0)
genSeq(row,col-1,length+1,combi);
if((row+1)<6)
genSeq(row+1,col,length+1,combi);
if((col+1)<6)
genSeq(row,col+1,length+1,combi);
if((row+1)<6&&(col+1)<6)
genSeq(row+1,col+1,length+1,combi);
if((row-1)>=0&&(col+1)<6)
genSeq(row-1,col+1,length+1,combi);
if((row+1)<6&&(row-1)>=0)
genSeq(row+1,col-1,length+1,combi);
if((row-1)>=0&&(col-1)>=0)
genSeq(row-1,col-1,length+1,combi);
}
I was also thinking of writing a dynamic program basically recursion with memorization. Is it a better choice?? if yes than I'm not clear how to implement it in recursion. Have i really hit a dead end with approach???
Thankyou
Edit
Eg result
12121212,12121218,12121219,12121211,12121213.
the restrictions are that you have to move to your neighbor from any point, you have to start for each position in the matrix i.e each row,col. you can move one step at a time, i.e right, left, up, down and the both diagonal positions. Check the if conditions.
i.e
if your in (0,0) you can move to either (1,0) or (1,1) or (0,1) i.e three neighbors.
if your in (2,2) you can move to eight neighbors.
so on...
To eliminate duplicates you can covert 8 digit sequences into 8-digit integers and put them in a hashtable.
Memoization might be a good idea. You can memoize for each cell in the matrix all possible combinations of length 2-7 that can be achieved from it. Going backwards: first generate for each cell all sequences of 2 digits. Then based on that of 3 digits etc.
UPDATE: code in Python
# original matrix
lst = [
[1,2,3,4,5,6],
[8,9,1,2,3,4],
[5,6,7,8,9,1],
[2,3,4,5,6,7],
[8,9,1,2,3,4],
[5,6,7,8,9,1]]
# working matrtix; wrk[i][j] contains a set of all possible paths of length k which can end in lst[i][j]
wrk = [[set() for i in range(6)] for j in range(6)]
# for the first (0rh) iteration initialize with single step paths
for i in range(0, 6):
for j in range(0, 6):
wrk[i][j].add(lst[i][j])
# run iterations 1 through 7
for k in range(1,8):
# create new emtpy wrk matrix for the next iteration
nw = [[set() for i in range(6)] for j in range(6)]
for i in range(0, 6):
for j in range(0, 6):
# the next gen. wrk[i][j] is going to be based on the current wrk paths of its neighbors
ns = set()
if i > 0:
for p in wrk[i-1][j]:
ns.add(10**k * lst[i][j] + p)
if i < 5:
for p in wrk[i+1][j]:
ns.add(10**k * lst[i][j] + p)
if j > 0:
for p in wrk[i][j-1]:
ns.add(10**k * lst[i][j] + p)
if j < 5:
for p in wrk[i][j+1]:
ns.add(10**k * lst[i][j] + p)
nw[i][j] = ns
wrk = nw
# now build final set to eliminate duplicates
result = set()
for i in range(0, 6):
for j in range(0, 6):
result |= wrk[i][j]
print len(result)
print result
There are LOTS of ways to do this. Going through every combination is a perfectly reasonable first approach. It all depends on your requirements. If your matrix is small, and this operation isn't time sensitive, then there's no problem.
I'm not really an algorithms guy, but I'm sure there are really clever ways of doing this that someone will post after me.
Also, in Java when using CamelCase, method names should start with a lowercase character.
int a={{1,2,3,4,5,6},
{8,9,1,2,3,4},
{5,6,7,8,9,1},
{2,3,4,5,6,7},
{8,9,1,2,3,4},
{5,6,7,8,9,1},
}
By length you mean summation of combination of matrix elements resulting 8. i.e., elements to sum up 8 with in row itself and with the other row elements. From row 1 = { {2,6}, {3,5}, } and now row 1 elements with row 2 and so on. Is that what you are expecting ?
You can think about your matrix like it is one-dimension array - no matter here ("place" the rows one by one). For one-dimension array you can write a function like (assuming you should print the combinations)
f(i, n) prints all combinations of length n using elements a[i] ... a[last].
It should skip some elements from a[i] to a[i + k] (for all possible k), print a[k] and make a recursive call f(i + k + 1, n - 1).