I have a 2D array where I'm running some computation on each process. Afterwards, I need to gather all the computed columns back to the root processes. I'm currently partitioning in a first come first serve manner. In pseudo code, the main loop looks like:
DO i = mpi_rank + 1, num_columns, mpi_size
array(:,i) = do work here
After this is completed, I need to gather these columns into the correct indices back in the root process. What is the best way to do this? It looks like MPI_GATHERV could do what I want if the partitioning scheme was different. However, I'm not sure what the best way to partition that would be since num_columns and mpi_size are not necessarily evenly divisible.
I suggest the following approach:
Cut the 2D array into chunks of "almost equal" size, i.e. with local number of columns close to num_columns / mpi_size.
Gather chunks with mpi_gatherv, which operates with chunks of different size.
To get "almost equal" number of columns, set local number of columns to integer value of num_columns / mpi_size and increment by one only for first mod(num_columns,mpi_size) mpi tasks.
The following table demonstrates the partitioning of (10,12) matrix on 5 MPI processes:
01 02 03 11 12 13 21 22 31 32 41 42
01 02 03 11 12 13 21 22 31 32 41 42
01 02 03 11 12 13 21 22 31 32 41 42
01 02 03 11 12 13 21 22 31 32 41 42
01 02 03 11 12 13 21 22 31 32 41 42
01 02 03 11 12 13 21 22 31 32 41 42
01 02 03 11 12 13 21 22 31 32 41 42
01 02 03 11 12 13 21 22 31 32 41 42
01 02 03 11 12 13 21 22 31 32 41 42
01 02 03 11 12 13 21 22 31 32 41 42
Here the first digit is an id of the process, the second digit is a number of local columns.
As you can see, processes 0 and 1 got 3 columns each, while all other processes got only 2 columns each.
Below you can find working example code that I wrote.
The trickiest part would be the generation of rcounts and displs arrays for MPI_Gatherv. The discussed table is an output of the code.
program mpi2d
implicit none
include 'mpif.h'
integer myid, nprocs, ierr
integer,parameter:: m = 10 ! global number of rows
integer,parameter:: n = 12 ! global number of columns
integer nloc ! local number of columns
integer array(m,n) ! global m-by-n, i.e. m rows and n columns
integer,allocatable:: loc(:,:) ! local piece of global 2d array
integer,allocatable:: rcounts(:) ! array of nloc's (for mpi_gatrherv)
integer,allocatable:: displs(:) ! array of displacements (for mpi_gatherv)
integer i,j
! Initialize
call mpi_init(ierr)
call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr)
call mpi_comm_size(MPI_COMM_WORLD, nprocs, ierr)
! Partition, i.e. get local number of columns
nloc = n / nprocs
if (mod(n,nprocs)>myid) nloc = nloc + 1
! Compute partitioned array
allocate(loc(m,nloc))
do j=1,nloc
loc(:,j) = myid*10 + j
enddo
! Build arrays for mpi_gatherv:
! rcounts containes all nloc's
! displs containes displacements of partitions in terms of columns
allocate(rcounts(nprocs),displs(nprocs))
displs(1) = 0
do j=1,nprocs
rcounts(j) = n / nprocs
if(mod(n,nprocs).gt.(j-1)) rcounts(j)=rcounts(j)+1
if((j-1).ne.0)displs(j) = displs(j-1) + rcounts(j-1)
enddo
! Convert from number of columns to number of integers
nloc = m * nloc
rcounts = m * rcounts
displs = m * displs
! Gather array on root
call mpi_gatherv(loc,nloc,MPI_INT,array,
& rcounts,displs,MPI_INT,0,MPI_COMM_WORLD,ierr)
! Print array on root
if(myid==0)then
do i=1,m
do j=1,n
write(*,'(I04.2)',advance='no') array(i,j)
enddo
write(*,*)
enddo
endif
! Finish
call mpi_finalize(ierr)
end
What about gathering in chunks of size mpi_size?
To shorten this here, I'll assume that num_columns is a multiple of mpi_size. In your case the gathering should look something like (lda is the first dimension of array):
DO i = 1, num_columns/mpi_size
IF (rank == 0) THEN
CALL MPI_GATHER(MPI_IN_PLACE, lda, [TYPE], array(1,(i-1)*mpi_size+1), lda, [TYPE], 0, MPI_COMM_WORLD, ierr)
ELSE
CALL MPI_GATHER(array(1, rank + (i-1)*mpi_size + 1), lda, [TYPE], array(1,(i-1)*mpi_size+1), lda, [TYPE], 0, MPI_COMM_WORLD, ierr)
END IF
ENDDO
I'm not so sure with the indices and if this actually works, but I think, you should get the point.
Related
I have integer numbers in a text file to be assigned to an array C[5][100].
My data is in this format:
17 40 35 24 50 15 31 38 48 18 16 44
41 10 26 50 48 20 24 12 48 24 34 39
...............
I am trying the code below but the error I get is this:
ValueError: cannot copy sequence with size 1005 to array axis with dimension 100
text_file = open("c051001.txt", "r")
C=np.zeros((5,100))
for i in range(agent):
C[i,]=map(int, (value for value in text_file.read().split()))
Number of integers in the file is more than 500 but I want to assign the remainder of numbers to another array.
You need to divide the data into appropriate chunks. A simple way to do this could be:
agent = 5
resource = 1
sz = 100
C = np.zeros((agent, sz))
idx = 0
chunk = sz
for i in range(agent):
C[i, ] = list(map(int, data[idx:idx + chunk]))
idx += chunk
# Assign the following 500 integers into another array of A[5,100,1]
A = np.zeros((agent, sz, resource))
for k in range(resource):
for i in range(agent):
A[i, :, k] = list(map(int, data[idx:idx + chunk]))
idx += chunk
trailing_data = data[idx:]
I have that each processor has its own unique matrix, A, of size Nx2 where N varies with processor. I want to collect all these matrices into one single buff (NxP)x2 matrix, where P is the number of processors.
Size wise in Fortran they are allocated like,
A(N,2)
buff(N*P,2)
As an example, let P = 2 and the A matrices for each processor be,
for Proc-1
10 11
10 11
for Proc-2
20 21
20 21
To this end I use MPI_GATHERV and save the individual matrices in the buff matrix. If I do this then buff will look like this,
10 20
10 20
11 21
11 21
But what I want is the matrix to look like this,
10 11
10 11
20 21
20 21
In memory (I think) Buff : |10 , 10 , 20, 20 , 11 , 11 , 21 , 21|
Sample code is below,
...
! size = 2
root = 0
ALLOCATE ( count(size), num(size) )
! -----------------------------------------------------------
! Mock data
! -----------------------------------------------------------
IF(rank.eq.0) THEN
m = 2
mm = m*2
allocate(A(m,2))
A(1,1) = 10
A(1,2) = 11
A(2,1) = 10
A(2,2) = 11
ELSE
m = 2
mm = m*2
allocate(A(m,2))
A(1,1) = 20
A(1,2) = 21
A(2,1) = 20
A(2,2) = 21
END IF
! -----------------------------------------------------------
! send number of elements
! -----------------------------------------------------------
CALL MPI_GATHER(mm,1,MPI_INTEGER,count,1,MPI_INTEGER,root,cworld,ierr)
! -----------------------------------------------------------
! Figure out displacement vector needed for gatherv
! -----------------------------------------------------------
if(rank.eq.0) THEN
ALLOCATE (buff(SUM(count)/2,2), disp(size), rdisp(size))
rdisp = count
disp(1) = 0
DO i = 2,size
disp(i) = disp(i-1) + count(i-1)
END DO
END IF
! -----------------------------------------------------------
! Rank-0 gathers msg
! -----------------------------------------------------------
CALL MPI_GATHERV(A,mm,MPI_INTEGER,buff,rdisp,disp,MPI_INTEGER,root,cworld,ierr)
! -----------------------------------------------------------
! Print buff
! -----------------------------------------------------------
if(rank.eq.0) THEN
DO i = 1,sum(count)/2
print*, buff(i,:)
end do
END IF
I have looked at Using Gatherv for 2d Arrays in Fortran but am a little confused with the explanation.
I’m not very familiar with the MPI details, but is there a "simple" way to gather all the matrices and place them in the correct memory position in buff?
**** Edit ****
Fallowing what Gilles Gouaillardet suggested. I'm trying to figure how to do that,
The derived type for sending the rows should look something like this (I think),
CALL MPI_TYPE_vector(2,1,2,MPI_INTEGER,MPI_ROWS,ierr)
CALL MPI_TYPE_COMMIT(MPI_ROWS,ierr)
Then I extend,
call MPI_Type_size(MPI_INTEGER, msg_size, ierr)
lb = 0
extent = 2*msg_size
call MPI_Type_create_resized(MPI_ROWS, lb, extent , MPI_ROWS_extend, ierr)
CALL MPI_TYPE_COMMIT(, MPI_ROWS_extend,ierr)
I’m trying to understand why I need the second derived type for receiving. I’m not sure how that one should look like.
I have created a Fortran array, say
real, dimension(4, 4) :: A
Being a matrix
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
And I want to pass it to a subroutine in form
call MySoubroutine(A(2,2))
And inside my subroutine get this array and modify some of its elements
real, dimension(:), intent(inout) : A
A(1,1) = 91
A(1, 2) = 92
A(2, 1) = 93
A(2, 2) = 94
So after calling the function in my main program the array A is
1 2 3 4
5 91 92 8
9 93 94 12
13 14 15 16
What is the best an most optimum way to achieve such a behaviour?
In detail my questions are:
Is there a better way of using a subarray inside the subroutine?
How shall I declare the array in the subroutine? I want just to pass a pointer to the first element, so may not know the dimension of the subarray.
In my Fortran 90 code, I have created the following array (called array) of integers:
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
I wish to extract the first column, and save it in a four-element vector called time. I have the following code:
PROGRAM test
IMPLICIT NONE
INTEGER, PARAMETER :: numrows=4, numcols=10
INTEGER :: i, j, k
INTEGER, DIMENSION(:,:), ALLOCATABLE :: array, time
ALLOCATE(array(numrows,numcols))
ALLOCATE(time(numrows))
k=1
DO i=1,numrows
DO j=1,numcols
array(i,j)=k
k=k+1
END DO
END DO
DO i=1,numrows
WRITE(*,"(100(3X,I3))") (array(i,j), j=1,numcols)
END DO
time=array(:,1)
END PROGRAM test
But, I get the following error message (when compiling in gfortran):
test.f90:8.15:
ALLOCATE(time(numrows))
1
Error: Rank mismatch in array reference at (1) (1/2)
test.f90:22.2:
time=array(:,1)
1
Error: Incompatible ranks 2 and 1 in assignment at (1)
Why is this the case? The error message seems to suggest that the array array(:,1) is of rank 2, not rank 1. Is there any way that I can convert array(:,1) to an array of rank 1? Do I need to use RESHAPE to somehow squeeze the array? Or is the problem that by using array(:,1), I am specifying a column vector rather than a row vector? Thank you very much for your time.
You are specifying a rank-2 allocatable array called time:
INTEGER, DIMENSION(:,:), ALLOCATABLE :: array, time
and then attempting to allocate it as a rank-1 array:
ALLOCATE(time(numrows))
-- don't do that. This works perfectly fine:
PROGRAM test
IMPLICIT NONE
INTEGER, PARAMETER :: numrows=4, numcols=10
INTEGER :: i, j, k
INTEGER, DIMENSION(:,:), ALLOCATABLE :: array
INTEGER, DIMENSION(:), ALLOCATABLE :: time
ALLOCATE(array(numrows,numcols))
ALLOCATE(time(numrows))
k=1
DO i=1,numrows
DO j=1,numcols
array(i,j)=k
k=k+1
END DO
END DO
DO i=1,numrows
WRITE(*,"(100(3X,I3))") (array(i,j), j=1,numcols)
END DO
time=array(:,1)
END PROGRAM test
Is there an intrinsic in Fortran that generates an array containing a sequence of numbers from a to b, similar to python's range()
>>> range(1,5)
[1, 2, 3, 4]
>>> range(6,10)
[6, 7, 8, 9]
?
No, there isn't.
You can, however, initialize an array with a constructor that does the same thing,
program arraycons
implicit none
integer :: i
real :: a(10) = (/(i, i=2,20, 2)/)
print *, a
end program arraycons
If you need to support floats, here is a Fortran subroutine similar to linspace in NumPy and MATLAB.
! Generates evenly spaced numbers from `from` to `to` (inclusive).
!
! Inputs:
! -------
!
! from, to : the lower and upper boundaries of the numbers to generate
!
! Outputs:
! -------
!
! array : Array of evenly spaced numbers
!
subroutine linspace(from, to, array)
real(dp), intent(in) :: from, to
real(dp), intent(out) :: array(:)
real(dp) :: range
integer :: n, i
n = size(array)
range = to - from
if (n == 0) return
if (n == 1) then
array(1) = from
return
end if
do i=1, n
array(i) = from + range * (i - 1) / (n - 1)
end do
end subroutine
Usage:
real(dp) :: array(5)
call linspace(from=0._dp, to=1._dp, array=array)
Outputs the array
[0., 0.25, 0.5, 0.75, 1.]
Here dp is
integer, parameter :: dp = selected_real_kind(p = 15, r = 307) ! Double precision
It is possible to create a function that reproduces precisely the functionality of range in python:
module mod_python_utils
contains
pure function range(n1,n2,dn_)
integer, intent(in) :: n1,n2
integer, optional, intent(in) :: dn_
integer, allocatable :: range(:)
integer ::dn
dn=1; if(present(dn_))dn=dn_
if(dn<=0)then
allocate(range(0))
else
allocate(range(1+(n2-n1)/dn))
range=[(i,i=n1,n2,dn)]
endif
end function range
end module mod_python_utils
program testRange
use mod_python_utils
implicit none
integer, allocatable :: v(:)
v=range(51,70)
print"(*(i0,x))",v
v=range(-3,30,2)
print"(*(i0,x))",v
print"(*(i0,x))",range(1,100,3)
print"(*(i0,x))",range(1,100,-3)
end program testRange
The output of the above is
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
-3 -1 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
Notice that :
the last line is empty: Fortran treats graciously zero-length arrays.
allocated variables get automatically deallocated once out of scope.