Related
PROGRAM ShareNeighbors
IMPLICIT REAL (a-h,o-z)
INCLUDE "mpif.h"
PARAMETER (m = 500, n = 500)
DIMENSION a(m,n), b(m,n)
DIMENSION h(m,n)
INTEGER istatus(MPI_STATUS_SIZE)
INTEGER iprocs, jprocs
PARAMETER (ROOT = 0)
integer dims(2),coords(2)
logical periods(2)
data periods/2*.false./
integer status(MPI_STATUS_SIZE)
integer comm2d,req,source
CALL MPI_INIT(ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr)
! Get a new communicator for a decomposition of the domain.
! Let MPI find a "good" decomposition
dims(1) = 0
dims(2) = 0
CALL MPI_DIMS_CREATE(nprocs,2,dims,ierr)
if (myrank.EQ.Root) then
print *,nprocs,'processors have been arranged into',dims(1),'X',dims(2),'grid'
endif
CALL MPI_CART_CREATE(MPI_COMM_WORLD,2,dims,periods,.true., &
comm2d,ierr)
! Get my position in this communicator
CALL MPI_COMM_RANK(comm2d,myrank,ierr)
! Get the decomposition
CALL fnd2ddecomp(comm2d,m,n,ista,iend,jsta,jend)
! print *,ista,jsta,iend,jend
ilen = iend - ista + 1
jlen = jend - jsta + 1
CALL MPI_Cart_get(comm2d,2,dims,periods,coords,ierr)
iprocs = dims(1)
jprocs = dims(2)
myranki = coords(1)
myrankj = coords(2)
DO j = jsta, jend
DO i = ista, iend
a(i,j) = myrank+1
ENDDO
ENDDO
! Send data from each processor to Root
call MPI_ISEND(ista,1,MPI_INTEGER,Root,1, &
MPI_COMM_WORLD,req,ierr)
call MPI_ISEND(iend,1,MPI_INTEGER,Root,1, &
MPI_COMM_WORLD,req,ierr)
call MPI_ISEND(jsta,1,MPI_INTEGER,Root,1, &
MPI_COMM_WORLD,req,ierr)
call MPI_ISEND(jend,1,MPI_INTEGER,Root,1, &
MPI_COMM_WORLD,req,ierr)
call MPI_ISEND(a(ista:iend,jsta:jend),(ilen)*(jlen),MPI_REAL, &
Root,1,MPI_COMM_WORLD,req,ierr )
! Recieved the results from othe precessors
if (myrank.EQ.Root) then
do source = 0,nprocs-1
call MPI_RECV(ista,1,MPI_INTEGER,source, &
1,MPI_COMM_WORLD,status,ierr )
call MPI_RECV(iend,1,MPI_INTEGER,source, &
1,MPI_COMM_WORLD,status,ierr )
call MPI_RECV(jsta,1,MPI_INTEGER,source, &
1,MPI_COMM_WORLD,status,ierr )
call MPI_RECV(jend,1,MPI_INTEGER,source, &
1,MPI_COMM_WORLD,status,ierr )
ilen = iend - ista + 1
jlen = jend - jsta + 1
call MPI_RECV(a(ista:iend,jsta:jend),(ilen)*(jlen),MPI_REAL, &
source,1,MPI_COMM_WORLD,status,ierr)
! print the results
call ZMINMAX(m,n,ista,iend,jsta,jend,a(:,:),amin,amax)
print *, 'myid=',source,amin,amax
call MPI_Wait(req, status, ierr)
enddo
endif
CALL MPI_FINALIZE(ierr)
END
subroutine fnd2ddecomp(comm2d,m,n,ista,iend,jsta,jend)
integer comm2d
integer m,n,ista,jsta,iend,jend
integer dims(2),coords(2),ierr
logical periods(2)
! Get (i,j) position of a processor from Cartesian topology.
CALL MPI_Cart_get(comm2d,2,dims,periods,coords,ierr)
! Decomposition in first (ie. X) direction
CALL MPE_DECOMP1D(m,dims(1),coords(1),ista,iend)
! Decomposition in second (ie. Y) direction
CALL MPE_DECOMP1D(n,dims(2),coords(2),jsta,jend)
return
end
SUBROUTINE MPE_DECOMP1D(n,numprocs,myid,s,e)
integer n,numprocs,myid,s,e,nlocal,deficit
nlocal = n / numprocs
s = myid * nlocal + 1
deficit = mod(n,numprocs)
s = s + min(myid,deficit)
! Give one more slice to processors
if (myid .lt. deficit) then
nlocal = nlocal + 1
endif
e = s + nlocal - 1
if (e .gt. n .or. myid .eq. numprocs-1) e = n
return
end
SUBROUTINE ZMINMAX(IX,JX,SX,EX,SY,EY,ZX,ZXMIN,ZXMAX)
INTEGER :: IX,JX,SX,EX,SY,EY
REAL :: ZX(IX,JX)
REAL :: ZXMIN,ZXMAX
ZXMIN=1000.
ZXMAX=-1000.
DO II=SX,EX
DO JJ=SY,EY
IF(ZX(II,JJ).LT.ZXMIN)ZXMIN=ZX(II,JJ)
IF(ZX(II,JJ).GT.ZXMAX)ZXMAX=ZX(II,JJ)
ENDDO
ENDDO
RETURN
END
When I am running the above code with 4 processors Root receives garbage values. Where as for 15 processors, the data transfer is proper. How I can tackle this?
I guess it is related buffer, a point which is not clear to me. How I have to tackle the buffer wisely?
1. problem
You are doing multiple sends
call MPI_ISEND(ista,1,MPI_INTEGER,Root,1, &
MPI_COMM_WORLD,req,ierr)
call MPI_ISEND(iend,1,MPI_INTEGER,Root,1, &
MPI_COMM_WORLD,req,ierr)
call MPI_ISEND(jsta,1,MPI_INTEGER,Root,1, &
MPI_COMM_WORLD,req,ierr)
call MPI_ISEND(jend,1,MPI_INTEGER,Root,1, &
MPI_COMM_WORLD,req,ierr)
call MPI_ISEND(a(ista:iend,jsta:jend),(ilen)*(jlen),MPI_REAL, &
Root,1,MPI_COMM_WORLD,req,ierr )
and all of them with the same request variable req. That can't work.
2. problem
You are using a subarray a(ista:iend,jsta:jend) in non-blocking MPI. That is not allowed*. You need to copy the array into some temporary array buffer or use MPI derived subarray datatype (too hard for you at this stage).
The reason for the problem is that the compiler will create a temporary copy just for the call to ISend. The ISend will remember the address, but will not send anything. Then temporary is deleted and the address becomes invalid. And then the MPI_Wait will try to use that address and will fail.
3. problem
Your MPI_Wait is in the wrong place. It must be after the sends out of any if conditions so that they are always executed (provided you are always sending).
You must collect all request separately and than wait for all of them. Best to have them in a an array and wait for all of them at once using MPI_Waitall.
Remeber, the ISend typically does not actually send anything if the buffer is large. The exchange often happens during the Wait operation. At least for larger arrays.
Recommendation:
Take a simple problem example and try to exchange just two small arrays with MPI_IRecv and MPI_ISend between two processes. As simple test problem as you can do. Learn from it, do simple steps. Take no offence, but your current understanding of non-blocking MPI is too weak to write full scale programs. MPI is hard, non-blocking MPI is even harder.
* not allowed when using the interface available in MPI-2. MPI-3 brings a new interface available by using use mpi_f08 where it is possible. But learn the basics first.
I am fairly new to writing/running parallel code. Currently I am experimenting with basic tutorials in writing parallel code to obtain a feel for the process. My computer is using ubuntu with Mpich.
I am attempting to run the code entitled "The complete parallel program to sum a vector" on this page :http://condor.cc.ku.edu/~grobe/docs/intro-MPI.shtml
and am encountering the following error upon execution after being prompted for/entering a number:
Fatal error in MPI_Send: Invalid tag, error stack:
MPI_Send(174): MPI_Send(buf=0x7ffeab0f2d3c, count=1, MPI_INT, dest=1, tag=1157242880, MPI_COMM_WORLD) failed
MPI_Send(101): Invalid tag, value is 1157242880
I also receive the warning while compiling:
sumvecp.f90:41:23:
call mpi_send(vector(start_row),num_rows_to_send, mpi_real, an_id, send_data_tag, mpi_comm_world,ierr)
1
Warning: Legacy Extension: REAL array index at (1)
This is my code
program sumvecp
include '/usr/include/mpi/mpif.h'
parameter (max_rows = 10000000)
parameter (send_data_tag = 2001, return_data_tag = 2002)
integer my_id, root_proces, ierr, status(mpi_status_size)
integer num_procs, an_id, num_rows_to_receive
integer avg_rows_per_process, num_rows,num_rows_to_send
real vector(max_rows), vector2(max_rows), partial_sum, sum
root_process = 0
call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world,my_id,ierr)
call mpi_comm_size(mpi_comm_world,num_procs,ierr)
if (my_id .eq. root_process) then
print *, "please enter the number of numbers to sum: "
read *, num_rows
if (num_rows .gt. max_rows) stop "Too many numbers."
avg_rows_per_process = num_rows / num_procs
do ii = 1,num_rows
vector(ii) = float(ii)
end do
do an_id = 1, num_procs -1
start_row = (an_id*avg_rows_per_process) +1
end_row = start_row + avg_rows_per_process -1
if (an_id .eq. (num_procs - 1)) end_row = num_rows
num_rows_to_send = end_row - start_row + 1
call mpi_send(num_rows_to_send, 1, mpi_int, an_id, send_data_tag, mpi_comm_world,ierr)
call mpi_send(vector(start_row),num_rows_to_send, mpi_real, an_id, send_data_tag, mpi_comm_world,ierr)
end do
summ = 0.0
do ii = 1, avg_rows_per_process
summ = summ + vector(ii)
end do
print *,"sum", summ, "calculated by the root process."
do an_id =1, num_procs -1
call mpi_recv(partial_sum, 1, mpi_real, mpi_any_source, mpi_any_tag, mpi_comm_world, status, ierr)
sender = status(mpi_source)
print *, "partial sum", partial_sum, "returned from process", sender
summ = summ + partial_sum
end do
print *, "The grand total is: ", sum
else
call mpi_recv(num_rows_to_receive, 1, mpi_int, root_process, mpi_any_tag, mpi_comm_world,status,ierr)
call mpi_recv(vector2,num_rows_to_received, mpi_real,root_process,mpi_any_tag,mpi_comm_world,status,ierr)
num_rows_received = num_rows_to_receive
partial_sum = 0.0
do ii=1,num_rows_received
partial_sum = partial_sum + vector2(ii)
end do
call mpi_send(partial_sum,1,mpi_real,root_process,return_data_tag,mpi_comm_world,ierr)
endif
call mpi_finalize(ierr)
stop
end
You are missing IMPLICIT NONE and you have a large number of undeclared variables.
The reported error is because
send_data_tag = 2001, return_data_tag = 2002
are implicitly real variables and not integers. But you probably have many more problems.
First add IMPLICIT NONE and declare or variables. Also I highly recommend to put use mpi instead of the include '/usr/include/mpi/mpif.h' it is likely to help you to find more problems.
Now I see the code is copied from some website. I wouldn't trust this website, because the codes are clearly wrong.
I have a simple program, which is supposed to gather a number of small arrays into one big one using MPI.
PROGRAM main
include 'mpif.h'
integer ierr, i, myrank, thefile, n_procs
integer, parameter :: BUFSIZE = 3
complex*16, allocatable :: loc_arr(:), glob_arr(:)
call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, n_procs, ierr)
allocate(loc_arr(BUFSIZE))
loc_arr = 0.7 * myrank - cmplx(0.3, 0, kind=8)
allocate(glob_arr(n_procs* BUFSIZE))
write (*,*) myrank, shape(glob_arr)
call MPI_Gather(loc_arr, BUFSIZE, MPI_DOUBLE_COMPLEX,&
glob_arr, n_procs * BUFSIZE, MPI_DOUBLE_COMPLEX,&
0, MPI_COMM_WORLD, ierr)
write (*,*) myrank,"Errorcode:" , ierr
call MPI_FINALIZE(ierr)
END PROGRAM main
I have some experience with MPI in C, but for Fortran 90 nothing seems to work. Here is how I compile(I use ifort) and run it:
mpif90 test.f90 -check all && mpirun -np 4 ./a.out
1 12
3 12
3 Errorcode: 0
1 Errorcode: 0
0 12
2 12
2 Errorcode: 0
0 Errorcode: 0
*** Error in `./a.out': free(): invalid pointer: 0x0000000000a25790 ***
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 10889 RUNNING AT LenovoX1kabel
= EXIT CODE: 6
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 10889 RUNNING AT LenovoX1kabel
= EXIT CODE: 6
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
What do I do wrong? Sometimes I will get this pointer problem, sometimes I will a segmentation fault, but to me it doesn't look like any of the ifort checks complain.
All the Errorcodes are 0, so I'm not sure where I go wrong.
You should never specify the number of processes in MPI collectives. That is a simple rule of thumb.
Therefore the line n_procs * BUFSIZE is clearly wrong.
And indeed the manual states that: recvcount Number of elements for any single receive (integer, significant only at root).
You should just use BUFSIZE. This is the same for C and Fortran.
Hey I am trying to get my LAPACK libraries to work and I have searched and searched but I can't seem to figure out what I am doing wrong.
I try running my code, and I get the following error
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x7FFB23D405F7
#1 0x7FFB23D40C3E
#2 0x7FFB23692EAF
#3 0x401ED1 in sgesv_
#4 0x401D0B in MAIN__ at CFDtest.f03:? Segmentation fault (core dumped)
I will paste my main code here, hopefully someone can help me with this problem.
****************************************************
PROGRAM CFD_TEST
USE MY_LIB
IMPLICIT DOUBLE PRECISION (A-H,O-Z)
DIMENSION ET(0:10), VN(0:10), WT(0:10)
DIMENSION SO(0:10), FU(0:10), DMA(0:10,0:10)
DIMENSION DMA2(0:10,0:10), QN(0:10), WKSPCE(0:10)
INTEGER*8 :: pivot(10), inf
INTEGER*8 :: N
EXTERNAL SGESV
!SET THE PARAMETERS
SIGMA1 = 0.D0
SIGMA2 = 0.D0
TAU = 1.D0
EF = 1.D0
EXP = 2.71828182845904509D0
COST = EXP/(1.D0+EXP*EXP)
DO 1 N=2, 10
!COMPUATION OF THE NODES, WEIGHTS AND DERIVATIVE MATRIX
CALL ZELEGL(N,ET,VN)
CALL WELEGL(N,ET,VN,WT)
CALL DMLEGL(N,10,ET,VN,DMA)
!CONSTRUCTION OF THE MATRIX CORRESPONDING TO THE
!DIFFERENTIAL OPERATOR
DO 2 I=0, N
DO 2 J=0, N
SUM = 0.D0
DO 3 K=0, N
SUM = SUM + DMA(I,K)*DMA(K,J)
3 CONTINUE
OPER = -SUM
IF(I .EQ. J) OPER = -SUM + TAU
DMA2(I,J) = OPER
2 CONTINUE
!CHANGE OF THE ENTRIES OF THE MATRIX ACCORDING TO THE
!BOUNDARY CONDITIONS
DO 4 J=0, N
DMA2(0,J) = 0.D0
DMA2(N,J) = 0.D0
4 CONTINUE
DMA2(0,0) = 1.D0
DMA2(N,N) = 1.D0
!CONSTRUCTION OF THE RIGHT-HAND SIDE VECTOR
DO 5 I=1, N-1
FU(I) = EF
5 CONTINUE
FU(0) = SIGMA1
FU(N) = SIGMA2
!SOLUTION OF THE LINEAR SYSTEM
N1 = N + 1
CALL SGESV(N,N,DMA2,pivot,FU,N,inf)
DO 6 I = 0, N
FU(I) = SO(I)
6 CONTINUE
PRINT *, pivot
1 CONTINUE
RETURN
END PROGRAM CFD_TEST
*****************************************************
The commands I run to compile are
gfortran -c MY_LIB.f03
gfortran -c CFDtest.f03
gfortran MY_LIB.o CFDtest.o -o CFDtest -L/usr/local/lib -llapack -lblas
I ran the command
-fbacktrace -g -Wall -Wextra CFDtest
CFDtest: In function _fini':
(.fini+0x0): multiple definition of_fini'
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crti.o:/build/buildd/glibc-2.19/csu/../sysdeps/x86_64/crti.S:80: first defined here
CFDtest: In function data_start':
(.data+0x0): multiple definition ofdata_start'
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crt1.o:(.data+0x0): first defined here
CFDtest: In function data_start':
(.data+0x8): multiple definition of__dso_handle'
/usr/lib/gcc/x86_64-linux-gnu/4.9/crtbegin.o:(.data+0x0): first defined here
CFDtest:(.rodata+0x0): multiple definition of _IO_stdin_used'
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crt1.o:(.rodata.cst4+0x0): first defined here
CFDtest: In function_start':
(.text+0x0): multiple definition of _start'
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crt1.o:(.text+0x0): first defined here
CFDtest: In function_init':
(.init+0x0): multiple definition of _init'
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crti.o:/build/buildd/glibc-2.19/csu/../sysdeps/x86_64/crti.S:64: first defined here
/usr/lib/gcc/x86_64-linux-gnu/4.9/crtend.o:(.tm_clone_table+0x0): multiple definition of__TMC_END'
CFDtest:(.data+0x10): first defined here
/usr/bin/ld: error in CFDtest(.eh_frame); no .eh_frame_hdr table will be created.
collect2: error: ld returned 1 exit status
You haven't posted your code for MY_LIB.f03 so we cannot compile CFDtest.f03 exactly as you have supplied it.
(As an aside, the usual naming convention is that f90 in a .f90 file is not supposed to imply the language version being targeted. Rather, .f90 denotes free format while .f is used for fixed format. By extension, your .f03 files would be better (i.e., more portable if) named as .f90.)
I commented out the USE MY_LIB line and ran your code through nagfor -u -c cfd_test.f90. The output, broken down, is
Extension: cfd_test.f90, line 13: Byte count on numeric data type
detected at *#8
Extension: cfd_test.f90, line 15: Byte count on numeric data type
detected at *#8
Byte counts are not portable. The kind value for an 8-byte integer is selected_int_kind(18). (Similarly you might like to use a kind(0.0d0) kind value for your double precision data.)
Error: cfd_test.f90, line 48: Implicit type for I
detected at 2#I
Error: cfd_test.f90, line 50: Implicit type for J
detected at 2#J
Error: cfd_test.f90, line 54: Implicit type for K
detected at 3#K
Error: cfd_test.f90, line 100: Implicit type for N1
detected at N1#=
You have these implicitly typed, which implies they are 4-byte (default) integers. You should probably declare these explicitly as 8-byte integers (using the 8-byte integer kind value above) if that's what you intend.
Questionable: cfd_test.f90, line 116: Variable COST set but never referenced
Questionable: cfd_test.f90, line 116: Variable N1 set but never referenced
Warning: cfd_test.f90, line 116: Unused local variable QN
Warning: cfd_test.f90, line 116: Unused local variable WKSPCE
You need to decide what you intend to do with these, or whether they are just deletable cruft.
With the implicit integers declared explicitly, there is further output
Warning: cfd_test.f90, line 116: Variable SO referenced but never set
This looks bad.
Obsolescent: cfd_test.f90, line 66: 2 is a shared DO termination label
Your DO loops would probably be better using the modern END DO terminators (not shared!)
Error: cfd_test.f90, line 114: RETURN is only allowed in SUBROUTINEs and FUNCTIONs
This is obviously easy to fix.
For the LAPACK call, one source of explicit interfaces for these routines is the NAG Fortran Library (through the nag_library module). Since your real data is not single precision, you should be using dgesv instead of sgesv. Adding USE nag_library, ONLY: dgesv and switching to call dgesv instead of sgesv, then recompiling as above, reveals
Incorrect data type INTEGER(KIND=4) (expected INTEGER) for argument N (no. 1) of DGESV
so you should indeed be using default (4-byte integers) - at least for the LAPACK build on your system, which will almost certainly be using 4-byte integers. Thus you might want to forget all about kinding your integers and just use the default integer type for all. Correcting this gives
Array supplied for scalar argument LDA (no. 4) of DGESV
so you do need to add this argument. Maybe pass size(DMA2,1)?
With this argument added to the call the code compiles successfully, but without the definitions for your *LEGL functions I couldn't go through any run-time testing.
Here is my modified (and pretty-printed) version of your program
Program cfd_test
! Use my_lib
! Use nag_library, Only: dgesv
Implicit None
Integer, Parameter :: wp = kind(0.0D0)
Real (Kind=wp) :: ef, oper, sigma1, sigma2, tau
Integer :: i, inf, j, k, n, sum
Real (Kind=wp) :: dma(0:10, 0:10), dma2(0:10, 0:10), et(0:10), fu(0:10), &
so(0:10), vn(0:10), wt(0:10)
Integer :: pivot(10)
External :: dgesv, dmlegl, welegl, zelegl
Intrinsic :: kind, size
! SET THE PARAMETERS
sigma1 = 0._wp
sigma2 = 0._wp
tau = 1._wp
ef = 1._wp
Do n = 2, 10
! COMPUATION OF THE NODES, WEIGHTS AND DERIVATIVE MATRIX
Call zelegl(n, et, vn)
Call welegl(n, et, vn, wt)
Call dmlegl(n, 10, et, vn, dma)
! CONSTRUCTION OF THE MATRIX CORRESPONDING TO THE
! DIFFERENTIAL OPERATOR
Do i = 0, n
Do j = 0, n
sum = 0._wp
Do k = 0, n
sum = sum + dma(i, k)*dma(k, j)
End Do
oper = -sum
If (i==j) oper = -sum + tau
dma2(i, j) = oper
End Do
End Do
! CHANGE OF THE ENTRIES OF THE MATRIX ACCORDING TO THE
! BOUNDARY CONDITIONS
Do j = 0, n
dma2(0, j) = 0._wp
dma2(n, j) = 0._wp
End Do
dma2(0, 0) = 1._wp
dma2(n, n) = 1._wp
! CONSTRUCTION OF THE RIGHT-HAND SIDE VECTOR
Do i = 1, n - 1
fu(i) = ef
End Do
fu(0) = sigma1
fu(n) = sigma2
! SOLUTION OF THE LINEAR SYSTEM
Call dgesv(n, n, dma2, size(dma2,1), pivot, fu, n, inf)
Do i = 0, n
fu(i) = so(i)
End Do
Print *, pivot
End Do
End Program
In general your development experience will be the most pleasant if you use as good a checking compiler as you can get your hands on and if you make sure you ask it to diagnose as much as it can for you.
As far as I can tell, there could be a number of problems:
Your integers with INTEGER*8 might be too long, maybe INTEGER*4 or simply INTEGER would be better
You call SGESV on double arguments instead of DGESV
Your LDA argument is missing, so your code should perhaps look like CALL DGESV(N,N,DMA2,N,pivot,FU,N,inf) but you need to check whether this is what you want.
I am trying to reproduce this C example in Fortran. my code so far:
use mpi
implicit none
integer, parameter :: maxn = 8
integer, allocatable :: xlocal(:,:)
integer :: i, j, lsize, errcnt, toterr, buff
integer :: ierror, nproc, pid, root = 0, nreq = 0
integer, allocatable :: request(:), status(:,:)
call MPI_INIT(ierror)
call MPI_COMM_SIZE(MPI_COMM_WORLD, nproc, ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD, pid, ierror)
if (mod(maxn, nproc) /= 0) then
write(*,*) 'Array size (maxn) should be a multiple of the number of processes'
call MPI_ABORT(MPI_COMM_WORLD, 1, ierror)
end if
lsize = maxn/nproc
allocate(xlocal(0:lsize+1, maxn))
allocate(request(nproc))
allocate(status(MPI_STATUS_SIZE,nproc))
xlocal(0,:) = -1
xlocal(1:lsize,:) = pid
xlocal(lsize+1,:) = -1
! send down unless on bottom
if (pid < nproc-1) then
nreq = nreq + 1
call MPI_ISEND(xlocal(lsize,:), maxn, MPI_INTEGER, &
pid+1, 0, MPI_COMM_WORLD, request(nreq), ierror)
write(*,'(2(A,I1),A)') 'process ', pid, ' sent to process ', pid+1, ':'
write(*,*) xlocal(lsize,:)
end if
if (pid > 0) then
nreq = nreq + 1
call MPI_IRECV(xlocal(0,:), maxn, MPI_INTEGER, &
pid-1, 0, MPI_COMM_WORLD, request(nreq), ierror)
write(*,'(2(A,I1),A)') 'process ', pid, ' received from process ', pid-1, ':'
write(*,*) xlocal(0,:)
end if
! send up unless on top
if (pid > 0) then
nreq = nreq + 1
call MPI_ISEND(xlocal(1,:), maxn, MPI_INTEGER, &
pid-1, 1, MPI_COMM_WORLD, request(nreq), ierror)
write(*,'(2(A,I1),A)') 'process ', pid, ' sent to process ', pid-1, ':'
write(*,*) xlocal(1,:)
end if
if (pid < nproc-1) then
nreq = nreq + 1
call MPI_IRECV(xlocal(lsize+1,:), maxn, MPI_INTEGER, &
pid+1, 1, MPI_COMM_WORLD, request(nreq), ierror)
write(*,'(2(A,I1),A)') 'process ', pid, ' received from process ', pid+1, ':'
write(*,*) xlocal(lsize+1,:)
end if
call MPI_WAITALL(nreq, request, status, ierror)
! check results
errcnt = 0
do i = 1, lsize
do j = 1, maxn
if (xlocal(i,j) /= pid) errcnt = errcnt + 1
end do
end do
do j = 1, maxn
if (xlocal(0,j) /= pid-1) errcnt = errcnt + 1
if ((pid < nproc-1) .and. (xlocal(lsize+1,j) /= pid+1)) errcnt = errcnt + 1
end do
call MPI_REDUCE(errcnt, toterr, 1, MPI_INTEGER, MPI_SUM, 0, MPI_COMM_WORLD)
if (pid == root) then
if (toterr == 0) then
write(*,*) "no errors found"
else
write(*,*) "found ", toterr, " errors"
end if
end if
deallocate(xlocal)
deallocate(request)
deallocate(status)
call MPI_FINALIZE(ierror)
but i am running into segmentation faults and can not figure out why. I have a feeling it is due to the request array. can someone explain the correct way of using the request array in Fortran? none of the references I found clarify this.
thx in advance
In case you haven't already done so, consider compiling your program with some flags that will help you in debugging, e.g. with gfortran, you can use -O0 -g -fbounds-check (if that does not help, you might add -fsanitize=address for versions >= 4.8). Other compilers have similar options for debugging.
Doing that, and running with 2 processes, you program crashes at the MPI_Reduce line. If you look up the specifications (e.g. OpenMPI 1.8) you can see that this subroutine requires one more argument, i.e., you forgot to add the ierror argument at the end.
It is a bit tragic that even though the subprograms from the mpi module are accessible through a use association, and thus should be checked for argument consistency to avoid these trivial errors, not all subprograms are necessarily in that module. I don't know which MPI implementation you use, but I checked my local MPICH installation and it does not have most subroutines in the module, so no explicit interfaces exist for them. I guess you are in a similar situation, but I guess other implementations might suffer a similar fate. You could compare it to the C header file missing the function prototype for MPI_Reduce. I guess the reason for this is that originally there was only a Fortran 77 interface for most implementations.
Some final comments: be careful not to just copy-paste the C code. The arrays you pass are not contiguous and will result in a temporary copy to be passed to the MPI routines, which is very inefficient (not that it really matters in this case).