Why this MPI_SENDRECV is deadlock? - fortran

this is my first post. Thank you in advance for your kind help. This is a very short code for testing MPI_SENDRECV,which I did not understand. But it is deadlock can anybody tell me why?
PROGRAM sendrecv
INCLUDE "mpif.h"
INTEGER ibuf(20)
CALL MPI_INIT(ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr)
a=1
b=2
if (myid == 0) then
call mpi_sendrecv(a,1,mpi_real,1,0,
. b,1,mpi_real,1,0,
. MPI_COMM_WORLD, status,ierr)
elseif (myid == 1) then
call mpi_sendrecv(b,1,mpi_real,0,0,
. a,1,mpi_real,0,0,
. MPI_COMM_WORLD,status,ierr)
end if
if (myid.eq.0) then
write(*,*) a
endif
if (myid.eq.1) then
write(*,*) b
endif
CALL MPI_FINALIZE(ierr)
END

As pointed out by #SteveBlackwell, you're using myid instead of myrank. If you use:
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)
instead then you'll be halfway there. The deadlock arises here because both processors (probably) have myid = 0 (Note, this behavior is compiler dependent -- other compilers might set it to some strange number and your program will appear to work, but no messages will be passed).
Your second problem is that status is implicitly declared as a real variable, but MPI is expecting an integer array with size MPI_STATUS_SIZE. This could have all sorts of effects -- You could get a segfault, or worse, some strange memory error since MPI is writing to a buffer that it shouldn't. (alternatively, you could use MPI_STATUS_IGNORE since you're not doing anything with the status anyway).
As pointed out by #HighPerformanceMark, it is best practice to explicitly type everything in your program and use IMPLICIT NONE to avoid these types of problems. In other words, you should have a really good reason if your subroutine/module/function/main program don't have an IMPLICIT NONE in the declaration.

Related

MPI: Is there a simple way to get a single value from another rank?

I am running a parallel Fortran90 code using MPI. In every rank, there is an allocatable 3D array "u" that stores double precision floating point numbers. It has the same size for all ranks.
Some parts of the code are skipped by a large number of MPI processes, since up to now they were not involved in some computations. The code in the main looks somewhat like this:
[...] !previous code
if(number_of_samples(rank).gt.0) then
do i=1,number_of_samples(rank)
call perform_some_action() !only called by ranks with more than zero samples
enddo
endif
[...] !subsequent code
The subroutine "perform_some_action" does some computations, which up to now only needed information that was stored within the respective MPI process. With my latest modification however, I sometimes need access to one single value of the array "u" which is stored in a neighbor rank, and not in the rank that performs the calculations within the subroutine "perform_some_action".
My question is: is there some simple way to, within "perform_some_action", retrieve a single value of array "u" in another rank? I know exactly in which rank the required value is stored, and I know the position of the value within "u" in this rank.
My first idea was to use MPI_SEND and MPI_RECV, however I think this is not possible if the code never passes the sender rank? I.e., since I only call "perform_some_action" from ranks with number_of_samples(rank).gt.0, I will not go through the subroutine in ranks with number_of_samples(rank)=0. Still, I need to sometimes access the array "u" of these ranks.
My latest idea was to use MPI_SENDRECV. My subroutine looks like this:
subroutine perform_some_action()
implicit none
include 'mpif.h'
real(8) :: saveValueHere
[...]
call MPI_SENDRECV(u(i,j,k),1,MPI_DOUBLE_PRECISION,rank,0,&
saveValueHere,1,MPI_DOUBLE_PRECISION,sendingRank,0,&
MPI_COMM_Cart,status,ierr)
[...]
end subroutine
However, using this approach, I always get "MPI_ERR_TRUNCATE: message truncated".
Another idea I had would be to, within the subroutine, create a new MPI communicator containing only the receiving rank, and call MPI_BCAST to send the single value. However, MPI_BCAST is obviously not desinged to do something like point-to-point communication.
Edit: I tried to provide an example for reproduction:
PROGRAM mpitest
USE mpi
IMPLICIT NONE
INTEGER :: ierr, npe, rank, win
INTEGER :: is,ie,js,je,ks,ke
REAL(8), dimension(:,:,:), allocatable :: u
REAL(8) :: saveValueHere
CALL MPI_INIT( ierr )
CALL MPI_COMM_RANK( MPI_COMM_WORLD, rank, ierr )
CALL MPI_COMM_SIZE( MPI_COMM_WORLD, npe, ierr )
if(rank.eq.0) then
is=1
ie=2
js=1
je=2
ks=1
ke=2
else if(rank.eq.1) then
is=3
ie=4
js=3
je=4
ks=3
ke=4
endif
allocate(u(is:ie,js:je,ks:ke))
u=0.d0
if(rank.eq.0) then
u(is,js,ks)=1.d0
u(is,js,ke)=2.d0
u(is,je,ks)=3.d0
u(is,je,ke)=4.d0
u(ie,js,ks)=5.d0
u(ie,js,ke)=6.d0
u(ie,je,ks)=7.d0
u(ie,je,ke)=8.d0
else if(rank.eq.1) then
u(is,js,ks)=11.d0
u(is,js,ke)=12.d0
u(is,je,ks)=13.d0
u(is,je,ke)=14.d0
u(ie,js,ks)=15.d0
u(ie,js,ke)=16.d0
u(ie,je,ks)=17.d0
u(ie,je,ke)=18.d0
endif
if(rank.eq.0) then
write(*,*) 'get u(3,4,3)=13.d0 from rank 1 and save in saveValueHere'
endif
CALL MPI_FINALIZE(ierr)
END PROGRAM mpitest

Code get compiled when include 'mpif.h' but failed when switch to use mpi

I am trying to switch to use mpi for some old fortran codes I have. I got some strange errors when compiling the code.
Error: There is no specific subroutine for the generic 'mpi_type_indexed' at (1)
when I try to switch to 'use mpi' in the code. If I use 'include 'mpif.h'' then the program got compiled and is able to run correctly.
I have written a compact example to verify the program. Both the code and my example are compiled under gcc/8.1.0 and openmpi/3.1.2.
program bt
use mpi
implicit none
!include 'mpif.h'
contains
subroutine read_me()
implicit none
integer :: my_n, my_disp, my_type
integer :: ierr
my_n = 2
my_disp = 4
call MPI_Type_indexed(1, my_n, my_disp, MPI_INTEGER, my_type, ierr)
end subroutine
end program
compile it with no flag: mpif90 bt.F90
With use mpi committed and include 'mpif.h' uncommitted, everything works fine.
With use mpi uncommitted and include 'mpif.h' committed, I got error says
bt.F90:23:67:
call MPI_Type_indexed(1, my_n, my_disp, MPI_INTEGER, my_type, ierr)
1
Error: There is no specific subroutine for the generic 'mpi_type_indexed' at (1)
As indicated in the comments the "problem" that has occurred is that because you have used the module rather than the include file an interface is now in scope, and the compiler can now detect that you are trying to call MPI_Type_indexed with incorrect arguments, as the 2nd and 3rd arguments should be arrays - take a look at https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node79.htm#Node79 to see what the interface should be.
Looking at your example it looks as though the original author was assuming that a scalar and a 1 element array are the same thing - this is not the case as the former is rank 0 and the later rank 1. I say this as the first argument specifies how big the arrays should be, and in your case this has the value 1. Thus the 2nd and 3rd arguments should be single element arrays, rather than the scalars you have. The simplest solution, as these arguments are Intent( In ), is to put them in square brackets acting as an array constructor
call MPI_Type_indexed(1, [ my_n ], [ my_disp ], MPI_INTEGER, my_type, ierr)

Decorate Write-Operation for MPI-usage [duplicate]

I have a parallel fortran code in which I want only the rank=0 process to be able to write to stdout, but I don't want to have to litter the code with:
if(rank==0) write(*,*) ...
so I was wondering if doing something like the following would be a good idea, or whether there is a better way?
program test
use mpi
implicit none
integer :: ierr
integer :: nproc
integer :: rank
integer :: stdout
call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world, rank, ierr)
call mpi_comm_size(mpi_comm_world, nproc, ierr)
select case(rank)
case(0)
stdout = 6
case default
stdout = 7
open(unit=stdout, file='/dev/null')
end select
write(stdout,*) "Hello from rank=", rank
call mpi_finalize(ierr)
end program test
This gives:
$ mpirun -n 10 ./a.out
Hello from rank= 0
Thanks for any advice!
There are two disadvantages to your solution:
This "clever" solution actually obscures the code, since it lies: stdout isn't stdout any more. If someone reads the code he/she will think that all processes are writing to stdout, while in reality they aren't.
If you want all processes to write to stdout at some point, what will you do then? Add more tricks?
If you really want to stick with this trick, please don't use "stdout" as a variable for the unit number, but e.g. "master" or anything that indicates you're not actually writing to stdout. Furthermore, you should be aware that the number 6 isn't always stdout. Fortran 2003 allows you to check the unit number of stdout, so you should use that if you can.
My advice would be to stay with the if(rank==0) statements. They are clearly indicating what happens in the code. If you use lots of similar i/o statements, you could write subroutines for writing only for rank 0 or for all processes. These can have meaningful names that indicate the intended usage.
mpirun comes with the option to redirect stdout from each process into separate files. For example, -output-filename out would result in out.1.0, out.1.1, ... which you then can monitor using whatever way you like (I use tail -f). Next to if(rank.eq.0) this is the cleanest solution I think.
I am not so concerned with the two disadvantages mentioned by steabert. We can work that out by introducing another file descriptor that clearly indicates that it is stdout only on master process, e.g. stdout -> stdout0.
But my concern is here: The /dev/null will work in UNIX-like environment. Will it work on Windows environment? How about the funky BlueGene systems?

Fortran 77 -> Fortran 90+: COMMON blocks, MPI_Bcast

I am refactoring F77 program to more recent Fortran standard (90 or even newer).
I have a module where some variables defined. These variables are currently put into common block, because in external subroutine all these variables are broadcasted using only one MPI_BCAST call and exploiting contiguous storage of variables in this common block.
module foo
implicit none
integer :: a,b,c,d
real*8 :: r,t,p
common/com/ a,b,c,d,r,t,p
end module foo
subroutine bar
...
com_length = 4*4 + 3*8 ! 4 integers + 3 real(8)
! bcasting 'com' common block, i.e. all variables at once
call mpi_bcast(a,com_length,mpi_byte,root,comm,ierr)
...
end subroutine bar
Problem is that length of common block com_length is calculated manually and error prone. If COMMON block definition is missing, debug will take ages because even valgrind won't notice OOB.
On the other hand, calling MPI_BCAST separately for each variable will negatively impact performance.
I will appreciate your suggestions on how to refactor this.
You could do it in 2 MPI_BCAST calls.
CALL MPI_BCAST([a, b, c, d], 4, MPI_INTEGER, root, MPI_COMM_WORLD, ierr)
CALL MPI_BCAST([t, r, p], 3, MPI_DOUBLE_PRECISION, root, MPI_COMM_WORLD, ierr)
The 4 and 3 may not be exactly right, but the idea is still the same: group your like-variables as an array and broadcast them.

standard output in Fortran MPI code

I have a parallel fortran code in which I want only the rank=0 process to be able to write to stdout, but I don't want to have to litter the code with:
if(rank==0) write(*,*) ...
so I was wondering if doing something like the following would be a good idea, or whether there is a better way?
program test
use mpi
implicit none
integer :: ierr
integer :: nproc
integer :: rank
integer :: stdout
call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world, rank, ierr)
call mpi_comm_size(mpi_comm_world, nproc, ierr)
select case(rank)
case(0)
stdout = 6
case default
stdout = 7
open(unit=stdout, file='/dev/null')
end select
write(stdout,*) "Hello from rank=", rank
call mpi_finalize(ierr)
end program test
This gives:
$ mpirun -n 10 ./a.out
Hello from rank= 0
Thanks for any advice!
There are two disadvantages to your solution:
This "clever" solution actually obscures the code, since it lies: stdout isn't stdout any more. If someone reads the code he/she will think that all processes are writing to stdout, while in reality they aren't.
If you want all processes to write to stdout at some point, what will you do then? Add more tricks?
If you really want to stick with this trick, please don't use "stdout" as a variable for the unit number, but e.g. "master" or anything that indicates you're not actually writing to stdout. Furthermore, you should be aware that the number 6 isn't always stdout. Fortran 2003 allows you to check the unit number of stdout, so you should use that if you can.
My advice would be to stay with the if(rank==0) statements. They are clearly indicating what happens in the code. If you use lots of similar i/o statements, you could write subroutines for writing only for rank 0 or for all processes. These can have meaningful names that indicate the intended usage.
mpirun comes with the option to redirect stdout from each process into separate files. For example, -output-filename out would result in out.1.0, out.1.1, ... which you then can monitor using whatever way you like (I use tail -f). Next to if(rank.eq.0) this is the cleanest solution I think.
I am not so concerned with the two disadvantages mentioned by steabert. We can work that out by introducing another file descriptor that clearly indicates that it is stdout only on master process, e.g. stdout -> stdout0.
But my concern is here: The /dev/null will work in UNIX-like environment. Will it work on Windows environment? How about the funky BlueGene systems?