While trying to use a type for blockdiagonal matrices in Fortran code,
I stumbled over a surprising bug in the following piece of code using the following compilers:
GNU Fortran (SUSE Linux) 7.4.0
ifort (IFORT) 18.0.5 20180823
ifort (IFORT) 16.0.1 20151021
If I compile
gfortran -Wall -Werror --debug ifort_bug.f && valgrind ./a.out
I have no errors reported from valgrind.
If I compile
ifort -warn all,error -debug -stacktrace ifort_bug.f && valgrind ./a.out
I get a segmentation fault for ifort_18 in my code and "only" memory leaks for ifort_16.
Is this a bug in the intel compilers, or does gfortran silently fix my bad code?
module blockdiagonal_matrices
implicit none
private
public :: t_blockdiagonal, new, delete,
& blocksizes, operator(.mult.), mult
save
integer, parameter :: dp = kind(1.d0)
type :: t_blockdiagonal
real(dp), allocatable :: block(:, :)
end type
interface new
module procedure block_new
end interface
interface delete
module procedure block_delete
end interface
interface operator (.mult.)
module procedure mult_blocks
end interface
contains
subroutine block_new(blocks, blocksizes)
type(t_blockdiagonal), intent(out) :: blocks(:)
integer, intent(in) :: blocksizes(:)
integer :: i, L
do i = 1, size(blocks)
L = blocksizes(i)
allocate(blocks(i)%block(L, L))
end do
end subroutine
subroutine block_delete(blocks)
type(t_blockdiagonal) :: blocks(:)
integer :: i
do i = 1, size(blocks)
deallocate(blocks(i)%block)
end do
end subroutine
function blocksizes(A) result(res)
type(t_blockdiagonal), intent(in) :: A(:)
integer :: res(size(A))
integer :: i
res = [(size(A(i)%block, 1), i = 1, size(A))]
end function
function mult_blocks(A, B) result(C)
type(t_blockdiagonal), intent(in) :: A(:), B(:)
type(t_blockdiagonal) :: C(size(A))
integer :: i
call new(C, blocksizes=blocksizes(A))
do i = 1, size(A)
C(i)%block = matmul(A(i)%block, B(i)%block)
end do
end function
subroutine mult(A, B, C)
type(t_blockdiagonal), intent(in) :: A(:), B(:)
type(t_blockdiagonal) :: C(:)
integer :: i
do i = 1, size(A)
C(i)%block = matmul(A(i)%block, B(i)%block)
end do
end subroutine
end module blockdiagonal_matrices
program time_blockdiagonal
use blockdiagonal_matrices
integer, parameter :: n_blocks = 2, L_block = 10**2
type(t_blockdiagonal) :: A(n_blocks), B(n_blocks), C(n_blocks)
integer :: i
integer :: start, finish, rate
call system_clock(count_rate=rate)
call new(A, blocksizes=[(L_block, i = 1, n_blocks)])
call new(B, blocksizes=[(L_block, i = 1, n_blocks)])
call new(C, blocksizes=[(L_block, i = 1, n_blocks)])
do i = 1, n_blocks
call random_number(A(i)%block)
call random_number(B(i)%block)
end do
call system_clock(start)
C = A .mult. B
call system_clock(finish)
write(6,*) 'Elapsed Time in seconds:',
& dble(finish - start) / dble(rate)
call system_clock(start)
call mult(A, B, C)
call system_clock(finish)
write(6,*) 'Elapsed Time in seconds:',
& dble(finish - start) / dble(rate)
call delete(A)
call delete(B)
call delete(C)
end program time_blockdiagonal
The ifort_18 segmentation fault is:
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
a.out 00000000004134BD Unknown Unknown Unknown
libpthread-2.26.s 00007FF720EB6300 Unknown Unknown Unknown
a.out 000000000040ABB8 Unknown Unknown Unknown
a.out 000000000040B029 Unknown Unknown Unknown
a.out 0000000000407113 Unknown Unknown Unknown
a.out 0000000000402B4E Unknown Unknown Unknown
libc-2.26.so 00007FF720B0AF8A __libc_start_main Unknown Unknown
a.out 0000000000402A6A Unknown Unknown Unknown
When I went into it with gdb it turned out, that the segfault is raised upon returning from function mult_blocks.
blockdiagonal_matrices::mult_blocks (c=..., a=..., b=...) at ifort_bug.f:63
63 do i = 1, size(A)
(gdb) s
64 C(i)%block = matmul(A(i)%block, B(i)%block)
(gdb) s
s
s
65 end do
(gdb) s
64 C(i)%block = matmul(A(i)%block, B(i)%block)
(gdb) s
65 end do
(gdb) s
66 end function
(gdb) s
Program received signal SIGSEGV, Segmentation fault.
0x000000000040abc4 in do_deallocate_all ()
(gdb) q
A debugging session is active.
Even with this information I cannot find a bug in my code.
EDIT
I found a fix, although I don't understand why it works.
Compilation fix
If I use -heap-arrays for compilation it works. So at first glance it seems to run into a Stackoverflow. If I do ulimit -s unlimited it does not solve the problem though.
Code fix
If I explicitly allocate in the code it solves the issue.
subroutine new(blocks, blocksizes)
type(t_blockdiagonal), allocatable, intent(out) :: blocks(:)
integer, intent(in) :: blocksizes(:)
integer :: i, L
allocate(blocks(size(blocksizes)))
do i = 1, size(blocks)
L = blocksizes(i)
allocate(blocks(i)%block(L, L))
end do
end subroutine
subroutine delete(blocks)
type(t_blockdiagonal), allocatable :: blocks(:)
integer :: i
do i = 1, size(blocks)
deallocate(blocks(i)%block)
end do
deallocate(blocks)
end subroutine
function mult_blocks(A, B) result(C)
type(t_blockdiagonal), intent(in) :: A(:), B(:)
type(t_blockdiagonal), allocatable :: C(:)
integer :: i
call new(C, blocksizes=blocksizes(A))
do i = 1, size(A)
C(i)%block = matmul(A(i)%block, B(i)%block)
end do
end function
This way of writing is IMHO actually better than before and not a "dirty hack".
It is not anymore possible to call new with a differing size of blocksize and blocks.
Open question
Speaking C the type(t_blockdiagonal) :: blocks(n)
Should be a float** blocks[n] so just a vector of pointers to pointers.
The allocation of the actual blocks happened also in the first version on the heap. Hence I don't get the Stackoverflow for a vector that contains ca. 10 pointers.
IntelĀ® Fortran Compiler - Increased stack usage of 8.0 or higher compilers causes segmentation fault (via archive.org)
According to this article, ulimit -s unlimited does not mean that the stack will be literally "unlimited":
The size of "unlimited" varies by Linux configuration, so you may need to specify a larger, specific number to ulimit (for example, 999999999). On Linux also note that many 32bit Linux distributions ship with a pthread static library (libpthread.a) that at runtime will fix the stacksize to 2093056 bytes regardless of the ulimit setting.
It is most likely that you do indeed have a stack overflow, given that -heap-arrays solves the issue.
Related
Consider the following Fortran program
program test_prg
use iso_fortran_env, only : real64
use mpi_f08
implicit none
real(real64), allocatable :: arr_send(:), arr_recv(:)
integer :: ierr
call MPI_Init(ierr)
allocate(arr_send(3), arr_recv(3))
arr_send = 1
print *, lbound(arr_recv)
call MPI_Gatherv(arr_send, size(arr_send), MPI_DOUBLE_PRECISION, arr_recv, [size(arr_send)], [0], MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD, ierr)
print *, lbound(arr_recv)
call MPI_Finalize(ierr)
end program
Execution of this program on 1 processor (compiled with gfortran 9.3.0 and mpich 3.3.2), prints:
1
0
So arr_recv has changed its lower bound after the call to MPI_Gatherv. If I use arr_recv(1) instead of arr_recv in the call to MPI_Gatherv, then it doesn't change. If I replace mpi_f08 module with mpi, then using either arr_recv(1) or arr_recv doesn't change the lower bound.
Why is lower bound changing in this program?
At this stage, I believe this is a bug in gfortran affecting the MPI Fortran 2018 bindings (e.g. use mpi_f08) and I reported it at https://gcc.gnu.org/pipermail/fortran/2020-September/055068.html.
All gfortran versions are affected (I tried 9.2.0, 10.2.0 and the latest master branch, versions 8 and earlier do not support dimension(..).
The reproducer below can be used to evidence the issue
MODULE FOO
INTERFACE
SUBROUTINE dummyc(x0) BIND(C, name="sync")
type(*), dimension(..) :: x0
END SUBROUTINE
END INTERFACE
contains
SUBROUTINE dummy(x0)
type(*), dimension(..) :: x0
call dummyc(x0)
END SUBROUTINE
END MODULE
PROGRAM main
USE FOO
IMPLICIT NONE
integer :: before(2), after(2)
INTEGER, parameter :: n = 1
DOUBLE PRECISION, ALLOCATABLE :: buf(:)
DOUBLE PRECISION :: buf2(n)
ALLOCATE(buf(n))
before(1) = LBOUND(buf,1)
before(2) = UBOUND(buf,1)
CALL dummy (buf)
after(1) = LBOUND(buf,1)
after(2) = UBOUND(buf,1)
if (before(1) .NE. after(1)) stop 1
if (before(2) .NE. after(2)) stop 2
before(1) = LBOUND(buf2,1)
before(2) = UBOUND(buf2,1)
CALL dummy (buf2)
after(1) = LBOUND(buf2,1)
after(2) = LBOUND(buf2,1)
if (before(1) .NE. after(1)) stop 3
if (before(2) .NE. after(2)) stop 4
END PROGRAM
FWIW, Intel ifort compiler (I tried 18.0.5) works fine with the reproducer.
This question is a bit old, but I had the same issue on the last days with GNU Fortran (GCC) 11.2.0 when moving from use mpi to use mpi_f08 - but in my case, it was with MPI_Allreduce. It's clearly a bug on the MPI functions, but one workaround is to send the argument with the bounds defined, as var(1:end). This worked for me. In your case, you could try:
call MPI_Gatherv(arr_send(1:3), size(arr_send(1:3)), MPI_DOUBLE_PRECISION, arr_recv(1:3), [size(arr_send(1:3))], [0], MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD, ierr)
I am writing a generic subroutine in fortran90 that will read in a column of data (real values). The subroutine should first check to see that the file exists and can be opened, then it determines the number of elements (Array_Size) in the column by reading the number of lines until end of file. Next the subroutine rewinds the file back to the beginning and reads in the data points and assigns each to an array (Column1(n)) and also determines the largest element in the array (Max_Value). The hope is that this subroutine can be written to be completely generic and not require any prior knowledge of the number of data points in the file, which is why the number of elements is first determined so the array, "Column1", can be dynamically allocated to contain "Array_Size" number of data points. Once the array is passed to the main program, it is transferred to another array and the initial dynamically allocated array is deallocated so that the routine can be repeated for multiple other input files, although this example only reads in one data file.
As written below, the program compiles just fine on the Intel fortran compiler; however, when it runs it gives me a severe (174): SIGSEV fault. I place the write(,) statements before and after the allocate statement in the subroutine and it prints the first statement "Program works here", but not the second, which indicates that the problem is occurring at the ALLOCATE (Column1(Array_Size)) statement, between the two write(,) statements. I re-compiled it with -C flag and ran the executable, which fails again and states severe (408): "Attempt to fetch from allocatable variable MISC_ARRAY when it is not allocated". The variable MISC_ARRAY is the dummy variable in the main program, which seems to indicate that the compiler wants the array allocated in the main program and not in the subprogram. If I statically allocate the array, the program works just fine. In order to make the program generic and not require any knowledge of the size of each file, it needs to be dynamically allocated and this should happen in the subprogram, not the main program. Is there a way to accomplish this that I am not seeing?
PROGRAM MAIN
IMPLICIT NONE
! - variable Definitions for MAIN program
INTEGER :: n
! - Variable Definitions for EXPENSE READER Subprograms
REAL, DIMENSION(:), ALLOCATABLE :: Misc_Array,MISC_DATA
INTEGER :: Size_Misc
REAL :: Peak_Misc_Value
! REAL :: Misc_Array(365)
CHARACTER(LEN=13) :: File_Name
File_Name = "Misc.txt"
CALL One_Column(File_Name,Size_Misc,Peak_Misc_Value,Misc_Array)
ALLOCATE (MISC_DATA(Size_Misc))
DO n = 1,Size_Misc ! Transfers array data
MISC_DATA(n) = Misc_Array(n)
END DO
DEALLOCATE (Misc_Array)
END PROGRAM MAIN
SUBROUTINE One_Column(File_Name,Array_Size,Max_Value,Column1)
IMPLICIT NONE
REAL, DIMENSION(:), ALLOCATABLE,INTENT(OUT) :: Column1
! REAL :: Column1(365)
REAL, INTENT(OUT) :: Max_Value
CHARACTER,INTENT(IN) :: File_Name*13
INTEGER, INTENT(OUT) :: Array_Size
INTEGER :: Open_Status,Input_Status,n
! Open the file and check to ensure it is properly opened
OPEN(UNIT=100,FILE = File_Name,STATUS = 'old',ACTION = 'READ', &
IOSTAT = Open_Status)
IF(Open_Status > 0) THEN
WRITE(*,'(A,A)') "**** Cannot Open ",File_Name
STOP
RETURN
END IF
! Determine the size of the file
Array_Size = 0
DO 300
READ(100,*,IOSTAT = Input_Status)
IF(Input_Status < 0) EXIT
Array_Size = Array_Size + 1
300 CONTINUE
REWIND(100)
WRITE(*,*) "Program works here"
ALLOCATE (Column1(Array_Size))
WRITE(*,*) "Program stops working here"
Max_Value = 0.0
DO n = 1,Array_Size
READ(100,*) Column1(n)
IF(Column1(n) .GT. Max_Value) Max_Value = Column1(n)
END DO
END SUBROUTINE One_Column
This is an educated guess: I think that the subroutine One_Column ought to have an explicit interface. As written the source code has 2 compilation units, a program (called main) and an external subroutine (called One_Column).
At compile-time the compiler can't figure out the correct way to call the subroutine from the program. In good-old (emphasis on old) Fortran style it takes a leap of faith and leaves it to the linker to find a subroutine with the right name and crosses its fingers (as it were) and hopes that the actual arguments match the dummy arguments at run-time. This approach won't work on subroutines returning allocated data structures.
For a simple fix move end program to the end of the source file, in the line vacated enter the keyword contains. The compiler will then take care of creating the necessary interface.
For a more scalable fix, put the subroutine into a module and use-associate it.
I think it is important to show the corrected code so that future users can read the question and also see the solution. I broke the subroutine into a series of smaller functions and one subroutine to keep the data as local as possible and implemented it into a module. The main program and module are attached. The main program includes a call to the functions twice, just to show that it can be used modularly to open multiple files.
PROGRAM MAIN
!
! - Author: Jonathan A. Webb
! - Date: December 11, 2014
! - Purpose: This code calls subprograms in module READ_COLUMNAR_FILE
! to determine the number of elements in an input file, the
! largest element in the input file and reads in the column of
! data as an allocatable array
!***************************************************************************
!***************************************************************************
!********************* **********************
!********************* VARIABLE DEFINITIONS **********************
!********************* **********************
!***************************************************************************
!***************************************************************************
USE READ_COLUMNAR_FILE
IMPLICIT NONE
CHARACTER(LEN=13) :: File_Name
INTEGER :: Size_Misc,Size_Bar,Unit_Number
REAL :: Peak_Misc_Value,Peak_Bar_Value
REAL, DIMENSION(:), ALLOCATABLE :: Misc_Array,Bar_Array
!***************************************************************************
!***************************************************************************
!********************* **********************
!********************* FILE READER BLOCK **********************
!********************* **********************
!***************************************************************************
!***************************************************************************
! - This section reads in data from all of the columnar input decks.
! User defines the input file name and number
File_Name = "Misc.txt"; Unit_Number = 100
! Determines the number of rows in the file
Size_Misc = File_Length(File_Name,Unit_Number)
! Yields the allocatable array and the largest element in the array
CALL Read_File(File_Name,Unit_Number,Misc_Array,Peak_Misc_Value)
File_Name = "Bar.txt"; Unit_Number = 100
Size_Bar = File_Length(File_Name,Unit_Number)
CALL Read_File(File_Name,Unit_Number,Bar_Array,Peak_Bar_Value)
END PROGRAM MAIN
MODULE READ_COLUMNAR_FILE
!***********************************************************************************
!***********************************************************************************
! ***
! Author: Jonathan A. Webb ***
! Purpose: Compilation of subprograms required to read in multi-column ***
! data files ***
! Drafted: December 11, 2014 ***
! ***
!***********************************************************************************
!***********************************************************************************
!
!-----------------------------------
! Public functions and subroutines for this module
!-----------------------------------
PUBLIC :: Read_File
PUBLIC :: File_Length
!-----------------------------------
! Private functions and subroutines for this module
!-----------------------------------
PRIVATE :: Check_File
!===============================================================================
CONTAINS
!===============================================================================
SUBROUTINE Check_File(Unit_Number,Open_Status,File_Name)
INTEGER,INTENT(IN) :: Unit_Number
CHARACTER(LEN=13), INTENT(IN) :: File_Name
INTEGER,INTENT(OUT) :: Open_Status
! Check to see if the file exists
OPEN(UNIT=Unit_Number,FILE = File_Name,STATUS='old',ACTION='read', &
IOSTAT = Open_Status)
IF(Open_Status .GT. 0) THEN
WRITE(*,*) "**** Cannot Open ", File_Name," ****"
STOP
RETURN
END IF
END SUBROUTINE Check_File
!===============================================================================
FUNCTION File_Length(File_Name,Unit_Number)
INTEGER :: File_Length
INTEGER, INTENT(IN) :: Unit_Number
CHARACTER(LEN=13),INTENT(IN) :: File_Name
INTEGER :: Open_Status,Input_Status
! Calls subroutine to check on status of file
CALL Check_File(Unit_Number,Open_Status,File_Name)
IF(Open_Status .GT. 0)THEN
WRITE(*,*) "**** Cannot Read", File_Name," ****"
STOP
RETURN
END IF
! Determine File Size
File_Length = 0
DO 300
READ(Unit_Number,*,IOSTAT = Input_Status)
IF(Input_Status .LT. 0) EXIT
File_Length = File_Length + 1
300 CONTINUE
CLOSE(Unit_Number)
END FUNCTION File_Length
!===============================================================================
SUBROUTINE Read_File(File_Name,Unit_Number,Column1,Max_Value)
INTEGER, INTENT(IN) :: Unit_Number
REAL, DIMENSION(:), ALLOCATABLE,INTENT(OUT) :: Column1
CHARACTER(LEN=13),INTENT(IN) :: File_Name
REAL, INTENT(OUT) :: Max_Value
INTEGER :: Array_Size,n
! Determines the array size and allocates the array
Array_Size = File_Length(File_Name,Unit_Number)
ALLOCATE (Column1(Array_Size))
! - Reads in columnar array and determines the element with
! the largest value
Max_Value = 0.0
OPEN(UNIT= Unit_Number,File = File_Name)
DO n = 1,Array_Size
READ(Unit_Number,*) Column1(n)
IF(Column1(n) .GT. Max_Value) Max_Value = Column1(n)
END DO
CLOSE(Unit_Number)
END SUBROUTINE Read_File
!===============================================================================
END MODULE READ_COLUMNAR_FILE
This is probably something really simple but I'm getting the error when compiling my little Fortran program. (The file is .f90) Is this something to do with fixed versus free line length? That seems to be all I could glean from a google search.
Here's the program:
program array
integer :: k, n, i, j, h, f, AllocateStatus
real*8, dimension(:, :, :), allocatable :: a
character, parameter :: "fname"
k = 5
n = 5
h = 1
allocate(a(n,k,h), stat = AllocateStatus)
if (AllocateStatus /= 0) stop "*** Not enough memory ***"
a(1,:,:) = 5
a(2,:,:) = 6
call writeArray(7,a,"testOutput")
deallocate(a)
end program array
subroutine writeArray(f,array,fname)
implicit none
integer :: f, i, j, k, n
character, parameter :: "fname"
real*8, dimension(:, :, :), allocatable :: array
open(unit = f, file="fname")
do, i=1,n
do, j=1,k
write(7,"(F5.2)") array(i,j,:)
if (j==k) write(7,"(A1)") "X"
enddo
enddo
!write(7,"(I5)") size(a)
close(f)
end subroutine writeArray
And the errors:
test.f90:4.29:
character, parameter :: "fname"
1
Error: Invalid character in name at (1)
test.f90:24.26:
character, parameter :: "fname"
1
Error: Invalid character in name at (1)
test.f90:21.35:
subroutine writeArray(f,array,fname)
1
Error: Symbol 'fname' at (1) has no IMPLICIT type
You cannot use quotation marks to denote an initialization. In your subroutine, you should have
CHARACTER(LEN=*) :: fname
in place of what you have there. You probably do not need the PARAMETER statement with the character declaration. The initialization of fname does not appear to be needed in the main program.
Another pair of things I noted in your code: (1) you don't need to declare array and ALLOCATABLE and (2) you ought to start file UNITs at values >= 10 because the single-digit numbers are occasionally associated with (reserved for?) standard out.
Another suggestion is that you should either put your writeArray subroutine in its own MODULE and USE it, or write the program as
PROGRAM Main
...
CONTAINS
SUBROUTINE writeArray
...
END SUBROUTINE
END PROGRAM
With either method, you will catch inconsistencies in the arguments. Not only that, you will also be able to use the variables n and k without issue.
I totally agree with #kyle. So in heeding those suggestions I would also declare the intent of the variables to the subroutine writeArray. Thus the program would be along the lines of:
program array
integer :: k, n, h, AllocateStatus
double precision, dimension(:, :, :), allocatable :: a
character(len=1024) :: fname
fname = "testOutput"
k = 5
n = 5
h = 1
allocate(a(n,k,h), stat = AllocateStatus)
if (AllocateStatus /= 0) stop "*** Not enough memory ***"
a(1,:,:) = 5
a(2,:,:) = 6
call writeArray(7,a,fname)
deallocate(a)
contains
subroutine writeArray(f,array,fname)
implicit none
integer, intent(in) :: f
integer :: i, j, k
character(len=*), intent(in) :: fname
double precision, dimension(:, :, :), intent(in) :: array
open(unit = f, file=fname)
i = size(array, 1)
k = size(array, 2)
do, i=1,n
do, j=1,k
write(7,"(F5.2)") array(i,j,:)..
if (j==k) write(7,"(A1)") "X"
enddo
enddo
!write(7,"(I5)") size(a)
close(f)
end subroutine writeArray
end program array
Also I don't like using real*8, I tend to either declare it as either real(kind=8) or double precision.
Lastly, depending on the compiler you use (and hence it's flags), Try to always be as pedantic and chatty as possible. For gfortran I typically use the options -Wall -pedantic when compiling.
Additional comments:
You definitely don't want parameter in the declaration of fname -- that designates that the "variable" is constant, which is inconsistent with a dummy argument.
You could declare the arguments as:
integer, intent (in) :: f
character (len=*), intent (in) :: fname
real*8, dimension(:, :, :), intent (in) :: array
The reason that you don't need to declare array as allocatable in the subroutine is that you don't change its allocation in the subroutine. You can obtain the values of n and k with the size intrinsic and so don't need to pass them as arguments.
I like to do this:
program main
implicit none
integer l
integer, allocatable, dimension(:) :: array
allocate(array(10))
array = 0
!$omp parallel do private(array)
do l = 1, 10
array(l) = l
enddo
!$omp end parallel do
print *, array
deallocate(array)
end
But I am running into error messages:
* glibc detected * ./a.out: munmap_chunk(): invalid pointer: 0x00007fff25d05a40 *
This seems to be a bug in ifort according to some discussions at intel forums but should be resolved in the version I am using (11.1.073 - Linux). This is a MASSIVE downscaled version of my code! I unfortunately can not use static arrays to have a workaround.
If I put the print into the loop, I get other errors:
* glibc detected ./a.out: double free or corruption (out): 0x00002b22a0c016f0 **
I didn't get the errors you're getting, but you have an issue with privatizing array in your OpenMP call.
[mjswartz#666-lgn testfiles]$ vi array.f90
[mjswartz#666-lgn testfiles]$ ifort -o array array.f90 -openmp
[mjswartz#666-lgn testfiles]$ ./array
0 0 0 0 0 0
0 0 0 0
[mjswartz#666-lgn testfiles]$ vi array.f90
[mjswartz#666-lgn testfiles]$ ifort -o array array.f90 -openmp
[mjswartz#666-lgn testfiles]$ ./array
1 2 3 4 5 6
7 8 9 10
First run is with private array, second is without.
program main
implicit none
integer l
integer, allocatable, dimension(:) :: array
allocate(array(10))
!$omp parallel do
do l = 1, 10
array(l) = l
enddo
print*, array
deallocate(array)
end program main
I just ran your code with ifort and openmp and it spewed 0d0's. I had to manually quit the execution. What is your expected output? I'm not a big fan of unnecessarily dynamically allocating arrays. You know what you're going to allocate your matrices as, so just make parameters and statically do it. I'll mess with some stuff and edit this response in a few.
Ok, so here's my edits:
program main
implicit none
integer :: l, j
integer, parameter :: lmax = 15e3
integer, parameter :: jmax = 25
integer, parameter :: nk = 300
complex*16, dimension(9*nk) :: x0, xin, xout
complex*16, dimension(lmax) :: e_pump, e_probe
complex*16 :: e_pumphlp, e_probehlp
character*25 :: problemtype
real*8 :: m
! OpenMP variables
integer :: myid, nthreads, omp_get_num_threads, omp_get_thread_num
x0 = 0.0d0
problemtype = 'type1'
if (problemtype .ne. 'type1') then
write(*,*) 'Problem type not specified. Quitting'
stop
else
! Spawn a parallel region explicitly scoping all variables
!$omp parallel
myid = omp_get_thread_num()
if (myid .eq. 0) then
nthreads = omp_get_num_threads()
write(*,*) 'Starting program with', nthreads, 'threads'
endif
!$omp do private(j,l,m,e_pumphlp,e_probehlp,e_pump,e_probe)
do j = 1, jmax - 1
do l = 1, lmax
call electricfield(0.0d0, 0.0d0, e_pumphlp, &
e_probehlp, 0.0d0)
! print *, e_pumphlp, e_probehlp
e_pump(l) = e_pumphlp
e_probe(l) = e_probehlp
print *, e_pump(l), e_probe(l)
end do
end do
!$omp end parallel
end if
end program main
Notice I removed your use of a module since it was unnecessary. You have an external module containing a subroutine, so just make it an external subroutine. Also, I changed your matrices to be statically allocated. Case statements are a fancy and expensive version of if statements. You were casing 15e3*25 times rather than once (expensive), so I moved those outside. I changed the OpenMP calls, but only semantically. I gave you some output so that you know what OpenMP is actually doing.
Here is the new subroutine:
subroutine electricfield(t, tdelay, e_pump, e_probe, phase)
implicit none
real*8, intent(in) :: t, tdelay
complex*16, intent(out) :: e_pump, e_probe
real*8, optional, intent (in) :: phase
e_pump = 0.0d0
e_probe = 0.0d0
return
end subroutine electricfield
I just removed the module shell around it and changed some of your variable names. Fortran is not case sensitive, so don't torture yourself by doing caps and having to repeat it throughout.
I compiled this with
ifort -o diffeq diffeq.f90 electricfield.f90 -openmp
and ran with
./diffeq > output
to catch the program vomiting 0's and to see how many threads I was using:
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
Starting program with 32 threads
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
Hope this helps!
It would appear that you are running into a compiler bug associated with the implementation of OpenMP 3.0.
If you can't update your compiler, then you will need to change your approach. There are a few options - for example you could make the allocatable arrays shared, increase their rank by one and have one thread allocate them such that the extent of the additional dimension is the number of workers in the team. All subsequent references to those arrays then need to be have the subscript for that additional rank be the omp team number (+ 1, depending on what you've used for the lower bound).
Explicit allocation of the private allocatable arrays inside the parallel construct (only) may also be an option.
My problem is that I don't know how to call subroutines when I use mpi scheme in Fortran.
I have written this small code named TRY.f90 in which there is a subroutine named CONCENTRATION.f90. How should I change CONCENTRATION.f90 in order to make the code works?
PROGRAM TRY
USE MPI
integer status(mpi_status_size)
INTEGER I, J, K, II, IERR, MY_ID, NUM_PROCS, PSP
INTEGER , PARAMETER :: GRIDX =64, GRIDY=64
REAL , DIMENSION(gridx,gridy) :: PSI
PSI=0
PRINT*, 'VARIABLE'
CALL MPI_INIT(IERR)
CALL MPI_COMM_RANK(MPI_COMM_WORLD,MY_ID,IERR)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NUM_PROCS,IERR)
CALL CONCENTRATION(GRIDX, GRIDY, NUM_PROCS, MY_ID , PSI)
IF (MY_ID .NE. 0) THEN
CALL mpi_send( PSI(1+MY_ID*GRIDX/NUM_PROCS:(MY_ID+1)*GRIDX/NUM_PROCS:1,1:GRIDY:1),&
(GRIDX/NUM_PROCS)*GRIDY,mpi_real, 0,10,mpi_comm_world,ierr)
END IF
IF (MY_ID .EQ. 0) THEN
DO II=1,NUM_PROCS-1
CALL mpi_recv(PSI(1+II*GRIDX/NUM_PROCS:(II+1)*GRIDX/NUM_PROCS:1,1:GRIDY:1),&
(GRIDX/NUM_PROCS)*GRIDY,mpi_real, &
II,10,mpi_comm_world,status,ierr)
END DO
END IF
CALL MPI_FINALIZE(IERR)
END PROGRAM TRY
I am using a subroutine named CONCENTRATION.f90 which is:
SUBROUTINE CONCENTRATION(GRIDX, GRIDY, NUM_PROCS, MY_ID , PSI)
implicit none
INTEGER*8, INTENT(IN) :: GRIDX, GRIDY
INTEGER , INTENT(IN) :: NUM_PROCS, MY_ID
REAL*8 , DIMENSION(GRIDX,GRIDY), INTENT(OUT) :: PSI
INTEGER*8 I, J
DO I=1+MY_ID*GRIDX/NUM_PROCS, (MY_ID+1)*GRIDX/NUM_PROCS
DO J=1,GRIDY
PSI(I,J)=2.0
END DO
END DO
END SUBROUTINE CONCENTRATION
The code currently gives me error since I think I should have made some changes on the subroutine CONCENTRATION.f90. Or I should also change the way I call the subroutine.
Could you please tell me what are those changes? Thanks for your helps in advance
Your program segfaults because of type mismatch. In the main program you have declared PSI as an array of REAL:
REAL , DIMENSION(gridx,gridy) :: PSI
while in the CONCENTRATION subroutine you use another type of REAL*8:
REAL*8 , DIMENSION(GRIDX,GRIDY), INTENT(OUT) :: PSI
By default REAL is 4 bytes long while REAL*8 (or DOUBLE PRECISION or REAL(KIND=8)) is 8 bytes long. So you are giving to CONCENTRATION an array that is 2 times smaller than what it believes to be and all ranks from NUM_PROCS/2 onwards write past the end of the PSI array and thus cause segfaults. If you run with one process only, then even rank 0 will segfault.
You should also read about MPI collective operations. MPI_GATHER and MPI_GATHERV do exactly what you are trying to achieve whith multiple sends and receives here.
The only change would be to declare concentration as reentrant. That could be the default for Fortran 90. (The bulk of my experience is F77, and reentrant is not the default there.)