OpenMP reductions inside subgrograms

OpenMP reductions inside subgrograms - fortran

The following Fortran code fails (random result), but replacing the call to mysum by abc=abc+1
gives the correct result. How to make OpenMP recognizing the reduction in a subprogram?
program reduc
use omp_lib
implicit none
integer :: abc=0, icount
call OMP_set_num_threads(8)
!$omp parallel private (icount) reduction(+:abc)
!$omp do
do icount = 1,8
!abc = abc + 1
call mysum(OMP_get_thread_num())
end do
!$omp end do
!$omp end parallel
print*,"abc at end: ",abc
contains
subroutine mysum(omp_rank)
integer :: omp_rank
abc = abc + 1
print*,"OMP rank: ", omp_rank, " abc: ", abc
endsubroutine mysum
end program reduc
I also tried to put !$omp threadprivate (abc) into mysum, which was rejected with
"Error: Symbol 'abc' at (1) has no IMPLICIT type.", which is of course not true.

Because of the reduction, the original variable abc is privatised in each thread, but abc in the contained subroutine always refers to the original, non privatised variable.
The solution is to pass it as an argument:
subroutine mysum(omp_rank,abc)
integer, intent(in) :: omp_rank
integer, intent(inout) :: abc
abc = abc + 1
print*,"OMP rank: ", omp_rank, " abc: ", abc
endsubroutine mysum

Related

Fortran OpenACC invoking a function on device using a function pointer

How can I access a function on device via a function pointer?
In below code I am trying to access init0 or init1 using function pointer init. The code does work as intended if OpenACC is not enabled during compilation. However, it fails when compiled with OpenACC. Below code is saved as stackOverflow2.f95:
module modTest2
use openacc
implicit none
type :: Container
sequence
integer :: n
integer, allocatable :: arr(:)
end type Container
interface Container
procedure :: new_Container
end interface
abstract interface
integer function function_template (i)
integer, intent (in) :: i
end function function_template
end interface
contains
type(Container) function new_Container(n)
integer, intent(in) :: n
allocate(new_Container%arr(n))
end function new_Container
end module modTest2
program test2
use modTest2
implicit none
integer :: n, x, i
type(Container) :: c
procedure(function_template), pointer :: init
print *, "Enter array size: "
read *, n
print *, "Allocating..."
c = Container(n)
print *, "Allocation complete!"
print *, "Enter initialization type (x): "
read *, x
print *, "Initializing..."
select case (x)
case (0)
init => init0
case default
init => init1
end select
!$acc data copyin(c) copyout(c%arr)
!$acc parallel loop present(c)
do i = 1, n
c%arr(i) = init(i)
end do
!$acc end data
print *, "Initialization complete..."
do i = 1, n
print *, i, c%arr(i)
end do
contains
integer function init0(i)
!$acc routine
integer, intent(in) :: i
init0 = 10*i
end function init0
integer function init1(i)
!$acc routine
integer, intent(in) :: i
init1 = 20*i
end function init1
end program test2
Correct output is seen without OpenACC:
$ gfortran -c stackOverflow2.f95
$ gfortran stackOverflow2.o -o a.out
$ ./a.out
Enter array size:
3
Allocating...
Allocation complete!
Enter initialization type (x):
0
Initializing...
Initialization complete...
1 10
2 20
3 30
Incorrect output is seen below with OpenACC (Note that NVIDIA compiler is used here):
$ /opt/nvidia/hpc_sdk/Linux_x86_64/22.1/compilers/bin/nvfortran stackOverflow2.f95 -acc; ./a.out
Enter array size:
3
Allocating...
Allocation complete!
Enter initialization type (x):
0
Initializing...
Initialization complete...
1 0
2 0
3 0

Sorry but function pointers (along with C++ virtual functions) are not yet supported on the device. Adding the compiler feedback flag (-Minfo=accel), you'll see the following message:
% nvfortran -acc -Minfo=accel test.f90
test2:
62, Generating copyout(c%arr(:)) [if not already present]
Generating copyin(c) [if not already present]
65, Accelerator restriction: Indirect function/procedure calls are not supported
The problem being that indirect functions require a device jump table and runtime dynamic linking which is currently unavailable. While I don't have a timeline, we are exploring options on how to offer this support in the future.

Using gfortran-11 with the below did the trick:
module modTest2
use openacc
implicit none
type :: Container
sequence
integer :: n
integer, allocatable :: arr(:)
end type Container
interface Container
procedure :: new_Container
end interface
abstract interface
integer function function_template (i)
integer, intent (in) :: i
end function function_template
end interface
contains
type(Container) function new_Container(n)
integer, intent(in) :: n
allocate(new_Container%arr(n))
end function new_Container
end module modTest2
program test2
use modTest2
implicit none
integer :: n, x, i
type(Container) :: c
procedure(function_template), pointer :: init
print *, "Enter array size: "
read *, n
print *, "Allocating..."
c = Container(n)
print *, "Allocation complete!"
print *, "Enter initialization type (x): "
read *, x
print *, "Initializing..."
select case (x)
case (0)
init => init0
case default
init => init1
end select
!$acc enter data copyin(c)
!$acc enter data create(c%arr)
!$acc parallel loop present(c)
do i = 1, n
c%arr(i) = init(i)
end do
!$acc exit data copyout(c%arr)
!$acc exit data delete(c)
print *, "Initialization complete..."
do i = 1, n
print *, i, c%arr(i)
end do
contains
integer function init0(i)
!$acc routine
integer, intent(in) :: i
init0 = 10*i
end function init0
integer function init1(i)
!$acc routine
integer, intent(in) :: i
init1 = 20*i
end function init1
end program test2
Here's the output:
$ gfortran-11 -fopenacc stackOverflow2.f95
$ gfortran-11 -fopenacc stackOverflow2.o -o stackOverflow2
$ ./stackOverflow2
Enter array size:
4
Allocating...
Allocation complete!
Enter initialization type (x):
0
Initializing...
Initialization complete...
1 10
2 20
3 30
4 40
$ ./stackOverflow2
Enter array size:
4
Allocating...
Allocation complete!
Enter initialization type (x):
9
Initializing...
Initialization complete...
1 20
2 40
3 60
4 80

OpenACC routine vector with intent out argument

I am currently accelerating a Fortran code where I have a main accelerated loop in subroutine sub. In the loop, I want to call subroutine subsub on the device with acc routine. The subroutine has an intent(out) argument val, which is private in the loop. As subsub has a loop itself, I want to use the vector clause:
module calc
implicit none
public :: sub
private
contains
subroutine sub()
integer :: i
integer :: array(10)
integer :: val
!$acc kernels loop independent private(val)
do i = 1, 10
call subsub(val)
array(i) = val
enddo
print "(10(i0, x))", array
endsubroutine
subroutine subsub(val)
!$acc routine vector
integer, intent(out) :: val
integer :: i
val = 0
!$acc loop independent reduction(+:val)
do i = 1, 10
val = val + 1
enddo
endsubroutine
endmodule
program test
use calc, only: sub
implicit none
call sub()
endprogram
When compiling with the PGI compiler version 20.9-0 and running the program, I get gibberish values in variable array. When I simply use acc routine for subsub, I get the correct behavior (10 in all values of array). What is wrong in my approach to parallelize this subroutine?

It does look like a compiler code generation issue on how val is getting handled in the main loop. Luckily the workaround is easy, just add the installation of val in the main loop.
% cat test.f90
module calc
implicit none
public :: sub
private
contains
subroutine sub()
integer :: i
integer :: array(10)
integer :: val
!$acc kernels loop independent private(val)
do i = 1, 10
val = 0
call subsub(val)
array(i) = val
enddo
print "(10(i0, x))", array
endsubroutine
subroutine subsub(val)
!$acc routine vector
integer, intent(out) :: val
integer :: i
val = 0
!$acc loop independent reduction(+:val)
do i = 1, 10
val = val + 1
enddo
endsubroutine
endmodule
program test
use calc, only: sub
implicit none
call sub()
endprogram
% nvfortran -acc -Minfo=accel test.f90 -V20.9 ; a.out
sub:
10, Generating implicit copyout(array(:)) [if not already present]
11, Loop is parallelizable
Generating Tesla code
11, !$acc loop gang ! blockidx%x
subsub:
18, Generating Tesla code
24, !$acc loop vector ! threadidx%x
Generating reduction(+:val)
Vector barrier inserted for vector loop reduction
24, Loop is parallelizable
10 10 10 10 10 10 10 10 10 10

Calling a function or subroutine

I'm quite new to fortran, i'm trying to execute a function/subroutine but i'm getting an error Explicit interface required
This is my code:
function printmat(m)
integer, dimension(:,:) :: m
integer :: row,col
row = size(m,1)
col = size(m,2)
do k=1,row
print *, m(k,1:col)
enddo
end function printmat
program test
integer, dimension(5, 5) :: mat
integer :: i,j
do i=1,5
do j=1,5
mat(j,i) = real(i)/real(j)
enddo
enddo
call printmat(mat)
end program test
But when i execute it i get:
Error: Explicit interface required for 'printmat' at (1): assumed-shape argument
Any idea of what could it be? I tried wrapping it into a module, but when i use "use modulename" in the program it gives me an error (tries to read it from a file with the same name)

Wrap it into a module and make it a subroutine if you want to use it with CALL.
module printmat_module
contains
subroutine printmat(m)
integer, dimension(:,:) :: m
integer :: row,col
row = size(m,1)
col = size(m,2)
do k=1,row
print *, m(k,1:col)
enddo
end subroutine printmat
end module printmat_module
program test
use printmat_module
integer, dimension(5, 5) :: mat
integer :: i,j
do i=1,5
do j=1,5
mat(j,i) = real(i)/real(j)
enddo
enddo
call printmat(mat)
end program test
Alternatively you can just do what the compiler tells you and add an explicit interface to the program.
subroutine printmat(m)
integer, dimension(:,:) :: m
integer :: row,col
row = size(m,1)
col = size(m,2)
do k=1,row
print *, m(k,1:col)
enddo
end subroutine printmat
program test
interface
subroutine printmat(m)
integer, dimension(:,:) :: m
end subroutine printmat
end interface
integer, dimension(5, 5) :: mat
integer :: i,j
do i=1,5
do j=1,5
mat(j,i) = real(i)/real(j)
enddo
enddo
call printmat(mat)
end program test

Modern Fortran: Calling an ancestor procedure from descendent

I am starting to use the OO features of Modern Fortran and am already familiar with OO in other languages. In Delphi (Object Pascal) it is common to call an ancestor version of a procedure in its overridden descendent procedure and there is even a language statement "inherited" that allows this. I can't find an equivalent Fortran construct - but am probably looking for the wrong thing. See simple example below. Any advice much appreciated.
type tClass
integer :: i
contains
procedure Clear => Clear_Class
end type tClass
type tSubClass
integer :: j
contains
procedure Clear => Clear_SubClass
end type tSubClass
subroutine Clear_Class
i = 0
end subroutine
subroutine Clear_SubClass
inherited Clear ! this is the Delphi way
j = 0
end subroutine

Here is some sample code that tries to implement the comment by #HighPerformanceMark (i.e., a child type has a hidden component that refers to the parent type).
module testmod
implicit none
type tClass
integer :: i = 123
contains
procedure :: Clear => Clear_Class
endtype
type, extends(tClass) :: tSubClass
integer :: j = 456
contains
procedure :: Clear => Clear_SubClass
endtype
contains
subroutine Clear_Class( this )
class(tClass) :: this
this % i = 0
end
subroutine Clear_SubClass( this )
class(tSubClass) :: this
this % j = 0
call this % tClass % Clear() !! (*) calling a method of the parent type
end
end
program main
use testmod
implicit none
type(tClass) :: foo
type(tSubClass) :: subfoo
print *, "foo (before) = ", foo
call foo % Clear()
print *, "foo (after) = ", foo
print *, "subfoo (before) = ", subfoo
call subfoo % Clear()
print *, "subfoo (after) = ", subfoo
end
which gives (with gfortran-8.2)
foo (before) = 123
foo (after) = 0
subfoo (before) = 123 456
subfoo (after) = 0 0
If we comment out the line marked by (*), subfoo % i is kept unmodified:
foo (before) = 123
foo (after) = 0
subfoo (before) = 123 456
subfoo (after) = 123 0

Fortran + OpenMP code with a subroutine stops abruptly

I have a piece of experimental code that works perfectly with serial compilation and execution. When I compile it with openmp option on ifort (on ubuntu), the compilation goes on fine but the execution stops abruptly. The code is as follows:
!!!!!!!! module
module array
implicit none
real(kind=8),allocatable :: y(:)
end module array
module nonarray
implicit none
real(kind=8):: aa
end module nonarray
use nonarray; use array
implicit none
integer(kind=8):: iter,i
integer(kind=8),parameter:: id=1
real(kind=8),allocatable:: yt(:)
allocate(y(id)); allocate(yt(id)); y=0.d0; yt=0.d0
aa=4.d0 !!A SYSTEM PARAMETER
!$OMP PARALLEL PRIVATE(y,yt,iter,i)
!$OMP DO
loop1: do iter=1,20 !! THE INITIAL CONDITION LOOP
call random_number(y)!! RANDOM INITIALIZATION OF THE VARIABLE
loop2: do i=1,10000 !! ITERATION OF THE SYSTEM
call evolve(yt)
y=yt
enddo loop2 !! END OF SYSTEM ITERATION
write(1,*)aa,yt
enddo loop1 !!INITIAL CONDITION ITERATION DONE
!$OMP ENDDO
!$OMP END PARALLEL
stop
end
recursive subroutine evolve(yevl)
use nonarray; use array
implicit none
integer(kind=8),parameter:: id=1
real(kind=8):: xf
real(kind=8),intent(out):: yevl(id)
xf=aa*y(1)*(1.d0-y(1))
yevl(1)=xf
end subroutine evolve
For compilation I use the following command:
ifort -openmp -fpp test.f90.
test.f90 being the name of the program.
Any suggestions or help is highly appreciated.

I am not an OMP expert, but I think if the subroutine evolve should see a different (private) y in each thread, you should pass it directly from within the parallelized code block to the subroutine instead of importing it from an external module:
module common
use iso_fortran_env
implicit none
integer, parameter :: dp = real64
real(dp) :: aa
contains
subroutine evolve(y, yevl)
implicit none
real(dp), intent(in) :: y(:)
real(dp), intent(out):: yevl(:)
yevl(1) = aa * y(1) * (1.0_dp - y(1))
end subroutine evolve
end module common
program test
use common
implicit none
integer :: iter, i
real(dp), allocatable :: yt(:), y(:)
allocate(yt(1), y(1))
y(:) = 0.0_dp
yt(:) = 0.0_dp
aa = 4.0_dp
!$OMP PARALLEL DO PRIVATE(y,yt,iter,i)
loop1: do iter = 1, 20
call random_number(y)
loop2: do i = 1, 10000
call evolve(y, yt)
y = yt
end do loop2
write(*,*) aa, yt
end do loop1
!$OMP END PARALLEL DO
end program test
Just an additional warning: the code above worked with various compilers (nagfor 5.3.1, gfortran 4.6.3, ifort 13.0.1), but not with ifort 12.1.6. So, although I can't see any obvious problems with it, I may have messed up something.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

OpenMP reductions inside subgrograms - fortran

Related

Fortran OpenACC invoking a function on device using a function pointer

OpenACC routine vector with intent out argument

Calling a function or subroutine

Modern Fortran: Calling an ancestor procedure from descendent

Fortran + OpenMP code with a subroutine stops abruptly

Categories

Resources