Using the routine directive on a Fortran90 intrinsic - fortran

I have given OpenAcc access to the subroutine called "lang_force", and within that subroutine I call the Fortran90 intrinsic RANDOM_NUMBER().
subroutine lang_force(damp, temp, noise)
!$acc routine(lang_force)
implicit none
double precision, intent(in) :: damp, temp
double precision, dimension(1:3), intent(out) :: noise
double precision :: kb, a1, theta, phi, mag1, pi,r,s
integer :: i,j,k
double precision :: x,y,z
kb = 1.3806E-23
!kb = 1.0
pi=4.D0*DATAN(1.D0)
call random_number(a1)
call random_number(r)
call random_number(s)
mag1 = sqrt(-2.0*log(a1))
theta = r*pi
phi = 2.0*s*pi
x = mag1*cos(phi)*sin(theta)
y = mag1*sin(theta)*sin(phi)
z = mag1*cos(theta)
noise(1) = sqrt(2.0*kb*temp*damp)*x
noise(2) = sqrt(2.0*kb*temp*damp)*y
noise(3) = sqrt(2.0*kb*temp*damp)*z
end subroutine lang_force
When compiled with the latest version of pdf90, it tells me that it needs access to RANDOM_NUMBER(). How do I declare a routine directive to such a fortran90 intrinsic subroutine?

Not all Fortran intrinsics are supported within device code including RANDOM_NUMBER.
RANDOM_NUMBER in particular is not thread safe since all threads would share the same state. Instead, you need to use cuRand for which we ship examples with the compilers under the "2020/examples/CUDA-Libraries/cuRAND/test_rand_oacc_ftn" directory.
While this is in C, I wrote about is in more detail on the NVIDIA user forums:
https://forums.developer.nvidia.com/t/random-numbers-on-device-generation/135748/2

Related

Dummy argument not agreeing with actual argument when passing function

I'm trying to implement Newton's method but I'm getting a confusing error message. In my code you'll see I called external with f1 and f2 which I assumes tells the computer to look for the function but it's treating them as variables based on the error message. I've read the stack overflow posts similar to my issue but none of the solutions seem to work. I've tried with and without the external but the issue still persists. Hoping someone could see what I'm missing.
implicit none
contains
subroutine solve(f1,f2,x0,n, EPSILON)
implicit none
real(kind = 2), external:: f1, f2
real (kind = 2), intent(in):: x0, EPSILON
real (kind = 2):: x
integer, intent(in):: n
integer:: iteration
x = x0
do while (abs(f1(x))>EPSILON)
iteration = iteration + 1
print*, iteration, x, f1(x)
x = x - (f1(x)/f2(x))
if (iteration >= n) then
print*, "No Convergence"
stop
end if
end do
print*, iteration, x
end subroutine solve
end module newton
Program Lab10
use newton
implicit none
integer, parameter :: n = 1000 ! maximum iteration
real(kind = 2), parameter :: EPSILON = 1.d-3
real(kind = 2):: x0, x
x0 = 3.0d0
call solve(f(x),fp(x),x0,n, EPSILON)
contains
real (kind = 2) function f(x) ! this is f(x)
implicit none
real (kind = 2), intent(in)::x
f = x**2.0d0-1.0d0
end function f
real (kind = 2) function fp(x) ! This is f'(x)
implicit none
real (kind = 2), intent(in)::x
fp = 2.0d0*x
end function fp
end program Lab10```
You seem to be passing function results to your subroutine and not the functions themselves. Remove (x) when calling solve() and the problem will be resolved. But more importantly, this code is a prime example of how to not program in Fortran. The attribute external is deprecated and you better provide an explicit interface. In addition, what is the meaning of kind = 2. Gfortran does not even comprehend it. Even if it comprehends the kind, it is not portable. Here is a correct portable modern implementation of the code,
module newton
use iso_fortran_env, only: RK => real64
implicit none
abstract interface
pure function f_proc(x) result(result)
import RK
real(RK), intent(in) :: x
real(RK) :: result
end function f_proc
end interface
contains
subroutine solve(f1,f2,x0,n, EPSILON)
procedure(f_proc) :: f1, f2
real(RK), intent(in) :: x0, EPSILON
integer, intent(in) :: n
real(RK) :: x
integer :: iteration
x = x0
do while (abs(f1(x))>EPSILON)
iteration = iteration + 1
print*, iteration, x, f1(x)
x = x - (f1(x)/f2(x))
if (iteration >= n) then
print*, "No Convergence"
stop
end if
end do
print*, iteration, x
end subroutine solve
end module newton
Program Lab10
use newton
integer, parameter :: n = 1000 ! maximum iteration
real(RK), parameter :: EPSILON = 1.e-3_RK
real(RK) :: x0, x
x0 = 3._RK
call solve(f,fp,x0,n, EPSILON)
contains
pure function f(x) result(result) ! this is f(x)
real (RK), intent(in) :: x
real (RK) :: result
result = x**2 - 1._RK
end function f
pure function fp(x) result(result) ! This is f'(x)
real (RK), intent(in) :: x
real (RK) :: result
result = 2 * x
end function fp
end program Lab10
If you expect to pass nonpure functions to the subroutine solve(), then remove the pure attribute. Note the use of real64 to declare 64-bit (double precision) real kind. Also notice how I have used _RK suffix to assign 64-bit precision to real constants. Also, notice I changed the exponents from real to integer as it is multiplication is more efficient than exponentiation computationally. I hope this answer serves more than merely the solution to Lab10.

How to calculate several different random numbers from normal dist? using fortran

I have to find 'n' random numbers from a normal distribution given the mean and standard deviation. I have figure out how to get a random number, but when try to loop it to get several different random numbers, it gives me the same number x amount of times?
program prac10
implicit none
real :: mu, sigma
integer :: i
!print*, "mean and stdev?"
!read*, mu, sigma
mu=1.1
sigma=0.1
do i=1, 2 # here is the part I think I am stuck on??
call normal(mu,sigma)
enddo
end program
subroutine normal(mu,sigma)
implicit none
integer :: i
integer :: n
real :: u, v, z, randnum
real :: mu, sigma
real :: pi=3.141593
call RANDOM_SEED()
call RANDOM_NUMBER(u)
call RANDOM_NUMBER(v)
z=sqrt(-2*log(u))*cos(2*pi*v)
randnum=mu+sigma*z
print*, randnum
end subroutine
particularly the part where I should be looping/repeating. I used from 1:2, replacing n with 2 for now so that I wouldn't have to input it every time I try to run it
The most important fact is that you must not call RANOM_SEED repeatedly. It is supposed to be called only once.
It is also good to modify the subroutine to generate the number and pass it further. Notice also the change of formatting to make it more readable and the change in the value of pi.
program prac10
implicit none
real :: mu, sigma, x
integer :: i
call RANDOM_SEED()
mu = 1.1
sigma = 0.1
do i = 1, 2
call normal(x,mu,sigma)
print *, x
end do
end program
subroutine normal(randnum,mu,sigma)
implicit none
integer :: i
integer :: n
real :: u, v, z, randnum
real :: mu, sigma
real :: pi = acos(-1.0)
call RANDOM_NUMBER(u)
call RANDOM_NUMBER(v)
z = sqrt(-2*log(u)) * cos(2*pi*v)
randnum = mu + sigma*z
end subroutine

Numerical integration of an array in 3d spherical polar

I want to integrate a 3d array over space in r, theta and phi (spherical polar). For 1d I use Simpson's 1/3rd rule but I am confused about that for 3d. Also, would you like to suggest any other method for integration or subroutine? I am using Fortran 95.
I have written the Fortran code for integration in 3d, I thought I should share with you.
The code for calculating integration of a function is 3 dimension is:
!This program uses Simpson's 1/3 method to calulate volume
integral in r,theta & phi.
program SimpsonInteg3d
implicit none
integer::i,j,k
integer, parameter :: N=10,M=360,L=180
integer, parameter:: rmin=0,rmax=N,phimin=0,phimax=M,&
thetamin=0,thetamax=L
double precision,&
dimension(rmin:rmax,thetamin:thetamax,phimin:phimax)::U
real*8, parameter :: pi = 4*atan(1.0),dr=1./N,&
dtheta=pi/(L),dphi=2*pi/M
real*8 :: r(rmin:rmax)=(/(i*dr,i=rmin,rmax)/),&
theta(thetamin:thetamax)=(/(j*dtheta,j=thetamin,thetamax)/),&
p(phimin:phimax)=(/(k*dphi,k=phimin,phimax)/)
real*8::intg
do i=rmin,rmax
do j=thetamin, thetamax
do k=phimin,phimax
!The function which has to be integrated.
U(i,j,k)=r(i)* (sin((p(k)))**2) *sin(theta(j))
enddo
enddo
enddo
call Integration(Intg,U,r,theta,p)
print*,"Integration of function U using simpson's 1/3=", Intg
end program
!===============================================================!
!Subroutine for calculating integral of a function in 3d.
subroutine Integration(Intg,U,r,theta,p)
implicit none
integer::i,j,k
integer, parameter :: N=10,M=360,L=180
integer, parameter ::rmin=0,rmax=N,&
phimin=0,phimax=M,thetamin=0,thetamax=L
double precision,&
dimension(rmin:rmax,thetamin:thetamax,phimin:phimax):: U
real*8::
r(rmin:rmax),theta(thetamin:thetamax),p(phimin:phimax),Intg,Ia
double precision,dimension(rmin:rmax)::Itheta
real*8, parameter :: pi = 4*atan(1.0),dr=1./N,&
dtheta=pi/(L),dphi=2*pi/M
Intg=0
Ia=0
do i=rmin+1,rmax-1
call Integtheta(Itheta,i,U,r,theta,p)
if(mod(i,2).eq.0) then
Ia = Ia + 2*Itheta(i)*r(i)**2
else
Ia = Ia + 4*Itheta(i)*r(i)**2
endif
end do
call Integtheta(Itheta,rmin,U,r,theta,p)
call Integtheta(Itheta,rmax,U,r,theta,p)
Intg=(dr/3)*(Itheta(rmin)+Itheta(rmax)+ Ia)
end subroutine Integration
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!Subroutine for calculating integral of U along theta and phi
subroutine Integtheta(Itheta,i,U,r,theta,p)
implicit none
integer::i,j,k
integer, parameter :: N=10,M=360,L=180
integer, parameter ::rmin=0,rmax=N,&
phimin=0,phimax=M,thetamin=0,thetamax=L
double precision,&
dimension(rmin:rmax,thetamin:thetamax,phimin:phimax)::U
real*8:: r(rmin:rmax),theta(thetamin:thetamax),p(phimin:phimax)
double precision,dimension(rmin:rmax)::Itheta,Itha
double precision,dimension(rmin:rmax,thetamin:thetamax)::Ip
real*8, parameter :: pi = 4*atan(1.0),dr=1./N,&
dtheta=pi/(L),dphi=2*pi/M
Itheta(i)=0
Itha(i)=0
do j=thetamin+1,thetamax-1
call Integphi(Ip,i,j,U,r,theta,p)
if(mod(j,2).eq.0) then
Itha(i) = Itha(i) + 2*Ip(i,j)*sin(theta(j))
else
Itha(i) = Itha(i) + 4*Ip(i,j)*sin(theta(j))
endif
end do
call Integphi(Ip,i,thetamin,U,r,theta,p)
call Integphi(Ip,i,thetamax,U,r,theta,p)
Itheta(i)=(dtheta/3)*(Ip(i,thetamin)+Ip(i,thetamax)+ Itha(i))
end subroutine Integtheta
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!Subroutine for calculating integral of U along phi
subroutine Integphi(Ip,i,j,U,r,theta,p)
implicit none
integer::i,j,k
integer, parameter :: N=10,M=360,L=180
integer, parameter ::rmin=0,rmax=N,&
phimin=0,phimax=M,thetamin=0,thetamax=L
double precision,&
dimension(rmin:rmax,thetamin:thetamax,phimin:phimax)::U
real*8:: r(rmin:rmax),theta(thetamin:thetamax),p(phimin:phimax)
double precision,dimension(rmin:rmax,thetamin:thetamax)::Ip,Ipa
real*8, parameter :: pi = 4*atan(1.0),dr=1./N,&
dtheta=pi/(L),dphi=2*pi/M
Ipa(i,j)=0
do k=phimin+1,phimax-1
if(mod(k,2).eq.0) then
Ipa(i,j) = Ipa(i,j) + 2*U(i,j,k)
else
Ipa(i,j)= Ipa(i,j) + 4*U(i,j,k)
endif
end do
Ip(i,j)=(dphi/3)*(U(i,j,phimin)+U(i,j,phimax)+ Ipa(i,j))
end subroutine Integphi
It calculates the integration of the function U along phi first and then uses the function Ip to calculate integral along theta. Then finally the function Itheta is used to calculate integration along r.

Using MKL to solve a non-linear system of equations with an objective function stored in another module

I'm trying to use the MKL trust region algorithm to solve a nonlinear system of equations in a Fortran program. I started from the example provided online (ex_nlsqp_f90_x.f90 https://software.intel.com/en-us/node/501498) and everything works correctly. Now, because I have to use this in a much bigger program, I need the user defined objective function to be loaded from a separate module. Hence, I split the example into 2 separate files, but I'm not able to make it compile correctly.
So here is the code for module which contains user defined data structure and the objective function
module modFun
implicit none
private
public my_data, extended_powell
type :: my_data
integer a
integer sum
end type my_data
contains
subroutine extended_powell (m, n, x, f, user_data)
implicit none
integer, intent(in) :: m, n
real*8 , intent(in) :: x(n)
real*8, intent(out) :: f(m)
type(my_data) :: user_data
integer i
user_data%sum = user_data%sum + user_data%a
do i = 1, n/4
f(4*(i-1)+1) = x(4*(i-1)+1) + 10.0 * x(4*(i-1)+2)
f(4*(i-1)+2) = 2.2360679774998 * (x(4*(i-1)+3) - x(4*(i-1)+4))
f(4*(i-1)+3) = ( x(4*(i-1)+2) - 2.0 * x(4*(i-1)+3) )**2
f(4*(i-1)+4) = 3.1622776601684 * (x(4*(i-1)+1) - x(4*(i-1)+4))**2
end do
end subroutine extended_powell
end module modFun
and here the portion of the main program calling it
include 'mkl_rci.f90'
program EXAMPLE_EX_NLSQP_F90_X
use MKL_RCI
use MKL_RCI_type
use modFun
! user's objective function
! n - number of function variables
! m - dimension of function value
integer n, m
parameter (n = 4)
parameter (m = 4)
! precisions for stop-criteria (see manual for more details)
real*8 eps(6)
real*8 x(n)
real*8 fjac(m*n)
! number of iterations
integer fun
! Additional users data
type(my_data) :: m_data
m_data%a = 1
m_data%sum = 0
rs = 0.0
fun = djacobix(extended_powell,n,m,fjac,x,eps(1),%val(loc(m_data)))
end program EXAMPLE_EX_NLSQP_F90_X
Also djacobix code
INTERFACE
INTEGER FUNCTION DJACOBIX(fcn, n, m, fjac, x, eps, user_data)
USE, INTRINSIC :: ISO_C_BINDING
INTEGER, INTENT(IN) :: n
INTEGER, INTENT(IN) :: m
DOUBLE PRECISION, INTENT(IN) :: eps
DOUBLE PRECISION, INTENT(IN), DIMENSION(*) :: x
DOUBLE PRECISION, INTENT(OUT), DIMENSION(m, *) :: fjac
INTEGER(C_INTPTR_T) :: user_data
INTERFACE
SUBROUTINE fcn(m, n, x, f, user_data)
USE, INTRINSIC :: ISO_C_BINDING
INTEGER, INTENT(IN) :: n
INTEGER, INTENT(IN) :: m
DOUBLE PRECISION, INTENT(IN), DIMENSION(*) :: x
DOUBLE PRECISION, INTENT(OUT), DIMENSION(*) :: f
INTEGER(C_INTPTR_T), INTENT(IN) :: user_data
END SUBROUTINE
END INTERFACE
END FUNCTION
END INTERFACE
When i compile the following errors are generated:
mpiifort -g -t -mkl -I/apps/rhel6/intel/composer_xe_2015.3.187/mkl/include/intel64/lp64 -c modFun.f90
mpiifort -g -t -mkl -I/apps/rhel6/intel/composer_xe_2015.3.187/mkl/include/intel64/lp64 -c main.f90
main.f90(30): error #7065: The characteristics of dummy argument 5 of the associated actual procedure differ from the characteristics of dummy argument 5 of the dummy procedure. [EXTENDED_POWELL]
fun = djacobix(extended_powell,n,m,fjac,x,eps(1),%val(loc(m_data)))
-------------------^
I have the feeling I have to create an interface to override the check on the m_data, but I can't figure out where and how. Can anyone help me with this problem providing a working example?
I guess the reason is that the function djacobix passes the pointer instead of the true value of variable user_data.
You can check the manual at https://software.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/nonlinear-optimization-problem-solvers/jacobian-matrix-calculation-routines/jacobix.html where a sentence shows that "You need to declare fcn as extern in the calling program."

OpenBLAS slower than intrinsic function dot_product

I need make a dot product in Fortran. I can do with the intrinsic function dot_product from Fortran or use ddot from OpenBLAS. The problem is the ddot is slower. This is my code:
With BLAS:
program VectorBLAS
! time VectorBlas.e = 0.30s
implicit none
double precision, dimension(3) :: b
double precision :: result
double precision, external :: ddot
integer, parameter :: LargeInt_K = selected_int_kind (18)
integer (kind=LargeInt_K) :: I
DO I = 1, 10000000
b(:) = 3
result = ddot(3, b, 1, b, 1)
END DO
end program VectorBLAS
With dot_product
program VectorModule
! time VectorModule.e = 0.19s
implicit none
double precision, dimension (3) :: b
double precision :: result
integer, parameter :: LargeInt_K = selected_int_kind (18)
integer (kind=LargeInt_K) :: I
DO I = 1, 10000000
b(:) = 3
result = dot_product(b, b)
END DO
end program VectorModule
The two codes are compiled using:
gfortran file_name.f90 -lblas -o file_name.e
What am I doing wrong? BLAS not have to be faster?
While BLAS, and especially the optimized versions, are generally faster for larger arrays, the built-in functions are faster for smaller sizes.
This is especially visible from the linked source code of ddot, where additional work is spent on further functionality (e.g., different increments). For small array lengths, the work done here outweighs the performance gain of the optimizations.
If you make your vectors (much) larger, the optimized version should be faster.
Here is an example to illustrate this:
program test
use, intrinsic :: ISO_Fortran_env, only: REAL64
implicit none
integer :: t1, t2, rate, ttot1, ttot2, i
real(REAL64), allocatable :: a(:),b(:),c(:)
real(REAL64), external :: ddot
allocate( a(100000), b(100000), c(100000) )
call system_clock(count_rate=rate)
ttot1 = 0 ; ttot2 = 0
do i=1,1000
call random_number(a)
call random_number(b)
call system_clock(t1)
c = dot_product(a,b)
call system_clock(t2)
ttot1 = ttot1 + t2 - t1
call system_clock(t1)
c = ddot(100000,a,1,b,1)
call system_clock(t2)
ttot2 = ttot2 + t2 - t1
enddo
print *,'dot_product: ', real(ttot1)/real(rate)
print *,'BLAS, ddot: ', real(ttot2)/real(rate)
end program
The BLAS routines are quite a bit faster here:
OMP_NUM_THREADS=1 ./a.out
dot_product: 0.145999998
BLAS, ddot: 0.100000001