Related
In Newton's method, to solve a nonlinear system of equations we need to find the Jacobian matrix and the determinant of the inverse of the Jacobian matrix.
Here are my component functions,
real function f1(x,y)
parameter (pi = 3.141592653589793)
f1 = log(abs(x-y**2)) - sin(x*y) - sin(pi)
end function f1
real function f2(x,y)
f2 = exp(x*y) + cos(x-y) - 2
end function f2
For the 2x2 case I am computing the Jacobian matrix and determinant of the inverse of Jacobian matrix like this,
x = [2,2]
h = 0.00001
.
.
! calculate approximate partial derivative
! you can make it more accurate by reducing the
! value of h
j11 = (f1(x(1)+h,x(2))-f1(x(1),x(2)))/h
j12 = (f1(x(1),x(2)+h)-f1(x(1),x(2)))/h
j21 = (f2(x(1)+h,x(2))-f2(x(1),x(2)))/h
j22 = (f2(x(1),x(2)+h)-f2(x(1),x(2)))/h
! calculate the Jacobian
J(1,:) = [j11,j12]
J(2,:) = [j21,j22]
! calculate inverse Jacobian
inv_J(1,:) = [J(2,2),-J(1,2)]
inv_J(2,:) = [-J(2,1),J(1,1)]
DET=J(1,1)*J(2,2) - J(1,2)*J(2,1)
inv_J = inv_J/DET
.
.
How do I in Fortran extend this to evaluate a Jacobian for m functions evaluated at n points?
Here is a more flexible jacobian calculator.
The results with the 2×2 test case are what you expect
arguments (x)
2.00000000000000
2.00000000000000
values (y)
1.44994967586787
53.5981500331442
Jacobian
0.807287239448229 3.30728724371454
109.196300248300 109.196300248300
I check the results against a symbolic calculation for the given inputs of
Console.f90
program Console1
use ISO_FORTRAN_ENV
implicit none
! Variables
integer, parameter :: wp = real64
real(wp), parameter :: pi = 3.141592653589793d0
! Interfaces
interface
function fun(x,n,m) result(y)
import
integer, intent(in) :: n,m
real(wp), intent(in) :: x(m)
real(wp) :: y(n)
end function
end interface
real(wp) :: h
real(wp), allocatable :: x(:), y(:), J(:,:)
! Body of Console1
x = [2d0, 2d0]
h = 0.0001d0
print *, "arguments"
print *, x(1)
print *, x(2)
y = test(x,2,2)
print *, "values"
print *, y(1)
print *, y(2)
J = jacobian(test,x,2,h)
print *, "Jacobian"
print *, J(1,:)
print *, J(2,:)
contains
function test(x,n,m) result(y)
! Test case per original question
integer, intent(in) :: n,m
real(wp), intent(in) :: x(m)
real(wp) :: y(n)
y(1) = log(abs(x(1)-x(2)**2)) - sin(x(1)*x(2)) - sin(pi)
y(2) = exp(x(1)*x(2)) + cos(x(1)-x(2)) - 2
end function
function jacobian(f,x,n,h) result(u)
procedure(fun), pointer, intent(in) :: f
real(wp), allocatable, intent(in) :: x(:)
integer, intent(in) :: n
real(wp), intent(in) :: h
real(wp), allocatable :: u(:,:)
integer :: j, m
real(wp), allocatable :: y1(:), y2(:), e(:)
m = size(x)
allocate(u(n,m))
do j=1, m
e = element(j, m) ! Get kronecker delta for j-th value
y1 = f(x-e*h/2,n,m)
y2 = f(x+e*h/2,n,m)
u(:,j) = (y2-y1)/h ! Finite difference for each column
end do
end function
function element(i,n) result(e)
! Kronecker delta vector. All zeros, except the i-th value.
integer, intent(in) :: i, n
real(wp) :: e(n)
e(:) = 0d0
e(i) = 1d0
end function
end program Console1
I will answer about evaluation in different points. This is quite simple. You just need an array of points, and if the points are in some regular grid, you may not even need that.
You may have an array of xs and array of ys or you can have an array of derived datatype with x and y components.
For the former:
real, allocatable :: x(:), y(:)
x = [... !probably read from some data file
y = [...
do i = 1, size(x)
J(i) = Jacobian(f, x(i), y(i))
end do
If you want to have many functions at the same time, the problem is always in representing functions. Even if you have an array of function pointers, you need to code them manually. A different approach is to have a full algebra module, where you enter some string representing a function and you can evaluate such function and even compute derivatives symbolically. That requires a parser, an evaluator, it is a large task. There are libraries for this. Evaluation of such a derivative will be slow unless further optimizing steps (compiling to machine code) are undertaken.
Numerical evaluation of the derivative is certainly possible. It will slow the convergence somewhat, depending on the order of the approximation of the derivative. You do a difference of two points for the numerical derivative. You can make an interpolating polynomial from values in multiple points to get a higher-order approximation (finite difference approximations), but that costs machine cycles.
Normally we can use auto difference tools as #John Alexiou mentioned. However in practise I prefer using MATLAB to analytically solve out the Jacobian and then use its build-in function fortran() to convert the result to a f90 file. Take your function as an example. Just type these into MATLAB
syms x y
Fval=sym(zeros(2,1));
Fval(1)=log(abs(x-y^2)) - sin(x*y) - sin(pi);
Fval(2)=exp(x*y) + cos(x-y) - 2;
X=[x;y];
Fjac=jacobian(Fval,X);
fortran(Fjac)
which will yield
Fjac(1,1) = -y*cos(x*y)-((-(x-y**2)/abs(-x+y**2)))/abs(-x+y**2)
Fjac(1,2) = -x*cos(x*y)+(y*((-(x-y**2)/abs(-x+y**2)))*2.0D0)/abs(-
&x+y**2)
Fjac(2,1) = -sin(x-y)+y*exp(x*y)
Fjac(2,2) = sin(x-y)+x*exp(x*y)
to you. You just get an analytical Jacobian fortran function.
Meanwhile, it is impossible to solve the inverse of a mxn matrix because of rank mismatching. You should simplify the system of equations to get a nxn Jacobin.
Additionally, when we use Newton-Raphson's method we do not solve the inverse of the Jacobin which is time-consuming and inaccurate for a large system. An easy way is to use dgesv in LAPACK for dense Jacobin. As we only need to solve the vector x from system of linear equations
Jx=-F
dgesv use LU decomposition and Gaussian elimination to solve above system of equations which is extremely faster than solving inverse matrix.
If the system of equations is large, you can use UMFPACK and its fortran interface module mUMFPACK to solve the system of equations in which J is a sparse matrix. Or use subroutine ILUD and LUSOL in a wide-spread sparse matrix library SPARSEKIT2.
In addition to these, there are tons of other methods which try to solve the Jx=-F faster and more accurate such as Generalized Minimal Residual (GMRES) and Stabilized Bi-Conjugate Gradient (BICGSTAB) which is a strand of literature.
I'm trying to implement Newton's method but I'm getting a confusing error message. In my code you'll see I called external with f1 and f2 which I assumes tells the computer to look for the function but it's treating them as variables based on the error message. I've read the stack overflow posts similar to my issue but none of the solutions seem to work. I've tried with and without the external but the issue still persists. Hoping someone could see what I'm missing.
implicit none
contains
subroutine solve(f1,f2,x0,n, EPSILON)
implicit none
real(kind = 2), external:: f1, f2
real (kind = 2), intent(in):: x0, EPSILON
real (kind = 2):: x
integer, intent(in):: n
integer:: iteration
x = x0
do while (abs(f1(x))>EPSILON)
iteration = iteration + 1
print*, iteration, x, f1(x)
x = x - (f1(x)/f2(x))
if (iteration >= n) then
print*, "No Convergence"
stop
end if
end do
print*, iteration, x
end subroutine solve
end module newton
Program Lab10
use newton
implicit none
integer, parameter :: n = 1000 ! maximum iteration
real(kind = 2), parameter :: EPSILON = 1.d-3
real(kind = 2):: x0, x
x0 = 3.0d0
call solve(f(x),fp(x),x0,n, EPSILON)
contains
real (kind = 2) function f(x) ! this is f(x)
implicit none
real (kind = 2), intent(in)::x
f = x**2.0d0-1.0d0
end function f
real (kind = 2) function fp(x) ! This is f'(x)
implicit none
real (kind = 2), intent(in)::x
fp = 2.0d0*x
end function fp
end program Lab10```
You seem to be passing function results to your subroutine and not the functions themselves. Remove (x) when calling solve() and the problem will be resolved. But more importantly, this code is a prime example of how to not program in Fortran. The attribute external is deprecated and you better provide an explicit interface. In addition, what is the meaning of kind = 2. Gfortran does not even comprehend it. Even if it comprehends the kind, it is not portable. Here is a correct portable modern implementation of the code,
module newton
use iso_fortran_env, only: RK => real64
implicit none
abstract interface
pure function f_proc(x) result(result)
import RK
real(RK), intent(in) :: x
real(RK) :: result
end function f_proc
end interface
contains
subroutine solve(f1,f2,x0,n, EPSILON)
procedure(f_proc) :: f1, f2
real(RK), intent(in) :: x0, EPSILON
integer, intent(in) :: n
real(RK) :: x
integer :: iteration
x = x0
do while (abs(f1(x))>EPSILON)
iteration = iteration + 1
print*, iteration, x, f1(x)
x = x - (f1(x)/f2(x))
if (iteration >= n) then
print*, "No Convergence"
stop
end if
end do
print*, iteration, x
end subroutine solve
end module newton
Program Lab10
use newton
integer, parameter :: n = 1000 ! maximum iteration
real(RK), parameter :: EPSILON = 1.e-3_RK
real(RK) :: x0, x
x0 = 3._RK
call solve(f,fp,x0,n, EPSILON)
contains
pure function f(x) result(result) ! this is f(x)
real (RK), intent(in) :: x
real (RK) :: result
result = x**2 - 1._RK
end function f
pure function fp(x) result(result) ! This is f'(x)
real (RK), intent(in) :: x
real (RK) :: result
result = 2 * x
end function fp
end program Lab10
If you expect to pass nonpure functions to the subroutine solve(), then remove the pure attribute. Note the use of real64 to declare 64-bit (double precision) real kind. Also notice how I have used _RK suffix to assign 64-bit precision to real constants. Also, notice I changed the exponents from real to integer as it is multiplication is more efficient than exponentiation computationally. I hope this answer serves more than merely the solution to Lab10.
I want to calculate z value as the coordinate in range of x:-50~50 and y:-50~50 like below code.
program test
implicit none
! --- [local entities]
real*8 :: rrr,th,U0,amp,alp,Ndiv
real*8 :: pi,alpR,NR,Rmin,Rmax,z
integer :: ir, i, j
do i=0, 50
do j=0, 50
th=datan2(i,j)
pi=datan(1.d0)*4.d0
!
Ndiv= 24.d0 !! Number of circumferential division
alp = 90.d0/180.d0*pi !! phase [rad]
U0 = 11.4d0 !! average velocity
amp = 0.5d0 !! amplitude of velocity
Rmin = 10 !! [m]
Rmax = 50 !! [m]
NR = 6.d0 !! Number of radial division
!
rrr=dsqrt(i**2+j**2)
ir=int((rrr-Rmin)/(Rmax-Rmin)*NR)
alpR=2.d0*pi/dble(Ndiv)*dble(mod(ir,2))
z=U0*(1.d0+amp*dsin(0.5d0*Ndiv*th+alp+alpR))
write(*,*) 'i, j, z'
write(*,*) i, j, z
end do
end do
stop
end program test
But I couldn't make it work like below error. I think because i, j are in datan(i,j). How should I change these code?
test.f90:10.16:
th=datan2(i,j)
1
Error: 'y' argument of 'datan2' intrinsic at (1) must be REAL
test.f90:21.16:
rrr=dsqrt(i**2+j**2)
1
Error: 'x' argument of 'dsqrt' intrinsic at (1) must be REAL
Inspired by the comments of #Rodrigo Rodrigues, #Ian Bush, and #Richard, here is a suggested rewrite of the code segment from #SW. Kim
program test
use, intrinsic :: iso_fortran_env, only : real64
implicit none
! --- [local entities]
! Determine the kind of your real variables (select one):
! for specifying a given numerical precision
integer, parameter :: wp = selected_real_kind(15, 307) !15 digits, 10**307 range
! for specifying a given number of bits
! integer, parameter :: wp = real64
real(kind=wp), parameter :: pi = atan(1._wp)*4._wp
real(kind=wp) :: rrr, th, U0, amp, alp, Ndiv
real(kind=wp) :: alpR, NR, Rmin, Rmax, z
integer :: ir, i, j
do i = 0, 50
do j = 0, 50
th = atan2(real(i, kind=wp), real(j, kind=wp))
!
Ndiv= 24._wp !! Number of circumferential division
alp = 90._wp/180._wp*pi !! phase [rad]
U0 = 11.4_wp !! average velocity
amp = 0.5_wp !! amplitude of velocity
Rmin = 10 !! [m]
Rmax = 50 !! [m]
NR = 6._wp !! Number of radial division
!
rrr = sqrt(real(i, kind=wp)**2 + real(j, kind=wp)**2)
ir = int((rrr - Rmin) / (Rmax - Rmin) * NR)
alpR = 2._wp * pi / Ndiv * mod(ir, 2)
z = U0 * (1._wp + amp * sin(0.5_wp * Ndiv * th + alp + alpR))
!
write(*,*) 'i, j, z'
write(*,*) i, j, z
end do
end do
stop
end program test
Specifically, the following changes were made with respect to the original code posted:
Minimum change to answer the question: casting integer variables i and j to real values for using them in the real valued functions datan and dsqrt.
Using generic names for intrinsic procedures, i.e sqrt instead of dsqrt, atan instead of datan, and sin instead of dsin. One benefit of this approach, is that the kind of working precision wp can be changed in one place, without requiring explicit changes elsewhere in the code.
Defining the kind of real variables and calling it wp. Extended discussion of this topic, its implications and consequences can be found on this site, for example here and here. Also #Steve Lionel has an in depth post on his blog, where his general advice is to use selected_real_kind.
Defining pi as a parameter calculating its value once, instead of calculating the same value repeatedly within the nested for loops.
I am attempting to write a program that calculates the discrete fourier transform of a set of given data. I've sampled a sine wave, so my set is (pi/2,2*pi,3*pi/2,2*pi). Here is my program:
program DFT
implicit none
integer :: k, N, x, y, j, r, l
integer, parameter :: dp = selected_real_kind(15,300)
real, allocatable,dimension(:) :: h, rst
integer, dimension(:,:), allocatable :: W
real(kind=dp) :: pi
open(unit=100, file="dft.dat",status='replace')
N = 4
allocate(h(N))
allocate(rst(N))
allocate(W(-N/2:N/2,1:N))
pi = 3.14159265359
do k=1,N
h(k) = k*(pi*0.5)
end do
do j = -N/2,N/2
do k = 1, N
W(j,k) = EXP((2.0_dp*pi*cmplx(0.0_dp,1.0_dp)*j*k)/N)
end do
end do
rst = matmul(W,h)
!print *, h, w
write(100,*) rst
end program
And this prints out the array rst as:
0.00000000 0.00000000 15.7079639 0.00000000 0.00000000
Using an online calculator, the results should be:
15.7+0j -3.14+3.14j -3.14+0j -3.14-3.14j
I'm not sure why rst is 1 entry too long either.
Can anyone spot why it's printing out 0 for 3/4 of the results? I notice that 15.7 appears in both the actual answers and my result.
Thank you
Even though the question has been answered and accepted, the program given has so many problems that I had to say...
The input given is not a sine wave, it's a linear function of time. Kind of like a 1-based ramp input.
For DFTs the indices normally are considered to go from 0:N-1, not 1:N.
For W the Nyquist frequency is represented twice, as -N/2 and N/2. Again it would have been normal to number the rows 0:N-1, BTW, this is why you have an extra output in your rst vector.
pi is double precision but only initialized to 12 significant figures. It's hard to tell if there's a typo in your value of pi which is why many would use 4*atan(1.0_dp) or acos(-1.0_dp).
Notice that h(N) is actually going to end up as the zero time input, which is one reason the whole world indices DFT vectors from zero.
The expression cmplx(0.0_dp,1.0_dp) is sort of futile because the CMPLX intrinsic always returns a single precision result if the third optional KIND= argument is not present. As a complex literal, (0.0_dp,1.0_dp) would be double precision. However, you could as well use (0,1) because it's exactly representable in single precision and would be converted to double precision when it gets multiplied by the growing product on its left. Also 2.0_dp could have been represented successfully as 2 with less clutter.
The expression EXP((2.0_dp*pi*cmplx(0.0_dp,1.0_dp)*j*k)/N) is appropriate for inverse DFT, disregarding normalization. Thus I would have written the whole thing more cleanly and correctly as EXP(-2*pi*(0,1)*j*k/N). Then the output should have been directly comparable to what the online calculator printed out.
Fortran does complex numbers for you but you must declare the appropriate variables as complex. Try
complex, allocatable,dimension(:) :: rst
complex, dimension(:,:), allocatable :: W
The gfortran compiler gives wrong answer, when I run a parallel program using OpenMP. In the same time, ifort gives exact result.
This is the whole compilable code.
!_______________________________________________________________ !
!____________MODULE SECTION_____________________________________ !
MODULE MATRIC
IMPLICIT NONE
INTEGER , PARAMETER :: NG = 40
DOUBLE PRECISION,SAVE :: Z , PA , PB ,CMU
DOUBLE PRECISION , PARAMETER :: PI=2.0D0*ACOS(0.0D0) , &
FPI=4.0D0*PI , SQFPI = SQRT(FPI), DLAM=1.0D0
DOUBLE PRECISION , DIMENSION(450) :: DEL1, DEL2, X, R , SNLO
DOUBLE PRECISION :: XG(60) , WG(60)
END MODULE MATRIC
!_________________________________________________________________________!
! MODULE SECTION
!__________________________________________________________________________!
MODULE POTDATA
IMPLICIT NONE
INTEGER :: IA , IB , ID
DOUBLE PRECISION :: RA , RB , R1s(450)
END MODULE POTDATA
!__________________________________________________________________________!
!______________________________________________________________________!
program check
use matric
use potdata
implicit double precision(a-h,o-z)
pa = 0.72D0 ; pb = 0.19D0
mesh = 441 ; noint= 40 ; z = 2.0d0
CALL GAULEG(-1.d0,1.d0)
NB = MESH/NOINT
I = 1
X(I) = 0.0D+00
DELTAX = 0.0025D+00*40.0D+00/DBLE(NOINT)
DO J=1,NB
IMK = (J-1)*NOINT + 1
DO K=1,NOINT
AK=K
I=I+1
X(I)=X(IMK)+AK*DELTAX
END DO
DELTAX=2.0D+00*DELTAX
END DO
CMU=(9.0D00*PI*PI/(128.0D00*Z))**THIRD
R(1)=0.0D+00 ; SNLO(1) = 0.D00
DO I=2,MESH
R(I)=CMU*X(I)
SNLO(I) = R(I)*dexp(-Z*R(I))
R1S(I) = SNLO(I)/(SQFPI*R(I))
END DO
call EFFPOT(MESH,NOINT)
end program check
subroutine EFFPOT(MESH,NOINT)
USE OMP_LIB
USE MATRIC
USE POTDATA
implicit none
integer, intent(in) :: MESH, NOINT
double precision::anorm(450)
double precision, external :: funct
double precision :: asum, fac, cnorm
!$omp parallel do default(none) private(del1,ia,asum,ib,ra,rb,fac) &
!$omp shared(id,mesh,r,anorm,NOINT,del2,R1s)
do ia = 2,mesh
ra = r(ia)
if(R1s(ia).lt.1.D-7.and.R1s(ia).ge.1.D-8)id = ia
do ib = 2,mesh
rb = r(ib)
call QGAUSS(funct,-1.d0,1.d0,fac)
del1(ib) = rb**2*fac*R1s(ib)**2
end do
CALL NCDF(del1,ASUM,r(2),mesh,NOINT)
anorm(ia) = 2.0d0*pi*asum
del2(ia) = 2.0d0*pi*asum*(ra*R1s(ia))**2
end do
!$omp end parallel do
CALL NCDF(del2,ASUM,r(2),mesh,NOINT)
cnorm = 1.0/dsqrt(4.*pi*ASUM)
write(6,*)'cnorm =',cnorm
return
end
double precision function funct(x)
USE POTDATA , ONLY : RA , RB
USE MATRIC , ONLY : PA , PB , DLAM
implicit none
double precision, intent(in) :: x
double precision :: f1, f2, ramrb
ramrb = dsqrt(ra**2+rb**2-2.d0*ra*rb*x)
f1 = dcosh(pa*ra)+dcosh(pa*rb)
f2 = 1.d0+0.5*dlam*ramrb*dexp(-pb*ramrb)
funct = (f1*f2)**2
return
end
SUBROUTINE QGAUSS(func,aa,bb,ss)
USE OMP_LIB
USE MATRIC , ONLY : XG ,WG , NG
IMPLICIT DOUBLE PRECISION(A-H,O-Z)
external func
xm = 0.5d0*(bb+aa)
xl = 0.5d0*(bb-aa)
ss = 0.d0
do j=1,ng
dx = xl*xg(j)
ss = ss + wg(j)*(func(xm+dx)+func(xm-dx))
end do
ss = xl*ss/2.0
return
END
SUBROUTINE GAULEG(x1,x2)
USE MATRIC , ONLY : XG ,WG ,NG , PI
IMPLICIT DOUBLE PRECISION(A-H,O-Z)
eps = 1.d-14
m = (ng+1)/2
xm = 0.5D0*(x1+x2)
xl = 0.5D0*(x2-x1)
do i=1,m
z = dcos(pi*(dfloat(i)-0.25d0)/(dfloat(ng)+0.5d0))
1 continue
p1 = 1.d0
p2 = 0.d0
do j=1,ng
p3 = p2
p2 = p1
p1 = ((2.d0*dfloat(j)-1.d0)*z*p2 &
- (dfloat(j)-1.d0)*p3)/dfloat(j)
end do
pp = dfloat(ng)*(z*p1-p2)/(z*z-1.d0)
z1 = z
z = z1 - p1/pp
if (dabs(z-z1).gt.eps) go to 1
xg(i) = xm - xl*z
xg(ng+1-i) = xm + xl*z
wg(i) = 2.d0*xl/((1.d0-z*z)*pp*pp)
wg(ng+1-i) = wg(i)
end do
return
end
SUBROUTINE NCDF(F,ASUM,H,KKK,NOINT)
IMPLICIT DOUBLE PRECISION (A-H,O-Z)
DIMENSION F(450)
NBLOCK = (KKK-2)/NOINT + 1
C2HO45 = 2.0D+00*H/45.0D+00
ASUM = 0.0D+00
DO J=1,NBLOCK
ISTAR = NOINT*(J-1)+5
IEND = NOINT*J + 1
IEND = MIN0(KKK,IEND)
DO I=ISTAR,IEND,4
ASUM = ASUM + C2HO45*(7.0D+00*(F(I-4)+F(I)) &
+32.0D+00*(F(I-3)+F(I-1)) + 12.0D+00*F(I-2))
END DO
IF(IEND.EQ.KKK) GO TO 4
C2HO45 = 2.0D+00*C2HO45
4 END DO
RETURN
END
Thanks everybody specially #Vladimir who has taken interest in my problem. Finally i got the right answer by removing ra and rb from the module potdata and defined function as funct(x, ra, rb) and then removing ra and rb from the loop. Because i was writing ra, rb then reading their values in the above code so loop was having flow dependence. Now i get exact result from both compiler (which is 8.7933767516) parallelly, sequentially both. Exact way is this
subroutine EFFPOT(MESH,NOINT)
USE OMP_LIB
USE MATRIC
USE POTDATA
implicit none
integer, intent(in) :: MESH, NOINT
double precision::anorm(450)
double precision, external :: funct
double precision :: asum, fac, cnorm
!$omp parallel do default(none) private(del1,ia,asum,ib,fac) &
!$omp shared(id,mesh,r,anorm,NOINT,del2,R1s)
do ia = 2,mesh
if(R1s(ia).lt.1.D-7.and.R1s(ia).ge.1.D-8)id = ia
do ib = 2,mesh
call QGAUSS(funct,-1.d0,1.d0,fac,r(ia),r(ib))
del1(ib) = r(ib)**2*fac*R1s(ib)**2
end do
CALL NCDF(del1,ASUM,r(2),mesh,NOINT)
anorm(ia) = 2.0d0*pi*asum
del2(ia) = 2.0d0*pi*asum*(r(ia)*R1s(ia))**2
end do
!$omp end parallel do
CALL NCDF(del2,ASUM,r(2),mesh,NOINT)
cnorm = 1.0/dsqrt(4.*pi*ASUM)
write(6,*)'cnorm =',cnorm
return
end
double precision function funct(x,ra,rb)
USE MATRIC , ONLY : PA , PB , DLAM
implicit none
double precision, intent(in) :: x, ra, rb
double precision :: f1, f2, ramrb
ramrb = dsqrt(ra**2+rb**2-2.d0*ra*rb*x)
f1 = dcosh(pa*ra)+dcosh(pa*rb)
f2 = 1.d0+0.5*dlam*ramrb*dexp(-pb*ramrb)
funct = (f1*f2)**2
return
end
SUBROUTINE QGAUSS(func,aa,bb,ss,ra,rb)
USE OMP_LIB
USE MATRIC , ONLY : XG ,WG , NG
IMPLICIT DOUBLE PRECISION(A-H,O-Z)
external func
xm = 0.5d0*(bb+aa)
xl = 0.5d0*(bb-aa)
ss = 0.d0
do j=1,ng
dx = xl*xg(j)
ss = ss + wg(j)*(func(xm+dx,ra,rb)+func(xm-dx,ra,rb))
end do
ss = xl*ss/2.0
return
END
The cause of your problem is that the OpenMP standard does not specify what happens if a private list item is accessed in the region but outside of the construct. See example private.2f (found on page 135 of the OpenMP standard supplement) for a short version of the same problem.
Specifically, the module variables ra and rb are declared private in the OpenMP parallel region inside EFFPOT and also accessed from funct. funct is in the dynamic scope of the parallel region but (lexically) outside of it and therefore it is not specified whether ra and rb referenced by funct are the original module variables or their private copies (most compilers would go for the original variables).
You have already found one of the solutions. The other one would be to declare ra and rb threadprivate since they are only used to pass data from EFFPOT to funct:
MODULE POTDATA
IMPLICIT NONE
INTEGER :: IA , IB , ID
DOUBLE PRECISION :: RA , RB , R1s(450)
!$OMP THREADPRIVATE(RA,RB)
END MODULE POTDATA
You should then also remove ra and rb from the list of the private clause of the parallel region within EFFPOT.
On some platforms, e.g. OS X, using threadprivate with GCC (i.e. gfortran) could be slower than actually passing around the two variables as arguments because of the emulated TLS.
Note that this semantic error is really hard to detect and many OpenMP tools won't actually spot it.
First of all, it is very difficult to say something specific without seeing the actual code. However, I do have some comments on your situation and the conclusions you are drawing.
The fact that your program runs fine both in parallel and sequential execution when compiled with "ifort" doesn't mean that your program is correct. Since compiler bugs leading to programs giving wrong answers are very rare, but on the other hand manual parallel programming is very error-prone, we should assume a problem with the way you parallelized your code. We are probably talking about a race condition.
And no, the problem you are having is not at all unusual. When you have a race condition, it happens often that the sequential execution works everywhere and your parallel execution works in some environments and fails in others. It's even common that your code gives different answers every time you call it (not only depending on the compiler, but on many other factors that can change over time).
What I suggest you should do, is to get a parallel debugger, like for example TotalView that will help you keep track of the various threads and their states. Try to find a simple test environment (as few threads as possible) that fails reliably.