Here is the Main Program:
PROGRAM integration
EXTERNAL funct
DOUBLE PRECISION funct, a , b, sum, h
INTEGER n, i
REAL s
PARAMETER (a = 0, b = 10, n = 200)
h = (b-a)/n
sum = 0.0
DO i = 1, n
sum = sum+funct(i*h+a)
END DO
sum = h*(sum-0.5*(funct(a)+funct(b)))
PRINT *,sum
CONTAINS
END
And below is the Function funct(x)
DOUBLE PRECISION FUNCTION funct(x)
IMPLICIT NONE
DOUBLE PRECISION x
INTEGER K
Do k = 1,10
funct = x ** 2 * k
End Do
PRINT *, 'Value of funct is', funct
RETURN
END
I would like the 'Sum' in the Main Program to print 10 different sums over 10 different values of k in Function funct(x).
I have tried the above program but it just compiles the last value of Funct() instead of 10 different values in sum.
Array results require an explicit interface. You would also need to adjust funct and sum to actually be arrays using the dimension statement. Using an explicit interface requires Fortran 90+ (thanks for the hints by #francescalus and #VladimirF) and is quite tedious:
PROGRAM integration
INTERFACE funct
FUNCTION funct(x) result(r)
IMPLICIT NONE
DOUBLE PRECISION r
DIMENSION r( 10 )
DOUBLE PRECISION x
END FUNCTION
END INTERFACE
DOUBLE PRECISION a , b, sum, h
DIMENSION sum( 10)
INTEGER n, i
PARAMETER (a = 0, b = 10, n = 200)
h = (b-a)/n
sum = 0.0
DO i = 1, n
sum = sum+funct(i*h+a)
END DO
sum = h*(sum-0.5*(funct(a)+funct(b)))
PRINT *,sum
END
FUNCTION funct(x)
IMPLICIT NONE
DOUBLE PRECISION funct
DIMENSION funct( 10)
DOUBLE PRECISION x
INTEGER K
Do k = 1,10
funct(k) = x ** 2 * k
End Do
PRINT *, 'Value of funct is', funct
RETURN
END
If you can, you should switch to a more modern Standard such as Fortran 90+, and use modules. These provide interfaces automatically, which makes the code much simpler.
Alternatively, you could take the loop over k out of the function, and perform the sum element-wise. This would be valid FORTRAN 77:
PROGRAM integration
c ...
DIMENSION sum( 10)
c ...
INTEGER K
c ...
DO i = 1, n
Do k = 1,10
sum(k)= sum(k)+funct(i*h+a, k)
End Do
END DO
c ...
Notice that I pass k to the function. It needs to be adjusted accordingly:
DOUBLE PRECISION FUNCTION funct(x,k)
IMPLICIT NONE
DOUBLE PRECISION x
INTEGER K
funct = x ** 2 * k
PRINT *, 'Value of funct is', funct
RETURN
END
This version just returns a scalar and fills the array in the main program.
Apart from that I'm not sure it is wise to use a variable called sum. There is an intrinsic function with the same name. This could lead to some confusion...
I'm doing a program in Fortran90 which do a sum from i=1 to i=n where nis given. The sum is sum_{i=1}^{i=n}1/(i*(i+1)*(i+2)). This sum converges to 0.25. This is the code:
PROGRAM main
INTEGER n(4)
DOUBLE PRECISION s(4)
INTEGER i
OPEN(11,FILE='input')
OPEN(12,FILE='output')
DO i=1,4
READ(11,*) n(i)
END DO
PRINT*,n
CALL suma(n,s)
PRINT*, s
END
SUBROUTINE suma(n,s)
INTEGER n(4),j,k
DOUBLE PRECISION s(4),add
s=0
DO k=1,4
DO j=1,n(k)
add=1./(j*(j+1)*(j+2))
s(k)=s(k)+add
END DO
END DO
END SUBROUTINE
input
178
1586
18232
142705
The output file is now empty, I need to code it. I'm just printing the results, which are:
0.249984481688 0.249999400246 0.248687836759 0.247565846142
The problem comes with the variable add. When j is bigger, add turns negative, and the sum doesn't converge well. How can I fix it?
The problem is an integer overflow. 142705142706142707 is a number that is too large for a 4-byte integer.
What happens then is that the number overflows and loops back to negative numbers.
As #albert said in his comment, one solution is to convert it to double precision every step of the way: ((1.d0/j) / (j+1)) / (j+2). That way, it is calculating with floating point values.
Another option would be to use 8-byte integers:
integer, parameter :: int64 = selected_int_kind(17)
integer(kind=int64) :: j
You should be very careful with your calculations, though. Finer is not always better. I recommend that you look at how floating point arithmetic is performed by a computer, and what issues this can create. See for example here on wikipedia.
This is likely a better way to achieve what you want. I did remove the IO. The output from the program is
% gfortran -o z a.f90 && ./z
178 0.249984481688392
1586 0.249999801599584
18232 0.249999998496064
142705 0.249999999975453
program main
implicit none ! Never write a program without this statement
integer, parameter :: knd = kind(1.d0) ! double precision kind
integer n(4)
real(knd) s(4)
integer i
n = [178, 1586, 18232, 142705]
call suma(n, s)
do i = 1, 4
print '(I6,F18.15)', n(i), s(i)
end do
contains
!
! Recursively, sum a(j+1) = j * a(j) / (j + 1)
!
subroutine suma(n, s)
integer, intent(in) :: n(4)
real(knd), intent(out) :: s(4)
real(knd) aj
integer j, k
s = 0
do k = 1, 4
aj = 1 / real(1 * 2 * 3, knd) ! a(1)
do j = 1, n(k)
s(k) = s(k) + aj
aj = j * aj / (j + 3)
end do
end do
end subroutine
end program main
Here is the Main Program:
PROGRAM integration
EXTERNAL funct
DOUBLE PRECISION funct, a , b, sum, h
INTEGER n, i
REAL s
PARAMETER (a = 0, b = 10, n = 200)
h = (b-a)/n
sum = 0.0
DO i = 1, n
sum = sum+funct(i*h+a)
END DO
sum = h*(sum-0.5*(funct(a)+funct(b)))
PRINT *,sum
CONTAINS
END
And below is the Function funct(x)
DOUBLE PRECISION FUNCTION funct(x)
IMPLICIT NONE
DOUBLE PRECISION x
INTEGER K
Do k = 1,10
funct = x ** 2 * k
End Do
PRINT *, 'Value of funct is', funct
RETURN
END
I would like the 'Sum' in the Main Program to print 10 different sums over 10 different values of k in Function funct(x).
I have tried the above program but it just compiles the last value of Funct() instead of 10 different values in sum.
Array results require an explicit interface. You would also need to adjust funct and sum to actually be arrays using the dimension statement. Using an explicit interface requires Fortran 90+ (thanks for the hints by #francescalus and #VladimirF) and is quite tedious:
PROGRAM integration
INTERFACE funct
FUNCTION funct(x) result(r)
IMPLICIT NONE
DOUBLE PRECISION r
DIMENSION r( 10 )
DOUBLE PRECISION x
END FUNCTION
END INTERFACE
DOUBLE PRECISION a , b, sum, h
DIMENSION sum( 10)
INTEGER n, i
PARAMETER (a = 0, b = 10, n = 200)
h = (b-a)/n
sum = 0.0
DO i = 1, n
sum = sum+funct(i*h+a)
END DO
sum = h*(sum-0.5*(funct(a)+funct(b)))
PRINT *,sum
END
FUNCTION funct(x)
IMPLICIT NONE
DOUBLE PRECISION funct
DIMENSION funct( 10)
DOUBLE PRECISION x
INTEGER K
Do k = 1,10
funct(k) = x ** 2 * k
End Do
PRINT *, 'Value of funct is', funct
RETURN
END
If you can, you should switch to a more modern Standard such as Fortran 90+, and use modules. These provide interfaces automatically, which makes the code much simpler.
Alternatively, you could take the loop over k out of the function, and perform the sum element-wise. This would be valid FORTRAN 77:
PROGRAM integration
c ...
DIMENSION sum( 10)
c ...
INTEGER K
c ...
DO i = 1, n
Do k = 1,10
sum(k)= sum(k)+funct(i*h+a, k)
End Do
END DO
c ...
Notice that I pass k to the function. It needs to be adjusted accordingly:
DOUBLE PRECISION FUNCTION funct(x,k)
IMPLICIT NONE
DOUBLE PRECISION x
INTEGER K
funct = x ** 2 * k
PRINT *, 'Value of funct is', funct
RETURN
END
This version just returns a scalar and fills the array in the main program.
Apart from that I'm not sure it is wise to use a variable called sum. There is an intrinsic function with the same name. This could lead to some confusion...
The gfortran compiler gives wrong answer, when I run a parallel program using OpenMP. In the same time, ifort gives exact result.
This is the whole compilable code.
!_______________________________________________________________ !
!____________MODULE SECTION_____________________________________ !
MODULE MATRIC
IMPLICIT NONE
INTEGER , PARAMETER :: NG = 40
DOUBLE PRECISION,SAVE :: Z , PA , PB ,CMU
DOUBLE PRECISION , PARAMETER :: PI=2.0D0*ACOS(0.0D0) , &
FPI=4.0D0*PI , SQFPI = SQRT(FPI), DLAM=1.0D0
DOUBLE PRECISION , DIMENSION(450) :: DEL1, DEL2, X, R , SNLO
DOUBLE PRECISION :: XG(60) , WG(60)
END MODULE MATRIC
!_________________________________________________________________________!
! MODULE SECTION
!__________________________________________________________________________!
MODULE POTDATA
IMPLICIT NONE
INTEGER :: IA , IB , ID
DOUBLE PRECISION :: RA , RB , R1s(450)
END MODULE POTDATA
!__________________________________________________________________________!
!______________________________________________________________________!
program check
use matric
use potdata
implicit double precision(a-h,o-z)
pa = 0.72D0 ; pb = 0.19D0
mesh = 441 ; noint= 40 ; z = 2.0d0
CALL GAULEG(-1.d0,1.d0)
NB = MESH/NOINT
I = 1
X(I) = 0.0D+00
DELTAX = 0.0025D+00*40.0D+00/DBLE(NOINT)
DO J=1,NB
IMK = (J-1)*NOINT + 1
DO K=1,NOINT
AK=K
I=I+1
X(I)=X(IMK)+AK*DELTAX
END DO
DELTAX=2.0D+00*DELTAX
END DO
CMU=(9.0D00*PI*PI/(128.0D00*Z))**THIRD
R(1)=0.0D+00 ; SNLO(1) = 0.D00
DO I=2,MESH
R(I)=CMU*X(I)
SNLO(I) = R(I)*dexp(-Z*R(I))
R1S(I) = SNLO(I)/(SQFPI*R(I))
END DO
call EFFPOT(MESH,NOINT)
end program check
subroutine EFFPOT(MESH,NOINT)
USE OMP_LIB
USE MATRIC
USE POTDATA
implicit none
integer, intent(in) :: MESH, NOINT
double precision::anorm(450)
double precision, external :: funct
double precision :: asum, fac, cnorm
!$omp parallel do default(none) private(del1,ia,asum,ib,ra,rb,fac) &
!$omp shared(id,mesh,r,anorm,NOINT,del2,R1s)
do ia = 2,mesh
ra = r(ia)
if(R1s(ia).lt.1.D-7.and.R1s(ia).ge.1.D-8)id = ia
do ib = 2,mesh
rb = r(ib)
call QGAUSS(funct,-1.d0,1.d0,fac)
del1(ib) = rb**2*fac*R1s(ib)**2
end do
CALL NCDF(del1,ASUM,r(2),mesh,NOINT)
anorm(ia) = 2.0d0*pi*asum
del2(ia) = 2.0d0*pi*asum*(ra*R1s(ia))**2
end do
!$omp end parallel do
CALL NCDF(del2,ASUM,r(2),mesh,NOINT)
cnorm = 1.0/dsqrt(4.*pi*ASUM)
write(6,*)'cnorm =',cnorm
return
end
double precision function funct(x)
USE POTDATA , ONLY : RA , RB
USE MATRIC , ONLY : PA , PB , DLAM
implicit none
double precision, intent(in) :: x
double precision :: f1, f2, ramrb
ramrb = dsqrt(ra**2+rb**2-2.d0*ra*rb*x)
f1 = dcosh(pa*ra)+dcosh(pa*rb)
f2 = 1.d0+0.5*dlam*ramrb*dexp(-pb*ramrb)
funct = (f1*f2)**2
return
end
SUBROUTINE QGAUSS(func,aa,bb,ss)
USE OMP_LIB
USE MATRIC , ONLY : XG ,WG , NG
IMPLICIT DOUBLE PRECISION(A-H,O-Z)
external func
xm = 0.5d0*(bb+aa)
xl = 0.5d0*(bb-aa)
ss = 0.d0
do j=1,ng
dx = xl*xg(j)
ss = ss + wg(j)*(func(xm+dx)+func(xm-dx))
end do
ss = xl*ss/2.0
return
END
SUBROUTINE GAULEG(x1,x2)
USE MATRIC , ONLY : XG ,WG ,NG , PI
IMPLICIT DOUBLE PRECISION(A-H,O-Z)
eps = 1.d-14
m = (ng+1)/2
xm = 0.5D0*(x1+x2)
xl = 0.5D0*(x2-x1)
do i=1,m
z = dcos(pi*(dfloat(i)-0.25d0)/(dfloat(ng)+0.5d0))
1 continue
p1 = 1.d0
p2 = 0.d0
do j=1,ng
p3 = p2
p2 = p1
p1 = ((2.d0*dfloat(j)-1.d0)*z*p2 &
- (dfloat(j)-1.d0)*p3)/dfloat(j)
end do
pp = dfloat(ng)*(z*p1-p2)/(z*z-1.d0)
z1 = z
z = z1 - p1/pp
if (dabs(z-z1).gt.eps) go to 1
xg(i) = xm - xl*z
xg(ng+1-i) = xm + xl*z
wg(i) = 2.d0*xl/((1.d0-z*z)*pp*pp)
wg(ng+1-i) = wg(i)
end do
return
end
SUBROUTINE NCDF(F,ASUM,H,KKK,NOINT)
IMPLICIT DOUBLE PRECISION (A-H,O-Z)
DIMENSION F(450)
NBLOCK = (KKK-2)/NOINT + 1
C2HO45 = 2.0D+00*H/45.0D+00
ASUM = 0.0D+00
DO J=1,NBLOCK
ISTAR = NOINT*(J-1)+5
IEND = NOINT*J + 1
IEND = MIN0(KKK,IEND)
DO I=ISTAR,IEND,4
ASUM = ASUM + C2HO45*(7.0D+00*(F(I-4)+F(I)) &
+32.0D+00*(F(I-3)+F(I-1)) + 12.0D+00*F(I-2))
END DO
IF(IEND.EQ.KKK) GO TO 4
C2HO45 = 2.0D+00*C2HO45
4 END DO
RETURN
END
Thanks everybody specially #Vladimir who has taken interest in my problem. Finally i got the right answer by removing ra and rb from the module potdata and defined function as funct(x, ra, rb) and then removing ra and rb from the loop. Because i was writing ra, rb then reading their values in the above code so loop was having flow dependence. Now i get exact result from both compiler (which is 8.7933767516) parallelly, sequentially both. Exact way is this
subroutine EFFPOT(MESH,NOINT)
USE OMP_LIB
USE MATRIC
USE POTDATA
implicit none
integer, intent(in) :: MESH, NOINT
double precision::anorm(450)
double precision, external :: funct
double precision :: asum, fac, cnorm
!$omp parallel do default(none) private(del1,ia,asum,ib,fac) &
!$omp shared(id,mesh,r,anorm,NOINT,del2,R1s)
do ia = 2,mesh
if(R1s(ia).lt.1.D-7.and.R1s(ia).ge.1.D-8)id = ia
do ib = 2,mesh
call QGAUSS(funct,-1.d0,1.d0,fac,r(ia),r(ib))
del1(ib) = r(ib)**2*fac*R1s(ib)**2
end do
CALL NCDF(del1,ASUM,r(2),mesh,NOINT)
anorm(ia) = 2.0d0*pi*asum
del2(ia) = 2.0d0*pi*asum*(r(ia)*R1s(ia))**2
end do
!$omp end parallel do
CALL NCDF(del2,ASUM,r(2),mesh,NOINT)
cnorm = 1.0/dsqrt(4.*pi*ASUM)
write(6,*)'cnorm =',cnorm
return
end
double precision function funct(x,ra,rb)
USE MATRIC , ONLY : PA , PB , DLAM
implicit none
double precision, intent(in) :: x, ra, rb
double precision :: f1, f2, ramrb
ramrb = dsqrt(ra**2+rb**2-2.d0*ra*rb*x)
f1 = dcosh(pa*ra)+dcosh(pa*rb)
f2 = 1.d0+0.5*dlam*ramrb*dexp(-pb*ramrb)
funct = (f1*f2)**2
return
end
SUBROUTINE QGAUSS(func,aa,bb,ss,ra,rb)
USE OMP_LIB
USE MATRIC , ONLY : XG ,WG , NG
IMPLICIT DOUBLE PRECISION(A-H,O-Z)
external func
xm = 0.5d0*(bb+aa)
xl = 0.5d0*(bb-aa)
ss = 0.d0
do j=1,ng
dx = xl*xg(j)
ss = ss + wg(j)*(func(xm+dx,ra,rb)+func(xm-dx,ra,rb))
end do
ss = xl*ss/2.0
return
END
The cause of your problem is that the OpenMP standard does not specify what happens if a private list item is accessed in the region but outside of the construct. See example private.2f (found on page 135 of the OpenMP standard supplement) for a short version of the same problem.
Specifically, the module variables ra and rb are declared private in the OpenMP parallel region inside EFFPOT and also accessed from funct. funct is in the dynamic scope of the parallel region but (lexically) outside of it and therefore it is not specified whether ra and rb referenced by funct are the original module variables or their private copies (most compilers would go for the original variables).
You have already found one of the solutions. The other one would be to declare ra and rb threadprivate since they are only used to pass data from EFFPOT to funct:
MODULE POTDATA
IMPLICIT NONE
INTEGER :: IA , IB , ID
DOUBLE PRECISION :: RA , RB , R1s(450)
!$OMP THREADPRIVATE(RA,RB)
END MODULE POTDATA
You should then also remove ra and rb from the list of the private clause of the parallel region within EFFPOT.
On some platforms, e.g. OS X, using threadprivate with GCC (i.e. gfortran) could be slower than actually passing around the two variables as arguments because of the emulated TLS.
Note that this semantic error is really hard to detect and many OpenMP tools won't actually spot it.
First of all, it is very difficult to say something specific without seeing the actual code. However, I do have some comments on your situation and the conclusions you are drawing.
The fact that your program runs fine both in parallel and sequential execution when compiled with "ifort" doesn't mean that your program is correct. Since compiler bugs leading to programs giving wrong answers are very rare, but on the other hand manual parallel programming is very error-prone, we should assume a problem with the way you parallelized your code. We are probably talking about a race condition.
And no, the problem you are having is not at all unusual. When you have a race condition, it happens often that the sequential execution works everywhere and your parallel execution works in some environments and fails in others. It's even common that your code gives different answers every time you call it (not only depending on the compiler, but on many other factors that can change over time).
What I suggest you should do, is to get a parallel debugger, like for example TotalView that will help you keep track of the various threads and their states. Try to find a simple test environment (as few threads as possible) that fails reliably.
I'm using Fortran90 to solve a simple integration problem and calculating speed differences when run in parallel. I'm having trouble getting the correct result when paralleling the process using openMP.
program midpoint
use omp_lib
implicit none
integer :: beginning, rate, end, iteration
double precision :: sum, div, x, sum2
integer ::a,b, n
n = 100000000
a = 10
b = 0
div = dble(a-b)/n
x=b+div/2
sum = 0.0
call system_clock(beginning, rate)
do iteration=1,n
sum = sum + sqrt(x)*div ! evaluating sqrt(x) function
x = x + div
end do
call system_clock(end)
print *, "Computation from single core: ", sum
print *, "elapsed time from single core: ", real(end - beginning) / real(rate)
x=b+div/2
sum = 0.0
sum2 = 0.0
call system_clock(beginning, rate)
!$omp parallel private(iteration, sum) shared(sum2, x)
!$omp do
do iteration=1,n
sum = sum + sqrt(x)*div ! evaluating sqrt(x) function
x = x + div
end do
!$omp end do
sum2 = sum2 + sum
!$omp end parallel
call system_clock(end)
print *, "Computation from multiple cores: ", sum2
print *, "elapsed time from multiple cores: ", real(end - beginning) / real(rate)
end program
Thanks
You've programmed a race condition. In the line
sum2 = sum2 + sum
you've given threads the authority to read and write to a shared variable (sum2) with no control over the sequencing of operations. The same problem arises with the next line x = x + div too.
Continue reading your OpenMP tutorial until you encounter the reduction clause which is designed for what you seem to be doing. Learn too about the firstprivate clause which will initialise a thread-local variable with the value of the variable of the same name when the parallel region is first encountered.
I haven't checked the syntax carefully but it should be something like this:
!$omp parallel do private(iteration) firstprivate(x) shared(div) reduction(+:sum)
do iteration=1,n
sum = sum + sqrt(x)*div ! evaluating sqrt(x) function
x = x + div
end do
!$omp end parallel do
! at this point the value of sum will have been 'reduced' across all threads
print *, "Computation from multiple cores: ", sum