Fortran issue 0 + 0 not equal to 0 - fortran

I decied to post here since I'm facing a very weird trouble with my fortran90 code.
Indeed, I declared double precision variables such that :
double precision, dimension(-1:20,-1:20) :: a,b,c
Then I initialize all variables to zero,
a(:,:) = 0.d0
b(:,:) = 0.d0
c(:,:) = 0.d0
And finally I do my computations,
a(1:17,2:18) = a(1:17,2:18) + b(1:17,2:18) + 0.5d0*c(1:17,2:18)
I checked each variable and they are all equal to zero before this computations but then I obtain,
abs(maxval(a(1:17,2)) - minval(a(1:17,2))) = 4.336808689942018E-019
This makes no sense but I have no idea where the problem comes from, could someone help me out on this ?
Regards
P.S : I'm using ifort with the following options : "-O3 -xHost -vec-report0 -implicitnone -warn truncated_source -warn argument_checking -warn unused -warn declarations -warn alignments -warn ignore_loc -warn usage -check nobounds -ftz"

I ran your snippet on MS Visual Studio 2005 (Intel Fortran 11.0.3452.2005) and got 0.000000000000000E+000. Same results on GNU Fortran 4.9.0.
Not sure if it makes a difference in your case, but may I suggest you to use DABS instead of ABS, though.
Fausto

Related

Fortran performance for complex vs real variable

So, I was wondering if it is preferable to work on the real and imaginary part of the array separately instead of a complex variable for performance gain. For example,
program test
implicit none
integer,parameter :: n = 1e8
real(kind=8),parameter :: pi = 4.0d0*atan(1.0d0)
complex(kind=8),parameter :: i_ = (0.0d0,1.0d0)
double complex :: s
real(kind=8) :: th(n),sz, t1,t2, s1,s2
integer :: i
sz = 2.0d0*pi/n
do i=1,n
th(i) = sz*i
enddo
call cpu_time(t1)
s= sum(exp(th*i_))
call cpu_time(t2)
print *, t2-t1
call cpu_time(t1)
s1 = sum(cos(th))
s2 = sum(sin(th))
call cpu_time(t2)
print *, t2-t1
end program test
And the time it takes
3.7041089999999999
2.6299830000000002
So, the splited calculation does takes less time. This was a very simple calculation. But I have some long calculation and using complex variables improves the readability and does takes less lines of code. But will it sacrifice the performance of my code ? Or is it always advisable to work on the real and imaginary part separately?
Better to understand what kind of trick compiler can do for you. Generally it's not worth the effort to do so nowadays. Create a little script to study the CPU time of your code.
#!/bin/bash
src=a.f90
for fcc in gfortran ifort; do
$fcc --version
for flag in "-O0" "-O1" "-O2" "-O3"; do
fexe=$fcc$flag
echo $fcc $src -o "$fcc$flag" $flag
$fcc $src -o $fexe $flag
echo "run $fexe ..."
./$fexe
done
done
You will notice the some of the CPU time may show very close to 0, as the compiler is clever enough to discard the computation that you never used. Make the change to avoid the compile optimize out your computation.
print *, t2-t1, s
print *, t2-t1, s1, s2
The result of using ifort is here, beside the speed, notice the ACCURACY, speed comes at a price:
ifort (IFORT) 14.0.2
ifort a.f90 -o ifort-O0 -O0
run ifort-O0 ...
3.57999900000000 (-2.319317404797516E-009,7.034712528404704E-009)
4.07666600000000 -2.319317404797516E-009 7.034712528404704E-009
ifort a.f90 -o ifort-O1 -O1
run ifort-O1 ...
3.30333300000000 (-2.319317404797516E-009,7.034712528404704E-009)
3.54666700000000 -2.319317404797516E-009 7.034712528404704E-009
ifort a.f90 -o ifort-O2 -O2
run ifort-O2 ...
3.08000000000000 (-2.319317404797516E-009,7.034712528404704E-009)
1.13666600000000 -6.304215927066537E-009 1.737099880017717E-009
ifort a.f90 -o ifort-O3 -O3
run ifort-O3 ...
3.08333400000000 (-2.319317404797516E-009,7.034712528404704E-009)
1.13666600000000 -6.304215927066537E-009 1.737099880017717E-009
sum 31.999 3.496 0:35.82 99.0% 0
you may wonder what happens between -O1 and -O2 flag, if check the compiled object file, the actual internal function it linked has changed from:
U cexp
U cos
U sin
to :
U __svml_cos2
U __svml_sin2
U cexp
svml stand for short vector math library. Some trade off between speed and accuracy can be found in Intel IPP Library Fixed-Accuracy Arithmetic Functions

Mandated vectorization for gfortran compiler

I want to execute a Fortran loop in a vectorial way with a vector processor (Intel Xeon). I recently got the way doing this with the Intel compiler ifort that we can add !DIR$ SIMD before the loop.
But when I work with gfortran compiler, I find that all the vectorization operations are automatic. For example,
PROGRAM MAIN1
IMPLICIT NONE
DOUBLE PRECISION :: X(100)
INTEGER :: NELEM = 100, NELMAX = 100, LV = 4
INTEGER :: IKLE(100), I, IB, IELEM
DOUBLE PRECISION :: W(100)
DOUBLE PRECISION :: MASKEL(100)
LOGICAL :: MSK = .FALSE.
DO I = 1, 100
X(I) = I
IKLE(I) = I
W(I) = 0
END DO
DO IB = 1,(NELEM+LV-1)/LV
!------------loop to vectorize------------------
DO IELEM = 1+(IB-1)*LV , MIN(NELEM,IB*LV)
X(IKLE(IELEM)) = X(IKLE(IELEM)) + W(IELEM)
ENDDO ! IELEM
!-----------------------------------------------
ENDDO ! IB
PRINT *, X
END PROGRAM
Part of the output of gfortran main1.f -O3 -fopt-info-optimized is printed below
main1.f:18:0: note: not vectorized: not suitable for gather load _33 = x[_32];
main1.f:18:0: note: bad data references.
main1.f:18:0: note: not vectorized: not enough data-refs in basic block.
main1.f:18:0: note: not vectorized: not enough data-refs in basic block.
Since the program output X is right when the loop is compiled by ifort in a mandated vectorization mode, I wonder if there's also a similar way for gfortran.
In this case with scatter stores, forcing vectorization by directive could change the results when there are repeated entries in the index array IKLE(:), as it doesn't preserve the sequence of memory access. As far as I know, the only directive of this nature available in gfortran is !$omp simd, which gfortran is free to ignore. omp simd directives are active only when corresponding compile options are set.
ifort offers (-opt-report4 in recent versions) an assessment of peak speedup possible by vectorization. I don't know whether that assessment is based on the declared array sizes. If there is a speedup, it would be achieved more by changing the operation sequence than by actual SIMD parallelism.

gfortran does not allow character arrays with varying component lengths

See the example below
program test
character(10),dimension(5):: models = (/"feddes.swp", "jarvis89.swp", "jarvis10.swp" , "pem.swp", "van.swp"/)
end
The following error is returned:
Different CHARACTER lengths (10/12) in array constructor at (1)
There is no error with ifort compiler. Why does it happen with gfortran and is there any way to circumvent this problem?
You have some lengths 12 in the constructor, so it may be better to use length 12.
Also, use instead
character(len=12), dimension(5) :: models = [character(len=12) :: "feddes.swp", &
"jarvis89.swp", "jarvis10.swp", "pem.swp", "van.swp"]
Possibly even better, if you have compiler support, is
character(len=*), dimension(*) :: ...
The original code is accepted by ifort but it is not standard fortran, hence the error from gfortran. If you supply the option -std to ifort it will print warnings when the compiler allows extensions such as this.

Variable strangely takes the value zero after the call of a subroutine

I have been facing some issues trying to convert a code previously compiled with compaq visual fortran 6.6 to gfortran.
Here is a specific problem I have met with gfortran :
There is a variable called "et" which takes the value 3E+10. Then the program calls a subroutine. "et" doesn't appear in the subroutine, but after coming back to the main program it has now the value 0.
When compliling with compaq visual fortran I didn't have this problem.
The code I am working on is a huge scientific program, so I put below only a small part of it :
c
c calculate load/unload modulus
c
500 t=(s1-s3)/2.
aa=1.00
if(iconeps.ne.1)bb=1.00
if(smean.lt.ap1) smean=ap1
if(xn.gt.0.000001) aa=(smean/atmp)**xn
if(iconeps.eq.1)go to 220
if(xm.gt.0.000001) bb=(smean/atmp)**xm
220 if(t.ge.0.99*sm1) go to 600
et=xku*aa*atmp+tt*tm1
if(iconeps.ne.1)bt=xkb*atmp*bb
go to 900
600 et=(xkl*aa*atmp+tt*tm1)*(1.0-rf*sr)**2
if(iconeps.ne.1)bt=xkb*atmp*bb
900 continue
btmax=17.0*et
btmin=0.33*et
if(iconeps.ne.1)then
tbt=(alf1+alf3*dtt)*dtt*(1.+vide)*tm2
btf=bt+tbt
bt=btf
endif
if(bt.lt.btmin) bt=btmin
if(bt.gt.btmax) bt=btmax
if(iconeps.eq.1)go to 1100
1000 continue
1050 if(mt.eq.mtyp4c)goto 1100
s=0.0
t=0.0
call shap4n(s,t,f,pfs,pft) ! Modification by NHV
call thick4n(s,t,xe,ye,thick)
call bmat4n(xe,ye,f,pfs,pft,b,detj,thick)
c calculate incremental strains
do 1300 i=1,4
temp=0.0
do 1200 j=1,8
1200 temp=temp+b(i,j)*disp(j)
1300 depi(i)=temp
epsv=0.0
do 1400 i=1,2
1400 epsv=epsv+depi(i)
epsv=epsv+depi(4)
ev=vide-(1.+vide)*epsv
if(ev.lt.0.0)ev=vide*.01
1100 continue
call perm(permws,xkw,coef,rw,tvisc,ev,vide,tt,pp)
: "et" keeps the good value until just before calling the subroutine "perm". Just after this subroutine it takes the value zero.
"et" isn't in any common block
This piece of code is part of a subroutine called by several different subroutines. What is even more strange is that when it is called in other parts of the code I doesn't have this problem ("et" keeps its value)
So if someone has ever met this kind of problem or have any idea about it I will be very gratefull
Perhaps you have a memory access error, such as an array bounds violation, or a mismatch between actual and dummy arguments. Are the interfaces of the subroutines explicit, such as being "used" from a module? Also try turning on compiler debugging options ... obviously subscript checking, but others might catch something. An extensive set for gfortran 4.5 or 4.6 is:
-O2 -fimplicit-none -Wall -Wline-truncation -Wcharacter-truncation -Wsurprising -Waliasing -Wimplicit-interface -Wunused-parameter -fwhole-file -fcheck=all -std=f2008 -pedantic -fbacktrace
Subscript checking is included in fcheck=all
I had this problem. In my main program, I was using double precision but the numbers I calculated with in my subroutine were single precision. After I changed them to double it fixed the problem and I got actual values instead of 0.

Fortran double precision program with a simple MKL BLAS routine

In trying to mix precision in a simple program - using both real and double - and use the ddot routine from BLAS, I'm coming up with incorrect output for the double precision piece. Here's the code:
program test
!! adding this statement narrowed the issue down to ddot being considered real(4)
implicit none
integer, parameter :: dp = kind(1.0d0)
!! The following 2 lines were added for the calls to the BLAS routines.
!! This fixed the issue.
real(dp), external :: ddot
real, external :: sdot
real, dimension(3) :: a,b
real(dp), dimension(3) :: d,e
integer :: i
do i = 1,3
a(i) = 1.0*i
b(i) = 3.5*i
d(i) = 1.0d0*i
e(i) = 3.5d0*i
end do
write (*,200) "sdot real(4) = ", sdot(3,a,1,b,1) ! should work and return 49.0
write (*,200) "ddot real(4) = ", ddot(3,a,1,b,1) ! should not work
write (*,200) "sdot real(8) = ", sdot(3,d,1,e,1) ! should not work
write (*,200) "ddot real(8) = ", ddot(3,d,1,e,1) ! should work and return 49.0
200 format(a,f5.2)
end program test
I've tried compiling with both gfortran and ifort using the MKL BLAS libraries as follows:
ifort -lmkl_intel_lp64 -lmkl_sequential -lmkl_core
gfortran -lmkl_intel_lp64 -lmkl_sequential -lmkl_core main.f90
The output is:
sdot real(4) = 49.00
ddot real(4) = 0.00
sdot real(8) = 4.10
ddot real(8) = 0.00
How can I get the ddot routine to correctly process the double precision values?
Additionally, adding the -autodouble flag (ifort) or -fdefault-real-8 (gfortran) flag makes both of the ddot routines work, but the sdot routines fail.
Edit:
I added the implicit none statement, and the two type statements for the ddot and sdot functions. Without the type specified for the function calls the ddot was being typed implicitly as single precision real.
I haven't used MKL, but perhaps you need a "use" statement so that the compiler knows the interface to the functions? Or to otherwise declare the functions. They are not declared so the compiler is probably assuming that return of ddot is single precision and mis-interpreting the bits.
Turning on the warning option causes the compiler to tell you about the problem. With gfortran, try:
-fimplicit-none -Wall -Wline-truncation -Wcharacter-truncation -Wsurprising -Waliasing -Wimplicit-interface -Wunused-parameter -fwhole-file -fcheck=all -std=f2008 -pedantic -fbacktrace
Passing incorrect kind variables is a case of interface mismatch (which is illegal, so in principle the compiler might do anything including starting WW III), so maybe this is messing up the stack and hence following calls also return incorrect results. Try to comment out those incorrect calls (your lines marked with "should not work") and see if that helps.
Also, enable all kinds of debug options you can find, as e.g. the answer by M.S.B. shows for gfortran.