Cannot run Fortran code - fortran

My program runs with errors and I really have no idea why, will you help me please? The question is as below:
The flow problem is classic example of viscous diffusion. The governing equation for such problem was derived using boundary layer theory to reduce the full Navier- Stokes equations to the single parabolic PDE.
with the necessary initial and boundary conditions,
t = 0: u(0) = 0, u(0.04m) = 0;
t > 0: u(0) = 40.0, u(0.04m) = U = 0.0m/s.
This problem may be described physically as transient viscous-driven flow between two plates of infinite extent and separated by a distance of 0.04m. Initially both plates are at rest. After time,t=0, the upper plate is set in motion in the positive x-direction with a velocity of 40.0m/s. Due to the viscosity of the fluid filling the space between the plates, successive lamina of fluid are set in motion as time elapses. Eventually, the system reaches a “quasi-steady state”, as the velocity profile becomes more or less constant in time. The governing equation lends nicely to the use of finite difference techniques to solve the problem in the transient domain.
Computer Code in Fortran for DUFORT FRANKEL SCHEME.
! Homework1 DUFORT FRANKE SCHEME
! Program computes the numerical solution to the
! Transient Flow Problem.
! The following initial and bounadry conditions are applied:
! t=0: u(y=0)=40.0m/s
! t>0: u(y=0)=0.0; u(y=0.04m)=0.0
parameter(maxn=30,eps=1.0e-3)
integer k,m,mm,count
real*8 u_old(1001,maxn),u_new(1001,maxn),y(maxn)
real*8 t,tau,h,r,tmax,u_init,nu,sum,error
!
data h,m,u_init,nu,r,tmax /0.001,41,0.0,2.17e-4,0.217,2.5e5/
!
open(unit=1,file='hw1_dufort.out',status='unknown')
tau=r*h**2/nu
mm=m-1
error=1.0
!
count=0
k=1
t=0.0
y(1)=0.0
!
do 2 i=2,m
y(i)=y(i-1)+h
2 continue
!
write(1,*)'Velocity Results:'
write(1,10)t,(u_old(k,j),j=1,m)
do while ((error.gt.eps).and.(count.lt.1080))
count=count+1
sum=0.0
t=t+tau
u_old(k,1)=40.0
u_old(k,m)=0.0
do 4 i=2,mm
if (k.lt.2) then
u_new(k,i)=(2.0*r/(1.0+2.0*r))*(u_old(k,i+1)+u_old(k,i-1))
else
u_new(k,i)=(2.0*r/(1.0+2.0*r))*(u_old(k,i+1)+u_old(k,i-1))+ ((1.0 -
& 2.0*r)/(1.0+2.0*r))*u_old(k-1,i)
end if
end do
!
10 format(2x,f10.3,2x,41f8.4)
write(1,'(" Number of steps for convergence = ",i4)')count
end

Related

how can we get the output of a veriable on each step-size in RK4stepsize method?

I am using Fortran 77 in Intel Fortran. I want to ask that how can we take the output of a variable at each step size using the RK4-step size method
I have different subroutines in the code. IF we calculate the root of a variable by Bernt's method in another subroutine and then for that root we want to take output at each time step using rk4 adaptive stepsize method how can we do?
for example
call adaptivestep_size(x,t,t+dt,epx,dt,hmin,nok,nbad,ad,bsstep,
&pwE,pwB,gamma,ep,zlamda)
here zlamda is the subroutine which is calculating the root of desired variable by brent's method in another subroutine as,
SUBROUTINE zlamda(lamdaroot)
INTEGER N,NBMAX
REAL X1,X2
PARAMETER(N=100,NBMAX=20,X1=0.0,X2=3.141592/2.) ! take X2=pi/2 to find smallest root only out of multiple roots
INTEGER i,nb
REAL zbrent,root,tol,xb1(NBMAX),xb2(NBMAX)
EXTERNAL zbrak,fzlamda
REAL eL,z,fzlamda,lamdaroot,lamda
COMMON /flamda/ eL,z
nb=NBMAX
call zbrak(fzlamda,X1,X2,N,xb1,xb2,nb)
do 200 i=1,nb
tol=(1.0e-6)*(xb1(i)+xb2(i))/2.0
lamdaroot=zbrent(fzlamda,xb1(i),xb2(i),tol) ! here you get latitude, "lamdaroot" in radian, from given "z"
200 continue
return
END
FUNCTION fzlamda(lamda)
REAL fzlamda,z,eL,lamda
COMMON /flamda/ eL,x(3,1) ! eL (normalized L value), z are supposed to be provided. Then, fzlamda is a function of lamda only
lamda=9.*3.1415926535/180.0
fzlamda=0.5*sin(lamda)*sqrt(1.+3.*sin(lamda)**2)+(0.5/sqrt(3.))
+ *log(abs(sqrt(3.)*sin(lamda)+sqrt(1.+3.*sin(lamda)**2)))
+ -abs(x(3,1))/4685.0 ! taking abs(z) means: lamda increases from equator as abs(z) increases
return
END
now actually the lamdaroot is the variable for which I want to take output at each time step so how would I set the subroutine zlamda which will give me output thought the whole calculations

Retrospectively closing a NetCDF file created with Fortran

I'm running a distributed model stripped to its bare minimum below:
integer, parameter :: &
nx = 1200,& ! Number of columns in grid
ny = 1200,& ! Number of rows in grid
nt = 6000 ! Number of timesteps
integer :: it ! Loop counter
real :: var1(nx,ny), var2(nx,ny), var3(nx,ny), etc(nx,ny)
! Create netcdf to write model output
call check( nf90_create(path="out.nc",cmode=nf90_clobber, ncid=nc_out_id) )
! Loop over time
do it = 1,nt
! Calculate a lot of variables
...
! Write some variables in out.nc at each timestep
CALL check( nf90_put_var(ncid=nc_out_id, varid=var1_varid, values=var1, &
start = (/ 1, 1, it /), count = (/ nx, ny, 1 /)) )
! Close the netcdf otherwise it is not readable:
if (it == nt) call check( nf90_close(nc_out_id) )
enddo
I'm in the development stage of the model so, it inevitably crashes at unexpected points (usually at the Calculate a lot of variables stage), which means that, if the model crashes at timestep it =3000, 2999 timesteps will be written to the netcdf output file, but I will not be able to read the file because the file has not been closed. Still, the data have been written: I currently have a 2GB out.nc file that I can't read. When I ncdump the file it shows
netcdf out.nc {
dimensions:
x = 1400 ;
y = 1200 ;
time = UNLIMITED ; // (0 currently)
variables:
float var1 (time, y, x) ;
data:
}
My questions are: (1) Is there a way to close the file retrospectively, even outside Fortran, to be able to read the data that have already been written? (2) Alternatively, is there another way to write the file in Fortran that would make the file readable even without closing it?
When nf90_close is called, buffered output is written to disk and the file ID is relinquished so it can be reused. The problem is most likely due to buffered output not having been written to the disk when the program terminates due to a crash, meaning that only the changes you made in "define mode" are present in the file (as shown by ncdump).
You therefore need to force the data to be written to the disk more often. There are three ways of doing this (as far as I am aware).
nf90_sync - which synchronises the buffered data to disk when called. This gives you the most control over when to output data (every loop step, or every n loop steps, for example), which can allow you to optimize for speed vs robustness, but introduces more programming and checking overhead for you.
Thanks to #RussF for this idea. Creating or opening the file using the nf90_share flag. This is the recommended approach if the netCDF file is intended to be used by multiple readers/writers simultaneously. It is essentially the same as an automatic implementation of nf90_sync for writing data. It gives less control, but also less programming overhead. Note that:
This only applies to netCDF-3 classic or 64-bit offset files.
Finally, an option I wouldn't recommend, but am including for completeness (and I guess there may be situations where this is the best option, although none spring to mind) - closing and reopening the file. I don't recommend this, because it will slow down your program, and adds greater possibility of causing errors.

Strange behavior while calling properties from REFPROP FORTRAN files

I am trying to use REFPROPs HSFLSH subroutine to compute properties for steam.
When the same state property is calculated over multiple iterations
(fixed enthalpy and entropy (Enthalpy = 50000 J/mol & Entropy = 125 J/mol),
the time taken to compute using HSFLSH after every 4th/5th iteration increases to about 0.15 ms against negligible amount of time for other iterations. This is turning problematic because my program places call to this subroutine over several thousand times. Thus leading to abnormally huge program run times.
The program used to generate the above log is here:
C refprop check
program time_check
parameter(ncmax=20)
dimension x(ncmax)
real hkj,skj
character hrf*3, herr*255
character*255 hf(ncmax),hfmix
C
C SETUP FOR WATER
C
nc=1 !Number of components
hf(1)='water.fld' !Fluid name
hfmix='hmx.bnc' !Mixture file name
hrf='DEF' !Reference state (DEF means default)
call setup(nc,hf,hfmix,hrf,ierr,herr)
if (ierr.ne.0) write (*,*) herr
call INFO(1,wm,ttp,tnbp,tc,pc,dc,zc,acf,dip,rgas)
write(*,*) 'Mol weight ', wm
h = 50000.0
s = 125.0
c
C
DO I=1,NCMAX
x(I) = 0
END DO
C ******************************************************
C THIS IS THE ACTUAL CALL PLACE
C ******************************************************
do I=1,100
call cpu_time(tstrt)
CALL HSFLSH(h,s,x,T_TEMP,P_TEMP,RHO_TEMP,dl,dv,xliq,xvap,
& WET_TEMP,e,
& cv,cp,VS_TEMP,ierr,herr)
call cpu_time(tstop)
write(*,*),I,' time taken to run hsflsh routine= ',tstop - tstrt
end do
stop
end
(of course you will need the FORTRAN FILES, which unfortunately I cannot share since REFPROP isn't open source)
Can someone help me figure out why is this happening.?
P.S : The above code was compiled using gfortran -fdefault-real-8
UPDATE
I tried using system_clock to time my computations as suggested by #Ross below. The results are uniform across the loop (image below). I will have to find alternate ways to improve computation speed I guess (Sigh!)
I don't have a concrete answer, but this sort of behaviour looks like what I would expect if all calls really took around 3 ms, but your call to CPU_TIME doesn't register anything below around 15 ms. Do you see any output with time taken less than, say 10 ms? Of particular interest to me is the approximately even spacing between calls that return nonzero time - it's about even at 5.
CPU timing can be a tricky business. I recommended in a comment that you try system_clock, which can be higher precision than CPU_TIME. You said it doesn't work, but I'm unconvinced. Did you pass a long integer to system_clock? What was the count_rate for your system? Were all the times still either 15 or 0 ms?

Generate normally distributed data within a range using Fortran

I would like to generate an array of normally distributed data around a given mean and withing given upper and lower bounds.
I found a function here that uses the Marsaglia polar method (a.k.a polar form of the Box-Müller transformation).
How might I modify this to generate points within -10 and +10 integer values around the mean?
function normal(mean, sigma)
! mean : mean of distribution
! sigma : number of standard deviations
implicit none
integer, parameter:: b8 = selected_real_kind(14)
real(b8), parameter :: pi = 3.141592653589793239_b8
real(b8) normal
real rand_num
real(b8) tmp
real(b8) mean
real(b8) sigma
real(b8) fac
real(b8) gsave
real(b8) rsq
real(b8) r1
real(b8) r2
integer flag
save flag
save gsave
data flag /0/
if (flag.eq.0) then
rsq=2.0_b8
do while(rsq.ge.1.0_b8.or.rsq.eq.0.0_b8) ! new from for do
call random_number(rand_num)
r1=2.0_b8*rand_num-1.0_b8
call random_number(rand_num)
r2=2.0_b8*rand_num-1.0_b8
rsq=r1*r1+r2*r2
enddo
fac=sqrt(-2.0_b8*log(rsq)/rsq)
gsave=r1*fac
tmp=r2*fac
flag=1
else
tmp=gsave
flag=0
endif
normal=tmp*sigma+mean
return
endfunction normal
There are different ways to sample from a truncated normal distribution. The "best" way depends on your mean and variance.
A simple way is just plain rejection: generate a deviate and if it's too big or small throw it away and repeat. If your parameters are such that you don't reject many this is a good method.
For the routine specified, doing this rejection is simple: sticking the whole calculation part (including the flag test) into a do...end do loop with an if (abs(normal).lt.10) exit should do it.
[This isn't elegant, and extra bookkeeping may be required, with samples being generated as pairs. But, again, if rejection happens rarely that's not too bad. Tidying is left as an exercise for the reader.]
That's how one may change the routine, but as #george comments, that probably isn't the best approach, even with the rejection. Certainly, having a function called normal which takes mean and sigma and returns a deviate which isn't from a normal distribution could be confusing.
function truncated_normal(mu, sigma, lim_a, lim_b) !Very side-effecty
... declarations ...
if ( ... test for wanting the rejection method ... ) then
do
truncated_normal = normal(mu, sigma)
if (truncated_normal.gt.lim_a.and.truncated_normal.lt.lim_b) exit
end do
else
... the other magical method when rejection is too likely ...
end if
end

accelerator directives not working

This is the code for a matrix multiplication
program ex
implicit none
real :: a(256,256),b(256,256),c(256,256),t1,t2
integer i,j,k,sum
sum=0
do j = 1,256
do i = 1,256
a(i,j) = 1
b(i,j) = 1
c(i,j) = 0.0
enddo
enddo
call cpu_time(t1)
!$acc region do
do i=1,256
do j=1,256
sum=0
do k=1,256
sum=sum+a(i,k)*b(k,j)
c(i,j)=sum
end do
end do
end do
!$acc end region
call cpu_time(t2)
print*,"cpu time=",t2-t1
print*,c
end program ex
When I execute this the execution time is 75 msec when using the accelerator directives and the PGI compiler. But when I run same matrix multiplication with a "cuda fortran" implementation the execution time is only 5msec. So there is big difference even though I used the accelerator directives. So I doubt that my accelerator directives are working properly.
I tried to accelerate your program using very similar accelerator directives OpenHMPP. Note that I switched one your line, that is probably errorneously in the innermost loop. Also note, that I had to advice the compiler of the reduction taking place. Also I renamed the reduction variable, because it shadowed the sum intrinsic function.
The performance is not good, because of the overheead with starting the GPU kernel and because of the memory transfers. You need orders of magnitude more work for it to be profitable to use GPU.
For example when I used matrices 2000 x 2000 then the CPU execution time was 41 seconds, but GPU execution time only 8 s.
program ex
implicit none
real :: a(256,256),b(256,256),c(256,256),t1,t2
integer i,j,k,sm
sm=0
do j = 1,256
do i = 1,256
a(i,j) = 1
b(i,j) = 1
c(i,j) = 0.0
enddo
enddo
call cpu_time(t1)
!$hmpp region, target = CUDA
!$hmppcg gridify, reduce(+:sm)
do i=1,256
do j=1,256
sm=0
do k=1,256
sm=sm+a(i,k)*b(k,j)
end do
c(i,j)=sm
end do
end do
!$hmpp endregion
call cpu_time(t2)
print*,"cpu time=",t2-t1
print*,sum(c)
end program ex
edit: it would be probably not to use reduce(+:sm), but just private(sm)
FYI, the OP also posted this question on the PGI User Forum (http://www.pgroup.com/userforum/viewtopic.php?t=3081). We believe the original issue was the result of pilot error. When we profiled his code using CUDA Prof, the CUDA Fortran kernel execution time was 205 ms versus 344 ms using the PGI Accelerator Model. Also, if I fix his code so that "c(i,j)=sum" is placed outside of the inner "k" loop, the PGI Accelerator Model time reduces to 123ms. It's unclear how he gathered his timings.
Thanks to those that tried to help.
- Mat