I have a Fortran subroutine that is called several times in a main program (which I don't have access to). In my subroutine, I wish to read data from one of several (~10^4) files in every iteration based on an input argument. Each of the files has one line of data; and the format of my data is as follows:
0.97014199999999995 0.24253600000000000 0.0000000000000000
I'm using the following lines of code to open and read the files:
program test_read
implicit none
integer :: i, iopen_status, iread_status
real :: gb
CHARACTER(len=25) :: filename
CHARACTER(*), PARAMETER :: fileplace =
& "/home/ajax/hexmesh_readn/G3/"
dimension gb(3)
i = 5
WRITE(filename,'(a,I0,a)')'GBn_',i,'.txt'
open(unit=15,
& file=fileplace//filename,IOSTAT=iopen_status)
read (15,*,IOSTAT=iread_status) gb
print *,"gb",gb(1),gb(2),gb(3)
close(15)
end program test_read
In the main program, i is a variable, but I have a file for all possible values of i.
Now, this code works perfectly well when I run on my local machine. But, when I submit it along with the main program, it behaves somewhat weirdly. Specifically, it reads some of the files, but not others.
When I print out the IOSTAT for open and read, I see that the IOSTAT for open is 0 for all the files, whereas that for the read command is 0 for some, -1 for some and 29 for others! I looked up what the error code 29 means and I learned that it might indicate that the file is not found in the path. But the file is most definitely there.
Also, I don't see anything different about the files that it isn't able to read. In fact, I have even seen the same file giving an IOSTAT value of 0 and 29!
One thing to note is that I'm running the main program on several cores. Could this have anything to do with the error?
Are you running several instances of the same program simultaneously? On some operating systems, different programs can't simultaneously open the same file. Specifying that you want read-only access might allow access by multiple programs. On the Fortran open statement: action='read'.
If you are running a multi-threaded program, then different threads might be doing IO simultaneously on different files ... different unit numbers should be used by each thread to avoid conflicts.
Related
I am currently "optimizing" a scientific modelling program developed in Fortran 95. This program is basically making heavy computations in 3D to solve some equations, in addition numerous variable have to be saved and used ~ 50 tables with sizes likes (50; 50; 10000), I even have some 5D tables with sizes like (6;6;15;15;10000) to save in order to reduce the computation time.
I developed a perfectly working version of this code using a python3 interface to control my runs. Basically python is calling a fortran module containing my code to obtain all the results from my modelling. The problem with this method is that I cannot parallelize my code in some time consuming regions. Moreover, I would benefit from the computational time advantage of Fortran for a post treatment of the models that is now partially done in python due to interface.
In the first part of my optimization campaign for this code I want to add a control of the runs with Fortran. A program would call the module containing my code to obtain all the necessary and heavy variables. The Python interface would still be presented, the switch between the Fortran and python control run being done in the compilation in the Makefile directly, this Makefile is already done, everything is compiling well and the python interface is still perfectly working.
My troubles are concerning the Fortran control program and its gestion of the allocated memory I assume. As the size of my tables are not known in advance and requires to open some files I have to declare all my variable as ALLOCATABLE. I then allocate them with the correct sizes before calling my module containing my code. When calling my code errors related to memory problems are appearing, with the error message "Program received signal SIGSEV: Segmentation fault - invalid memory reference". This error appears when I'm setting a table to 0d0, if I'm reducing the size/precision of my modelling the program can proceed a bit further before crashing hence the memory related problem. I think that I'm doing something not correct in the utilisation of the variables between my control and my modelling module. Maybe some variables are stored in the wrong memory space, I precise that I'm using gfortran on ubuntu 22.04.1.
I have different possibilities to try to solve this issue using derived types and pointers or simply by breaking my modelling module. Before going into these heavy structural modifications I wanted to know if someone has experience an equivalent problem and what were the solutions.
Here is a schema of the structure of my code:
Run program:
program run_model
use coordinates
use file
use mathematical
use modelling_module
implicit none
integer :: n_x, n_y, n_z
real(8),dimension(:), ALLOCATABLE:: x,y,z
+ all other output variables in 3D
.
.
.
Some operations and file opening
ALLOCATE(x(n_x),y(n_y),z(n_z))
+ all other variables
CALL modelling(n_x, n_y, n_z, output variables)
end program run_model
Modelling module in a separated file:
module modelling_module
use coordinates
use file
use mathematical
implicit none
private
public :: modelling
contains
subroutine modelling(n_x, n_y, n_z, output variables)
integer, intent(in):: n_x, n_y, n_z,
real(8),dimension(n_x), intent(out):: x
real(8),dimension(n_y), intent(out):: y
real(8),dimension(n_z), intent(out):: z
+ all output variables
Computation of the model
.
.
.
end subroutine modelling
end module modelling_module
Thank you in advance for your answers !
I was making some tests with the code bellow when I faced an strange behavior in my program. When I use the call for the intrinsic subroutine "sleep" in my program nothing was written to the file testing.dat. If I removed the call for this subroutine it worked fine, the numbers were written. I tried the same code (calling the subroutine "sleep") with Intel Fortran and it worked fine as well.
It seems to me that the sleep subroutine halts in some sense the execution before the file is written with the program compiled using gfortran, behavior that does not occur using intel fortran. I'm not an computer science expert but that is my guess, does anyone else have a better one?
I tried with all the flags bellow and nothing has changed:
gfortran -g file.f90 -o executable
gfortran file.f90 -o executable
gfortran -O3 file.f90 -o executable
I am using a xubuntu 18.01 OS.
program test
implicit none
integer :: i, j, k
open(34, file="testing.dat")
do i=1,9999999
do j=1,9999999
do k=1,9999999
print*, i, j, k
write(34,'(3I8)') i, j, k
call sleep (1)
end do
end do
end do
end program
File output can be buffered. That means that the characters or bytes that are to be written in the external file are first gathered somewhere in memory and than written to the external file in larger chunks. That can speed-up file output. If you look at the external file i a random moment, it does not have to contain the output from all write statements that were executed, some may be in the buffers. The flush(unit) statement makes the data visible to the external processes by flushing the data. The gfortran manual for the older flush intrinsic subroutine states
The FLUSH intrinsic and the Fortran 2003 FLUSH statement have
identical effect: they flush the runtime library's I/O buffer so that
the data becomes visible to other processes. This does not guarantee
that the data is committed to disk.
File buffering can also be typically controlled by compiler or runtime-library settings using compiler flags or environment variables. For gfortran you can find the runtime variables at https://gcc.gnu.org/onlinedocs/gfortran/Runtime.html#Runtime
There are four variables you might be interested in:
GFORTRAN_UNBUFFERED_ALL: Do not buffer I/O for all units
GFORTRAN_UNBUFFERED_PRECONNECTED: Do not buffer I/O for preconnected units.
GFORTRAN_FORMATTED_BUFFER_SIZE: Buffer size for formatted files
GFORTRAN_UNFORMATTED_BUFFER_SIZE: Buffer size for unformatted files
In the program I am using, there is a subroutine open(name,reclen,etc) to open files in a standardized manner.
The program uses a lot of disk IO out of multiple reasons, and files that are written somewhere are often opened multiple times in different subroutines.
Is there the possibility to print out the current path in a subroutine?
something like
subroutine open()
call print_path()
end subroutine
which would print something like a stack trace without killing the program:
this instance of open() was called at:
program line routine/program/function
=========================================
calc 157 subrout1.f90
calc 112 parentrout.f90
calc 20 calc.f90
The opened file here has the name ABC.txt
So in this instance I know that the file ABC.txt was opened in subrout1 at line 157 which was called in parentrout at line 112 in the program calc at line 20.
You can get a backtrace at any time by calling subroutine backtrace() in gfortran and tracebackqq() in Intel Fortran (see also answers to how to stop a fortran program abnormally). These are compiler specific. I don't know of any standard solution nor a solution that would be at least common to these two compilers.
I know this may sound like a stupid question: is there any difference between
write(*,*)
and
write(6,*)
?
I am running a complicated code on the supercomputer in my institute which outputs a data file via a unit number different than 6, and apparently the Fortran code compiled with the ONLY difference being the above code gives me a different data file (i.e., data do not match).
I know the (*,*) format goes to standard output, while the (6,*) renders on screen, however I am really confused by why this has any effect on my actual data. Any ideas about how this works would be appreciated!
The unit denoted by * is the "standard output" (not a true Fortran standard term). It is usually pre-connected as unit number 6, but it can be connected to a different one - compiler options control that. You can check this using the constant OUTPUT_UNIT in the module iso_fortran_env
OUTPUT_UNIT:
Identifies the preconnected unit identified by the asterisk (*) in WRITE statement.
(from gfortran documentation)
Most often the results can be expected to be the same for both. If it is not your case, you have to show as what the differences look like.
If you use some other unit number and you opened it yourself in your own code, anything can happen. You must check the options you used when opening the file, i.e. the open statement and the compiler options in place.
I have downloaded the following fortran program dragon.f at http://www.iamg.org/documents/oldftp/VOL32/v32-10-11.zip
I need to do a minor modification to the program which requires the program to be translated to fortran90 (see below to confirm if this is truly needed).
I have managed to do this (translation only) by three different methods:
replacing comment line indicators (c for !) and line continuation
indicators (* in column 6 for & at the end of last line)
using convert.f90 (see https ://wwwasdoc.web.cern.ch/wwwasdoc/WWW/f90/convert.f90)
using f2f.pl (see https :// bitbucket.org/lemonlab/f2f/downloads)
Both 1) and 3) worked (i.e. managed to compile program) while 2) didn't work straight away.
However, after testing the program I found that the results are different.
With the fortran77 program, I get the "expected" results for the example provided with the program (the program comes with an example data "grdata.txt", and its example output "flm.txt" and "check.txt"). However, after running the translated (fortran90) program the results I get are different.
I suspect there are some issues with the way some variables are declared.
Can you give me recommendations in how to properly translate this program so I get the exact same results?
The reason I need to do it in fortran90 is because I need to input the parameters via a text file instead of modifying the program. This shouldnt be an issue for most of the parameters involved, except for the declaration of the last one, in which the size is determined from parameters that the program does not know a priori (see below):
implicit double precision(a-h,o-z)
parameter(lmax=90,imax=45,jmax=30)
parameter(dcta=4.0d0,dfai=4.0d0)
parameter(thetaa=0.d0,thetab=180.d0,phaia=0.d0,phaib=120.d0)
dimension f(0:imax,0:jmax),coe(imax,jmax,4),coew(4),fw(4)
So for example, I will read lmax, imax, jmax, dcta, dfai, thetaa, thetab, phaia, and phaib and the program needs to declare f and coe but as far as I read after googling this issue, they cannot be declared with an unknown size in fortran77.
Edit: This was my attempt to do this modification:
character fname1*100
call getarg(1,fname1)
open(10,file=fname1)
read(10,*)lmax,imax,jmax,dcta,dfai,thetaa,thetab,phaia,phaib
close(10)
So the program will read these constants from a file (e.g. params.txt), where the name of the file is supplied as an argument when invoking the program. The problem when I do this is that I do not know how to modify the line
dimension f(0:imax,0:jmax)...
in order to declare this array when the values imax and jmax are not known when compiling the program (they depend on the size of the data that the user will use).
As has been pointed out in the comments above, parameters cannot be read from file since they are set at compile time. Read them in as integer, declare the arrays as allocatable, and then allocate.
integer imax,jmax
real(8), allocatable :: f(:,:),coe(:,:,:)
read(10,*) imax,jmax
allocate(f(0:imax,0:jmax),coe(imax,jmax,4))
I found out that the differences in the results were attributed to using different compilers.
PS I ended up adding a lot more code than I intended at the beginning to allow reading data from netcdf files. This program in particular is really helpful for spherical harmonic expansion. [tag:spherical harmonics]