Reading variable length data in FORTRAN - fortran

I have an input file that I cannot alter the format. One of the lines in particular can contain either 6 or 7 reals and I don't have any way of knowing ahead of time.
After some reading, my understanding of the list-formatted read statement is that if I attempt to read 7 reals on a line containing 6, it will attempt to read from the next line. The author of the code says that when it was written, it would read the 6 reals and then default the 7th to 0. I am assuming he relied on some compiler specific behavior, because I cannot find a mention of this behavior anywhere.
I am using gfortran as my compiler, is there a way to specify this behavior? Or is there a good way to count a number of inputs on a line and rewind to then chose to read the correct number?

here is a little trick to accomplish that
character*100 line
real array(7)
read(unit,'(a)')line !read whole line as string'
line=trim(line)//' 0' !add a zero to the string
read(line,*)array !list read
If the input line only had 6 values, the zero is now the seventh.
If there were seven to begin with it will do nothing.
I try to avoid using format specifiers on input as much as possible.

Maybe you should use the IOSTAT statement for detecting the wrong format when you attempt to read 7 values when there are only 6. And you should use the ADVANCE statement to be able to retry to read the same line.
READ(LU,'7(F10.3)', IOSTAT=iError, ADVANCE='NO') MyArray(1:7)
IF(iError > 0) THEN
! Error when trying to read 7 values => try to read 6 !
READ(LU, '6(F10.3)') MyArray(1:6)
ELSEIF(iError == 0) THEN
READ(LU, *) ! For skipping the line read with success with 7 values
ENDIF
IOSTAT takes a negative value for example when you reach the end of the file, positive for problem of reading (typically formatting error) and 0 when the read succeed. See this link for a complete definition of gfortran error code: http://www.hep.manchester.ac.uk/u/samt/misc/gfortran_errors.html
Another way to do it could be to read the line as a string and manipulating the string in order to get the vector values :
CHARACTER(LEN=1000) :: sLine
...
READ(LU, '(A)') sLine
READ(sLine,'7(F10.3)', IOSTAT=iError) MyArray(1:7)
IF(iError > 0) THEN
! Error when trying to read 7 values => try to read 6 !
READ(sLine, '6(F10.3)') MyArray(1:6)
ENDIF
If the values are written in fixed format, you can determine the lenght of the vector by testing the lenght of the line:
CHARACTER(LEN=1000) :: sLine
INTEGER :: nbValues
CHARACTER(LEN=2) :: sNbValues
...
READ(LU, '(A)') sLine
nbValues = LEN_TRIM(sLine) / 10 ! If format is like '(F10.x)'
WRITE(sNbValues, '(I2)') nbValues
READ(sLine, '('//TRIM(sNbValues)//'(F10.3))') MyArray(1:nbValues)

Related

How do I skip lines when some conditions are met with Fortran? [duplicate]

It is my understanding that Fortran, when reading data from file, will skip lines starting with and asterisk (*) assuming that they are a comment. Well, I seem to be having a problem with achieving this behavior with a very simple program I created. This is my simple Fortran program:
1 program test
2
3 integer dat1
4
5 open(unit=1,file="file.inp")
6
7 read(1,*) dat1
8
9
10 end program test
This is "file.inp":
1 *Hello
2 1
I built my simple program with
gfortran -g -o test test.f90
When I run, I get the error:
At line 7 of file test.f90 (unit = 1, file = 'file.inp')
Fortran runtime error: Bad integer for item 1 in list input
When I run the input file with the comment line deleted, i.e.:
1 1
The code runs fine. So it seems to be a problem with Fortran correctly interpreting that comment line. It must be something exceedingly simple I'm missing here, but I can't turn up anything on google.
Fortran doesn't automatically skip comments lines in input files. You can do this easily enough by first reading the line into a string, checking the first character for your comment symbol or search the string for that symbol, then if the line is not a comment, doing an "internal read" of the string to obtain the numeric value.
Something like:
use, intrinsic :: iso_fortran_env
character (len=200) :: line
integer :: dat1, RetCode
read_loop: do
read (1, '(A)', isostat=RetCode) line
if ( RetCode == iostat_end) exit ReadLoop
if ( RetCode /= 0 ) then
... read error
exit read_loop
end if
if ( index (line, "*") /= 0 ) cycle read_loop
read (line, *) dat1
end do read_loop
Fortran does not ignore anything by default, unless you are using namelists and in that case comments start with an exclamation mark.
I found the use of the backspace statement to be a lot more intuitive than the proposed solutions. The following subroutine skips the line when a comment character, "#" is encountered at the beginning of the line.
subroutine skip_comments(fileUnit)
integer, intent(in) :: fileUnit
character(len=1) :: firstChar
firstChar = '#'
do while (firstChar .eq. '#')
read(fileUnit, '(A)') firstChar
enddo
backspace(fileUnit)
end subroutine skip_comments
This subroutine may be used in programs before the read statement like so:
open(unit=10, file=filename)
call skip_comments(10)
read(10, *) a, b, c
call skip_comments(10)
read(10, *) d, e
close(10)
Limitations for the above implementation:
It will not work if the comment is placed between the values of a variable spanning multiple lines, say an array.
It is very inefficient for large input files since the entire file is re-read from the beginning till the previous character when the backspace statement is encountered.
Can only be used for sequential access files, i.e. typical ASCII text files. Files opened with the direct or append access types will not work.
However, I find it a perfect fit for short files used for providing user-parameters.

READ from array instead of file

I have an existing Fortran 77 program where the input values are read from an input file.
read (unit=*, fmt=*) value
The read statement automatically jumps to the next line in the file every time it is called.
Is it possible to replace the "reference" index with another data container, like an array?
For example:
read (myarray, fmt=*) value
I tried it but it always reads the first array-element and does not jump automatically to the next element.
I would have to change every read(unit=*, ...) to read(array(i), ...) and increase the i separately to get to the next element.
Since the program is huge, I am looking for a way to keep the existing read statements and just change the source of the data.
So the unit wouldn't be a integer value but a array where every element is a line from my input file.
Does anybody have an idea?
I tried to discribe the problem in code:
(the input_file.input ist just 15 lines with the numbers 1 to 15)
program FortranInput
implicit none
! Variables
integer :: i, inpid
character*130, dimension(100) :: inp_values
character*130 :: value
inpid = 20
! Open File
open(inpid, file='input_file.input')
! Read from file ------------------------------------------------
do i = 1, 15
! read always takes the next line in the file
read(inpid,'(a130)') value
! write each line to new array-element
inp_values(i) = value
! output each line from file to screen
write(*,*) value
end do
close (inpid)
! Read from array -----------------------------------------------
do i = 1, 15
! read always takes the first line in the array
read(inp_values,'(a130)') value
write(*,*) value
end do
end program FortranInput
Yes, in your example you have to always read from the appropriate array element using the (i) syntax. I can't see another way.
However, often you can use a character array as file in multiple records without using the element index. Consider this:
integer :: i, n=15
character*130, dimension(100) :: inp_values
character*130 :: value
integer :: values(100)
do i = 1, n
write(value,*) i
inp_values(i) = value
end do
read(inp_values,'(*(i130,/))') values(1:n)
write(*,*) values(1:n)
or even
read(inp_values,*) values(1:n)
It is important to remember that an internal file does not keep track of the position at which it is opened. The position is only valid within each write or read statement.
Internal files, unlike external files, have no concept of persistent position (between input/output statements). In this regard if you want one read statement to transfer from one record and the next read from another record you will have to reference these records directly.
However, you don't show how you really want to use the input. If you can re-write the input to use a single read statement then the appropriate records will be the source.
For example, if you can rewrite
do i=1,5
read(unit, '(I5)') x(i)
end do
as
read(unit, '(I5,/)') x(1:5)
then you can easily switch to using an internal file.

Fortran is reading beyond endfile record

I'm trying to read some data from a file, and the endfile record detection is important to stop reading. However, depending of the array dimensions of the array used to read data, I cannot detect properly the endfile record and my Fortran program stops.
The program is below:
!integer, dimension(3) :: x ! line 1.1
!integer, dimension(3,10) :: x ! line 1.2
integer, dimension(10,3) :: ! line 1.3
integer :: status,i=1
character(len=100) :: error
open( 30, file='data.dat', status='old' )
do
print *,i
!read( 30, *, iostat=status, iomsg=error ) x ! line 2.1
!read( 30, *, iostat=status, iomsg=error ) x(:,i) ! line 2.2
read( 30, *, iostat=status, iomsg=error ) x(i,:) ! line 2.3
if ( status < 0 ) then print *,'EOF'
print *,'total of ',i-1,' lines read.'
exit
else if ( status > 0 ) then
print *,'error cod: ',status
print *,'error message: ', error
stop
else if ( status == 0 ) then
print *,'reading ok.'
i = i + 1
end if
end do
With 'data.dat' file been:
10 20 30
30 40 50
When lines 1.3 and 2.3 are uncommented the mentioned error appears:
error cod: 5008
error message: Read past ENDFILE record
However, using lines 1.1 and 2.1, or 1.2 and 2.2, the program works, detecting endfile record.
So, I would like some help on understanding why I cannot use lines 1.3 and 2.3 to read properly this file, since I'm giving the correct number of array elements for read command.
I'm using gfortran compiler, version 6.3.0.
EDIT: simpler example
the following produces a 5008 "Read past ENDFILE record" error:
implicit none
integer x(2,2),s
open(20,file='noexist')
read(20,*,iostat=s)x
write(*,*)s
end
if we make x a scalar or a one-d array ( any size ) we get the expected -1 EOF flag. It doesn't matter if the file actually doesn't exist or is empty. If the file contains some, but not enough, data its hard to make sense of which return value you might get.
I am not sure if I am expressing myself correctly but it has to do with the way fortran is reading and storing 2d-arrays. When you are using this notation: x(:,i), the column i is virtually expanded in-line and the items are read using this one line of code. In the other case where x(i,:) is used, the row i is read as if you called read multiple times.
You may use implied loops if you want to stick with a specific shape and size. For example you could use something like that:read( 30, *, iostat=status, iomsg=error ) (x(i,j), j=1,3)
In any case you should check that your data are stored properly (as expected at least) in variable x.
Please note this is only a guess. Remember that Fortran stores arrays in column major order. When gfortran compiles read() x(:,i), the 3 memory locations are next to each other so in the executable, it produces a single call to the operating system to read in 3 values from the file.
Now when read() x(i,:) is compiled, the three data elements x(i,1), x(i,2) and x(i,3) are not in contiguous memory. So I am guessing the executable actually has 3 read calls to the operating system. The first one would trap the EOF but the 2nd one gives you the read past end of file error.
UPDATE: I have confirmed that this does not occur with Intel's ifort. gfortran seems to have had a similar problem before: Bad IOSTAT values when readings NAMELISTs past EOF. Whether this is a bug or not is debatable. The code certainly looks like it should trap an EOF.

overwrite a file using fortran

I am using a Fortran 90 program that writes a file. The first line of this file is supposed to indicate the number of lines in the remaining file. The file is written by the program when a certain criterion is met and that can't be determined beforehand. Basically, I will know the total number of lines only after the run is over.
I want to do it in the following manner:
1) open the file and write the first line with some text say, "Hello"
2) Write rows in the file as desired and keep a counter for number of rows.
3) Once the run is over and just before closing the file, replace the first line string ("Hello") with the counter.
The problem is in step 3. I don't know how to replace the first line.
Another option that I can think of is to write to 2 files. First, write a file as above without the counter. Once the run is over, close the file and write another file and this time, I know the value of the counter.
I believe there is a way to proceed with the first approach. Can someone please help me with this?
Fortran supports three forms of file access - DIRECT, STREAM (F2003+) and SEQUENTIAL. Both DIRECT and STREAM access support being able to rewrite earlier parts of a file, SEQUENTIAL access does not (a rewrite to an earlier record truncates the file at the rewritten record).
With direct access, all the records in the file are the same length. An arbitrary record can be [must be] accessed by any input/output statement by simply specifying the relevant record number in the statement. Note though, that the typical disk format of a direct access file may not match your idea of a file with "lines".
With formatted stream access, the current position in the file can be captured using an INQUIRE statement, and then a later input/output statement can begin data transfer at that position by using a POS specifier. The typical disk format of a formatted stream access file usually matches with what people expect of a text file with lines.
Stream access is likely what you want. Examples for both approaches are shown below.
Direct access:
PROGRAM direct
IMPLICIT NONE
INTEGER :: unit
REAL :: r
INTEGER :: line
OPEN( NEWUNIT=unit, &
FILE='direct.txt', &
STATUS='REPLACE', &
ACCESS='DIRECT', &
RECL=15, & ! The fixed record length.
FORM='FORMATTED' )
CALL RANDOM_SEED()
! No need to write records in order - we just leave off
! writing the first record until the end.
line = 0
DO
CALL RANDOM_NUMBER(r)
IF (r < 0.05) EXIT
line = line + 1
PRINT "('Writing line ',I0)", line
! All the "data" records are offset by one, to allow the
! first record to record the line count.
WRITE (unit, "('line ',I10)", REC=line+1) line
END DO
! Now update the first record with the number of following "lines".
WRITE (unit, "(I10)", REC=1) line
CLOSE(unit)
END PROGRAM direct
Stream access:
PROGRAM stream
IMPLICIT NONE
INTEGER :: unit
REAL :: r
INTEGER :: line
INTEGER :: pos
OPEN( NEWUNIT=unit, &
FILE='stream.txt', &
STATUS='REPLACE', &
ACCESS='STREAM', &
POSITION='REWIND', &
FORM='FORMATTED' )
CALL RANDOM_SEED()
! Remember where we are. In this case, the position
! is the first file storage unit in the file, but
! it doesn't have to be.
INQUIRE(unit, POS=pos)
! Leave some space in the file for later overwriting
! with the number of lines. We'll stick the number
! zero in there for now.
WRITE (unit, "(I10)") 0
! Write out the varying number of lines.
line = 0
DO
CALL RANDOM_NUMBER(r)
IF (r < 0.05) EXIT
line = line + 1
PRINT "('Writing line ',I0)", line
WRITE (unit, "('line ',I10)") line
END DO
! Now update the space at the start with the number of following "lines".
WRITE (unit, "(I10)", POS=pos) line
CLOSE(unit)
END PROGRAM stream
Going back on a sequential access file is tricky, because lines can vary in length. And if you change the length of one line, you'd have to move all the stuff behind.
What I recommend is to write your output to a scratch file while counting the number of lines. Then, once you're finished, rewind the scratch file, write the number of lines to your output file, and copy the contents of the scratch file to that output file.
Here's what I did:
program var_file
implicit none
character(len=*), parameter :: filename = 'delme.dat'
integer :: n, io_stat
character(len=300) :: line
open(unit=200, status='SCRATCH', action="READWRITE")
n = 0
do
read(*, '(A)') line
if (len_trim(line) == 0) exit ! Empty line -> finished
n = n + 1
write(200, '(A)') trim(line)
end do
rewind(200)
open(unit=100, file=filename, status="unknown", action="write")
write(100, '(I0)') n
do
read(200, '(A)', iostat=io_stat) line
if (io_stat /= 0) exit
write(100, '(A)') trim(line)
end do
close(200)
close(100)
end program var_file

Write and read access=stream files in fortran

I have a shell script from which I pass a binary file to a fortran program such that
Mth=$1
loop=1
it=1
while test $it -le 12
do
Mth=`expr $Mth + $loop`
file="DataFile"$Mth".bin"
./fort_exe ${Yr} ${nt} ${it}
# Increment loop
it=`expr $it + 1`
done
This script is used to pass 12 files within a do loop to the fortran program. In the fortran program, I read the binary file passed from the shell script and I am trying to write a 2nd file which would compile in a single file all the data that was read from the consecutive files e.g.
!Open binary file passed from shell script
open(1,file='Datafile'//TRIM{Mth)//.bin',action='read',form='unformatted',access='direct', &
recl=4*x*y, status='old')
! Open write file for t 1. The status is different in t 1 and t > 1 so I open it twice: I guess there is a more elegant way to do this...
open(2,file='Newfile.bin',action='write',form='unformatted', &
access='stream', position='append', status='replace')
irec = 0
do t = 1, nt
! Read input file
irec = irec + 1
read(1,rec=irec) val(:,:)
! write output file
irecW= irec + (imonth-1)*nt
if ( t .eq. 1) write(2,pos=irecW) val(:,:)
! Close file after t = 1, update the status to old and reopen.
if ( t .eq. 2) then
close (2)
open(2,file='Newfile.bin',action='write',form='unformatted', &
access='stream', position='append',status='old')
endif
if ( t .ge. 2) write(2,pos=irecW) val(:,:)
enddo
I can read the binary data from the first file no problem but when I try and read from another program the binary data from the file that I wrote in the first program such that
open(1,file='Newfile.bin',action='read',form='unformatted', &
access='stream', status='old')
irec=0
do t = 1, nt
! Read input file
irec = irec + 1
read(1,pos=irec) val(:,:)
write(*,*) val(:,:)
enddo
val(:,:) is nothing but a list of zeros. This is the first time I use access=stream which I believe is the only way I can use position='append'. I have tried compiling with gfortran and ifort but I do not get any error messages.
Does anyone have any idea why this is happening?
Firstly, I do not think you need to close and reopen your output file as you are doing. The status specifier is only relevant to the open statement in which it appears: replace will delete Newfile.bin if it exists at that time, before opening a new file with the same name. The status is implicitly changed to old, but this does not affect any operations done to the file.
However, since your Fortran code does not know you run it 12 times, you should have a way of making sure the file is only replaced the first time and opened as old afterwards; otherwise, Newfile.bin will only contain the information from the last file processed.
As for reading in the wrong values, this most likely occurs because of the difference between direct access (where you can choose a record length) and stream access (where you cannot). With stream access, data is stored as a sequence of "file storage units". Their size is in general compiler-dependent, but is available through the module iso_fortran_env as file_storage_size; it is usually 8 bits. This means that each entry will usually occupy multiple storage units, so you have to take care that a read or write with the pos = specifier does not access the wrong storage units.
Edit:
Some example code writing and reading with stream access:
program stream
use, intrinsic :: iso_fortran_env
implicit none
integer :: i, offset
real(real32), dimension(4,6) :: val, nval
open(unit=2, file='Newfile.bin', action='readwrite', form='unformatted', &
access='stream', status='replace')
do i = 1,2
call random_number(val)
write(2) val
enddo
! The file now contains two sequences of 24 reals, each element of which
! occupies the following number of storage units:
offset = storage_size(val) / file_storage_size
! Retrieve the second sequence and compare:
read(2, pos = 1 + offset*size(val)) nval
print*, all(nval == val)
close(2)
end program
The value true should be printed to the screen.
Note also that it's not strictly necessary to specify a pos while writing your data to the file, because the file will automatically be positioned beyond the last record read or written.
That said, direct or stream access is most beneficial if you need to access the data in a non-sequential manner. If you only need to combine input files into one, it could be easier to write the output file with sequential access, for which you can also specify recl and position = 'append'.
You can check for the existence of a file in standard Fortran, by using the inquire statement:
logical :: exist
inquire(file="test.dat", exist=exist)
if (exist) then
print *, "File test.dat exists"
else
print *, "File test.dat does not exist"
end if
Alternatively you can have a look at the modFileSys library which provides libc like file manipulation routines.
As for appending and streams: Appending files is also possible when you use "classical" record based fortran files, you do not have to use streams for that.