Problemas reading a CSV file, IOSTAT=5010, gfortran - fortran

I'm in trouble while reading a CSV file with about 2 Gb:
OPEN(unit=11,status='OLD',file=fin,form='formatted')
IEND=0
n=0
DO
n=n+1
READ(11,*,IOSTAT=iEND1) RECORD
IF (IEND==0) THEN
deal with RECORD
ELSEIF (IEND>0) THEN
PRINT *, IEND
ELSE
STOP 'END OF FILE'
ENDIF
ENDDO
CLOSE(UNIT=11)
...
RECORD is an own defined type.
For some reason, everything goes fine up to reading record number 24200. There are about 21 million records in the file. I'm getting an IOSTAT=5010 message and RECORD comes back empty.
I carefully looked into the formatted file with vi -b file.csv and everything seems fine.
I'm using GNU Fortran (GCC) 5.2.0.
What does this IOSTAT coded really mean?

Related

How to handle an optional group in a Fortran Namelist

I am working with a code originally written in Fortran 77 that makes use of namelists (supported by compiler extension at the time of its writing - this feature only became standard in Fortran 90) for reading input files. The namelist input files have groups of namelist variables in between (multiple) plain text headers and footers (see example.nml). Some groups of namelist variables are only read if certain conditions are met for previously read variables.
When reading all the namelist groups in a file in sequence, executables compiled with gfortran, ifort and nagfor all behave the same and give the expected output. However, when a given namelist group in the input file is to be skipped (the optional reading), gfortran and ifort executables handle this as desired, while the executable compiled with nagfor raises a runtime error:
Runtime Error: reader.f90, line 27: Expected NAMELIST group /GRP3/ but found /GRP2/
Program terminated by I/O error on unit 15 (File="example.nml",Formatted,Sequential)
As a minimal working example reproducing the problem, consider the namelist file example.nml and driver program reader.f90 given below, in which NUM2 from namelist group GRP2 should only be read if NUM1 from namelist group GRP1 equals 1:
example.nml:
this is a header
&GRP1 NUM1=1 /
&GRP2 NUM2=2 /
&GRP3 NUM3=3 /
this is a footer
reader.f90:
program reader
implicit none
character(len=40) :: hdr, ftr
integer :: num1, num2, num3, icode
! namelist definition
namelist/grp1/num1
namelist/grp2/num2
namelist/grp3/num3
! open input file
open(unit=15, file='example.nml', form='formatted', status='old', iostat=icode)
! read input data from namelists
read(15, '(a)') hdr
print *, hdr
read(15, grp1)
print *, num1
if (num1 == 1) then
read(15, grp2)
print *, num2
end if
read(15,grp3)
print *, num3
read(15, '(a)') ftr
print *, ftr
! close input file
close(unit=15)
end program reader
All executables give this expected output when NUM1=1:
this is a header
1
2
3
this is a footer
However, when e.g. NUM1=0, the executables compiled with gfortran and ifort give the desired output:
this is a header
0
3
this is a footer
while the executable compiled with nagfor (which is known for being strictly standard conforming), reads the header and first namelist group:
this is a header
0
but then terminates with the previously mentioned runtime error.
As indicated by the error message, example.nml is accessed sequentially, and if that is the case, /GRP2/ is the next record to be read, not /GRP3/ as the program logic asks for, so the error message makes sense to me.
So my question is this:
Can the shown behaviour be attributed to standard (non-)conformance enforced by nagfor and not gfortran and ifort?
If so, does this mean that the non-sequential reading observed with gfortran and ifort is due to extensions supported by these compilers (and not nagfor)? Can this be turned on/off using compiler flags?
The simplest workaround I can think of (minimal change to a large existing program), would be to add a dummy read(15,*) in an else branch for the if statement in reader.f90. This seems to work with all the mentioned compilers. Would this make the code standard conforming (Fortran 90 or later)?
These are the compiler versions and options that were used to compile the executables:
GNU Fortran (Ubuntu 9.1.0-2ubuntu2~18.04) 9.1.0:
gfortran -Wall -Wextra -fcheck=all -g -Og -fbacktrace reader.f90
Intel(R) Visual Fortran, Version 16.0 Build 20160415:
ifort -Od -debug:all -check:all -traceback reader.f90
NAG Fortran Compiler Release 6.1(Tozai) Build 6116:
nagfor -O0 -g -C reader.f90
When namelist formatting is requested on an external file, the namelist record is taken to commence at the record at the current position of the file.
The structure of a namelist input record is well defined by the language specification (see Fortran 2018 13.11.3.1, for example). In particular, this does not allow a mismatching namelist group name. nagfor complaining about this does so legitimately.
Several compilers do indeed appear to continue skipping over records until the namelist group is identified in a record, but I'm not aware of compiler flags available to control that behaviour. Historically, the case was generally that multiple namelists would be specified using distinct files.
Coming to your "simple workaround": this is, alas, not sufficient in the general case. Namelist input may consume several records of an external file. read(15,*) will advance the file position by only a single record. You will want to advance to after the terminating record for the namelist.
When you know the namelist is just that single record then the workaround is good.
#francescalus' answer and comment on that answer, clearly explained the first two parts of my question, while pointing out a flaw in the third part. In the hope that it may be useful to others that stumble upon a similar problem with a legacy code, here is the workaround I ended up implementing:
In essence, the solution is to ensure that the file record marker is always positioned correctly before attempting any namelist group read. This positioning is done in a subroutine that rewinds an input file, reads through records until it finds one with a matching group name (if not found, an error/warning can be raised) and then rewinds and repositions the file record marker to be ready for a namelist read.
subroutine position_at_nml_group(iunit, nml_group, status)
integer, intent(in) :: iunit
character(len=*), intent(in) :: nml_group
integer, intent(out) :: status
character(len=40) :: file_str
character(len=:), allocatable :: test_str
integer :: i, n
! rewind file
rewind(iunit)
! define test string, i.e. namelist group we're looking for
test_str = '&' // trim(adjustl(nml_group))
! search for the record containing the namelist group we're looking for
n = 0
do
read(iunit, '(a)', iostat=status) file_str
if (status /= 0) then
exit ! e.g. end of file
else
if (index(adjustl(file_str), test_str) == 1) then
! backspace(iunit) ?
exit ! i.e. found record we're looking for
end if
end if
n = n + 1 ! increment record counter
end do
! can possibly replace this section with "backspace(iunit)" after a
! successful string compare, but not sure that's legal for namelist records
! thus, the following:
if (status == 0) then
rewind(iunit)
do i = 1, n
read(iunit, '(a)')
end do
end if
end subroutine position_at_nml_group
Now, before reading any (possibly optional) namelist group, the file is positioned correctly first:
program new_reader
implicit none
character(len=40) :: line
integer :: num1, num2, num3, icode
! namelist definitions
namelist/grp1/num1
namelist/grp2/num2
namelist/grp3/num3
! open input file
open(unit=15, file='example.nml', access='sequential', &
form='formatted', status='old', iostat=icode)
read(15, '(a)') line
print *, line
call position_at_nml_group(15, 'GRP1', icode)
if (icode == 0) then
read(15, grp1)
print *, num1
end if
if (num1 == 1) then
call position_at_nml_group(15, 'GRP2', icode)
if (icode == 0) then
read(15, grp2)
print *, num2
end if
end if
call position_at_nml_group(15, 'GRP3', icode)
if (icode == 0) then
read(15, grp3)
print *, num3
end if
read(15, '(a)') line
print *, line
! close input file
close(unit=15)
contains
include 'position_at_nml_group.f90'
end program new_reader
Using this approach eliminates uncertainty in how different compilers treat not finding matching namelist groups at the current record in a file, generating the desired output for all compilers tested (nagfor, gfortran, ifort).
Note: For brevity, only a bare minimum of error checking is done in the code snippet shown here, this (and case insensitive string comparison!) should probably be added.

Fortran is reading beyond endfile record

I'm trying to read some data from a file, and the endfile record detection is important to stop reading. However, depending of the array dimensions of the array used to read data, I cannot detect properly the endfile record and my Fortran program stops.
The program is below:
!integer, dimension(3) :: x ! line 1.1
!integer, dimension(3,10) :: x ! line 1.2
integer, dimension(10,3) :: ! line 1.3
integer :: status,i=1
character(len=100) :: error
open( 30, file='data.dat', status='old' )
do
print *,i
!read( 30, *, iostat=status, iomsg=error ) x ! line 2.1
!read( 30, *, iostat=status, iomsg=error ) x(:,i) ! line 2.2
read( 30, *, iostat=status, iomsg=error ) x(i,:) ! line 2.3
if ( status < 0 ) then print *,'EOF'
print *,'total of ',i-1,' lines read.'
exit
else if ( status > 0 ) then
print *,'error cod: ',status
print *,'error message: ', error
stop
else if ( status == 0 ) then
print *,'reading ok.'
i = i + 1
end if
end do
With 'data.dat' file been:
10 20 30
30 40 50
When lines 1.3 and 2.3 are uncommented the mentioned error appears:
error cod: 5008
error message: Read past ENDFILE record
However, using lines 1.1 and 2.1, or 1.2 and 2.2, the program works, detecting endfile record.
So, I would like some help on understanding why I cannot use lines 1.3 and 2.3 to read properly this file, since I'm giving the correct number of array elements for read command.
I'm using gfortran compiler, version 6.3.0.
EDIT: simpler example
the following produces a 5008 "Read past ENDFILE record" error:
implicit none
integer x(2,2),s
open(20,file='noexist')
read(20,*,iostat=s)x
write(*,*)s
end
if we make x a scalar or a one-d array ( any size ) we get the expected -1 EOF flag. It doesn't matter if the file actually doesn't exist or is empty. If the file contains some, but not enough, data its hard to make sense of which return value you might get.
I am not sure if I am expressing myself correctly but it has to do with the way fortran is reading and storing 2d-arrays. When you are using this notation: x(:,i), the column i is virtually expanded in-line and the items are read using this one line of code. In the other case where x(i,:) is used, the row i is read as if you called read multiple times.
You may use implied loops if you want to stick with a specific shape and size. For example you could use something like that:read( 30, *, iostat=status, iomsg=error ) (x(i,j), j=1,3)
In any case you should check that your data are stored properly (as expected at least) in variable x.
Please note this is only a guess. Remember that Fortran stores arrays in column major order. When gfortran compiles read() x(:,i), the 3 memory locations are next to each other so in the executable, it produces a single call to the operating system to read in 3 values from the file.
Now when read() x(i,:) is compiled, the three data elements x(i,1), x(i,2) and x(i,3) are not in contiguous memory. So I am guessing the executable actually has 3 read calls to the operating system. The first one would trap the EOF but the 2nd one gives you the read past end of file error.
UPDATE: I have confirmed that this does not occur with Intel's ifort. gfortran seems to have had a similar problem before: Bad IOSTAT values when readings NAMELISTs past EOF. Whether this is a bug or not is debatable. The code certainly looks like it should trap an EOF.

Unable to read a number from a text file

I am using Fortran to make a subroutine to use in a CFD shallow water software.
I have written this code to read and use the values stored.
PROGRAM hieto
! Calcula la precipitacion efectiva en funcion del tiempo
!IMPLICIT NONE
real::a
!Abrir CSV
!OPEN(UNIT=10,FILE="datos.txt",FORM="formatted",STATUS="replace",ACTION="readwrite",ACCESS='sequential')
open(unit=10, file='datos.txt')
!Leer el archivo
read(10, *, iostat=ios)a
print*,ios
print*, a
close (UNIT=10)
END PROGRAM hieto
My text file datos, looks like this
1
2
3
When I run the code as is, I get the following output
-1
0.0000000000
Process return 0 (0x0) execution time: 0.002 s
the first number in the row one is one not zero, so I don't know why this happens.
And if I remove the iostat=ios from the read statement, I get the following error:
At ine 13 (the line od the read stament) of file /home/Dropbox/scripts_tesis/fortran/hieto_telemac.f90 (unit=10, file=datos.txt')
Fortran runtime error: end of file.
Proceess returned 2 (0x2)
I have read some answers here so I tried adding end=3 in the read statement, and also to end my text file with a blank line at the end.
The end=3 gives an error saying 3 is not a defined label and putting a blank row in the text file does nothing.
I am using ubuntu 16.04 LTS and Gfortran compiler.
What happens is that your file is empty.
Make sure that there is indeed a file called datos.txt in that directory. Pay attention to the exact name. datos.txt and just datos is not the same thing.
If you tried to open it before with the commented command that includes STATUS="replace" your old file would have been replaced.
And because the file is empty, you didn't real anything useful. If iostat is non-zero, and your is -1, then the value of the variable being read is undefined. So your a is undefined. Again, because your file is empty.
Additionally, you cannot just blindly put end=3 in your code because you saw it somewhere on Stack Overflow. You must first understand what it is supposed to do. There is no reason to combine iostat= and end=. The iostat is perfectly sufficient.

Unexpected end of file with the read command

I'm trying to read data from a mesh file in Fortran 2003, but I'm getting an unexpected end of file runtime error. Some lines in the file seem to be skipped by the read command. For example, with this sample.txt file :
1 2 2 0 1 1132 1131 1165
2 2 2 0 2 1099 1061 1060
I want to read the first integer from each line, so my program is :
program read_file
implicit none
integer :: ierr, i, j
open(unit=10,file='sample.txt',status='old',action='read',iostat=ierr)
read(10,*) i
read(10,*) j
write(*,*) i, j
end program read_file
And at runtime, I'm getting
Fortran runtime error: End of file
What is odd is that if I force a carriage return at the end of the file, the program will read the two integers just fine.
If you really need to fix this on the read side (ie. properly terminating the last line of the file is not practical for some reason ) you might try reading each line into a string, then internal reading from the string:
character*80 line
integer i
do ..
read(unit,'(a)')line
read(line,*)i
enddo
Of course this may or may not work depending on the compiler as well..
Obviously fixing the file is the best option ( Whatever program is creating this file should be fixed )
Every record in a sequential file must be properly terminated. The records in text files are the lines. They must be properly terminated. In some editors that means you must add an empty line to the end. Every line containing data must be terminated.
Some compilers are less sensitive to this issue than others and will terminate the last record for you.

Write and read access=stream files in fortran

I have a shell script from which I pass a binary file to a fortran program such that
Mth=$1
loop=1
it=1
while test $it -le 12
do
Mth=`expr $Mth + $loop`
file="DataFile"$Mth".bin"
./fort_exe ${Yr} ${nt} ${it}
# Increment loop
it=`expr $it + 1`
done
This script is used to pass 12 files within a do loop to the fortran program. In the fortran program, I read the binary file passed from the shell script and I am trying to write a 2nd file which would compile in a single file all the data that was read from the consecutive files e.g.
!Open binary file passed from shell script
open(1,file='Datafile'//TRIM{Mth)//.bin',action='read',form='unformatted',access='direct', &
recl=4*x*y, status='old')
! Open write file for t 1. The status is different in t 1 and t > 1 so I open it twice: I guess there is a more elegant way to do this...
open(2,file='Newfile.bin',action='write',form='unformatted', &
access='stream', position='append', status='replace')
irec = 0
do t = 1, nt
! Read input file
irec = irec + 1
read(1,rec=irec) val(:,:)
! write output file
irecW= irec + (imonth-1)*nt
if ( t .eq. 1) write(2,pos=irecW) val(:,:)
! Close file after t = 1, update the status to old and reopen.
if ( t .eq. 2) then
close (2)
open(2,file='Newfile.bin',action='write',form='unformatted', &
access='stream', position='append',status='old')
endif
if ( t .ge. 2) write(2,pos=irecW) val(:,:)
enddo
I can read the binary data from the first file no problem but when I try and read from another program the binary data from the file that I wrote in the first program such that
open(1,file='Newfile.bin',action='read',form='unformatted', &
access='stream', status='old')
irec=0
do t = 1, nt
! Read input file
irec = irec + 1
read(1,pos=irec) val(:,:)
write(*,*) val(:,:)
enddo
val(:,:) is nothing but a list of zeros. This is the first time I use access=stream which I believe is the only way I can use position='append'. I have tried compiling with gfortran and ifort but I do not get any error messages.
Does anyone have any idea why this is happening?
Firstly, I do not think you need to close and reopen your output file as you are doing. The status specifier is only relevant to the open statement in which it appears: replace will delete Newfile.bin if it exists at that time, before opening a new file with the same name. The status is implicitly changed to old, but this does not affect any operations done to the file.
However, since your Fortran code does not know you run it 12 times, you should have a way of making sure the file is only replaced the first time and opened as old afterwards; otherwise, Newfile.bin will only contain the information from the last file processed.
As for reading in the wrong values, this most likely occurs because of the difference between direct access (where you can choose a record length) and stream access (where you cannot). With stream access, data is stored as a sequence of "file storage units". Their size is in general compiler-dependent, but is available through the module iso_fortran_env as file_storage_size; it is usually 8 bits. This means that each entry will usually occupy multiple storage units, so you have to take care that a read or write with the pos = specifier does not access the wrong storage units.
Edit:
Some example code writing and reading with stream access:
program stream
use, intrinsic :: iso_fortran_env
implicit none
integer :: i, offset
real(real32), dimension(4,6) :: val, nval
open(unit=2, file='Newfile.bin', action='readwrite', form='unformatted', &
access='stream', status='replace')
do i = 1,2
call random_number(val)
write(2) val
enddo
! The file now contains two sequences of 24 reals, each element of which
! occupies the following number of storage units:
offset = storage_size(val) / file_storage_size
! Retrieve the second sequence and compare:
read(2, pos = 1 + offset*size(val)) nval
print*, all(nval == val)
close(2)
end program
The value true should be printed to the screen.
Note also that it's not strictly necessary to specify a pos while writing your data to the file, because the file will automatically be positioned beyond the last record read or written.
That said, direct or stream access is most beneficial if you need to access the data in a non-sequential manner. If you only need to combine input files into one, it could be easier to write the output file with sequential access, for which you can also specify recl and position = 'append'.
You can check for the existence of a file in standard Fortran, by using the inquire statement:
logical :: exist
inquire(file="test.dat", exist=exist)
if (exist) then
print *, "File test.dat exists"
else
print *, "File test.dat does not exist"
end if
Alternatively you can have a look at the modFileSys library which provides libc like file manipulation routines.
As for appending and streams: Appending files is also possible when you use "classical" record based fortran files, you do not have to use streams for that.