I have a data file with 84480 lines, I split them into 20 different files in a subroutine each having 4224 lines. Now I want to use these files one by one in another subroutine and do some analysis. But when I tried, I'm getting the runtime error: end of file.
Here is the structure of the main program
real (kind = 8) :: x(84480),y(84480),x1(4424),y1(4424)
open(1000,file='datafile.txt',status='old')
n = 20 ! number of configurations
m = 84480 ! total number of lines in all configurations
p = 4224 ! number of lines in a single configuration
no = 100 ! starting file number configurations
do i=1,m
read(1000,*) x(i),y(i)
end do
call split(x,y,m,n)
do i = 1,20
open(no)
do j = 1,p
read(no,*) x1(j),y1(j) ! error is occurring in here
end do
no = no + 1
end do
end
Here is the subroutine
subroutine split(x,y,m,n)
integer , intent (in) :: m,n
real (kind = 8) , intent(in) :: x(m),y(m)
integer :: i,k,j,p
p = 100
do i=0,n-1
k = i*4224
do j = k+1,k+4224
write(p,*) x(j),y(j)
end do
p = p + 1
end do
end subroutine split
This subroutine is producing output files fort.100 to fort.119 correctly. But it shows the following error
unit = 100, file = 'fort.100'
Fortran runtime error: End of file
Where am I going wrong?.
Of interest here is file connection. The program here uses two forms of connection: preconnection and the open statement. We ignore the connection to datafile.txt here.
We see preconnection in the subroutine with
write(p,*) x(j),y(j)
where the unit number p hasn't previously been in an open statement. This is where the default filename fort.100 (etc.) comes about.
After the subroutine has been called those 20 preconnected units have each had data written. Each of those connections is positioned at the end of the file. This is the notable part.
When, after the subroutine, we come to the loop with
open(no)
we are, because we haven't closed the connection, opening a connection with a unit number which is already connected to a file. This is perfectly acceptable. But we have to understand what this means.
The statement open(no) has no file specifier which means that the unit remains connected to the file it was connected to previously. As there is no other specifier given, nothing about the connection is changed. In particular, the connection is not repositioned: we are still at the end of each file.
So, come the read, we are attempting to read from the file when we are positioned at its end. Result: an end of file error.
Now, how to solve this?
One way, is to reposition the connection. Although we may want to open(no, position='rewind') we can't do that. There is, however
rewind no ! An unfortunate unit number name; could also be rewind(no).
Alternatively, as suggested in the comments on the question, we could close each connection, and reopen in the loop (with an explicit position='rewind') for the reading.
Related
It is my understanding that Fortran, when reading data from file, will skip lines starting with and asterisk (*) assuming that they are a comment. Well, I seem to be having a problem with achieving this behavior with a very simple program I created. This is my simple Fortran program:
1 program test
2
3 integer dat1
4
5 open(unit=1,file="file.inp")
6
7 read(1,*) dat1
8
9
10 end program test
This is "file.inp":
1 *Hello
2 1
I built my simple program with
gfortran -g -o test test.f90
When I run, I get the error:
At line 7 of file test.f90 (unit = 1, file = 'file.inp')
Fortran runtime error: Bad integer for item 1 in list input
When I run the input file with the comment line deleted, i.e.:
1 1
The code runs fine. So it seems to be a problem with Fortran correctly interpreting that comment line. It must be something exceedingly simple I'm missing here, but I can't turn up anything on google.
Fortran doesn't automatically skip comments lines in input files. You can do this easily enough by first reading the line into a string, checking the first character for your comment symbol or search the string for that symbol, then if the line is not a comment, doing an "internal read" of the string to obtain the numeric value.
Something like:
use, intrinsic :: iso_fortran_env
character (len=200) :: line
integer :: dat1, RetCode
read_loop: do
read (1, '(A)', isostat=RetCode) line
if ( RetCode == iostat_end) exit ReadLoop
if ( RetCode /= 0 ) then
... read error
exit read_loop
end if
if ( index (line, "*") /= 0 ) cycle read_loop
read (line, *) dat1
end do read_loop
Fortran does not ignore anything by default, unless you are using namelists and in that case comments start with an exclamation mark.
I found the use of the backspace statement to be a lot more intuitive than the proposed solutions. The following subroutine skips the line when a comment character, "#" is encountered at the beginning of the line.
subroutine skip_comments(fileUnit)
integer, intent(in) :: fileUnit
character(len=1) :: firstChar
firstChar = '#'
do while (firstChar .eq. '#')
read(fileUnit, '(A)') firstChar
enddo
backspace(fileUnit)
end subroutine skip_comments
This subroutine may be used in programs before the read statement like so:
open(unit=10, file=filename)
call skip_comments(10)
read(10, *) a, b, c
call skip_comments(10)
read(10, *) d, e
close(10)
Limitations for the above implementation:
It will not work if the comment is placed between the values of a variable spanning multiple lines, say an array.
It is very inefficient for large input files since the entire file is re-read from the beginning till the previous character when the backspace statement is encountered.
Can only be used for sequential access files, i.e. typical ASCII text files. Files opened with the direct or append access types will not work.
However, I find it a perfect fit for short files used for providing user-parameters.
I have an existing Fortran 77 program where the input values are read from an input file.
read (unit=*, fmt=*) value
The read statement automatically jumps to the next line in the file every time it is called.
Is it possible to replace the "reference" index with another data container, like an array?
For example:
read (myarray, fmt=*) value
I tried it but it always reads the first array-element and does not jump automatically to the next element.
I would have to change every read(unit=*, ...) to read(array(i), ...) and increase the i separately to get to the next element.
Since the program is huge, I am looking for a way to keep the existing read statements and just change the source of the data.
So the unit wouldn't be a integer value but a array where every element is a line from my input file.
Does anybody have an idea?
I tried to discribe the problem in code:
(the input_file.input ist just 15 lines with the numbers 1 to 15)
program FortranInput
implicit none
! Variables
integer :: i, inpid
character*130, dimension(100) :: inp_values
character*130 :: value
inpid = 20
! Open File
open(inpid, file='input_file.input')
! Read from file ------------------------------------------------
do i = 1, 15
! read always takes the next line in the file
read(inpid,'(a130)') value
! write each line to new array-element
inp_values(i) = value
! output each line from file to screen
write(*,*) value
end do
close (inpid)
! Read from array -----------------------------------------------
do i = 1, 15
! read always takes the first line in the array
read(inp_values,'(a130)') value
write(*,*) value
end do
end program FortranInput
Yes, in your example you have to always read from the appropriate array element using the (i) syntax. I can't see another way.
However, often you can use a character array as file in multiple records without using the element index. Consider this:
integer :: i, n=15
character*130, dimension(100) :: inp_values
character*130 :: value
integer :: values(100)
do i = 1, n
write(value,*) i
inp_values(i) = value
end do
read(inp_values,'(*(i130,/))') values(1:n)
write(*,*) values(1:n)
or even
read(inp_values,*) values(1:n)
It is important to remember that an internal file does not keep track of the position at which it is opened. The position is only valid within each write or read statement.
Internal files, unlike external files, have no concept of persistent position (between input/output statements). In this regard if you want one read statement to transfer from one record and the next read from another record you will have to reference these records directly.
However, you don't show how you really want to use the input. If you can re-write the input to use a single read statement then the appropriate records will be the source.
For example, if you can rewrite
do i=1,5
read(unit, '(I5)') x(i)
end do
as
read(unit, '(I5,/)') x(1:5)
then you can easily switch to using an internal file.
I have a Fortran routine that opens a lot of text files write data from a time loop. This routine uses open with the newunit option, this unit is stored in an object in order to write things in files later. This works fine most of the time but when the program needs to open a large number N of files at the same time I get the following error:
**forrtl: severe (104): incorrect STATUS= specifier value for connected file, unit -1, file CONOUT$**
reffering to the first open function in createFiles subroutine. This error occurs whether the file already exists or not. I don't know if this might help but at this stage the new unit that should be generated would be -32768.
I include a minimal code sample with a "timeSeries" class including a routine that creates two files:
the first file fileName1 is opened and closed directy after writing stuff inside
the second file fileName2 is kept open in order to write things comùputed in a time loop later and closed at the end of the time loop
The example is composed of the two following files. It breaks for i=32639.
main.f90 :
program writeFiles
use TS
logical :: stat
integer :: i, istep, N, NtimeSteps
character(len=16) :: fileName1, fileName2
character(len=300) :: path
type(timeSeries), dimension(:), allocatable :: myTS
call getcwd( path )
path = trim(path) // '\Output_files'
inquire(directory = trim(path), exist = stat )
if (.not. stat) call system("mkdir " // '"' // trim(path) // '"' )
N = 50000
NtimeSteps = 100
allocate(myTS(N))
do i = 1, N
write(fileName1,'(a6,i6.6,a4)') 'file1_', i, '.txt'
write(fileName2,'(a6,i6.6,a4)') 'file2_', i, '.txt'
call myTS(i)%createFiles(trim(path),fileName1,fileName2)
end do
do istep = 1, NtimeSteps
#
#compute stuff
#
do i = 1, N
write(myTS(i)%fileUnit,*) 'stuff'
end do
end do
do i = 1, N
close(myTS(i)%fileUnit)
end do
end program writeFiles
module.f90 :
module TS
type timeSeries
integer :: fileUnit
contains
procedure :: createFiles => timeSeries_createFiles
end type timeSeries
contains
subroutine timeSeries_createFiles(this,dir,fileName1,fileName2)
class(timeSeries) :: this
character(*) :: dir, fileName1, fileName2
open(newunit = this%fileUnit , file = dir // '\' // fileName1, status = 'replace') !error occurs here after multiple function calls
write(this%fileUnit,*) 'Write stuff'
close(this%fileUnit)
open(newunit = this%fileUnit , file = dir // '\' // fileName2, status = 'replace')
end subroutine timeSeries_createFiles
end module
Any idea about the reason for this error? Is there a limitation for the number of files opened at the same time? Could it be related to a memory issue?
I'm using Intel(R) Visual Fortran Compiler 17.0.4.210
Windows has this interesting habit of not releasing all the resources for a closed file for a short time after you do a close. I have seen this sort of problem on and off for decades. My usual recommendation is to put a call to SLEEPQQ with a duration of half a second after a CLOSE when you intend to do another OPEN soon after on the same file. But you're not doing that here.
There's more here that is puzzling. The error message referring to unit -1 and CONOUT$ should not occur when opening an explicit file and using NEWUNIT. In Intel's implementation, NEWUNIT numbers start at -129 and go more negative from there. Unit -1 is used for PRINT or WRITE(*), and CONOUT$ is the console. STATUS='REPLACE' would not be valid for a unit connected to the console. That the newunit number would be -32768 is telling and suggests an internal limit for NEWUNIT in the Intel libraries.
I did a test of my own and see that if you use NEWUNIT and close the unit, the unit numbers go as low as -16384 before cycling back to -129. That's ok if indeed you're closing the units, but you're never closing the second file you open, so you're at least hitting a maximum number of NEWUNIT files open. I would recommend figuring out a different way of approaching the problem that didn't require leaving thousands of files open.
I am using a Fortran 90 program that writes a file. The first line of this file is supposed to indicate the number of lines in the remaining file. The file is written by the program when a certain criterion is met and that can't be determined beforehand. Basically, I will know the total number of lines only after the run is over.
I want to do it in the following manner:
1) open the file and write the first line with some text say, "Hello"
2) Write rows in the file as desired and keep a counter for number of rows.
3) Once the run is over and just before closing the file, replace the first line string ("Hello") with the counter.
The problem is in step 3. I don't know how to replace the first line.
Another option that I can think of is to write to 2 files. First, write a file as above without the counter. Once the run is over, close the file and write another file and this time, I know the value of the counter.
I believe there is a way to proceed with the first approach. Can someone please help me with this?
Fortran supports three forms of file access - DIRECT, STREAM (F2003+) and SEQUENTIAL. Both DIRECT and STREAM access support being able to rewrite earlier parts of a file, SEQUENTIAL access does not (a rewrite to an earlier record truncates the file at the rewritten record).
With direct access, all the records in the file are the same length. An arbitrary record can be [must be] accessed by any input/output statement by simply specifying the relevant record number in the statement. Note though, that the typical disk format of a direct access file may not match your idea of a file with "lines".
With formatted stream access, the current position in the file can be captured using an INQUIRE statement, and then a later input/output statement can begin data transfer at that position by using a POS specifier. The typical disk format of a formatted stream access file usually matches with what people expect of a text file with lines.
Stream access is likely what you want. Examples for both approaches are shown below.
Direct access:
PROGRAM direct
IMPLICIT NONE
INTEGER :: unit
REAL :: r
INTEGER :: line
OPEN( NEWUNIT=unit, &
FILE='direct.txt', &
STATUS='REPLACE', &
ACCESS='DIRECT', &
RECL=15, & ! The fixed record length.
FORM='FORMATTED' )
CALL RANDOM_SEED()
! No need to write records in order - we just leave off
! writing the first record until the end.
line = 0
DO
CALL RANDOM_NUMBER(r)
IF (r < 0.05) EXIT
line = line + 1
PRINT "('Writing line ',I0)", line
! All the "data" records are offset by one, to allow the
! first record to record the line count.
WRITE (unit, "('line ',I10)", REC=line+1) line
END DO
! Now update the first record with the number of following "lines".
WRITE (unit, "(I10)", REC=1) line
CLOSE(unit)
END PROGRAM direct
Stream access:
PROGRAM stream
IMPLICIT NONE
INTEGER :: unit
REAL :: r
INTEGER :: line
INTEGER :: pos
OPEN( NEWUNIT=unit, &
FILE='stream.txt', &
STATUS='REPLACE', &
ACCESS='STREAM', &
POSITION='REWIND', &
FORM='FORMATTED' )
CALL RANDOM_SEED()
! Remember where we are. In this case, the position
! is the first file storage unit in the file, but
! it doesn't have to be.
INQUIRE(unit, POS=pos)
! Leave some space in the file for later overwriting
! with the number of lines. We'll stick the number
! zero in there for now.
WRITE (unit, "(I10)") 0
! Write out the varying number of lines.
line = 0
DO
CALL RANDOM_NUMBER(r)
IF (r < 0.05) EXIT
line = line + 1
PRINT "('Writing line ',I0)", line
WRITE (unit, "('line ',I10)") line
END DO
! Now update the space at the start with the number of following "lines".
WRITE (unit, "(I10)", POS=pos) line
CLOSE(unit)
END PROGRAM stream
Going back on a sequential access file is tricky, because lines can vary in length. And if you change the length of one line, you'd have to move all the stuff behind.
What I recommend is to write your output to a scratch file while counting the number of lines. Then, once you're finished, rewind the scratch file, write the number of lines to your output file, and copy the contents of the scratch file to that output file.
Here's what I did:
program var_file
implicit none
character(len=*), parameter :: filename = 'delme.dat'
integer :: n, io_stat
character(len=300) :: line
open(unit=200, status='SCRATCH', action="READWRITE")
n = 0
do
read(*, '(A)') line
if (len_trim(line) == 0) exit ! Empty line -> finished
n = n + 1
write(200, '(A)') trim(line)
end do
rewind(200)
open(unit=100, file=filename, status="unknown", action="write")
write(100, '(I0)') n
do
read(200, '(A)', iostat=io_stat) line
if (io_stat /= 0) exit
write(100, '(A)') trim(line)
end do
close(200)
close(100)
end program var_file
I have a shell script from which I pass a binary file to a fortran program such that
Mth=$1
loop=1
it=1
while test $it -le 12
do
Mth=`expr $Mth + $loop`
file="DataFile"$Mth".bin"
./fort_exe ${Yr} ${nt} ${it}
# Increment loop
it=`expr $it + 1`
done
This script is used to pass 12 files within a do loop to the fortran program. In the fortran program, I read the binary file passed from the shell script and I am trying to write a 2nd file which would compile in a single file all the data that was read from the consecutive files e.g.
!Open binary file passed from shell script
open(1,file='Datafile'//TRIM{Mth)//.bin',action='read',form='unformatted',access='direct', &
recl=4*x*y, status='old')
! Open write file for t 1. The status is different in t 1 and t > 1 so I open it twice: I guess there is a more elegant way to do this...
open(2,file='Newfile.bin',action='write',form='unformatted', &
access='stream', position='append', status='replace')
irec = 0
do t = 1, nt
! Read input file
irec = irec + 1
read(1,rec=irec) val(:,:)
! write output file
irecW= irec + (imonth-1)*nt
if ( t .eq. 1) write(2,pos=irecW) val(:,:)
! Close file after t = 1, update the status to old and reopen.
if ( t .eq. 2) then
close (2)
open(2,file='Newfile.bin',action='write',form='unformatted', &
access='stream', position='append',status='old')
endif
if ( t .ge. 2) write(2,pos=irecW) val(:,:)
enddo
I can read the binary data from the first file no problem but when I try and read from another program the binary data from the file that I wrote in the first program such that
open(1,file='Newfile.bin',action='read',form='unformatted', &
access='stream', status='old')
irec=0
do t = 1, nt
! Read input file
irec = irec + 1
read(1,pos=irec) val(:,:)
write(*,*) val(:,:)
enddo
val(:,:) is nothing but a list of zeros. This is the first time I use access=stream which I believe is the only way I can use position='append'. I have tried compiling with gfortran and ifort but I do not get any error messages.
Does anyone have any idea why this is happening?
Firstly, I do not think you need to close and reopen your output file as you are doing. The status specifier is only relevant to the open statement in which it appears: replace will delete Newfile.bin if it exists at that time, before opening a new file with the same name. The status is implicitly changed to old, but this does not affect any operations done to the file.
However, since your Fortran code does not know you run it 12 times, you should have a way of making sure the file is only replaced the first time and opened as old afterwards; otherwise, Newfile.bin will only contain the information from the last file processed.
As for reading in the wrong values, this most likely occurs because of the difference between direct access (where you can choose a record length) and stream access (where you cannot). With stream access, data is stored as a sequence of "file storage units". Their size is in general compiler-dependent, but is available through the module iso_fortran_env as file_storage_size; it is usually 8 bits. This means that each entry will usually occupy multiple storage units, so you have to take care that a read or write with the pos = specifier does not access the wrong storage units.
Edit:
Some example code writing and reading with stream access:
program stream
use, intrinsic :: iso_fortran_env
implicit none
integer :: i, offset
real(real32), dimension(4,6) :: val, nval
open(unit=2, file='Newfile.bin', action='readwrite', form='unformatted', &
access='stream', status='replace')
do i = 1,2
call random_number(val)
write(2) val
enddo
! The file now contains two sequences of 24 reals, each element of which
! occupies the following number of storage units:
offset = storage_size(val) / file_storage_size
! Retrieve the second sequence and compare:
read(2, pos = 1 + offset*size(val)) nval
print*, all(nval == val)
close(2)
end program
The value true should be printed to the screen.
Note also that it's not strictly necessary to specify a pos while writing your data to the file, because the file will automatically be positioned beyond the last record read or written.
That said, direct or stream access is most beneficial if you need to access the data in a non-sequential manner. If you only need to combine input files into one, it could be easier to write the output file with sequential access, for which you can also specify recl and position = 'append'.
You can check for the existence of a file in standard Fortran, by using the inquire statement:
logical :: exist
inquire(file="test.dat", exist=exist)
if (exist) then
print *, "File test.dat exists"
else
print *, "File test.dat does not exist"
end if
Alternatively you can have a look at the modFileSys library which provides libc like file manipulation routines.
As for appending and streams: Appending files is also possible when you use "classical" record based fortran files, you do not have to use streams for that.