FORTRAN parsing file with varying line formate - fortran

I have only limited experience with FORTRAN and I need to parse files with a structure similar to this:
H s 13.010000 0.019685
1.962000 0.137977
0.444600 0.478148
s 0.122000 1.000000
p 0.727000 1.000000
***
He s 38.360000 0.023809
5.770000 0.154891
1.240000 0.469987
s 0.297600 1.000000
p 1.275000 1.000000
***
I need to search for the label (e.g. He) and then read the corresponding blocks into an array.
I know I can parse file by specifying the format each line is supposed to have, but here there are different formats possible.
In Python I would just split each line by the white spaces and deal with it depending on the number of columns. But how to approach this in FORTRAN?

You can read each line as a character string and then process it. If, as it seems, the format is fixed (element symbol in first two characters, orbital letter in sixth character, etc.), the following program could serve you as inspiration:
program elms
implicit none
integer, parameter :: MAX_LEN = 40
character(len=MAX_LEN) :: line_el, line
integer :: u
integer :: is
integer :: nlin
character(len=2) :: element = 'He'
integer, parameter :: MAX_LINES = 20
real, dimension(MAX_LINES) :: e, f
open(newunit=u, file='elms.dat', status='old', action='read')
main_loop: do
! Read line
read(u, '(a)', iostat=is) line_el
if (eof_iostat(is)) exit main_loop
! Check first two characters of the line vs. chemical element.
if (line_el(1:2) .eq. element) then
! This is the beginning of an element block
nlin = 0
line = line_el
do
if (line .ne. '') then
! Line is not empty or only spaces.
nlin = nlin + 1
if (line(6:6) .ne. ' ') then
! Line contains an orbital letter - process it.
end if
! Read the real values in the rest of the line
read(line(7:),*) e(nlin), f(nlin)
end if
! Read next line
read(u, '(a)', iostat=is) line
if (eof_iostat(is)) exit main_loop
if (line(1:2) .ne. ' ') then
! Finished processing element block.
exit main_loop
end if
end do
end if
end do main_loop
! Close file
close(u)
contains
logical function eof_iostat(istat)
! Returns true if the end of file has been reached
use, intrinsic :: iso_fortran_env, only: IOSTAT_END
implicit none
integer, intent(in) :: istat
select case (istat)
case (0) ! No error
eof_iostat = .false.
case (IOSTAT_END) ! End of file reached
eof_iostat = .true.
case default ! Error
STOP
end select
end function eof_iostat
end program
You will probably need to make the program a subroutine, make element an intent(in) dummy argument, process the orbital symbols, etc.
Note that, if possible, it would be easier to just read all the data from the file in one go, and then search for the relevant data in the memory (e.g., having an array with the chemical symbols).

Related

Reading three specific numbers in a text file and writing them out

I have a problem how to print only specific three numbers, which are in a file with no format. I have no idea how to read it and print because if I use read from higher i, it does not start reading e.g. for i = 4, the line 4. I need only numbers 88.98, 65.50, and 30.
text
678 people
450 girls
22 old people
0 cats
0 dogs
4 girls blond
1 boy blond
1 old man
0 88.9814 xo xi
0 65.508 yo yi
0 30 zo zi
I tried this, but this is not working at all.
program souradnice
implicit none
integer :: i, k
character*100 :: yo, zo, line, name, text
real :: xo
open(10,file="text.dat", status='old')
do i=20,20
read(10,fmt='(a)') line
read(unit=line, fmt='(a100)') text
if(name=="xo") then
print *, trim(text)
endif
enddo
close(10)
end program souradnice
You need to read the whole file line by line, and check each line to see if it's the one you want, e.g. by using the index intrinsic. For example,
program souradnice
implicit none
character(100) :: line
character(5) :: matches(3)
real :: numbers(3)
character(10) :: dummy
integer :: i, ierr
! Substrings to match to find the relevant lines
matches = ["xo xi", "yo yi", "zo zi"]
open(10,file="text.dat", status='old')
do
! Read a line from the file, and exit the loop if the file end is reached.
read(10,fmt='(a)',iostat=ierr) line
if (ierr<0) then
exit
endif
do i=1,3
! Check if `line` matches any of the i'th line we want.
if (index(line, matches(i))>0) then
! If it matches, read the relevant number into `numbers`.
read(line,*) dummy, numbers(i)
endif
enddo
enddo
write(*,*) numbers
end program

"Fortran runtime error: End of file" problem when creating a linked list with pointer (READ in DO loop)

I am creating a linked list from the input file. Each node in the linked list includes the information in each block of the input.
When I try to get value from input file, I assigned multiple lines (4 lines each time) into a 1D array "tmp". I think the DO-loop I used in the "input:DO" loop is wrong. But I do not know how to solve.
I use gfortran to compile and no mistake come out for compiling.
I have tried WRITE for testing in the "input:DO" loop for testing. The result shows that I can open the input file successfully.
PROGRAM read
IMPLICIT NONE
INTEGER,PARAMETER :: nat=4
character(len=20) :: filename
!Derived types to store atom data
TYPE :: atom
CHARACTER(LEN=2) :: atom_name
REAL, DIMENSION(3) :: coord
END TYPE atom
!The array info stores info of all atom in one time step
type :: atom_seq
type(atom),dimension(nat):: info
type(atom_seq),pointer :: p
end type atom_seq
TYPE (atom_seq), POINTER :: head
TYPE (atom_seq), POINTER :: tail
type(atom), dimension(nat) :: temp
! Declare variable
INTEGER :: istat
INTEGER :: i=0, n=0
! Open input data file
WRITE(*,*) 'ENTER the file name with the data to be read: '
READ(*,'(A20)') filename
NULLIFY(head)
OPEN( UNIT=9, FILE=TRIM(filename), STATUS="OLD", ACTION="READ", IOSTAT=istat)
! Was the open successful
fileopen: IF (istat == 0) THEN
input: DO
!WRITE(*,*) "OPEN done " ! for testing
READ(9,*) ! <--when run, error is in this line
READ(9,*)
DO i = 1, nat
READ(9,*,IOSTAT=istat) temp(i)%atom_name, temp(i)%coord(1), temp(i)%coord(2), temp(i)%coord(3)
ENDDO
IF (istat /= 0) EXIT
n = n + 1 ! Bump count
IF (.NOT. ASSOCIATED(head) ) THEN ! No values in list
ALLOCATE(head, STAT=istat) ! Allocate new value
tail => head ! Tail points to new value
NULLIFY(tail%p) ! Nullify p in new value
DO i = 1, nat ! Store number
tail%info(i)%atom_name = temp(i)%atom_name
tail%info(i)%coord(1) = temp(i)%coord(1)
tail%info(i)%coord(2) = temp(i)%coord(2)
tail%info(i)%coord(3) = temp(i)%coord(3)
ENDDO
ELSE ! Values already in list
ALLOCATE(tail%p, STAT=istat) ! Allocate new value
tail => tail%p
NULLIFY(tail%p)
DO i = 1, nat ! Store number
tail%info(i)%atom_name = temp(i)%atom_name
tail%info(i)%coord(1) = temp(i)%coord(1)
tail%info(i)%coord(2) = temp(i)%coord(2)
tail%info(i)%coord(3) = temp(i)%coord(3)
ENDDO
END IF
END DO input
ELSE fileopen
WRITE(*,1030) istat
1030 FORMAT ('File open failed --status = ', I6)
END IF fileopen
END PROGRAM read
The input file: inp
4
Particles:1_0
O 0.8050005000 0.7000000000 3.2350000000
H 1.4750005000 1.2800000000 2.8650000000
H 0.8550005000 -0.0900000000 2.7150000000
O 0.4050005000 0.7500000000 -4.1350000000
4
Particles:1_5
O 0.8799478358 0.6383317306 3.1569568025
H 1.4046592860 1.2232485826 2.4978364665
H 1.1472783731 -0.2687458123 3.0229092093
O 0.5392992531 0.6047144782 -4.0811918365
4
Particles:1_10
O -3.8021765454 3.1600783692 -4.5455655916
H -4.5320715486 3.0937504111 4.9514896261
H -3.5088238380 4.0613340230 -4.5394597924
O -3.3469012765 -0.7064128847 1.2465212113
and the error is
hg#xi /home/hg/pole $ ./read
ENTER the file name with the data to be read:
inp
At line XXX of file read.f95 (unit = 9, file = 'inp')
Fortran runtime error: End of file
Error termination. Backtrace:
#0 0x7f1c1fdbb31a
#1 0x7f1c1fdbbec5
#2 0x7f1c1fdbc68d
#3 0x7f1c1ff32a33
#4 0x7f1c1ff364b7
#5 0x7f1c1ff365b8
#6 0x5566d3dc9daf
#7 0x5566d3dca9ed
#8 0x7f1c1f9d0b96
#9 0x5566d3dc9a79
#10 0xffffffffffffffff
I hope to fix the problem. If my idea is wrong, please give some suggestions to design a better data structure to save data in the input (The input file may have thousands of blocks, instead of 3. It is big and the number of blocks is unknown before running the code. )
There is a test on the IOSTAT missing with the first executed READ in the loop. When the result is not OK the loop can be terminated e.g. change:
fileopen: IF (istat == 0) THEN
input: DO
!WRITE(*,*) "OPEN done " ! for testing
READ(9,*) ! <--when run, error is in this line
READ(9,*)
in
fileopen: IF (istat == 0) THEN
input: DO
!WRITE(*,*) "OPEN done " ! for testing
READ(9,*,IOSTAT=istat)
IF (istat /=0) EXIT
READ(9,*)

Trouble reading reals from unknown length character string in Fortran

This is a small portion of the data I am trying to read:
01/06/2009,Tom Sanders,,264,220,73,260
01/08/2009,Adam Apple,158,,260,,208
01/13/2009,Lori Freeman,230,288,218,282,234
01/15/2009,Diane Greenberg,170,,250,321,197
01/20/2009,Adam Apple,257,,263,256,190
01/21/2009,Diane Greenberg,201,,160,195,142
01/27/2009,Tom Sanders,267,,143,140,206
01/29/2009,Tina Workman,153,,124,155,140
02/03/2009,Tina Workman,233,,115,,163
02/03/2009,Adam Apple,266,130,310,,310
the numbers between each comma are from a different location
Where two commas would represent missing data and a trailing comma would mean the fifth data point is missing
My goal is to organize the data into a table after calculating the average of each site and person, hence my two dim arrays
I want my output to look something like the following:
(obviously neater formatting but a table nonetheless)
Average Observed TDS (mg/l)
Name Site 1 Site 2 Site 3 Site 4 Site 5
------------------------------------------------------
Tom Sanders 251.0 172.5 251.7 160.0 229.0
Adam Apple 227.0 130.0 277.7 256.0 236.0
Lori Freeman 194.0 288.0 216.7 279.0 202.7
Diane Greenberg 185.5 190.0 205.0 258.0 169.5
Tina Workman 193.0 140.0 119.5 155.0 163.0
This is my program so far:
program name_finder
implicit none
integer, parameter :: wp = selected_real_kind(15)
real(wp) :: m, tds
real(wp), dimension(20,5) :: avg_site, site_sum
integer, dimension(20) :: nobs
integer, dimension(5) :: x
integer :: ierror, i, nemp, cp, non, ni, n
character(len=40), dimension(20) :: names
character(len=200) :: line, aname
character(len=20) :: output, filename
character(len=3), parameter :: a = "(A)"
do
write(*,*) "Enter file to open."
read(*,*) filename
open(unit=10,file = filename, status = "old", iostat = ierror)
if (ierror==0) exit
end do
write(*,*) "File, ",trim(filename)," has been opened."
non = 0
outer: do
read(10,a, iostat = ierror) line
if (ierror/=0) exit
cp = index(line(12:),",") + 11
aname = line(12:cp-1)
n=0
middle: do
read(line,'(Tcp,f4.2)') tds
write(*,*) "tds=", tds
n=n+1
if (n>10) exit
i = 1
inner: do
if (i > non) then
non = non +1
names(non) = trim(aname)
!ni = non
exit
end if
if (aname == names(i)) then
!ni = i
!cycle outer
exit inner
end if
i = i + 1
end do inner
end do middle
end do outer
write(*,*)
write(*,*) "Names:"
do i = 1,non
write(*,*) i, names(i)
end do
close(10)
close(20)
STOP
end program name_finder
TLDR; I am having trouble reading the data from the file shown at the top of each site after the names.
Suggestions? Thanks!
I hope the following is helpful. I have omitted any easily assumed declarations or any further data manipulation or writing to another file. The code is used just to read the data line by line.
character(150) :: word
read(fileunit, '(A)') word ! read the entire line
comma_ind = index(word,',') ! find the position of first comma
! Find the position of next comma
data_begin = index(word(comma_ind+1:),',')
! Save the name
thename = word(comma_ind+1:comma_ind+data_begin-1)
! Define next starting point
data_begin = comma_ind+data_begin
! Read the rest of the data
outer: do
if (word(data_begin+1:data_begin+1) == ',') then
! decide what to do when missing an entry
data_begin = data_begin + 1
cycle outer
else if (word(data_begin+1:data_begin+1) == ' ') then
! Missing last entry
exit outer
else
! Use it to find the length of current entry
st_ind = index(word(data_begin+1:),',')
if (st_ind == 0) then
! You reached the last entry, read it and exit
read(word(data_begin+1:), *) realData
exit outer
else
! Read current entry
read(word(data_begin+1: data_begin+st_ind-1),*) realData
end if
! Update starting point
data_begin = data_begin + st_ind
end if
end do outer
There could be a more elegant way to do it but I cannot think of any at the moment.

Call a subroutine for a list of points instead of a single point

I have a certain piece of code in fortran. The code takes 'pq' as an input from the user and is a single point. Instead of doing this I want to read a set of points 'pq' from a file points.txt and run it for those number of points instead of just one single user input. Is it possible? The code is as follows:
program prop
use module
implicit none
character(len=80) :: ErrorMsg
character(2) :: xy
real(8) :: Conc(20) = 0.d0
character(len=20) :: fn, fl
real(8) :: Mmolar, Tcritical, Pcritical, Tmininimum, Tmaximum, x, y
call Init_module()
write(*,*) 'Insert the gas name:'
read(*,*) fn
write(*,*) 'Insert the gas library:'
read(*,*) fl
write(*,*) 'Insert the copule pq:'
read(*,*) pq
write(*,*) 'Insert the value of ', pq(1:1)
read(*,*) x
write(*,*) 'Insert the value of ', pq(2:2)
read(*,*) y
write(*,*) 'Pres = ', Pres( pq, x, y, ErrorMsg)
write(*,*) 'Temp = ', Temperature( pq, x, y, ErrorMsg)
call ReleaseObjects()
end program prop
Instead of reading pq as a single point x,y from the user in the above code, I want to read a set of points from file.txt, for example 50 points and then run subroutines Pres and Temperature.
Each line of the file contains one point x,y and x and y in each line are separated by a few space characters.
The first few lines of file.txt are:
Ts
500
0.04781564 159.81587875
0.20396084 165.46398084
0.08159885 166.81382894
0.03879184 164.17497877
0.12585959 165.37000305
0.09895530 165.95997769
0.10389518 170.74235496
It must be noted that the length and the sign of the floating numbers can vary. The file.txt is originally written through python with the formatting for x, y being '%-12.8f %-12.8f\n'%. I have the following code to try and read the file but am not able to read from the 3rd line onwards:
real, allocatable :: x(:),y(:)
integer :: np
open(12,file=trim('file.txt'),status='old', &
access='sequential', form='formatted', action='read' )
read(12,*)pq
write(*,*)'pq:', pq
read(12,*)np
write(*,*)'number of points:',np
allocate (x(np))
allocate (y(np))
do i=1,np
read(12,*)x(i),y(i)
write(*,*)x(i),y(i)
enddo
Instead of using the READ statement with the asterisk (*) as the first argument asking for an user input, use a file identifier. You need to OPEN your file containing the set of points, assuming it is ASCII :
OPEN(UNIT=10,FILE=file.txt,ACTION='read',STATUS='old')
I think the arguments of this command are quite explanatory.
Then assuming your file contains multiple lines with x and y values, you can read each line of your file by doing :
READ(10,*) x,y
If you have multiple points to read, just use a DO if you know the number of points to read, a DO WHILE otherwise. To take your example with 50 points, something like this should work :
OPEN(UNIT=10,FILE=file.txt,ACTION='read',STATUS='old') ! Open file
DO i=1,50
READ(10,*) x,y
write(*,*) 'Pres = ', Pres( pq, x, y, ErrorMsg)
write(*,*) 'Temp = ', Temperature( pq, x, y, ErrorMsg)
END DO
CLOSE(10) ! Close file
EDIT
Your suggestion is almost correct. You forgot to declare pq as a character(len=2). You should not have been able to pass line 1 because of that.
As I said, there is a space separator that is naturally treated by a asterisk as a format. Anyway, if you want to exactly match the format, use the same with which you wrote your data. Reading your format Python, I assume you wrote two floats with a space separator, and indeed if you count the number of character of your digits :
0.04781564 159.81587875
^^^^^^^^^^^^|^^^^^^^^^^^^
1 12|1 12
|
space
which gives the following format in Fortran :
read(12,'(f12.8,1X,f12.8)') x(i),y(i)
X means a space separator in Fortran formats.
Then you can write you data onscreen with the same format to check :
write(*,'(f12.8,1X,f12.8)') x(i),y(i)
It gives :
pq:Ts
number of points: 500
0.04781564 159.81587219
0.20396084 165.46397400
0.08159885 166.81382751
0.03879184 164.17497253
0.12585959 165.37001038
0.09895530 165.95997620
0.10389518 170.74235535
You may have noticed that you lost precision on the last digits. It is because you have declared a simple real (4 bytes). Switch your real to 8 bytes with real(kind=8) or real*8 according to your compiler (be aware, not the right way to do it, not portable but sufficient in your case)
Do not forget to close your file when you are done dealing with it :
close(12)

error in fortran example - allocating memory error 1

I'm new to fortran and I'm trying to write and compile an easy example that I found in this address to read a simple 2 column text file:
Date Water-flow
717976 7.140
717977 6.570
717978 6.040
717979 5.780
717980 5.530
The program is pasted below, but when trying to compile and run I don't really understand the resulting error, could you please advice me?
PROGRAM READHPOP
IMPLICIT NONE
INTEGER, PARAMETER :: lun = 10
INTEGER :: res,i
CHARACTER(len=80) :: cbuffer
INTEGER :: flength
INTEGER,ALLOCATABLE,DIMENSION(:) :: dates
REAL,ALLOCATABLE,DIMENSION(:) :: water_flow
INTEGER :: c_position,string_length
OPEN(UNIT=lun,FILE="st.dat",FORM="FORMATTED",IOSTAT=res)
IF(res/=0) THEN
PRINT*,'error in opening file, status: ',res
STOP
END IF
READ(UNIT=lun,FMT='(A)',IOSTAT=res) cbuffer
IF(res /=0) THEN
PRINT *,'ERROR IN READING file, status: ',res
CLOSE(UNIT=lun)
STOP
END IF
string_length=LEN_TRIM(cbuffer)
c_position=INDEX(cbuffer,':')
READ(cbuffer(c_position+1:string_length),FMT='(A10)') flength
ALLOCATE(dates(flength),STAT=res)
IF (res/=0) THEN
PRINT*,'ERROR IN ALLOCATING MEMORY, status:',res
CLOSE(UNIT=lun)
STOP
END IF
READ(UNIT=lun,FMT='(A)',IOSTAT=res) cbuffer
DO i=1,cbuffer
READ(UNIT=lun,FMT='(I6,1X,F6.3)') dates(i),water_flow(i)
PRINT*,'DIAS ',dates(i)
END DO
end PROGRAM readhpop
EDIT: thanks to all for the inputs, just to close the issue, I paste below the working code, regards!
PROGRAM READHPOP
IMPLICIT NONE
INTEGER, PARAMETER :: lun = 10
INTEGER :: res,i
CHARACTER(len=80) :: cbuffer
INTEGER :: flength
INTEGER,ALLOCATABLE,DIMENSION(:) :: dates
REAL,ALLOCATABLE,DIMENSION(:) :: water_flow
INTEGER :: c_position,string_length
OPEN(UNIT=lun,FILE="st.dat",FORM="FORMATTED",IOSTAT=res)
IF(res/=0) THEN
PRINT*,'error in opening file, status: ',res
STOP
END IF
READ(UNIT=lun,FMT='(A)',IOSTAT=res) cbuffer
IF(res /=0) THEN
PRINT *,'ERROR IN READING file, status: ',res
CLOSE(UNIT=lun)
STOP
END IF
string_length=LEN_TRIM(cbuffer)
c_position=INDEX(cbuffer,':')
READ(cbuffer(c_position+1:string_length),FMT='(I10)') flength
ALLOCATE(dates(flength),water_flow(flength),STAT=res)
IF (res/=0) THEN
PRINT*,'ERROR IN ALLOCATING MEMORY, status:',res
CLOSE(UNIT=lun)
STOP
END IF
DO i=1,flength
READ(UNIT=lun,FMT='(I6,F6.3)') dates(i),water_flow(i)
PRINT*,'LINE OF FILE',i,' DAYS ',dates(i),' WATERFLOW ',water_flow(i)
END DO
PAUSE
END PROGRAM READHPOP
Okay, here is a working version:
PROGRAM READHPOP
IMPLICIT NONE
INTEGER, PARAMETER :: lun = 10
INTEGER :: res,i
CHARACTER(len=80) :: cbuffer
INTEGER :: flength
INTEGER,ALLOCATABLE,DIMENSION(:) :: dates
REAL,ALLOCATABLE,DIMENSION(:) :: water_flow
INTEGER :: c_position,string_length
OPEN(UNIT=lun,FILE="st.dat",FORM="FORMATTED",IOSTAT=res)
IF(res/=0) THEN
PRINT*,'error in opening file, status: ',res
STOP
END IF
READ(UNIT=lun,FMT='(A)',IOSTAT=res) cbuffer
IF(res /=0) THEN
PRINT *,'ERROR IN READING file, status: ',res
CLOSE(UNIT=lun)
STOP
END IF
string_length=LEN_TRIM(cbuffer)
c_position=INDEX(cbuffer,':')
READ(cbuffer(c_position+1:string_length),FMT='(I10)') flength
ALLOCATE(dates(flength),water_flow(flength),STAT=res)
IF (res/=0) THEN
PRINT*,'ERROR IN ALLOCATING MEMORY, status:',res
CLOSE(UNIT=lun)
STOP
END IF
DO i=1,flength
READ(UNIT=lun,FMT='(I6,F6.3)') dates(i),water_flow(i)
PRINT*,'DIAS ',dates(i)
END DO
end PROGRAM readhpop
Issues:
water_flow needs to be allocated
When determining flength: flength is an integer, so read it as an integer (here: '(I10)'). The number of dates is expected after a colon, so change st.dat to:
Date Water-flow: 5
717976 7.140
717977 6.570
717978 6.040
717979 5.780
717980 5.530
Loop over flength instead of cbuffer
One read statement too much - you are trying to read beyond the end of the file...
In this line
READ(cbuffer(c_position+1:string_length),FMT='(A10)') flength
you read an integer value under a character edit descriptor. This may well return a value in flength which exceeds any reasonable value for your allocatable array. For example, in a quick test I got a value of 538981169. Change that line to
READ(cbuffer(c_position+1:string_length),'(i)') flength