Fortran read mixed text and numbers - fortran

I am using Fortran 90 to read a file that contains data in the following format
number# 125 var1= 2 var2= 1 var3: 4
.
.
.
.
number# 234 var1= 3 var2= 5 var3: 1
I tried the following command and works fine
read (2,*) tempstr , my_param(1), tempstr , my_param(2), tempstr , my_param(3)
Problem is when the numbers become larger and there is no space between string and number, i.e. the data looks as following:
number# 125 var1= 2 var2=124 var3: 4
I tried
read (2,512) my_param(1), my_param(2), my_param(3)
512 format('number#', i, 'var1=', i, 'var2=', i, 'var3:', i)
It reads all number as zero
I can't switch to some other language. The data set is huge, so I can't pre-process it. Also, the delimiters are not the same every time.
Can someone please help with the problem?
Thanks in advance

First up, 720 thousand lines is not too much for pre-processing. Tools like sed and awk work mostly on a line-by-line basis, so they scale really well.
What I have actually done was to convert the data in such a way that I could use namelists:
$ cat preprocess.sed
# Add commas between values
# Space followed by letter -> insert comma
s/ \([[:alpha:]]\)/ , \1/g
# "number" is a key word in Fortran, so replace it with num
s/number/num/g
# Replace all possible data delimitors with the equals character
s/[#:]/=/g
# add the '&mydata' namelist descriptor to the beginning
s/^/\&mydata /1
# add the namelist closing "/" character to the end of the line:
s,$,/,1
$ sed -f preprocess.sed < data.dat > data.nml
Check that the data was correctly preprocessed:
$ tail -3 data.dat
number#1997 var1=114 var2=130 var3:127
number#1998 var1=164 var2=192 var3: 86
number#1999 var1=101 var2= 48 var3:120
$ tail -3 data.nml
&mydata num=1997 , var1=114 , var2=130 , var3=127/
&mydata num=1998 , var1=164 , var2=192 , var3= 86/
&mydata num=1999 , var1=101 , var2= 48 , var3=120/
Then you can read it with this fortran program:
program read_mixed
implicit none
integer :: num, var1, var2, var3
integer :: io_stat
namelist /mydata/ num, var1, var2, var3
open(unit=100, file='data.nml', status='old', action='read')
do
read(100, nml=mydata, iostat=io_stat)
if (io_stat /= 0) exit
print *, num, var1, var2, var3
end do
close(100)
end program read_mixed

While I still stand with my original answer, particularly because the input data is already so close to what a namelist file would look like, let's assume that you really can't make any preprocessing of the data beforehand.
The next best thing is to read in the whole line into a character(len=<enough>) variable, then extract the values out of that with String Manipulation. Something like this:
program mixed2
implicit none
integer :: num, val1, val2, val3
character(len=50) :: line
integer :: io_stat
open(unit=100, file='data.dat', action='READ', status='OLD')
do
read(100, '(A)', iostat=io_stat) line
if (io_stat /= 0) exit
call get_values(line, num, val1, val2, val3)
print *, num, val1, val2, val3
end do
close(100)
contains
subroutine get_values(line, n, v1, v2, v3)
implicit none
character(len=*), intent(in) :: line
integer, intent(out) :: n, v1, v2, v3
integer :: idx
! Search for "number#"
idx = index(line, 'number#') + len('number#')
! Get the integer after that word
read(line(idx:idx+3), '(I4)') n
idx = index(line, 'var1') + len('var1=')
read(line(idx:idx+3), '(I4)') v1
idx = index(line, 'var2') + len('var3=')
read(line(idx:idx+3), '(I4)') v2
idx = index(line, 'var3') + len('var3:')
read(line(idx:idx+3), '(I4)') v3
end subroutine get_values
end program mixed2
Please note that I have not included any error/sanity checking. I'll leave that up to you.

Related

Reading three specific numbers in a text file and writing them out

I have a problem how to print only specific three numbers, which are in a file with no format. I have no idea how to read it and print because if I use read from higher i, it does not start reading e.g. for i = 4, the line 4. I need only numbers 88.98, 65.50, and 30.
text
678 people
450 girls
22 old people
0 cats
0 dogs
4 girls blond
1 boy blond
1 old man
0 88.9814 xo xi
0 65.508 yo yi
0 30 zo zi
I tried this, but this is not working at all.
program souradnice
implicit none
integer :: i, k
character*100 :: yo, zo, line, name, text
real :: xo
open(10,file="text.dat", status='old')
do i=20,20
read(10,fmt='(a)') line
read(unit=line, fmt='(a100)') text
if(name=="xo") then
print *, trim(text)
endif
enddo
close(10)
end program souradnice
You need to read the whole file line by line, and check each line to see if it's the one you want, e.g. by using the index intrinsic. For example,
program souradnice
implicit none
character(100) :: line
character(5) :: matches(3)
real :: numbers(3)
character(10) :: dummy
integer :: i, ierr
! Substrings to match to find the relevant lines
matches = ["xo xi", "yo yi", "zo zi"]
open(10,file="text.dat", status='old')
do
! Read a line from the file, and exit the loop if the file end is reached.
read(10,fmt='(a)',iostat=ierr) line
if (ierr<0) then
exit
endif
do i=1,3
! Check if `line` matches any of the i'th line we want.
if (index(line, matches(i))>0) then
! If it matches, read the relevant number into `numbers`.
read(line,*) dummy, numbers(i)
endif
enddo
enddo
write(*,*) numbers
end program

Trouble reading reals from unknown length character string in Fortran

This is a small portion of the data I am trying to read:
01/06/2009,Tom Sanders,,264,220,73,260
01/08/2009,Adam Apple,158,,260,,208
01/13/2009,Lori Freeman,230,288,218,282,234
01/15/2009,Diane Greenberg,170,,250,321,197
01/20/2009,Adam Apple,257,,263,256,190
01/21/2009,Diane Greenberg,201,,160,195,142
01/27/2009,Tom Sanders,267,,143,140,206
01/29/2009,Tina Workman,153,,124,155,140
02/03/2009,Tina Workman,233,,115,,163
02/03/2009,Adam Apple,266,130,310,,310
the numbers between each comma are from a different location
Where two commas would represent missing data and a trailing comma would mean the fifth data point is missing
My goal is to organize the data into a table after calculating the average of each site and person, hence my two dim arrays
I want my output to look something like the following:
(obviously neater formatting but a table nonetheless)
Average Observed TDS (mg/l)
Name Site 1 Site 2 Site 3 Site 4 Site 5
------------------------------------------------------
Tom Sanders 251.0 172.5 251.7 160.0 229.0
Adam Apple 227.0 130.0 277.7 256.0 236.0
Lori Freeman 194.0 288.0 216.7 279.0 202.7
Diane Greenberg 185.5 190.0 205.0 258.0 169.5
Tina Workman 193.0 140.0 119.5 155.0 163.0
This is my program so far:
program name_finder
implicit none
integer, parameter :: wp = selected_real_kind(15)
real(wp) :: m, tds
real(wp), dimension(20,5) :: avg_site, site_sum
integer, dimension(20) :: nobs
integer, dimension(5) :: x
integer :: ierror, i, nemp, cp, non, ni, n
character(len=40), dimension(20) :: names
character(len=200) :: line, aname
character(len=20) :: output, filename
character(len=3), parameter :: a = "(A)"
do
write(*,*) "Enter file to open."
read(*,*) filename
open(unit=10,file = filename, status = "old", iostat = ierror)
if (ierror==0) exit
end do
write(*,*) "File, ",trim(filename)," has been opened."
non = 0
outer: do
read(10,a, iostat = ierror) line
if (ierror/=0) exit
cp = index(line(12:),",") + 11
aname = line(12:cp-1)
n=0
middle: do
read(line,'(Tcp,f4.2)') tds
write(*,*) "tds=", tds
n=n+1
if (n>10) exit
i = 1
inner: do
if (i > non) then
non = non +1
names(non) = trim(aname)
!ni = non
exit
end if
if (aname == names(i)) then
!ni = i
!cycle outer
exit inner
end if
i = i + 1
end do inner
end do middle
end do outer
write(*,*)
write(*,*) "Names:"
do i = 1,non
write(*,*) i, names(i)
end do
close(10)
close(20)
STOP
end program name_finder
TLDR; I am having trouble reading the data from the file shown at the top of each site after the names.
Suggestions? Thanks!
I hope the following is helpful. I have omitted any easily assumed declarations or any further data manipulation or writing to another file. The code is used just to read the data line by line.
character(150) :: word
read(fileunit, '(A)') word ! read the entire line
comma_ind = index(word,',') ! find the position of first comma
! Find the position of next comma
data_begin = index(word(comma_ind+1:),',')
! Save the name
thename = word(comma_ind+1:comma_ind+data_begin-1)
! Define next starting point
data_begin = comma_ind+data_begin
! Read the rest of the data
outer: do
if (word(data_begin+1:data_begin+1) == ',') then
! decide what to do when missing an entry
data_begin = data_begin + 1
cycle outer
else if (word(data_begin+1:data_begin+1) == ' ') then
! Missing last entry
exit outer
else
! Use it to find the length of current entry
st_ind = index(word(data_begin+1:),',')
if (st_ind == 0) then
! You reached the last entry, read it and exit
read(word(data_begin+1:), *) realData
exit outer
else
! Read current entry
read(word(data_begin+1: data_begin+st_ind-1),*) realData
end if
! Update starting point
data_begin = data_begin + st_ind
end if
end do outer
There could be a more elegant way to do it but I cannot think of any at the moment.

data entrance error Fortran

I'm learning how to programming with fortran90 and i need receive data from a txt file by the command prompt (something like that:
program.exe"<"data.txt).
at the Input txt file I'll always have a single line with at least 6 numbers till infinity.
if the data was wrote line by line it runs fine but as single line I'm receiving the error: "traceback:not available,compile with - ftrace=frame or - ftrace=full fortran runtime error:end file"
*note: i'm using Force fortran 2.0
here is example of data:
0 1 0.001 5 3 1 0 -9 3
edit: just clarifying: the code is working fine itself except for the read statement, which is a simple "read*,". I want know how To read a entire line from a txt once the entrance will be made by the promt command with stream direction.
( you can see more about that here: https://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/redirection.mspx?mfr=true).
there is no need to read the code, i've posted it just for knowledge.
I'm sorry about the whole inconvenience.
here is the code so far:
program bissecao
implicit none
integer::cont,int,e,k,intc,t1,t2,t3
doubleprecision::ii,is,pre,prec,erro,somaa,somab,xn
doubleprecision,dimension(:),allocatable::co
t1=0
t2=0
t3=0
! print*,"insira um limite inf da funcao"
read*,ii
!print*,"insira o limite superior da func"
read*,is
! print*,"insira a precisÆo admissivel"
read*,pre
if (erro<=0) then !elimina criterio de parada negativo ou zero
Print*,"erro"
go to 100
end if
!print*,"insira a qtd iteracoes admissiveis"
read*,int
!print*,"insira o grau da f(x)"
read*,e
if (e<=0) then ! elimina expoente negativo
e=(e**2)**(0.5)
end if
allocate(co(e+1))
!print*, "insira os coeficientes na ordem:&
! &c1x^n+...+(cn-1)x^1+cnx^0"
read(*,*)(co(k),k=e+1,1,-1)
somab=2*pre
intc=0
do while (intc<int.and.(somab**2)**0.5>pre.and.((is-ii)**2)**0.5>pre)
somab=0
somaa=0
xn =(ii+is)/2
do k=1,e+1,1
if (ii /=0) then
somaa=ii**(k-1)*co(k)+somaa
else
somaa=co(1)
end if
! print*,"somaa",k,"=",somaa
end do
do k=1,(e+1),1
if (xn/=0) then
somab=xn**(k-1)*co(k)+somab
else
somab=co(1)
end if
!print*,"somab",k,"=",somab
end do
if ((somaa*somab)<0) then
is=xn
else if((somaa*somab)>0)then
ii=xn
else if ((somaa*somab)==0) then
xn=(ii+is)/2
go to 100
end if
intc =intc+1
prec=is-ii
if ((((is-ii)**2)**.5)< pre) then
t3=1
end if
if (((somab**2)**.5)< pre) then
t2=1.
end if
if (intc>=int) then
t1=1
end if
end do
somab=0
xn=(ii+is)/2
do k=1,(e+1),1
if (xn/=0) then
somab=xn**(k-1)*co(k)+somab
else
somab=co(1)
end if
end do
100 write(*,'(A,F20.15,A,F20.15,A,A,F20.15,A,F20.15,A,I2)'),"I:[",ii,",",is,"]","raiz:",xn,"Fraiz:",somab,"Iteracoes:",intc
end program !----------------------------------------------------------------------------
In your program, you are using the "list-directed input" (i.e., read *, or read(*,*))
read *, ii
read *, is
read *, pre
read *, int
read *, e
read *, ( co( k ), k = e+1, 1, -1 )
which means that the program goes to the next line of the data file after each read statement (by neglecting any remaining data in the same line). So, the program works if the data file (say "multi.dat") consists of separate lines (as suggested by OP):
0
1
0.001
5
3
1 0 -9 3
But now you are trying to read an input file containing only a single line (say "single.dat")
0 1 0.001 5 3 1 0 -9 3
In this case, we need to read all the values with a single read statement (if list-directed input is to be used).
A subtle point here is that the range of array co depends on e, which also needs to be read by the same read statement. A workaround might be to just pre-allocate co with a sufficiently large number of elements (say 100) and read the data in a single line, e.g.,
integer :: k
allocate( co( 100 ) )
read *, ii, is, pre, int, e, ( co( k ), k = e+1, 1, -1 )
For completeness, here is a test program where you can choose method = 1 or 2 to read "multi.dat" or "single.dat".
program main
implicit none
integer :: int, e, k, method
double precision :: ii, is, pre
double precision, allocatable :: co(:)
allocate( co( 1000 ) )
method = 1 !! 1:multi-line-data, 2:single-line-data
if ( method == 1 ) then
call system( "cat multi.dat" )
read*, ii
read*, is
read*, pre
read*, int
read*, e
read*, ( co( k ), k = e+1, 1, -1 )
else
call system( "cat single.dat" )
read*, ii, is, pre, int, e, ( co( k ), k = e+1, 1, -1 )
endif
print *, "Input data obtained:"
print *, "ii = ", ii
print *, "is = ", is
print *, "pre = ", pre
print *, "int = ", int
print *, "e = ", e
do k = 1, e+1
print *, "co(", k, ") = ", co( k )
enddo
end program
You can pass the input file from standard input as
./a.out < multi.dat (for method=1)
./a.out < single.dat (for method=2)
Please note that "multi.dat" can also be read directly by using "<".

Program to create spatial grid, average values that fall within grid, write to table

So I'm trying to come up with a clever way to make this program read a catalog and take anything falling within specific spatial "grid" boxes and average the data in that box together. I'll paste my horrid attempt below and hopefully you'll see what I'm trying to do. I can't get the program to work correctly (it gets stuck in a loop somewhere that I haven't debugged), and before I bang my head against it anymore I want to know if this looks like a logical set of operations for what I'm looking to do, or if there is a better way to accomplish this.
Edit: To clarify, the argument section is for the trimming parameters---"lmin lmax bmin bmax" set the overall frame, and "deg" sets the square-degree increments.
program redgrid
implicit none
! Variable declarations and settings:
integer :: ncrt, c, i, j, k, count, n, iarg, D, db, cn
real :: dsun, pma, pmd, epma, epmd, ra, dec, degbin
real :: V, Per, Amp, FeH, EBV, Dm, Fi, FeHav, EBVav
real :: lmin, lmax, bmin, bmax, l, b, deg, lbin, bbin
real :: bbinmax, bbinmin, lbinmax, lbinmin
character(len=60) :: infile, outfile, word, name
parameter(D=20000)
dimension :: EBV(D), FeH(D), lbinmax(D), bbinmax(D)
dimension :: bbinmin(D), lbinmin(D)
103 format(1x,i6,4x,f6.2,4x,f6.2,4x,f7.2,3x,f6.2,4x,f5.2,4x,f5.2,4x,f5.2,4x,f6.4)
3 continue
iarg=iargc()
if(iarg.lt.7) then
print*, 'Usage: redgrid infile outfile lmin lmax bmin bmax square_deg'
stop
endif
call getarg(1, infile)
call getarg(2, outfile)
call getarg(3, word)
read(word,*) lmin
call getarg(4, word)
read(word,*) lmax
call getarg(5, word)
read(word,*) bmin
call getarg(6, word)
read(word,*) bmax
call getarg(7, word)
read(word,*) deg
open(unit=1,file=infile,status='old',err=3)
open(unit=2,file=outfile,status='unknown')
write(2,*)"| l center | b center | [Fe/H] avg | E(B-V) avg | "
FeHav = 0.0
EBVav = 0.0
lbinmin(1) = lmin
bbinmin(1) = bmin
degbin = (bmax-bmin)/deg
db = NINT(degbin)
do j = 1, db
bbinmax(j) = bbinmin(j) + deg
lbinmax(j) = lbinmin(j)*cos(bbinmax(j))
print*, lbinmin(j), bbinmin(j), db
cn = 1
7 continue
read(1,*,err=7,end=8) ncrt, ra, dec, l, b,&
V, dsun, FeH(cn), EBV(cn)
if(b.ge.bbinmin(j).and.b.lt.bbinmax(j)) then
if(l.ge.lbinmin(j).and.l.lt.lbinmax(j)) then
FeHav = FeHav + FeH(cn)
EBVav = EBVav + EBV(cn)
cn = cn + 1
end if
end if
goto 7
8 continue
FeHav = FeHav/cn
EBVav = EBVav/cn
write(2,*) lbinmax(j), bbinmax(j), FeHav, EBVav
bbinmin(j+1) = bbinmin(j) + deg
lbinmin(j+1) = lbinmin(j) + deg
end do
close(1)
close(2)
end program redgrid
Below is a small section of the table I'm working with. "l" and "b" are the two coordinates I am working with---they are angular, hence the need to make the grid components "b" and "l*cos(b)." For each 0.5 x 0.5 degree section, I need to have averages of E(B-V) and [Fe/H] within that block. When I write the file all I need are four columns: the two coordinates where the box is located, and the two averages for that box.
| Ncrt | ra | dec | l | b | V | dkpc | [Fe/H] | E(B-V) |
7888 216.53 -43.85 -39.56 15.78 15.68 8.90 -1.19 0.1420
7889 217.49 -43.13 -38.61 16.18 16.15 10.67 -1.15 0.1750
7893 219.16 -43.26 -37.50 15.58 15.38 7.79 -1.40 0.1580
Right now, the program gets stuck somewhere in the loop cycle. I've pasted the terminal output that happens when I run it, along with the command line I'm running it with. Please let me know if I can help clarify. This is a pretty complex problem for a Fortran rookie such as myself---perhaps I'm missing some fundamental knowledge that would make it much easier. Anyways, thanks in advance.
./redgrid table2.above redtest.trim -40 0 15 30 0.5
-40.0000000 15.0000000 30 0.00000000 0.00000000
-39.5000000 15.5000000 30 -1.18592596 0.353437036
^it gets stuck after two lines.
I assume that the program does what you want it to do, but you are looking for a few things to tidy the code up.
Well first up, I'd fix up the indentation.
Secondly, I'd not use unit numbers below 10.
INTEGER, PARAMETER :: in_unit = 100
INTEGER, PARAMETER :: out_unit = 101
...
OPEN(unit=in_unit, file=infile, status='OLD")
...
READ(in_unit, *) ...
...
CLOSE(in_unit)
Thirdly, I'd not use GOTOs and labels. You can do that in a loop far easier:
INTEGER :: read_status
DO j = 1, db
...
read_loop : DO
READ(in_unit, *, IOSTAT=read_status) ...
IF (read_status == -1) THEN ! EOF
EXIT read_loop
ELSEIF (read_status /= 0) THEN
CYCLE read_loop
ENDIF
...
END DO read_loop
...
END DO
There are a few dangers in your code, and even in this one above: It can lead to infinite loops. For example, if the opening of infile fails (e.g. the file doesn't exist), it loops back to label 3, but nothing changes, so it will eventually again try to open the same file, and probably have the same error.
Same above: If READ repeatedly fails without advancing, and without the error being an EOF, then the read loop will not terminate.
You have to think about what you want your program to do when something like this happens, and code it in. (For example: Print an error message and STOP if it can't open the file.)
You have a very long FORMAT statement. You can leave it like that, though I'd probably try to shorten it a bit:
103 FORMAT(I7, 2F10.2, F11.2, 4F9.2, F10.4)
This should be the same line, as numbers are usually right-aligned. You can also use strings as a format, so you could also do something like this:
CHARACTER(LEN=*), PARAMETER :: data_out_form = &
'(I7, 2F10.2, F11.2, 4F9.2, F10.4)'
WRITE(*, data_out_form) var1, var2, var3, ...
and again, that's one less label.

Adapting a Fortran Code to write a file, run an executable and read in arrays from a file

I am new to Fortran but I am trying to adapt a Fortran code and I am having trouble doing something which I think is probably quite simple.
I want to adapt a Fortran file called original.f so that it makes an input file called input.inp and populates it with 4 integers calculated earlier in original.f so that input.inp looks like, for example:
&input
A = 1
B = 2
C = 3
D = 4
&end
I know how to write this format:
OPEN(UNIT=10,FILE='input.inp')
WRITE (10,00001) 1,2,3,4
...
...
...
00001 Format (/2x,'&input',
& /2x,'A = ',i4,
& /2x,'B = ',i4,
& /2x,'C = ',i4,
& /2x,'D = ',i4,
& /2x,'&end')
(or something like this that I can fiddle with when I get it working) but I am not sure how to create the input.inp file write this into it and then use this input file.
The input file needs to be used to run an executable called "exec". I would run this in bash as:
./exec < input.inp > output.out
Where output.out contains two arrays called eg(11) and ai(11,6,2) (with dimensions given) like:
eg(1)= 1
eg(2)= 2
...
...
...
eg(11)= 11
ai(1,1,1)= 111
ai(1,2,1)= 121
...
...
...
ai(11,6,2)=1162
Finally I need to read these inputs back into original.f so that they can be used further down in file. I have defined these arrays at the beginning of original.f as:
COMMON /DATA / eg(11),ai(11,6,2)
But I am not sure of the Fortran to read data line by linw from output.out to populate these arrays.
Any help for any of the stages in this process would be hugely appreciated.
Thank you very much
James
Since you have shown how you create the input file, I assume the question is how to read it. The code shows how "a" and "b" can be read from successive lines after skipping the first line. On Windows, if the resulting executable is a.exe, the commands a.exe < data.txt or type data.txt | a.exe will read from data.txt.
program xread
implicit none
character (len=10) :: words(3)
integer, parameter :: iu = 5 ! assuming unit 5 is standard input
integer :: a,b
read (iu,*) ! skip line with &input
read (iu,*) words ! read "a", "=", and "1" into 3 strings
read (words(3),*) a ! read integer from 3rd string
read (iu,*) words ! read "b", "=", and "1" into 3 strings
read (words(3),*) b ! read integer from 3rd string
print*,"a =",a," b =",b
end program xread
If I understand the expanded question correctly, you have to work with an output file, produced by some other code you did not write, with lines like eg(1) = ....
For the simplest case where you know the number of elements and their ordering beforehand, you can simply search each line for the equals sign from behind:
program readme
implicit none
character(100) :: buffer
integer :: i, j, k, pos, eg(11), ai(11,6,2)
do i = 1,11
read*, buffer
pos = index(buffer, '=', back = .true.)
read(buffer(pos+1:), *) eg(i)
enddo
! I have assumed an arbitrary ordering here
do k = 1,2
do i = 1,11
do j = 1,6
read*, buffer
pos = index(buffer, '=', back = .true.)
read(buffer(pos+1:), *) ai(i,j,k)
enddo
enddo
enddo
end program
Assuming here for simplicity that the data are provided to standard input.