Read text file where the columns have specific format

Read text file where the columns have specific format - fortran

I am working with Fortran and I need to read a file that have 3 columns. The problem is that the 3rd column is a combination of integers, e.g. 120120101, and I need to separate each single value in a different column.
Usually, I manually remove the first 2 columns so the file would look like:
Info
0120012545
1254875541
0122110000
2254879933
To read this file where each single value is in a different column, I can use the following Fortran subroutine:
subroutine readF(imp, m, n)
implicit none
integer :: n,m,i,imp(n,m)
open(unit=100, file='file.txt', status='old', action='read')
do i=2,n
read(100,'(*(i1))') imp(i,1:m)
end do
close(unit=100)
end subroutine readF
I wonder if it is possible to read a file with the following content:
IDs Idx Info
ID001 1 125478521111
ID002 1 525478214147
ID003 2 985550004599
ID004 2 000478520002
and the results would looks like:
ID001 1 1 2 5 4 7 8 5 2 1 1 1 1
ID002 1 5 2 5 4 7 8 2 1 4 1 4 7
ID003 2 9 8 5 5 5 0 0 0 4 5 9 9
ID004 2 0 0 0 4 7 8 5 2 0 0 0 2
where the values in the 3rd column is spitted in m column.
The first row is the header, but I don't need it, so I start reading from the second line.
I tried to write use the following subroutine, but it didn't work:
subroutine readF(imp, ind, m, n)
implicit none
integer :: n,m,i,imp(n,m),ind(n),chip(n)
open(unit=100, file='file.txt', status='old', action='read')
do i=2,n
read(100,'(i8,i1,*(i1))') ind(i),chip(i),imp(i,1:m)
end do
close(unit=100)
end subroutine readF
Does anyone know how I could read that file without manually removing the first two columns?
Thank you.

I am going to guess what each of the variables mean and also try to explain some apparent mistakes.
I believe your do i=2,n is a mistake because I have seen some of my students make this mistake. Starting i at 2 does not mean you are reading in from the second line, it is just the value of i. Then, assuming you have n data lines, you will miss the last data line because you are reading in n-1 lines. What you want is a blank read statement before the loop. This skips the header line. Then you want i to go from 1 to n.
From the order of the variables in the read statement, I assume ind is the ID number, chip is the Idx number, and imp has the Info numbers of 1 integer each up to m of them.
Your i8 will take the first 8 columns of information and try to interpret them as an integer. Well, ID001 1 1 is the first 8 columns of the first data line and this is not an integer. You need to skip the 'ID' and read in '001' into ind. Then skip 1 character and read in 1 integer into chip, then skip 1 more character then read in the Info, 1 integer at a time. The x format specifier skips 1 character.
For each integer to go into imp separately, you need an implied do loop that goes from 1 to m. I used j there for that. If you do not know about implied do loops, please google it. It is quite standard in Fortran.
This code snippet will do just that:
open(unit=100, file='file.txt', status='old', action='read')
read(100,*) ! This skips the header line.
do i=1,n ! Read in n data lines.
read(100,'(2x,i3,1x,i1,1x,*(i1))') ind(i),chip(i),(imp(i,j),j=1,m)
end do
close(unit=100)
Additional answer to address the comment. I see you would have two options. First, get into line parsing. I would not choose this.
Second option is to read the line using unformatted input. Unformatted input uses blanks to separate the input items. I would make the third item a character variable long enough to accommodate a length of m. This character variable can be read with Fortran's read statement. This is called reading from an internal record. You would read each integer as before. This is what this would look like:
character(len=m) :: Info
character(len=:),allocatable :: Dumb
open(unit=100, file='file.txt', status='old', action='read')
read(100,*) ! This skips the header line.
do i=1,n ! Read in n data lines.
read(100,*) Dumb, chip(i), Info
read(Info,'(*(i1))') (imp(i,j),j=1,m)
end do
close(unit=100)
The first read statement in the do loop is reading from the file. It sticks the entire first column into Dumb no matter its length, the second column into chip(i), and the entire 3rd column into a character string named Info.
The second read statement is reading from the "internal record" Info. You can use a read statement on a character string. Here I use the format specifiers and the implied do loop to extract 1 integer at a time.

Related

Compile error for a simple Fortran 77 program

I copied and pasted in Sublime Text the following program from a Fortran 77 tutorial:
program circle
real r, area
c This program reads a real number r and prints
c the area of a circle with radius r.
write (*,*) 'Give radius r:'
read (*,*) r
area = 3.14159*r*r
write (*,*) 'Area = ', area
stop
end
I saved it as circle.f and compiled from the Terminal (macOS Sierra):
gfortran circle.f
It returned the error message:
circle.f:1:1:
program circle
1
Error: Non-numeric character in statement label at (1)
circle.f:1:1:
program circle
1
Error: Unclassifiable statement at (1)
How can I fix it? (The answer for another similar question does not solve the problem.)

Fortran 77 has fixed form source. Only characters between the 7th and the 73rd column can be used for statements. (The first 6 characters are used to declare the whole line a comment, as numeric labels, or to denote this line to be a continuation of the previous.) The 74th and later characters are simply ignored.
Inside this range, spaces are ignored. So the following lines would be identical:
column 1 1 2 2 3 3 4 4
1 5 0 5 0 5 0 5 0 5
-----------------------------------------------
if (i .le. 10) call my_sub(i)
if(i.le.10)callmy_sub(i)
i f ( i. le .10) cal lmy_ sub(i)
I leave it up to you to decide which one is easiest to read.
But if you start at the first character, even with the starting "program" statement, the compiler will complain. It expected a c, C, ! (to declare the whole line a comment) or a digit as the beginning of a numeric label.

How to read the lines of the input in arbitrary order?

I would like to ask how I can read the lines of the input in arbitrary order. In other words: how to read a given line of the input? I have written the next test program:
program main
implicit integer*4(i-n)
dimension ind(6)
do i=1,6
ind(i)=6-i
end do
open(7,file='test.inp',status='old')
do i=0,5
call fseek(7,ind(i+1),0)
read(7,*) m
write(*,*) m
call fseek(7,0,0)
end do
end
where test.inp contains:
1
2
3
4
5
6
My output given is:
4
5
6
2
3
4
What is the problem? I would expect
6
5
4
3
2
1

for a text file the simplest thing is to just use an empty read to advance lines. This will read the nth line of file opened with unit=iu
rewind(iu)
do i=1,n-1
read(iu,*)
enddo
read(iu,*)data
Note if you are doing a bunch of reads from the same file you should consider reading the whole file into a character array, then you can very simply access lines by index.
here is an example of reading in a whole file:
implicit none
integer::iu=20,i,n,io
character(len=:),allocatable::line(:)
real::x,y
open(iu,file='filename')
n=0
do while(.true.) ! pass through once to count the lines
read(iu,*,iostat=io)
if(io.ne.0)exit
n=n+1
enddo
write(*,*)'lines in file=',n
!allocate the character array. Here I'm hard coding a max line length
!of 130 characters (that can be fixed if its a problem.)
allocate(character(130)::line(n))
rewind(iu)
!read in entire file
do i=1,n
read(iu,'(a)')line(i)
enddo
!now we can random access the lines using internal reads:
read(line(55),*)x,y
! ( obviously use whatever format you need on the read )
write(*,*)x,y
end
One obvious drawback to this is you can not read data that spans multiple lines the same as if you were reading from the file.
Edit: my old version of gfortran doensn't like that allocatable character syntax.
This works:
character(len=130),allocatable::line(:)
...
allocate(line(n))

reading input file in fortran

I am looking for reading a file, like:
NE 32 0
IBZINT 2
NKTAB 936
XC-POT VWN
ITER 29
MIX 2.00000000000000E-01
TOL 1.00000000000000E-05
I was thinking it is index intrinsic that I am looking for, and was writing a code accordingly:
EDIT The code is updated,
Implicit None
integer ::i,pos
character(50) :: name
character(len=16),dimension(100)::key,val
key(1)="NE"
open(12,file="FeRh/FeRh.pot_new",status="old")
do i=1,100
read(12,*)name
if (name(1:2)==key(1))then
write(*,*)"find NE"
write(*,*)name(1:2)
write(*,*)name(index("NE","")+21)
endif
end do
close(12)
!write(*,*)index(key(1),"")
End Program readpot
I am expecting to have 32 in the 3rd write statement.
Must have gone horribly wrong some where. can you kindly help?

When you want to read a line from the file you are using list-directed (* as the format) input. This isn't what you want as there will be some limited parsing by the run-time.
That is, read(12,*) name on the first record will result in "NE" padded with lots of spaces in the variable name as the record will be split on the spaces.
As you want the entire line in name, use the format '(A)' in the read.
Once you have that line, you can then do your further parsing. However, from what you show index doesn't seem to be helping, especially as you are checking against an empty substring. You know the length of the key (using len_trim) so if you have a match you know the location of the first separator.

If I wanted to read a line such as
NE 32 0
I'd write a statement such as
read(12,*) name, int1, int2
and expect my processor to set name to NE, int1 to 32 and int2 to 0, if, that is, I'd declared int1 and int2 to be integers.
I'm puzzled that you seem to want to read a line of text and then parse it, all the while ignoring the benefits of list-directed input. If you do want to parse it into something other than a character variable and two integers, let us know.

Starting reading from specific line numbers in Fortran

I have a file with 1000s of numbers like:
0000
0032
1201
: :
: :
: :
2324
Depending on an input parameter "n", I want to read "m" numbers from this file from line numbers "n" to "n+m-1".
Any ideas how can I do this in Fortran?

I don't know if you have tried it yourself, but here is an minimal example:
say, your input file looks like this:
0000
0032
1201
1234
4567
7890
2324
use this code (after reading it)
Program jhp
Implicit None
integer :: i
integer, parameter :: &
m=7, & !total number of line
n=4, & !line to skip
p=3 !lines to read
integer,dimension(m)::arr !file to read
open(12,file='file_so',status='old')
do i=1,n
read(12,*)arr(i)
end do
do i=1,p
read(12,*)arr(i)
write(*,*)arr(i)
end do
End Program jhp
This skips first n line, and reads p lines after that.
Hope that helps

may be,
open (unit, file ...)
do i=1,n
read(unit,*) crap
end do
do i =n,n+m-1
read(unit,*) whatever
end do
close(unit)
is what you are looking for. this is untasted, but may give you a go.
edit: direct access is better for this type of job:
Just realised, though this is the easiest one, not the preferred one.
You can open the file in direct access mode and complete your job as:
OPEN( unit, file, ACCESS='DIRECT', RECL=100, FORM='FORMATTED')
READ( unit, *, REC=n, ERR=10 ) x

FORTRAN looping lines and character positions?

I'm trying to loop through all the lines in a document using FORTRAN 77 and comparing particular line positions to strings and then editing it.
E.g.:
|BXK |00640.3A |AWP |1.01|
|BUCKEYE MUNICIPAL AIRPORT |08794|
I want to change the 08794 to 0871994 in the second line.
This is what I have so far:
PROGRAM CONVERSION
IMPLICIT NONE
CHARACTER(LEN=120) :: ROW
CHARACTER(LEN=2) :: DATE1='19', DATE2='20'
INTEGER :: DATENUMBER
INTEGER :: J
OPEN(UNIT=1, FILE='BXK__96B.TXT', STATUS ='OLD')
OPEN(UNIT=2, FILE='BXK__96B_MODIFIED.TXT', STATUS='UKNOWN')
DO J=1,10000
READ(1,'(A)') ROW
IF (J==2) THEN
DATENUMBER = ICHAR(ROW(76))
IF ((DATENUMBER.LE.9) .AND. (DATENUMBER.GE.2)) THEN
WRITE(2, '(A)' ROW(1:75), DATE1, ROW(76:120))
ELSE
WRITE(2, '(A)' ROW(1:75), DATE2, ROW(76:120))
ENDIF
END IF
END DO
CONTINUE
CLOSE(1)
CLOSE(2)
END

Ahh, so what you mean is, you want to convert the 2-digit representation of the year found at the right end of line 2 into its 4-digit representation. You seem already to have figured out how to find the position of the leading digit of the year, ie 76. Rather easier than what you have written would be
integer :: year
.
.
.
read(line(76:77),'(i2)') year ! this reads year from the characters in positions 76,77
if (20<=year.and.year<=90) then ! not sure if this precisely your test
year = year+1900
else
year = year+2000
end if
write(line(76:79),'(i4)') year
I haven't gone to the trouble of integrating this into the rest of your code, that should be straightforward, if not ask for more help.
Actually, I suppose you probably haven't figured out how to find the column at which you want to start reading the year from line 2. Precisely how you do this depends on what the format of your file really is. The functions you need to familiarise yourself with are, as one of the comments tells you INDEX and SCAN.
If you are looking for the 4th character after the 2nd occurrence of | in line 2 you could do it this way:
integer :: posn_of_2nd_vertical_bar
.
.
.
posn_of_2nd_vertical_bar = scan(row(scan(row,'|')+1:),'|')
and then replace your constant 76 with posn_of_2nd_vertical_bar+4

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Read text file where the columns have specific format - fortran

Related

Compile error for a simple Fortran 77 program

How to read the lines of the input in arbitrary order?

reading input file in fortran

Starting reading from specific line numbers in Fortran

FORTRAN looping lines and character positions?

Categories

Resources