Related
I am working with Fortran and I need to read a file that have 3 columns. The problem is that the 3rd column is a combination of integers, e.g. 120120101, and I need to separate each single value in a different column.
Usually, I manually remove the first 2 columns so the file would look like:
Info
0120012545
1254875541
0122110000
2254879933
To read this file where each single value is in a different column, I can use the following Fortran subroutine:
subroutine readF(imp, m, n)
implicit none
integer :: n,m,i,imp(n,m)
open(unit=100, file='file.txt', status='old', action='read')
do i=2,n
read(100,'(*(i1))') imp(i,1:m)
end do
close(unit=100)
end subroutine readF
I wonder if it is possible to read a file with the following content:
IDs Idx Info
ID001 1 125478521111
ID002 1 525478214147
ID003 2 985550004599
ID004 2 000478520002
and the results would looks like:
ID001 1 1 2 5 4 7 8 5 2 1 1 1 1
ID002 1 5 2 5 4 7 8 2 1 4 1 4 7
ID003 2 9 8 5 5 5 0 0 0 4 5 9 9
ID004 2 0 0 0 4 7 8 5 2 0 0 0 2
where the values in the 3rd column is spitted in m column.
The first row is the header, but I don't need it, so I start reading from the second line.
I tried to write use the following subroutine, but it didn't work:
subroutine readF(imp, ind, m, n)
implicit none
integer :: n,m,i,imp(n,m),ind(n),chip(n)
open(unit=100, file='file.txt', status='old', action='read')
do i=2,n
read(100,'(i8,i1,*(i1))') ind(i),chip(i),imp(i,1:m)
end do
close(unit=100)
end subroutine readF
Does anyone know how I could read that file without manually removing the first two columns?
Thank you.
I am going to guess what each of the variables mean and also try to explain some apparent mistakes.
I believe your do i=2,n is a mistake because I have seen some of my students make this mistake. Starting i at 2 does not mean you are reading in from the second line, it is just the value of i. Then, assuming you have n data lines, you will miss the last data line because you are reading in n-1 lines. What you want is a blank read statement before the loop. This skips the header line. Then you want i to go from 1 to n.
From the order of the variables in the read statement, I assume ind is the ID number, chip is the Idx number, and imp has the Info numbers of 1 integer each up to m of them.
Your i8 will take the first 8 columns of information and try to interpret them as an integer. Well, ID001 1 1 is the first 8 columns of the first data line and this is not an integer. You need to skip the 'ID' and read in '001' into ind. Then skip 1 character and read in 1 integer into chip, then skip 1 more character then read in the Info, 1 integer at a time. The x format specifier skips 1 character.
For each integer to go into imp separately, you need an implied do loop that goes from 1 to m. I used j there for that. If you do not know about implied do loops, please google it. It is quite standard in Fortran.
This code snippet will do just that:
open(unit=100, file='file.txt', status='old', action='read')
read(100,*) ! This skips the header line.
do i=1,n ! Read in n data lines.
read(100,'(2x,i3,1x,i1,1x,*(i1))') ind(i),chip(i),(imp(i,j),j=1,m)
end do
close(unit=100)
Additional answer to address the comment. I see you would have two options. First, get into line parsing. I would not choose this.
Second option is to read the line using unformatted input. Unformatted input uses blanks to separate the input items. I would make the third item a character variable long enough to accommodate a length of m. This character variable can be read with Fortran's read statement. This is called reading from an internal record. You would read each integer as before. This is what this would look like:
character(len=m) :: Info
character(len=:),allocatable :: Dumb
open(unit=100, file='file.txt', status='old', action='read')
read(100,*) ! This skips the header line.
do i=1,n ! Read in n data lines.
read(100,*) Dumb, chip(i), Info
read(Info,'(*(i1))') (imp(i,j),j=1,m)
end do
close(unit=100)
The first read statement in the do loop is reading from the file. It sticks the entire first column into Dumb no matter its length, the second column into chip(i), and the entire 3rd column into a character string named Info.
The second read statement is reading from the "internal record" Info. You can use a read statement on a character string. Here I use the format specifiers and the implied do loop to extract 1 integer at a time.
I would like to ask how I can read the lines of the input in arbitrary order. In other words: how to read a given line of the input? I have written the next test program:
program main
implicit integer*4(i-n)
dimension ind(6)
do i=1,6
ind(i)=6-i
end do
open(7,file='test.inp',status='old')
do i=0,5
call fseek(7,ind(i+1),0)
read(7,*) m
write(*,*) m
call fseek(7,0,0)
end do
end
where test.inp contains:
1
2
3
4
5
6
My output given is:
4
5
6
2
3
4
What is the problem? I would expect
6
5
4
3
2
1
for a text file the simplest thing is to just use an empty read to advance lines. This will read the nth line of file opened with unit=iu
rewind(iu)
do i=1,n-1
read(iu,*)
enddo
read(iu,*)data
Note if you are doing a bunch of reads from the same file you should consider reading the whole file into a character array, then you can very simply access lines by index.
here is an example of reading in a whole file:
implicit none
integer::iu=20,i,n,io
character(len=:),allocatable::line(:)
real::x,y
open(iu,file='filename')
n=0
do while(.true.) ! pass through once to count the lines
read(iu,*,iostat=io)
if(io.ne.0)exit
n=n+1
enddo
write(*,*)'lines in file=',n
!allocate the character array. Here I'm hard coding a max line length
!of 130 characters (that can be fixed if its a problem.)
allocate(character(130)::line(n))
rewind(iu)
!read in entire file
do i=1,n
read(iu,'(a)')line(i)
enddo
!now we can random access the lines using internal reads:
read(line(55),*)x,y
! ( obviously use whatever format you need on the read )
write(*,*)x,y
end
One obvious drawback to this is you can not read data that spans multiple lines the same as if you were reading from the file.
Edit: my old version of gfortran doensn't like that allocatable character syntax.
This works:
character(len=130),allocatable::line(:)
...
allocate(line(n))
Consider the following data:
Class Gender Condition Tenis
A Male Fail Fail 33
A Female Fail NotFail 23
S Male Yellow 14
BC Male Happy Elephant 44
I have a comma separated value with unformatted tabulation (it varies among tabs and whitespaces).
In one specific column I have compound words which I would like to eliminate the space. In the above example, I would like to replace "Fail " with "Fail_" and "Happy" with "Happy_".
The result would be the following:
Class Gender Condition Tenis
A Male Fail_Fail 33
A Female Fail_NotFail 23
S Male Yellow 14
BC Male Happy_Elephant 44
I already managed to do that in two steps:
:%s/Fail /Fail_/g
:%s/Happy /Happy_/g
Question: As I'm very new to gVim I am trying to implement these replacements all together, but I could not find how to do that*.
After this step, I will tabulate my data with the following:
:%s/\s\+/,/g
And get the final result:
Number,Gender,Condition,Tenis
A,Male,Fail_Fail,33
A,Female,Fail_NotFail,23
S,Male,Yellow,14
BC,Male,Happy_Elephant,44
On SO, I searched for [vim] :%s two is:question and some variations, but I could not find a related thread, so I guess I am lacking the correct terminology.
Edit: This is the actual data (with more than 1 million rows). The problem starts in the 12th column (e.g. "Fail Planting" should be "Fail_Planting").
SP1 51F001 3 1 1 2 3 2001 52 52 H Normal 17,20000076 23,39999962 NULL NULL
SP1 51F001 3 1 1 2 3 2001 53 53 F Fail Planting 0 0 NULL NULL
SP1 51F001 3 1 1 2 3 2001 54 54 N Normal 13,89999962 0 NULL NULL
You can use an expression on the right hand side of the substitution.
:%s/\(Fail\|Happy\) \|\s\+/\= submatch(0) =~# '^\s\+$' ? ',' : submatch(1).'_'/g
So this finds Fail or Happy or whitespace and then converts checks to see if the matched part is completely whitespace. It it is replace by a comma if it is not use the captured part and append an underscore. submatch(0) is the whole match and submatch(1) is the first capture group.
Take a look at :h sub-replace-expression. If you want to do something very complex define you can define a function.
Very magic version
:%s/\v(Fail|Happy) |\s+/\= submatch(0) =~# '^\v\s+$' ? ',' : submatch(1).'_'/g
You have all the parts you just need to combine them together with |. Example:
:%s/\>\s\</_/g|%s/\s\+/,/g
I am using \> and \< to find words that only have one space between them so we can replace it with _.
For more help see:
:h /\>
:h :range
:h :bar
You could perhaps try a macro if there are certain conditions that are true (or write a vimscript, but my vimscript is very rusty). I will show a sample macro you could use:
Go to first line in file after the headings
press q to begin recording a macro
press t to choose the register t for recording to (I use t for "temp")
press ^ to move to the beginning of the line
press 2w to move to the third word (move 2 words to the right)
press e to move to the end of the word
press l (letter l) to move right one character (to the space)
press r to enter replace single character mode
press _ to enter an underscore
press j to move down a line
press q to stop recording the macro
Now that you have the macro stored in register t you can run the macro on every line in the file. If there are 100 lines in the file, you have already done 1 and there is a header, so you would type the following to run it on the remaining 98 lines:
98#t
These two commands:
:%s/\(\a\) \(\a\)/\1_\2/g
:%s/\s\+/,/g
seem to work on your sample:
SP1,51F001,3,1,1,2,3,2001,52,52,H,Normal,17,20000076,23,39999962,NULL,NULL
SP1,51F001,3,1,1,2,3,2001,53,53,F,Fail_Planting,0,0,NULL,NULL
SP1,51F001,3,1,1,2,3,2001,54,54,N,Normal,13,89999962,0,NULL,NULL
but you have decimal numbers here with a comma as separator that will mess with the "comma-separated-ness" of your data. Changing those commas into periods beforehand might be a good idea:
:%s/,/./g
SP1,51F001,3,1,1,2,3,2001,52,52,H,Normal,17.20000076,23.39999962,NULL,NULL
SP1,51F001,3,1,1,2,3,2001,53,53,F,Fail_Planting,0,0,NULL,NULL
SP1,51F001,3,1,1,2,3,2001,54,54,N,Normal,13.89999962,0,NULL,NULL
I'm trying to loop through all the lines in a document using FORTRAN 77 and comparing particular line positions to strings and then editing it.
E.g.:
|BXK |00640.3A |AWP |1.01|
|BUCKEYE MUNICIPAL AIRPORT |08794|
I want to change the 08794 to 0871994 in the second line.
This is what I have so far:
PROGRAM CONVERSION
IMPLICIT NONE
CHARACTER(LEN=120) :: ROW
CHARACTER(LEN=2) :: DATE1='19', DATE2='20'
INTEGER :: DATENUMBER
INTEGER :: J
OPEN(UNIT=1, FILE='BXK__96B.TXT', STATUS ='OLD')
OPEN(UNIT=2, FILE='BXK__96B_MODIFIED.TXT', STATUS='UKNOWN')
DO J=1,10000
READ(1,'(A)') ROW
IF (J==2) THEN
DATENUMBER = ICHAR(ROW(76))
IF ((DATENUMBER.LE.9) .AND. (DATENUMBER.GE.2)) THEN
WRITE(2, '(A)' ROW(1:75), DATE1, ROW(76:120))
ELSE
WRITE(2, '(A)' ROW(1:75), DATE2, ROW(76:120))
ENDIF
END IF
END DO
CONTINUE
CLOSE(1)
CLOSE(2)
END
Ahh, so what you mean is, you want to convert the 2-digit representation of the year found at the right end of line 2 into its 4-digit representation. You seem already to have figured out how to find the position of the leading digit of the year, ie 76. Rather easier than what you have written would be
integer :: year
.
.
.
read(line(76:77),'(i2)') year ! this reads year from the characters in positions 76,77
if (20<=year.and.year<=90) then ! not sure if this precisely your test
year = year+1900
else
year = year+2000
end if
write(line(76:79),'(i4)') year
I haven't gone to the trouble of integrating this into the rest of your code, that should be straightforward, if not ask for more help.
Actually, I suppose you probably haven't figured out how to find the column at which you want to start reading the year from line 2. Precisely how you do this depends on what the format of your file really is. The functions you need to familiarise yourself with are, as one of the comments tells you INDEX and SCAN.
If you are looking for the 4th character after the 2nd occurrence of | in line 2 you could do it this way:
integer :: posn_of_2nd_vertical_bar
.
.
.
posn_of_2nd_vertical_bar = scan(row(scan(row,'|')+1:),'|')
and then replace your constant 76 with posn_of_2nd_vertical_bar+4
I was reading an exercise of UVA, which I need to simulate a deterministic pushdown automaton, to see
if certain strings are accepted or not by PDA on a given entry in the following format:
The first line of input will be an integer C, which indicates the number of test cases. The first line of each test case contains five integers E, T, F, S and C, where E represents the number of states in the automaton, T the number of transitions, F represents the number of final states, S the initial state and C the number of test strings respectively. The next line will contain F integers, which represent the final states of the automaton. Then come T lines, each with 2 integers I and J and 3 strings, L, T and A, where I and J (0 ≤ I, J < E) represent the state of origin and destination of a transition state respectively. L represents the character read from the tape into the transition, T represents the symbol found at the top of the stack and A the action to perform with the top of the stack at the end of this transition (the character used to represent the bottom of the pile is always Z. to represent the end of the string, or unstack the action of not taking into account the top of the stack for the transition character is used <alt+156> £). The alphabet of the stack will be capital letters. For chain A, the symbols are stacked from right to left (in the same way that the program JFlap, ie, the new top of the stack will be the character that is to the left). Then come C lines, each with an input string. The input strings may contain lowercase letters and numbers (not necessarily present in any transition).
The output in the first line of each test case must display the following string "Case G:", where G represents the number of test case (starting at 1). Then C lines on which to print the word "OK" if the automaton accepts the string or "Reject" otherwise.
For example:
Input:
2
3 5 1 0 5
2
0 0 1 Z XZ
0 0 1 X XX
0 1 0 X X
1 1 1 X £
1 2 £ Z Z
111101111
110111
011111
1010101
11011
4 6 1 0 5
3
1 2 b A £
0 0 a Z AZ
0 1 a A AAA
1 0 a A AA
2 3 £ Z Z
2 2 b A £
aabbb
aaaabbbbbb
c1bbb
abbb
aaaaaabbbbbbbbb
this is the output:
Output:
Case 1:
Accepted
Rejected
Rejected
Rejected
Accepted
Case 2:
Accepted
Accepted
Rejected
Rejected
Accepted
I need some help, or any idea how I can simulate this PDA, I am not asking me a code that solves the problem because I want to make my own code (The idea is to learn right??), But I need some help (Some idea or pseudocode) to begin implementation.
You first need a data structure to keep transitions. You can use a vector with a transition struct that contains transition quintuples. But you can use fact that states are integer and create a vector which keeps at index 0, transitions from state 0; at index 1 transitions from state 1 like that. This way you can reduce searching time for finding correct transition.
You can easily use the stack in stl library for the stack. You also need search function it could chnage depending on your implementation if you use first method you can use a function which is like:
int findIndex(vector<quintuple> v)//which finds the index of correct transition otherwise returns -1
then use the return value to get newstate and newstack symbol.
Or you can use a for loop over the vector and bool flag which represents transition is found or not.
On second method you can use a function which takes references to new state and new stack symbol and set them if you find a appropriate transition.
For inputs you can use something like vector or vector depends on personal taste. You can implement your main method with for loops but if you want extra difficulties you can implement a recursive function. May it be easy.