How can I get specific lines in a file and add it to array?
For example: I want to get lines 200-300 and put them inside an array. And while at that count the total line in the file. The file can be quite big.
File.each_line is a good reference for this:
lines = [] of String
index = 0
range = 200..300
File.each_line(file, chomp: true) do |line|
index += 1
if range.includes?(index)
lines << line
end
end
Now lines holds the lines in range and index is the number of total lines in the file.
To prevent reading the entire file and allocating a new array for all of its content, you can use File.each_line iterator:
lines = [] of String
File.each_line(file, chomp: true).with_index(1) do |line, idx|
case idx
when 1...200 then next # ommit lines before line 200 (note exclusive range)
when 200..300 then lines << line # collect lines 200-300
else break # early break, to be efficient
end
end
Related
I'm writing in fortran (90). My program must read file1, do something with every line of it and write result to file2. But the problem - file1 has some unneeded information in first line.
How can I skip a line from input file using Fortran?
The code:
open (18, file='m3dv.dat')
open (19, file='m3dv2.dat')
do
read(18,*) x
tmp = sqrt(x**2 + 1)
write(19, *) tmp
end do
First line is a combination of text and numbers.
One possible solution has already been presented to you which uses a "dummy variable", but I just wanted to add that you don't even need a dummy variable, just a blank read statement before entering the loop is enough:
open(18, file='m3dv.dat')
read(18,*)
do
...
The other answers are correct but this can improve conciseness and (thus) readability of your code.
Perform a read operation before the do loop that reads whatever is on the first line into a "dummy" variable.
program linereadtest
implicit none
character (LEN=75) ::firstline
integer :: temp,n
!
!
!
open(18,file='linereadtest.txt')
read(18,*) firstline
do n=1,4
read(18,'(i3)') temp
write(*,*) temp
end do
stop
end program linereadtest
Datafile:
This is a test of 1000 things that 10
of which do not exist
50
100
34
566
!ignore the space in between the line and the numbers, I can't get it to format
open (18, file='m3dv.dat')
open (19, file='m3dv2.dat')
read(18,*) x // <---
do
read(18,*) x
tmp = sqrt(x**2 + 1)
write(19, *) tmp
end do
The line added just reads the first line and then overwrites it with the seconde on the first iteration.
I have a .txt file with dozens of columns and hundreds of rows. I want to write the results of the entirety of two specific columns into two variables. I don't have a great deal of experience with for loops but here is my attempt to loop through the file.
a = open('file.txt', 'r') #<--This puts the file in read mode
header = a.readline() #<-- This skips the strings in the 0th row indicating the labels of each column
for line in a:
line = line.strip() #removes '\n' characters in text file
columns = line.split() #Splits the white space between columns
x = float(columns[0]) # the 1st column of interest
y = float(columns[1]) # the 2nd column of interest
print(x, y)
f.close()
Outside of the loop, printing x or y only displays the last value of the text file. I want it to have all the values of the specified columns of the file. I know of the append command but I am unsure how to apply it in this situation within the for loop.
Does anyone have any suggestions or easier methods on how to do this?
Make two lists x and y before you sart the loop and append to them in the loop:
a = open('file.txt', 'r') #<--This puts the file in read mode
header = a.readline() #<-- This skips the strings in the 0th row indicating the labels of each column
x = []
y = []
for line in a:
line = line.strip() #removes '\n' characters in text file
columns = line.split() #Splits the white space between columns
x.append(float(columns[0])) # the 1st column of interest
y.append(float(columns[1])) # the 2nd column of interest
f.close()
print('all x:')
print(x)
print('all y:')
print(y)
Your code only binds the value of the last element. I'm not sure that is your entire codes, but if you want to keep add the values of the column, I would suggest appending it to the array then print it outside of loop.
listx = []
listy = []
a = open('openfile', 'r')
#skip the header
for line in a:
#split the line
#set the x and y variables.
listx.append(x)
listy.append(y)
#print outside of loop.
The basic outline of this problem is to read the file, look for integers using the re.findall(), looking for a regular expression of [0-9]+ and then converting the extracted strings to integers and summing up the integers.
I am finding trouble in appending the list. From my below code, it is just appending the first(0) index of the line. Please help me. Thank you.
import re
hand = open ('a.txt')
lst = list()
for line in hand:
line = line.rstrip()
stuff = re.findall('[0-9]+', line)
if len(stuff)!= 1 : continue
num = int (stuff[0])
lst.append(num)
print sum(lst)
import re
ls=[];
text=open('C:/Users/pvkpu/Desktop/py4e/file1.txt');
for line in text:
line=line.rstrip();
l=re.findall('[0-9]+',line);
if len(l)==0:
continue
ls+=l
for i in range(len(ls)):
ls[i]=int(ls[i]);
print(sum(ls));
Great, thank you for including the whole txt file! Your main problem was in the if len(stuff)... line which was skipping if stuff had zero things in it and when it had 2,3 and so on. You were only keeping stuff lists of length 1. I put comments in the code but please ask any questions if something is unclear.
import re
hand = open ('a.txt')
str_num_lst = list()
for line in hand:
line = line.rstrip()
stuff = re.findall('[0-9]+', line)
#If we didn't find anything on this line then continue
if len(stuff) == 0: continue
#if len(stuff)!= 1: continue #<-- This line was wrong as it skip lists with more than 1 element
#If we did find something, stuff will be a list of string:
#(i.e. stuff = ['9607', '4292', '4498'] or stuff = ['4563'])
#For now lets just add this list onto our str_num_list
#without worrying about converting to int.
#We use '+=' instead of 'append' since both stuff and str_num_lst are lists
str_num_lst += stuff
#Print out the str_num_list to check if everything's ok
print str_num_lst
#Get an overall sum by looping over the string numbers in the str_num_lst
#Can convert to int inside the loop
overall_sum = 0
for str_num in str_num_lst:
overall_sum += int(str_num)
#Print sum
print 'Overall sum is:'
print overall_sum
EDIT:
You are right, reading in the entire file as one line is a good solution, and it's not difficult to do. Check out this post. Here is what the code could look like.
import re
hand = open('a.txt')
all_lines = hand.read() #Reads in all lines as one long string
all_str_nums_as_one_line = re.findall('[0-9]+',all_lines)
hand.close() #<-- can close the file now since we've read it in
#Go through all the matches to get a total
tot = 0
for str_num in all_str_nums_as_one_line:
tot += int(str_num)
print('Overall sum is:',tot) #editing to add ()
I am trying to read a file that looks as follows:
Data Sampling Rate: 256 Hz
*************************
Channels in EDF Files:
**********************
Channel 1: FP1-F7
Channel 2: F7-T7
Channel 3: T7-P7
Channel 4: P7-O1
File Name: chb01_02.edf
File Start Time: 12:42:57
File End Time: 13:42:57
Number of Seizures in File: 0
File Name: chb01_03.edf
File Start Time: 13:43:04
File End Time: 14:43:04
Number of Seizures in File: 1
Seizure Start Time: 2996 seconds
Seizure End Time: 3036 seconds
So far I have this code:
fid1= fopen('chb01-summary.txt')
data=struct('id',{},'stime',{},'etime',{},'seizenum',{},'sseize',{},'eseize',{});
if fid1 ==-1
error('File cannot be opened ')
end
tline= fgetl(fid1);
while ischar(tline)
i=1;
disp(tline);
end
I want to use regexp to find the expressions and so I did:
line1 = '(.*\d{2} (\.edf)'
data{1} = regexp(tline, line1);
tline=fgetl(fid1);
time = '^Time: .*\d{2]}: \d{2} :\d{2}' ;
data{2}= regexp(tline,time);
tline=getl(fid1);
seizure = '^File: .*\d';
data{4}= regexp(tline,seizure);
if data{4}>0
stime = '^Time: .*\d{5}';
tline=getl(fid1);
data{5}= regexp(tline,seizure);
tline= getl(fid1);
data{6}= regexp(tline,seizure);
end
I tried using a loop to find the line at which file name starts with:
for (firstline<1) || (firstline>1 )
firstline= strfind(tline, 'File Name')
tline=fgetl(fid1);
end
and now I'm stumped.
Suppose that I am at the line at which the information is there, how do I store the information with regexp? I got an empty array for data after running the code once...
Thanks in advance.
I find it the easiest to read the lines into a cell array first using textscan:
%// Read lines as strings
fid = fopen('input.txt', 'r');
C = textscan(fid, '%s', 'Delimiter', '\n');
fclose(fid);
and then apply regexp on it to do the rest of the manipulations:
%// Parse field names and values
C = regexp(C{:}, '^\s*([^:]+)\s*:\s*(.+)\s*', 'tokens');
C = [C{:}]; %// Flatten the cell array
C = reshape([C{:}], 2, []); %// Reshape into name-value pairs
Now you have a cell array C of field names and their corresponding (string) values, and all you have to do is plug it into struct in the correct syntax (using a comma-separated list in this case). Note that the field names have spaces in them, so this needs to be taken care of before they can be used (e.g replace them with underscores):
C(1, :) = strrep(C(1, :), ' ', '_'); %// Replace spaces with underscores
data = struct(C{:});
Here's what I get for your input file:
data =
Data_Sampling_Rate: '256 Hz'
Channel_1: 'FP1-F7'
Channel_2: 'F7-T7'
Channel_3: 'T7-P7'
Channel_4: 'P7-O1'
File_Name: 'chb01_03.edf'
File_Start_Time: '13:43:04'
File_End_Time: '14:43:04'
Number_of_Seizures_in_File: '1'
Seizure_Start_Time: '2996 seconds'
Seizure_End_Time: '3036 seconds'
Of course, it is possible to prettify it even more by converting all relevant numbers to numerical values, grouping the 'channel' fields together and such, but I'll leave this to you. Good luck!
I have a list with a set of strings and another dynamic list:
arr = ['sample1','sample2','sample3']
applist=[]
I am reading a text file line by line, and if a line starts with any of the strings in arr, then I append it to applist, as follows:
for line in open('test.txt').readlines():
for word in arr:
if line.startswith(word):
applist.append(line)
Now, if I do not have a line with any of the strings in the arr list, then I want to append 'NULL' to applist instead. I tried:
for line in open('test.txt').readlines():
for word in arr:
if line.startswith(word):
applist.append(line)
elif word not in 'test.txt':
applist.append('NULL')
But it obviously doesn't work (it inserts many unnecessary NULLs). How do I go about it? Also, there are other lines in the text file besides the three lines starting with the strings in arr. But I want to append only these three lines. Thanks in advance!
for line in open('test.txt').readlines():
found = False
for word in arr:
if line.startswith(word):
applist.append(line)
found = True
break
if not found: applist.append('NULL')
I think this might be what you are looking for:
found1 = NULL
found2 = NULL
found3 = NULL
for line in open('test.txt').readlines():
if line.startswith(arr[0]):
found1 = line;
elif line.startswith(arr[1]):
found2 = line;
elif line.startswith(arr[2]):
found3 = line;
for word in arr:
applist = [found1, found2, found3]
you could clean that up and make it better looking, but that should give you the logic you're going for.