Change values in Python file (tab-delimited list) - python-2.7

I have read a *.INP file into Python. Here is the code I used:
import csv
r = csv.reader(open('T_JAC.INP')) # Here your csv file
lines = [l for l in r]
print lines[23]
print lines[26]
The first print statement produces ['9E21\t\texthere (text) text alphabets text alphanumeric'].
The second print statement produces ['4E15\t\texthere (text) text alphabets text alphanumeric'].
I need to change the numbers 7E21 and 4E15. I need to change them to values from a list fil_replace = [9E21,6E15].i.e. I need to replace 7E21 to 9E21 and I need to change 4E21 to 6E21.
Is there a way to replace these numbers?

Something with str.replace should work (as long as you read r in as a string), albeit not the most efficient solution:
r.replace('7E21', '9E21')
file = open('YAC.IN', 'w')
file.write(r)
file.close()
If you're looking for a way to just replace the values 'in place' in the file unfortunately it's not possible. The entire file has to be read in, modified, then re-written.

Related

Making a text file which will contain my list items and applying regular expression to it

I am supposed to make a code which will read a text file containing some words with some common linguistic features. Apply some regular expression to all of the words and write one file which will have the changed words.
For now let's say my text file named abcd.txt has these words
king
sing
ping
cling
booked
looked
cooked
packed
My first question starts from here. In my simple text file how to write these words to get the above mentioned results. Shall I write them line-separated or comma separated?
This is the code provided by user palvarez.
import re
with open("new_abcd", "w+") as new, open("abcd") as original:
for word in original:
new_word = re.sub("ing$", "xyz", word)
new.write(new_word)
Can I add something like -
with open("new_abcd", "w+") as file, open("abcd") as original:
for word in original:
new_aword = re.sub("ed$", "abcd", word)
new.write(new_aword)
in the same code file? I want something like -
kabc
sabc
pabc
clabc
bookxyz
lookxyz
cookxyz
packxyz
PS - I don't know whether mentioning this is necessary or not, but I am supposed to do this for a Unicode supported script Devanagari. I didn't use it here in my examples because many of us here can't read the script. Additionally that script uses some diacritics. eg. 'का' has one consonant character 'क' and one vowel symbol 'ा' which together make 'का'. In my regular expression I need to condition the diacritics.
I think the approach you have with one word by line is better since you don't have to trouble yourself with delimiters and striping.
With a file like this:
king
sing
ping
cling
booked
looked
cooked
packed
And a code like this, using re.sub to replace a pattern:
import re
with open("new_abcd.txt", "w") as new, open("abcd.txt") as original:
for word in original:
new_word = re.sub("ing$", "xyz", word)
new_word = re.sub("ed$", "abcd", new_word)
new.write(new_word)
It creates a resulting file:
kxyz
sxyz
pxyz
clxyz
bookabcd
lookabcd
cookabcd
packabcd
I tried out with the diacritic you gave us and it seems to work fine:
print(re.sub("ा$", "ing", "का"))
>>> कing
EDIT: added multiple replacement. You can have your replacements into a list and iterate over it to do re.sub as follows.
import re
# List where first is pattern and second is replacement string
replacements = [("ing$", "xyz"), ("ed$", "abcd")]
with open("new_abcd.txt", "w") as new, open("abcd.txt") as original:
for word in original:
new_word = word
for pattern, replacement in replacements:
new_word = re.sub(pattern, replacement, word)
if new_word != word:
break
new.write(new_word)
This limits one modification per word, only the first that modifies the word is taken.
It is recommended that for starters, utilize the with context manager to open your file, this way you do not need to explicitly close the file once you are done with it.
Another added advantage is then you are able to process the file line by line, this will be very useful if you are working with larger sets of data. Writing them in a single line or csv format will then all depend on the requirement of your output and how you would want to further process them.
As an example, to read from a file and say substitute a substring, you can use re.sub.
import re
with open('abcd.txt', 'r') as f:
for line in f:
#do something here
print(re.sub("ing$",'ring',line.strip()))
>>
kring
sring
pring
clring
Another nifty trick is to manage both the input and output utilizing the same context manager like:
import re
with open('abcd.txt', 'r') as f, open('out_abcd.txt', 'w') as o:
for line in f:
#notice that we add '\n' to write each output to a newline
o.write(re.sub("ing$",'ring',line.strip())+'\n')
This create an output file with your new contents in a very memory efficient way.
If you'd like to write to a csv file or any other specific formats, I highly suggest you spend sometime to understand Python's input and output functions here. If linguistics in text is what you are going for that understand encoding of different languages and further study Python's regex operations.

Python - using raw_input() to search a text document

I am trying to write a simple script that a user can enter what he/she wants to search in a specified txt file. If the word they searching is found then print it to a new text file. This is what I got so far.
import re
import os
os.chdir("C:\Python 2016 Training")
patterns = open("rtr.txt", "r")
what_directory_am_i_in = os.getcwd()
print what_directory_am_i_in
search = raw_input("What you looking for? ")
for line in patterns:
re.findall("(.*)search(.*)", line)
fo = open("test", "wb")
fo.write(line)
fo.close
This successfully creates a file called test, but the output is nothing close to what word was entered into the search variable.
Any advice appreciated.
First of all, you have not read a file
patterns = open("rtr.txt", "r")
this is a file object and not the content of file, to read the file contents you need to use
patterns.readlines()
secondly, re.findall returns a list of matched strings, so you would want to store that. You regex is also not correct as pointed by Hani, It should be
matched = re.findall("(.*)" + search + "(.*)", line)
rather it should be :
if you want the complete line
matched = re.findall(".*" + search + ".*", line)
or simply
matched = line if search in line else None
Thirdly, you don't need to keep opening your output file in the for loop. You are overwriting your file everytime in the loop so it will capture only the last result. Also remember to call the close method on the files.
Hope this helps
you are searching here for all lines that has "search" word in it
you need to get the lines that has the text you entered in the shell
so change this line
re.findall("(.*)search(.*)", line)
to
re.findall("(.*)"+search+"(.*)", line)

How do I print specific lines of a file in python?

I'm trying to print everything in a file with python. But, whenever I use python's built-in readfile() function it only print the first line of my text file. Here's my code:
File = open("test.txt", 'r', 0)
line = File.readline()[:]
print line
and thank you for everyone that answers
and to make my question clearer every time I run the code it prints only "word list food
Is this what you are looking for?
printline = 6
lineCounter = 0
with open('anyTxtFile.txt','r') as f:
for line in f:
lineCounter += 1
if lineCounter == printline:
print(line, end='')
Opens text file, in working directory, and prints printLine
File.readlines()
will, as emre. said, return a list of all the lines in your file. If you'd like to produce a similar result using the readline() command,
s=File.readline()
while s!="":
print s
s=File.readline()
Both methods above leave a newline at the end of each string, except for the last string.
Another alternative would be:
for s in File:
print s
To search for a specific string, or a specific line number, I'd say the first method is best. Looking for a specific line number would be as simple as:
File.readlines()[i]
Where i is the line number you are interested in accessing. Looking for a string is a bit more work, but looping through the list would not be too challenging. Something like:
L=File.readlines()
s="yourStringHere"
i=0
while i<len(L):
if L[i].find(s)!=-1:
break
i+=1
print i
would give you the line number that contained the string you were looking for.
Make it more pythonic.
print_line = 6
with open('input_txt_file.txt', 'r') as f:
for i, line in enumerate(f):
if i == print_line:
print(line, end='')
break

read a file into python and remove values

I have the following code that reads in a file and stores the values into a list. It also reads each line of the file and if it sees a '(' in the file it splits it and everything that follows it on the line.
with open('filename', 'r') as f:
list1 = [line.strip().split('(')[0].split() for line in f]
Is there a way that I can change this to split not only at a '(' but also at a '#'?
Use re.split.
With a sample of your data's format, it may be possible to do better than this, but without context we can still show how to use this with the code you have provided.
To use re.split():
import re
with open('filename', 'r') as f:
list1 = [re.split('[(#]+', line.strip())[0].split() for line in f]
Notice that the first parameter in re.split() is a regular expression to split on, and the second parameter is the string to apply this operation to.
General idea from: Splitting a string with multiple delimiters in Python

Easiest way to cross-reference a CSV file with a text file for common strings

I have a list of strings in a CSV file, and another text file that I would like to search for these strings. The CSV file has just the strings that I am interested in, but the text file has a bunch of other text interspersed among the strings of interest (the strings I am interested in are ID numbers for a database of proteins). What would the easiest way of going about this be? I want to check the text file for the presence of every string in the CSV file. I am working in a research lab at a top university, so you would be aiding cutting-edge research!
Thanks :)
I would use Python for this. To print the matching lines, you could do this:
import csv
with open("strings.csv") as csvfile:
reader = csv.reader(csvfile)
searchstrings = {row[0] for row in reader} # Construct a set of keywords
with open("text.txt") as txtfile:
for number, line in enumerate(txtfile):
for needle in searchstrings:
if needle in line:
print("Line {0}: {1}".format(number, line.strip()))
break # only necessary if there are several matches per line