Python not able to find string in file - python-2.7

so I am writing a program to find specific lines in a dump from the uninstall registry, and then write those lines to a new text file. Here is the code.
fileName = "export.txt"
outputFileName = input("Enter the Output File Name")
inputFile = open(fileName, "r")
outputFile = open(outputFileName, "w")
displayName = ""
displayVersion = ""
publisher = ""
for line in inputFile:
if "DisplayName" in line:
lst = line.split("=")
displayName = lst[1][1:len(lst[1])-1]
if "DisplayVersion" in line:
lst = line.split("=")
displayVersion = lst[1][1:len(lst[1])-1]
if "Publisher" in line:
lst = line.split("=")
publisher = lst[1][1:len(lst[1])-1]
if displayName!= "" or displayVersion != "" or publisher != "":
outputFile.write(displayName + "\t" + displayVersion + "\t" +publisher + "\n")
displayName = ""
displayVersion = ""
publisher = ""
inputFile.close()
outputFile.close()
For some reason, the first three if statements are not being entered. Here is a snippet from the export.txt text file.
[HKEY_LOCAL_MACHINE\SoftWare\Microsoft\Windows\CurrentVersion\Uninstall\Matlab R2016b]
"DisplayName"="MATLAB R2016b"
"UninstallString"="C:\\Program Files\\MATLAB\\R2016b\\uninstall\\bin\\win64\\uninstall.exe C:\\Program Files\\MATLAB\\R2016b"
"DisplayIcon"="C:\\Program Files\\MATLAB\\R2016b\\bin\\win64\\matlab.ico"
"InstallLocation"="C:\\Program Files\\MATLAB\\R2016b"
"DisplayVersion"="9.1"
"URLInfoAbout"="www.mathworks.com"
"Publisher"="MathWorks"
"HelpLink"="www.mathworks.com/support"
"Comments"=" "
ê[HKEY_LOCAL_MACHINE\SoftWare\Microsoft\Windows\CurrentVersion\Uninstall\Matlab
R2016b]
“DisplayName””MATLAB R201 6b”
“UninstallString””C: \\Prograrn
Files\\MATLAB\\R2016b\\uninstall\\bin\\win64\\uninstall.exe C: \\Prograrn
Files\\frIATLAB\\R201 6b”
“Displaylcon””C:\\Prograrn Files\\MATLAB\\R2016b\\bin\\win64\\matlab. ico”
“InstallLocation””C: \\Prograrn Files\\MATLAB\\R201 6b”
“DisplayVersion””9. 1”
“URLlnfoAbout””www. mathworks. corn”
“Publisher”=”MathWorks”
“HelpLink””www. rnathworks. corn/support”
“Comments”” “

The logic of your last if statement is completely reversed. It should be or instead of and in order for your statement to work properly.

What about checking in a different way using find()?
if line.find("DisplayName") != -1:
do stuff.
I ran this fine, here's the code:
fileName = "export.txt"
outputFileName = input("Enter the Output File Name")
inputFile = open(fileName, "r")
outputFile = open(outputFileName, "w")
displayName = ""
displayVersion = ""
publisher = ""
for line in inputFile:
print line
if line.find("DisplayName") != -1:
lst = line.split("=")
displayName = lst[1][1:len(lst[1])-2]
if line.find("DisplayVersion") != -1:
print "here2"
lst = line.split("=")
displayVersion = lst[1][1:len(lst[1])-2]
if line.find("Publisher") != -1:
print "here3"
lst = line.split("=")
publisher = lst[1][1:len(lst[1])-2]
if displayName!= "" and displayVersion != "" and publisher != "":
print "Here4"
print displayName + "\t" + displayVersion + "\t" +publisher
outputFile.write(displayName + "\t" + displayVersion + "\t" +publisher)
displayName = ""
displayVersion = ""
publisher = ""
inputFile.close()
outputFile.close()
produces:
MATLAB R2016b 9.1 MathWorks
The output while running the script looks like this:
Enter the Output File Name"out.txt"
[HKEY_LOCAL_MACHINE\SoftWare\Microsoft\Windows\CurrentVersion\Uninstall\Matlab R2016b]
"DisplayName"="MATLAB R2016b"
"UninstallString"="C:\Program Files\MATLAB\R2016b\uninstall\bin\win64\uninstall.exe C:\Program Files\MATLAB\R2016b"
"DisplayIcon"="C:\Program Files\MATLAB\R2016b\bin\win64\matlab.ico"
"InstallLocation"="C:\Program Files\MATLAB\R2016b"
"DisplayVersion"="9.1"
here2
"URLInfoAbout"="www.mathworks.com"
"Publisher"="MathWorks"
here3
Here4
MATLAB R2016b 9.1 MathWorks
"HelpLink"="www.mathworks.com/support"
"Comments"=" "

Related

Do not write line that begins with certain string

I'm trying to omit writing the lines that begin with "KO", however when I run the code the lines still are written to the output file. I tried calling a a boolean expression to see if "KO" was in geneData and it comes back as true. I'm stuck with just that part.
#Read in hsa links
hsa = []
with open ('/users/skylake/desktop/pathway-HSAs.txt', 'r') as file:
for line in file:
line = line.strip()
hsa.append(line)
#Import Modules | Create KEGG Variable
from bioservices.kegg import KEGG
import re
k = KEGG()
##Data Parsing | Writing to File
#for i in range(len(hsa)):
data = k.get(hsa[2])
dict_data = k.parse(data)
#Prep title of file
nameData = re.sub("\[u'", "", str(dict_data['NAME']))
nameData = re.sub(" - Homo sapiens(human)']", "", nameData)
f = open('/Users/Skylake/Desktop/pathway-info/' + nameData + '.txt' , 'w')
#Prep gene data format
geneData = re.sub("', u'", "',\n", str(dict_data['GENE']))
geneData = re.sub("': u'", ": ", geneData)
geneData = re.sub("{u'", "", geneData)
geneData = re.sub("'}", "", geneData)
geneData = re.sub("\[KO", "\nKO", geneData)
f.write("Genes\n")
f.writelines([line for line in geneData if 'KO' not in line])
#Prep compound data format
if 'COMPOUND' in dict_data:
compData = re.sub("\"", "'", str(dict_data['COMPOUND']))
compData = re.sub("', u'", "\n", compData)
compData = re.sub("': u'", ": ", compData)
compData = re.sub("{u'", "", compData)
compData = re.sub("'}", "", compData)
f.write("\nCompounds\n")
f.write(compData)
#Close file
f.close()
Your genedata variable is a single string. When you iterate over it, you are dealing with the individual characters of the string; your line variable is horribly misnamed. The two-character string 'KO' is obviously not contained within any of these single characters, thus your boolean condition is always True.
With no example input data, nor any expected output data, I can't tell what you're trying to do well enough to suggest a solution.

Trying to edit a txt file from a range of user inputs in python

me: I am very new to coding.
What i'm trying to do: Allow the user to change a txt files data. E.g. The name of a person, the email of a person, etc.
Problem: Code accepts my inputs however, it does not change the txt file.
Code i've made already.
click here for code
L = open("players.txt","r+")
edit_name = raw_input ("Enter the name of the person you wish to edit: ")
for line in L:
s = line.strip()
strings = s.split(",")
if edit_name == strings[0]:
print strings[:8]
print " \t 1 - Forename \n"
print " \t 2 - Surname \n"
print " \t 3 - Email Address \n"
print " \t 4 - Phone Number \n"
print " \t 5 - Division \n"
print " \t 6 - Points in the new division\n"
print " \t 7 - Old division\n"
print " \t 8 - Old points\n"
option = raw_input("Enter the number of what you would like to edit: ")
if option == "1":
updated_forename = raw_input ("New forename: ")
strings[0] = updated_forename
elif option == "2":
updated_surname = raw_input ("New surname: ")
strings[1] = updated_surname
elif option == "3":
updated_email = raw_input("New email: ")
strings[2] = updated_email
elif option == "4":
updated_phone_number = raw_input("New phonenumber: ")
strings[3] = updated_phone_number
elif option == "5":
updated_division = raw_input("New division: ")
strings[4] = updated_division
elif option == "6":
updated_points_new_div = raw_input("New points in division: ")
strings[5] = updated_points_new_div
elif option == "7":
updated_olddivision = raw_input("Old divison: ")
strings[6] = updated_olddivision
elif option == "8":
updated_oldpoints = raw_input("Old Points: ")
strings[7] = updated_oldpoints
print "Updated information"
print strings[:8]
L.close() #Closes the file to free us usage space.
Text file i'm wanting to edit.
click here for text file
Im guessing I need to basically save over the existing text file with the new data that has been entered. The question is how?
Any help would be appreciated.
p.s. First time posting so i cannot post pictures as i don't have 10 reputation. My apologies.
You are never actully writing to the file:
https://docs.python.org/2/tutorial/inputoutput.html
Change "L.open" to write mode "w", use "L.write()" to write new data, this means you need to rewrite the data you don't want to change and construct and write new data where you wanted it to be modified.
Pseudo-code:
Open file
for line in file
if(line.name == selectedname):
write_row_edited(something)
else:
write_line_unedited()
close file
I took the time to insert the missing peudo-code
#we need to load file into memory, so we can edit it (rewrite it modified)
file = open("players.txt","r")
data = file.read()
file.close()
datalines = data.split("\n")
#now we have the file "line-by-line" in memory so we can edit it
edit_name = raw_input ("Enter the name of the person you wish to edit: ")
file = open("players.txt","w")
for line in datalines:
s = line.strip()
strings = s.split(",")
if edit_name == strings[0]:
print strings[:8]
print " \t 1 - Forename \n"
print " \t 2 - Surname \n"
print " \t 3 - Email Address \n"
print " \t 4 - Phone Number \n"
print " \t 5 - Division \n"
print " \t 6 - Points in the new division\n"
print " \t 7 - Old division\n"
print " \t 8 - Old points\n"
option = raw_input("Enter the number of what you would like to edit: ")
if option == "1":
updated_forename = raw_input ("New forename: ")
strings[0] = updated_forename
elif option == "2":
updated_surname = raw_input ("New surname: ")
strings[1] = updated_surname
elif option == "3":
updated_email = raw_input("New email: ")
strings[2] = updated_email
elif option == "4":
updated_phone_number = raw_input("New phonenumber: ")
strings[3] = updated_phone_number
elif option == "5":
updated_division = raw_input("New division: ")
strings[4] = updated_division
elif option == "6":
updated_points_new_div = raw_input("New points in division: ")
strings[5] = updated_points_new_div
elif option == "7":
updated_olddivision = raw_input("Old divison: ")
strings[6] = updated_olddivision
elif option == "8":
updated_oldpoints = raw_input("Old Points: ")
strings[7] = updated_oldpoints
print "Updated information"
print strings[:8]
#merge string so we can write it back
newline = ",".join(strings)
file.write(newline+"\n")
else:
file.write(line+"\n")
file.close()

if...elif statement in python/pandas

I am working on a script that sorts people's names. I had this working using the csv module, but as this is going to be tied to a larger pandas project, I thought I would convert it.
I need to split a single name field into fields for first, middle and last. The original field has the first name first. ex: Richard Wayne Van Dyke.
I split the names but want "Van Dyke" to be the last name.
Here is my code for the csv module that works:
with open('inputfil.csv') as inf:
docs = csv.reader(inf)
next(ccaddocs, None)
for i in docs:
#print i
fullname = i[1]#it's the second column in the input file
namelist =fullname.split(' ')
firstname = namelist[0]
middlename = namelist[1]
if len(namelist) == 2:
lastname = namelist[1]
middlename = ''
elif len(namelist) == 3:
lastname = namelist[2]
elif len(namelist) == 4:
lastname = namelist[2] + " " + namelist[3] #gets Van Dyke in lastname
print "First: " + firstname + " middle: " + middlename + " last: " + lastname
Here is my pandas-based code that I'm struggling with:
df = pd.DataFrame({'Name':['Richard Wayne Van Dyke','Gary Del Barco','Dave Allen Smith']})
df = df.fillna('')
df =df.astype(unicode)
splits = df['Name'].str.split(' ', expand=True)
df['firstName'] = splits[0]
if splits[2].notnull and splits[3].isnull:#this works for Bret Allen Cardwell
df['lastName'] = splits[2]
df['middleName'] = splits[1]
print "Case 1: First: " + df['firstName'] + " middle: " +df['middleName'] + " last: " + df['lastName']
elif splits[2].all() == 'Del':#trying to get last name of "Del Barco"
print 'del'
df['middleName'] = ''
df['lastName'] = splits[2] + " " + splits[3]
print "Case 2: First: " + df['firstName'] + " middle: " +df['middleName'] + " last: " + df['lastName']
elif splits[3].notnull: #trying to get last name of "Van Dyke"
df['middleName'] = splits[1]
df['lastName'] = splits[2] + " " + splits[3]
print "Case 3: First: " + df['firstName'] + " middle: " +df['middleName'] + " last: " + df['lastName']
There is something basic that I'm missing.
if len(name) >= 3: # (assume that user only has one middle name)
firstname = splits[0]
middlename = splits[1]
lastnames = splits[2:] ( catch all last names into a list )

Python : count function does not work

I am stuck on an exercise from a Coursera Python course, this is the question:
"Open the file mbox-short.txt and read it line by line. When you find a line that starts with 'From ' like the following line:
From stephen.marquard#uct.ac.za Sat Jan 5 09:14:16 2008
You will parse the From line using split() and print out the second word in the line (i.e. the entire address of the person who sent the message). Then print out a count at the end.
Hint: make sure not to include the lines that start with 'From:'.
You can download the sample data at http://www.pythonlearn.com/code/mbox-short.txt"
Here is my code:
fname = raw_input("Enter file name: ")
if len(fname) < 1 : fname = "mbox-short.txt"
fh = open(fname)
count = 0
for line in fh:
words = line.split()
if len(words) > 2 and words[0] == 'From':
print words[1]
count = count + 1
else:
continue
print "There were", count, "lines in the file with From as the first word"`
The output should be a list of emails and the sum of them, but it doesn't work and I don't know why: actually the output is "There were 0 lines in the file with From as the first word"
I used your code and downloaded the file from the link. And I am getting this output:
There were 27 lines in the file with From as the first word
Have you checked if you are downloading the file in the same location as the code file.
fname = input("Enter file name: ")
counter = 0
fh = open(fname)
for line in fh :
line = line.rstrip()
if not line.startswith('From '): continue
words = line.split()
print (words[1])
counter +=1
print ("There were", counter, "lines in the file with From as the first word")
fname = input("Enter file name: ")
fh = open(fname)
count = 0
for line in fh :
if line.startswith('From '): # consider the lines which start from the word "From "
y=line.split() # we split the line into words and store it in a list
print(y[1]) # print the word present at index 1
count=count+1 # increment the count variable
print("There were", count, "lines in the file with From as the first word")
I have written all the comments if anyone faces any difficulty, in case you need help feel free to contact me. This is the easiest code available on internet. Hope you benefit from my answer
fname = input('Enter the file name:')
fh = open(fname)
count = 0
for line in fh:
if line.startswith('From'):
linesplit =line.split()
print(linesplit[1])
count = count +1
fname = input("Enter file name: ")
if len(fname) < 1 : fname = "mbox-short.txt"
fh = open(fname)
count = 0
for i in fh:
i=i.rstrip()
if not i.startswith('From '): continue
word=i.split()
count=count+1
print(word[1])
print("There were", count, "lines in the file with From as the first word")
fname = input("Enter file name: ")
if len(fname) < 1 : fname = "mbox-short.txt"
fh = open(fname)
count = 0
for line in fh:
if line.startswith('From'):
line=line.rstrip()
lt=line.split()
if len(lt)==2:
print(lt[1])
count=count+1
print("There were", count, "lines in the file with From as the first word")
My code looks like this and works as a charm:
fname = input("Enter file name: ")
if len(fname) < 1:
fname = "mbox-short.txt"
fh = open(fname)
count = 0 #initialize the counter to 0 for the start
for line in fh: #iterate the document line by line
words = line.split() #split the lines in words
if not len(words) < 2 and words[0] == "From": #check for lines starting with "From" and if the line is longer than 2 positions
print(words[1]) #print the words on position 1 from the list
count += 1 # count
else:
continue
print("There were", count, "lines in the file with From as the first word")
It is a nice exercise from the course of Dr. Chuck
There is also another way. You can store the found words in a separate empty list and then print out the lenght of the list. It will deliver the same result.
My tested code as follows:
fname = input("Enter file name: ")
if len(fname) < 1:
fname = "mbox-short.txt"
fh = open(fname)
newl = list()
for line in fh:
words = line.split()
if not len(words) < 2 and words[0] == 'From':
newl.append(words[1])
else:
continue
print(*newl, sep = "\n")
print("There were", len(newl), "lines in the file with From as the first word")
I did pass the exercise with it as well. Enjoy and keep the good work. Python is so much fun to me even though i always hated programming.

Error in mapper and reducer with python

There is a problem in the mapper.py file when I run it in the cluster. The error is " unexpected syntax before line" in "strl = line.strip()".
There is no error when I test it locally. I want to get the words of text file stored and change their format and count them and send to the output in s3 bucket.
Guidance most welcome. Thanks
mapper:
import sys, re
for line in sys.stdin:
strl = line.strip()
words = strl.split()
for word in words:
word = word.lower()
result = ""
charref = re.compile("[a-f]")
match = charref.search(word[0])
if match:
result+= "TR2234J"
else:
result+= ""
print result, "\t"
reducer:
import sys
for line in sys.stdin:
line = line.strip()
new_word =""
words = line.split("\t")
final_count = len(words)
my_num = final_count / 6
for i in range (my_num):
new_word = "".join(words[i*6:10+(i*6)])
print new_word, "\t"