Iterating through a .txt file in an odd way - list

What I am trying to do is write a program that opens a .txt file with movie reviews where the rating is a number from 0-4 followed by a short review of the movie. The program then prompts the user to open a second text file with words that will be matched against the reviews and given a number value based on the review.
For example, with these two sample reviews how they would appear in the .txt file:
4 A comedy-drama of nearly epic proportions rooted in a sincere performance by the title character undergoing midlife crisis . 2 Massoud 's story is an epic , but also a tragedy , the record of a tenacious , humane fighter who was also the prisoner -LRB- and ultimately the victim -RRB- of history .
So, if I were looking for the word "epic", it would increment the count for that word by 2 (which I already have figured out) since it appears twice, and then append the values 4 and 2 to a list of ratings for that word.
How do I append those ints to a list or dictionary related to that word? Keep in mind that I need to create a new list or dicitonary key for every word in a list of words.
Please and thank you. And sorry if this was poorly worded, programming isn't my forte.
All of my code:
def menu_validate(prompt, min_val, max_val):
""" produces a prompt, gets input, validates the input and returns a value. """
while True:
try:
menu = int(input(prompt))
if menu >= min_val and menu <= max_val:
return menu
break
elif menu.lower == "quit" or menu.lower == "q":
quit()
print("You must enter a number value from {} to {}.".format(min_val, max_val))
except ValueError:
print("You must enter a number value from {} to {}.".format(min_val, max_val))
def open_file(prompt):
""" opens a file """
while True:
try:
file_name = str(input(prompt))
if ".txt" in file_name:
input_file = open(file_name, 'r')
return input_file
else:
input_file = open(file_name+".txt", 'r')
return input_file
except FileNotFoundError:
print("You must enter a valid file name. Make sure the file you would like to open is in this programs root folder.")
def make_list(file):
lst = []
for line in file:
lst2 = line.split(' ')
del lst2[-1]
lst.append(lst2)
return lst
def rating_list(lst):
'''iterates through a list of lists and appends the first value in each list to a second list'''
rating_list = []
for list in lst:
rating_list.append(list[0])
return rating_list
def word_cnt(lst, word : str):
cnt = 0
for list in lst:
for word in list:
cnt += 1
return cnt
def words_list(file):
lst = []
for word in file:
lst.append(word)
return lst
##def sort(words, occurrences, avg_scores, std_dev):
## '''sorts and prints the output'''
## menu = menu_validate("You must choose one of the valid choices of 1, 2, 3, 4 \n Sort Options\n 1. Sort by Avg Ascending\n 2. Sort by Avg Descending\n 3. Sort by Std Deviation Ascending\n 4. Sort by Std Deviation Descending", 1, 4)
## print ("{}{}{}{}\n{}".format("Word", "Occurence", "Avg. Score", "Std. Dev.", "="*51))
## if menu == 1:
## for i in range (len(word_list)):
## print ("{}{}{}{}".format(cnt_list.sorted[i],)
def make_odict(lst1, lst2):
'''makes an ordered dictionary of keys/values from 2 lists of equal length'''
dic = OrderedDict()
for i in range (len(word_list)):
dic[lst2[i]] = lst2[i]
return dic
cnt_list = []
while True:
menu = menu_validate("1. Get sentiment for all words in a file? \nQ. Quit \n", 1, 1)
if menu == True:
ratings_file = open("sample.txt")
ratings_list = make_list(ratings_file)
word_file = open_file("Enter the name of the file with words to score \n")
word_list = words_list(word_file)
for word in word_list:
cnt = word_cnt(ratings_list, word)
cnt_list.append(word_cnt(ratings_list, word))
Sorry, I know it's messy and very incomplete.

I think you mean:
import collections
counts = collections.defaultdict(int)
word = 'epic'
counts[word] += 1
Obviously, you can do more with word than I have, but you aren't showing us any code, so ...
EDIT
Okay, looking at your code, I'd suggest you make the separation between rating and text explicit. Take this:
def make_list(file):
lst = []
for line in file:
lst2 = line.split(' ')
del lst2[-1]
lst.append(lst2)
return lst
And convert it to this:
def parse_ratings(file):
"""
Given a file of lines, each with a numeric rating at the start,
parse the lines into score/text tuples, one per line. Return the
list of parsed tuples.
"""
ratings = []
for line in file:
text = line.strip().split()
if text:
score = text[0]
ratings.append((score,text[1:]))
return ratings
Then you can compute both values together:
def match_reviews(word, ratings):
cnt = 0
scores = []
for score,text in ratings:
n = text.count(word)
if n:
cnt += n
scores.append(score)
return (cnt, scores)

Related

For-loop error: list index out of range

So I am rather new to programming and just recently started with Classes and we are supposed to make a phonebook that can be loaded in seperate text files.
I however keep running into the problem in this section that when I get into the for-loop. It hits a brick wall on
if storage[2] == permaStorage[i].number:
And tells me "IndexError: list index out of range". I am almost certain it is due to permaStorage starts out empty, but even when I attempt to fill it with temporary instances of Phonebook it tells me it out of range. The main reason it is there is to check if a phone number already exists within the permaStorage.
Anyone got a good tip on how to solve this or work around it?
(Sorry if the text is badly written. Just joined this site and not sure on the style)
class Phonebook():
def __init__(self):
self.name = ''
self.number = ''
def Add(name1, number1):
y = Phonebook()
y.name = name1
y.number = number1
return y
def Main():
permaStorage = []
while True:
print " add name number\n lookup name\n alias name newname\n change name number\n save filename\n load filename\n quit\n"
choices = raw_input ("What would you like to do?: ")
storage = choices.split(" ")
if storage[0] == "add":
for i in range(0, len(permaStorage)+1):
if storage[2] == permaStorage[i].number:
print "This number already exists. No two people can have the same phonenumber!\n"
break
if i == len(permaStorage):
print "hej"
try:
tempbox = Add(storage[1], storage[2])
permaStorage.append(tempbox)
except:
raw_input ("Remember to write name and phonenumber! Press any key to continue \n")
I think problem is that permaStorage is empty list and then u try to:
for i in range(0, len(permaStorage)+1):
if storage[2] == permaStorage[i].number:
will cause an error because permaStorage has 0 items but u trying to get first (i=0, permaStorage[0]) item.
I think you should replace second if clause with first one:
for i in range(0, len(permaStorage)+1):
if i == len(permaStorage):
print "hej"
try:
tempbox = Add(storage[1], storage[2])
permaStorage.append(tempbox)
if storage[2] == permaStorage[i].number:
print "This number already exists. No two people can have the same phonenumber!\n"
break
So in this case if perStorage is blank you will append some value and next if clause will be ok.
Indexing starts at zero in python. Hence, a list of length 5 has the last element index as 4 starting from 0. Change range to range(0, len(permastorage))
You should iterate upto the last element of the list, not beyond.
Try -
for i in range(0, len(permaStorage)):
The list of numbers produced in range() is from the start, but not including the end, so range(3) == [0, 1, 2].
So if your list x has length 10, range(0, len(x)) will give you 0 through 9, which is the correct indices of the elements of your list.
Adding 1 to len(x) will produce the range 0 through 10, and when you try to access x[10], it will fail.

hackerrank day 8 python

SPOILER This questions is about the Hackerrank Day 8 challenge, in case you want to try it yourself first.
This is the question they give:
Given n names and phone numbers, assemble a phone book that maps
friends' names to their respective phone numbers. You will then be
given an unknown number of names to query your phone book for. For
each name queried, print the associated entry from your phone book
on a new line in the form name=phoneNumber; if an entry for is not
found, print Not found instead.
Note: Your phone book should be a Dictionary/Map/HashMap data
structure.
The first line contains an integer, n, denoting the number of
entries in the phone book. Each of the n subsequent lines describes
an entry in the form of 2 space-separated values on a single line. The
first value is a friend's name, and the second value is an 8-digit
phone number.
After the n lines of phone book entries, there are an unknown number
of lines of queries. Each line (query) contains name a to look up,
and you must continue reading lines until there is no more input.
Note: Names consist of lowercase English alphabetic letters and are
first names only.
They go further then to give the input:
3
sam 99912222
tom 11122222
harry 12299933
sam
edward
harry
which expects the output:
sam=99912222
Not found
harry=12299933
I am having trouble with the unknown number of names to query. I tried using a try/except block to stop at an EOFError but I keep timing out on their test cases 1, 2 and 3. It works on two of the other test cases but not those and I assume it must be because I am stuck in a kind of infinite loop using my while True statement? This is what I wrote:
phonebook = {}
entries = int(raw_input())
for n in range(entries):
name, num = raw_input().strip().split(' ')
name, num = [str(name), int(num)]
phonebook[name] = num
while True:
try:
search = str(raw_input())
if search in phonebook.keys():
output = ''.join('%s=%r' % (search, phonebook[search]))
print output
else:
print "Not found"
except EOFError:
break
I am still fairly new to python so maybe I'm not using the try/except or break methods correctly? I would appreciate if anyone could tell me where I went wrong or what I can do to improve my code?
The only mistake you are doing is that you are using
phonebook.keys()
You can loop without using .keys() . It will save time.
phonebook = {}
entries = int(raw_input())
for n in range(entries):
name, num = raw_input().strip().split(' ')
name, num = [str(name), int(num)]
phonebook[name] = num
while True:
try:
search = str(raw_input())
if search in phonebook:
output = ''.join('%s=%r' % (search, phonebook[search]))
print output
else:
print "Not found"
except EOFError:
break
The above code will work with all the test cases.
In python-3
# n, Enter number of record you need to insert in dict
n = int(input())
d = dict()
# enter name and number by separate space
for i in range(0, n):
name, number = input().split()
d[name] = number
# print(d) #print dict, if needed
# enter name in order to get phone number
for i in range(0, n):
try:
name = input()
if name in d:
print(f"{name}={d[name]}")
else:
print("Not found")
except:
break
Input:
3
sam 99912222
tom 11122222
harry 12299933
sam
edward
harry
Output:
sam=99912222
Not found
harry=12299933
n = int(input())
d = dict()
for i in range(0, n):
name, number = input().split()
d[name] = number
#print(d) Check if your dictionary is ready
for i in range(0, n):
name = input()
if name in d:
print(f'{name}={d[name]}')
else:
print("Not found")
Try this, It'll work.
run this code to pass all the test cases:
n = int(input())
d = {}
for i in range(n):
tp = input()
a, b = tp.split()
d.update({a: b})
inputs = []
input1 = input().strip()
try:
while len(input1) > 0:
inputs.append(input1)
input1 = input().strip()
except:
pass
for i in inputs:
if i in d.keys():
c = 1
print(i + "=" + d[i])
else:
print('Not found')
Lets make life easy
Hacker rank 30 Day Code - Day no 8 (#Murtuza Chawala)
n = int(input())
i = 0
book = dict() #Declare a dictionary
while(i < n):
name , number = input().split() #Split input for name,number
book[name] = number #Append key,value pair in dictionary
i+=1
while True: #Run infinitely
try:
#Trick - If there is no more input stop the program
query = input()
except:
break
val = book.get(query, 0) #Returns 0 is name not in dictionary
if val != 0:
print(query + "=" + book[query])
else:
print("Not found")
n = int(input())
PhoneBook = dict(input().split() for x in range(n))
try:
for x in range(n):
key = input()
if key in PhoneBook:
print (key,'=',PhoneBook[key],sep='')
else:
print('Not found')
except:
exit()
n= int(input())
dct={}
for i in range(n):
info=input().split()
dct[info[0]]=info[1]
while 1:
try:
query=input().lower()
if query in dct:
print(query+'='+dct[query])
else:
print('Not found')
except EOFError:
break
Below snippet works for me.
noOfTestCases = int(input())
phoneDict = {}
for i in range(noOfTestCases):
name, phoneNumber = input().split()
phoneDict[name] = phoneNumber
for i in range(noOfTestCases):
try:
name = input()
if name in phoneDict:
print(name+'='+phoneDict[name])
else:
print("Not found")
except:
break
Input
3
sam 99912222
tom 11122222
harry 12299933
sam
edward
harry
Output
sam=99912222
Not found
harry=12299933
# Enter your code here. Read input from STDIN. Print output to STDOUT
entries = int( input() )
# print(entries)
data = {}
for i in range(entries):
# print("i=",i)
name, num = input().strip().split(" ")
# print(name)
# print(num)
data[name]=num
# print(data)
while True:
try:
search = input()
if search in data.keys():
print(search,"=",data[search], sep="")
else:
print("Not found")
except EOFError:
break
Input (stdin)
3
sam 99912222
tom 11122222
harry 12299933
sam
edward
harry
Your Output (stdout)
sam=99912222
Not found
harry=12299933
Expected Output
sam=99912222
Not found
harry=12299933
#Using Setter and Getter
n = int(input())
d = {}
while n:
x, y = input().split()
d.setdefault(x,y)
n -= 1
while True:
try:
inp = input()
if d.get(inp):
print(f"{inp}={d.get(inp)}")
else:
print(f"Not found")
except EOFError:
break
n=int(input())
d=dict()
for i in range(n):
name,number = input().split()
d.update({name:number})
for i in range(n):
name=input()
if name in d:print(name +"="+d[name])
else:
print("Not found")

Why won't my csv list replace my blank values with "N"?

I'm attempting to create a function which reads a specific column of a csv file which currently alternates between empty values and "1", pops them into a list and then replaces them with an "N" for the empty value and "B" for the "1"'s. I'm pretty new to python, as well as programming in general, so any tips and all help is welcome. This is what I have so far, and it does process, but only replaces my "1"'s with "B"'s. I've double checked my csv and the position is definitely empty and does not contain spaces. I've also looked at other responses and tried to emulate some similar logic that appeared to be behind them, but something still doesn't seem to work. If someone could point me in the right direction it would be very much appreciated.
#sample data (for 195 entries):
["Header0,"Header1","Foundation","Header3"],
["abc1","a12n","","123"],
["def2","d13b","1","456"],
["ghi3","g12n","","789"],
def Foundation( csv_file_path, Remove_Header = False, Remove_SubHeader = False ):
delineator = ','
raw_file = file(csv_file_path, 'r')
return_List = []
n = 0
#Process lines in file
for line in raw_file.readlines():
#Check if to include or remove header
if (n == 0 ) and (Remove_Header == True):
n = n + 1
continue
#Check if to include or remove sub header
if (n == 1) and (Remove_SubHeader == True):
n = n + 1
continue
sList2 = line.replace("\n","").strip().split( delineator )
col_2 = str(sList2.pop(2))
for n in col_2:
if n == "1":
col_2 = col_2.replace("1", "B")
elif n == "":
col_2 = col_2.replace("", "N")
print col_2
return_List.append(sList2) #add my secondary list back to my main List? right?
sList2.insert(0, col_2)# insert back to my secondary list where it went
n = n + 1 #add to counter and move down the line
raw_file.close()
#Return the list
return return_List

Appending individual lists created from a list comprehension using values from input()

I created a list comprehension to provide me the following:
listoflists = [[] for i in range(252*5)]
I then simplified the list in variable newlists to contain only the number of lists in range(weeks) which is a dynamic variable.
I want to append each individual list in the following loop for a specified range with the append process moving through each list after its reached a specified length. The values are generated from an input function. For instance, if the first list in newlists exceeds a length of 5 I want the values following the 5th loop to then append to the next list and so on. The code I currently have is:
p = 0
singlist = []
listoflists = [[] for i in range(252*5)]
newlists= [listoflists[i] for i in range(weeks)]
while p<(int(people)*weeks): #fix appending process
for i in range(int(people)*weeks):
weekly =input("Put your hours: ")
singlist.append(int(weekly))
p += 1
if weekly.isalpha() == True:
print("Not a valid amount of time")
for i in range(0,weeks):
while len(newlists[i])<int(people):
newlists[i].append(singlist[i])
This code however appends the same values to all lists in range weeks. What is the most efficient way to fix this? Thank you!
if singlist = [10,15,20,25]
desire output for newlists is: [[10,15],[20,25]]
How I've structured the program:
import sys
import numpy as np
import pandas as pd
from datetime import tzinfo,timedelta,datetime
import matplotlib.pyplot as plt
import itertools as it
from itertools import count,islice
team = []
y = 0
while y == 0:
try:
people = input("How many people are on your engagement? ")
if people.isdigit() == True:
y += 1
except:
print("Not a number try again")
z= 0
while z<int(people):
for i in range(int(people)):
names = input("Name: ")
if names.isalpha() == False:
team.append(names)
z+=1
elif names.isdigit() == True:
print("Not a name try again")
ties = [] # fix looping for more than one person
e = 0
while e<int(people):
for i in range(int(people)):
title = input("What is their title: ")
if title.isdigit() == True:
print("Not a title try again")
else:
ties.append(title)
e+=1
values = [] #fix looping for more than one person
t= 0
while t <int(people):
for i in range(int(people)):
charge = input("How much are you charging for them: ")
if charge.isalpha() == True:
print("Not a valid rate")
else:
values.append(int(charge))
t +=1
weeks = int(input("How many weeks are you including: "))
days = []
x = 0
while x<weeks: #include a parameter for dates of a 7 day difference to only be permitted
try:
for i in range(int(weeks)):
dates = input("Input the dates (mm/dd/yy): ")
dt_start = datetime.strptime(dates,'%m/%d/%y')
days.append(dates)
x+=1
except:
print("Incorrect format")
p = 0
singlist = []
listoflists = [[] for i in range(252*5)]
newlists= [listoflists[i] for i in range(weeks)]
while p<(int(people)*weeks): #fix appending process
for i in range(int(people)*weeks):
weekly =input("Put your hours: ")
singlist.append(int(weekly))
p += 1
if weekly.isalpha() == True:
print("Not a valid amount of time")
def func(items,n):
items = iter(items)
for i in it.count():
out = it.islice(items,weeks*i,weeks*i+n)
if not out:
break
output = list(func(singlist,weeks))
# items = [1,2,3,...n]
# output = [[1,2],[3,4],..], n = 2 elements each
items_ = iter(items)
outiter = iter(lambda: [next(items_) for i in range(n)],[])
outlist = list(outiter)
You can do the same thing using while loop in place of count() and [a:b] slice operation on list instead of islice(). But using iterators is very efficient.

find all ocurrences inside a list

I'm trying to implement a function to find occurrences in a list, here's my code:
def all_numbers():
num_list = []
c.execute("SELECT * FROM myTable")
for row in c:
num_list.append(row[1])
return num_list
def compare_results():
look_up_num = raw_input("Lucky number: ")
occurrences = [i for i, x in enumerate(all_numbers()) if x == look_up_num]
return occurrences
I keep getting an empty list instead of the ocurrences even when I enter a number that is on the mentioned list.
Your code does the following:
It fetches everything from the database. Each row is a sequence.
Then, it takes all these results and adds them to a list.
It returns this list.
Next, your code goes through each item list (remember, its a sequence, like a tuple) and fetches the item and its index (this is what enumerate does).
Next, you attempt to compare the sequence with a string, and if it matches, return it as part of a list.
At #5, the script fails because you are comparing a tuple to a string. Here is a simplified example of what you are doing:
>>> def all_numbers():
... return [(1,5), (2,6)]
...
>>> lucky_number = 5
>>> for i, x in enumerate(all_numbers()):
... print('{} {}'.format(i, x))
... if x == lucky_number:
... print 'Found it!'
...
0 (1, 5)
1 (2, 6)
As you can see, at each loop, your x is the tuple, and it will never equal 5; even though actually the row exists.
You can have the database do your dirty work for you, by returning only the number of rows that match your lucky number:
def get_number_count(lucky_number):
""" Returns the number of times the lucky_number
appears in the database """
c.execute('SELECT COUNT(*) FROM myTable WHERE number_column = %s', (lucky_number,))
result = c.fetchone()
return result[0]
def get_input_number():
""" Get the number to be searched in the database """
lookup_num = raw_input('Lucky number: ')
return get_number_count(lookup_num)
raw_input is returning a string. Try converting it to a number.
occurrences = [i for i, x in enumerate(all_numbers()) if x == int(look_up_num)]