For fun, I'm trying to write a code in python that associates a random number with a letter of the alphabet or punctuation mark and adds that letter to a list. I then want to have the code keep making new lists of random letters until it outputs "to be or not to be, that is the question." I then want to print that list and see how many evaluations it took. This is what I have so far.
from random import *
alphabet = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z',',',' ','.']
sentence = []
numbers = []
def random(x):
randval = x
return randval
count = 0
for i in range(1000): # trying to place an upper bound on how many times to try
for i in range(41): # the number of characters in the sentence
randomness = random(randint(0,28)) # the number of enteries in the alphabet list
numbers.append(randomness)
for i in numbers:
count += 1
sentence.append(alphabet[i])
if sentence!=['t','o',' ','b','e',' ','o','r',' ','n','o','t',' ','t','o',' ','b','e',',',' ','t','h','a','t',' ','i','s','t','h','e',' ','q','u','e','s','t','i','o','n','.']:
sentence = [] ### This is supposed to empty the list if it gets the wrong order, but doesn't quite do that.
if sentence == ['t','o',' ','b','e',' ','o','r',' ','n','o','t',' ','t','o',' ','b','e',',',' ','t','h','a','t',' ','i','s','t','h','e',' ','q','u','e','s','t','i','o','n','.']:
print sentence
print count
break
new_sentence = ''.join(sentence)
print new_sentence
I'm not sure what I'm doing wrong. The list size keeps blowing up instead of keeping a length of 41. suggestions?
I have written the following code below. It works without errors, the problem that I am facing is that if there are 2 words in a sentence that have been repeated the same number of times, the code does not return the first word in alphabetical order. Can anyone please suggest any alternatives? This code is going to be evaluated in Python 2.7.
"""Quiz: Most Frequent Word"""
def most_frequent(s):
"""Return the most frequently occuring word in s."""
""" Step 1 - The following assumptions have been made:
- Space is the default delimiter
- There are no other punctuation marks that need removing
- Convert all letters into lower case"""
word_list_array = s.split()
"""Step 2 - sort the list alphabetically"""
word_sort = sorted(word_list_array, key=str.lower)
"""Step 3 - count the number of times word has been repeated in the word_sort array.
create another array containing the word and the frequency in which it is repeated"""
wordfreq = []
freq_wordsort = []
for w in word_sort:
wordfreq.append(word_sort.count(w))
freq_wordsort = zip(wordfreq, word_sort)
"""Step 4 - output the array having the maximum first index variable and output the word in that array"""
max_word = max(freq_wordsort)
word = max_word[-1]
result = word
return result
def test_run():
"""Test most_frequent() with some inputs."""
print most_frequent("london bridge is falling down falling down falling down london bridge is falling down my fair lady") # output: 'bridge'
print most_frequent("betty bought a bit of butter but the butter was bitter") # output: 'butter'
if __name__ == '__main__':
test_run()
Without messing too much around with your code, I find that a good solution can be achieved through the use of the index method.
After having found the word with the highest frequency (max_word), you simply call the index method on wordfreq providing max_word as input, which returns its position in the list; then you return the word associated to this index in word_sort.
Code example is below (I removed the zip function as it is not needed anymore, and added two simpler examples):
"""Quiz: Most Frequent Word"""
def most_frequent(s):
"""Return the most frequently occuring word in s."""
""" Step 1 - The following assumptions have been made:
- Space is the default delimiter
- There are no other punctuation marks that need removing
- Convert all letters into lower case"""
word_list_array = s.split()
"""Step 2 - sort the list alphabetically"""
word_sort = sorted(word_list_array, key=str.lower)
"""Step 3 - count the number of times word has been repeated in the word_sort array.
create another array containing the word and the frequency in which it is repeated"""
wordfreq = []
# freq_wordsort = []
for w in word_sort:
wordfreq.append(word_sort.count(w))
# freq_wordsort = zip(wordfreq, word_sort)
"""Step 4 - output the array having the maximum first index variable and output the word in that array"""
max_word = max(wordfreq)
word = word_sort[wordfreq.index(max_word)] # <--- solution!
result = word
return result
def test_run():
"""Test most_frequent() with some inputs."""
print(most_frequent("london bridge is falling down falling down falling down london bridge is falling down my fair lady")) # output: 'down'
print(most_frequent("betty bought a bit of butter but the butter was bitter")) # output: 'butter'
print(most_frequent("a a a a b b b b")) #output: 'a'
print(most_frequent("z z j j z j z j")) #output: 'j'
if __name__ == '__main__':
test_run()
I have this questions: Write a program that will calculate the average word length of a text stored in a file (i.e the sum of all the lengths of the word tokens in the text, divided by the number of word tokens).
my code:
allword = 0
words = 0
average = 0
with open('/home/......', 'r') as f:
for i in f:
me = i.split()
allword += len(me)
words += len(i)
average += allword / float(words)
print average
so , i have 4 line and 55 characters without computer blank space, i come from average: 27.54 .... and i think that the result not gut is...
Can anybody with simple words tell me, where are that problem....
Very Thanks!
#mustaccio
Maybe 27.54 to high...now the code with a little change.....
allword = 0
words = 0
average = 0
with open('/home/....', 'r') as f:
for i in f:
me = "".join(i.split(" "))
allword += len(me)
words += len(i)
average += allword / float(words)
print average
Now i come 4.32....
I am trying to code a program that will insert specific numbers before parts of an input, for example given the input "171819-202122-232425" I would like it to split up the number into pieces and use the dash as a delimiter. I have split up the number using list(str(input)) but have no idea how to insert the appropriate numbers. It has to work for any number Thanks for the help.
Output =
(number)17
(number)18
(number)19
(number+1)20
(number+1)21
(number+1)22
(number+2)23
(number+2)24
(number+2)25
You could use split and regexps to dig out lists of your numbers:
Code
import re
mynum = "171819-202122-232425"
start_number = 5
groups = mynum.split('-') # list of numbers separated by "-"
number_of_groups = xrange(start_number , start_number + len(groups))
for (i, number_group) in zip(number_of_groups, groups):
numbers = re.findall("\d{2}", number_group) # return list of two-digit numbers
for x in numbers:
print "(%s)%s" % (i, x)
Result
(5)17
(5)18
(5)19
(6)20
(6)21
(6)22
(7)23
(7)24
(7)25
Try this:
Code:
mInput = "171819-202122-232425"
number = 9 # Just an example
result = ""
i = 0
for n in mInput:
if n == '-': # To handle dash case
number += 1
continue
i += 1
if i % 2 == 1: # Each two digits
result += "\n(" + str(number) + ")"
result += n # Add current digit
print result
Output:
(9)17
(9)18
(9)19
(10)20
(10)21
(10)22
(11)23
(11)24
(11)25
I have an input with words and their frequency for a given line, however, I would like to have a total count of word frequency. I know there are many solutions for calculating word frequency from a file as a whole, but the input I have has brackets around each line, and parenthesis around each word. I have not been able to extract the word and count because there are a different number of words for each line. Any help would be greatly appreciated!
A sample input:
[('Company', 1)]
[('Tax', 1), ('Service', 1)]
[('"Birchwood', 1), ('LLC"', 1), ('Enterprise,', 1)]
[("Wendy's", 1), ('Salon', 1)]
Code I have been trying:
from collections import defaultdict
def wordCountTotals (fh):
d = defaultdict(int)
for line in fh:
word, count = line.split()
d[word] += count
return d[word], count
I have also tried using :
re.search("\((\w+)\, [0-9]+)", s)
but still no results
Because there are brackets and parenthesis, this code does not work - there are too many values to unpack. If anyone could help with this, I would be so grateful!
Your input consists of list of tuples as exactly same syntax in Python, we can use ast.literal_eval to exploit this fact.
>>> import ast
>>> ast.literal_eval(" [('Company', 1)]".strip())
[('Company', 1)]
So, something along the lines of:
d = defaultdict(0)
for line in fh:
val = ast.literal_eval(line.strip())
for s, c in val:
d[s] += c
return d
would be enough. I have not tried this, might need some fixes.