How can i correctly print out this dictionary in a way i have each word sorted by the number of times(frequency) in the text?
slova = dict()
for line in text:
line = re.split('[^a-z]',text)
line[i] = filter(None,line)
i =+ 1
i = 0
for line in text:
for word in line:
if word not in slova:
slova[word] = i
i += 1
I'm not sure what your text looks like, and you also haven't provided example output, but here is what my guess is. If this doesn't help please update your question and I'll try again. The code makes use of Counter from collections to do all the heavy lifting. First all of the words in all of the lines of the text are flattened to a single list, then this list is simply passed to Counter. The keys of the Counter (the words) are then sorted by their counts and printed out.
CODE:
from collections import Counter
import re
text = ['hello hi hello yes hello',
'hello hi hello yes hello']
all_words = [w for l in text for w in re.split('[^a-z]',l)]
word_counts = Counter(all_words)
sorted_words = sorted(word_counts.keys(),
key=lambda k: word_counts[k],
reverse = True)
#Print out the word and counts
for word in sorted_words:
print word,word_counts[word]
OUTPUT:
hello 6
yes 2
hi 2
I have some data that I've pulled from a website. This is the code I used to grab it (my actual code is much longer but I think this about sums it up).
lid_restrict_save = []
for t in range(10000,10020):
address = 'http://www.tspc.oregon.gov/lookup_application/' + lines2[t]
page = requests.get(address)
tree = html.fromstring(page.text)
#District Restriction
dist_restrict = tree.xpath('//tr[11]//text()')
if u"District Restriction" in dist_restrict:
lid_restrict_save.append(id2)
I'm trying to export this list:
print lid_restrict_save
[['5656966VP65', '5656966RR68', '56569659965', '56569658964']]
to a text file.
f = open('dis_restrict_no_uniqDOB2.txt', 'r+')
for j in range(0,len(lid_restrict_save)):
s = ( (unicode(lid_restrict_save[j]).encode('utf-8') + ' \n' ))
f.write(s)
f.close()
I want the text to come out looking like this:
5656966VP65
5656966RR68
56569659965
56569658964
This code worked but only when I started the range from 0.
f = open('dis_restrict.txt', 'r+')
for j in range(0,len(ldob_restrict)):
f.write( ldob_restrict[j].encode("utf-8") + ' \n' )
f.close()
When I've tried changing the code I keep getting this error:
"AttributeError: 'list' object has no attribute 'encode'."
I've tried the suggestions from here, here, and here but to no avail.
If anyone has any hints it would be greatly appreciated.
lid_restrict_save is a nested list so you can't encode the first element because it is not a string.
You could write to the txt file using this:
lid_restrict_save = [['5656966VP65', '5656966RR68', '56569659965', '56569658964']]
lid_restrict_save = lid_restrict_save[0] # remove the outer list
with open('dis_restrict.txt', 'r+') as f:
for i in lid_restrict_save:
f.write(str(i) + '\n')
You are given an integer NN on one line. The next line contains NN space separated integers. Create a tuple of those NN integers. Let's call it TT.
Compute hash(T) and print it.
Note: Here, hash() is one of the functions in the __builtins__ module.
Input Format
The first line contains NN. The next line contains NN space separated integers.
Output Format
Print the computed value.
Sample Input
2
1 2
Sample Output
3713081631934410656
My code
a=int(raw_input())
b=()
i=0
for i in range (0,a):
x=int(raw_input())
c = b + (x,)
i=i+1
hash(b)
Error:
invalid literal for int() with base 10: '1 2'
There are three errors that I can spot:
First, your for-loop is not indented.
Second, you should not be adding 1 to i - the for-loop does this automatically.
Thirds - and this is where the error is thrown - is that raw_input reads the entire line. If you are reading the line '1 2', you cannot convert this to an int.
To fix this problem, I suggest doing:
line = tuple(map(int,raw_input().split(' ')))
This takes the raw input, splits it into an list, makes this list into ints, then turns this list into a tuple.
In fact, you can scrap the entire for loop. You could answer this problem in two lines of code:
raw_input()#To get rid of the first line, which we do not need
print hash(tuple(map(int,raw_input().split(' '))))
The input format
next line contains NN space separated integers
eg: 1 2 3, is not an integer (because of the spaces), that is why when you try int(raw_input()) your code throws an error. You should use split(' ') as the other answer has suggested, to separate each integer. This will remove the error.
Also, there is no need to use i=i+1 as the loop will take care of it
Try the below code:
if __name__ == '__main__':
n = int(input())
integer_list = map(int, input().split())
t = tuple(integer_list)
print(hash(t))
Try This code for Python-3
if __name__ == '__main__':
n = int(input())
integer_list = map(int, input().split())
input_list = [int(x) for x in integer_list]
t = tuple(input_list)``
print(hash(t))
I have a code here on Python 2.7 that is supposed to tell me the frequency of a letter or word within a single text file.
def frequency_a_in_text(textfile, a):
"""Counts how many "a" letters are in the text file.
"""
try:
f = open(textfile,'r')
lines = f.readlines()
f.close()
except IOError:
return -1
tot = 0
for line in lines:
split = str(line.split())
k = split.count(s)
tot = tot + k
return tot
print frequency_a_in_text("RandomTextFile.txt", "a")
There's a little bit of extra coding in there - the "try" and "except", but that's just telling me that if I can't open the text file, then it'll return a "-1" to me.
Whenever I run it, it seems to just read the first line and tell me how many "a" letters there are.
You are returning out of the function after the first iteration of your loop.
The return statement should be outside of the loop.
for line in lines:
split = str(line.split())
k = split.count(s)
tot = tot + k
return tot
I am beginner for real programming and have the ff problem
I want to read many instances stored in a file/csv/txt/excel
like the folloing
find<S>ing<G>s<p>
Then when I read this file it goes through each character and start from the six position and continue until the 11 position-the max size of a single row is 12
-,-,-,-,-,f,i,n,d,i,n,0
-,-,-,-,f,i,n,d,i,n,g,0
-,-,-,f,i,n,d,i,n,g,s,0
-,-,f,i,n,d,i,n,g,s,-,S//there is an S value next to the letter d
-,f,i,n,d,i,n,g,s,-,-,0
f,i,n,d,i,n,g,s,-,-,-,0
i,n,d,i,n,g,s,-,-,-,-,G // there is a G value here at th end of g
n,d,i,n,g,s,-,-,-,-,-,P */// there is a P value here at th end of s
Here is the code that I tried in python. but can be possible in c++, java, dotNet.
import sys
import os
f = open('/home/mm/exprimentdata/sample3.csv')// can be txt file
string = f.read()
a = []
b = []
i = 0
while (i < len(string)):
if (string[i] != '\n '):
n = string[i]
if (string[i] == ""):
print ' = '
if (string[i] = upper | numeric)
print rep(char).rjust(12),delimiter=','
a.append(n)
i = (i+1)
print (len(a))
print a
my question is how can I compare each string and assign a single char at the rightmost part (position 12 like above G,P,S)
how can I push one step back after aligning the first row?
how can i fix the length
please anyone see fragment and adjust to solve the above case
I don't understand your question.
But some advice:
Firstly, you should be closing the file after you open it.
f = open('/home/mm/exprimentdata/sample3.csv')// can be txt file
string = f.read()
**f.close()**
Secondly, your indentation is problematic. Whitespace matters in Python. (Maybe your real code is indented properly and it's just a StackOverflow thing.)
Thirdly, instead of using a while loop and incrementing, you should be writing:
for i range(len(string)):
# loop code
Fourthly, this line will never evaluate to True:
if (string[i] == ""):
string[i] will always be some character (or cause an out of bounds error).
I advise you read a Python tutorial before you try and write this program.