Need help counting only certain words (Python) [duplicate] - if-statement

I have this code, where I am trying to count the number of:
Lines of code in a .py script
for_loops ("for ")
-while_loops ("while ")
if_statements ("if ")
function definitions ("def ")
multiplication signs ("*"
division signs ("/"
addition signs ("+")
subtraction signs ("-")
On the mathematical signs the code works, but when the code is looking for if statements it returns 2, when there is one, which is the main problem, but it makes me think I have written the for loop incorrectly, which could bring up more problems later. As well as this I am not sure how to print the Author line which comes up as [] instead of the name of the Author
The code:
from collections import Counter
FOR_=0
WHILE_=0
IF_=0
DEF_=0
x =input("Enter file or directory: ")
print ("Enter file or directory: {0}".format(x))
print ("Filename {0:>20}".format(x))
b= open(x)
c=b.readlines()
d=b.readlines(2)
print ("Author {0:<18}".format(d))
print ("lines_of_code {0:>8}".format((len (c))))
counter = Counter(str(c))
for line in c:
if ("for ") in line:
FOR_+=1
print ("for_loops {0:>12}".format((FOR_)))
for line in c:
if ("while ") in line:
WHILE_+=1
print ("while_loops {0:>10}".format((WHILE_)))
for line in c:
if ("if ") in line:
IF_+=1
a=IF_
print ("if_statements {0:>8}".format((a)))
for line in c:
if ("def ") in line:
DEF_+=1
print ("function_definitions {0}".format((DEF_)))
print ("multiplications {0:>6}".format((counter['*'])))
print ("divisions {0:>12}".format((counter['/'])))
print ("additions {0:>12}".format((counter['+'])))
print ("subtractions {0:>9}".format((counter['-'])))
The file being read from:
'''Dumbo
Author: Hector McTavish'''
for for for # Should count as 1 for statement
while_im_alive # Shouldn't count as a while
while blah # But this one should
if defined # Should be an if but not a def
def if # Should be a def but not an if
x = (2 * 3) + 4 * 2 * 7 / 1 - 2 # Various operators
Any help would be much appreciated

Instead of treating the source code as a string, use the ast module to parse it and then just walk through the nodes:
import ast
from collections import Counter
tree = ast.parse('''
"""
Author: Nobody
"""
def foo(*args, **kwargs):
for i in range(10):
if i != 2**2:
print(i * 2 * 3 * 2)
def bar():
pass
''')
counts = Counter(node.__class__ for node in ast.walk(tree))
print('The docstring says:', repr(ast.get_docstring(tree)))
print('You have', counts[ast.Mult], 'multiplication signs.')
print('You have', counts[ast.FunctionDef], 'function definitions.')
print('You have', counts[ast.If], 'if statements.')
It's pretty straightforward and handles all of your corner cases:
The docstring says: 'Author: Nobody'
You have 3 multiplication signs.
You have 2 function definitions.
You have 1 if statements.

if ("if ") in line will also count def if #.

Related

Reading mailing addresses of varying length from a text file using regular expressions

I am trying to read a text file and collect addresses from it. Here's an example of one of the entries in the text file:
Electrical Vendor Contact: John Smith Phone #: 123-456-7890
Address: 1234 ADDRESS ROAD Ship To:
Suite 123 ,
Nowhere, CA United States 12345
Phone: 234-567-8901 E-Mail: john.smith#gmail.com
Fax: 345-678-9012 Web Address: www.electricalvendor.com
Acct. No: 123456 Monthly Due Date: Days Until Due
Tax ID: Fed 1099 Exempt Discount On Assets Only
G/L Liab. Override:
G/L Default Exp:
Comments:
APPROVED FOR ELECTRICAL THINGS
I cannot wrap my head around how to search for and store the address for each of these entries when the amount of lines in the address varies. Currently, I have a generator that reads each line of the file. Then the get_addrs() method attempts to capture markers such as the Address: and Ship keywords in the file to signify when an address needs to be stored. Then I use a regular expression to search for zip codes in the line following a line with the Address: keyword. I think I've figured out how successfully save the second line for all addresses using that method. However, in a few addresses,es there is a suite number or other piece of information that causes the address to become three lines instead of two. I'm not sure how to account for this and I tried expanding my save_previous() method to three lines, but I can't get it quite right. Here's the code that I was able to successfully save all of the two line addresses with:
import re
class GetAddress():
def __init__(self):
self.line1 = []
self.line2 = []
self.s_line1 = []
self.addr_index = 0
self.ship_index = 0
self.no_ship = False
self.addr_here = False
self.prev_line = []
self.us_zip = ''
# Check if there is a shipping address.
def set_no_ship(self, line):
try:
self.no_ship = line.index(',') == len(line) - 1
except ValueError:
pass
# Save two lines at a time to see whether or not the previous
# line contains 'Address:' and 'Ship'.
def save_previous(self, line):
self.prev_line += [line]
if len(self.prev_line) > 2:
del self.prev_line[0]
def get_addrs(self, line):
self.addr_here = 'Address:' in line and 'Ship' in line
self.po_box = False
self.no_ship = False
self.addr_index = 0
self.ship_index = 0
self.zip1_index = 0
self.set_no_ship(line)
self.save_previous(line)
# Check if 'Address:' and 'Ship' are in the previous line.
self.prev_addr = (
'Address:' in self.prev_line[0]
and 'Ship' in self.prev_line[0])
if self.addr_here:
self.po_box = 'Box' in line or 'BOX' in line
self.addr_index = line.index('Address:') + 1
self.ship_index = line.index('Ship')
# Get the contents of the line between 'Address:' and
# 'Ship' if both words are present in this line.
if self.addr_index is not self.ship_index:
self.line1 += [' '.join(line[self.addr_index:self.ship_index])]
elif self.addr_index is self.ship_index:
self.line1 += ['']
if len(self.prev_line) > 1 and self.prev_addr:
self.po_box = 'Box' in line or 'BOX' in line
self.us_zip = re.search(r'(\d{5}(\-\d{4})?)', ' '.join(line))
if self.us_zip and not self.po_box:
self.zip1_index = line.index(self.us_zip.group(1))
if self.no_ship:
self.line2 += [' '.join(line[:line.index(',')])]
elif self.zip1_index and not self.no_ship:
self.line2 += [' '.join(line[:self.zip1_index + 1])]
elif len(self.line1) > 0 and not self.line1[-1]:
self.line2 += ['']
# Create a generator to read each line of the file.
def read_gen(infile):
with open(infile, 'r') as file:
for line in file:
yield line.split()
infile = 'Vendor List.txt'
info = GetAddress()
for i, line in enumerate(read_gen(infile)):
info.get_addrs(line)
I am still a beginner in Python so I'm sure a lot of my code may be redundant or unnecessary. I'd love some feedback as to how I might make this simpler and shorter while capturing both two and three line addresses.
I also posted this question to Reddit and u/Binary101010 pointed out that the text file is a fixed width, and it may be possible to slice each line in a way that only selects the necessary address information. Using this intuition I added some functionality to the generator expression, and I was able to produce the desired effect with the following code:
infile = 'Vendor List.txt'
# Create a generator with differing modes to read the specified lines of the file.
def read_gen(infile, mode=0, start=0, end=0, rows=[]):
lines = list()
with open(infile, 'r') as file:
for i, line in enumerate(file):
# Set end to correct value if no argument is given.
if end == 0:
end = len(line)
# Mode 0 gives all lines of the file
if mode == 0:
yield line[start:end]
# Mode 1 gives specific lines from the file using the rows keyword
# argument. Make sure rows is formatted as [start_row, end_row].
# rows list should only ever be length 2.
elif mode == 1:
if rows:
# Create a list for indices between specified rows.
for element in range(rows[0], rows[1]):
lines += [element]
# Return the current line if the index falls between the
# specified rows.
if i in lines:
yield line[start:end]
class GetAddress:
def __init__(self):
# Allow access to infile for use in set_addresses().
global infile
self.address_indices = list()
self.phone_indices = list()
self.addresses = list()
self.count = 0
def get(self, i, line):
# Search for appropriate substrings and set indices accordingly.
if 'Address:' in line[18:26]:
self.address_indices += [i]
if 'Phone:' in line[18:24]:
self.phone_indices += [i]
# Add address to list if both necessary indices have been collected.
if i in self.phone_indices:
self.set_addresses()
def set_addresses(self):
self.address = list()
start = self.address_indices[self.count]
end = self.phone_indices[self.count]
# Create a generator that only yields substrings for rows between given
# indices.
self.generator = read_gen(
infile,
mode=1,
start=40,
end=91,
rows=[start, end])
# Collect each line of the address from the generator and remove
# unnecessary spaces.
for element in range(start, end):
self.address += [next(self.generator).strip()]
# This document has a header on each page and a portion of that is
# collected in the address substring. Search for the header substring
# and remove the corresponding elements from self.address.
if len(self.address) > 3 and not self.address[-1]:
self.address = self.address[:self.address.index('header text')]
self.addresses += [self.address]
self.count += 1
info = GetAddress()
for i, line in enumerate(read_gen(infile)):
info.get(i, line)

How to convert command line argument into string

I am practising questions for an exam that I will have in two weeks and whenever I try to attempt this question I become lost. I have tried putting args into int(args) but get "ValueError: invalid literal for int() with base 10:".
I am not allowed to use any for loops or any functions that would make this task simple.
import sys
args = sys.argv[1]
total = 0
i = 0
while i < len(args):
total = total + args[i]
print total
You could go with "join" option.
import sys
' '.join(sys.argv[1:])
This will join your arguments with blank spaces between.
If you do this:
import sys
print " ".join(sys.argv[1:]) # skip the programs name which is given as argv[0]
it will print all your arguments with one " " apart.
Example:
python yourScriptName.py one two three four
will print
one two three four
To sum up your "numeric" command line params, you can use this:
import sys
def floatOrZero(tmp):
f = 0.0
try:
f = float(tmp) # make a float. # Lots of things are floats: 1.3e9
except:
f = 0.0 # this happens for non-floats
return f
# sum all convertable parameters and print result
# using a list comprehension to convert args (strings) into
# floats or 0.0 if not convertable
print "Sum of numeric entries: " , sum([floatOrZero(num) for num in sys.argv])

PyCharm shows "PEP8: expected 2 blank lines, found 1"

Consider the following code:
def add_function(a, b):
c = str(a) + b
print "c is %s" % c
def add_int_function(c, d):
e = c + d
print "the vaule of e is %d" % e
if __name__ =="__main__":
add_function(59906, 'kugrt5')
add_int_function(1, 2)
It always shows me: "expected 2 blank lines ,found 1" in aadd_int_function, but not in the add_function.
When I add two spaces in front of the def add_int_function(c, d):
there is a error shows unindent does not match any outer indentation level
in the end of add_function:
Just add another line between your function definitions :
1 line :
2 lines:
This is a pretty common question within the python community. After the release of PEP 8, new formatting styles were accepted into python. One of them states that after the definition of a class or function there must be two lines separating them. As such:
def yadayada:
print("two lines between the functions")
def secondyadayada:
print("this is the proper formatting")
So, you should never do it like:
def yadayada:
print("two lines between the functions")
def secondyadayada:
print("this is the proper formatting")
Or else PyCharm will throw that error at you.
Further clarification on #kennet-celeste & #shreyshrey 's answers,
Each function or class defined requires 2 spaces above and 2 spaces below. Unless the function is the last item in the script, in which the expected format is one blank line as an End of File marker. So:
# some code followed by 2 blank spaces
def function1():
def function2():
def function3():
For people who wonders why it requires two blank lines
if you were to write in other languages it would be:
fun doSth() {
print()
}
fun doSth1() {
print()
}
but if you were to delete all the curly braces from the code you will see:
two blank lines between methods
fun doSth()
print()
#
#
fun doSth1()
print()
#

python 2.7 - trying to print a string and the (printed) output of function in the same line

I have the following function defined:
def displayHand(hand):
"""
Displays the letters currently in the hand.
For example:
>>> displayHand({'a':1, 'x':2, 'l':3, 'e':1})
Should print out something like:
a x x l l l e
The order of the letters is unimportant.
hand: dictionary (string -> int)
"""
for letter in hand.keys():
for j in range(hand[letter]):
print letter, # print all on the same line
print '' # print an empty line
Now, I want to print the following:
Current hand: a b c
To do this, I try to do:
print "Current hand: ", displayHand({'a':1, 'b':1, 'c':1})
And I get:
Current hand: a b c
None
I know that None is printed cause I am calling the print function on the displayHand(hand) function, which doesn't return anything.
Is there any way to get rid of that "None" without modifying displayHand(hand)?
if you want to use your function in a print statement, it should return a string and not print something itself (and return None) - as you would do in a __str__ method of a class. something like:
def displayHand(hand):
ret = ''
for letter in hand.keys():
for j in range(hand[letter]):
ret += '{} '.format(letter) # print all on the same line
# ret += '\n'
return ret
or even
def displayHand(hand):
return ''.join(n*'{} '.format(k) for k,n in hand.items() )
When you trail a print with a ,, the next print will appear on the same line, so you should just call the two things on separate lines, as in:
def printStuff():
print "Current hand: ",
displayHand({'a':1, 'b':1, 'c':1})
Of course you could just adapt this and create a method like:
def printCurrentHand(hand):
print "Current hand: ",
displayHand(hand)
The only way to do this (or I believe the only way to do this) is to use return instead of print in your displayhand() function. Sorry if I didn't answer your question.
Your function 'displayHand' does not have to print the output,
it has to return a string.
def displayHand(hand):
mystring=''
for letter in hand.keys():
for j in range(hand[letter]):
mystring+= letter # concatenate all on the same line
return mystring
BUT, you have to check the '.keys' command help as the order of the input (a/b/c) may not be respected

Python Help - Learn Python The Hard Way exercise 41 "phrase = PHRASES[snippet]"

I am learning python - this is my first programming language that I am learning. I am a little confused about one line of the code. The full code can also be found at http://learnpythonthehardway.org/book/ex41.html
import random
from urllib import urlopen
import sys
WORD_URL = "http://learncodethehardway.org/words.txt"
WORDS = []
PHRASES = {
"class %%%(%%%):":
"Make a class named %%% that is-a %%%.",
"class %%%(object):\n\tdef __init__(self, ***)" :
"class %%% has-a __init__ that takes self and *** parameters.",
"class %%%(object):\n\tdef ***(self, ###)":
"class %%% has-a function named *** that takes self and ### parameters.",
"*** = %%%()":
"Set *** to an instance of class %%%.",
"***.***(###)":
"From *** get the *** function, and call it with parameters self, ###.",
"***.*** = '***'":
"From *** get the *** attribute and set it to '***'."
}
# do they want to drill phrases first
PHRASE_FIRST = False
if len(sys.argv) == 2 and sys.argv[1] == "english":
PHRASE_FIRST = True
# load up the words from the website
for word in urlopen(WORD_URL).readlines():
WORDS.append(word.strip())
def convert(snippet, phrase):
class_names = [w.capitalize() for w in
random.sample(WORDS, snippet.count("%%%"))]
other_names = random.sample(WORDS, snippet.count("***"))
results = []
param_names = []
for i in range (0, snippet.count("###")):
param_count = random.randint(1,3)
param_names.append(', '.join(random.sample(WORDS, param_count)))
for sentence in snippet, phrase:
result = sentence[:]
# fake class name
for word in class_names:
result = result.replace("***", word, 1)
# fake other names
for word in other_names:
result = result.replace("***", word, 1)
# fake parameter lists
for word in param_names:
result = result.replace("###", word, 1)
results.append(result)
return results
# keep going until they hit CTRL-D
try:
while True:
snippets = PHRASES.keys()
# returns a randomly shuffled dictionary keys list
random.shuffle(snippets)
for snippet in snippets:
phrase = PHRASES[snippet]
question, answer = convert(snippet, phrase)
if PHRASE_FIRST:
question, answer = answer, question
print question
raw_input("> ")
print "ANSWER: %s\n\n" % answer
except EOFError:
print "\nBye"
It is the 11th line of code from the bottom that I don't quite understand: phrase = PHRASES[snippet]. Since snippet in for snippet in snippets: is looping through the keys of the randomized-shuffled PHRASES list, why can't the code simply be phrase = snippet. Thanks in advance for any help.
Cheers - Darren
get the value of key"snippet" in the dictionary