How to convert command line argument into string - python-2.7

I am practising questions for an exam that I will have in two weeks and whenever I try to attempt this question I become lost. I have tried putting args into int(args) but get "ValueError: invalid literal for int() with base 10:".
I am not allowed to use any for loops or any functions that would make this task simple.
import sys
args = sys.argv[1]
total = 0
i = 0
while i < len(args):
total = total + args[i]
print total

You could go with "join" option.
import sys
' '.join(sys.argv[1:])
This will join your arguments with blank spaces between.

If you do this:
import sys
print " ".join(sys.argv[1:]) # skip the programs name which is given as argv[0]
it will print all your arguments with one " " apart.
Example:
python yourScriptName.py one two three four
will print
one two three four
To sum up your "numeric" command line params, you can use this:
import sys
def floatOrZero(tmp):
f = 0.0
try:
f = float(tmp) # make a float. # Lots of things are floats: 1.3e9
except:
f = 0.0 # this happens for non-floats
return f
# sum all convertable parameters and print result
# using a list comprehension to convert args (strings) into
# floats or 0.0 if not convertable
print "Sum of numeric entries: " , sum([floatOrZero(num) for num in sys.argv])

Related

python 2.7 - trying to print a string and the (printed) output of function in the same line

I have the following function defined:
def displayHand(hand):
"""
Displays the letters currently in the hand.
For example:
>>> displayHand({'a':1, 'x':2, 'l':3, 'e':1})
Should print out something like:
a x x l l l e
The order of the letters is unimportant.
hand: dictionary (string -> int)
"""
for letter in hand.keys():
for j in range(hand[letter]):
print letter, # print all on the same line
print '' # print an empty line
Now, I want to print the following:
Current hand: a b c
To do this, I try to do:
print "Current hand: ", displayHand({'a':1, 'b':1, 'c':1})
And I get:
Current hand: a b c
None
I know that None is printed cause I am calling the print function on the displayHand(hand) function, which doesn't return anything.
Is there any way to get rid of that "None" without modifying displayHand(hand)?
if you want to use your function in a print statement, it should return a string and not print something itself (and return None) - as you would do in a __str__ method of a class. something like:
def displayHand(hand):
ret = ''
for letter in hand.keys():
for j in range(hand[letter]):
ret += '{} '.format(letter) # print all on the same line
# ret += '\n'
return ret
or even
def displayHand(hand):
return ''.join(n*'{} '.format(k) for k,n in hand.items() )
When you trail a print with a ,, the next print will appear on the same line, so you should just call the two things on separate lines, as in:
def printStuff():
print "Current hand: ",
displayHand({'a':1, 'b':1, 'c':1})
Of course you could just adapt this and create a method like:
def printCurrentHand(hand):
print "Current hand: ",
displayHand(hand)
The only way to do this (or I believe the only way to do this) is to use return instead of print in your displayhand() function. Sorry if I didn't answer your question.
Your function 'displayHand' does not have to print the output,
it has to return a string.
def displayHand(hand):
mystring=''
for letter in hand.keys():
for j in range(hand[letter]):
mystring+= letter # concatenate all on the same line
return mystring
BUT, you have to check the '.keys' command help as the order of the input (a/b/c) may not be respected

regex for detecting subtitle errors

I'm having some issues with subtitles, I need a way to detect specific errors. I think regular expressions would help but need help figuring this one out. In this example of SRT formatted subtitle, line #13 ends at 00:01:10,130 and line #14 begins at 00:01:10:129.
13
00:01:05,549 --> 00:01:10,130
some text here.
14
00:01:10,129 --> 00:01:14,109
some other text here.
Problem is that next line can't begin before current one is over - embedding algorithm doesn't work when that happens. I need to check my SRT files and correct this manually, but looking for this manually in about 20 videos each an hour long just isn't an option. Specially since I need it 'yesterday' (:
Format for SRT subtitles is very specific:
XX
START --> END
TEXT
EMPTY LINE
[line number (digits)][new line character]
[start and end times in 00:00:00,000 format, separated by _space__minusSign__minusSign__greaterThenSign__space_][new line character]
[text - can be any character - letter, digit, punctuation sign.. pretty much anything][new line character]
[new line character]
I need to check if END time is greater then START time of the following subtitle. Help would be appreciated.
PS. I can work with Notepad++, Eclipse (Aptana), python or javascript...
Regular expressions can be used to achieve what you want, that being said, they can't do it on their own. Regular expressions are used for matching patterns and not numerical ranges.
If I where you, what I would do would be as following:
Parse the file and place the start-end time in one data structure (call it DS_A) and the text in another (call it DS_B).
Sort DS_A in ascending order. This should guarantee that you will not have overlapping ranges. (This previous SO post should point you in the right direction).
Iterate over and write the following in your file:j DS_A[i] --> DS_A[i + 1] <newline> DS_B[j] where i is a loop counter for DS_A and j is a loop counter for DS_B.
I ended up writing short script to fix this. here it is:
# -*- coding: utf-8 -*-
from datetime import datetime
import getopt, re, sys
count = 0
def fix_srt(inputfile):
global count
parsed_file, errors_file = '', ''
try:
with open( inputfile , 'r') as f:
srt_file = f.read()
parsed_file, errors_file = parse_srt(srt_file)
except:
pass
finally:
outputfile1 = ''.join( inputfile.split('.')[:-1] ) + '_fixed.srt'
outputfile2 = ''.join( inputfile.split('.')[:-1] ) + '_error.srt'
with open( outputfile1 , 'w') as f:
f.write(parsed_file)
with open( outputfile2 , 'w') as f:
f.write(errors_file)
print 'Detected %s errors in "%s". Fixed file saved as "%s"
(Errors only as "%s").' % ( count, inputfile, outputfile1, outputfile2 )
previous_end_time = datetime.strptime("00:00:00,000", "%H:%M:%S,%f")
def parse_times(times):
global previous_end_time
global count
_error = False
_times = []
for time_code in times:
t = datetime.strptime(time_code, "%H:%M:%S,%f")
_times.append(t)
if _times[0] < previous_end_time:
_times[0] = previous_end_time
count += 1
_error = True
previous_end_time = _times[1]
_times[0] = _times[0].strftime("%H:%M:%S,%f")[:12]
_times[1] = _times[1].strftime("%H:%M:%S,%f")[:12]
return _times, _error
def parse_srt(srt_file):
parsed_srt = []
parsed_err = []
for srt_group in re.sub('\r\n', '\n', srt_file).split('\n\n'):
lines = srt_group.split('\n')
if len(lines) >= 3:
times = lines[1].split(' --> ')
correct_times, error = parse_times(times)
if error:
clean_text = map( lambda x: x.strip(' '), lines[2:] )
srt_group = lines[0].strip(' ') + '\n' + ' --> '.join( correct_times ) + '\n' + '\n'.join( clean_text )
parsed_err.append( srt_group )
parsed_srt.append( srt_group )
return '\r\n'.join( parsed_srt ), '\r\n'.join( parsed_err )
def main(argv):
inputfile = None
try:
options, arguments = getopt.getopt(argv, "hi:", ["input="])
except:
print 'Usage: test.py -i <input file>'
for o, a in options:
if o == '-h':
print 'Usage: test.py -i <input file>'
sys.exit()
elif o in ['-i', '--input']:
inputfile = a
fix_srt(inputfile)
if __name__ == '__main__':
main( sys.argv[1:] )
If someone needs it save the code as srtfix.py, for example, and use it from command line:
python srtfix.py -i "my srt subtitle.srt"
I was lazy and used datetime module to process timecodes, so not sure script will work for subtitles longer then 24h (: I'm also not sure when miliseconds were added to Python's datetime module, I'm using version 2.7.5; it's possible script won't work on earlier versions because of this...

Need help counting only certain words (Python) [duplicate]

I have this code, where I am trying to count the number of:
Lines of code in a .py script
for_loops ("for ")
-while_loops ("while ")
if_statements ("if ")
function definitions ("def ")
multiplication signs ("*"
division signs ("/"
addition signs ("+")
subtraction signs ("-")
On the mathematical signs the code works, but when the code is looking for if statements it returns 2, when there is one, which is the main problem, but it makes me think I have written the for loop incorrectly, which could bring up more problems later. As well as this I am not sure how to print the Author line which comes up as [] instead of the name of the Author
The code:
from collections import Counter
FOR_=0
WHILE_=0
IF_=0
DEF_=0
x =input("Enter file or directory: ")
print ("Enter file or directory: {0}".format(x))
print ("Filename {0:>20}".format(x))
b= open(x)
c=b.readlines()
d=b.readlines(2)
print ("Author {0:<18}".format(d))
print ("lines_of_code {0:>8}".format((len (c))))
counter = Counter(str(c))
for line in c:
if ("for ") in line:
FOR_+=1
print ("for_loops {0:>12}".format((FOR_)))
for line in c:
if ("while ") in line:
WHILE_+=1
print ("while_loops {0:>10}".format((WHILE_)))
for line in c:
if ("if ") in line:
IF_+=1
a=IF_
print ("if_statements {0:>8}".format((a)))
for line in c:
if ("def ") in line:
DEF_+=1
print ("function_definitions {0}".format((DEF_)))
print ("multiplications {0:>6}".format((counter['*'])))
print ("divisions {0:>12}".format((counter['/'])))
print ("additions {0:>12}".format((counter['+'])))
print ("subtractions {0:>9}".format((counter['-'])))
The file being read from:
'''Dumbo
Author: Hector McTavish'''
for for for # Should count as 1 for statement
while_im_alive # Shouldn't count as a while
while blah # But this one should
if defined # Should be an if but not a def
def if # Should be a def but not an if
x = (2 * 3) + 4 * 2 * 7 / 1 - 2 # Various operators
Any help would be much appreciated
Instead of treating the source code as a string, use the ast module to parse it and then just walk through the nodes:
import ast
from collections import Counter
tree = ast.parse('''
"""
Author: Nobody
"""
def foo(*args, **kwargs):
for i in range(10):
if i != 2**2:
print(i * 2 * 3 * 2)
def bar():
pass
''')
counts = Counter(node.__class__ for node in ast.walk(tree))
print('The docstring says:', repr(ast.get_docstring(tree)))
print('You have', counts[ast.Mult], 'multiplication signs.')
print('You have', counts[ast.FunctionDef], 'function definitions.')
print('You have', counts[ast.If], 'if statements.')
It's pretty straightforward and handles all of your corner cases:
The docstring says: 'Author: Nobody'
You have 3 multiplication signs.
You have 2 function definitions.
You have 1 if statements.
if ("if ") in line will also count def if #.

removing punctuation then counting the no of every word occurance using python

Hello everybody I am new to python and need to write a program to eliminate punctuation then count the number of words in a string. So I have this:
import sys
import string
def removepun(txt):
for punct in string.punctuation:
txt = txt.replace(punct,"")
print txt
mywords = {}
for i in range(len(txt)):
item = txt[i]
count = txt.count(item)
mywords[item] = count
return sorted(mywords.items(), key = lambda item: item[1], reverse=True)
The problem is it returns back letters and counts them and not words as I hoped. Can you help me in this matter?
How about this?
>>> import string
>>> from collections import Counter
>>> s = 'One, two; three! four: five. six##$,.!'
>>> occurrence = Counter(s.translate(None, string.punctuation).split())
>>> print occurrence
Counter({'six': 1, 'three': 1, 'two': 1, 'four': 1, 'five': 1, 'One': 1})
after removing the punctuation
numberOfWords = len(txt.split(" "))
Assuming one space between words
EDIT:
a={}
for w in txt.split(" "):
if w in a:
a[w] += 1
else:
a[w] = 1
how it works
a is set to be a dict
the words in txt are iterated
if there is an entry already for dict a[w] then add one to it
if there is no entry then set one up, initialized to 1
output is the same as Haidro's excellent answer, a dict with keys of the words and values of the count of each word

NZEC in python on spoj for AP2

I wrote the following two codes
FCTRL2.py
import sys;
def fact(x):
res = 1
for i in range (1,x+1):
res=res*i
return res;
t = int(raw_input());
for i in range (0,t):
print fact(int(raw_input()));
and
AP2.py
import sys;
t = int(raw_input());
for i in range (0,t):
x,y,z = map(int,sys.stdin.readline().split())
n = (2*z)/(x+y)
d = (y-x)/(n-5)
a = x-(2*d)
print n
for j in range(0,n):
sys.stdout.write(a+j*d)
sys.stdout.write(' ')
print' '
FCTRL2.py is accepted on spoj whereas AP2.py gives NZEC error. Both work fine on my machine and i do not find much difference with regard to returning values from both. Please explain what is the difference in both and how do i avoid NZEC error for AP2.py
There may be extra white spaces in the input. A good problem setter would ensure that the input satisfies the specified format. But since spoj allows almost anyone to add problems, issues like this sometimes arise. One way to mitigate white space issues is to read the input at once, and then tokenize it.
import sys; # Why use ';'? It's so non-pythonic.
inp = sys.stdin.read().split() # Take whitespaces as delimiter
t = int(inp[0])
readAt = 1
for i in range (0,t):
x,y,z = map(int,inp[readAt:readAt+3]) # Read the next three elements
n = (2*z)/(x+y)
d = (y-x)/(n-5)
a = x-(2*d)
print n
#for j in range(0,n):
# sys.stdout.write(a+j*d)
# sys.stdout.write(' ')
#print ' '
print ' '.join([str(a+ti*d) for ti in xrange(n)]) # More compact and faster
readAt += 3 # Increment the index from which to start the next read
The n in line 10 can be a float, the range function expects an integer. Hence the program exits with an exception.
I tested this on Windows with values:
>ap2.py
23
4 7 9
1.6363636363636365
Traceback (most recent call last):
File "C:\martin\ap2.py", line 10, in <module>
for j in range(0,n):
TypeError: 'float' object cannot be interpreted as an integer