Python - write filename / filepath into a textfile - python-2.7

I have some filenames defined in a Python script with tkinter who look like this:
def save_file(self):
self.filename = tkFileDialog.asksaveasfilename(title="Save...", filetypes=([("Excel Workbook","*.xlsx")]))
At the end the filename of these variables should be written out into a text file:
text_out = open('output.txt', 'w')
text_out.write("The first filename is " + self.filename + " + '\n')
text_out.write("The first pathname is " + self.filelocation + '\n')
text_out.close()
But it doesn't work. Has anyone any ideas? I have also tried it with:
text_out.write("The first filename is " + str(self.filename) + " + '\n')
but without the expedted result.

I tried it from the command line and got this error message:
SyntaxError: EOL while scanning string literal
This appears to mean that Python doesn't like having '\n' at the end of a bunch of strings stuck together with +. Try treating the items as a list and joining them, like this:
text_out = open('output.txt', 'w')
filewrite = ''.join([
'The first filename is ', self.filename, '\n',
'The first pathname is ', self.filelocation, '\n'])
text_out.write(filewrite)
text_out.close()
I got this idea from the Python Cookbook, and it works for me.

Related

UTF-8 to EBCDIC using iconv in Python-script on USS

I am trying to convert utf-8 files from a directory listing on USS into ebcdic files BEFORE getting them into z/OS datasets.
Using a helper function which I found on stackoverflow (thanks for this!) I can issue shell-commands from within the python script:
def r(cmd_line):
return Popen(cmd_line.split(), stdout=PIPE).communicate()[0]
With this I can allocate and populate mainframe datasets from USS-files, using
r("tso alloc DSNAME(...) etc.") # to allocate a mainframe DS and
r("tso oget ...") # to populate the mainframe DS
However: some files need to be converted first, which in a shellscript I would simply code with
iconv -f UTF-8 -t IBM-1141 $utf8_file > $ebcdic_file
and I am totally at a loss of how to do this in python (2.7)?
Can't ask anybody in my shop since python was newly installed and I am currently the only one interested in it. Anyone an idea? Thanks a lot in advance!
Although not in the true spirit of python, you can do what you want by wrapping USS commands in a python script. Here is an example:
#!/bin/env python
from cStringIO import StringIO
import os
import sys
def r(cmd):
import subprocess
return subprocess.Popen(cmd, stdout=subprocess.PIPE).communicate()[0]
def allocate_dataset(dsName):
name = "'" + dsName + "'"
out = r(['/bin/tso', 'alloc', 'ds(' + name + ')', 'space(6000 2000)', 'track', 'lrecl(80)',
'dsntype(library)', 'blksize(3200)', 'recfm(f b)', 'dir(2)', 'new'])
for line in out.split():
print line
def not_allocated(dsName):
name = "'" + dsName + "'"
out = r(['/bin/tsocmd', 'listds ' + name])
for line in out.split():
if "NOT IN CATALOG" in out:
return True
return False
def ascii_to_ebcdic(from_codepage, to_codepage, fileName):
os.system('iconv -f' + from_codepage + ' -t' + to_codepage + ' <' + fileName + ' >ebcdic_' + fileName)
def copy_to_dataset(fileName, dsName, memberName):
dsn = "//'" + dsName + '(' + memberName + ")'"
os.system('cp -T ' + fileName + ' "' + dsn + '"')
def main():
dsName = "HLQ.MY.PYTHON"
if not_allocated(dsName):
print("Allocating '" + dsName + "' data set")
allocate_dataset(dsName)
ascii_to_ebcdic("UTF-8", "IBM-1047", "test.txt")
copy_to_dataset("ebcdic_test.txt", "HLQ.MY.PYTHON", "TXT")
member = "//'HLQ.MY.PYTHON(TXT)'"
os.system('cat -v "' + member + '"')
main()

Number stored as text when converted CSV to Xlsx

I wrote a code to convert/copy CSV file into Xlsx file. It copied the data successfully but all the data stored as text.
Now it is showing an exclamation mark on each data and when it is showing "Number stored as text" .
Can anyone pls help me how to get the data in number because i want to do manipulation on data.
here is the code:
wb = Workbook()
ws = wb.active
with open(plotDir + '\\' + file, 'r') as f:
for row in csv.reader(f):
ws.append(row)
wb.save(plotDir + '\\' + file[:-4] + '.xlsx')
You can use set_value_explicit :
with open(plotDir + '\\' + plotFile, 'r') as f:
for x, row in enumerate( csv.reader(f), start=1):
for y, val in enumerate(row, start=1):
ws.cell(row=x,column=y).set_explicit_value(value=val,data_type='n') # Cell.TYPE_NUMERIC
wb.save(plotDir + '\\' + plotFile[:-4] + '.xlsx')

How do I dynamically search for text in a file and write to another file

I'll try to be as specific as I can. Keep in mind I just started learning this language last week so I'm not a professional. I'm trying to make a program that will read a vocabulary file that I created and write the definition for the word to another preexisting file with a different format.
Example of the two formats and what I'm trying to do here:
Word 1 - Definition
Word 1 (page 531) - Definition from other file
What I'm currently doing with it is I'm opening both files and searching a word based on user input, which isn't working. What I want to do is I want the program to go into the output file and find the word, then find the same word in the input file, get the definition only, and paste it into the output file. Then move to the next word and loop until it finds the end of file. I really don't know how to do that so I'm currently stuck. How would you python pros here on stackoverflow handle this?
Also for those who are suspicious of my reasons for this program, I'm not trying to cheat on an assignment, I'm trying to get some of my college work done ahead of time and I don't want to run into conflicts with my formatting being different from the teachers. This is just to save me time so I don't have to do the same assignment twice.
Edit 1
Here is the full code pasted from my program currently.
import os
print("Welcome to the Key Terms Finder Program. What class is this for?\n[A]ccess\n[V]isual Basic")
class_input = raw_input(">>")
if class_input == "A" or class_input == "a":
class_input = "Access"
chapter_num = 11
elif class_input == "V" or class_input == "v":
class_input = "Visual Basic"
chapter_num = 13
else:
print("Incorrect Input")
print("So the class is " + class_input)
i = 1
for i in range(1, chapter_num + 1):
try:
os.makedirs("../Key Terms/" + class_input + "/Chapter " + str(i) + "/")
except WindowsError:
pass
print("What Chapter is this for? Enter just the Chapter number. Ex: 5")
chapter_input = raw_input(">>")
ChapterFolder = "../Key Terms/" + class_input + "/Chapter " + str(chapter_input) + "/"
inputFile = open(ChapterFolder + "input.txt", "r")
outputFile = open(ChapterFolder + "output.txt", "w")
line = inputFile.readlines()
i = 0
print("Let's get down to business. Enter the word you are looking to add to the file.")
print("To stop entering words, enter QWERTY")
word_input = ""
while word_input != "QWERTY":
word_input = raw_input(">>")
outputArea = word_input
linelen = len(line)
while i < linelen:
if line[i] == word_input:
print("Word Found")
break
else:
i = i + 1
print(i)
i = 0
inputFile.close()
outputFile.close()
Not a python pro , however, I will try to answer your question.
output=[]
word=[]
definition=[]
with open('input.txt','r') as f:
for line in f:
new_line=re.sub('\n','',line)
new_line=re.sub('\s+','',line)
word.append(new_line.split("-")[0])
definition.append(new_line.split("-")[1])
with open('output.txt','r') as f:
for line in f:
new_line=re.sub('\n','',line)
new_line=re.sub('\s+','',line)
try:
index = word.index(new_line)
print index
meaning = definition[index]
print meaning
output.append(new_line+" - "+meaning)
except ValueError as e:
output.append(new_line+" - meaning not found")
print e
f=open("output.txt","w")
f.write("\n".join(output))
f.close()
Here, input.txt is the file where word and definition is present.
output.txt is the file which has only words ( it was unclear to me what output.txt contained I assumed only words ).
Above code is reading from output.txt , looking into input.txt and gets the definition if found else it skips.
Assumption is word and definition are separated by -
Does this helps?

Python CSV export writing characters to new lines

I have been using multiple code snippets to create a solution that will allow me to write a list of players in a football team to a csv file.
import csv
data = []
string = input("Team Name: ")
fName = string.replace(' ', '') + ".csv"
print("When you have entered all the players, press enter.")
# while loop that will continue allowing entering of players
done = False
while not done:
a = input("Name of player: ")
if a == "":
done = True
else:
string += a + ','
string += input("Age: ") + ','
string += input("Position: ")
print (string)
file = open(fName, 'w')
output = csv.writer(file)
for row in string:
tempRow = row
output.writerow(tempRow)
file.close()
print("Team written to file.")
I would like the exported csv file to look like this:
player1,25,striker
player2,27,midfielder
and so on. However, when I check the exported csv file it looks more like this:
p
l
a
y
e
r
,
2
5
and so on.
Does anyone have an idea of where i'm going wrong?
Many thanks
Karl
Your string is a single string. It is not a list of strings. You are expecting it to be a list of strings when you are doing this:
for row in string:
When you iterate over a string, you are iterating over its characters. Which is why you are seeing a character per line.
Declare a list of strings. And append every string to it like this:
done = False
strings_list = []
while not done:
string = ""
a = input("Name of player: ")
if a == "":
done = True
else:
string += a + ','
string += input("Age: ") + ','
string += input("Position: ") + '\n'
strings_list.append(string)
Now iterate over this strings_list and print to the output file. Since you are putting the delimiter (comma) yourself in the string, you do not need a csv writer.
a_file = open(fName, 'w')
for row in strings_list:
print(row)
a_file.write(row)
a_file.close()
Note:
string is a name of a standard module in Python. It is wise not to use this as a name of any variable in your program. Same goes for your variable file

regex for detecting subtitle errors

I'm having some issues with subtitles, I need a way to detect specific errors. I think regular expressions would help but need help figuring this one out. In this example of SRT formatted subtitle, line #13 ends at 00:01:10,130 and line #14 begins at 00:01:10:129.
13
00:01:05,549 --> 00:01:10,130
some text here.
14
00:01:10,129 --> 00:01:14,109
some other text here.
Problem is that next line can't begin before current one is over - embedding algorithm doesn't work when that happens. I need to check my SRT files and correct this manually, but looking for this manually in about 20 videos each an hour long just isn't an option. Specially since I need it 'yesterday' (:
Format for SRT subtitles is very specific:
XX
START --> END
TEXT
EMPTY LINE
[line number (digits)][new line character]
[start and end times in 00:00:00,000 format, separated by _space__minusSign__minusSign__greaterThenSign__space_][new line character]
[text - can be any character - letter, digit, punctuation sign.. pretty much anything][new line character]
[new line character]
I need to check if END time is greater then START time of the following subtitle. Help would be appreciated.
PS. I can work with Notepad++, Eclipse (Aptana), python or javascript...
Regular expressions can be used to achieve what you want, that being said, they can't do it on their own. Regular expressions are used for matching patterns and not numerical ranges.
If I where you, what I would do would be as following:
Parse the file and place the start-end time in one data structure (call it DS_A) and the text in another (call it DS_B).
Sort DS_A in ascending order. This should guarantee that you will not have overlapping ranges. (This previous SO post should point you in the right direction).
Iterate over and write the following in your file:j DS_A[i] --> DS_A[i + 1] <newline> DS_B[j] where i is a loop counter for DS_A and j is a loop counter for DS_B.
I ended up writing short script to fix this. here it is:
# -*- coding: utf-8 -*-
from datetime import datetime
import getopt, re, sys
count = 0
def fix_srt(inputfile):
global count
parsed_file, errors_file = '', ''
try:
with open( inputfile , 'r') as f:
srt_file = f.read()
parsed_file, errors_file = parse_srt(srt_file)
except:
pass
finally:
outputfile1 = ''.join( inputfile.split('.')[:-1] ) + '_fixed.srt'
outputfile2 = ''.join( inputfile.split('.')[:-1] ) + '_error.srt'
with open( outputfile1 , 'w') as f:
f.write(parsed_file)
with open( outputfile2 , 'w') as f:
f.write(errors_file)
print 'Detected %s errors in "%s". Fixed file saved as "%s"
(Errors only as "%s").' % ( count, inputfile, outputfile1, outputfile2 )
previous_end_time = datetime.strptime("00:00:00,000", "%H:%M:%S,%f")
def parse_times(times):
global previous_end_time
global count
_error = False
_times = []
for time_code in times:
t = datetime.strptime(time_code, "%H:%M:%S,%f")
_times.append(t)
if _times[0] < previous_end_time:
_times[0] = previous_end_time
count += 1
_error = True
previous_end_time = _times[1]
_times[0] = _times[0].strftime("%H:%M:%S,%f")[:12]
_times[1] = _times[1].strftime("%H:%M:%S,%f")[:12]
return _times, _error
def parse_srt(srt_file):
parsed_srt = []
parsed_err = []
for srt_group in re.sub('\r\n', '\n', srt_file).split('\n\n'):
lines = srt_group.split('\n')
if len(lines) >= 3:
times = lines[1].split(' --> ')
correct_times, error = parse_times(times)
if error:
clean_text = map( lambda x: x.strip(' '), lines[2:] )
srt_group = lines[0].strip(' ') + '\n' + ' --> '.join( correct_times ) + '\n' + '\n'.join( clean_text )
parsed_err.append( srt_group )
parsed_srt.append( srt_group )
return '\r\n'.join( parsed_srt ), '\r\n'.join( parsed_err )
def main(argv):
inputfile = None
try:
options, arguments = getopt.getopt(argv, "hi:", ["input="])
except:
print 'Usage: test.py -i <input file>'
for o, a in options:
if o == '-h':
print 'Usage: test.py -i <input file>'
sys.exit()
elif o in ['-i', '--input']:
inputfile = a
fix_srt(inputfile)
if __name__ == '__main__':
main( sys.argv[1:] )
If someone needs it save the code as srtfix.py, for example, and use it from command line:
python srtfix.py -i "my srt subtitle.srt"
I was lazy and used datetime module to process timecodes, so not sure script will work for subtitles longer then 24h (: I'm also not sure when miliseconds were added to Python's datetime module, I'm using version 2.7.5; it's possible script won't work on earlier versions because of this...