how to modify element of list in python - list

Given a list of filenames, we want to rename all the files with extension .hpp to the extension h. To do this, we would like to generate a new list called newfilenames, consisting of the new filenames. Fill in the blanks in the code using any of the methods you’ve learned thus far, like a for loop or a list comprehension.
filenames = ["program.c", "stdio.hpp", "sample.hpp", "a.out", "math.hpp", "hpp.out"]
# Generate newfilenames as a list containing the new filenames
# using as many lines of code as your chosen method requires.
print(newfilenames)
# Should be ["program.c", "stdio.h", "sample.h", "a.out", "math.h", "hpp.out"]

newfilenames = [e.replace('.hpp','.h') for e in filenames]

filenames = ["program.c", "stdio.hpp", "sample.hpp", "a.out", "math.hpp", "hpp.out"]
newfilenames=[]
for filename in filenames:
if filename.endswith(".hpp"):
filename = filename.replace(".hpp", ".h")
newfilenames.append(filename)
else:
newfilenames.append(filename)
print(newfilenames)

filenames = ["program.c", "stdio.hpp", "sample.hpp", "a.out", "math.hpp", "hpp.out"]
# Generate newfilenames as a list containing the new filenames
# using as many lines of code as your chosen method requires.
newfilenames = [word.replace("hpp","h") if word.endswith("hpp") else word for word in filenames ]
print(newfilenames)
# Should be ["program.c", "stdio.h", "sample.h", "a.out", "math.h", "hpp.out"]

filenames = ["program.c", "stdio.hpp", "sample.hpp", "a.out", "math.hpp", "hpp.out"]
newfilenames = []
for names in filenames:
if names.endswith('.hpp'):
newfilenames.append(names[:-2])
continue
newfilenames.append(names)
print(newfilenames)

filenames = ["program.c", "stdio.hpp", "sample.hpp", "a.out", "math.hpp", "hpp.out"]
newfilenames=[]
for x in filenames:
if x.endswith(".hpp"):
newfilenames.append(x.replace(".hpp",".h"))
else:
newfilenames.append(x)
print(newfilenames)

filenames = ["program.c", "stdio.hpp", "sample.hpp", "a.out", "math.hpp", "hpp.out"]
newfilenames=[]
for filename in filenames:
if '.hpp' in filename:
index=filename.index(".")
newfile=filename[:index]+".h"
newfilenames.append(newfile)
else:
newfilenames.append(filename)
print(filenames)

newfilenames = [ f.rstrip("pp") if f.endswith(".hpp") else f for f in filenames ]
Instead of str.replace(), I prefer using str.rstrip() and str.endswith() to take care of potential edge cases.

newfilenames = []
for names in filenames:
if names.endswith('.hpp'):
newfilenames.append(names[:-2])
continue
newfilenames.append(names)

def pig_latin(text):
v = text.split()
t = []
for x in v:
i = str(x[1:]) + str(x[0][0]) + 'ay'
t.append(i)
return t
print(pig_latin("hello how are you")) # Should be "ellohay owhay reaay ouyay"
print(pig_latin("programming in python is fun")) # Should be "rogrammingpay niay ythonpay siay unfay"

Related

How do I copy from keyword to keyword in Python 2.7?

How might I copy lines keyword by keyword in one file fileA to another file fileB in Python 2.7? The query should include the keywords in the output.
If your question was to copy part of the original file which is between startWord and endWord, you can use this which is very slightly modified from
Extract subset of file in bash or Python
begin = 'BEGINSTRING'
end = 'ENDSTRING'
with open(f, 'r') as input_file:
tmp = []
flag = False
for line in input_file.readlines():
if begin in line:
flag = True
index = line.find(begin)
tmp.append(line[index:])
continue
elif flag:
tmp.append(line)
elif end in line:
index = line.find(end)
tmp.append(line[:index+1])
break
else:
pass
with open(f + '_new', 'w') as output_file:
for line in tmp:
output_file.write(line)

extract text between a[- and -] using python

I am writing a script to extract data from a file and split the data to multiple files contents for each file is split by 5 "#"s
Example:
#####
hello
#####
world
#####
in this case, "hello" should be in one file and "world" should be in another file
I am using python
If I understand your requirements correctly, you want to be able to take input from a file with a delimiter of #####
#####
hello
#####
world
#####
and this would generate a file for each block between
hello
and
world
You can use re.split to get the splits
splits = re.split("[#]{5}\n", input_buffer)
would give something like (note: above assumes the split also includes a newline)
['', 'hello\n', 'world\n', '']
and to get only the splits with actual text (assuming that trailing new lines are to be removed)
[i.strip() for i in splits if i]
Output filename was also not specified so used
for index, val in enumerate([i.strip() for i in splits if i]):
with open("output%d"%index, "w+") as f:
to create files named output0, outputN
import re
import StringIO
input_text = '''#####
hello
#####
world
#####
'''
string_file = StringIO.StringIO(input_text)
input_buffer = string_file.read()
splits = re.split("[#]{5}\n", input_buffer)
for index, val in enumerate([i.strip() for i in splits if i]):
with open("output%d"%index, "w+") as f:
f.write(val)
Just a helper, can obviously use a different regular expression to split on, change output name to something more suitable, etc.
Also if as the title of this question says using text between [- and -] splits could be obtained using re.findall instead
input_text = '''[-hello-]
[-world-]
'''
string_file = StringIO.StringIO(input_text)
input_buffer = string_file.read()
splits = re.findall("\[-(.*)-\]", input_buffer)
for index, val in enumerate(splits):
with open("output%d"%index, "w+") as f:
f.write(val)
This could do the trick:
with open('a.txt') as r: #open source file and assign it to variable r
r = r.read().split('#####') #read the contents and break it into list of elements separated by '#####'
new = [item.strip() for item in r if item] #clean empty rows from the list
for i, item in enumerate(new): #iterate trough new list and assign a number to each iteration starting with 0 (default)
with open('a%s.txt' % i+1, 'w') as w: #create new file for each element from the list that will be named 'a' + 'value of i + 1' + '.txt'
w.write(item) #writing contents of current element into file
This will read your file that I called 'a.txt' and produce files named a1.txt, a2.txt ... an.txt

Python - Sort files based on timestamp

I have a list which contains list of file names, i wanted to sort based on timestamp, which ( i.e timestamp ) is inbuild in each file name.
Note: In file, Hello_Hi_2015-02-20T084521_1424543480.tar.gz --> 2015-02-20T084521 represents as "year-moth-dayTHHMMSS" ( Based on this i wanted to sort )
Input file below:
file_list = ['Hello_Hi_2015-02-20T084521_1424543480.tar.gz',
'Hello_Hi_2015-02-20T095845_1424543481.tar.gz',
'Hello_Hi_2015-02-20T095926_1424543481.tar.gz',
'Hello_Hi_2015-02-20T100025_1424543482.tar.gz',
'Hello_Hi_2015-02-20T111631_1424543483.tar.gz',
'Hello_Hi_2015-02-20T111718_1424543483.tar.gz',
'Hello_Hi_2015-02-20T112502_1424543483.tar.gz',
'Hello_Hi_2015-02-20T112633_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113427_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113456_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113608_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113659_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113809_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113901_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113955_1424543485.tar.gz',
'Hello_Hi_2015-03-20T114122_1424543485.tar.gz',
'Hello_Hi_2015-02-20T114532_1424543486.tar.gz',
'Hello_Hi_2015-02-20T120045_1424543487.tar.gz',
'Hello_Hi_2015-02-20T120146_1424543487.tar.gz',
'Hello_WR_2015-02-20T084709_1424543480.tar.gz',
'Hello_WR_2015-02-20T113016_1424543486.tar.gz']
Output should be:
file_list = ['Hello_Hi_2015-02-20T084521_1424543480.tar.gz',
'Hello_WR_2015-02-20T084709_1424543480.tar.gz',
'Hello_Hi_2015-02-20T095845_1424543481.tar.gz',
'Hello_Hi_2015-02-20T095926_1424543481.tar.gz',
'Hello_Hi_2015-02-20T100025_1424543482.tar.gz',
'Hello_Hi_2015-02-20T111631_1424543483.tar.gz',
'Hello_Hi_2015-02-20T111718_1424543483.tar.gz',
'Hello_Hi_2015-02-20T112502_1424543483.tar.gz',
'Hello_Hi_2015-02-20T112633_1424543484.tar.gz',
'Hello_WR_2015-02-20T113016_1424543486.tar.gz',
'Hello_Hi_2015-02-20T113427_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113456_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113608_1424543484.tar.gz',
'Hello_Hi_2015-02-20T113659_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113809_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113901_1424543485.tar.gz',
'Hello_Hi_2015-02-20T113955_1424543485.tar.gz',
'Hello_Hi_2015-02-20T114532_1424543486.tar.gz',
'Hello_Hi_2015-02-20T120045_1424543487.tar.gz',
'Hello_Hi_2015-02-20T120146_1424543487.tar.gz',
'Hello_Hi_2015-03-20T114122_1424543485.tar.gz']
Below is the code which i have tried.
def sort( dir ):
os.chdir( dir )
file_list = glob.glob('Hello_*')
file_list.sort(key=os.path.getmtime)
print("\n".join(file_list))
return 0
Thanks in advance!!
So this worked for me and it sorted files by created time that did not have the time stamp in the name;
import os
import re
files = [file for file in os.listdir(".") if (file.lower().endswith('.gz'))]
files.sort(key=os.path.getmtime)
for file in sorted(files,key=os.path.getmtime):
print(file)
Would this work?
You could write list contents to a file line by line and read the file:
lines = sorted(open(open_file).readlines(), key = lambda line :
line.split("_")[2])
Further, you could print out lines.
Your code is trying to sort based on the filesystem-stored modified time, not the filename time.
Since your filename encoding is slightly sane :-) if you want to sort based on filename alone, you may use:
sorted(os.listdir(dir), key=lambda s: s[9:]))
That will do, but only because the timestamp encoding in the filename is sane: fixed-length prefix, zero-padded, constant-width numbers, going in sequence from biggest time reference (year) to the lowest one (second).
If your prefix is not fixed, you can try something with RegExp like this (which will sort by the value after the second underscore):
import re
pat = re.compile('_.*?(_)')
sorted(os.listdir(dir), key=lambda s: s[pat.search(s).end():])

Bulk Search/replacing of filenames using python

I have:
An excel file as A1:B2.
A folder with 200 jpeg files.
I'm trying to search the filename in the folder with the value in Column A and replace it with the value in Column B if found without changing the extensions of the files in the folder.
Here am stuck using various skiddies to do this but failed. Here's my code:
import os
import xlrd
path = r'c:\users\c_thv\desktop\x.xls'
#collect the files in fexceler
path1 = r'c:\users\c_thv\desktop'
data = []
for name in os.listdir(path1):
if os.path.isfile(os.path.join(path1, name)):
fileName, fileExtension = os.path.splitext(name)
if fileExtension == '.py':
data.append(fileName)
#print data
#collect the filenames for changing
book = xlrd.open_workbook(path)
sheet = book.sheet_by_index(0)
cell = sheet.cell(0,0)
cells = sheet.row_slice(rowx=0,start_colx=0,end_colx=2)
excel = []
#collect the workable data in an list
for cell in cells:
excel.append(cell)
#print excel
#compare list for matches
for i,j in enumerate(excel):
if j in data[:]:
os.rename(excel[i],data[i])
Try a print "Match found" after if j in data[:]: just to check if the condition is ever met. My guess is there will be no match because the list data is full on python filemanes (if fileExtension == '.py') and you are looking for jpeg files in the excel list.
Besides, old is not defined.
EDIT:
If I understand correctly, this will may help:
import os, xlrd
path = 'c:/users/c_thv/desktop' #path to jpg files
path1 = 'c:/users/c_thv/desktop/x.xls'
data =[] #list of jpg filenames in folder
#lets create a filenames list without the jpg extension
for name in os.listdir(path):
fileName, fileExtension = os.path.splitext(name)
if fileExtension =='.jpg':
data.append(fileName)
#lets create a list of old filenames in the excel column a
book = xlrd.open_workbook(path1)
sheet = book.sheet_by_index(0)
oldNames =[]
for row in range(sheet.nrows):
oldNames.append(sheet.cell_value(row,0))
#lets create a list with the new names in column b
newNames =[]
for row in range(sheet.nrows):
newNames.append(sheet.cell_value(row,1))
#now create a dictionary with the old name in a and the corresponding new name in b
fileNames = dict(zip(oldNames,newNames))
print fileNames
#lastly rename your jpg files
for f in data:
if f in fileNames.keys():
os.rename(path+'/'+f+'.jpg', path+'/'+fileNames[f]+'.jpg')

Python: Copy several files with one column into one file with multi-column

I have the following question in Python 2.7:
I have 20 different txt-files, each with exactly one column of numbers. Now - as an output - I would like to have one file with all those columns together. How can I concatenate one-column files in Python ? I was thinking about using the fileinput module, but I fear, I have to open all my different txt files at once ?
My idea:
filenames = ['input1.txt','input2.txt',...,'input20.txt']
import fileinput
with open('/path/output.txt', 'w') as outfile:
for line in fileinput.input(filenames)
write(line)
Any suggestions on that ?
Thanks for any help !
A very simply (naive?) solution is
filenames = ['a.txt', 'b.txt', 'c.txt', 'd.txt']
columns = []
for filename in filenames:
lines = []
for line in open(filename):
lines.append(line.strip('\n'))
columns.append(lines)
rows = zip(*columns)
with open('output.txt', 'w') as outfile:
for row in rows:
outfile.write("\t".join(row))
outfile.write("\n")
But on *nix (including OS X terminal and Cygwin), it's easier to
$ paste a.txt b.txt c.txt d.txt
from the command line.
My suggestion: a little functional approach. Using list comprehension to zip the file being read, to the accumulated columns, and then join them to be a string again, one column (file) at a time:
filenames = ['input1.txt','input2.txt','input20.txt']
outputfile = 'output.txt'
#maybe you need to separate each column:
separator = " "
separator_list = []
output_list = []
for f in filenames:
with open(f,'r') as inputfile:
if len(output_list) == 0:
output_list = inputfile.readlines()
separator_list = [ separator for x in range(0, len(outputlist))]
else:
input_list = inputfile.readlines()
output_list = [ ''.join(x) for x in [list(y) for y in zip(output_list, separator_list, input_list)]
with open(outputfile,'w') as output:
output.writelines(output_list)
It will keep in memory the accumulator for the result (output_list), and one file at a time (the one being read, which is also the only file open for reading), but may be a little slower, and, of course, it is not fail-proof.