I recently wrote a python script to select certain files within a directory and save them to a new archive within that directory. The script works with the exception that it creates a duplicate archive within the new archive. I think it has something to do with the arcname I used and the loop but I'm really not sure. As I'm sure is obvious by looking at my code I am a beginner so I am sure there is plenty of room for improvement here. Any ideas as to where the problem is? Also if you have any suggestions for improving the code I'm all ears.
import os,arcpy,zipfile
inputfc = arcpy.GetParameterAsText(0) # User Inputs Feature Class Path
desc = arcpy.Describe(inputfc)
fcname = desc.basename
zname = fcname + ".zip"
gpath = os.path.dirname(inputfc)
zpath = os.path.join(gpath,zname)
zfile = zipfile.ZipFile(zpath, "w")
for f in os.listdir(gpath):
fpath = os.path.join(gpath, f)
if f.startswith(fcname):
zfile.write(fpath,f,compress_type = zipfile.ZIP_DEFLATED)
zfile.close()
Edit: After aruisdante answered my question I decided to just change the zname variable to
zname = "zip" + fcname + ".zip" #ugly but it worked thanks
This:
zfile = zipfile.ZipFile(zpath, "w")
Creates a new Zip file at zpath
for f in os.listdir(gpath):
Iterates through all of the files at gpath. Since gpath is also the root of zpath, then the zip file you just created will be one of the files in gpath. So it gets included in the archive. You will need to exclude it:
for f in (filename for filename in os.listdir(gpath) if filename != zname):
Related
I have looked a lot for this but have not found anything. I am very new to matlab and regex in general.
My problem is, have a directory path 'dir' with only one .txt file in it. I do however not know the filename of the txt file. I want to load this file.
I have tried multiple things but cannot find the solution.
foo = load(fullfile(dir, '-regexp', '*.txt'))
Thank you for your help!
That syntax isn't valid for fullfile, and dir is an in-built function which it appears you're using as a variable... Here is something a little clearer which should work when you have a single txt file within a given folder
folder = 'my\folder\path\';
files = dir( fullfile( folder, '*.txt' ) );
if numel( files ) ~= 1
error( 'More or less than one .txt file found!' );
end
filepath = fullfile( files(1).folder, files(1).name );
foo = load( filepath ); % load is designed for .mat files, if your .txt contains anything
% non-numeric then you may want something more like readtable here...
I am having a situation in ensuring that when I create a zip file it does not have the whole directory of the file when it is unzipped.
Having done some research there is a lot of content about using arcname in zip.write, however any solution I try results in the whole server being zipped!
I have tried adding arcname = os.path.basename(file) and other possible solutions with no luck.
This is my code below:
all_order_files = glob.glob("/directory/"+str(order_submission.id)+"-*")
zip = zipfile.ZipFile("/directory/" + str(order_submission.id) + '-Order-Summary.zip', 'w')
for file in all_order_files:
zip.write(file)
zip.close()
After reading this answer: Create .zip in Python?
I adapted the code to read the following which solved the issue for me.
all_order_files = glob.glob("/directory/"+str(order_submission.id)+"-*")
zip = zipfile.ZipFile("/directory/" + str(order_submission.id) + '-Order-Summary.zip', 'w')
path = "/directory/"
for file in all_order_files:
file_name = file.split('/')[-1]
absname = os.path.abspath(os.path.join(path, file_name))
arcname = absname[len(path) + 1:]
zip.write(absname, arcname)
zip.close()
Noting the extra argument provided to the write function that changes the directory structure when the zip file is unzipped.
The code below searches within a directory for any PDFs and for each one it finds it moves into the corresponding folder which has '_folder' appended.
Could it be expressed in simpler terms? It's practically unreadable. Also if it can't find the folder, it destroys the PDF!
import os
import shutil
for root, dirs, files in os.walk(folder_path_variable):
for file1 in files:
if file1.endswith('.pdf') and not file1.startswith('.'):
filenamepath = os.path.join(root, file1)
name_of_file = file1.split('-')[0]
folderDest = filenamepath.split('/')[:9]
folderDest = '/'.join(folderDest)
folderDest = folderDest + '/' + name_of_file + '_folder'
shutil.move(filenamepath2, folderDest)
Really I want to traverse the same directory after constructing the variable name_of_file and if that variable is in a folder name, it performs the move. However I came across issues trying to nest another for loop...
I would try something like this:
for root, dirs, files in os.walk(folder_path_variable):
for filename in files:
if filename.endswith('.pdf') and not filename.startswith('.'):
filepath = os.path.join(root, filename)
filename_prefix = filename.split('-')[0]
dest_dir = os.path.join(root, filename_prefix + '_folder')
if not os.path.isdir(dest_dir):
os.mkdir(dest_dir)
os.rename(filepath, os.path.join(dest_dir, filename))
The answer by John Zwinck is correct, except it contains a bug where if the destination folder already exists, a folder within that folder is created and the pdf is moved to that location. I have fixed this by adding a 'break' statement within the inner for loop (for filename in files).
The code below now executes correctly. Looks for folder named as the pdf's first few characters (taking the prefix split at '-') with '_folder' at the tail, if it exists the pdf is moved into it. If it doesn't, one is created with the prefix name and '_folder' and pdf is moved into it.
for root, dirs, files in os.walk(folder_path_variable):
for filename in files:
if filename.endswith('.pdf') and not filename.startswith('.'):
filepath = os.path.join(root, filename)
filename_prefix = filename.split('-')[0]
dest_dir = os.path.join(root, filename_prefix + '_folder')
if not os.path.isdir(dest_dir):
os.mkdir(dest_dir)
os.rename(filepath, os.path.join(dest_dir, filename))
break
this is are my files
2015125_0r89_PEO.txt
2015125_0r89_PED.txt
2015125_0r89_PEN.txt
2015126_0r89_PEO.txt
2015126_0r89_PED.txt
2015126_0r89_PEN.txt
2015127_0r89_PEO.txt
2015127_0r89_PED.txt
2015127_0r89_PEN.txt
and I want to change to this:
US.CAR.PEO.D.2015.125.txt
US.CAR.PED.D.2015.125.txt
US.CAR.PEN.D.2015.125.txt
US.CAR.PEO.D.2015.126.txt
US.CAR.PED.D.2015.126.txt
US.CAR.PEN.D.2015.126.txt
US.CAR.PEO.D.2015.127.txt
US.CAR.PED.D.2015.127.txt
US.CAR.PEN.D.2015.127.txt
this is my code so far,
import os
paths = (os.path.join(root, filename)
for root, _, filenames in os.walk('C:\\data\\MAX\\') #location files
for filename in filenames)
for path in paths:
a = path.split("_")
b = a[2].split(".")
c = "US.CAR."+ b[0] + ".D." + a[0]
print c
when I run the script it's no make any error, but not change the name of the files .txt which it is what it should supposed to do
any help?
The way you do it by first getting the path and then manipulating it will get bad results, in this case is best first get the name of the file, make the changes to it and then change the name of the file itself, like this
for root,_,filenames in os.walk('C:\\data\\MAX\\'):
for name in filenames:
print "original:", name
a = name.split("_")
b = a[2].split(".")
new = "US.CAR.{}.D.{}.{}".format(b[0],a[0],b[1]) #don't forget the file extention
print "new",new
os.rename( os.path.join(root,name), os.path.join(root,new) )
string concatenation is more inefficient, the best way is using string formating.
I use the following code for downloading two files in a folder from a website.
I want to download some files that contain "MOD09GA.A2008077.h22v05.005.2008080122814.hdf" and "MOD09GA.A2008077.h23v05.005.2008080122921.hdf" in the page. But I don't know how to select these files. The code below download all the files, but I only need two of them.
Does anyone have any ideas?
URL = 'http://e4ftl01.cr.usgs.gov/MOLT/MOD09GA.005/2008.03.17/';
% Local path on your machine
localPath = 'E:/myfolder/';
% Read html contents and parse file names with ending *.hdf
urlContents = urlread(URL);
ret = regexp(urlContents, '"\S+.hdf.xml"', 'match');
% Loop over all files and download them
for k=1:length(ret)
filename = ret{k}(2:end-1);
filepathOnline = strcat(URL, filename);
filepathLocal = fullfile(localPath, filename);
urlwrite(filepathOnline, filepathLocal);
end
Try the regexp with tokens instead:
localPath = 'E:/myfolder/';
urlContents = 'aaaa "MOD09GA.A2008077.h22v05.005.2008080122814.hdf.xml" and "MOD09GA.A2008077.h23v05.005.2008080122921.hdf.xml" aaaaa';
ret = regexp(urlContents , '"(\S+)(?:\.\d+){2}(\.hdf\.xml)"', 'tokens');
%// Loop over each file name
for k=1:length(ret)
filename = [ret{k}{:}];
filepathLocal = fullfile(localPath, filename)
end