Python shutil file move in os walk for loop - python-2.7

The code below searches within a directory for any PDFs and for each one it finds it moves into the corresponding folder which has '_folder' appended.
Could it be expressed in simpler terms? It's practically unreadable. Also if it can't find the folder, it destroys the PDF!
import os
import shutil
for root, dirs, files in os.walk(folder_path_variable):
for file1 in files:
if file1.endswith('.pdf') and not file1.startswith('.'):
filenamepath = os.path.join(root, file1)
name_of_file = file1.split('-')[0]
folderDest = filenamepath.split('/')[:9]
folderDest = '/'.join(folderDest)
folderDest = folderDest + '/' + name_of_file + '_folder'
shutil.move(filenamepath2, folderDest)
Really I want to traverse the same directory after constructing the variable name_of_file and if that variable is in a folder name, it performs the move. However I came across issues trying to nest another for loop...

I would try something like this:
for root, dirs, files in os.walk(folder_path_variable):
for filename in files:
if filename.endswith('.pdf') and not filename.startswith('.'):
filepath = os.path.join(root, filename)
filename_prefix = filename.split('-')[0]
dest_dir = os.path.join(root, filename_prefix + '_folder')
if not os.path.isdir(dest_dir):
os.mkdir(dest_dir)
os.rename(filepath, os.path.join(dest_dir, filename))

The answer by John Zwinck is correct, except it contains a bug where if the destination folder already exists, a folder within that folder is created and the pdf is moved to that location. I have fixed this by adding a 'break' statement within the inner for loop (for filename in files).
The code below now executes correctly. Looks for folder named as the pdf's first few characters (taking the prefix split at '-') with '_folder' at the tail, if it exists the pdf is moved into it. If it doesn't, one is created with the prefix name and '_folder' and pdf is moved into it.
for root, dirs, files in os.walk(folder_path_variable):
for filename in files:
if filename.endswith('.pdf') and not filename.startswith('.'):
filepath = os.path.join(root, filename)
filename_prefix = filename.split('-')[0]
dest_dir = os.path.join(root, filename_prefix + '_folder')
if not os.path.isdir(dest_dir):
os.mkdir(dest_dir)
os.rename(filepath, os.path.join(dest_dir, filename))
break

Related

Traversing multiple folders for searching the same file in multiple foders in python

search the same file in multiple folders
I have tried with os.walk(path) but I am not getting the nested folders traversing
for current_root, folders, file_names in os.walk(self.path, topdown=True):
for i in folders:
print i
for filename in file_names:
count+= 1
file_path = os.path.join(current_root + '\\' + filename)
#print file_path
self.location_dictionary[file_path] = filename
in my code, it will print all folders but it will not enter to the nested folders recursively
ex: I have subdir,subdir1,subdir2 and in subdir I have another dir called abc
in subdir and abc both contain same file name I want to read that file
os.walk does not work that way.
for each current_root it traverses, it provides the list of directories and files directly under it.
You're nesting the loops, which does ... well I don't know...
Here you don't need the folder (so just mute the argument). current_root already contains that info for your files:
for current_root, _, file_names in os.walk(self.path, topdown=True):
for filename in file_names:
count+= 1
file_path = os.path.join(current_root,filename)
#print file_path
self.location_dictionary[file_path] = filename
aside: creating a dictionary with full file as key and filename as value looks, well, not what you want (the same information could be stored in a set or list and os.path.basename could be used to compute the filename. Maybe it's reverse (filename => full path), provided that there are no duplicate filenames.

Moving only Files in Directories

I have looked extensively on this site and I can't see an example that fits the bill. I have 4 directories each of which contains a number of files and another directory called 'Superseded'. I am trying to write a script that will move all files in each folder into the 'Superseded' folder but I'm not having any luck.
import os, shutil
source = r'U:\Data\All\Python_Test\Exports\GLA'
dest = r'U:\Data\All\Python_Test\Exports\Superseded'
listofFiles = os.listdir(source)
for f in listofFiles:
fullPath = source + "/" + f
shutil.move(fullPath, dest)
I can only get this to work for one directory and even then only when I've made the destination directory outside of the GLA directory if that makes sense.
I know there is a a os.path.isfile() module so that I can only move the files but I can't seem to get it to work. Does anybody have any ideas?
This works for me:
import os
#from:
# https://stackoverflow.com/questions/1158076/implement-touch-using-python
# I use this to create some empty file to move around later
def touch(fname, times=None):
fhandle = open(fname, 'a')
try:
os.utime(fname, times)
finally:
fhandle.close()
# this function is only to create the folders and files to be moved
def create_files_in_known_folders():
nameList=["source_dir_{:02d}".format(x) for x in range(4)]
for name in nameList:
path=os.path.expanduser(os.path.join("~",name))
if not os.path.exists(path):
os.mkdir(path)
ssPath=os.path.join(path,"superseded")
if not os.path.exists(ssPath):
os.mkdir(ssPath)
for i in range(3):
filename="{}_{:02d}.dat".format(name,i)
filepath=os.path.join(path, filename)
if not os.path.exists(filepath):
touch(filepath)
# THIS is actually the function doing what the OP asked for
# there many details that can be tweaked
def move_from_known_to_dest():
# here my given names from above
nameList=["source_dir_{:02d}".format(x) for x in range(4)]
# and my destination path
destPath=os.path.expanduser(os.path.join("~","dest"))
# not interested in files that are in subfolders
# if those would exist change to os.walk and
# exclude the destination folder with according if...:
for name in nameList:
path=os.path.expanduser(os.path.join("~",name))
dirList=os.listdir(path)
print path
for fileName in dirList:
filePath=os.path.join(path, fileName)
print filePath
if os.path.isfile(filePath):
destPath=os.path.join(path,"superseded",fileName)
print destPath
#alternatively you can chose to 1) overwrite ()might not work 2)delete first 3) not copy
# another option is to check for existence and if
# present add a number to the dest-file-name
# use while loop to check for first non-present number
assert not os.path.exists(destPath), "file {} already exits".format(destPath)
#https://stackoverflow.com/questions/8858008/how-to-move-a-file-in-python
os.rename( filePath, destPath)
if __name__=="__main__":
create_files_in_known_folders()
#break here and check that filestructure and files have been created
move_from_known_to_dest()
But, think carefully what to do if the file already exits in your destination folder.
os.walk might also be something you want to look at.
Implementing several options for the copy behaviour may look like this:
import warnings
#from:
# https://stackoverflow.com/questions/2187269/python-print-only-the-message-on-warnings
formatwarning_orig = warnings.formatwarning
warnings.formatwarning = lambda message, category, filename, lineno, line=None: \
formatwarning_orig(message, category, filename, lineno, line='')
def move_from_known_to_dest_extra(behaviour='overwrite'):
assert behaviour in ['overwrite','leave','accumulate'], "unknown behaviour: {}".format(behaviour)
nameList=["source_dir_{:02d}".format(x) for x in range(4)]
destPath=os.path.expanduser(os.path.join("~","dest"))
for name in nameList:
path=os.path.expanduser(os.path.join("~",name))
dirList=os.listdir(path)
for fileName in dirList:
filePath=os.path.join(path, fileName)
if os.path.isfile(filePath):
destPath=os.path.join(path,"superseded",fileName)
# simplest case...does not exist so copy
if not os.path.exists(destPath):
os.rename( filePath, destPath)
else:
if behaviour=='leave':
warnings.warn( "Warning! Not copying file: {}; file {} already exists!".format(filePath, destPath))
elif behaviour =='overwrite':
os.remove(destPath)
# documentation states:
# On Windows, if dst already exists, OSError will be raised even if it is a file.
os.rename( filePath, destPath)
warnings.warn( "Warning!Overwriting file: {}.".format(destPath))
elif behaviour=='accumulate': #redundant but OK
addPost=0
while True:
newDestPath=destPath+"{:04d}".format(addPost)
if not os.path.exists(newDestPath):
break
addPost+=1
assert addPost < 10000, "Clean up the mess!"
os.rename( filePath, newDestPath)
else:
assert 0, "Unknown copy behaviour requested."
Additionally one might check for file permissions as, e.g., os.remove() may raise an exception. In this case, however, I assume that permissions are properly set by the OP.

Python script to move specific files from one folder to another

I am trying to write a script (python 2.7) that will use a regex to identify specific files in a folder and move them to another folder. When I run the script, however, the source folder is moved to the target folder instead of just the files within it.
import os, shutil, re
src = "C:\\Users\\****\\Desktop\\test1\\"
#src = os.path.join('C:', os.sep, 'Users','****','Desktop','test1\\')
dst = "C:\\Users\\****\\Desktop\\test2\\"
#dst = os.path.join('C:', os.sep, 'Users','****','Desktop','test2')
files = os.listdir(src)
#regexCtask = "CTASK"
print files
#regex =re.compile(r'(?<=CTASK:)')
files.sort()
#print src, dst
regex = re.compile('CTASK*')
for f in files:
if regex.match(f):
filescr= os.path.join(src, files)
shutil.move(filesrc,dst)
#shutil.move(src,dst)
So basically there are files in "test1" folder that I want to move to "test2", but not all the files, just the ones that contain "CTASK" at the beginning.
The **** in the path is to protect my work username.
Sorry if it is messy, I am still trying a few things out.
You need to assign path to exact file (f) to filescr variable on each loop iteration, but not path to files (files - is a list!)
Try below code
import os
from os import path
import shutil
src = "C:\\Users\\****\\Desktop\\test1\\"
dst = "C:\\Users\\****\\Desktop\\test2\\"
files = [i for i in os.listdir(src) if i.startswith("CTASK") and path.isfile(path.join(src, i))]
for f in files:
shutil.copy(path.join(src, f), dst)
I wanted to move following folders : 1.1,1.2,1.45,1.7 to folder with name '1'
I Have posted solution below:
import shutil
import os
src_path = '/home/user/Documents/folder1'
dest_path='/home/user/Documents/folder2/'
source = os.listdir(src_path)
for folder in source :
#folder = '1.1 -anything'
newf = folder.split('.')[0]
#newf is name of new folder where you want to move
#change Folder name as per yourrequirement
destination = dest_path+newf
if not os.path.exists(destination):
os.makedirs(destination)
shutil.move(src_path+'/'+folder,destination) #change move to copy if you want to copy insted of moving
print 'done moving'

File exists - no such file

import os
myDir = "C:\\temp\\a"
for root, dirs, files in os.walk(myDir):
for file in files:
# fname = os.path.join(root, file) # this works fine, yeah!
fname = os.path.join(myDir, file)
print ("%r" % (fname))
src = os.path.isfile(fname)
if src == False:
print ("%r :Fail" % (fname))
f = open(fname,"r")
f.close()
I expected the two versions of fname to be the same, but I've found out the hard way that the above code doesn't work. I just want to know why, that's all.
The problem is that os.walk(myDir) walks all the subdirectories, recursively! When walk descends into a subdirectory, root will be that directory, while myDir is still the root directory the search started in.
Let's say you have a file C:\temp\a\b\c\foo.txt. When os.walk descends into c, myDir is still C:\temp\a and root is C:\temp\a\b\c. Then os.path.join(root, file) will yield C:\temp\a\b\c\foo.txt, while os.path.join(myDir, file) gives C:\temp\a\foo.txt.
You might want to rename your myDir variable to root, and root to current, respectively, so it's less confusing.

using os.walk cannot open the file from the list

My problem is to read '.csv' files in catalogs and do some calculations on them.
I have calculations working but my for loop seem not to work as I want to.
d = 'F:\MArcin\Experiments\csvCollection\'
for dirname, dirs, files in os.walk(d):
for i in files:
if i.endswith('.csv'):
data1 = pd.read_csv(i, sep=",")
data = data1['x'][:, np.newaxis]
target = data1['y']
The error Iam getting is:
IOError: File 1.csv does not exist
files is list of all '.csv' files inside dirname
i is str of size 1 and contains 1.csv (that is first of the files in catalog)
Any ideas why this is not working?
Thanks for any help.
Because 1.csv is somewhere on the filesystem and when you call read_csv() it opens file relative to current directory.
Just open it using absolute path:
data1 = pd.read_csv(os.path.join(dirname, i), sep=",")
dirname in os.walk represents actual directory where file 1.csv is located.