using os.walk cannot open the file from the list - python-2.7

My problem is to read '.csv' files in catalogs and do some calculations on them.
I have calculations working but my for loop seem not to work as I want to.
d = 'F:\MArcin\Experiments\csvCollection\'
for dirname, dirs, files in os.walk(d):
for i in files:
if i.endswith('.csv'):
data1 = pd.read_csv(i, sep=",")
data = data1['x'][:, np.newaxis]
target = data1['y']
The error Iam getting is:
IOError: File 1.csv does not exist
files is list of all '.csv' files inside dirname
i is str of size 1 and contains 1.csv (that is first of the files in catalog)
Any ideas why this is not working?
Thanks for any help.

Because 1.csv is somewhere on the filesystem and when you call read_csv() it opens file relative to current directory.
Just open it using absolute path:
data1 = pd.read_csv(os.path.join(dirname, i), sep=",")
dirname in os.walk represents actual directory where file 1.csv is located.

Related

Traversing multiple folders for searching the same file in multiple foders in python

search the same file in multiple folders
I have tried with os.walk(path) but I am not getting the nested folders traversing
for current_root, folders, file_names in os.walk(self.path, topdown=True):
for i in folders:
print i
for filename in file_names:
count+= 1
file_path = os.path.join(current_root + '\\' + filename)
#print file_path
self.location_dictionary[file_path] = filename
in my code, it will print all folders but it will not enter to the nested folders recursively
ex: I have subdir,subdir1,subdir2 and in subdir I have another dir called abc
in subdir and abc both contain same file name I want to read that file
os.walk does not work that way.
for each current_root it traverses, it provides the list of directories and files directly under it.
You're nesting the loops, which does ... well I don't know...
Here you don't need the folder (so just mute the argument). current_root already contains that info for your files:
for current_root, _, file_names in os.walk(self.path, topdown=True):
for filename in file_names:
count+= 1
file_path = os.path.join(current_root,filename)
#print file_path
self.location_dictionary[file_path] = filename
aside: creating a dictionary with full file as key and filename as value looks, well, not what you want (the same information could be stored in a set or list and os.path.basename could be used to compute the filename. Maybe it's reverse (filename => full path), provided that there are no duplicate filenames.

Executing an .exe file on files in another folder

I have my python code that runs a C++ code, which takes files in another folder as input.
I have my codes in folder A, and the input files are in folder B, and I have been trying this:
path = 'C:/pathToInputFiles'
dirs = os.listdir(path)
for path in dirs:
proc = subprocess.Popen([fullPathtoCppCode, inputFiles])
However, I keep receiving WindowsError: [Error 2] The system cannot find the file specified
The only way it works is when I put the C++ executable file in the same folder of the input files, which I am avoiding to do.
How can I make python reads the file path properly?
Try using os.path.join after your for statement.
path = os.path.join(directory, filename)
for example
def test(directory):
for filename in os.listdir(directory):
filename = os.path.join(directory, filename)
proc = subprocess.Popen([fullPathtoCppcode, inputFiles])

Python shutil file move in os walk for loop

The code below searches within a directory for any PDFs and for each one it finds it moves into the corresponding folder which has '_folder' appended.
Could it be expressed in simpler terms? It's practically unreadable. Also if it can't find the folder, it destroys the PDF!
import os
import shutil
for root, dirs, files in os.walk(folder_path_variable):
for file1 in files:
if file1.endswith('.pdf') and not file1.startswith('.'):
filenamepath = os.path.join(root, file1)
name_of_file = file1.split('-')[0]
folderDest = filenamepath.split('/')[:9]
folderDest = '/'.join(folderDest)
folderDest = folderDest + '/' + name_of_file + '_folder'
shutil.move(filenamepath2, folderDest)
Really I want to traverse the same directory after constructing the variable name_of_file and if that variable is in a folder name, it performs the move. However I came across issues trying to nest another for loop...
I would try something like this:
for root, dirs, files in os.walk(folder_path_variable):
for filename in files:
if filename.endswith('.pdf') and not filename.startswith('.'):
filepath = os.path.join(root, filename)
filename_prefix = filename.split('-')[0]
dest_dir = os.path.join(root, filename_prefix + '_folder')
if not os.path.isdir(dest_dir):
os.mkdir(dest_dir)
os.rename(filepath, os.path.join(dest_dir, filename))
The answer by John Zwinck is correct, except it contains a bug where if the destination folder already exists, a folder within that folder is created and the pdf is moved to that location. I have fixed this by adding a 'break' statement within the inner for loop (for filename in files).
The code below now executes correctly. Looks for folder named as the pdf's first few characters (taking the prefix split at '-') with '_folder' at the tail, if it exists the pdf is moved into it. If it doesn't, one is created with the prefix name and '_folder' and pdf is moved into it.
for root, dirs, files in os.walk(folder_path_variable):
for filename in files:
if filename.endswith('.pdf') and not filename.startswith('.'):
filepath = os.path.join(root, filename)
filename_prefix = filename.split('-')[0]
dest_dir = os.path.join(root, filename_prefix + '_folder')
if not os.path.isdir(dest_dir):
os.mkdir(dest_dir)
os.rename(filepath, os.path.join(dest_dir, filename))
break

Python finds a string in multiple files recursively and returns the file path

I'm learning Python and would like to search for a keyword in multiple files recursively.
I have an example function which should find the *.doc extension in a directory.
Then, the function should open each file with that file extension and read it.
If a keyword is found while reading the file, the function should identify the file path and print it.
Else, if the keyword is not found, python should continue.
To do that, I have defined a function which takes two arguments:
def find_word(extension, word):
# define the path for os.walk
for dname, dirs, files in os.walk('/rootFolder'):
#search for file name in files:
for fname in files:
#define the path of each file
fpath = os.path.join(dname, fname)
#open each file and read it
with open(fpath) as f:
data=f.read()
# if data contains the word
if word in data:
#print the file path of that file
print (fpath)
else:
continue
Could you give me a hand to fix this code?
Thanks,
def find_word(extension, word):
for root, dirs, files in os.walk('/DOC'):
# filter files for given extension:
files = [fi for fi in files if fi.endswith(".{ext}".format(ext=extension))]
for filename in files:
path = os.path.join(root, filename)
# open each file and read it
with open(path) as f:
# split() will create list of words and set will
# create list of unique words
words = set(f.read().split())
if word in words:
print(path)
.doc files are rich text files, i.e. they wont open with a simple text editor or python open method. In this case, you can use other python modules such as python-docx.
Update
For doc files (previous to Word 2007) you can also use other tools such as catdoc or antiword. Try the following.
import subprocess
def doc_to_text(filename):
return subprocess.Popen(
'catdoc -w "%s"' % filename,
shell=True,
stdout=subprocess.PIPE
).stdout.read()
print doc_to_text('fixtures/doc.doc')
If you are trying to read .doc file in your code the this won't work. you will have to change the part where you are reading the file.
Here are some links for reading a .doc file in python.
extracting text from MS word files in python
Reading/Writing MS Word files in Python
Reading/Writing MS Word files in Python

Zipping a file from a directory and placing it in another Directory

I am trying to set up a program to put my minecraft server world into a zip and place it into another directory on another drive (/media/500gb/MinecraftWorldBackups)
But I keep getting this error
Although the folder doesn't contain a folder or file called 'h'
What do I need to do to fix this I believe it is due to file and folder?
#!/usr/bin/env python
import time, zipfile
while True:
FileName = 'MinecraftBackup_' + str(int(time.time()))
Path = '/home/bertie/Desktop/FeedTheBeastServer/world/'
print(FileName)
Zip = zipfile.ZipFile('/media/500gb/MinecraftWorldBackups/'+FileName+'.zip','w')
for each in Path:
print(each)
try: Zip.write(Path + each)
except IOError: None
Zip.Close()
print('Done')
time.sleep(60)