I have looked a lot for this but have not found anything. I am very new to matlab and regex in general.
My problem is, have a directory path 'dir' with only one .txt file in it. I do however not know the filename of the txt file. I want to load this file.
I have tried multiple things but cannot find the solution.
foo = load(fullfile(dir, '-regexp', '*.txt'))
Thank you for your help!
That syntax isn't valid for fullfile, and dir is an in-built function which it appears you're using as a variable... Here is something a little clearer which should work when you have a single txt file within a given folder
folder = 'my\folder\path\';
files = dir( fullfile( folder, '*.txt' ) );
if numel( files ) ~= 1
error( 'More or less than one .txt file found!' );
end
filepath = fullfile( files(1).folder, files(1).name );
foo = load( filepath ); % load is designed for .mat files, if your .txt contains anything
% non-numeric then you may want something more like readtable here...
Related
I'm learning Python and would like to search for a keyword in multiple files recursively.
I have an example function which should find the *.doc extension in a directory.
Then, the function should open each file with that file extension and read it.
If a keyword is found while reading the file, the function should identify the file path and print it.
Else, if the keyword is not found, python should continue.
To do that, I have defined a function which takes two arguments:
def find_word(extension, word):
# define the path for os.walk
for dname, dirs, files in os.walk('/rootFolder'):
#search for file name in files:
for fname in files:
#define the path of each file
fpath = os.path.join(dname, fname)
#open each file and read it
with open(fpath) as f:
data=f.read()
# if data contains the word
if word in data:
#print the file path of that file
print (fpath)
else:
continue
Could you give me a hand to fix this code?
Thanks,
def find_word(extension, word):
for root, dirs, files in os.walk('/DOC'):
# filter files for given extension:
files = [fi for fi in files if fi.endswith(".{ext}".format(ext=extension))]
for filename in files:
path = os.path.join(root, filename)
# open each file and read it
with open(path) as f:
# split() will create list of words and set will
# create list of unique words
words = set(f.read().split())
if word in words:
print(path)
.doc files are rich text files, i.e. they wont open with a simple text editor or python open method. In this case, you can use other python modules such as python-docx.
Update
For doc files (previous to Word 2007) you can also use other tools such as catdoc or antiword. Try the following.
import subprocess
def doc_to_text(filename):
return subprocess.Popen(
'catdoc -w "%s"' % filename,
shell=True,
stdout=subprocess.PIPE
).stdout.read()
print doc_to_text('fixtures/doc.doc')
If you are trying to read .doc file in your code the this won't work. you will have to change the part where you are reading the file.
Here are some links for reading a .doc file in python.
extracting text from MS word files in python
Reading/Writing MS Word files in Python
Reading/Writing MS Word files in Python
I have an output file from a code which its name will ends to "_x.txt" and I want to connect two codes which second code will use this file as an input and will add more data into it. Finally, it will ends into "blabla_x_f.txt"
I am trying to work it out as below, but seems it is not correct and I could not solve it. Please help:
inf = str(raw_input(*+"_x.txt"))
with open(inf+'_x.txt') as fin, open(inf+'_x_f.txt','w') as fout:
....(other operations)
The main problem is that the "blabla" part of the file could change to any thing every time and will be random strings, so the code needs to be flexible and just search for whatever ends with "_x.txt".
Have a look at Python's glob module:
import glob
files = glob.glob('*_x.txt')
gives you a list of all files ending in _x.txt. Continue with
for path in files:
newpath = path[:-4] + '_f.txt'
with open(path) as in:
with open(newpath, 'w') as out:
# do something
My problem is to read '.csv' files in catalogs and do some calculations on them.
I have calculations working but my for loop seem not to work as I want to.
d = 'F:\MArcin\Experiments\csvCollection\'
for dirname, dirs, files in os.walk(d):
for i in files:
if i.endswith('.csv'):
data1 = pd.read_csv(i, sep=",")
data = data1['x'][:, np.newaxis]
target = data1['y']
The error Iam getting is:
IOError: File 1.csv does not exist
files is list of all '.csv' files inside dirname
i is str of size 1 and contains 1.csv (that is first of the files in catalog)
Any ideas why this is not working?
Thanks for any help.
Because 1.csv is somewhere on the filesystem and when you call read_csv() it opens file relative to current directory.
Just open it using absolute path:
data1 = pd.read_csv(os.path.join(dirname, i), sep=",")
dirname in os.walk represents actual directory where file 1.csv is located.
So I want to read in a text file and then use some of that to write to another file that doesn't exist in the same directory. So for instance if I have a file named text.txt, I want to write a script that reads it and then creates another file, text2.txt which has some of its contents determined by what was in text.txt.
To read the file I'm using the command,
with open(inpath, 'r') as f:
...
But then what is the preferred way to create a new file and start writing to it? If I had to guess, I'd think it would be
with open(inpath, 'r') as f:
outtext = open(outpath, 'w')
...
where the variable outpath stores the directory of the file to be written. If I understand all this correctly, if the directory outpath happens to exist, running this script would destroy it or at least append to it. But if it doesn't exist, then Python would create the file. Is that accurate? And is there a better, more elegant way to do this?
I believe inpath and outpath are absolute paths. So you cannot do:
with open(inpath, 'r') as f:
...
It will throw IOError exception. open method expects a file path, but since you are providing path to a directory, exception occurs. The same applies to outpath also. Now Lets assume values of inpath and outpath as:
input_path = '/Users/avi/inputs'
output_path = '/Users/avi/outputs'
Now, to read a file, you could do:
input_file_path = os.path.join(input_path, 'input.txt')
The input_file_path will be now /Users/avi/inputs/input.txt
and to open this:
with open(input_file_path, 'r') as f:
...
Now coming to second question, yes, if file already exists python will overwrite. If it does not, it creates a new one. So you can first check whether file exists or not. If it does, then you can create a new one:
output_path_file = os.path.join(output_path, 'output.txt')
if os.path.isfile(output_path_file):
# file already exists
# do something else like create another file
output_path_file = os.path.join(output_path, 'new_output.txt')
# now write to output file
with open(output_file_path, 'w') as f:
...
I recently wrote a python script to select certain files within a directory and save them to a new archive within that directory. The script works with the exception that it creates a duplicate archive within the new archive. I think it has something to do with the arcname I used and the loop but I'm really not sure. As I'm sure is obvious by looking at my code I am a beginner so I am sure there is plenty of room for improvement here. Any ideas as to where the problem is? Also if you have any suggestions for improving the code I'm all ears.
import os,arcpy,zipfile
inputfc = arcpy.GetParameterAsText(0) # User Inputs Feature Class Path
desc = arcpy.Describe(inputfc)
fcname = desc.basename
zname = fcname + ".zip"
gpath = os.path.dirname(inputfc)
zpath = os.path.join(gpath,zname)
zfile = zipfile.ZipFile(zpath, "w")
for f in os.listdir(gpath):
fpath = os.path.join(gpath, f)
if f.startswith(fcname):
zfile.write(fpath,f,compress_type = zipfile.ZIP_DEFLATED)
zfile.close()
Edit: After aruisdante answered my question I decided to just change the zname variable to
zname = "zip" + fcname + ".zip" #ugly but it worked thanks
This:
zfile = zipfile.ZipFile(zpath, "w")
Creates a new Zip file at zpath
for f in os.listdir(gpath):
Iterates through all of the files at gpath. Since gpath is also the root of zpath, then the zip file you just created will be one of the files in gpath. So it gets included in the archive. You will need to exclude it:
for f in (filename for filename in os.listdir(gpath) if filename != zname):