Python script to run recursive but to print once - python-2.7

I am new to the scripting world, I have multiple files in the directory and I am trying to use the script to search for a file starting with a string in the given directory using os.walk but to print once at the end stating file found or not found.
import os
for path, currentDirectory, files in os.walk("/root/kin"):
for file in files:
if file.startswith('aglk') and file.endswith('.txt'):
print("File Exists:",file)
break
else:
print("File Doesn't Exists")
In my case, if the file doesn't exist, it prints multiple times until it searches all the files in the directory which is not what I need. I would need a script to search for all the files in the directory and print once stating found or not found. Please suggest
I tried changing the if & else statements but not find the right indentation usage.

Related

error in indirect file load in Informatica

I am trying indirect file load in Informatica.
I put below files in $PmSrcFilesDir (from here the workflow task pick up files)
-list.txt
-production_plan_20210906.csv
-production_plan_20210907.csv
The list.txt files contains the csv file names only.
I configured below options:
Source filetype- Indirect
Source filename- list.txt
Source file directory- $PMSourceFileDir
After running the workflow it shows error- as
FR_3000 Error Opening File...No such file or directory
You can give absolute path in list.txt.
/Path/to/file/production_plan_20210906.csv
/Path/to/file/production_plan_20210907.csv
You can use command task or shell script to get absolute path and file name.
Pls check session log, which file it cant read - list file or main file. If list file mention $PMSourceFileDir correctly in param file.
Now, make sure informatica user (user that runs infa server) has read access to those data, list folders and files. Admin can help.

Python glob.glob to work with optional subdirectories?

Is there a way to make glob.glob (not glob2) search for files in optional subdirectories in Python 2.7?
I want to search for files ending with "_stats.txt", in these two paths:
/starting_path/Data/Intensities/BaseCalls/Primary_Analysis_Results/results/FASTQ_1mm_currentDate/Project_1/trimming_currentDate/cutadapt_S1_stats.txt
/starting_path/Data/Intensities/BaseCalls/Primary_Analysis_Results/results/FASTQ_1mm_currentDate/trimming_currentDate/cutadapt_S1_stats.txt
As you can see, the "Project_1" subdirectory doesn't always exist in file paths. For the moment I have tried with the following code:
stats_paths=glob.glob("/starting_path/Data/Intensities/BaseCalls/Primary_Analysis_Results/results/FASTQ_*/**/trimming_*/*_stats.txt")
but it only works when the "Project_1" subdirectory exists. When it is not in the path I get an empty list.
Thanks in advance!

Grep with regex from file in bash script without inclusion of more folders

I have a file containing various paths such as:
/home/user/Desktop/Bash/file1.txt
/home/user/Desktop/Bash/file2.txt
/home/user/Desktop/Bash/file3.txt
/home/user/Desktop/Bash/moreFiles/anotherFile.txt
/home/user/Desktop/text/prettyFile.txt
And I receive a input from user that contains the directory, such as:
/home/user/Desktop/Bash/
And I usually save this expression into regex to find all the files in the directory by grep. However, if the folder has more folders, it includes them as well, but I want to only include the files in the directory that was entered by the user. My desired output is should be this:
/home/user/Desktop/Bash/file1.txt
/home/user/Desktop/Bash/file2.txt
/home/user/Desktop/Bash/file3.txt
But it keeps including
/home/user/Desktop/Bash/moreFiles/anotherFile.txt
which I don't want and I need to do it inside a bash script.
You can use this grep command to get just the files directly under given path skipping sub-directories:
s='/home/user/Desktop/Bash/'
grep -E "$s[^/]+/?$" file
/home/user/Desktop/Bash/file1.txt
/home/user/Desktop/Bash/file2.txt
/home/user/Desktop/Bash/file3.txt

Locating project-specifc configuration files from imported modules

Project structure:
/lib/modules/mod1.py
/mod2.py
/subdir1/subdir2/mod3.py
/configs/config.yaml
mod3.py imports mod2.py. mod2.py imports mod1.py. mod1.py loads configuration files that are at a relative path to mod2.py using os.getcwd().
The problem is that when mod3.py imports mod2.py, mod1.py attempts to load the config files from a path relative to mod3.py (i.e. /subdir1/subdir2/configs/config.yaml instead of /configs/config.yaml)--this, of course, doesn't work.
I believe understand why this isn't working (os.getcwd() get the path of the originally executed file).
How can I fix this so that mod1.py will use a path relative to mod2.py even when mod2.py is imported from mod3.py?
I haven't been able to find a built-in way to do this in Python, so what I ended up doing is this:
mod1.py:
configs_list = os.getcwd().split('/')
for x in configs_list:
# Check each directory in list, bottom up. 'pop()' list on
# each failure. Assign var and break loop when configs path is found.
if not os.path.exists('/'.join(configs_list) + '/configs'):
configs_list.pop()
else:
configs_path = '/'.join(configs_list) + '/configs'
break
configs_path is then used to prefix the specific configuration file name(s) in mod1.py. Since every call to mod1.py will occur from within a project's directory structure, and every project has only one configs directory, this should (and has so far) correctly identified the configs directory regardless of where in the project the given script is being run from.
I'm open to better or more Pythonic ways of doing this, if anyone has input.

read and write files to different directories Python

I want my code to go into a sub directory, perform some operation and save the output in a file which is one step up, to the main dir.
Main directory ---> sub_directory
I would appreciate solutions which do not require "hardcoding" the path of the main
directory. Is there a way I can directly write my file output to the main dir without
doing a os.chdir() every iteration? Something like just giving the path of the file to read and write?
For eg:
# example
import os
for i in xrange(10):
code to read and operate on some file in this sub dir one by one (ten files)
# write output file to the previous directory
# without hardcoding the path
code to write files to main directory (ten files )
You probably want to check the directory the file is operating within or check the current working directory:
import os
cur_dir= os.getcwd()
top_dir = os.path.dirname(cur_dir)
# perform operations in current directory
# do some stuff in top directory
Assuming you start in the main directory, and you know the (relative) path to the subdirectories, just do
open(os.path.join(subdir, filename))
to access a path in a subdirectory without actually changing the current directory.