Grep with regex from file in bash script without inclusion of more folders - regex

I have a file containing various paths such as:
/home/user/Desktop/Bash/file1.txt
/home/user/Desktop/Bash/file2.txt
/home/user/Desktop/Bash/file3.txt
/home/user/Desktop/Bash/moreFiles/anotherFile.txt
/home/user/Desktop/text/prettyFile.txt
And I receive a input from user that contains the directory, such as:
/home/user/Desktop/Bash/
And I usually save this expression into regex to find all the files in the directory by grep. However, if the folder has more folders, it includes them as well, but I want to only include the files in the directory that was entered by the user. My desired output is should be this:
/home/user/Desktop/Bash/file1.txt
/home/user/Desktop/Bash/file2.txt
/home/user/Desktop/Bash/file3.txt
But it keeps including
/home/user/Desktop/Bash/moreFiles/anotherFile.txt
which I don't want and I need to do it inside a bash script.

You can use this grep command to get just the files directly under given path skipping sub-directories:
s='/home/user/Desktop/Bash/'
grep -E "$s[^/]+/?$" file
/home/user/Desktop/Bash/file1.txt
/home/user/Desktop/Bash/file2.txt
/home/user/Desktop/Bash/file3.txt

Related

Using regular expression in lftp to ignore some strings from file name

Get specific file with name like abc_yyyymmdd_hhmmss.csv from directory using mget.
Example files in a folder:
abc_20221202_145911.csv
abc_20221202_145921.csv
abc_20221202_145941.csv
abc_20181202_145941.csv
But, I want to ignore hhmmss part. I want to get all files with abc_20221202_*.csv
How to include * in mget.
My code below:
File=abc_
Date=20221202
Filename=$File$Date"_*".csv
// Assume I have sftp connection established and I am in directory //where files with above naming convention are present. As I can //download the file when hardcoding exact file name during testing
conn=`lftp $protocol://$user:$password#$sftp_server -p $port <<EOF>/error.log
cd $path
mget $Filename
EOF`
The script is able to find the file but not able to retrieve it from the server.
But, if I remove * and provide the entire file name abc_20221202_145941.csv it will download the file. Why is * causing issue in retrieving the file
Assuming mget actually accepts regex:
Currently your regexp is looking for files that match abc_20221202_(underscore any number of times).csv
Just add a . before the * so it matches any character after the underscore any number of times before the .csv
Like so:
Filename=$File$Date"_.*".csv
If mget doesn't actually support regex, just use wget instead:
wget -r -np -nH -A "abc_20221202_.*\.csv" --ftp-user=user --ftp-password=psd ftp://ip/*
I think the backtick symbol was causing the problem when using *. Once I removed the ` (backtick) and used below command, it worked fine.
lftp -p $port $protocol://$user:$password#$sftp_server <<EOF>/error.log
cd $path
lcd $targetPath
mget $Filename
EOF
You probably missed an underscore between File and Date. A good way to debug such problems is to enable debug (“debug” command) and command logging (set cmd:trace true)

Solaris: Regex how to select files with certain filename

First of all, the server runs Solaris.
The context of my question is Informatica PowerCenter.
I need to list files situated in the Inbox directory. Basically, the outcome should be one file list by type of file. The different file types are distinguished by the file name. I don't want to update the script every time a new file type starts to exist so I was thinking of a parameterized shell script with the regex, the inbox directory and the file list
An example:
/Inbox/ABC.DEFGHI.PAC.AE.1236547.49566
/Inbox/ABC.DEFGHI.PAC.AE.9876543.21036
/Inbox/DEF.JKLMNO.PAC.AI.1236547.49566
... has to result in 2 list files containing the path and file name of the listed files:
/Inbox/PAC.AE.FILELIST
-->/Inbox/ABC.DEFGHI.PAC.AE.1236547.49566
-->/Inbox/ABC.DEFGHI.PAC.AE.9876543.21036
/Inbox/PAC.AI.FILELIST
-->/Inbox/DEF.JKLMNO.PAC.AI.1236547.49566
Assuming all input files follow the convention you indicate (when splitting on dots, the 3rd and 4th column determine the type), this script might do the trick:
#! /usr/bin/env bash
# First parameter or current directory
INPUTDIR=${1:-.}
# Second parameter (or first input directory if not given)
OUTPUTDIR=${2:-$INPUTDIR}
# Filter out directories
INPUTFILES=$(ls -p $INPUTDIR | grep -v "/")
echo "Input: $INPUTDIR, output: $OUTPUTDIR"
for FILE in $INPUTFILES; do
FILETYPE=$(echo $FILE | cut -d. -f3,4)
COLLECTION_FILENAME="$OUTPUTDIR/${FILETYPE:-UNKNOWN}.FILELIST"
echo "$FILE" >> $COLLECTION_FILENAME
done
Usage:
./script.sh Inbox Inbox/collections
Will read all files (not directories) from Inbox, and write the collection files to Inbox/collections. Filenames inside collections should be sorted alphabetically.

Copy all Files in a List to a Unique Directory

I am trying to take a text file that contains a list of files and copy them all to a directory. Within this directory, they will have unique directory names. An example of text file the structure can be seen below:
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000003/s01_2011_11_01/a_.edf
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000003/s01_2011_11_01/a_1.edf
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000003/s02_2011_11_11/a_.edf
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000003/s02_2011_11_11/a_1.edf
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000005/s01_2009_02_13/a_.edf
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000005/s02_2010_10_02/a_.edf
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000005/s03_2010_10_02/a_.edf
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000005/s04_2010_10_03/a_.edf
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000005/s04_2010_10_03/a_1.edf
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000005/s04_2010_10_03/a_2.edf
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000005/s04_2010_10_03/a_3.edf
/data/isip/data/tuh_eeg/v0.6.0/edf/001/00000005/s04_2010_10_03/a_4.edf
I need a shell command or an EMACS macro to go through this list and copy them all to unique directories within the current working directory. The unique directory will depend on the file; for example, for the first two files, the directory would be
/001/00000003/s01_2011_11_01/
I have tried doing this using an EMACS macro, but I was not able to get it to work. A shell command or EMACs macro would work.
Something as simple as:
cat list | sed "s/^.*edf\/\(.*\)\/\(.*\)$/mkdir -p root_dir\/\1 \&\& cp \0 root_dir\/\1\/\2/" | sh
If on OSX - install gnu-sed and use gsed instead of sed. Run command without | sh to see what it'll do. Make sure to tweak root_dir, of course.

Shell script to create directories and files from a list of file names

I'm (still) not a shell-wizard, but I'm trying to find a way to create directories and files from a list of file names.
Let's take this source file (source.txt) as an example:
README.md
foo/index.html
foo/bar/README.md
foo/bar/index.html
foo/baz/README.md
I'll use this command to remove empty lines and trim useless spaces:
$ more source.txt | sed '/^$/d;s/^ *//;s/ *$//'
It will give me this list:
README.md
foo/index.html
foo/bar/README.md
foo/bar/index.html
foo/baz/README.md
Now I'm trying to loop on every line and create the related file (it it doesn't already exists), with it's parents directories.
How could I do this?
Ideally, I would put this script in an alias to quickly use it.
As always, posting a question brings me to the end of the problem...
I came to a satisfying solution, using dirname and basename in a for .. in loop:
for i in `cat source.txt | sed '/^$/d;s/^ *//;s/ *$//'`;
do mkdir -p `dirname $i`;
touch `echo $(dirname $i)$(echo "/")$(basename $i)`;
done
This one-line command will:
read the file names list
create directories
create empty files in their own directory

Regex to add an extension to a directory full of files

I am new to regular expressions.
I have many irregularly numbered ascii files with no extension: g000554, g000556, g000558, g000561, g000563 ... g001979 etc
I would like to type a regex at the terminal (or in a short script) to add a .dat to all of these files.
So I would like to change them to become: g000554.dat, g000556.dat, g000558.dat, g000561.dat, g000563.dat ... g001979.dat etc
p.s. Sorry I should have provided more info: by terminal I meant a mac terminal and I cannot use the 'rename' command.
I think you're using a linux system. So i provide a bash solution. It works only if your files starts with g and there is no other files in that directory except the files you want to rename.
for i in g*; do mv "$i" "$i.dat"; done
The below would add .dat extension to all the files present in the current directory,
for i in *; do mv "$i" "$i.dat"; done