Was wondering if someone could help me out with regular expressions and bash.
I'm trying to execute a set of commands on files that only have a certain extension, in this case: mpg, mpeg, avi, and mkv.
I've actually found a solution here, however, it doesn't seem to work. If someone can tell me why, I'd appreciate it.
#!/bin/bash
# Configuration
TARGETDIR="$1"
TARGETEXT="(mpg|mpeg|avi|mkv)"
for d in `find $1 -type d`
do
echo "Searching directory: $d"
for f in "$d"/*
do
if [ -d "${f}" ];
then
# File is a directory, do not perform
echo "$f is a directory, not performing ..."
elif [ -f "${f}" ];
then
filename=$(basename "$f")
extension="${filename##*.}"
if [ "$extension" == "$TARGETEXT" ];
then
echo "Match"
else
echo "Mismatch - $f - $extension"
fi
fi
done
done
Again, any assistance is appreciated.
This can probably be done using only the find command.
find $TARGETDIR -regex ".*\\.$TARGETEXT" -type f -exec your_command {} \;
Instead of direct string comparison
if [ "$extension" == "$TARGETEXT" ];
use Bash regex matching syntax
if [[ "$extension" =~ $TARGETEXT ]];
Note the double [[ ]] and the non-quoted $TARGETEXT.
You can do this in bash without regular expressions, just file patterns:
shopt -s globstar nullglob
for f in **/*.{mpg,mpeg,avi,mkv}; do
if [[ -f "$f" ]]; then
# do something with the file:
echo "$f"
fi
done
Related
I have written a small script that loops through directories (starting from a given argument directory) and prompts directories that have an xml file inside. Here is my code :
#! /bin/bash
process()
{
LIST_ENTRIES=$(find $1 -mindepth 1 -maxdepth 1)
regex="\.xml"
if [[ $LIST_ENTRIES =~ $regex ]]; then
echo "$1"
fi
# Process found entries
while read -r line
do
if [[ -d $line ]]; then
process $line
fi
done <<< "$LIST_ENTRIES"
}
process $1
This code works fine. However, if I change the regex to \.xml$ to indicate that it should match at the end of the line, the result is different, and I do not get all the right directories.
Is there something wrong with this ?
Your variable LIST_ENTRIES may not have .xml as the last entry.
To validate, try echo $LIST_ENTRIES.
To overcome this, use for around your if:
process()
{
LIST_ENTRIES=$(find $1 -mindepth 1 -maxdepth 1)
regex="\.xml$"
for each in $LIST_ENTRIES; do
if [[ $each =~ $regex ]]; then
echo "$1"
fi
done
# Process found entries
while read -r line
do
if [[ -d $line ]]; then
process $line
fi
done <<< "$LIST_ENTRIES"
}
process $1
I'm looking for a fast, short and portable way to check if a file matching the regex (env(ironment)?|requirements).ya?ml exists in the current working directory and if so assign its basename to a variable $FILE for further processing.
Basically, I'd like to combine getting the handle in
for FILE in environment.yml env.yml requirements.yml environment.yaml env.yaml requirements.yaml; do
if [ -e $FILE ]; then
...
fi
done
with using a regex as in
if test -n "$(find -E . -maxdepth 1 -regex '.*(env(ironment)?|requirements).ya?ml' -print -quit)"
then
...
fi
Stick it in a variable:
file="$(find -E . -maxdepth 1 -regex '.*(env(ironment)?|requirements).ya?ml' -print -quit)"
if [ -n "$file" ]
then
echo "I found $file"
else
echo "No such file."
fi
Alternatively, you can keep your loop and shorten it using brace expansion:
for file in {env{,ironment},requirements}.{yml,yaml}
do
if [ -e "$file" ]
then
echo "Found $file"
else
echo "There is no $file"
fi
done
or match files directly using bash's extglob:
shopt -s nullglob
for file in #(env?(ironment)|requirements).y?(a)ml
do
echo "Found $file"
done
I'm currently trying to loop through all files in a certain directory using bash. If the file matches the following regular expression, it outputs the filename. If it doesn't, it outputs 'not' and then the filename. The regular expression is supposed to filter out any files that have a '.' in them.
for f in * ; do
if [[ $f =~ "^[^\.]+$" ]]; then
echo "$f"
else
echo "not $f"
fi
done
It correctly loops through all the files, but for a reason that has stumped me for quite a while, I cannot get it to only exclude files with a '.' in them. For example, in a directory with the following files:
bashrc
gitconfig
install.sh
README.md
vimrc
the output of the script is such:
not bashrc
not gitconfig
not install.sh
not README.md
not vimrc
I validated the regular expression here. Any thoughts?
Don't quote the right-hand side of your expression.
if [[ $f =~ ^[^.]+$ ]]; then
Quotes make the string a literal substring, rather than a regular expression.
For better portability across bash versions, put your regex in a variable (single-quoted, which will make the backslash literal):
re='^[.]+$'
if [[ $f =~ $re ]]; then
That said, you could do this with an extglob as well:
shopt -s extglob # enable extended globs
for f in +([!.]); do
printf 'Matched %q\n' "$f"
done
...or with a general-purpose pattern match:
for f in *; do
if [[ $f = *.* ]]; then
printf '%q contains a dot\n' "$f"
else
printf '%q does not contain a dot\n' "$f"
fi
done
So I'm writing a bash script that counts the number of files in a directory and outputs a number. The function takes a directory argument as well as an optional file-type extension argument.
I am using the following lines to set the dir variable to the directory and ext variable to a regular expression that will represent all the file types to count.
dir=$1
[[ $# -eq 2 ]] && ext="*.$2" || ext="*"
The problem I am encountering occurs when I attempt to run the following line:
echo $(find $dir -maxdepth 1 -type f -name $ext | wc -l)
Running the script from the terminal works when I provide the second file-type argument but fails when I don't.
harrison#Luminous:~$ bash Documents/howmany.sh Documents/ sh
3
harrison#Luminous:~$ bash Documents/howmany.sh Documents/
find: paths must precede expression: Desktop
Usage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|rates|opt|exec] [path...] [expression]
0
I have searched for this error and I know it's an issue with the shell expanding my wildcard as explained here. I've tried experimenting with single quotes, double quotes, and backslashes to escape the asterisk but nothing seems to work. What's particularly interesting is that when I try running this directly through the terminal, it works perfectly fine.
harrison#Luminous:~$ echo $(find Documents/ -maxdepth 1 -type f -name "*" | wc -l)
6
Simplified:
dir=${1:-.} #if $1 not set use .
name=${2+*.$2} #if $2 is set use *.$2 for name
name=${name:-*} #if name still isnt set, use *
find "$dir" -name "$name" -print #use quotes
or
name=${2+*.$2} #if $2 is set use *.$2 for name
find "${1:-.}" -name "${name:-*}" -print #use quotes
also, as #John Kugelman says, you could use:
name=${2+*.$2}
find "${1:-.}" ${name:+-name "$name"} -print
find . -name "*" -print is the same as find . -print, so if $name isn't set, there's no need to specify -name "*".
Try this:
dir="$1"
[[ $# -eq 2 ]] && ext='*.$2' || ext='*'
If that doesn't work, you can just switch to an if statement, where you use the -name pattern in a branch and you don't in the other.
A couple more points:
Those are not regular expressions, but rather shell patterns.
echo $(command) is just equivalent to command.
This is a part of my shell script, which I use to perform a recursive find and replace in the working directory. Backup and other utilities are in other functions, which are irrelevant to my problem.
#!/bin/bash
# backup function goes here
# #param $1 The find pattern.
# #param $2 The replace pattern.
function findAndReplace {
bufferFile=/tmp/tmp.$$
filesToReplace=`find . -type f | grep -vi cvs | grep -v '#'`
sedPattern="s/$1/$2/g"
echo "Using pattern $sedPattern"
for f in $filesToReplace; do
echo "sedding file $f"
sed "$sedPattern" "$f" > "$bufferFile"
exitCode=$?
if [ $exitCode -ne 0 ] ; then
echo "sed $sedPattern exited with $exitCode"
exit 1
fi
chown --reference=$f $bufferFile
mv $bufferFile $f
done
}
backup
findAndReplace "$1" "$2"
Here's a sample usage: recursive-replace.sh "function _report" "function report".
It works, but there is one problem. It uses sed on ALL files in the working directory. I would like to sed only those files, which contain the find pattern.
Then, I modified the line:
filesToReplace=`find . -type f | grep -vi cvs | grep -v '#'`
to:
filesToReplace=`grep -rl "$1" . | grep -vi cvs | grep -v '#'`
And it works too, but not for all find patterns. E.g. for pattern \$this->report\((.*)\) I recieve error: grep: Unmatched ( or \(. This pattern is correct for sed, but not for grep.
Regex syntaxes for grep and sed differ. What can I do?
use grep -E ("extended" regexp option) — it usually solves the problem.
(also sometimes available as egrep)
Also, why not keep using find?
filesToReplace=`find . -name CVS -prune -o -type f -exec grep -l "$1" {} \; | grep -v '#'`
Also note the -i option of sed, which allows in-place changes in files and the removal of the bufferFile/chown/mv logic.
Why not compare source and buffer files before overwriting the source file:
#!/bin/bash
# backup function goes here
# #param $1 The find pattern.
# #param $2 The replace pattern.
function findAndReplace {
bufferFile=/tmp/tmp.$$
filesToReplace=`find . -type f | grep -vi cvs | grep -v '#'`
sedPattern="s/$1/$2/g"
echo "Using pattern $sedPattern"
for f in $filesToReplace; do
echo "sedding file $f"
sed "$sedPattern" "$f" > "$bufferFile"
exitCode=$?
if [ $exitCode -ne 0 ] ; then
echo "sed $sedPattern exited with $exitCode"
exit 1
fi
cmp -s $f $bufferFile
if [ $? -ne 0 ]; then
chown --reference=$f $bufferFile
mv $bufferFile $f
fi
done
}
backup
findAndReplace "$1" "$2"