Renaming directories based on a pattern in Bash - regex

I have a Bash script that works well for just renaming directories that match a criteria.
for name in *\[*\]\ -\ *; do
if [[ -d "$name" ]] && [[ ! -e "${name#* - }" ]]; then
mv "$name" "${name#* - }"
fi
done
Currently if the directory looks like:
user1 [files.sentfrom.com] - Directory-Subject
It renames the directory and only the directory to look like
Directory-Subject (this could have different type of text)
How can I change the script / search criteria to now search for
www.ibm.com - Directory-Subject
and rename the directory and only the directory to
Directory-Subject

You could write your code this way so that it covers both the cases:
for dir in *\ -\ *; do
[[ -d "$dir" ]] || continue # skip if not a directory
sub="${dir#* - }"
if [[ ! -e "$sub" ]]; then
mv "$dir" "$sub"
fi
done
Before running the script:
$ ls -1d */
user1 [files.sentfrom.com] - Directory-Subject/
www.ibm.com - Directory-Subject
After:
$ ls -1d */
Directory-Subject/
www.ibm.com - Directory-Subject/ # didn't move because directory existed already

A simple answer would be to change *\[*\]\ -\ * to *\ -\ *
for name in *\ -\ *; do
if [[ -d "$name" ]] && [[ ! -e "${name#* - }" ]]; then
mv "$name" "${name#* - }"
fi
done
For more information, please read glob and wildcards

Related

bash sript to check script file extension and adding an extension

I have written the following Bash script. Its role is to check its own name, and in case of nonexistent extension , to amend ".sh" with sed. Still I have error "missing target file..."
#!/bin/bash
FILE_NAME="$0"
EXTENSION=".sh"
FILE_NAME_MOD="$FILE_NAME$EXTENSION"
if [[ "$0" != "FILE_NAME_MOD" ]]; then
echo mv -v "$FILENAME" "$FILENAME$EXTENSION"
cp "$0" | sed 's/\([^.sh]\)$/\1.sh/g' $0
fi
#!/bin/bash
file="$0"
extension=".sh"
if [ $(echo -n $file | tail -c 3) != $extension ]; then
mv -v "$file" "$file$extension"
fi
Important stuff:
-n flag suppress the new line at the end, so we can test for 3 chars instead of 4
When in doubt, always use set -x to debug your scripts.
Try this Shellcheck-clean code:
#! /bin/bash -p
file=${BASH_SOURCE[0]}
extension=.sh
[[ $file == *"$extension" ]] || mv -i -- "$file" "$file$extension"
See choosing between $0 and BASH_SOURCE for details of why ${BASH_SOURCE[0]} is better than $0.
See Correct Bash and shell script variable capitalization for details of why file is better than FILE and extension is better than EXTENSION. (In short, ALL_UPPERCASE names are dangerous because there is a danger that they will clash with names that are already used for something else.)
The -i option to mv means that you will be prompted to continue if the new filename is already in use.
See Should I save my scripts with the .sh extension? before adding .sh extensions to your shell programs.
Just for fun, here is a way to do it just with GNU sed:
#!/usr/bin/env bash
sed --silent '
# match FILENAME only if it does not end with ".sh"
/\.sh$/! {
# change "FILENAME" to "mv -v FILENAME FILENAME.sh"
s/.*/mv -v & &.sh/
# execute the command
e
}
' <<<"$0"
You can also make the above script output useful messages:
#!/usr/bin/env bash
sed --silent '
/\.sh$/! {
s/.*/mv -v & &.sh/
e
# exit with code 0 immediately after the change has been made
q0
}
# otherwise exit with code 1
q1
' <<<"$0" && echo 'done' || echo 'no changes were made'

Linking directories inside directories in Bash

My script processes files.lst and it has a loop that looks like this
while read src_column dest_column; do
if [[ -d $src ]]; then
src="../../default/$src_column/*"
else
src="../../default/$src_column"
fi
pushd $dest
ln -s $src .
popd
done < files.lst
files.lst
#~source~ ~destination~
data dir1
default/def1.txt new1.txt
data dir2/dir22/dir222
default/def1.txt dir2/dir22/dir222/new1.txt
default dir2/dir22
default/def2.txt dir2/dir22/ne2.txt
The cases should be like this:
if destinations are dir2/dir22/dir222 or dir2/dir22/dir222/new1.txt
the starting prefix of $src should be ../../../../default
if destinations are dir2/dir22 or dir2/dir22/new2.txt
the starting prefix of $src should be ../../../default
if destinations are dir2 or dir2/new2.txt
the starting prefix of $src should be ../../default
The problem is I don't know how I will count the directories how deep they are. What approach should I do? I am thinking of regex but I got no idea how I'll use it.
Using sed to calculate the paths...:
while read src_column dest_column; do
if [[ -d $src ]]; then
dest_column="$dest_column/"
fi
src_prefix="$(sed -r 's|/[^/]*$|/|; s|//*|/|g; s|[^/]+|..|g' <<< "./$dest_column")default"
# sed command details:
# First expression: strip out any file.txt from $dest_column
# 2nd expression: Change duplicate / to single / (e.g. a/b//c// to a/b/c
# Last expression: Change any path to `..`
#Finally append the missing ../default.
if [[ -d $src ]]; then
src="$src_prefix/$src_column/*"
else
src="$src_prefix/$src_column"
fi
pushd $dest
ln -s $src .
popd
done < files.lst

Shell script to rename multiple files from their parent folders

I'm looking for a script for below structure:
Before :
/Description/TestCVin/OpenCVin/NameCv/.....
/Description/blacVin/baka/NameCv_hubala/......
/Description/CVintere/oldCvimg/NameCv_add/.....
after:
/Description/TestaplCVin/OpenaplCVin/NameaplCv/.....
/Description/blaapcVlin/baka/NameaplCv_hubala/......
/Description/aplCVintere/oldaplCvimg/NameaplCv_add/.....
I want to rename " Cv or CV or cV " >> "aplCv or aplCV or aplcV" in all folder by regular expression...
My script does look like:
#!/bin/sh
printf "Input your Directory path: -> "
read DIR
cd "$DIR"
FILECASE=$(find . -iname "*cv*")
LAST_DIR_NAME=""
for fdir in $FILECASE
do
if [[ -d $fdir ]];
then
LAST_DIR_NAME=$fdir
fi
FILE=$(echo $fdir | sed -e "s/\([Cc][Vv]\)/arpl\1/g")
echo "la file $FILE"
if ([[ -f $fdir ]] && [[ "$fdir" =~ "$LAST_DIR_NAME" ]]);
then
FILECASE=$(find . -iname "*cv*")
tmp=$(echo $LAST_DIR_NAME | sed -e "s/\([Cc][Vv]\)/arpl\1/g")
fdir=$(echo $fdir | sed -e 's|'$LAST_DIR_NAME'|'$tmp'|g')
fi
mv -- "$fdir" "$FILE"
done
But it throws an error ..:(
How could I write it to rename the files according to their folder names?
You can do like this
#!/bin/sh
printf "Input your Directory path: -> "
read DIR
cd "$DIR"
MYARRAY=$(find . -iname "*cv*" )
touch "tmpfile"
for fdir in $MYARRAY
do
echo "$fdir" >> "tmpfile"
done
MYARRAY=$(tac "tmpfile")
for fdir in $MYARRAY
do
cd "$fdir"
prev=$(cd -)
base=$(basename $fdir)
cd ..
nDIR=$(echo "$base" | sed -e "s/\([Cc][Vv]\)/arpl\1/g")
mv "$base" "$nDIR"
cd $prev
done
rm -f "tmpfile"
Also one issue i think tac command not included in Mac OS X.Instead tac use tail -r like MYARRAY=$(tail -r "tmpfile")
Always make a backup before playing with this kind of scripts.
You can try the following:
find . -iname '*cv*' -exec echo 'mv {} $(echo $(dirname {})/$(basename {}|sed s/cv/apl/gi))' \;|tac|xargs -i bash -c 'eval {}'
This uses -exec to print commands for renaming.
The second arguments are generated by using shell substitutions to replace cv with apl in the last part of the path.
tac is used to reverse the order of the commands, so that we do not rename a directory before working with its contents.
Finally, we eval the commands with bash.
Also, do not use -exec in a permanent script. Please read the security warnings about exec in the find man-page.

Splitting all txt files in a folder into smaller files based on a regular expression using bash

I have a folder containing large text files. Each file is a collection of 1000 files separated by [[ file name ]]. I want to split the files and make 1000 files out of them and put them in a new folder. Is there a way in bash to do it? Any other fast method will also do.
for f in $(find . -name '*.txt')
do mkdir $f
mv
cd $f
awk '/[[.*]]/{g++} { print $0 > g".txt"}' $f
cd ..
done
You are trying to create a folder with the same name of the already existing file.
for f in $(find . -name '*.txt')
do mkdir $f
Here, "find" will list the files in the current path, and for each of these files you will try to create a directory with exactly the same name. One way of doing it would be first creating a temporary folder:
for f in $(find . -name '*.txt')
do mkdir temporary # create a temporary folder
mv $f temporary # move the file into the folder
mv temporary $f # rename the temporary folder to the name of the file
cd $f # enter the folder and go on....
awk '/[[.*]]/{g++} { print $0 > g".txt"}' $f
cd ..
done
Note that all your folders will have the ".txt" extension. If you don't want that, you can cut it out before creating the folder; that way, you won't need the temporary folder, because the folder you're trying to create has a different name from the .txt file.
Example:
for f in $(find . -name '*.txt' | rev | cut -b 5- | rev)
Although not awk and written and written by a drunk person, not guaranteed to work.
import re
import sys
def main():
pattern = re.compile(r'\[\[(.+)]]')
with open (sys.argv[1]) as f:
for line in f:
m = re.search(pattern, line)
if m:
try:
with open(fname, 'w+') as g:
g.writelines(lines)
except NameError:
pass
fname = m.group(1)
lines = []
else:
lines.append(line)
with open(fname, 'w+') as g:
g.writelines(lines)
if __name__ == '__main__':
main()
Write a bash script. Here, I've done it for you.
Notice the structure and features of this script:
explain what it does in a usage() function, which is used for the -h option.
provide a set of standard options: -h, -n, -v.
use getopts to do option processing
do lots of error checking on the arguments
be careful about filename parsing (notice that blanks surrounding the file names are ignored.
hide details within functions. Notice the 'talk', 'qtalk', 'nvtalk' functions? Those are from a bash library I've built to make this kind of scripting easy to do.
explain what is going on to the user if in $verbose mode.
provide the user the ability to see what would be done without actually doing it (the -n option, for $norun mode).
never run commands directly. but use the run function, which pays attention to the $norun, $verbose, and $quiet variables.
I'm not just fishing for you, but teaching you how to fish.
Good luck with your next bash script.
Alan S.
#!/bin/bash
# split-collections IN-FOLDER OUT-FOLDER
PROG="${0##*/}"
usage() {
cat 1>&2 <<EOF
usage: $PROG [OPTIONS] IN-FOLDER OUT-FOLDER
This script splits a collection of files within IN-FOLDER into
separate, named files into the given OUT-FOLDER. The created file
names are obtained from formatted text headers within the input
files.
The format of each input file is a set of HEADER and BODY pairs,
where each HEADER is a text line formatted as:
[[input-filename1]]
text line 1
text line 2
...
[[input-filename2]]
text line 1
text line 2
...
Normal processing will show the filenames being read, and file
names being created. Use the -v (verbose) option to show the
number of text lines being written to each created file. Use
-v twice to show the actual lines of text being written.
Use the -n option to show what would be done, without actually
doing it.
Options
-h Show this help
-n Dry run -- do NOT create any files or make any changes
-o Overwrite existing output files.
-v Be verbose
EOF
exit
}
talk() { echo 1>&2 "$#" ; }
chat() { [[ -n "$norun$verbose" ]] && talk "$#" ; }
nvtalk() { [[ -n "$verbose" ]] || talk "$#" ; }
qtalk() { [[ -n "$quiet" ]] || talk "$#" ; }
nrtalk() { talk "${norun:+(norun) }$#" ; }
error() {
local code=2
case "$1" in [0-9]*) code=$1 ; shift ;; esac
echo 1>&2 "$#"
exit $code
}
talkf() { printf 1>&2 "$#" ; }
chatf() { [[ -n "$norun$verbose" ]] && talkf "$#" ; }
nvtalkf() { [[ -n "$verbose" ]] || talkf "$#" ; }
qtalkf() { [[ -n "$quiet" ]] || talkf "$#" ; }
nrtalkf() { talkf "${norun:+(norun) }$#" ; }
errorf() {
local code=2
case "$1" in [0-9]*) code=$1 ; shift ;; esac
printf 1>&2 "$#"
exit $code
}
# run COMMAND ARGS ...
qrun() {
( quiet=1 run "$#" )
}
run() {
if [[ -n "$norun" ]]; then
if [[ -z "$quiet" ]]; then
nrtalk "$#"
fi
else
if [[ -n "$verbose" ]]; then
talk ">> $#"
fi
if ! eval "$#" ; then
local code=$?
return $code
fi
fi
return 0
}
show_line() {
talkf "%s:%d: %s\n" "$in_file" "$lines_in" "$line"
}
# given an input filename, read it and create
# the output files as indicated by the contents
# of the text in the file
split_collection() {
in_file="$1"
out_file=
lines_in=0
lines_out=0
skipping=
while read line ; do
: $(( lines_in++ ))
[[ $verbose_count > 1 ]] && show_line
# if a line with the format of "[[foo]]" occurs,
# close the current output file, and open a new
# output file called "foo"
if [[ "$line" =~ ^\[\[[[:blank:]]*([^ ]+.*[^ ]|[^ ])[[:blank:]]*\]\][[:blank:]]*$ ]] ; then
new_file="${BASH_REMATCH[1]}"
# close out the current file, if any
if [[ "$out_file" ]]; then
nrtalkf "%d lines written to %s\n" $lines_out "$out_file"
fi
# check the filename for bogosities
case "$new_file" in
*..*|*/*)
[[ $verbose_count < 2 ]] && show_line
error "Badly formatted filename"
;;
esac
out_file="$out_folder/$new_file"
if [[ -e "$out_file" ]]; then
if [[ -n "$overwrite" ]]; then
nrtalk "Overwriting existing '$out_file'"
qrun "cat /dev/null >'$out_file'"
else
error "$out_file already exists."
fi
else
nrtalk "Creating new output file: '$out_file' ..."
qrun "touch '$out_file'"
fi
lines_out=0
elif [[ -z "$out_file" ]]; then
# apparently, there are text lines before the filename
# header; ignore them (out loud)
if [[ ! "$skipping" ]]; then
talk "Text preceding first filename ignored.."
skipping=1
fi
else # next line of input for the file
qrun "echo \"$line\" >>'$out_file'"
: $(( lines_out++ ))
fi
done
}
norun=
verbose=
verbose_count=0
overwrite=
quiet=
while getopts 'hnoqv' opt ; do
case "$opt" in
h) usage ;;
n) norun=1 ;;
o) overwrite=1 ;;
q) quiet=1 ;;
v) verbose=1 ; : $(( verbose_count++ )) ;;
esac
done
shift $(( OPTIND - 1 ))
in_folder="${1:?Missing IN-FOLDER; see $PROG -h for details}"
out_folder="${2:?Missing OUT-FOLDER; see $PROG -h for details}"
# validate the input and output folders
#
# It might be reasonable to create the output folder for the
# user, but that's left as an exercise for the user.
in_folder="${in_folder%/}" # remove trailing slash, if any
out_folder="${out_folder%/}"
[[ -e "$in_folder" ]] || error "$in_folder does not exist"
[[ -d "$in_folder" ]] || error "$in_folder is not a directory."
[[ -e "$out_folder" ]] || error "$out_folder does not exist."
[[ -d "$out_folder" ]] || error "$out_folder is not a directory."
for collection in $in_folder/* ; do
talk "Reading $collection .."
split_collection "$collection" <$collection
done
exit

Determining correct extension in bash with Regular Expressions

Was wondering if someone could help me out with regular expressions and bash.
I'm trying to execute a set of commands on files that only have a certain extension, in this case: mpg, mpeg, avi, and mkv.
I've actually found a solution here, however, it doesn't seem to work. If someone can tell me why, I'd appreciate it.
#!/bin/bash
# Configuration
TARGETDIR="$1"
TARGETEXT="(mpg|mpeg|avi|mkv)"
for d in `find $1 -type d`
do
echo "Searching directory: $d"
for f in "$d"/*
do
if [ -d "${f}" ];
then
# File is a directory, do not perform
echo "$f is a directory, not performing ..."
elif [ -f "${f}" ];
then
filename=$(basename "$f")
extension="${filename##*.}"
if [ "$extension" == "$TARGETEXT" ];
then
echo "Match"
else
echo "Mismatch - $f - $extension"
fi
fi
done
done
Again, any assistance is appreciated.
This can probably be done using only the find command.
find $TARGETDIR -regex ".*\\.$TARGETEXT" -type f -exec your_command {} \;
Instead of direct string comparison
if [ "$extension" == "$TARGETEXT" ];
use Bash regex matching syntax
if [[ "$extension" =~ $TARGETEXT ]];
Note the double [[ ]] and the non-quoted $TARGETEXT.
You can do this in bash without regular expressions, just file patterns:
shopt -s globstar nullglob
for f in **/*.{mpg,mpeg,avi,mkv}; do
if [[ -f "$f" ]]; then
# do something with the file:
echo "$f"
fi
done