bash sript to check script file extension and adding an extension - regex

I have written the following Bash script. Its role is to check its own name, and in case of nonexistent extension , to amend ".sh" with sed. Still I have error "missing target file..."
#!/bin/bash
FILE_NAME="$0"
EXTENSION=".sh"
FILE_NAME_MOD="$FILE_NAME$EXTENSION"
if [[ "$0" != "FILE_NAME_MOD" ]]; then
echo mv -v "$FILENAME" "$FILENAME$EXTENSION"
cp "$0" | sed 's/\([^.sh]\)$/\1.sh/g' $0
fi

#!/bin/bash
file="$0"
extension=".sh"
if [ $(echo -n $file | tail -c 3) != $extension ]; then
mv -v "$file" "$file$extension"
fi
Important stuff:
-n flag suppress the new line at the end, so we can test for 3 chars instead of 4
When in doubt, always use set -x to debug your scripts.

Try this Shellcheck-clean code:
#! /bin/bash -p
file=${BASH_SOURCE[0]}
extension=.sh
[[ $file == *"$extension" ]] || mv -i -- "$file" "$file$extension"
See choosing between $0 and BASH_SOURCE for details of why ${BASH_SOURCE[0]} is better than $0.
See Correct Bash and shell script variable capitalization for details of why file is better than FILE and extension is better than EXTENSION. (In short, ALL_UPPERCASE names are dangerous because there is a danger that they will clash with names that are already used for something else.)
The -i option to mv means that you will be prompted to continue if the new filename is already in use.
See Should I save my scripts with the .sh extension? before adding .sh extensions to your shell programs.

Just for fun, here is a way to do it just with GNU sed:
#!/usr/bin/env bash
sed --silent '
# match FILENAME only if it does not end with ".sh"
/\.sh$/! {
# change "FILENAME" to "mv -v FILENAME FILENAME.sh"
s/.*/mv -v & &.sh/
# execute the command
e
}
' <<<"$0"
You can also make the above script output useful messages:
#!/usr/bin/env bash
sed --silent '
/\.sh$/! {
s/.*/mv -v & &.sh/
e
# exit with code 0 immediately after the change has been made
q0
}
# otherwise exit with code 1
q1
' <<<"$0" && echo 'done' || echo 'no changes were made'

Related

grab a argument as regex pattern inside a shell script

This is simple script to run ls with filter :
sh myscript.sh ".pyc"
myscript.sh :
echo "---------------------------"
for i in `ls | grep '.*\.pyc'`; do
echo "$i"
done
it will do 'ls' and only show *.pyc. Now i want to put that pattern in the argument :
sh myscript.sh ".pyc"
and modify the script :
echo "---------------------------"
for i in `ls | grep '.*\$1'`; do
echo "$i"
done
But this doesn't work. it returns empty result. How to properly insert that $1 in the regex while inside the shell script ?
Replace everything with this: printf '%s\n' *"$1".
Or alternatively just run one of printf '%s\n' *.pyc, ls *.pyc, ls -d *.pyc, etc.
You probably want *.pyc (a shell glob/wildcard which expands to all files ending .pyc), as opposed to using grep.

Correcting file numbers using bash

I have a bunch of file names in a folder like this:
test_07_ds.csv
test_08_ds.csv
test_09_ds.csv
test_10_ds.csv
...
I want to decrease the number of every file, so that these become:
test_01_ds.csv
test_02_ds.csv
test_03_ds.csv
test_04_ds.csv
...
Here's what I came up with:
for i in $1/*; do
n=${i//[^0-9]/};
n2=`expr $n - 6`;
if [ $n2 -lt 10 ]; then
n2="0"$n2;
fi
n3=`echo $i | sed -r "s/[0-9]+/$n2/"`
echo $n3;
cp $i "fix/$n3";
done;
Is there a cleaner way of doing this?
This might help:
shopt -s extglob
for i in test_{07..10}_ds.csv; do
IFS=_ read s m e <<<"$i"; # echo "Start=$s Middle=$m End=$e"
n=${m#+(0)} # Remove leading zeros to
# avoid interpretation as octal number.
n=$((n-6)) # Subtract 6.
n=$(printf '%02d' "$n") # Format `n` with a leading 0.
# comment out the next echo to actually execute the copy.
echo \
cp "$i" "fix/${s}_${n}_${e}";
done;
Or collapsing it all together
#!/bin/bash
shopt -s extglob
for i in ${1:-.}/*; do # $1 will default to pwd `.`
IFS=_ read s m e <<<"$i"; # echo "Start=$s Middle=$m End=$e"
n=$(printf '%02d' "$((${m#+(0)}-6))")
cp "$i" "fix/${s}_${n}_${e}";
done;
You can use awk for simplification:
for f in *.csv; do
mv "$f" $(awk 'BEGIN{FS=OFS="_"} {$2 = sprintf("%02d", $2-6)} 1' <<< "$f")
done
Could you please try following code and let me know if this helps you.
awk 'FNR==1{OLD=FILENAME;split(FILENAME, A,"_");A[2]=A[2]-6;NEW=A[1]"_"A[2]"_"A[3];system("mv " OLD " " NEW);close(OLD)}' *.csv
Also I had assumed like your files are always starting from _7 name so I have deducted 6 from each of their names, also in case you could put complete path in mv command which is placed in above system awk's built-in utility and could move the files to another place too. Let me know how it goes then.

Bash Script sed command not working correctly with file passed through command line

Problem
As I am trying to write a script to rename massive files according to some regex requirement, the command work ok on my iTerm2 succeeds but the same command fails to do the work in the script.
Plus some of my file names includes some Chinese and Korean characters.(don't know whether that is the problem or not)
code
So My code takes three input: Old regex, New regex and the files that need to be renamed.
Here is not code:
#!/bin/bash
# we have less than 3 arguments. Print the help text:
if [ $# -lt 3 ] ; then
cat << HELP
ren -- renames a number of files using sed regular expressions USAGE: ren 'regexp'
'replacement' files...
EXAMPLE: rename all *.HTM files into *.html:
ren 'HTM' 'html' *.HTM
HELP
exit 0
fi
OLD="$1"
NEW="$2"
# The shift command removes one argument from the list of
# command line arguments.
shift
shift
# $# contains now all the files:
for file in "$#"; do
if [ -f "$file" ] ; then
newfile=`echo "$file" | sed "s/${OLD}/${NEW}/g"`
if [ -f "$newfile" ]; then
echo "ERROR: $newfile exists already"
else
echo "renaming $file to $newfile ..."
mv "$file" "$newfile"
fi
fi
done
I register the bash command in the .profile as:
alias ren="bash /pathtothefile/ren.sh"
Test
The original file name is "제01과.mp3" and I want it to become "第01课.mp3".
So with my script I use:
$ ren "제\([0-9]*\)과" "第\1课" *.mp3
And it seems that the sed in the script has not worked successfully.
But the following which is exactly the same, works to replaces the name:
$ echo "제01과.mp3" | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
Any thoughts? Thx
Print the result
I have make the following change in the script so that it could print the process information:
newfile=`echo "$file" | sed "s/${OLD}/${NEW}/g"`
echo "The ${file} is changed to ${newfile}"
And the result for my test is:
The 제01과.mp3 is changed into 제01과.mp3
ERROR: 제01과.mp3 exists already
So there is no format problem.
Updating(all done under bash 4.2.45(2), Mac OS 10.9)
Testing
As I try to execute the command from the bash directly. I mean with the for loop. There is something interesting. I first stored all the names into a files.txt file using:
$ ls | grep mp3 > files.txt
And do the sed and bla bla. While single command in bash interactive mode like:
$ file="제01과.mp3"
$ echo $file | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
gives
第01课.mp3
While in the following in the interactive mode:
files=`cat files.txt`
for file in $files
do
echo $file | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
done
gives no changes!
And by now:
echo $file
gives:
$ 제30과.mp3
(There are only 30 files)
Problem Part
And I tried the first command which worked before:
$ echo $file | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
It gives no changes as:
$ 제30과.mp3
So I create a new newfile and tried again as:
$ newfile="제30과.mp3"
$ echo $newfile | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
And it gives correctly:
$第30课.mp3
WOW ORZ... Why! Why ! Why! And I try to see whether file and newfile are the same, and of course, they are not:
if [[ $file == $new ]]; then
echo True
else
echo False
fi
gives:
False
My guess
I guess there are some encoding problems , but I have found non reference, could anyone help? Thx again.
Update 2
I seem to understand that there are a huge difference between string and the file name. To be specific, it I directly use a variable like:
file="제30과.mp3"
in the script, the sed works fine. However, if the variable was passed from the $# or set the variable like:
file=./*mp3
Then the sed fails to work. I don't know why. And btw, mac sed has no -r option and in ubuntu -r does not solve the question I mention above.
Some errors combined:
In order to use groups in a regex, you need extended regex -r in sed, -E in grep
escaping correctly is a beast :)
Example
files="제2과.mp3 제30과.mp3"
for file in $files
do
echo $file | sed -r 's/제([0-9]*)과\.mp3/第\1课.mp3/g'
done
outputs
第2课.mp3
第30课.mp3
If you are not doing this as a programming project, but want to skip ahead to the part where it just works, I found these resources listed at http://www.tldp.org/LDP/GNU-Linux-Tools-Summary/html/x4055.htm:
MMV (and MCP, MLN, ...) utilities use a specialized syntax to perform bulk file operations on paths. (http://linux.maruhn.com/sec/mmv.html)
mmv before\*after.mp3 Before\#1After.mp3
Esomaniac, a Java alternative that also works on Windows, is apparently dead (home page is parked).
rename is a perl script you can download from CPAN: https://metacpan.org/release/File-Rename
rename 's/\.JPG$/.jpg/' *.JPG

Renaming files with sed, escaping issues

I'm trying to write a bash script to remove spaces, underscores and dots and replace them with dashes. I also set to lowercase and remove brackets. That's the (long) second sed command, which seems to work.
The first sed call escapes the original names with spaces with '\ ' like when I tab complete, and this is the issue I think.
If I replace 'mv -i' with 'echo' I get what I think I want: the original filename escaped with backslashes and then the new name. If I paste this into the terminal it works, but with mv in the script the spaces cause problems. The escaping doesn't work.
#!/bin/bash
for a in "$#"; do
mv -i $(echo "$a" | sed -e 's/ /\\\ /g') $(echo "$a" | sed -e 's/\(.*\)/\L\1/' -e 's/_/-/g' -e 's/ /-/g' -e 's/---/--/g' -e 's/(//g' -e 's/)//g' -e 's/\[//g' -e 's/\]//g' -e 's/\./-/g' -e 's/-\([^-]*\)$/\.\1/')
done
The other solution is to put quotes around the names, but I can't work out how I would do this. I feel like I've got close, but I'm stumped.
I've also considered the 'rename' command, but you cannot do multiple operations like you can with sed.
Please point out any other issues, this is one of my first scripts. I'm not sure I got the "$#" or "$a" bits completely correct.
Cheers.
edit:
sample input filename
I am a Badly [named] (file) - PLEASE.rename_me.JPG
should become
i-am-a-badly-named-file--please-rename-me.jpg
edit2: my solution, tweaked from gniourf_gniourf's really helpful pure bash answer:
#!/bin/bash
for a in "$#"; do
b=${a,,} #lowercase
b=${b//[_[:space:]\.]/-} #subst dot,space,underscore with dash
b=${b//---/--} #remove triple dash
b=${b//[()\[\]]/} #remove brackets
if [ "${b%-*}" != "$b" ]; then #if there is a dash (prevents filename.filename)
b=${b%-*}.${b##*-} #replace final dash with a dot for extension
fi
if [ "$a" != "$b" ]; then #if there has been a change
echo '--->' "$b" #
#mv -i -- "$a" "$b" #rename
fi
done
This only fails if the file had spaces etc and no extension (e.g this BAD_filename becomes this-bad.filename. But these are media files and should have an extension, so I would have to sort them anyway.
Again, corrections and improvements welcome. I'm new at this stuff
Try doing this with rename :
rename 's/[_\s\.]/-/g' *files
from the shell prompt. It's very useful, you can put some perl code inside if needed.
You can remove the -n (dry-run mode switch) when your tests become valids.
There are other tools with the same name which may or may not be able to do this, so be careful.
If you run the following command (linux)
$ file $(readlink -f $(type -p rename))
and you have a result like
.../rename: Perl script, ASCII text executable
then this seems to be the right tool =)
If not, to make it the default (usually already the case) on Debian and derivative like Ubuntu :
$ sudo update-alternatives --set rename /path/to/rename
(replace /path/to/rename to the path of your perl's rename command.
Last but not least, this tool was originally written by Larry Wall, the Perl's dad.
Just for the records, look:
$ a='I am a Badly [named] (file) - PLEASE.rename_me.JPG'
$ # lowercase that
$ echo "${a,,}"
i am a badly [named] (file) - please.rename_me.jpg
$ # Cool! let's save that somewhere
$ b=${a,,}
$ # substitution 's/[_ ]/-/g:
$ echo "${b//[_ ]/-}"
i-am-a-badly-[named]-(file)---please.rename-me.jpg
$ # or better, yet:
$ echo "${b//[_[:space:]]/-}"
i-am-a-badly-[named]-(file)---please.rename-me.jpg
$ # Cool! let's save that somewhere
$ c=${b//[_[:space:]]/-}
$ # substitution 's/---/--/g' (??)
$ echo "${c//---/--}"
i-am-a-badly-[named]-(file)--please.rename-me.jpg
$ d=${c//---/--}
$ # substitution 's/()[]//g':
$ echo "${d//[()\[\]]/}"
i-am-a-badly-named-file--please.rename-me.jpg
$ e="${d//[()\[\]]/}"
$ # substitution 's/\./-/g':
$ echo "${e//\./-}"
i-am-a-badly-named-file--please-rename-me-jpg
$ f=${e//\./-}
$ # substitution 's/-\([^-]*\)$/\.\1/':
$ echo "${f%-*}.${f##*-}"
i-am-a-badly-named-file--please-rename-me.jpg
$ # Done!
Now, here's a 100% bash implementation of what you're trying to achieve:
#!/bin/bash
for a in "$#"; do
b=${a,,}
b=${b//[_[:space:]]/-}
b=${b//---/--}
b=${b//[()\[\]]/}
b=${b//\./-}
b=${b%-*}.${b##*-}
mv -i -- "$a" "$b"
done
yeah, done!
All this standard and known as shell parameter expansion.
Remark. For a more robust script, you could check whether a has an extension (read: a period in its name), otherwise the last substitution of the algorithm fails a little bit. For this, put the following line just below the for statement:
[[ a != *.* ]] && { echo "Oh no, file \`$a' has no extension..."; continue; }
(and isn't the *.* part of this line so cute?)

Splitting all txt files in a folder into smaller files based on a regular expression using bash

I have a folder containing large text files. Each file is a collection of 1000 files separated by [[ file name ]]. I want to split the files and make 1000 files out of them and put them in a new folder. Is there a way in bash to do it? Any other fast method will also do.
for f in $(find . -name '*.txt')
do mkdir $f
mv
cd $f
awk '/[[.*]]/{g++} { print $0 > g".txt"}' $f
cd ..
done
You are trying to create a folder with the same name of the already existing file.
for f in $(find . -name '*.txt')
do mkdir $f
Here, "find" will list the files in the current path, and for each of these files you will try to create a directory with exactly the same name. One way of doing it would be first creating a temporary folder:
for f in $(find . -name '*.txt')
do mkdir temporary # create a temporary folder
mv $f temporary # move the file into the folder
mv temporary $f # rename the temporary folder to the name of the file
cd $f # enter the folder and go on....
awk '/[[.*]]/{g++} { print $0 > g".txt"}' $f
cd ..
done
Note that all your folders will have the ".txt" extension. If you don't want that, you can cut it out before creating the folder; that way, you won't need the temporary folder, because the folder you're trying to create has a different name from the .txt file.
Example:
for f in $(find . -name '*.txt' | rev | cut -b 5- | rev)
Although not awk and written and written by a drunk person, not guaranteed to work.
import re
import sys
def main():
pattern = re.compile(r'\[\[(.+)]]')
with open (sys.argv[1]) as f:
for line in f:
m = re.search(pattern, line)
if m:
try:
with open(fname, 'w+') as g:
g.writelines(lines)
except NameError:
pass
fname = m.group(1)
lines = []
else:
lines.append(line)
with open(fname, 'w+') as g:
g.writelines(lines)
if __name__ == '__main__':
main()
Write a bash script. Here, I've done it for you.
Notice the structure and features of this script:
explain what it does in a usage() function, which is used for the -h option.
provide a set of standard options: -h, -n, -v.
use getopts to do option processing
do lots of error checking on the arguments
be careful about filename parsing (notice that blanks surrounding the file names are ignored.
hide details within functions. Notice the 'talk', 'qtalk', 'nvtalk' functions? Those are from a bash library I've built to make this kind of scripting easy to do.
explain what is going on to the user if in $verbose mode.
provide the user the ability to see what would be done without actually doing it (the -n option, for $norun mode).
never run commands directly. but use the run function, which pays attention to the $norun, $verbose, and $quiet variables.
I'm not just fishing for you, but teaching you how to fish.
Good luck with your next bash script.
Alan S.
#!/bin/bash
# split-collections IN-FOLDER OUT-FOLDER
PROG="${0##*/}"
usage() {
cat 1>&2 <<EOF
usage: $PROG [OPTIONS] IN-FOLDER OUT-FOLDER
This script splits a collection of files within IN-FOLDER into
separate, named files into the given OUT-FOLDER. The created file
names are obtained from formatted text headers within the input
files.
The format of each input file is a set of HEADER and BODY pairs,
where each HEADER is a text line formatted as:
[[input-filename1]]
text line 1
text line 2
...
[[input-filename2]]
text line 1
text line 2
...
Normal processing will show the filenames being read, and file
names being created. Use the -v (verbose) option to show the
number of text lines being written to each created file. Use
-v twice to show the actual lines of text being written.
Use the -n option to show what would be done, without actually
doing it.
Options
-h Show this help
-n Dry run -- do NOT create any files or make any changes
-o Overwrite existing output files.
-v Be verbose
EOF
exit
}
talk() { echo 1>&2 "$#" ; }
chat() { [[ -n "$norun$verbose" ]] && talk "$#" ; }
nvtalk() { [[ -n "$verbose" ]] || talk "$#" ; }
qtalk() { [[ -n "$quiet" ]] || talk "$#" ; }
nrtalk() { talk "${norun:+(norun) }$#" ; }
error() {
local code=2
case "$1" in [0-9]*) code=$1 ; shift ;; esac
echo 1>&2 "$#"
exit $code
}
talkf() { printf 1>&2 "$#" ; }
chatf() { [[ -n "$norun$verbose" ]] && talkf "$#" ; }
nvtalkf() { [[ -n "$verbose" ]] || talkf "$#" ; }
qtalkf() { [[ -n "$quiet" ]] || talkf "$#" ; }
nrtalkf() { talkf "${norun:+(norun) }$#" ; }
errorf() {
local code=2
case "$1" in [0-9]*) code=$1 ; shift ;; esac
printf 1>&2 "$#"
exit $code
}
# run COMMAND ARGS ...
qrun() {
( quiet=1 run "$#" )
}
run() {
if [[ -n "$norun" ]]; then
if [[ -z "$quiet" ]]; then
nrtalk "$#"
fi
else
if [[ -n "$verbose" ]]; then
talk ">> $#"
fi
if ! eval "$#" ; then
local code=$?
return $code
fi
fi
return 0
}
show_line() {
talkf "%s:%d: %s\n" "$in_file" "$lines_in" "$line"
}
# given an input filename, read it and create
# the output files as indicated by the contents
# of the text in the file
split_collection() {
in_file="$1"
out_file=
lines_in=0
lines_out=0
skipping=
while read line ; do
: $(( lines_in++ ))
[[ $verbose_count > 1 ]] && show_line
# if a line with the format of "[[foo]]" occurs,
# close the current output file, and open a new
# output file called "foo"
if [[ "$line" =~ ^\[\[[[:blank:]]*([^ ]+.*[^ ]|[^ ])[[:blank:]]*\]\][[:blank:]]*$ ]] ; then
new_file="${BASH_REMATCH[1]}"
# close out the current file, if any
if [[ "$out_file" ]]; then
nrtalkf "%d lines written to %s\n" $lines_out "$out_file"
fi
# check the filename for bogosities
case "$new_file" in
*..*|*/*)
[[ $verbose_count < 2 ]] && show_line
error "Badly formatted filename"
;;
esac
out_file="$out_folder/$new_file"
if [[ -e "$out_file" ]]; then
if [[ -n "$overwrite" ]]; then
nrtalk "Overwriting existing '$out_file'"
qrun "cat /dev/null >'$out_file'"
else
error "$out_file already exists."
fi
else
nrtalk "Creating new output file: '$out_file' ..."
qrun "touch '$out_file'"
fi
lines_out=0
elif [[ -z "$out_file" ]]; then
# apparently, there are text lines before the filename
# header; ignore them (out loud)
if [[ ! "$skipping" ]]; then
talk "Text preceding first filename ignored.."
skipping=1
fi
else # next line of input for the file
qrun "echo \"$line\" >>'$out_file'"
: $(( lines_out++ ))
fi
done
}
norun=
verbose=
verbose_count=0
overwrite=
quiet=
while getopts 'hnoqv' opt ; do
case "$opt" in
h) usage ;;
n) norun=1 ;;
o) overwrite=1 ;;
q) quiet=1 ;;
v) verbose=1 ; : $(( verbose_count++ )) ;;
esac
done
shift $(( OPTIND - 1 ))
in_folder="${1:?Missing IN-FOLDER; see $PROG -h for details}"
out_folder="${2:?Missing OUT-FOLDER; see $PROG -h for details}"
# validate the input and output folders
#
# It might be reasonable to create the output folder for the
# user, but that's left as an exercise for the user.
in_folder="${in_folder%/}" # remove trailing slash, if any
out_folder="${out_folder%/}"
[[ -e "$in_folder" ]] || error "$in_folder does not exist"
[[ -d "$in_folder" ]] || error "$in_folder is not a directory."
[[ -e "$out_folder" ]] || error "$out_folder does not exist."
[[ -d "$out_folder" ]] || error "$out_folder is not a directory."
for collection in $in_folder/* ; do
talk "Reading $collection .."
split_collection "$collection" <$collection
done
exit