Parsing OPTARG inside case with regular expression and [[ ]]'s

Parsing OPTARG inside case with regular expression and [[ ]]'s - regex

So, I'm setting up a bash script and want to parse arguments to certain flags using getopts. For a minimal example, consider the a script which has a flag, -M, and it takes y or n as an argument. If I use the following code:
#!/bin/bash
# minimalExample.sh
while getopts "M:" OPTION;
do
case ${OPTION} in
M)
RMPI=${OPTARG}
if ! [[ "$RMPI" =~ "^[yn]$" ]]
then
echo "-M must be followed by either y or n."
exit 1
fi
;;
esac
done
I get the following:
$ ./minimalExample.sh -M y
-M must be followed by either y or n.
FAIL: 1
However, if I use the following code instead
#!/bin/bash
# minimalExample2.sh
while getopts "M:" OPTION;
do
case ${OPTION} in
M)
RMPI=${OPTARG}
if [ -z $(echo $RMPI | grep -E "^[yn]$") ]
then
echo "-M must be followed by either y or n."
exit 1
else
echo "good"
fi
;;
esac
done
I get:
$ ./minimalExample2.sh -M y
good
Why doesn't minimalExample.sh work?

quoting regexp in this context forces a string comparison.
change to
if ! [[ "$RMPI" =~ ^[yn]$ ]]
check following post for more details,
bash regex with quotes?

Why do you need regex here at all? -M y is not the same as -M n, is it? So you definitely will use some statement (case or if) to distinguish one from another.
#!/bin/bash
while getopts "M:" OPTION; do
case ${OPTION} in
M)
case ${OPTARG} in
y)
# do what must be done if -M y
;;
n)
# do what must be done if -M n
;;
*)
echo >&2 "-M must be followed by either y or n."
exit 1
;;
;;
esac
done
Please note >&2 – error messages should be output to STDERR, not to STDOUT.

Related

Regular expressions don't work as expected in bash if-else block's condition

My pattern defined to match in if-else block is :
pat="17[0-1][0-9][0-9][0-9].AUG"
nln=""
In my script, I'm taking user input which needs to be matched against the pattern, which if doesn't match, appropriate error messages are to be shown. Pretty simple, but giving me a hard time though. My code block from the script is this:
echo "How many days' AUDIT Logs need to be searched?"
read days
echo "Enter file name(s)[For multiple files, one file per line]: "
for(( c = 0 ; c < $days ; c++))
do
read elements
if [[ $elements =~ $pat ]];
then
array[$c]="$elements"
elif [[ $elements =~ $nln ]];
then
echo "No file entered.Run script again. Exiting"
exit;
else
echo "Invalid filename entered: $elements.Run script again. Exiting"
exit;
fi
done
The format I want from the user for filenames to be entered is this:
170402.AUG
So basically yymmdd.AUG (where y-year,m-month,d-day), with trailing or leading spaces is fine. Anything other than that should throw "Invalid filename entered: $elements.Run script again. Exiting" message. Also I want to check if if it is a blank line with a "Enter" hit, it should give an error saying "No file entered.Run script again. Exiting"
However my code, even if I enter something like "xxx" as filename, which should be throwing "Invalid filename entered: $elements.Run script again. Exiting", is actually checking true against a blank line, and throwing "No file entered.Run script again. Exiting"
Need some help with handling the regular expressions' check with user input, as otherwise rest of my script works just fine.

I think as discussed in the comments you are confusing with the glob match and a regEx match, what you have defined as pat is a glob match which needs to be equated with the == operator as,
pat="17[0-1][0-9][0-9][0-9].AUG"
string="170402.AUG"
[[ $string == $pat ]] && printf "Match success\n"
The equivalent ~ match would be to something as
pat="17[[:digit:]]{4}\.AUG"
[[ $string =~ $pat ]] && printf "Match success\n"
As you can see the . in the regex syntax has been escaped to deprive of its special meaning ( to match any character) but just to use as a literal dot. The POSIX character class [[:digit:]] with a character count {4} allows you to match 4 digits followed by .AUG
And for the string empty check do as suggested by the comments from Cyrus, or by Benjamin.W
[[ $elements == "" ]]
(or)
[[ -z $elements ]]

I would not bug the user with how many days (who want count 15 days or like)? Also, why only one file per line? You should help the users, not bug them like microsoft...
For the start:
show_help() { cat <<'EOF'
bla bla....
EOF
}
show_files() { echo "${#files[#]} valid files entered: ${files[#]}"; }
while read -r -p 'files? (h-help)> ' line
do
case "$line" in
q) echo "quitting..." ; exit 0 ;;
h) show_help ; continue;;
'') (( ${#files} )) && show_files; continue ;;
l) show_files ; continue ;;
p) (( ${#files} )) && break || { echo "No files enterd.. quitting" ; exit 1; } ;; # go to processing
esac
# select (grep) the valid patterns from the entered line
# and append them into the array
# using the -P (if your grep know it) you can construct very complex regexes
files+=( $(grep -oP '17\d{4}.\w{3}' <<< "$line") )
done
echo "processing files ${files[#]}"
Using such logic you can build really powerful and user-friendly app. Also, you can use -e for the read enable the readline functions (cursor keys and like)...
But :) Consider just create a simple script, which accepts arguments. Without any dialogs and such. example:
myscript -h
same as above, or some longer help text
myscript 170402.AUG 170403.AUG 170404.AUG 170405.AUG
will do whatever it should do with the files. Main benefit, you could use globbing in the filenames, like
myscript 1704*
and so on...
And if you really want the dialog, it could show it when someone runs the script without any argument, e.g.:
myscript
will run in interactive mode...

Using --include=GLOB vs ~/path/*.{x,y} with grep [duplicate]

I am confused by the usage of brackets, parentheses, curly braces in Bash, as well as the difference between their double or single forms. Is there a clear explanation?

In Bash, test and [ are shell builtins.
The double bracket, which is a shell keyword, enables additional functionality. For example, you can use && and || instead of -a and -o and there's a regular expression matching operator =~.
Also, in a simple test, double square brackets seem to evaluate quite a lot quicker than single ones.
$ time for ((i=0; i<10000000; i++)); do [[ "$i" = 1000 ]]; done
real 0m24.548s
user 0m24.337s
sys 0m0.036s
$ time for ((i=0; i<10000000; i++)); do [ "$i" = 1000 ]; done
real 0m33.478s
user 0m33.478s
sys 0m0.000s
The braces, in addition to delimiting a variable name are used for parameter expansion so you can do things like:
Truncate the contents of a variable
$ var="abcde"; echo ${var%d*}
abc
Make substitutions similar to sed
$ var="abcde"; echo ${var/de/12}
abc12
Use a default value
$ default="hello"; unset var; echo ${var:-$default}
hello
and several more
Also, brace expansions create lists of strings which are typically iterated over in loops:
$ echo f{oo,ee,a}d
food feed fad
$ mv error.log{,.OLD}
(error.log is renamed to error.log.OLD because the brace expression
expands to "mv error.log error.log.OLD")
$ for num in {000..2}; do echo "$num"; done
000
001
002
$ echo {00..8..2}
00 02 04 06 08
$ echo {D..T..4}
D H L P T
Note that the leading zero and increment features weren't available before Bash 4.
Thanks to gboffi for reminding me about brace expansions.
Double parentheses are used for arithmetic operations:
((a++))
((meaning = 42))
for ((i=0; i<10; i++))
echo $((a + b + (14 * c)))
and they enable you to omit the dollar signs on integer and array variables and include spaces around operators for readability.
Single brackets are also used for array indices:
array[4]="hello"
element=${array[index]}
Curly brace are required for (most/all?) array references on the right hand side.
ephemient's comment reminded me that parentheses are also used for subshells. And that they are used to create arrays.
array=(1 2 3)
echo ${array[1]}
2

A single bracket ([) usually actually calls a program named [; man test or man [ for more info. Example:
$ VARIABLE=abcdef
$ if [ $VARIABLE == abcdef ] ; then echo yes ; else echo no ; fi
yes
The double bracket ([[) does the same thing (basically) as a single bracket, but is a bash builtin.
$ VARIABLE=abcdef
$ if [[ $VARIABLE == 123456 ]] ; then echo yes ; else echo no ; fi
no
Parentheses (()) are used to create a subshell. For example:
$ pwd
/home/user
$ (cd /tmp; pwd)
/tmp
$ pwd
/home/user
As you can see, the subshell allowed you to perform operations without affecting the environment of the current shell.
(a) Braces ({}) are used to unambiguously identify variables. Example:
$ VARIABLE=abcdef
$ echo Variable: $VARIABLE
Variable: abcdef
$ echo Variable: $VARIABLE123456
Variable:
$ echo Variable: ${VARIABLE}123456
Variable: abcdef123456
(b) Braces are also used to execute a sequence of commands in the current shell context, e.g.
$ { date; top -b -n1 | head ; } >logfile
# 'date' and 'top' output are concatenated,
# could be useful sometimes to hunt for a top loader )
$ { date; make 2>&1; date; } | tee logfile
# now we can calculate the duration of a build from the logfile
There is a subtle syntactic difference with ( ), though (see bash reference) ; essentially, a semicolon ; after the last command within braces is a must, and the braces {, } must be surrounded by spaces.

Brackets
if [ CONDITION ] Test construct
if [[ CONDITION ]] Extended test construct
Array[1]=element1 Array initialization
[a-z] Range of characters within a Regular Expression
$[ expression ] A non-standard & obsolete version of $(( expression )) [1]
[1] http://wiki.bash-hackers.org/scripting/obsolete
Curly Braces
${variable} Parameter substitution
${!variable} Indirect variable reference
{ command1; command2; . . . commandN; } Block of code
{string1,string2,string3,...} Brace expansion
{a..z} Extended brace expansion
{} Text replacement, after find and xargs
Parentheses
( command1; command2 ) Command group executed within a subshell
Array=(element1 element2 element3) Array initialization
result=$(COMMAND) Command substitution, new style
>(COMMAND) Process substitution
<(COMMAND) Process substitution
Double Parentheses
(( var = 78 )) Integer arithmetic
var=$(( 20 + 5 )) Integer arithmetic, with variable assignment
(( var++ )) C-style variable increment
(( var-- )) C-style variable decrement
(( var0 = var1<98?9:21 )) C-style ternary operation

I just wanted to add these from TLDP:
~:$ echo $SHELL
/bin/bash
~:$ echo ${#SHELL}
9
~:$ ARRAY=(one two three)
~:$ echo ${#ARRAY}
3
~:$ echo ${TEST:-test}
test
~:$ echo $TEST
~:$ export TEST=a_string
~:$ echo ${TEST:-test}
a_string
~:$ echo ${TEST2:-$TEST}
a_string
~:$ echo $TEST2
~:$ echo ${TEST2:=$TEST}
a_string
~:$ echo $TEST2
a_string
~:$ export STRING="thisisaverylongname"
~:$ echo ${STRING:4}
isaverylongname
~:$ echo ${STRING:6:5}
avery
~:$ echo ${ARRAY[*]}
one two one three one four
~:$ echo ${ARRAY[*]#one}
two three four
~:$ echo ${ARRAY[*]#t}
one wo one hree one four
~:$ echo ${ARRAY[*]#t*}
one wo one hree one four
~:$ echo ${ARRAY[*]##t*}
one one one four
~:$ echo $STRING
thisisaverylongname
~:$ echo ${STRING%name}
thisisaverylong
~:$ echo ${STRING/name/string}
thisisaverylongstring

The difference between test, [ and [[ is explained in great details in the BashFAQ.
(Note: The link shows many examples for comparison)
To cut a long story short: test implements the old, portable syntax of
the command. In almost all shells (the oldest Bourne shells are the
exception), [ is a synonym for test (but requires a final argument of
]). Although all modern shells have built-in implementations of [,
there usually still is an external executable of that name, e.g.
/bin/[.
[[ is a new, improved version of it, and it is a keyword, not a program.
This has beneficial effects on the ease of use, as shown below. [[ is
understood by KornShell and BASH (e.g. 2.03), but not by the older
POSIX or BourneShell.
And the conclusion:
When should the new test command [[ be used, and when the old one [?
If portability/conformance to POSIX or the BourneShell is a concern, the old syntax should
be used. If on the other hand the script requires BASH, Zsh, or KornShell,
the new syntax is usually more flexible.

Parentheses in function definition
Parentheses () are being used in function definition:
function_name () { command1 ; command2 ; }
That is the reason you have to escape parentheses even in command parameters:
$ echo (
bash: syntax error near unexpected token `newline'
$ echo \(
(
$ echo () { command echo The command echo was redefined. ; }
$ echo anything
The command echo was redefined.

Some common and handy uses for brackets, parenthesis, and braces
As mentioned above, sometimes you want a message displayed without losing the return value. This is a handy snippet:
$ [ -f go.mod ] || { echo 'File not found' && false; }
This produced no output and a 0 (true) return value if the file go.mod exists in the current directory. Test the result:
$ echo $?
0
If the file does not exist, you get the message but also a return value of 1 (false), which can also be tested:
$ [ -f fake_file ] || { echo 'File not found'; false; }
File not found
$ echo $?
1
You can also simply create a function to check if a file exists:
fileexists() { [ -f "$1" ]; }
or if a file is readable (not corrupted, have permissions, etc.):
canread() { [ -r "$1" ]; }
or if it is a directory:
isdir() { [ -d "$1" ]; }
or is writable for the current user:
canwrite() { [ -w "$1" ]; }
or if a file exists and is not empty (like a log file with content...)
isempty() { [ -s "$1" ]; }
There are more details at: TLDP
You can also see if a program exists and is available on the path:
exists () { command -v $1 > /dev/null 2>&1; }
This is useful in scripts, for example:
# gitit does an autosave commit to the current
# if Git is installed and available.
# If git is not available, it will use brew
# (on macOS) to install it.
#
# The first argument passed, if any, is used as
# the commit message; otherwise the default is used.
gitit() {
$(exists git) && {
git add --all;
git commit -m "${1:-'GitBot: dev progress autosave'}";
git push;
} || brew install git;
}

Additional info about How to use parentheses to group and expand expressions:
(it is listed on the link syntax-brackets)
Some main points in there:
Group commands in a sub-shell: ( )
(list)
Group commands in the current shell: { }
{ list; }
Test - return the binary result of an expression: [[ ]]
[[ expression ]]
Arithmetic expansion
The format for Arithmetic expansion is:
$(( expression ))
The format for a simple Arithmetic Evaluation is:
(( expression ))
Combine multiple expressions
( expression )
(( expr1 && expr2 ))

Truncate the contents of a variable
$ var="abcde"; echo ${var%d*}
abc
Make substitutions similar to sed
$ var="abcde"; echo ${var/de/12}
abc12
Use a default value
$ default="hello"; unset var; echo ${var:-$default}
hello

If statement string comparison

What am I'm doing wrong over here?
The script by default enters this IF statement & displays the echo statement to exit.
#!/bin/ksh
server=$1
dbname=$2
IFS="
"
if [[ "${dbname}" != "abc_def_data" || "${dbname}" != "abc_def01_data" ]]; then
echo "Msg: Triggers can only be applied to CMS_JAD:abc_def_data/abc_def01_data!"
exit 0
fi

chaining of != conditions requires some inversion of thinking.
I much prefer a clearer path to testing these conditions by using the case ... esac structure.
case "${dbname}" in
abc_def_data|abc_def01_data )
#dbg echo "matched, but for real code replace with just a ':' char"
:
;;
* )
echo "didn_t match any expected values for \$dbname"
echo exit 1
;;
esac
Note that as you're really trying to find the *) case, the actions for the abc_def_data (etc) match can be anything, but to just skip to the next section of code, you would only need the shell's null cmd : .
Edit 1
Note that I have echo exit 1, just so if you copy/paste this to a command line, your shell won't exit. In real code, remove the echo and expect the exit to work.
Edit 2
Also, note that the | char in the case match (abc_def_data**|**abc_def01_data) is essentially an OR (I think it is called something else in the "case match" context).
IHTH

Did you, by any chance, meant to write this?
if [[ "${dbname}" != "abc_def_data" && "${dbname}" != "abc_def01_data" ]]; then
echo "Msg: Triggers can only be applied to CMS_JAD:abc_def_data/abc_def01_data!"
exit 0
fi

try this man, it should work just fine you should have seperated the conditions with "[ ]" and used -o instead of ||....
btw it worked for me fine...
server=$1
dbname=$2
IFS=""
if [ "${dbname}" != "abc_def_data" ] -o [ "${dbname}" != "abc_def01_data" ]
then
echo "Msg: Triggers can only be applied to CMS_JAD:abc_def_data/abc_def01_data!"
exit 0
fi

How to get more than one char in getopt? eg -abc option -bcd another_option [duplicate]

I'm trying to make a getopt command such that when I pass the "-ab" parameter to a script,
that script will treat -ab as a single parameter.
#!/bin/sh
args=`getopt "ab":fc:d $*`
set -- $args
for i in $args
do
case "$i" in
-ab) shift;echo "You typed ab $1.";shift;;
-c) shift;echo "You typed a c $1";shift;;
esac
done
However, this does not seem to work. Can anyone offer any assistance?

getopt doesn't support what you are looking for. You can either use single-letter (-a) or long options (--long). Something like -ab is treated the same way as -a b: as option a with argument b. Note that long options are prefixed by two dashes.

i was struggling with this for long - then i got into reading about getopt and getopts
single char options and long options .
I had similar requirement where i needed to have number of multichar input arguments.
so , i came up with this - it worked in my case - hope this helps you
function show_help {
echo "usage: $BASH_SOURCE --input1 <input1> --input2 <input2> --input3 <input3>"
echo " --input1 - is input 1 ."
echo " --input2 - is input 2 ."
echo " --input3 - is input 3 ."
}
# Read command line options
ARGUMENT_LIST=(
"input1"
"input2"
"input3"
)
# read arguments
opts=$(getopt \
--longoptions "$(printf "%s:," "${ARGUMENT_LIST[#]}")" \
--name "$(basename "$0")" \
--options "" \
-- "$#"
)
echo $opts
eval set --$opts
while true; do
case "$1" in
h)
show_help
exit 0
;;
--input1)
shift
empId=$1
;;
--input2)
shift
fromDate=$1
;;
--input3)
shift
toDate=$1
;;
--)
shift
break
;;
esac
shift
done
Note - I have added help function as per my requirement, you can remove it if not needed

That's not the unix way, though some do it e.g. java -cp classpath.
Hack: instead of -ab arg, have -b arg and a dummy option -a.
That way, -ab arg does what you want. (-b arg will too; hopefully that's not a bug, but a shortcut feature...).
The only change is your line:
-ab) shift;echo "You typed ab $1.";shift;;
becomes
-b) shift;echo "You typed ab $1.";shift;;

GNU getopt have --alternative option
-a, --alternative
Allow long options to start with a single '-'.
Example:
#!/usr/bin/env bash
SOPT='a:b'
LOPT='ab:'
OPTS=$(getopt -q -a \
--options ${SOPT} \
--longoptions ${LOPT} \
--name "$(basename "$0")" \
-- "$#"
)
if [[ $? > 0 ]]; then
exit 2
fi
A=
B=false
AB=
eval set -- $OPTS
while [[ $# > 0 ]]; do
case ${1} in
-a) A=$2 && shift ;;
-b) B=true ;;
--ab) AB=$2 && shift ;;
--) ;;
*) ;;
esac
shift
done
printf "Params:\n A=%s\n B=%s\n AB=%s\n" "${A}" "${B}" "${AB}"
$ ./test.sh -a aaa -b -ab=test
Params:
A=aaa
B=true
AB=test

getopt supports long format. You can search SO for such examples.
See here, for example

Shell: Checking if argument exists and matches expression

I'm new to shell scripting and trying to write the ability to check if an argument exists and if it matches an expression. I'm not sure how to write expressions, so this is what I have so far:
#!/bin/bash
if [[ -n "$1"] && [${1#*.} -eq "tar.gz"]]; then
echo "Passed";
else
echo "Missing valid argument"
fi
To run the script, I would type this command:
# script.sh YYYY-MM.tar.gz
I believe what I have is
if the YYYY-MM.tar.gz is not after script.sh it will echo "Missing valid argument" and
if the file does not end in .tar.gz it echo's the same error.
However, I want to also check if the full file name is in YYYY-MM.tar.gz format.

if [[ -n "$1" ]] && [[ "${1#*.}" == "tar.gz" ]]; then
-eq: (equal) for arithmetic tests
==: to compare strings
See: help test

You can also use:
case "$1" in
*.tar.gz) ;; #passed
*) echo "wrong/missing argument $1"; exit 1;;
esac
echo "ok arg: $1"

As long as the file is in the correct YYYY-MM.tar.gz format, it obviously is non-empty and ends in .tar.gz as well. Check with a regular expression:
if ! [[ $1 =~ [0-9]{4}-[0-9]{1,2}.tar.gz ]]; then
echo "Argument 1 not in correct YYYY-MM.tar.gz format"
exit 1
fi
Obviously, the regular expression above is too general, allowing names like 0193-67.tar.gz. You can adjust it to be as specific as you need it to be for your application, though. I might recommend
[1-9][0-9]{3}-([1-9]|10|11|12).tar.gz
to allow only 4-digit years starting with 1000 (support for the first millennium ACE seems unnecessary) and only months 1-12 (no leading zero).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Parsing OPTARG inside case with regular expression and [[ ]]'s - regex

quoting regexp in this context forces a string comparison. change to if ! [[ "$RMPI" =~ ^[yn]$ ]] check following post for more details, bash regex with quotes?

Related

Regular expressions don't work as expected in bash if-else block's condition

Using --include=GLOB vs ~/path/*.{x,y} with grep [duplicate]

If statement string comparison

How to get more than one char in getopt? eg -abc option -bcd another_option [duplicate]

Shell: Checking if argument exists and matches expression

Categories

Resources