test that argument looks like a date - regex

I'm modifying a script used to test a report writer. The report writer takes optional --from and --to flags to specify the start and end dates. I'd like to modify the script function that starts up the report writer so that its date arguments are also optional.
Sadly, there are already optional arguments to the function, so I'm trying to test whether an argument is in the right format for a date (we use nn/nn/nnnn).
So, I'm echoing the candidate string and checking with grep whether it is in the correct format. Except it doesn't work.
Here is an extract from the function
# If the next argument looks like a date, consume it and use it to define
# the report start date
looksLikeDate=$(echo $1 | grep -e '[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]')
echo from -
echo \$1: \"$1\"
echo looksLikeDate: \"$looksLikeDate\"
if [ -n $looksLikeDate ]
then
echo "-n: true"
FROMFLAG="--from $1"
shift 1
else
echo "-n : false"
FROMFLAG=""
fi
# If the next argument looks like a date, consume it and use it to define
# the report end date
looksLikeDate=$(echo $1 | grep -e '[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]')
echo to -
echo \$1: \"$1\"
echo looksLikeDate: \"$looksLikeDate\"
if [ -n $looksLikeDate ]
then
echo "-n: true"
TOFLAG="--to $1"
shift 1
else
echo "-n: false"
TOFLAG=""
fi
...and here is the output with dates...
from -
$1: "09/02/2018"
looksLikeDate: "09/02/2018"
-n: true
to -
$1: "09/02/2018"
looksLikeDate: "09/02/2018"
-n: true
...and without...
from -
$1: ""
looksLikeDate: ""
-n: true
to -
$1: ""
looksLikeDate: ""
-n: true
...what have I missed? I'd expect that since looksLikeDate is demonstrable empty [ -n $looksLikeDate ] would return false and the code would go down the else path of the if statement.
Update:
Since posting, it occurs to me that the easiest thing is to not to look at the arguments in the function and get callers to pass the --from and --to with the arguments so that I can simply pass $* to the report writer as is done for the existing optional arguments.
Thank you very much for reading; I'm still curious as to why the posted code doesn't work.

That's because you're not quoting your variables! Use more quotes!
So here's what's happening: when Bash sees
[ -n $looksLikeDate ]
it performs parameter expansion, glob expansion, quote removal, etc., and finally sees this (I put one token on each line):
[
-n
]
and you see that the $looksLikeDate part is missing because the parameter $looksLikeDate expands to the empty string before the quote removal step. Then Bash executes the builtin [, and with the closing ], this is equivalent to the following command:
test -n
Now looking at the reference manual for the test builtin, you'll read:
1 argument
The expression is true if, and only if, the argument is not null.
And here, the argument is -n, hence not nil, hence the expression is true.
So remember:
Use more quotes! quote all your variable expansions!
This specific line should look like:
[ -n "$looksLikeDate" ]
Another possibility is to use the [[ keyword:
[[ -n $looksLikeDate ]]
But anyway, quote all your expansion!
Also, you don't need the external tool grep, you can use Bash's internal regex engine or, better yet:
if [[ $1 = [[:digit:]][[:digit:]]/[[:digit:]][[:digit:]]/[[:digit:]][[:digit:]][[:digit:]][[:digit:]] ]]; then
which is a bit long, so use a variable:
date_pattern="[[:digit:]][[:digit:]]/[[:digit:]][[:digit:]]/[[:digit:]][[:digit:]][[:digit:]][[:digit:]]"
if [[ $1 = $date_pattern ]]; then
(and here you mustn't quote the right hand side $date_pattern).

Related

how to enforce a date format

I want to use the date command to output a day of week from user input.
I want to force the input to be of the format MM/DD/YYYY.
For example, at the command line I give
./programname MM/DD/YYYY MM/DD/YYYY
Snippets from the script itself
#!/bin/bash
DATE_FORMAT="^[0-9][0-9][/][0-9][0-9][/][0-9][0-9][0-9][0-9]$" #MM/DD/YYYY
DATE1="$1"
DATE2="$2"
... followed by
if [ "$DATE1" != "$DATE_FORMAT" ] || [ "$DATE2" != "$DATE_FORMAT" ]; then
echo -e "Please follow the valid format MM/DD/YYYY.\n" 1>&2
exit 1
Now the problem is even when I enter correct date formats,
./programname 11/22/2014 11/23/2014
I still get that error message that I set up, which means that condition for if is evaluated true even when I input valid format... any suggestions why this is happening?
This script seems to work:
#!/bin/bash
DATE_FORMAT="^[01][0-9][/][0-3][0-9][/][0-9][0-9][0-9][0-9]$" #MM/DD/YYYY
DATE1="$1"
DATE2="$2"
if [[ "$DATE1" =~ $DATE_FORMAT ]] && [[ "$DATE2" =~ $DATE_FORMAT ]]
then echo "Both dates ($DATE1 and $DATE2) are OK"
else echo "Please follow the valid format MM/DD/YYYY ($DATE1 or $DATE2 is wrong)."
fi
It uses the =~ operator for a positive regex match inside Bash's [[ test command. The documents don't mention a !~ for negative matching (though that's what Awk and Perl use). With the single-bracket [ test command, there is no regex matching. Note that the regex expression must not be enclosed in double quotes:
Any part of the pattern may be quoted to force the quoted portion to be matched as a string. Bracket expressions in regular expressions must be treated carefully, since normal quoting characters lose their meanings between brackets. If the pattern is stored in a shell variable, quoting the variable expansion forces the entire pattern to be matched as a string.
The test is also more stringent, rejecting 23/45/2091, amongst other invalid date strings.
$ bash dt19.sh 11/22/2014 11/23/2014
Both dates (11/22/2014 and 11/23/2014) are OK
$ bash dt19.sh 31/22/2014 11/43/2014
Please follow the valid format MM/DD/YYYY (31/22/2014 or 11/43/2014 is wrong).
$
Corrected code:
#!/bin/bash
DATE1="$1"
DATE2="$2"
if echo "$DATE1" | grep -q -E '[0-9][0-9][/][0-9][0-9][/][0-9][0-9][0-9][0-9]'
then
echo "Do whatever you want here"
exit 1
else
echo "Invalid date"
fi

Using --include=GLOB vs ~/path/*.{x,y} with grep [duplicate]

I am confused by the usage of brackets, parentheses, curly braces in Bash, as well as the difference between their double or single forms. Is there a clear explanation?
In Bash, test and [ are shell builtins.
The double bracket, which is a shell keyword, enables additional functionality. For example, you can use && and || instead of -a and -o and there's a regular expression matching operator =~.
Also, in a simple test, double square brackets seem to evaluate quite a lot quicker than single ones.
$ time for ((i=0; i<10000000; i++)); do [[ "$i" = 1000 ]]; done
real 0m24.548s
user 0m24.337s
sys 0m0.036s
$ time for ((i=0; i<10000000; i++)); do [ "$i" = 1000 ]; done
real 0m33.478s
user 0m33.478s
sys 0m0.000s
The braces, in addition to delimiting a variable name are used for parameter expansion so you can do things like:
Truncate the contents of a variable
$ var="abcde"; echo ${var%d*}
abc
Make substitutions similar to sed
$ var="abcde"; echo ${var/de/12}
abc12
Use a default value
$ default="hello"; unset var; echo ${var:-$default}
hello
and several more
Also, brace expansions create lists of strings which are typically iterated over in loops:
$ echo f{oo,ee,a}d
food feed fad
$ mv error.log{,.OLD}
(error.log is renamed to error.log.OLD because the brace expression
expands to "mv error.log error.log.OLD")
$ for num in {000..2}; do echo "$num"; done
000
001
002
$ echo {00..8..2}
00 02 04 06 08
$ echo {D..T..4}
D H L P T
Note that the leading zero and increment features weren't available before Bash 4.
Thanks to gboffi for reminding me about brace expansions.
Double parentheses are used for arithmetic operations:
((a++))
((meaning = 42))
for ((i=0; i<10; i++))
echo $((a + b + (14 * c)))
and they enable you to omit the dollar signs on integer and array variables and include spaces around operators for readability.
Single brackets are also used for array indices:
array[4]="hello"
element=${array[index]}
Curly brace are required for (most/all?) array references on the right hand side.
ephemient's comment reminded me that parentheses are also used for subshells. And that they are used to create arrays.
array=(1 2 3)
echo ${array[1]}
2
A single bracket ([) usually actually calls a program named [; man test or man [ for more info. Example:
$ VARIABLE=abcdef
$ if [ $VARIABLE == abcdef ] ; then echo yes ; else echo no ; fi
yes
The double bracket ([[) does the same thing (basically) as a single bracket, but is a bash builtin.
$ VARIABLE=abcdef
$ if [[ $VARIABLE == 123456 ]] ; then echo yes ; else echo no ; fi
no
Parentheses (()) are used to create a subshell. For example:
$ pwd
/home/user
$ (cd /tmp; pwd)
/tmp
$ pwd
/home/user
As you can see, the subshell allowed you to perform operations without affecting the environment of the current shell.
(a) Braces ({}) are used to unambiguously identify variables. Example:
$ VARIABLE=abcdef
$ echo Variable: $VARIABLE
Variable: abcdef
$ echo Variable: $VARIABLE123456
Variable:
$ echo Variable: ${VARIABLE}123456
Variable: abcdef123456
(b) Braces are also used to execute a sequence of commands in the current shell context, e.g.
$ { date; top -b -n1 | head ; } >logfile
# 'date' and 'top' output are concatenated,
# could be useful sometimes to hunt for a top loader )
$ { date; make 2>&1; date; } | tee logfile
# now we can calculate the duration of a build from the logfile
There is a subtle syntactic difference with ( ), though (see bash reference) ; essentially, a semicolon ; after the last command within braces is a must, and the braces {, } must be surrounded by spaces.
Brackets
if [ CONDITION ] Test construct
if [[ CONDITION ]] Extended test construct
Array[1]=element1 Array initialization
[a-z] Range of characters within a Regular Expression
$[ expression ] A non-standard & obsolete version of $(( expression )) [1]
[1] http://wiki.bash-hackers.org/scripting/obsolete
Curly Braces
${variable} Parameter substitution
${!variable} Indirect variable reference
{ command1; command2; . . . commandN; } Block of code
{string1,string2,string3,...} Brace expansion
{a..z} Extended brace expansion
{} Text replacement, after find and xargs
Parentheses
( command1; command2 ) Command group executed within a subshell
Array=(element1 element2 element3) Array initialization
result=$(COMMAND) Command substitution, new style
>(COMMAND) Process substitution
<(COMMAND) Process substitution
Double Parentheses
(( var = 78 )) Integer arithmetic
var=$(( 20 + 5 )) Integer arithmetic, with variable assignment
(( var++ )) C-style variable increment
(( var-- )) C-style variable decrement
(( var0 = var1<98?9:21 )) C-style ternary operation
I just wanted to add these from TLDP:
~:$ echo $SHELL
/bin/bash
~:$ echo ${#SHELL}
9
~:$ ARRAY=(one two three)
~:$ echo ${#ARRAY}
3
~:$ echo ${TEST:-test}
test
~:$ echo $TEST
~:$ export TEST=a_string
~:$ echo ${TEST:-test}
a_string
~:$ echo ${TEST2:-$TEST}
a_string
~:$ echo $TEST2
~:$ echo ${TEST2:=$TEST}
a_string
~:$ echo $TEST2
a_string
~:$ export STRING="thisisaverylongname"
~:$ echo ${STRING:4}
isaverylongname
~:$ echo ${STRING:6:5}
avery
~:$ echo ${ARRAY[*]}
one two one three one four
~:$ echo ${ARRAY[*]#one}
two three four
~:$ echo ${ARRAY[*]#t}
one wo one hree one four
~:$ echo ${ARRAY[*]#t*}
one wo one hree one four
~:$ echo ${ARRAY[*]##t*}
one one one four
~:$ echo $STRING
thisisaverylongname
~:$ echo ${STRING%name}
thisisaverylong
~:$ echo ${STRING/name/string}
thisisaverylongstring
The difference between test, [ and [[ is explained in great details in the BashFAQ.
(Note: The link shows many examples for comparison)
To cut a long story short: test implements the old, portable syntax of
the command. In almost all shells (the oldest Bourne shells are the
exception), [ is a synonym for test (but requires a final argument of
]). Although all modern shells have built-in implementations of [,
there usually still is an external executable of that name, e.g.
/bin/[.
[[ is a new, improved version of it, and it is a keyword, not a program.
This has beneficial effects on the ease of use, as shown below. [[ is
understood by KornShell and BASH (e.g. 2.03), but not by the older
POSIX or BourneShell.
And the conclusion:
When should the new test command [[ be used, and when the old one [?
If portability/conformance to POSIX or the BourneShell is a concern, the old syntax should
be used. If on the other hand the script requires BASH, Zsh, or KornShell,
the new syntax is usually more flexible.
Parentheses in function definition
Parentheses () are being used in function definition:
function_name () { command1 ; command2 ; }
That is the reason you have to escape parentheses even in command parameters:
$ echo (
bash: syntax error near unexpected token `newline'
$ echo \(
(
$ echo () { command echo The command echo was redefined. ; }
$ echo anything
The command echo was redefined.
Some common and handy uses for brackets, parenthesis, and braces
As mentioned above, sometimes you want a message displayed without losing the return value. This is a handy snippet:
$ [ -f go.mod ] || { echo 'File not found' && false; }
This produced no output and a 0 (true) return value if the file go.mod exists in the current directory. Test the result:
$ echo $?
0
If the file does not exist, you get the message but also a return value of 1 (false), which can also be tested:
$ [ -f fake_file ] || { echo 'File not found'; false; }
File not found
$ echo $?
1
You can also simply create a function to check if a file exists:
fileexists() { [ -f "$1" ]; }
or if a file is readable (not corrupted, have permissions, etc.):
canread() { [ -r "$1" ]; }
or if it is a directory:
isdir() { [ -d "$1" ]; }
or is writable for the current user:
canwrite() { [ -w "$1" ]; }
or if a file exists and is not empty (like a log file with content...)
isempty() { [ -s "$1" ]; }
There are more details at: TLDP
You can also see if a program exists and is available on the path:
exists () { command -v $1 > /dev/null 2>&1; }
This is useful in scripts, for example:
# gitit does an autosave commit to the current
# if Git is installed and available.
# If git is not available, it will use brew
# (on macOS) to install it.
#
# The first argument passed, if any, is used as
# the commit message; otherwise the default is used.
gitit() {
$(exists git) && {
git add --all;
git commit -m "${1:-'GitBot: dev progress autosave'}";
git push;
} || brew install git;
}
Additional info about How to use parentheses to group and expand expressions:
(it is listed on the link syntax-brackets)
Some main points in there:
Group commands in a sub-shell: ( )
(list)
Group commands in the current shell: { }
{ list; }
Test - return the binary result of an expression: [[ ]]
[[ expression ]]
Arithmetic expansion
The format for Arithmetic expansion is:
$(( expression ))
The format for a simple Arithmetic Evaluation is:
(( expression ))
Combine multiple expressions
( expression )
(( expr1 && expr2 ))
Truncate the contents of a variable
$ var="abcde"; echo ${var%d*}
abc
Make substitutions similar to sed
$ var="abcde"; echo ${var/de/12}
abc12
Use a default value
$ default="hello"; unset var; echo ${var:-$default}
hello

bash regular expression format

My code have problem with compare var with regular expression.
The main problem is problem is here
if [[ “$alarm” =~ ^[0-2][0-9]\:[0-5][0-9]$ ]]
This "if" is never true i dont know why even if i pass to "$alarm" value like 13:00 or 08:19 its always false and write "invalid clock format".
When i try this ^[0-2][0-9]:[0-5][0-9]$ on site to test regular expressions its work for example i compered with 12:20.
I start my script whith command ./alarm 11:12
below is whole code
#!/bin/bash
masa="`date +%k:%M`"
mp3="$HOME/Desktop/alarm.mp3" #change this
echo "lol";
if [ $# != 1 ]; then
echo "please insert alarm time [24hours format]"
echo "example ./alarm 13:00 [will ring alarm at 1:00pm]"
exit;
fi
alarm=$1
echo "$alarm"
#fix me with better regex >_<
if [[ “$alarm” =~ ^[0-2][0-9]\:[0-5][0-9]$ ]]
then
echo "time now $masa"
echo "alarm set to $alarm"
echo "will play $mp3"
else
echo "invalid clock format"
exit;
fi
while [ $masa != $alarm ];do
masa="`date +%k:%M`" #update time
sleep 1 #dont overload the cpu cycle
done
echo $masa
if [ $masa = $alarm ];then
echo ringggggggg
play $mp3 > /dev/null 2> /dev/null &
fi
exit
I can see a couple of issues with your test.
Firstly, it looks like you may be using the wrong kind of double quotes around your variable (“ ”, rather than "). These "fancy quotes" are being concatenated with your variable, which I assume is what causes your pattern to fail to match. You could change them but within bash's extended tests (i.e. [[ instead of [), there's no need to quote your variables anyway, so I would suggest removing them entirely.
Secondly, your regular expression allows some invalid dates at the moment. I would suggest using something like this:
re='^([01][0-9]|2[0-3]):[0-5][0-9]$'
if [[ $alarm =~ $re ]]
I have deliberately chosen to use a separate variable to store the pattern, as this is the most widely compatible way of working with bash regexes.

Shell: Checking if argument exists and matches expression

I'm new to shell scripting and trying to write the ability to check if an argument exists and if it matches an expression. I'm not sure how to write expressions, so this is what I have so far:
#!/bin/bash
if [[ -n "$1"] && [${1#*.} -eq "tar.gz"]]; then
echo "Passed";
else
echo "Missing valid argument"
fi
To run the script, I would type this command:
# script.sh YYYY-MM.tar.gz
I believe what I have is
if the YYYY-MM.tar.gz is not after script.sh it will echo "Missing valid argument" and
if the file does not end in .tar.gz it echo's the same error.
However, I want to also check if the full file name is in YYYY-MM.tar.gz format.
if [[ -n "$1" ]] && [[ "${1#*.}" == "tar.gz" ]]; then
-eq: (equal) for arithmetic tests
==: to compare strings
See: help test
You can also use:
case "$1" in
*.tar.gz) ;; #passed
*) echo "wrong/missing argument $1"; exit 1;;
esac
echo "ok arg: $1"
As long as the file is in the correct YYYY-MM.tar.gz format, it obviously is non-empty and ends in .tar.gz as well. Check with a regular expression:
if ! [[ $1 =~ [0-9]{4}-[0-9]{1,2}.tar.gz ]]; then
echo "Argument 1 not in correct YYYY-MM.tar.gz format"
exit 1
fi
Obviously, the regular expression above is too general, allowing names like 0193-67.tar.gz. You can adjust it to be as specific as you need it to be for your application, though. I might recommend
[1-9][0-9]{3}-([1-9]|10|11|12).tar.gz
to allow only 4-digit years starting with 1000 (support for the first millennium ACE seems unnecessary) and only months 1-12 (no leading zero).

Getting the index of the substring on solaris

How can I find the index of a substring which matches a regular expression on solaris10?
Assuming that what you want is to find the location of the first match of a wildcard in a string using bash, the following bash function returns just that, or empty if the wildcard doesn't match:
function match_index()
{
local pattern=$1
local string=$2
local result=${string/${pattern}*/}
[ ${#result} = ${#string} ] || echo ${#result}
}
For example:
$ echo $(match_index "a[0-9][0-9]" "This is a a123 test")
10
If you want to allow full-blown regular expressions instead of just wildcards, replace the "local result=" line with
local result=$(echo "$string" | sed 's/'"$pattern"'.*$//')
but then you're exposed to the usual shell quoting issues.
The goto options for me are bash, awk and perl. I'm not sure what you're trying to do, but any of the three would likely work well. For example:
f=somestring
string=$(expr match "$f" '.*\(expression\).*')
echo $string
You tagged the question as bash, so I'm going to assume you're asking how to do this in a bash script. Unfortunately, the built-in regular expression matching doesn't save string indices. However, if you're asking this in order to extract the match substring, you're in luck:
if [[ "$var" =~ "$regex" ]]; then
n=${#BASH_REMATCH[*]}
while [[ $i -lt $n ]]
do
echo "capture[$i]: ${BASH_REMATCH[$i]}"
let i++
done
fi
This snippet will output in turn all of the submatches. The first one (index 0) will be the entire match.
You might like your awk options better, though. There's a function match which gives you the index you want. Documentation can be found here. It'll also store the length of the match in RLENGTH, if you need that. To implement this in a bash script, you could do something like:
match_index=$(echo "$var_to_search" | \
awk '{
where = match($0, '"$regex_to_find"')
if (where)
print where
else
print -1
}')
There are a lot of ways to deal with passing the variables in to awk. This combination of piping output and directly embedding one into the awk one-liner is fairly common. You can also give awk variable values with the -v option (see man awk).
Obviously you can modify this to get the length, the match string, whatever it is you need. You can capture multiple things into an array variable if necessary:
match_data=($( ... awk '{ ... print where,RLENGTH,match_string ... }'))
If you use bash 4.x you can source the oobash. A string lib written in bash with oo-style:
http://sourceforge.net/projects/oobash/
String is the constructor function:
String a abcda
a.indexOf a
0
a.lastIndexOf a
4
a.indexOf da
3
There are many "methods" more to work with strings in your scripts:
-base64Decode -base64Encode -capitalize -center
-charAt -concat -contains -count
-endsWith -equals -equalsIgnoreCase -reverse
-hashCode -indexOf -isAlnum -isAlpha
-isAscii -isDigit -isEmpty -isHexDigit
-isLowerCase -isSpace -isPrintable -isUpperCase
-isVisible -lastIndexOf -length -matches
-replaceAll -replaceFirst -startsWith -substring
-swapCase -toLowerCase -toString -toUpperCase
-trim -zfill