Shell REGEX for date - regex

I am trying to fix the users input if they input an incorrect date format that is not (YYYY-MM-DD) but I cant figure it out. Here is what I have:
while [ "$startDate" != "^[0-9]{4}-[0-9]{2}-[0-9]{2}$" ]
do
echo "Please retype the start date (YYYY-MM-DD):"
read startDate
done

Instead of !=, you have to use ! $var =~ regex to perform regex comparisons:
[[ $date =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]
^^
So that your script can be like this:
date=""
while [[ ! $date =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; do
echo "enter date (YYYY-MM-DD)"
read $date
done

Why not convert the incorrect format to desired one using date program:
$ date -d "06/38/1992" +"%Y-%m-%d"
1992-06-28
You can also check for conversion failure by checking return value.
$ date -d "06/38/1992" +"%Y-%m-%d"
date: invalid date ‘06/38/1992’
$ [ -z $? ] || echo "Error, failed to parse date"

Related

Bash regex not recognizing a single space " "

I'm trying to solve a problem that appeared in my script which doesn't let me match the date+time (YYYY-MM-DD HH:MM:SS) inside a for loop
list='"dt_txt":"2022-06-03 21:00:00"},'
regex_datehour='"dt_txt":"([0-9,-]*.[0-9,:]*)'
for i in $list; do
[[ $i =~ $regex_datehour ]] && echo "${BASH_REMATCH[1]}"
done
It seems that the "." between the two pair of brackets it's not recognizing the space! that's because inside of the list, if I replace the empty space between the date and the time by a _, it works as intended! list='"dt_txt":"2022-06-03_21:00:00"},'
desired output:
2022-06-03 21:00:00
what I get:
2022-06-03
The problem here is one that catches a lot of people, and that is whitespace breaking. In the for loop, your $list variable is not quoted, and it contains a space:
$ list='"dt_txt":"2022-06-03 21:00:00"},'
$ for i in $list ; do echo "i = $i" ; done ;
i = "dt_txt":"2022-06-03
i = 21:00:00"},
Make sure to put double-quotes around all strings that contain variables except regexes:
Using an array for list, which is what makes sense when using the for loop from your original code, it would look something like this:
#!/usr/bin/env bash
# filename: re.sh
list=(
'"dt_txt":"2022-06-03 21:00:00"},'
'"dt_txt":"2022-06-03 22:00:00"},'
'"dt_txt":"2022-06-03 23:00:00"},'
)
regex_datehour='"dt_txt":"([0-9,-]*.[0-9,:]*)'
for i in "${list[#]}" ; do
[[ "$i" =~ $regex_datehour ]] && echo "${BASH_REMATCH[1]}"
done
$ ./re.sh
2022-06-03 21:00:00
2022-06-03 22:00:00
2022-06-03 23:00:00

preg_match_all equivalent for BASH?

I have a string like this
foo:collection:indexation [options] [--] <text> <text_1> <text_2> <text_3> <text_4>
And i want to use bash regex to get an array or string that I can split to get this in order to check if the syntax is correct
["text", "text_1", "text_2", "text_3", "text_4"]
I have tried to do this :
COMMAND_OUTPUT=$($COMMAND_HELP)
# get the output of the help
# regex
ARGUMENT_REGEX="<([^>]+)>"
GOOD_REGEX="[a-z-]"
# get all the arguments
while [[ $COMMAND_OUTPUT =~ $ARGUMENT_REGEX ]]; do
ARGUMENT="${BASH_REMATCH[1]}"
# bad syntax
if [[ ! $ARGUMENT =~ $GOOD_REGEX ]]; then
echo "Invalid argument '$ARGUMENT' for the command $FILE"
echo "Must only use characters [a-z:-]"
exit 5
fi
done
But the while does not seem to be appropriate since I always get the first match.
How can I get all the matches for this regex ?
Thanks !
The loop doesn't work because every time you're just testing the same input string against the regexp. It doesn't know that it should start scanning after the match from the previous iteration. You'd need to remove the part of the string up to and including the previous match before doing the next test.
A simpler way is to use grep -o to get all the matches.
$COMMAND_HELP | grep -o "$ARGUMENT_REGEX" | while read ARGUMENT; do
if [[ ! $ARGUMENT =~ $GOOD_REGEX ]]; then
echo "Invalid argument '$ARGUMENT' for the command $FILE"
echo "Must only use characters [a-z:-]"
exit 5
fi
done
Bash doesn't have this directly, but you can achieve a similar effect with a slight modification.
string='foo...'
re='<([^>]+)>'
while [[ $string =~ $re(.*) ]]; do
string=${BASH_REMATCH[2]}
# process as before
done
This matches the regex we want and also everything in the string after the regex. We keep shortening $string by assigning only the after-our-regex portion to it on every iteration. On the last iteration, ${BASH_REMATCH[2]} will be empty so the loop will terminate.

Regex validation for grouping in Bash Scripting

I have tried creating a regex validation for Bash and have been doing this. It's working only for the first digit, the second one no. Can you help me out here?
while [[ $usrInput =~ [^[1-9]|[0-2]{1}$] ]]
do
echo "This is not a valid option. Please type an integer between 1 and 12"
read usrInput
done
You can't nest ranges. You want something like
while ! [[ $usrInput =~ ^[0-9]|11|12$ ]]; do
although in general it would be simpler to compare a digit string numerically:
min=1
max=12
until [[ $usrInput =~ ^[0-9]+$ ]] &&
(( usrInput >= min && usrInput <= max )); do
I believe you (or bash) are not grouping the expressions correctly. Either way, this'll work:
while read usrInput
do
if [[ "$usrInput" =~ ^([1-9]|1[0-2])$ ]]
then
break
else
echo "Number not between 1 and 12"
fi
done

Bash regex to match substring with exact integer range

I need to match a string $str that contains any of
foo{77..93}
and capture the above substring in a variable.
So far I've got:
str=/random/string/containing/abc-foo78_efg/ # for example
if [[ $str =~ (foo[7-9][0-9]) ]]; then
id=${BASH_REMATCH[1]}
fi
echo $id # gives foo78
but this also captures ids outside of the target range (e.g. foo95).
Is there a way to restrict the regex to an exact integer range? (tried foo[77-93] but that doesn't work.
Thanks
If you want to use a regex, you're going to have to make it slightly more complex:
if [[ $str =~ foo(7[7-9]|8[0-9]|9[0-3]) ]]; then
id=${BASH_REMATCH[0]}
fi
Note that I have removed the capture group around the whole pattern and am now using the 0th element of the match array.
As an aside, for maximum compatibility with older versions of bash, I would recommend assigning the pattern to a variable and using in the test like this:
re='foo(7[7-9]|8[0-9]|9[0-3])'
if [[ $str =~ $re ]]; then
id=${BASH_REMATCH[0]}
fi
An alternative to using a regex would be to use an arithmetic context, like this:
if (( "${str#foo}" >= 77 && "${str#foo}" <= 93 )); then
id=$str
fi
This strips the "foo" part from the start of the variable so that the integer part can be compared numerically.
Sure is easy to do with Perl:
$ echo foo{1..100} | tr ' ' '\n' | perl -lne 'print $_ if m/foo(\d+)/ and $1>=77 and $1<=93'
foo77
foo78
foo79
foo80
foo81
foo82
foo83
foo84
foo85
foo86
foo87
foo88
foo89
foo90
foo91
foo92
foo93
Or awk even:
$ echo foo{1..100} | tr ' ' '\n' | awk -F 'foo' '$2>=77 && $2<=93
{print}'
foo77
foo78
foo79
foo80
foo81
foo82
foo83
foo84
foo85
foo86
foo87
foo88
foo89
foo90
foo91
foo92
foo93

function to validate dates using regex in bash

I have function like following to be able to validate two dates passed to it:
function validate_dates() {
# validate date format to be yyyy-mm-dd
local regex="^[[:digit]]{4}-[[:digit]]{2}-[[:digit]]{2}$"
local dates=( "$1" "$2" )
printf "%s\n" "${dates[#]}"
for __date in "$dates"
do
echo "$__date"
[[ $__date =~ $regex ]] || error_exit "One of dates is malformed!" # error_exit is just function helper to exit with message
done
}
However when I call function -
validate_dates "2013-05-23" "2014-07-28"
I get:
2013-05-23
2014-07-28
2013-05-23
One of dates is malformed!
Why does it break on correctly formatted date?
"^[[:digit]]{4}-[[:digit]]{2}-[[:digit]]{2}$"
probably should be:
"^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}$"
You missed some colons.
Also, for __date in "$dates" perhaps should be for __date in "${dates[#]}". "$dates" would only expand to the first element of the array.
If you want to have more than just 2 parameters or have variable parameters, change
local dates=( "$1" "$2" )
to
local dates=("$#")
My version:
function validate_dates {
local regex='^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}$'
printf "%s\n" "$#"
for __; do
[[ $__ =~ $regex ]] || error_exit "$__ is an invalid date."
done
}
Please note you cannot "validate date" using only regex. A second pass using date might be useful:
for __date in "$dates"
do
echo "$__date"
{ [[ $__date =~ $regex ]] && (date -d "$__date" > /dev/null 2>&1) } ||
error_exit "One of dates is malformed!" # error_exit is just function helper to exit with message
done
Using grep with -P makes the regex much easier to read, and you can simplify your function as below:
function validate_dates {
for i in $#; do
grep -qP '^\d{4}-\d{2}-\d{2}$' <<< $i ||\
error_exit "One of dates is malformed!"
done
}