Bash Regex comparison not working - regex

keyFileName=$1;
for fileExt in "${validTypes[#]}"
do
echo $fileExt;
if [[ $keyFileName == *.$fileExt ]]; then
keyStatus="true";
fi
done;
I am trying to check the file extension of a file passed in against an array of multiple file extensions. However it doesn't seem to be working properly. Any help?

validTypes=(".txt" ".mp3")
keyFileName="$1"
for fileExt in "${validTypes[#]}"
do
echo $fileExt;
if [[ $keyFileName =~ ^.*$fileExt$ ]]; then
keyStatus="true";
echo "Yes"
fi
done;
Effectively, you could change your if statement to either:
if [[ $keyFileName == ?*$fileExt ]] # Glob pattern case, ? denotes single char
or:
if [[ $keyFileName =~ .*$fileExt ]] # Regex case, . denotes single char

Looping over the array to do a regex match on each element seems rather inefficient. You're using regex; it's easy to combine the expressions and avoid looping at all.
Mangling the array into a valid regex is not entirely trivial, though. Here's my attempt:
validTypes=('\.txt' '\.mp3')
fileExtRe=$(printf '|%s' "${validTypes[#]}"
# Trim off the first alternation, add parens and anchor
fileExtRe="(${fileExtRe#?})$"
if [[ $keyFileName =~ $fileExtRe ]]; then
:
Notice how the elements in validTypes are regular expressions now, with the dot escaped to only match a literal dot.

Related

bash IF not matching variable that contains regex numbers

DPHPV = /usr/local/nginx/conf/php81-remi.conf;
I am unable to figure out how to match a string that contains any 2 digits:
if [[ "$DPHPV" =~ *"php[:digit:][:digit:]-remi.conf"* ]]
You are not using the right regex here as * is a quantifier in regex, not a placeholder for any text.
Actually, you do not need a regex, you may use a mere glob pattern like
if [[ "$DPHPV" == *php[[:digit:]][[:digit:]]-remi.conf ]]
Note
== - enables glob matching
*php[[:digit:]][[:digit:]]-remi.conf - matches any text with *, then matches php, then two digits (note that the POSIX character classes must be used inside bracket expressions), and then -rem.conf at the end of string.
See the online demo:
#!/bin/bash
DPHPV='/usr/local/nginx/conf/php81-remi.conf'
if [[ "$DPHPV" == *php[[:digit:]][[:digit:]]-remi.conf ]]; then
echo yes;
else
echo no;
fi
Output: yes.

Bash regular expression with quotes

I am writing a script and I want to check a variable for a format. This is the function I use :
check_non_numeric() {
#re='^\".*\"$'
re='\[^\]*\'
if ! [[ $1 =~ $re ]] ; then
echo "'$1' is not a valid format - \"[name]\" "
exit 1
fi
}
I want the regular expression to match a string with anything but quotation mark inside and quotation marks around it ("a" or "string" or "dsfo!^$**#"). The problem is that these regular expressions that I came up with dont work for me. I have used a very similar function to check if a variable is an integer or float and it worked there. Could you please tell me what the regular expression in question should be ?
Thank you very much
I'm assuming you meant you want to match anything that is not a string surrounded by quotes. It's easier to match use your regex to match, and the bash-test to "not" match it-- if that's not clear, use !. Here's a couple of ways to do it.
if [[ ! $(expr "$string" : '\".*\"') -gt 0 ]]; then echo "expr good"; fi
if [[ ! "$string" =~ \".*\" ]]; then echo "test good"; fi
Make sure you quote your variable you are testing with expr (which is there for edification purposes only).
As you want to match anything except string with quotation marks, you just target the quotation mark:
re='["]'
if [[ ! $1 =~ $re ]] ; then
Actually you don't need regex for this. Globbing will be enough:
if [[ ! $1 = *\"* ]]; then
...
fi
Your regex is very, very far off. \[ matches a literal left square bracket, and ^ (outside a character class) matches beginning of line.
Something like '^"[^"]*"' should work, if that's really what you want.
However, I kind of doubt that. In order to pass a value in literal double quotes, you would need something like
yourprogram '"value"'
or
yourprogram "\"value\""
which I would certainly want to avoid if I were you.

Using Bash regex match to test membership

Below, I use ALLOWED as container to test a token.
I am using a Bash regex match syntax =~ where the right hand side should be an extended regular expression.
In Bash's Regular Expression Matching. Using the operator =~, the left hand side operand is matched against the extended regular expression (ERE) on the right hand side. Check a related question on using date regex.
But I can't see str1 as a regex and I don't know why ALLOWED matches a string which is present inside it. Even as this works in this case, having regex (str1) as the test string leaves it open for tricky bugs in future.
export ALLOWED="str0 str1 strn"
export STR1="str1"
export STR2="str2"
if [[ $ALLOWED =~ ${STR1} ]]; then
echo "how does it this work?"
fi
if [[ $ALLOWED =~ ${STR2} ]]; then
:
else
echo "does not work."
fi
Questions:
Why/ How does this work?
What's a better to do test for an element in a list in bash?
The syntax is content =~ regex, for example think about how this simple phone number validation works
$ phone="555-443-2321"; if [[ $phone =~ [0-9-]+ ]]; then echo PASS; fi
as in your example, the right hand side is the regular expression and left hand side is the content.
Your regex can be a string literal, then the check will be whether content contains that substring
$ phone="555-443-2321"; if [[ $phone =~ "555" ]]; then echo PASS; fi
if it makes it easier for you think that as a regex for .*555.*
If I understand right, the confusion is because $a =~ $b checks whether there's a match for $b in $a, not whether $a as a whole matches. [[ "str0 str1 strn" =~ str1 ]] succeeds because there's a match for the (trivial) regex str1 somewhere in "str0 str1 strn".
If you want to check for a match to the entire string, you need to anchor the regex with a ^ at the beginning, and $ at the end: [[ $ALLOWED =~ ^${STR1}$ ]]

Bash scripting, regex in if statement

I'm pretty new to bash scripting and regexp and have a question.
I want to check to see if my variable $name starts with a-d, e-h, i-l etc and do some stuff accordingly. If the string starts with "the." or "The." it should check the first letter after the period.
My problem is that if $name consists of "the.anchor" both the a-d0-9 and q-t will be true. Do you guys have any idea what's wrong?
if [[ $name =~ ^([tT]he\.)?[a-dA-D0-9]+ ]]; then
do some stuff
fi
if [[ $name =~ ^([tT]he\.)?[e-hE-H]+ ]]; then
do some stuff
fi
if [[ $name =~ ^([tT]he\.)?[i-lI-L]+ ]]; then
do some stuff
fi
if [[ $name =~ ^([tT]he\.)?[m-pM-P]+ ]]; then
do some stuff
fi
if [[ $name =~ ^([tT]he\.)?[q-tQ-T]+ ]]; then
do some stuff
fi
if [[ $name =~ ^([tT]he\.)?[u-wU-W]+ ]]; then
do some stuff
fi
if [[ $name =~ ^([tT]he\.)?[x-zX-Z]+ ]]; then
do some stuff
fi
Thanks in advance!
Your first part it optional:
([tT]he\.)?
So the.anchor matches the pattern ^([tT]he\.)?[a-dA-D0-9]+ because the the. matches `^([tT]he\.)? and the a matches [a-dA-D0-9]+. It matches ^([tT]he\.)?[q-tQ-T]+ because ^([tT]he\.)? is optional an t matches [q-tQ-T]+. Note not the whole input is consumed by the second pattern, in fact only the first character is grabbed.
You can verify this by having bash echo the match:
echo "${BASH_REMATCH[0]}"
Which should print the.anchor in the first case and t in the second.
You do not have an end anchor on the pattern so only part of the input needs to be matched. If you made the second pattern ^([tT]he\.)?[q-tQ-T]+$ then it would not match.
Alternatively you could make the the first part possessive - ^([tT]he\.)?+. This will mean that if the engine matches the first expression it will not be unmatched. In the latter case ^([tT]he\.)?+ will grab the the. and then not release it when [q-tQ-T]+ fails; this will cause the match to fail.
I figured out a way to fix my problem by using elif statements and putting the q-t part as the last one
I think the ? can be removed as the if statement is already doing the test. The + matches the preceding item at least once and would only be needed if you want to match more than one instance of the letters.
You can do it like this:
if [[ $name =~ ^[tT]he\.[a-dA-D0-9] ]]; then
do some stuff
fi
The condition will only return true if the first character after ^[tT]he\. is [a-dA-D0-9].
However, I tend to think case is a cleaner solution than if statements when matching lists of characters against variables.
case $name in
[tT]he\.[a-dA-D0-9]*)
do some stuff
;;
esac

How should I get bash 3.2 to find a pattern between wildcards

Trying to compare input to a file containing alert words,
read MYINPUT
alertWords=( `cat "AlertWordList" `)
for X in "${alertWords[#]}"
do
# the wildcards in my expression do not work
if [[ $MYINPUT =~ *$X* ]]
then
echo "#1 matched"
else
echo "#1 nope"
fi
done
The =~ operator deals with regular expressions, and so to do a wildcard match like you wanted, the syntax would look like:
if [[ $MYINPUT =~ .*$X.* ]]
However, since this is regex, that's not needed, as it's implied that it could be anywhere in the string (unless it's anchored using ^ and/or $, so this should suffice:
if [[ $MYINPUT =~ $X ]]
Be mindful that if your "words" happen to contain regex metacharacters, then this might do strange things.
I'd avoid =~ here because as FatalError points out, it will interpret $X as a regular expression and this can lead to surprising bugs (especially since it's an extended regular expression, so it has more special characters than standard grep syntax).
Instead, you can just use == because bash treats the RHS of == as a globbing pattern:
read MYINPUT
alertWords=($(<"AlertWordList"))
for X in "${alertWords[#]}"
do
# the wildcards in my expression do work :-)
if [[ $MYINPUT == *"$X"* ]]
then
echo "#1 matched"
else
echo "#1 nope"
fi
done
I've also removed a use of cat in your alertWords assignment, as it keeps the file reading inside the shell instead of spawning another process to do it.
If you want to use patterns, not regexes for matching, you can use case:
read MYINPUT
alertWords=( `cat "AlertWordList" `)
for X in "${alertWords[#]}"
do
# the wildcards in my expression do not work
case "$MYINPUT" in
*$X* ) echo "#1 matched" ;;
* ) echo "#1 nope" ;;
esac
done