Bash regular expression with quotes - regex

I am writing a script and I want to check a variable for a format. This is the function I use :
check_non_numeric() {
#re='^\".*\"$'
re='\[^\]*\'
if ! [[ $1 =~ $re ]] ; then
echo "'$1' is not a valid format - \"[name]\" "
exit 1
fi
}
I want the regular expression to match a string with anything but quotation mark inside and quotation marks around it ("a" or "string" or "dsfo!^$**#"). The problem is that these regular expressions that I came up with dont work for me. I have used a very similar function to check if a variable is an integer or float and it worked there. Could you please tell me what the regular expression in question should be ?
Thank you very much

I'm assuming you meant you want to match anything that is not a string surrounded by quotes. It's easier to match use your regex to match, and the bash-test to "not" match it-- if that's not clear, use !. Here's a couple of ways to do it.
if [[ ! $(expr "$string" : '\".*\"') -gt 0 ]]; then echo "expr good"; fi
if [[ ! "$string" =~ \".*\" ]]; then echo "test good"; fi
Make sure you quote your variable you are testing with expr (which is there for edification purposes only).

As you want to match anything except string with quotation marks, you just target the quotation mark:
re='["]'
if [[ ! $1 =~ $re ]] ; then
Actually you don't need regex for this. Globbing will be enough:
if [[ ! $1 = *\"* ]]; then
...
fi

Your regex is very, very far off. \[ matches a literal left square bracket, and ^ (outside a character class) matches beginning of line.
Something like '^"[^"]*"' should work, if that's really what you want.
However, I kind of doubt that. In order to pass a value in literal double quotes, you would need something like
yourprogram '"value"'
or
yourprogram "\"value\""
which I would certainly want to avoid if I were you.

Related

Mix of regex and non-regex in bash if-statement

Inside of my $foo variable I have this data (please pay close attention to the .s and ,s):
,example.com,de.wikipedia.org,reddit,stackoverflow.com.,amazon.,
I am trying to write an if statement in bash that basically works like this:
if [[ "${foo}" =~ *','[a-z0-9]','* || "${foo}" =~ *','[a-z0-9]'.,'* ]]; then
echo "Invalid input detected"
else
echo "OK"
fi
It would echo Invalid input detected since reddit and amazon. are in $foo.
If I change the contents of $foo to be:
,example.com,de.wikipedia.org,www.reddit.com,stackoverflow.com.,amazon.com,
Then it would echo OK.
I am using bash 3.2.57(1)-release on OS X 10.11.6 El Capitan.
Try:
if [[ $foo =~ ,[a-z0-9]*, || $foo =~ ,[a-z0-9]*\., ]]; then
echo "Invalid input detected"
else
echo "OK"
fi
Notes:
=~ is a regular expression operator. The right-hand-side needs to be a regular expression, not a glob.
, is not a shell-active character. Thus, it does not need any special quoting.
[a-z0-9] matches exactly one alphanumeric. Since we want to allow for more any number, use [a-z0-9]*
In regular expressions, ','* matches zero or more commas. This is not what you want. One might write ,.* which, because, . is a wildcard, matches a comma followed by zero or more of anything. Since the regex is not anchored to the end, adding a final .* makes no difference.
Inside of [[...]] there is no word splitting. So shell variables do not the double-quoting that need elsewhere.
Note that, in [a-z0-9], the exact characters that match a-z or 0-9 depend on the collation order in the locale.

Bash Regex comparison not working

keyFileName=$1;
for fileExt in "${validTypes[#]}"
do
echo $fileExt;
if [[ $keyFileName == *.$fileExt ]]; then
keyStatus="true";
fi
done;
I am trying to check the file extension of a file passed in against an array of multiple file extensions. However it doesn't seem to be working properly. Any help?
validTypes=(".txt" ".mp3")
keyFileName="$1"
for fileExt in "${validTypes[#]}"
do
echo $fileExt;
if [[ $keyFileName =~ ^.*$fileExt$ ]]; then
keyStatus="true";
echo "Yes"
fi
done;
Effectively, you could change your if statement to either:
if [[ $keyFileName == ?*$fileExt ]] # Glob pattern case, ? denotes single char
or:
if [[ $keyFileName =~ .*$fileExt ]] # Regex case, . denotes single char
Looping over the array to do a regex match on each element seems rather inefficient. You're using regex; it's easy to combine the expressions and avoid looping at all.
Mangling the array into a valid regex is not entirely trivial, though. Here's my attempt:
validTypes=('\.txt' '\.mp3')
fileExtRe=$(printf '|%s' "${validTypes[#]}"
# Trim off the first alternation, add parens and anchor
fileExtRe="(${fileExtRe#?})$"
if [[ $keyFileName =~ $fileExtRe ]]; then
:
Notice how the elements in validTypes are regular expressions now, with the dot escaped to only match a literal dot.

bash. regexp using bash_rematch

I have a var which can has single quotes or spaces before and/or after:
var="' /path/to/somewhere '"
var=" /path/to/somewhere "#<- here is a space at the end
var="/path/to/somewhere'"
var=" /path/to/somewhere '"
I need a regexp using bash rematch to clean possible single quotes or blank spaces before and after
I know it can be done in this way:
var=${var##+([ \'])}
var=${var%%+([ \'])}
But I need it with BASH_REMATCH (long to explain xd). I'm trying with:
[[ ${var} =~ ^([\' ]*)?(.+)([\' ])?$ ]] && var="${BASH_REMATCH[1]}"
But it doesn't work. Probably .+ is getting the rest of the string. How can I get only the interesting part? Thanks.
The .+ in your example is what's causing the problem, as it is greedy, so it will consume the rest of the line.
In this case, you can prevent it from doing so by requiring that the part in the middle ends in something other than a space or ', like this:
re='^[ '"'"']*(.*[^ '"'"'])[ '"'"']*$'
[[ $var =~ $re ]] && echo "${BASH_REMATCH[1]}"
The nasty-looking '"'"' are needed to insert a literal ' into a single-quoted string. When working with regular expressions in bash, the recommended method is to define a variable containing the pattern in single quotes and then to use it, unquoted (this method works in all versions of bash that support regular expressions).

shell script odd regex

i have some regex that is behaving oddly in my shell script i have variables, and i have tried every what way to get them to behave, and they dont seem to do any regex, and i know my regex quite well thanks to regex101, here is what a sample looks like
fname="direcheck"
FIND="*"
if [[ $fname =~ $FIND ]]; then
echo "no quotes"
fi
if [[ "$fname" =~ "$FIND" ]]; then
echo "with quotes"
fi
right now it will display nothing
if i change find to
FIND="[9]*"
then it prints no quotes
if i say
FIND="[a-z]*"
then it prints no quotes
if i say
FIND="dircheck"
then nothing prints
if i say
FIND="*ck"
then nothing prints
I don't get how this regex is working
how do i use these variables, and what is the proper syntax?
* and *ck are invalid regular expressions. It would work (with no quotes) if you were comparing with ==, not =~. If you want to use the same functionality that you get in == for them, the equivalent regexps are .* and .*ck.
[9]* is any number (including zero) of characters that are 9. There is zero characters 9 in your direcheck, so it matches. (Edited from brainfart, thanks chepner)
dircheck is not found in direcheck, so not printing anything is hardly surprising.
[a-z]* is any number of characters that are between a and z (i.e. any number of lowercase letters). This will match, assuming it's not quoted.
I finally figured it out, and why it was working so oddly
[a-z]* and [9]* and [anythinghere]* they all match because it matches zero or more times. so "direcheck" has [9] zero or more times.
so
if [[ "$fname" =~ $FIND ]]; then
or
if [[ $fname =~ $FIND ]]; then
are both correct, and
if [[ "$fname" =~ "$FIND" ]]; then
matches only when the string matches exactly because $FIND is matched as a literal string not regex

How should I get bash 3.2 to find a pattern between wildcards

Trying to compare input to a file containing alert words,
read MYINPUT
alertWords=( `cat "AlertWordList" `)
for X in "${alertWords[#]}"
do
# the wildcards in my expression do not work
if [[ $MYINPUT =~ *$X* ]]
then
echo "#1 matched"
else
echo "#1 nope"
fi
done
The =~ operator deals with regular expressions, and so to do a wildcard match like you wanted, the syntax would look like:
if [[ $MYINPUT =~ .*$X.* ]]
However, since this is regex, that's not needed, as it's implied that it could be anywhere in the string (unless it's anchored using ^ and/or $, so this should suffice:
if [[ $MYINPUT =~ $X ]]
Be mindful that if your "words" happen to contain regex metacharacters, then this might do strange things.
I'd avoid =~ here because as FatalError points out, it will interpret $X as a regular expression and this can lead to surprising bugs (especially since it's an extended regular expression, so it has more special characters than standard grep syntax).
Instead, you can just use == because bash treats the RHS of == as a globbing pattern:
read MYINPUT
alertWords=($(<"AlertWordList"))
for X in "${alertWords[#]}"
do
# the wildcards in my expression do work :-)
if [[ $MYINPUT == *"$X"* ]]
then
echo "#1 matched"
else
echo "#1 nope"
fi
done
I've also removed a use of cat in your alertWords assignment, as it keeps the file reading inside the shell instead of spawning another process to do it.
If you want to use patterns, not regexes for matching, you can use case:
read MYINPUT
alertWords=( `cat "AlertWordList" `)
for X in "${alertWords[#]}"
do
# the wildcards in my expression do not work
case "$MYINPUT" in
*$X* ) echo "#1 matched" ;;
* ) echo "#1 nope" ;;
esac
done