Why does this regexp matches to almost everything? - regex

I cant really explain but check out the following:
name=$1
pat="\b[0-9a-zA-Z_]+\b"
if [[ $name =~ $pat ]]; then
echo "$name is ok as user name"
else
echo "$name is not ok as user name"
exit 1
fi
Test run:
./script test_user+
test_user+ is ok as user name
The username with a + sign should not match that regexp.

First of all:
\b is a PCRE extension; it isn't available in ERE, which the =~
operator in bash's [[ ]] syntax uses.
(From Bash regex match with word boundary)
Second, you don't want word boundaries (\b) if you wish to force the entire string to match. You want to match the start (^) and end ($):
pat="^[0-9a-zA-Z_]+\$"

if you dont want word bondry (guessed as you are trying username match) please use
^[0-9a-zA-Z_]+$

Contrary to the OP's experience and other answer it seems \b is supported on Ubuntu 14.04, bash 4.3.11 as word boundary. Here is a sample:
re='\bb[0-9]+\b'
[[ 'b123' =~ $re ]] && echo "matched" || echo "nope"
matched
[[ 'b123_' =~ $re ]] && echo "matched" || echo "nope"
nope
Even \< and \> also work fine as word boundaries:
re='\<b[0-9]+\>'
[[ 'b123' =~ $re ]] && echo "matched" || echo "nope"
matched
[[ 'b123_' =~ $re ]] && echo "matched" || echo "nope"
nope
However support of \b is specific to certain OS only. e.g. on OSX following works as word boundary:
[[ 'b123' =~ [[:\<:]]b[0-9]+[[:\>:]] ]] && echo "matched" || echo "nope"
matched
[[ 'b123_' =~ [[:\<:]]b[0-9]+[[:\>:]] ]] && echo "matched" || echo "nope"
nope

Related

What's wrong with this bash regex comparison? [duplicate]

This question already has answers here:
How can I match spaces with a regexp in Bash?
(4 answers)
Closed 5 years ago.
I saw in bash regex match string that I should compare regexes with =~.
Tried the following:
if [[ "____[9 / 101] Linking" =~ "[0-9]*" ]]; then echo "YES"; fi
And nothing is printed...
Tried without the quotes:
if [[ "____[9 / 101] Linking" =~ [0-9]* ]]; then echo "YES"; fi
And it works fine. But what to do if my regex contains white spaces (quotes required)?
Put your regex in a variable. You are free to use quotes when defining the variable:
$ re="[0-9]*" ; [[ "____[9 / 101] Linking" =~ $re ]] && echo "YES"
YES
$ re="9 /" ; [[ "____[9 / 101] Linking" =~ $re ]] && echo "YES"
YES
Since the reference to $re inside [[...]] is unquoted, the value of $re is treated as a regex. Anything on the right-side of =~ that is quoted, however, will be treated as a literal string.
Notes
In regular expressions, as opposed to globs, * means zero or more of the preceding. Thus [0-9]* is considered a match even if zero characters are matching:
$ re="[0-9]*" ; [[ "____[a / bcd] Linking" =~ $re ]] && echo "YES"
YES
$ re="[0-9]" ; [[ "____[a / bcd] Linking" =~ $re ]] && echo "YES"
$
If you want to match one or more digits, use [0-9]+.
Precede the whitespace with a \:
if [[ "____[9 / 101] Linking" =~ [0-9]*\ /\ [0-9]* ]]; then echo "YES"; fi

Regular expression with if condition for matching index and index, in shell programming

i want to match my 2 strings index and index1 in if condition of shell programming
i tried doing this by following
if [[ $1 == [iI][nN][dD][eE][xX][1]? ]]; then
echo "matched"
but it is not working, here basically i want to say in my regular expression that 1 should occur either 0 or 1 time.
Thanks in advance!
You need to use =~ operator to match regex and make sure to use anchors ^ and $ to avoid matching unwanted text:
[[ 'index1' =~ ^[iI][nN][dD][eE][xX]1?$ ]] && echo "ok" || echo "nope"
ok
[[ 'index' =~ ^[iI][nN][dD][eE][xX]1?$ ]] && echo "ok" || echo "nope"
ok

Regex to allow spaces in string - bash

I can't get a string with spaces to validate. It works without spaces, but when I include a space in the string it fails. I have googled furiously but can't get it to work.
if [[ $string =~ ^"[A-Za-z ]"$ ]]; then
# true
else
# false
fi
I'm not sure what I'm missing here...
Use a variable to store your regex:
re='^[A-Za-z ]+$'
Then use it as:
[[ "string" =~ $re ]] && echo "matched" || echo "nope"
matched
[[ "string with spaces" =~ $re ]] && echo "matched" || echo "nope"
matched
If you want inline regex then use:
[[ "string with spaces" =~ ^[A-Za-z\ ]+$ ]] && echo "matched" || echo "nope"
matched
Or else use [[:blank:]] property:
[[ "string with spaces" =~ ^[A-Za-z[:blank:]]+$ ]] && echo "matched" || echo "nope"
matched
I should instead use following regex if its always space instead..
if [[ $string =~ ^"[A-Za-z ](\s)"$ ]]; then
# true
else
# false
fi
Cheers :)

bash regex in 4.1

the following code works fine on 3.5 bash but not in 4.1
regex='^WORD\-([^(WORD2)][^[:space:]]{1,}$)|(WORD2[[:space:]][^[:space:]]{2,}$)'
if ! [[ $appname =~ $regex ]]
then
printf "no match"
ct_dev_error=$((ct_dev_error+1))
fi
any soliutions? or ideas?
Your regex can be simplified to this:
regex='^WORD-(WORD2[[:space:]][^[:space:]]{2,}|[^[:space:]]+)$'
Test it:
appname='WORD-APP' && [[ $appname =~ $regex ]] && echo "${BASH_REMATCH[0]}"
WORD-APP
appname='WORD-BUD APP' && [[ $appname =~ $regex ]] && echo "${BASH_REMATCH[0]}"
appname='WORD-WORD2 APP' && [[ $appname =~ $regex ]] && echo "${BASH_REMATCH[0]}"
WORD-WORD2 APP
[^(WORD2)] is not actually negating match of WORD2. It is actually a negated character class and it is basically matching a single character that is NOT one of the characters in this list (WORD2).

match leading dots in bash if using regex

Say I want to match the leading dot in a string ".a"
So I type
[[ ".a" =~ ^\. ]] && echo "ha"
ha
[[ "a" =~ ^\. ]] && echo "ha"
ha
Why am I getting the same result here?
You need to escape the dot it has meaning beyond just a period - it is a metacharacter in regex.
[[ "a" =~ ^\. ]] && echo "ha"
Make the change in the other example as well.
Check your bash version - you need 4.0 or higher I believe.
There's some compatibility issues with =~ between Bash versions after 3.0. The safest way to use =~ in Bash is to put the RE pattern in a var:
$ pat='^\.foo'
$ [[ .foo =~ $pat ]] && echo yes || echo no
yes
$ [[ foo =~ $pat ]] && echo yes || echo no
no
$
For more details, see E14 on the Bash FAQ page.
Probably it's because bash tries to treat "." as a \ character, like \n \r etc.
In order to tell \ & . as 2 separate characters, try
[[ "a" =~ ^\\. ]] && echo ha