Regex in bash: word class not generating a match - regex

I have been agonizing over this for hours. Why is this happening?
$ if [[ "test" =~ \w+ ]]; then echo "yes"; else echo "no"; fi
no
$ if [[ "t" =~ \w+ ]]; then echo "yes"; else echo "no"; fi
no
$ if [[ "w" =~ \w+ ]]; then echo "yes"; else echo "no"; fi
yes
$ if [[ "wwwww" =~ \w+ ]]; then echo "yes"; else echo "no"; fi
yes
It's as if the escaping backslash is doing nothing. Why is that so? I noticed wrapping the regex in quotes doesn't help:
$ if [[ "test" =~ "\w+" ]]; then echo "yes"; else echo "no"; fi
no
If it matters, this line is in a bash function, sourced from an .sh file and run on zsh.

Bash represents its character classes a little different (from what you're expecting). Documentation available here. The following should work:
if [[ "test" =~ '^[[:alnum:]]+$' ]]; then echo "yes"; else echo "no"; fi

Related

Regex is not matching even though match exact value in bash

echo "Hello World";
string="v13.2.exe"
pattern='^v[0-9]*\.[0-9]*\.exe$'
if [[ $str =~ pattern ]]; then
echo "found"
else
echo "not found"
fi
it always print not found. what is wrong
In one case ($str) you are not using the variable you have defined ($string). In the other (pattern), you're missing the $ sign ($pattern). Try
string="v13.2.exe"
pattern='^v[0-9]*\.[0-9]*\.exe$'
if [[ $string =~ $pattern ]]; then
echo "found"
else
echo "not found"
fi
found

Why does this regexp matches to almost everything?

I cant really explain but check out the following:
name=$1
pat="\b[0-9a-zA-Z_]+\b"
if [[ $name =~ $pat ]]; then
echo "$name is ok as user name"
else
echo "$name is not ok as user name"
exit 1
fi
Test run:
./script test_user+
test_user+ is ok as user name
The username with a + sign should not match that regexp.
First of all:
\b is a PCRE extension; it isn't available in ERE, which the =~
operator in bash's [[ ]] syntax uses.
(From Bash regex match with word boundary)
Second, you don't want word boundaries (\b) if you wish to force the entire string to match. You want to match the start (^) and end ($):
pat="^[0-9a-zA-Z_]+\$"
if you dont want word bondry (guessed as you are trying username match) please use
^[0-9a-zA-Z_]+$
Contrary to the OP's experience and other answer it seems \b is supported on Ubuntu 14.04, bash 4.3.11 as word boundary. Here is a sample:
re='\bb[0-9]+\b'
[[ 'b123' =~ $re ]] && echo "matched" || echo "nope"
matched
[[ 'b123_' =~ $re ]] && echo "matched" || echo "nope"
nope
Even \< and \> also work fine as word boundaries:
re='\<b[0-9]+\>'
[[ 'b123' =~ $re ]] && echo "matched" || echo "nope"
matched
[[ 'b123_' =~ $re ]] && echo "matched" || echo "nope"
nope
However support of \b is specific to certain OS only. e.g. on OSX following works as word boundary:
[[ 'b123' =~ [[:\<:]]b[0-9]+[[:\>:]] ]] && echo "matched" || echo "nope"
matched
[[ 'b123_' =~ [[:\<:]]b[0-9]+[[:\>:]] ]] && echo "matched" || echo "nope"
nope

Regex to allow spaces in string - bash

I can't get a string with spaces to validate. It works without spaces, but when I include a space in the string it fails. I have googled furiously but can't get it to work.
if [[ $string =~ ^"[A-Za-z ]"$ ]]; then
# true
else
# false
fi
I'm not sure what I'm missing here...
Use a variable to store your regex:
re='^[A-Za-z ]+$'
Then use it as:
[[ "string" =~ $re ]] && echo "matched" || echo "nope"
matched
[[ "string with spaces" =~ $re ]] && echo "matched" || echo "nope"
matched
If you want inline regex then use:
[[ "string with spaces" =~ ^[A-Za-z\ ]+$ ]] && echo "matched" || echo "nope"
matched
Or else use [[:blank:]] property:
[[ "string with spaces" =~ ^[A-Za-z[:blank:]]+$ ]] && echo "matched" || echo "nope"
matched
I should instead use following regex if its always space instead..
if [[ $string =~ ^"[A-Za-z ](\s)"$ ]]; then
# true
else
# false
fi
Cheers :)

Shell script to validate hex value

How to validate whether the given input string is valid Hex value or not using regex in shell scripts
For example:
Input var="ff:ff:fe:ff"
There is a : deliminator value
I want to use this regex for any input String
var = "ff:ff:fe:ff:fe"
var = "ff:ff:fe:ff:fe:fe:ff:ff"
\b0[xX][0-9a-fA-F]+\b
#!/bin/bash -x
var="fe:fe:fe:fe"
regex="/^([0-9A-F]+:?){4}$/"
if [[ $var =~ $regex ]]; then
echo "valid"
fi
Better version (thanks to chepner):
^([[:xdigit:]]{2})(:[[:xdigit:]]{2})*$
Test
if [[ "ff:af:ff:23:a2:ad" =~ ^([[:xdigit:]]{2})(:[[:xdigit:]]{2})*$ ]]; then
echo "match";
fi
Old Answer:
^([0-9A-Fa-f]{2})(:[0-9A-Fa-f]{2})*$
Test
$ if [[ "ff:af:ff:23:a2:ad" =~ ^([0-9A-Fa-f]{2})(:[0-9A-Fa-f]{2})*$ ]]; then
echo "match";
fi
$ match
$ if [[ "definitlynottherightformat" =~ ^([0-9A-Fa-f]{2})(:[0-9A-Fa-f]{2})*$ ]]; then
echo "match";
fi
$

Regex match a string with spaces (use quotes?) in an if statement

How would I do a regex match as shown below but with quotes around the ("^This") as in the real world "This" will be a string that can have spaces in it.
#!/bin/bash
text="This is just a test string"
if [[ "$text" =~ ^This ]]; then
echo "matched"
else
echo "not matched"
fi
I want to do something like
if [[ "$text" =~ "^This is" ]]; then
but this doesn't match.
You can use \ before spaces.
#!/bin/bash
text="This is just a test string"
if [[ "$text" =~ ^This\ is\ just ]]; then
echo "matched"
else
echo "not matched"
fi
I did not manage to inline the expression like this:
if [[ "$text" =~ "^ *This " ]]; then
but if you put the expression in a variable you could use normal regex syntax like this:
pat="^ *This "
if [[ $text =~ $pat ]]; then
Note that the quoting on $text and $pat is unnessesary.
Edit:
A convinient oneliner during the development:
pat="^ *This is "; [[ " This is just a test string" =~ $pat ]]; echo $?
can you make your problem description clearer?
text="This is just a test string"
case "$text" in
"This is"*) echo "match";;
esac
the above assume you want to match "This is" at exactly start of line.
Have you tried:
^[\s]*This