How to check whether a string has at least one alphabetic character? - regex

I want to check whether a string has at least one
alphabetic character?
a regex could be like:
"^.*[a-zA-Z].*$"
however, I want to judge whether a string has at least one
alphabetic character?
so I want to use, like
if [ it contains at least one alphabetic character];then
...
else
...
fi
so I'm at a loss on how to use the regex
I tried
if [ "$x"=~[a-zA-Z]+ ];then echo "yes"; else echo "no" ;fi
or
if [ "$x"=~"^.*[a-zA-Z].*$" ];then echo "yes"; else echo "no" ;fi
and test with x="1234", both of the above script output result of "yes", so they are wrong
how to achieve my goal?thanks!

Try this:
#!/bin/bash
x="1234"
y="a1234"
if [[ "$x" =~ [A-Za-z] ]]; then
echo "$x has one alphabet"
fi
if [[ "$y" =~ [A-Za-z] ]]; then
echo "Y is $y and has at least one alphabet"
fi

If you want to be portable, I'd call /usr/bin/grep with [A-Za-z].

Use the [:alpha:] character class that respects your locale, with a regular expression
[[ $str =~ [[:alpha:]] ]] && echo has alphabetic char
or a glob-style pattern
[[ $str == *[[:alpha:]]* ]] && echo has alphabetic char

It's quite common in sh scripts to use grep in an if clause. You can find many such examples in /etc/rc.d/.
if echo $theinputstring | grep -q '[a-zA-Z]' ; then
echo yes
else
echo no
fi

Related

Match two consecutive lines using Regex and Bash features only

What Regular Expression(s) can you use to match two consecutive lines?
The aim is not to use any packages like awk or sed but only use pure RegExp inside a shell script.
Example, I would like to ensure the word "hello" is immediately followed by "world" in the next line.
Acceptance criteria:
"hello" is not to have any spaces before it
"world" must have at least 1 or more space before it.
#/bin/bash
file=./myfile.txt
regex='^hello'
[[ `cat $file` =~ $regexp ]] && echo "yes" || echo "no"
myfile.txt
abc is def
hello
world
cde is efg
Here is pure bash way:
file='./myfile.txt'
[[ $(<$file) =~ hello$'\n'[[:blank:]]*world ]] && echo "yes" || echo "no"
yes
Here $'\n' matches a new line and [[:blank:]]* matches 0+ tabs or spaces.
If you want to be more precise then use:
[[ $(<file) =~ (^|$'\n')hello$'\n'[[:blank:]]*world($'\n'|$) ]] && echo "yes" || echo "no"
However grep or awk are much better tools for this job.

Extract integers from string with bash

From a variable how to extract integers that will be in format *\d+.\d+.\d+* (4.12.3123) using bash.
filename="xzxzxzxz4.12.3123fsfsfsfsfsfs"
I have tried:
filename="xzxzxzxz4.12.3123fsfsfsfsfsfs"
if [[ "$filename" =~ (.*)(\d+.\d+.\d+)(.*) ]]; then
echo ${BASH_REMATCH}
echo ${BASH_REMATCH[1]}
echo ${BASH_REMATCH[2]}
echo ${BASH_REMATCH[3]}
else
echo 'nej'
fi
which does not work.
The easiest way to work with regexes in Bash, in terms of consistency between Bash versions and escaping, is to put the regex into a single-quoted variable and then use it unquoted, as below:
re='[0-9]+\.[0-9]+\.[0-9]+'
[[ $filename =~ $re ]] && printf '%s\n' "${BASH_REMATCH[#]}"
The main issue with your approach were that you were using the "Perl-style" \d, so in fact you could make your code work with:
if [[ "$filename" =~ (.*)([0-9]+\.[0-9]+\.[0-9]+)(.*) ]]; then
echo "${BASH_REMATCH[2]}"
fi
But this unnecessarily creates 3 capture groups, when you don't even need one. Note that I also changed . (any character) to \. (a literal .).
one way to extract:
grep -oP '\d\.\d+\.\d+' <<<$xfilename
There is one more way
$ filename="xzxzxzxz4.12.3123fsfsfsfsfsfs"
$ awk '{ if (match($0, /[0-9].[0-9]+.[0-9]+/, m)) print m[0] }' <<< "$filename"
4.12.3123

Why does this regexp matches to almost everything?

I cant really explain but check out the following:
name=$1
pat="\b[0-9a-zA-Z_]+\b"
if [[ $name =~ $pat ]]; then
echo "$name is ok as user name"
else
echo "$name is not ok as user name"
exit 1
fi
Test run:
./script test_user+
test_user+ is ok as user name
The username with a + sign should not match that regexp.
First of all:
\b is a PCRE extension; it isn't available in ERE, which the =~
operator in bash's [[ ]] syntax uses.
(From Bash regex match with word boundary)
Second, you don't want word boundaries (\b) if you wish to force the entire string to match. You want to match the start (^) and end ($):
pat="^[0-9a-zA-Z_]+\$"
if you dont want word bondry (guessed as you are trying username match) please use
^[0-9a-zA-Z_]+$
Contrary to the OP's experience and other answer it seems \b is supported on Ubuntu 14.04, bash 4.3.11 as word boundary. Here is a sample:
re='\bb[0-9]+\b'
[[ 'b123' =~ $re ]] && echo "matched" || echo "nope"
matched
[[ 'b123_' =~ $re ]] && echo "matched" || echo "nope"
nope
Even \< and \> also work fine as word boundaries:
re='\<b[0-9]+\>'
[[ 'b123' =~ $re ]] && echo "matched" || echo "nope"
matched
[[ 'b123_' =~ $re ]] && echo "matched" || echo "nope"
nope
However support of \b is specific to certain OS only. e.g. on OSX following works as word boundary:
[[ 'b123' =~ [[:\<:]]b[0-9]+[[:\>:]] ]] && echo "matched" || echo "nope"
matched
[[ 'b123_' =~ [[:\<:]]b[0-9]+[[:\>:]] ]] && echo "matched" || echo "nope"
nope

match leading dots in bash if using regex

Say I want to match the leading dot in a string ".a"
So I type
[[ ".a" =~ ^\. ]] && echo "ha"
ha
[[ "a" =~ ^\. ]] && echo "ha"
ha
Why am I getting the same result here?
You need to escape the dot it has meaning beyond just a period - it is a metacharacter in regex.
[[ "a" =~ ^\. ]] && echo "ha"
Make the change in the other example as well.
Check your bash version - you need 4.0 or higher I believe.
There's some compatibility issues with =~ between Bash versions after 3.0. The safest way to use =~ in Bash is to put the RE pattern in a var:
$ pat='^\.foo'
$ [[ .foo =~ $pat ]] && echo yes || echo no
yes
$ [[ foo =~ $pat ]] && echo yes || echo no
no
$
For more details, see E14 on the Bash FAQ page.
Probably it's because bash tries to treat "." as a \ character, like \n \r etc.
In order to tell \ & . as 2 separate characters, try
[[ "a" =~ ^\\. ]] && echo ha

Bash - correct way to escape dollar in regex

What is the correct way to escape a dollar sign in a bash regex? I am trying to test whether a string begins with a dollar sign. Here is my code, in which I double escape the dollar within my double quotes expression:
echo -e "AB1\nAB2\n\$EXTERNAL_REF\nAB3" | while read value;
do
if [[ ! $value =~ "^\\$" ]];
then
echo $value
else
echo "Variable found: $value"
fi
done
This does what I want for one box which has:
GNU bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu)
And the verbose output shows
+ [[ ! $EXTERNAL_REF =~ ^\$ ]]
+ echo 'Variable found: $EXTERNAL_REF'
However, on another box which uses
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
The comparison is expanded as follows
+ [[ ! $EXTERNAL_REF =~ \^\\\$ ]]
+ echo '$EXTERNAL_REF'
Is there a standard/better way to do this that will work across all implementations?
Many thanks
Why do you use a regular expression here? A glob is enough:
#!/bin/bash
while read value; do
if [[ "$value" != \$* ]]; then
echo "$value"
else
echo "Variable found: $value"
fi
done < <(printf "%s\n" "AB1" "AB2" '$EXTERNAL_REF' "AB3")
Works here with shopt -s compat32.
The regex doesn't need any quotes at all. This should work:
if [[ ! $value =~ ^\$ ]];
I would replace the double quotes with single quotes and remove a single \ and have the changes as below
$value =~ "^\\$"
can also be used as
$value =~ '^\$'
I never found the solution either, but for my purposes, I settled on the following workaround:
if [[ "$value" =~ ^(.)[[:alpha:]_][[:alnum:]_]+\\b && ${BASH_REMATCH[1]} == '$' ]]; then
echo "Variable found: $value"
else
echo "$value"
fi
Rather than trying to "quote" the dollar-sign, I instead match everything around it and I capture the character where the dollar-sign should be to do a direct-string comparison on. A bit of a kludge, but it works.
Alternatively, I've taken to using variables, but just for the backslash character (I don't like storing the entire regex in a variable because I find it confusing for the regex to not appear in the context where it's used):
bs="\\"
string="test\$test"
if [[ "$string" =~ $bs$ ]]; then
echo "output \"$BASH_REMATCH\""
fi