how to write conditions in regex

how to write conditions in regex - regex

Trying regex for following strings
JIRAID-12314 >> should match
JIRAID-21312 test >> should match
JIRAID-12312-test >> should not match
if [[ $MESSAGE =~ ^$JIRAID-[0-9]{4,6}[\s\w]* ]];
then
echo "string matched
exit 0
How can I stop matching 3rd string?

You may use this regex in bash:
re='^JIRAID-[0-9]{4,6}( [[:alnum:]]+)?$'
RegEx Details:
^: Start
JIRAID-: Match JIRAID- text
[0-9]{4,6}: Match 4 to 6 digits
( [[:alnum:]]+)?: Optional group to match a space followed by 1+ alpha numeric characters
$: End
RegEx Demo
Code Demo
Code:
re='^JIRAID-[0-9]{4,6}( [[:alnum:]]+)?$'
for s in 'JIRAID-12314' 'JIRAID-21312 test' 'JIRAID-12312-test'; do
[[ $s =~ $re ]] && echo "$s matched" || echo "$s didn't match"
done

Related

Bash regex using bash_rematch not capturing as expected

I'm trying to capture BAR_BAR in FOO_FOO_FOO_BAR_BAR using the following regex: (?:.*?_){3}(.*).
The regular expression works when using a validator such as RegExr or regex101, but Bash doesn't return anything when I run:
text="FOO_FOO_FOO_BAR_BAR"
regex="(?:.*?_){3}(.*)"
[[ $text =~ $regex ]] && echo "${BASH_REMATCH[1]}"
When I run the following example regex it works perfectly (returning b):
text="abcdef"
regex="(b)(.)(d)e"
[[ $text =~ $regex ]] && echo "${BASH_REMATCH[1]}"
I'm new to using regex in Bash, what am I missing here?

POSIX regex does not support non-capturing groups and lazy quantifiers. Bash uses POSIX ERE, so you can use
text="FOO_FOO_FOO_BAR_BAR"
regex="([^_]*_){3}(.*)"
[[ $text =~ $regex ]] && echo "${BASH_REMATCH[2]}"
# => BAR_BAR
Here,
([^_]*_){3} - matches three occurrences (Group 1) of any zero or more chars other than _ followed with a _ char
(.*) - the rest of the string (Group 2).
As in this case a capturing group is used to serve a grouping construct at the beginning, "${BASH_REMATCH[2]}" holds the required value.

Regex match validation for less than n or n times

Suppose I have regex as below : [a-z]{1,28}
This will match the below string as per two matches given below:
abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz
Match 1
Full match 0-28 abcdefghijklmnopqrstuvwxyzab
Match 2
Full match 28-52 cdefghijklmnopqrstuvwxyz
I want to match only 28 or less than 28 characters on that.That means if my string is greater than 28 character,my validation should fail.
Please advise on the above.The problem I am facing is in when I am defining this validation xsd pattern(xs:pattern value="[a-z]{1,28}")
Thanks in advance

Use word boundaries \b to denote the needed sequence:
echo "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz" | egrep '\b[a-z]{1,28}\b'
# won't find the matches
echo "abcdefghijklmnopqrstuvwxyzab abc" | egrep -o '\b[a-z]{1,28}\b'
Outputs:
abcdefghijklmnopqrstuvwxyzab
abc

match with beginning/end of string.
str="abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz"
# Your solution.
if [[ $str =~ [a-z]{1,28} ]]; then
echo "First match"
fi
# Solution `matching complete line
if [[ $str =~ ^[a-z]{1,28}$ ]]; then
echo "Second match"
fi

regex issue Bash

I'm studying bash programming , in particular the regex and I found this code:
numpat='^[+-]([0-9]+)$'
strpat='^([a-z]*)\1$'
read stringa
if [[ $stringa =~ $numpat ]]
then
echo "numero"
echo numero > output
exit ${BASH_REMATCH[1]}
elif [[ $stringa =~ $strpat ]]
then
echo "echo"
echo echo > output
exit 11
fi
and I don't understand what means \1 in this line:
strpat='^([a-z]*)\1$'

\1 is a backreference. It matches whatever was matched by the first capture group ([a-z]*).
So the pattern ^([a-z]*)\1$ matches a string that built from a substring that's repeated twice, such as foofoo. The capture group matches the first foo, and the backreference matches the second foo. But if the string is foobar, the backreference never matches anything, because it can't find another repetition of any of the initial strings.
You can allow any number of repetitions by using the + quantifier after \1. This matches it one or more times.
DEMO

On cygwin, which uses newlib, \1 matches only 1.
if [[ a1 =~ $strpat ]]; then echo YES; fi # YES

Regex star produces no match

If memory serves I used to be able to do this
$ [[ abc123 =~ ([0-9]*) ]]
$ echo ${BASH_REMATCH[1]}
as you can see its giving no output with the star * character. Now it only works if I use the plus + character.
$ [[ abc123 =~ ([0-9]+) ]]
$ echo ${BASH_REMATCH[1]}
123
edit see more strangeness. It will match the start of the string, but not the end of the string.
$ [[ 123abc =~ ([0-9]*) ]]
$ echo ${BASH_REMATCH[1]}
123

Your regex returns the first match that it finds, that is position 0, before the "a", there it matches the empty string.
* as quantifier is difficult, because if that is the whole expression, it is able to match the empty string, and therefor it will match on each position where is no digit to match.
So in the string "abc123" it matches 4 times!
a b c 123
^ ^ ^ ^..
the first 3 times it is happy to match the empty string and on the fourth position it matches the series of digits.

What does this match : bash regex

if [[ "$len" -lt "$MINLEN" && "$line" =~ \[*\.\] ]]
This is from Advanced bash scripting guide "Example 10-1. Inserting a blank line between paragraphs in a text file"
As I understand this matches "any string or a dot character". Right ?

It matches zero or more open bracket characters (\[*), followed by a period and a close square bracket (\.\]). Note that it only requires that a match exist somewhere in "$line", not that the whole string match. Here's a demo:
$ showmatch() { [[ "$1" =~ \[*\.\] ]] && echo "matched: '${BASH_REMATCH[0]}'" || echo "no match"; }
$ showmatch "abc[.]def"
matched: '[.]'
$ showmatch "abc.]def"
matched: '.]'
$ showmatch "abc[[[[[[[.]def"
matched: '[[[[[[[.]'
$ showmatch "abc[[[[[[[xyz.]def"
matched: '.]'
$ showmatch "abc[[[[[[[.xyz]def"
no match
...and I'm pretty sure that's not what it's supposed to be doing in that example script.

It means any string ended with dot inside bracers, for example: [.]
[abc.]

Update: +1 to Gordon Davisson, who has summed it up pretty well... so I've redacted my original post
In brief: You can test the result of a bash regex match like this:
[[ "[*.]" =~ \[*\.\] ]] ; echo ${BASH_REMATCH[0]}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

how to write conditions in regex - regex

Trying regex for following strings JIRAID-12314 >> should match JIRAID-21312 test >> should match JIRAID-12312-test >> should not match if [[ $MESSAGE =~ ^$JIRAID-[0-9]{4,6}[\s\w]* ]]; then echo "string matched exit 0 How can I stop matching 3rd string?

Related

Bash regex using bash_rematch not capturing as expected

Regex match validation for less than n or n times

regex issue Bash

Regex star produces no match

What does this match : bash regex

Categories

Resources