regex to match strings not preceded by a bang

regex to match strings not preceded by a bang - regex

In bash, I am trying to match valid attributes that are present in an array. Attributes may be 'disabled' by preceding them with a bang (exclamation mark, !), in which case they must not be matched. I have this:
[[ ${TESTS[#]} =~ [^\!]match ]]
which will return true if the word 'match' is in TESTS and not preceded by a !.
It works, except when the word match is in the first position in the array. The problem is the regexp is saying 'match preceded by something that isn't a !'. When it's the first item it is preceded by nothing and therefore does not match.
How do I modify the above to say 'match not preceded by !' ?
From reading answers to other questions I have tried (?<!!)match but this does not work.

Use this re:
([^\!]|^)match
Example of usage:
$ [[ match =~ (^|[^\!])match ]] && echo matches || echo "doesn't match"
matches
$ [[ xmatch =~ (^|[^\!])match ]] && echo matches || echo "doesn't match"
matches
$ [[ '!match' =~ (^|[^\!])match ]] && echo match || echo "doesn't match"
doesn't match
In general, it would be also correct to use assertions here, but bash uses POSIX regular expressions and they know nothing about assertions. But with grep (GNU grep), or perl, or anything that supports PCRE you can do it:
$ echo match | grep -qP '(?<!!)match' && echo matches || echo "doesn't match"
matches
$ echo xmatch | grep -qP '(?<!!)match' && echo matches || echo "doesn't match"
matches
$ echo '!match' | grep -qP '(?<!!)match' && echo matches || echo "doesn't match"
doesn't match

Related

Match two consecutive lines using Regex and Bash features only

What Regular Expression(s) can you use to match two consecutive lines?
The aim is not to use any packages like awk or sed but only use pure RegExp inside a shell script.
Example, I would like to ensure the word "hello" is immediately followed by "world" in the next line.
Acceptance criteria:
"hello" is not to have any spaces before it
"world" must have at least 1 or more space before it.
#/bin/bash
file=./myfile.txt
regex='^hello'
[[ `cat $file` =~ $regexp ]] && echo "yes" || echo "no"
myfile.txt
abc is def
hello
world
cde is efg

Here is pure bash way:
file='./myfile.txt'
[[ $(<$file) =~ hello$'\n'[[:blank:]]*world ]] && echo "yes" || echo "no"
yes
Here $'\n' matches a new line and [[:blank:]]* matches 0+ tabs or spaces.
If you want to be more precise then use:
[[ $(<file) =~ (^|$'\n')hello$'\n'[[:blank:]]*world($'\n'|$) ]] && echo "yes" || echo "no"
However grep or awk are much better tools for this job.

Regex match strings in bash

I was curious if you can make regex to match 2nd character of a string with 2 from the back? For 1st and last its pretty easy and straightforward but i was curious if it can be done for any str length? I was playing with it in bash for the last hour and none of my solutions seams to work.
^(.).*\1$ thats the regex I have for 1st and last char it probably needs too be a little edited for this but i have no idea how.
Can you help me with the other one?
examples:
abcsba - match as b(index 1) == b(index -2)
regex - match as e(index 1) == a(index -2)
abba - match
unix - not matcha as indexOf(n) != indexOf(i)
linux - not match as indexOf(i) != indexOf(u)

Just discard equal number of character from the beginning and from the end:
$ pat='^.(.).*\1.$'
$ [[ abcsba =~ $pat ]] && echo yes || echo no
yes
$ [[ unix =~ $pat ]] && echo yes || echo no
no
or, generally use {n} (eg. to match 3rd character with 3rd from the end):
pat='^.{2}(.).*\1.{2}$'
$ [[ abcdef =~ $pat ]] && echo yes || echo no
no
$ [[ abccef =~ $pat ]] && echo yes || echo no
yes
In the second example we use single-character-ERE duplication operator {m,n} defined in POSIX for both basic and extended regular expressions (ERE variant is only relevant here though, since =~ operator in bash uses ERE). The {n} form is a special case, equal to {n,n}, meaning repeat the preceding character (or group) exactly n times.

As i understood you want to extract the characters . You can do it by awk command
echo "praveen" | awk '{print substr($1,1,2)}'
pr
01HW497089:tmp Controller$
Here i am extracting column1 value from character 1 to character 3

match leading dots in bash if using regex

Say I want to match the leading dot in a string ".a"
So I type
[[ ".a" =~ ^\. ]] && echo "ha"
ha
[[ "a" =~ ^\. ]] && echo "ha"
ha
Why am I getting the same result here?

You need to escape the dot it has meaning beyond just a period - it is a metacharacter in regex.
[[ "a" =~ ^\. ]] && echo "ha"
Make the change in the other example as well.
Check your bash version - you need 4.0 or higher I believe.

There's some compatibility issues with =~ between Bash versions after 3.0. The safest way to use =~ in Bash is to put the RE pattern in a var:
$ pat='^\.foo'
$ [[ .foo =~ $pat ]] && echo yes || echo no
yes
$ [[ foo =~ $pat ]] && echo yes || echo no
no
$
For more details, see E14 on the Bash FAQ page.

Probably it's because bash tries to treat "." as a \ character, like \n \r etc.
In order to tell \ & . as 2 separate characters, try
[[ "a" =~ ^\\. ]] && echo ha

What does this match : bash regex

if [[ "$len" -lt "$MINLEN" && "$line" =~ \[*\.\] ]]
This is from Advanced bash scripting guide "Example 10-1. Inserting a blank line between paragraphs in a text file"
As I understand this matches "any string or a dot character". Right ?

It matches zero or more open bracket characters (\[*), followed by a period and a close square bracket (\.\]). Note that it only requires that a match exist somewhere in "$line", not that the whole string match. Here's a demo:
$ showmatch() { [[ "$1" =~ \[*\.\] ]] && echo "matched: '${BASH_REMATCH[0]}'" || echo "no match"; }
$ showmatch "abc[.]def"
matched: '[.]'
$ showmatch "abc.]def"
matched: '.]'
$ showmatch "abc[[[[[[[.]def"
matched: '[[[[[[[.]'
$ showmatch "abc[[[[[[[xyz.]def"
matched: '.]'
$ showmatch "abc[[[[[[[.xyz]def"
no match
...and I'm pretty sure that's not what it's supposed to be doing in that example script.

It means any string ended with dot inside bracers, for example: [.]
[abc.]

Update: +1 to Gordon Davisson, who has summed it up pretty well... so I've redacted my original post
In brief: You can test the result of a bash regex match like this:
[[ "[*.]" =~ \[*\.\] ]] ; echo ${BASH_REMATCH[0]}

Regex in KornShell

I am trying to check whether a variable is exactly two numbers but I can not seem to figure it out.
How do you do check regular expressions (regex) in KornShell (ksh)?
I have tried:
if [[ $month =~ "[0-9]{2}" ]]
if [[ $month = _[0-9]{2}_ ]]
I have not been able to find any docs on it.
Any insight?

case $month in
[0-9][0-9]) echo "ok";;
*) echo "no";;
esac
should work.
If you need full regexp search, you can use egrep like this:
if echo $month | egrep -q '^[0-9]{2}$'
then
echo "ok"
else
echo "no"
fi

Ksh has supported limited extended patterns since ksh88, using the
special '(' pattern ')'
syntax.
In ksh88, the 'special' character prefixes change the number of matches expected:
'*' for zero or more matches
'+' at least one match
'#' for exactly one match
'?' for zero or one matches
'!' for negation
In ksh93, this was expanded with
'{' min ',' max '}'
to express an exact range:
for w in 1423 12 "" abc 23423 9 33 3 333
do
[[ $w == {1,3}(\d) ]] && print $w has between 1 and three digits
[[ $w == {2}(\d) ]] && print $w has exactly two digits
done
And finally, you can have perl-like clutter with '~', which introduces a whole new class of extensions,including full regular expressions with:
'~(E)( regex )'
More examples can be found in Finnbarr P. Murphy's blog

Where I come from, this is more likely to validate numeric months:
if (( $month >= 1 && $month <= 12 ))
or
[[ $month =~ ^([1-9]|1[012])$ ]]
or to include a leading zero for single-digit months:
[[ $month =~ ^(0[1-9]|1[012])$ ]]

ksh does not use regular expressions; it uses a simpler but still quite useful language called "shell globbing patterns". The key ideas are
Classes like [0-9] or [chly] match any character in the class.
The . is not a special character; it matches only ..
The ? matches any single character.
The * matches any sequence of characters.
Unlike regular expressions, shell globbing patterns must match the entire word, so it works as if it were a regexp it would always start with ^ and end with $.
Globbing patterns are not as powerful as regular expressions, but they are much easier to read, and they are very convenient for matching filenames and simple words. The case construct is my favorite for matching but there are others.
As already noted by Alok you probably want
case $number in
[0-9][0-9]) success ;;
*) failure;;
esac
Although possibly you might prefer not to match a two-digit number with initial zero, so prefer [1-9][0-9].

you can try this as well
$ month=100
$ [[ $month == {1,2}([0-9]) ]] && echo "ok" || echo "no"
no
$ [[ $month == [0-9][0-9] ]] && echo "ok" || echo "no"
no
$ month=10
$ [[ $month == {1,2}([0-9]) ]] && echo "ok" || echo "no"
ok
$ [[ $month == [0-9][0-9] ]] && echo "ok" || echo "no"
ok

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

regex to match strings not preceded by a bang - regex

Related

Match two consecutive lines using Regex and Bash features only

Regex match strings in bash

match leading dots in bash if using regex

What does this match : bash regex

Regex in KornShell

Categories

Resources