Do not match if nothing exists between optional parenthesis - regex

I'm attempting to parse group names from /etc/security/login-access.conf. We have a mixed environment of LDAP & AD machines. AD groups are encapsulated with parenthesis ().
I have the following regex that works to extract only the group name, however the only problem I am having with it is there is routinely a 'null' group and the regex returns a null & the ) characters:
Current regex:
/(?<=\+\s:\s[#\(])(.*?)(?=[\)]?\s:)/
Sample /etc/security/login-access.conf:
+ : #ldapgroup1 : ALL
+ : #ldapgroup2 : ALL
+ : (#adgroup1) : ALL
+ : (#adgroup2) : ALL
+ : () : ALL # <---This is the problematic entry.
I'm not sure if or how to tune the regex to ignore an entry that contains nothing between the parenthesis. Any help is appreciated.

Since your regex engine appears to have capture groups, I would just express your pattern as:
\+ : (\(#\S+\)|#\S+) : \S+
Demo
Here I use an alternation to cleanly match either the parentheses or non parentheses variants of the LDAP group names.

Might not be the most efficient, definitely ugly but it works:
(?<=\+\s:\s#|\()([a-zA-Z0-9_-]+)(?=[\)]?\s:)

If you are using perl, you can use a branch reset group:
\+\h:\h(?|#([\w-]+)|\(#([\w-]+)\))\h:
The pattern matches:
\+\h:\h Match + and a colon between horizontal whitespace chars
(?| Branch reset group
#([\w-]+) Match # and capture 1+ word chars or a hyphen in group 1
| Or
\(#([\w-]+)\) Match (#, capture capture 1+ word chars or a hyphen in group 2 (which will be available in group 1 due to the branch reset group) and match )
)\h: Close branch reset group
Regex demo

Related

Regex to get value from <key, value> by asserting conditions on the value

I have a regex which takes the value from the given key as below
Regex .*key="([^"]*)".* InputValue key="abcd-qwer-qaa-xyz-vwxc"
output abcd-qwer-qaa-xyz-vwxc
But, on top of this i need to validate the value with starting only with abcd- and somewhere the following pattern matches -xyz
Thus, the input and outputs has to be as follows:
I tried below which is not working as expected
.*key="([^"]*)"?(/Babcd|-xyz).*
The key value pair is part of the large string as below:
object{one="ab-vwxc",two="value1",key="abcd-eest-wd-xyz-bnn",four="obsolete Values"}
I think by matching the key its taking the value and that's y i used this .*key="([^"]*)".*
Note:
Its a dashboard. you can refer this link and search for Regex: /"([^"]+)"/ This regex is applied on the query result which is a string i referred. Its working with that regex .*key="([^"]*)".* above. I'm trying to alter with that regexGroup itself. Hope this helps?
Can anyone guide or suggest me on this please? That would be helpful. Thanks!
Looks like you could do with:
\bkey="(abcd(?=.*-xyz\b)(?:-[a-z]+){4})"
See the demo online
\bkey=" - A word-boundary and literally match 'key="'
( - Open 1st capture group.
abcd - Literally match 'abcd'.
(?=.*-xyz\b) - Positive lookahead for zero or more characters (but newline) followed by literally '-xyz' and a word-boundary.
(?: - Open non-capturing group.
-[a-z]+ - Match an hyphen followed by at least a single lowercase letter.
){4} - Close non-capture group and match it 4 times.
) - Close 1st capture group.
" - Match a literal double quote.
I'm not a 100% sure you'd only want to allow for lowercase letter so you can adjust that part if need be. The whole pattern validates the inputvalue whereas you could use capture group one to grab you key.
Update after edited question with new information:
Prometheus uses the RE2 engine in all regular expressions. Therefor the above suggestion won't work due to the lookarounds. A less restrictive but possible answer for OP could be:
\bkey="(abcd(?:-\w+)*-xyz(?:-\w+)*)"
See the online demo
Will this work?
Pattern
\bkey="(abcd-[^"]*\bxyz\b[^"]*)"
Demo
You could use the following regular expression to verify the string has the desired format and to match the portion of the string that is of interest.
(?<=\bkey=")(?=.*-xyz(?=-|$))abcd(?:-[a-z]+)+(?=")
Start your engine!
Note there are no capture groups.
The regex engine performs the following operations.
(?<=\bkey=") : positive lookbehind asserts the current
position in the string is preceded by 'key='
(?= : begin positive lookahead
.*-xyz : match 0+ characters, then '-xyz'
(?=-|$) : positive lookahead asserts the current position is
: followed by '-' or is at the end of the string
) : end non-capture group
abcd : match 'abcd'
(?: : begin non-capture group
-[a-z]+ : match '-' followed by 1+ characters in the class
)+ : end non-capture group and execute it 1+ times
(?=") : positive lookahead asserts the current position is
: followed by '"'

Using regex replacement in Sublime 3

I am trying to use replace in Sublime using regular expressions but I'm stuck. I tried various combinations but don't seem to be getting there.
This is the input and my desired output:
Input: N_BBP_c_46137_n
Output : BBP
I tried combinations of:
[^BBP]+\b
\*BBP*+\g
But none of the above (and many others) don't seem to work.
To turn N_BBP_c_46137_n into BBP and according to the comment just want that entire long name such as N_BBP_ to be replaced by only BBP* you might also use a capture group to keep BBP.
\bN_(BBP)_\S*
\bN_ Match N preceded by a word boundary
(BBP) Capture group 1, match BBP (or use [A-Z]+ to match 1+ uppercase chars)
_\S* Match _ followed by 0+ times a non whitespace char
In the replacement use the first capturing group $1
Regex demo
You may use
(N_)[^_]*(_c_\d+_n)
Replace with ${1}some new value$2.
Details
(N_) - Group 1 ($1 or ${1} if the next char is a digit): N_
[^_]* - any 0 or more chars other than _
-(_c_\d+_n) - Group 2 ($2): _c_, 1 or more digits and then _n.
See the regex demo.

Regex allows for repeating the pattern when I don't want it to

I'm trying to take a query parameter and verify if the syntax provided by the user is correct. Regex seems like the best choice for this, but I'm having trouble making it so the pattern doesn't allow for repeating itself.
The pattern I came up with is:
(^(\w+)(=|!=|>=|>|<=|<|~)((')(.*)('))(\s(AND|OR)\s)(\w+)(=|!=|>=|>|<=|<|~)((')(.*)('))$)
The syntax provided by the user should to be:
[field][predicate][single quote][value][single quote][white space][logical operator][white space][field][predicate][single quote][value][single quote]
Where:
field is [any word]
predicate is [= | != | >= | > | <= | < | ~]
logical operator is [AND | OR (with a space on both sides)]
value is [any word wrapped by single quotes]
An example looks like this: field1='value1' OR field2='value2'
The problem I am having is that the pattern I created allows for things like this:
field1='value1' OR field2='value2field1='value' OR field2='value2'' [This shouldn't work but does]
field1='value1' OR field2='value2 field1='value' OR field2='value2'' [This shouldn't work but does]
field1='value1' OR field2='value2' AND field3='value3' OR field4='value4'' [This shouldn't work but does]
Any help would be appreciated making it so the pattern doesn't match if it repeats.
You might use:
^\w+(?:<=|=>|!=|[~<>=])'\w+'(?: (?:OR|AND) \w+(?:<=|=>|!=|[~<>=])'\w+')*$
^ Start of string
\w+ Match 1 or more word chars
(?: Non capture group
<=|=>|!=|[~<>=] Match one of the alternatives
) Close group
\w+ Match 1 or more word chars between single quotes
(?: Non capture group
(?:OR|AND) \w+ Match space, either AND or OR and 1+ word chars
(?:<=|=>|!=|[~<>=]) Match one of the alternatives
\w+ Match 1 or more word chars between single quotes
)* Close group and repeat 0+ times to also match without AND or OR
$ End of string
If there should be at least a single AND or OR the quantifier of the last group could be + instead of *
The single chars in the predicate could be added to a character class [~<>=] to take out a few alternations.
Regex demo

Regex for page sorting query string

I am trying to match
"SomeField:asc"
"SomeField:desc"
"SomeField:asc,SomeField:asc"
"SomeField:desc,SomeField:desc" ...
Does not match if
""
SomeField:desc,SomeField
SomeField,SomeField:asc
SomeField:desc,SomeField:des, (exta comma)
I have current regex [A-Za-z]+:(asc|desc), but I am stuck. I am sure it is really simple regex but I am new to this so please be patient! Thank you
Maybe you can use this regex ^(?:[A-Za-z]+:(?:asc|desc),?)+$
From the beginning of the string ^
Inside a non capturing group (?:
One or more characters [A-Za-z]+
Followed by a color :
Inside a non capturing group (?:
asc or desc asc|desc
with an optional comma ,?
the outer optional group one or more times +
Unitl the end of the string $
I think this will do the trick:
([\w+]+:(asc|desc))(,([\w+]+:(asc|desc)))*
It will match one or more fields, ignoring those that do not meet the spec.

Regex capture group multiple times and other groups

I'm trying to make a regex expression which capture multiple groups of data.
Here is some data example :
sampledata=X
B : xyz=1 FAB1_1=03 FAB2_1=01
A : xyz=1 FAB1_1=03 FAB2_1=01
I need to capture the X which should appear one time, and FAB1_1=03, FAB2_1=01, ... All the strings which starts with FAB.
So, I could capture all "FAB" like this :
/(FAB[0-9]_[0-9]=[0-9]*)/sg
But I could not include the capture of X using this expression :
/sampledata=(?<samplegroup>[0-9A-Z]).*(FAB[0-9]_[0-9]=[0-9]*)/sg
This regex only return one group with X and the last match of group of "FAB".
You can use
(?:sampledata=(\S+)|(?!^)\G)(?:(?!FAB[0-9]_[0-9]=).)*(FAB[0-9]_[0-9])=([0-9]*)‌​
See the regex demo
The regex is based on the \G operator that matches either the start of string or the end of the previous successful match. We restrict it to match only in the latter case with a negative lookahead (?!^).
So:
(?:sampledata=(\S+)|(?!^)\G) - match a literal sampledata= and then match and capture into Group 1 one or more non-whitespace symbols -OR- match the end of the previous successful match
(?:(?!FAB[0-9]_[0-9]=).)* - match any text that is not FABn_n= (this is a tempered greedy token)
(FAB[0-9]_[0-9]) - Capture group 2, matching and capturing FAB followed with a digit, then a _, and one more digit
= - literal =
([0-9]*)‌​ - Capture group 3, matching and capturing zero or more digits
If you have 1 sampledata= block, you can safely unroll the tempered greedy token (demo) as
(?:sampledata=(\S+)|(?!^)\G)[^F]*(?:F(?!FAB[0-9]_[0-9]=)[^F]*)*?(FAB[0-9]_[0-9])=([0-9]*)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
That way, the expression will be more efficient.
If you have several sampledata blocks, enhance the tempered greedy token:
(?:sampledata=(\S+)|(?!^)\G)(?:(?!sampledata=|FAB[0-9]_[0-9]=).)*(FAB[0-9]_[0-9])=([0-9]*)
See another demo