This is my current regex:
/([A-Z])(?![A-Z])/gm
Bellow how it's evaluated:
https://regex101.com/r/K1gvmr/1
In the print you can see I'm getting:
[F] oo [B] ar XYBA [Z]
And instead I need to get that matches:
[F] oo [B] ar [X] YBAZ
How a negative lookahead (or another approach) can stop evaluation in the first char of each group only?
Try Regex: (?<![A-Z])[A-Z]
Explanation:
A negative look behind for A-Z followed by any character in A-Z
Demo
Related
How to make sure that part of the pattern (keyword in this case) is in the pattern you're looking for, but it can appear in different places. I want to have a match only when it occurs at least once.
Regex:
\b(([0-9])(xyz)?([-]([0-9])(xyz)?)?)\b
We only want the value if there is a keyword: xyz
Examples:
1. 1xyz-2xyz - it's OK
2. 1-2xyz - it's OK
3. 1xyz - it's OK
4. 1-2 - there should be no match, at least one xyz missing
I tried a positive lookahead and lookbehind but this is not working in this case.
You can make use of a conditional construct:
\b([0-9])(xyz)?(?:-([0-9])(xyz)?)?\b(?(2)|(?(4)|(?!)))
See the regex demo. Details:
\b - word boundary
([0-9]) - Group 1: a digit
(xyz)? - Group 2: an optional xyz string
(?:-([0-9])(xyz)?)? - an optional sequence of a -, a digit (Group 3), xyz optional char sequence
\b - word boundary
(?(2)|(?(4)|(?!))) - a conditional: if Group 2 (first (xyz)?) matched, it is fine, return the match, if not, check if Group 4 (second (xyz)?) matched, and return the match if yes, else, fail the match.
See the Python demo:
import re
text = "1. 1xyz-2xyz - it's OK\n2. 1-2xyz - it's OK\n3. 1xyz - it's OK\n4. 1-2 - there should be no match"
pattern = r"\b([0-9])(xyz)?(?:-([0-9])(xyz)?)?\b(?(2)|(?(4)|(?!)))"
print( [x.group() for x in re.finditer(pattern, text)] )
Output:
['1xyz-2xyz', '1-2xyz', '1xyz']
Indeed you could use a lookahead in the following way:
\b\d(?:xyz|(?=-\dxyz))(?:-\d(?:xyz)?)?\b
See this demo at regex101 (or using ^ start and $ end)
The first part matches either an xyz OR (if there is none) the lookahead ensures that the xyz occures in the second optional part. The second part is dependent on the previous condition.
Try this: \b(([0-9])?(xyz)+([-]([0-9])+(xyz)+)?)\b
Replace ? with +
Basically ?: zero or more and in your case you want to match one or more.
Whih is +
How about something as basic as this
(\dxyz-\dxyz|\dxyz-\d|\d-\dxyz|\dxyz)
You can add word boundary if needed
\b(\dxyz-\dxyz|\dxyz-\d|\d-\dxyz|\dxyz)\b
Just an OR
I want to validate strings that have the form:
One underscore _, a group of letters in a, b, c in alphabetical order and another underscore _.
Examples of valid strings are _a_, _b_, _ac_, _abc_.
I can achieve the correct validation for most cases using the regex _a?b?c?_, but that is still matched by __, which I don't want to consider valid. How can I adapt this regex so that among my zero or one characters a?b?c?, at least one of them must be present?
You can add a (?!_) lookahead after the first _:
_(?!_)a?b?c?_
Details:
_ - an underscore
(?!_) - the next char cannot be a _
a? - an optional a
b? - an optional b
c? - an optional c
_ - an underscore.
See the regex demo.
You may use this regex with a positive lookahead:
_(?=[abc])a?b?c?_
RegEx Demo
RegEx Demo:
_: Match a _
(?=[abc]): Positive lookahead to assert that there is a letter a or b or c
a?b?c?: Match optional a followed by b followed by c
_: Match a _
PS: Positive lookahead assertions are usually more efficient than negative lookahead (as evident from steps taken on regex101 webste).
Thought I'd chip in an alternative if one uses Python for this with PyPi's regex module which support what is called approximate “fuzzy” matching:
^_(abc){d<=2}_$
The pattern means:
^_ - Match start-line anchor and leading underscore;
(abc){d<=2} - Match 'abc' in order and allow for up to just two deletions;
_$ - Match trailing undescore and end-line anchor.
import regex as re
l = ["_a_", "_b_", "_ac_", "_abc_", "", "__", "_ca_"]
print([bool(re.search(r'^_(abc){d<=2}_$', s)) for s in l])
Prints:
[True, True, True, True, False, False, False]
Another option if a lookbehind is supported is looking back after the match, asserting not __
_a?b?c?_(?<!__)
Explanation
_ Match literally
a?b?c? Match an optional a or b or c
_ Match literally
(?<!__) Negative lookbehind, assert not __ directly to the left
Regex demo
If supported using SKIP FAIL getting __ out of the way:
__(*SKIP)(*FAIL)|_a?b?c?_
Regex demo
needs to match the lines with [m] and ignore the lines with [F]
regex needs to select lines 1& 2 only
1.[m]dfsd
2.[M]
3.[M]dfdfd[F]
4.[M]dfsd[f]
5.[m]dfd[F]
6.[m]fsdf[f]
tried this
(?=.[m])(?!=.[f])
Your attempt to use lookaheads is on the right track. I would use a negative lookahead to exclude [F], and then match [m] anywhere in the pattern:
^(?!.*\[[Ff]\]).*\[[Mm]\].*$
Demo
Note that I used [Mm] and [Ff] to match male and female as case insensitive.
I'm trying to build a regex, which will detect usernames mentioned in a string. The usernames can look like "username", "username[0-9]", "adm-username", "adm-username[0-9]".
As of now, I have this: \b(adm\-){0,1}username[0-9]{0,1}\b (link: https://regexr.com/4at34)
The problem is with adm-. If the preposition is aadm-username, the regex still detects 'username', I want it to fail. Any tips how to do that?
Thanks
You could replace \b by [\w-] in your case.
Also, don't match the boundaries.
And finally, don't match intermediate groups, make a single big group for your matches.
Demo
(?<![\w-])((?:adm-)?username\d?)(?![\w-])
[v] username
[v] username2
[v] adm-username
[v] adm-username2
[x] aadm-username
[x] aadm-username2
Explanation
(?<![\w-]) # negative lookbehind, only match if no word character or hyphen is present
(
(?:adm-)? # non-matching group containing adm- literally once or none, will be matched in the greater group
username\d? # literally matching username and a digit, once or none
)
(?![\w-]) # negative lookahead, only match if no word character or hyphen is present
I'm new to regex and I need an expression that matches on =, but not on ==.
So for example:
[x] == [y] // No match
[x] = [y] // Match
All my self-made regular expressions get a match on the first = in ==. I dont want that. I just want a match if the = is the only operator in the expression.
I'm working with delphi regular expressions.
Use negative lookaround:
(?<!=)=(?!=)
This will match equal sign if not preceded and followed by equal sign.
You have to match the whether the predecessor is not = and the successor is not false:
[^=]=[^=]
Have a look at this example. Here is a little interactive tutorial wich covers the important cases.
adapting this answer should do the trick:
(?:[^=]+(=)[^=]+)
Explanation:
(?: // Do not capture group
[^=]+ // Match 1 or more occurrences of character other than [=]
(=) // Match and capture a `=`
[^=]+ // Match 1 or more occurrences of character other than [=]
) // End of group