Full match only if the capturing group encountered once - regex

The pattern:
(test):(thestring)
What I want is full match only if there is just one test: before
test:thestring
But in this case there wouldn't be full match:
test:test:thestring
I've tried qualificator, but it didn't work.
Need help

Try this pattern: ^(?!.*((?(?<=^)|(?<=:))test(?=(:|$))).*(?1)).+$.
The main part is ((?(?<=^)|(?<=:))test(?=(:|$))), which matches test if it's preceeded by colon : or is at the beginning of a line and it's followed by colon : or end of the line.
(?(?<=^)|(?<=:)) this is workaround to (?<=(:|^)), but lookbehinds must have fixed length.
Then we have backreference to first capturing group (?1), to see if there are any other test.
This whole pattern is placed in negative lookahead (?!...), to match everything if it doesn't match pattern explained above (test matched more than one time).
Demo

for this very specific case:
(?<!.)(test:thestring)
Regex101
All it does is search for the string test:thestring and ensures that there are no characters before it. (Use MichaƂ Turczyn's regex for an all purpose search!)

^((?!test:).)*(test:thestring)
See in action

If you want a full match and there should be only one time test: before test:string you might assert the start of the string ^, use a negative lookahead (?:(?!test:).) to match any character if what is on the right side is not test:
Then match test:thestring followed by a negative lookahead (?:(?!test:thestring).)* that matches any character if what is on the right side is not test:thestring and assert the end of the string $
^(?:(?!test:).)*test:thestring(?:(?!test:thestring).)*$
Regex demo

Related

How to combine two or more regex and match if all are true

I want to run two regex on a string. If both match, give match.
I use this to check that the string contains /en/
(\b(\w*\/en\/\w*)\b)
'https://google.com/en/article/?mobile=true' gives match
'https://google.com/ru/article/?mobile=true' gives no match
I also use this to check that there is no ? in the string
(^[^\?]*$)
'https://google.com/en/article/' gives match since it doesn't include ?
'https://google.com/en/article/?mobile=true' does not give match
I tried adding them together like this:
(^[^\?]*$)(\b(\w*\/en\/\w*)\b)
However it produces no match in any case. I assume it has to do with pointer position and that the second () needs to specify that checking should start from the beginning?
You could turn the first pattern in a positive lookahead (?= if that is supported:
^(?=.*\/en\/)[^?]+$
^ Start of string
(?=.*\/en\/) Positive lookahead assert what is on the right contains /en/
[^?]+ Match 1+ times not ?
$ End of string.
Regex demo
(?=^[^\?]*$)(?=.*\/en\/.*).*
Both of your regexes can be placed into positive lookaheads to ensure the following string matches them. This can be repeated for any number of conditions you would like to be true.
Demo

How to create proper regular expression to find last character which I want to?

I need to create regex to find last underscore in string like 012344_2.0224.71_3 or 012354_5.00123.AR_3.335_8
I have wanted find last part with expression [^.]+$ and then find underscore at found element but I can not handle it.
I hope you can help me :)
Just use a negative character class [^_] that will match everything except an underscore (this helps to ensure no other underscores are found afterwards) and end of string $
Pattern would look as such:
(_)[^_]*$
The final underscore _ is in a capturing group, so you are wanting to return the submatch. You would replace the group 1 (your underscore).
See it live: Regex101
Notice the green highlighted portion on Regex101, this is your submatch and is what would be replaced.
The simplest solution I can imagine is using .*\K_, however not all regex flavours support \K.
If not, another idea would be to use _(?=[^_]*$)
You have a demo of the first and second option.
Explanation:
.*\K_: Fetches any character until an underscore. Since the * quantifier is greedy, It will match until the last underscore. Then \K discards the previous match and then we match the underscore.
_(?=[^_]*$): Fetch an underscore preceeded by non-underscore characters until the end of the line
If you want nothing but the "net" (i.e., nothing matched except the last underscore), use positive lookahead to check that no more underscores are in the string:
/_(?=[^_]*$)/gm
Demo
The pattern [^.]+$ matches not a dot 1+ times and then asserts the end of the string. The will give you the matches 71_3 and 335_8
What you want to match is an underscore when there are no more underscores following.
One way to do that is using a negative lookahead (?!.*_) if that is supported which asserts what is at the right does not match any character followed by an underscore
_(?!.*_)
Pattern demo

Regex excluding catches that ending with a dot

First of all, I don't need full e-mail address validation, my given task doesn't require it. I just want to upgrade my current regex code so that it won't match addresses ending with a dot.
My current code: [0-9A-Za-z.]+[#][0-9A-Za-z.]+
It catches both "user#exampe.com", "user#example.com."
I'd like it to catch only from the string that ends without the dot. user#exampe.com
Example string:
dasd.fas#fsaf.dfas.dsa, zghs#gas.gsq, adg32.dsa12#cas, ksak#c.csa., gs32.basaa#scaa.upc.
I'd like to catch the strings marked as code in the example.
Edit: I have only one line with multiple e-mail addresses separated with a , and a space after them.
You might add [0-9A-Za-z]after your regex to end with what you want to match in your character class without the dot followed by a positive lookahead (?=, |$) that asserts what follows is either a comma followed by a whitespace or the end of the string.
[0-9A-Za-z.]+#[0-9A-Za-z.]+[0-9A-Za-z](?=, |$)
Regex Demo
([0-9A-z.]+#(?:\.?[0-9A-z]+)+)(?=,|$)
Try it here
Just slightly modify your pattern: [0-9A-Za-z.]+[#](?:[a-zA-Z]|\.(?=[a-zA-Z]))+.
It uses alternation after # to match one or more: letters OR dot, if it's followed by another letter, thanks to positive lookahead: \.(?=[a-zA-Z]).
Demo
Try this one:
just capture , , $ and group them in non-capturing group except end .
[0-9A-Za-z.]+[#][0-9A-Za-z.]+[0-9A-Za-z](?:(,|$))
demo here

Regex in middle of text doesn't match

I have a regex to find url's in text:
^(?!:\/\/)([a-zA-Z0-9-_]+\.)*[a-zA-Z0-9][a-zA-Z0-9-_]+\.[a-zA-Z]{2,11}?$
However it fails when it is surrounded by text:
https://regex101.com/r/0vZy6h/1
I can't seem to grasp why it's not working.
Possible reasons why the pattern does not work:
^ and $ make it match the entire string
(?!:\/\/) is a negative lookahead that fails the match if, immediately to the right of the current location, there is :// substring. But [a-zA-Z0-9-_]+ means there can't be any ://, so, you most probably wanted to fail the match if :// is present to the left of the current location, i.e. you want a negative lookbehind, (?<!:\/\/).
[a-zA-Z]{2,11}? - matches 2 chars only if $ is removed since the {2,11}? is a lazy quantifier and when such a pattern is at the end of the pattern it will always match the minimum char amount, here, 2.
Use
(?<!:\/\/)([a-zA-Z0-9-_]+\.)*[a-zA-Z0-9][a-zA-Z0-9-_]+\.[a-zA-Z]{2,11}
See the regex demo. Add \b word boundaries if you need to match the substrings as whole words.
Note in Python regex there is no need to escape /, you may replace (?<!:\/\/) with (?<!://).
The spaces are not being matched. Try adding space to the character sets checking for leading or trailing text.

RegEx - String To Help Match

I read somewhere that it is possible to have a RegEx in which strings preceding and following are not to be matched, but instead help with ambiguities.
For example, I would like a RegEx that matches only "TESTING" from the second line ("defTESTINGghi") and nothing from line one and line two.
abcTESTINGdef
defTESTINGghi
ghiTESTINGjkl
If supported you can use the \K escape sequence. \K resets the starting point of the reported match and any previously consumed characters are no longer included. The Positive Lookahead asserts that the preceded is followed by ghi.
def\KTESTING(?=ghi)
Live Demo
Or depending on what your definition of the preceded and following not being matched are, why not simply use a capturing group to capture only the desired subpattern?
def(TESTING)ghi
Live Demo
You could try the below regexes to match the string TESTING only on the second line,
Through positive lookahead and lookbehind,
(?<=def)TESTING(?=ghi)
Matches the string TESTING only if it's present just after to the def and must be follwed by ghi.
Through positive lookahead,
TESTING(?=ghi)
Matches the string TESTING only if it's followed by ghi.
Through negative lookahead,
TESTING(?!def|jkl)
Matches the string TESTING if it's not followed by def or jkl.
Reference