Regex: Opposite of group match - regex

this expression
(^\+\d{2})_\1
would match
+32_+32
How can I make it match
+32_+44

If you want the opposite, you might use a negative lookahead (?!\1) asserting not the value of group 1 and then match a + and 2 digits
^(\+\d{2})_(?!\1)\+\d{2}
Regex demo
If you want to match an underscore followed by 2 digits, you don't need the first capturing group and you can match if afterwards.
^\+\d{2}_\+\d{2}
Regex demo

Related

Match with optional positive lookahead

I've got 2 strings in the format:
Some_thing_here_1234 Match Me 1 & 1234 Match Me 1_1
In both cases I want the resultant match to be 1234 Match Me 1
So far I've got (?<=^|_)\d{4}\s.+ which works but in the case of string 2 also captures the _1 at the end. I thought I could use a lookahead at the end with an optional such as (?<=^|_)\d{4}\s.+(?=_\d{1}$|$) but it always seems to revert to the second option and so the _1 gets through.
Any help would be great
You can use
(?<=^|_)\d{4}\s[^_]+
See the regex demo.
Details:
(?<=^|_) - a positive lookbehind that matches a location that is immediately preceded with either start of string or a _ char (equal to (?<![^_]))
\d{4} - four digits
\s - a whitespace
[^_]+ - one or more chars other than _.
Your second pattern (?<=^|_)\d{4}\s.+(?=_\d{1}$|$) is greedy and at the end of the string the second alternative |$ will match so you will keep matching the whole line.
Note that you can omit {1}
If you want to use an optional part in the lookahad, you can make the match non greedy and optionally match :_\d in the lookahead followed by the end of the string.
(?<=^|_)\d{4}\s.+?(?=(?:_\d)?$)
See a regex demo.

Negative Lookahead not match suffix

I have an expression that is matching something, but am trying to get this not to match if it's followed by the suffix: one or more spaces, three dashes, one or more spaces, one or more digits, a slash, and finally one or more digits. Here is the expression:
(?<=(^|\s+))[A-Z]+[ ]+([0-9]+(\.[0-9]{1,3})?)/([0-9]+(\.[0-9]{1,3})?)(?!(\s+\-\-\-\s+[0-9]+/[0-9]+))
And here is the text:
January 10.5/13.5 --- 22/26 ---
It's matching January 10.5/13, but I don't want it to match anything.
As lookarounds are supported, you can change the positive lookbehind at the start to a negative lookbehind asserting a whitespace boundary to the left (?<!\S)
You can use .* to it to scan the whole line, instead of starting with 1+ more whitespace chars \s+
The negative lookahead (?!.*\s-{3}\s+[0-9]+/[0-9] asserts that what is on the right is not the suffix.
You can omit the quantifier + after the last character class, as it does not matter if there are 1 or more digits following...as long as it is not a digit.
Note that in the current pattern, the decimal part is an optional capturing group 2. If you want that whole value in group 1, you can make it an optional group.
(?<!\S)[A-Z]+[ ]+([0-9]+(\.[0-9]{1,3})?)/([0-9]+(\.[0-9]{1,3})?)(?!.*\s-{3}\s+[0-9]+/[0-9])
Regex demo

Is it possible to compare two values in a row and fetch the desired one, but both the values matches the regex written

text = "Happy 4/20 from the team! 13/10 congrats..after so many contents"
I want to fetch only 13/10 which is the rating. I have written regex
(\d+\.\d+|\d+)/(((?=10)10)|([1-9]\d+))
but it fetches the first one(4/20).
Is this possible to achieve using regex?
In this part of your pattern (?=10)10 you can omit the positive lookahead because that says if what is on the right is 10, then match 10. This part [1-9]\d+ matches 10 and above so 10 is already in the range.
You could use a capturing group with a quantifier {2} to repeat that group.
Your pattern can also be written as \d+(?:\.\d+)?/[1-9]\d+)
To get the second group, you might use:
^(?:.*?(\d+(?:\.\d+)?/[1-9]\d+)){2}
^ Start of the string
(?: Non capturing group
.*? Match any char non greedy
( Capturing group
\d+(?:\.\d+)? Match 1+ digits, optionally match a dot and 1+ digits
/ Match /
[1-9]\d+ Match 10 and above
) Close capturing group
){2} Close non capturing group and repeat 2 times
Regex demo

Regular expression to exclude group with 0 and more occurence issue

I need to extract 1234567 from below URLs
http://www.test.in/some--wonders-1234567---2
http://www.test.in/some--wonders-1234567
I tried with .*\-([0-9]+)(?:-{2,}2)?.
but for the first URL it returned 2, but this is in non-capturing group.
Please give me a solution. I am digging it for so long. not getting any idea.
Try this one:
.*?\-([0-9]+)(?:-{2,}2|$)
It sets lazy mode for first .* pattern, you can also remove it at all with same effect:
\-([0-9]+)(?:-{2,}2|$)
If your regex engine supports negative look behinds (some do not), you can do it this way:
(?<!\d+-+)\d+
It gives you any non-empty digit string, which is not preceded by (minuses followed by digits).
Big advantage is that you don't have to use groups here - regex itself returns what you want.
You could match a - followed by one or more digits which you could capture in a group ([0-9]+). This group will contain the value you want to extract.
Then an optional part (?:-{2,}[0-9]+)? that would match ---2 followed by asserting the end of the line $.
-(\d+)(?:-{2,}\d+)?$
Explanation
- Match literally
(\d+) Capture one or more digits in a group
(?: Non capturing group
-{2,} Match 2 or more times -
\d+ Match one or more digits
)? close non capturing group and make it optional
$ Assert position at the end of the line

Regex: Match dot and dash in a 5 digit number

I'm trying to use a Regex to match only dot and dash from a number that matches in this format:
00.000-0
I'd use a two step way: first checking if the number is in this format 00.000-0 and then matching only the dot and dash, which I'd use a regex pattern like [^\d] or [\.\-].
But I'm trying to use in a single step, a Regex pattern that matches the dot after the first two digits and the dash followed by respectively, two digits, dot and three digits.
First, I tried in regex101.com with positive lookahead, something like (?=\d\d)\.(?=\d\d\d)\-, but it didn't work. Then I tried (?=\d\d)\., so at least I tried to the dot . to see if the lookahead was working, but again it didn't work.
I read in Regular-Expressions.info and, apparently, the lookahead format I tried was correct.
Is there something else I can do, it matches the dot and dash, only for this format: 00.000-0?
You might capture the dot and the dash in a capturing group ().
From the start of the string ^ match 2 digits [0-9]{2}, then capture (\.)the dot in capturing group 1, match 3 digits [0-9]{3} and capture the dash (-) in capturing group 2 and finally match a digit [0-9] at the end of the line $
^[0-9]{2}(\.)[0-9]{3}(-)[0-9]$
If your engine supports lookbehinds, an option to match only the dot and the dash could be to match a dot or a hyphen if on the left side and on the right side is the pattern that you would expect.
(?<=^\d{2})\.(?=\d{3}-\d$)|(?<=^\d{2}\.\d{3})-(?=\d$)