Why does this regex does not match - regex

I'm wondering why the following regex works for some strings and does not work for some others:
/^([0-3]+)(?!4|.*5)[0-9]+$/
1151 -> this does not match
1141 -> this does match, but why? since I can consider .* as empty and the regex becomes /^([0-3]+)(?!4|5)[0-9]+$/
I think that I'm misunderstanding the way the look-ahead works...

Let's look at how the regular expression would parse your string, step by step.
^([0-3]+)(?!4|.*5)[0-9]+$
First, some clarification. (?!4|.*5) is a negative look-ahead that checks if either 4 or .*5 follow the last consumed character. If it does, the current match fails and steps back. It could also be written as (?!(4|.*5)) if you wanted it to be slightly more clear about how exactly | affects it.
Let's start by looking at 1141
First, [0-3]+ consumes as many characters as possible, so it will consume up to and including the 11 in 1141. What's leftover is 41. The regular expression now checks to see if 4 is after the current characters, and since ?! is a negative look-ahead, the match will fail if it is found. Since 4 follows 11, the match fails and the regular expression steps backwards and tries again.
Instead of matching two 1s, it now attempts a single match and matches 1, with 141 left over. ?!4 checks to make sure 4 is the next character, and what do you know, it's not there. The regex leaves the negative look-ahead since it didn't match, and continues on to the rest of the regular expression. 141 is matched by the final [0-9]+, and thus the entire 1141 string is matched. Remember that look-arounds do not consume characters.
Now let's look at 1151
The same thing happens as last time, 11 is consumed and we have 51 left over. Now we look at the negative look-ahead, and evaluate the rest of the string off that. Obviously, 4 is no where in this string so we can ignore that, so let's look at .*5.
So the look-ahead .*5 tries to match 51. If it does end up matching, just as before the match will fail and the regular expression will step back. Now if you know any regex at all, it is obvious that .*5 will match the beginning of 51 since .* can evaluate to empty.
So we step back, and now we've matched a single 1 instead of both, and we're at the negative look-ahead again.
We have currently consumed 1, still have 151 left to match, and are on the (?!4|.*5) portion of the regex. Here, 4 is obviously non-existant in our string so it is not going to match, so let's look at .*5 again.
.*5 will match a portion of 151 since .* will consume the first 1, and the 5 will finish off by matching 5. This should also be obvious if you know regex.
So we've made a match in a negative look-ahead again, which is bad... so we step back again. We have no more integers to attempt to match with [0-3], and since you can't match 0 integers with a +, the entire string fails to match the regular expression.

1141 matches because the the regular expression engine can backtrack from matching 11 with the [0-3]+ to just matching the first 1, leaving the remaining numbers to be matched by the [0-9]+.
As the next character after the first 1 is 1 and not 4, the negative look-ahead, which only looks at the next character, does not prevent the match.
The 1151 does not match because the negative look-ahead with the added .* prevents it.
With the added .* put before the 5 the look-ahead now means 'don't match if the next character is 4, or after any number of any characters the next character is 5' (ignoring newlines).
So even if the engine backtracks to make [0-3]+ match just the first 1 of 1151, there is still a 5 ahead in the string, so a match is prevented.
Remember that look-aheads and look-behinds are zero-width.

If you want it to match 4 or 5 the best option would be
/^[0-3]+[45][0-9]+$/
but without a better explaination of what it's supposed to do it's hard to suggest anything more than that...

What regex flavour is that?
/^([0-3]+)(?!4|.*5)[0-9]+$/
Honestly the only way I would see it match 1141 and not 1151 is if the highlighted part of the regex would be evaluated as NOT 4 or .* followed by 5. If it was that case then the regex engine would fail to find a match for 1141 as it would match the 4 but would miss the 5 to make the inner match complete.
However, usually the alternation would be understood as 4 or .*5 - which is still not equivalent to 4 or 5, because the expression .* can prove quite powerful in case when the engine wants to make a match work.
What are you testing the expression in?

Related

regex to find word and alphanumeric pattern [duplicate]

I found this tutorial on regular expressions and while I intuitively understand what "greedy", "reluctant" and "possessive" qualifiers do, there seems to be a serious hole in my understanding.
Specifically, in the following example:
Enter your regex: .*foo // Greedy qualifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13.
Enter your regex: .*?foo // Reluctant qualifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfoo" starting at index 0 and ending at index 4.
I found the text "xxxxxxfoo" starting at index 4 and ending at index 13.
Enter your regex: .*+foo // Possessive qualifier
Enter input string to search: xfooxxxxxxfoo
No match found.
The explanation mentions eating the entire input string, letters been consumed, matcher backing off, rightmost occurrence of "foo" has been regurgitated, etc.
Unfortunately, despite the nice metaphors, I still don't understand what is eaten by whom... Do you know of another tutorial that explains (concisely) how regular expression engines work?
Alternatively, if someone can explain in somewhat different phrasing the following paragraph, that would be much appreciated:
The first example uses the greedy quantifier .* to find "anything", zero or more times, followed by the letters "f", "o", "o". Because the quantifier is greedy, the .* portion of the expression first eats the entire input string. At this point, the overall expression cannot succeed, because the last three letters ("f", "o", "o") have already been consumed [by whom?]. So the matcher slowly backs off [from right-to-left?] one letter at a time until the rightmost occurrence of "foo" has been regurgitated [what does this mean?], at which point the match succeeds and the search ends.
The second example, however, is reluctant, so it starts by first consuming [by whom?] "nothing". Because "foo" doesn't appear at the beginning of the string, it's forced to swallow [who swallows?] the first letter (an "x"), which triggers the first match at 0 and 4. Our test harness continues the process until the input string is exhausted. It finds another match at 4 and 13.
The third example fails to find a match because the quantifier is possessive. In this case, the entire input string is consumed by .*+ [how?], leaving nothing left over to satisfy the "foo" at the end of the expression. Use a possessive quantifier for situations where you want to seize all of something without ever backing off [what does back off mean?]; it will outperform the equivalent greedy quantifier in cases where the match is not immediately found.
I'll give it a shot.
A greedy quantifier first matches as much as possible. So the .* matches the entire string. Then the matcher tries to match the f following, but there are no characters left. So it "backtracks", making the greedy quantifier match one less character (leaving the "o" at the end of the string unmatched). That still doesn't match the f in the regex, so it backtracks one more step, making the greedy quantifier match one less character again (leaving the "oo" at the end of the string unmatched). That still doesn't match the f in the regex, so it backtracks one more step (leaving the "foo" at the end of the string unmatched). Now, the matcher finally matches the f in the regex, and the o and the next o are matched too. Success!
A reluctant or "non-greedy" quantifier first matches as little as possible. So the .* matches nothing at first, leaving the entire string unmatched. Then the matcher tries to match the f following, but the unmatched portion of the string starts with "x" so that doesn't work. So the matcher backtracks, making the non-greedy quantifier match one more character (now it matches the "x", leaving "fooxxxxxxfoo" unmatched). Then it tries to match the f, which succeeds, and the o and the next o in the regex match too. Success!
In your example, it then starts the process over with the remaining unmatched portion of the string, "xxxxxxfoo", following the same process.
A possessive quantifier is just like the greedy quantifier, but it doesn't backtrack. So it starts out with .* matching the entire string, leaving nothing unmatched. Then there is nothing left for it to match with the f in the regex. Since the possessive quantifier doesn't backtrack, the match fails there.
It is just my practice output to visualise the scene-
I haven't heard the exact terms 'regurgitate' or 'backing off' before; the phrase that would replace these is "backtracking", but 'regurgitate' seems like as good a phrase as any for "the content that had been tentatively accepted before backtracking threw it away again".
The important thing to realize about most regex engines is that they are backtracking: they will tentatively accept a potential, partial match, while trying to match the entire contents of the regex. If the regex cannot be completely matched at the first attempt, then the regex engine will backtrack on one of its matches. It will try matching *, +, ?, alternation, or {n,m} repetition differently, and try again. (And yes, this process can take a long time.)
The first example uses the greedy
quantifier .* to find "anything", zero
or more times, followed by the letters
"f" "o" "o". Because the quantifier is
greedy, the .* portion of the
expression first eats the entire input
string. At this point, the overall
expression cannot succeed, because the
last three letters ("f" "o" "o") have
already been consumed (by whom?).
The last three letters, f, o, and o were already consumed by the initial .* portion of the rule. However, the next element in the regex, f, has nothing left in the input string. The engine will be forced to backtrack on its initial .* match, and try matching all-but-the-last character. (It might be smart and backtrack to all-but-the-last-three, because it has three literal terms, but I'm unaware of implementation details at this level.)
So the matcher
slowly backs off (from right-to-left?) one letter at a time
until the rightmost occurrence of
"foo" has been regurgitated (what does this mean?), at which
This means the foo had tentatively been including when matching .*. Because that attempt failed, the regex engine tries accepting one fewer character in .*. If there had been a successful match before the .* in this example, then the engine would probably try shortening the .* match (from right-to-left, as you pointed out, because it is a greedy qualifier), and if it was unable to match the entire inputs, then it might be forced to re-evaluate what it had matched before the .* in my hypothetical example.
point the match succeeds and the
search ends.
The second example, however, is
reluctant, so it starts by first
consuming (by whom?) "nothing". Because "foo"
The initial nothing is consumed by .?*, which will consume the shortest possible amount of anything that allows the rest of the regex to match.
doesn't appear at the beginning of the
string, it's forced to swallow (who swallows?) the
Again the .?* consumes the first character, after backtracking on the initial failure to match the entire regex with the shortest possible match. (In this case, the regex engine is extending the match for .*? from left-to-right, because .*? is reluctant.)
first letter (an "x"), which triggers
the first match at 0 and 4. Our test
harness continues the process until
the input string is exhausted. It
finds another match at 4 and 13.
The third example fails to find a
match because the quantifier is
possessive. In this case, the entire
input string is consumed by .*+, (how?)
A .*+ will consume as much as possible, and will not backtrack to find new matches when the regex as a whole fails to find a match. Because the possessive form does not perform backtracking, you probably won't see many uses with .*+, but rather with character classes or similar restrictions: account: [[:digit:]]*+ phone: [[:digit:]]*+.
This can drastically speed up regex matching, because you're telling the regex engine that it should never backtrack over potential matches if an input doesn't match. (If you had to write all the matching code by hand, this would be similar to never using putc(3) to 'push back' an input character. It would be very similar to the naive code one might write on a first try. Except regex engines are way better than a single character of push-back, they can rewind all the back to zero and try again. :)
But more than potential speed ups, this also can let you write regexs that match exactly what you need to match. I'm having trouble coming up with an easy example :) but writing a regex using possessive vs greedy quantifiers can give you different matches, and one or the other may be more appropriate.
leaving nothing left over to satisfy
the "foo" at the end of the
expression. Use a possessive
quantifier for situations where you
want to seize all of something without
ever backing off (what does back off mean?); it will outperform
"Backing off" in this context means "backtracking" -- throwing away a tentative partial match to try another partial match, which may or may not succeed.
the equivalent greedy quantifier in
cases where the match is not
immediately found.
http://swtch.com/~rsc/regexp/regexp1.html
I'm not sure that's the best explanation on the internet, but it's reasonably well written and appropriately detailed, and I keep coming back to it. You might want to check it out.
If you want a higher-level (less detailed explanation), for simple regular expressions such as the one you're looking at, a regular expression engine works by backtracking. Essentially, it chooses ("eats") a section of the string and tries to match the regular expression against that section. If it matches, great. If not, the engine alters its choice of the section of the string and tries to match the regexp against that section, and so on, until it's tried every possible choice.
This process is used recursively: in its attempt to match a string with a given regular expression, the engine will split the regular expression into pieces and apply the algorithm to each piece individually.
The difference between greedy, reluctant, and possessive quantifiers enters when the engine is making its choices of what part of the string to try to match against, and how to modify that choice if it doesn't work the first time. The rules are as follows:
A greedy quantifier tells the engine to start with the entire string (or at least, all of it that hasn't already been matched by previous parts of the regular expression) and check whether it matches the regexp. If so, great; the engine can continue with the rest of the regexp. If not, it tries again, but trimming one character (the last one) off the section of the string to be checked. If that doesn't work, it trims off another character, etc. So a greedy quantifier checks possible matches in order from longest to shortest.
A reluctant quantifier tells the engine to start with the shortest possible piece of the string. If it matches, the engine can continue; if not, it adds one character to the section of the string being checked and tries that, and so on until it finds a match or the entire string has been used up. So a reluctant quantifier checks possible matches in order from shortest to longest.
A possessive quantifier is like a greedy quantifier on the first attempt: it tells the engine to start by checking the entire string. The difference is that if it doesn't work, the possessive quantifier reports that the match failed right then and there. The engine doesn't change the section of the string being looked at, and it doesn't make any more attempts.
This is why the possessive quantifier match fails in your example: the .*+ gets checked against the entire string, which it matches, but then the engine goes on to look for additional characters foo after that - but of course it doesn't find them, because you're already at the end of the string. If it were a greedy quantifier, it would backtrack and try making the .* only match up to the next-to-last character, then up to the third to last character, then up to the fourth to last character, which succeeds because only then is there a foo left after the .* has "eaten" the earlier part of the string.
Here is my take using Cell and Index positions (See the diagram here to distinguish a Cell from an Index).
Greedy - Match as much as possible to the greedy quantifier and the entire regex. If there is no match, backtrack on the greedy quantifier.
Input String: xfooxxxxxxfoo
Regex: .*foo
The above Regex has two parts:
(i)'.*' and
(ii)'foo'
Each of the steps below will analyze the two parts. Additional comments for a match to 'Pass' or 'Fail' is explained within braces.
Step 1:
(i) .* = xfooxxxxxxfoo - PASS ('.*' is a greedy quantifier and will use the entire Input String)
(ii) foo = No character left to match after index 13 - FAIL
Match failed.
Step 2:
(i) .* = xfooxxxxxxfo - PASS (Backtracking on the greedy quantifier '.*')
(ii) foo = o - FAIL
Match failed.
Step 3:
(i) .* = xfooxxxxxxf - PASS (Backtracking on the greedy quantifier '.*')
(ii) foo = oo - FAIL
Match failed.
Step 4:
(i) .* = xfooxxxxxx - PASS (Backtracking on the greedy quantifier '.*')
(ii) foo = foo - PASS
Report MATCH
Result: 1 match(es)
I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13.
Reluctant - Match as little as possible to the reluctant quantifier and match the entire regex. if there is no match, add characters to the reluctant quantifier.
Input String: xfooxxxxxxfoo
Regex: .*?foo
The above regex has two parts:
(i) '.*?' and
(ii) 'foo'
Step 1:
.*? = '' (blank) - PASS (Match as little as possible to the reluctant quantifier '.*?'. Index 0 having '' is a match.)
foo = xfo - FAIL (Cell 0,1,2 - i.e index between 0 and 3)
Match failed.
Step 2:
.*? = x - PASS (Add characters to the reluctant quantifier '.*?'. Cell 0 having 'x' is a match.)
foo = foo - PASS
Report MATCH
Step 3:
.*? = '' (blank) - PASS (Match as little as possible to the reluctant quantifier '.*?'. Index 4 having '' is a match.)
foo = xxx - FAIL (Cell 4,5,6 - i.e index between 4 and 7)
Match failed.
Step 4:
.*? = x - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4.)
foo = xxx - FAIL (Cell 5,6,7 - i.e index between 5 and 8)
Match failed.
Step 5:
.*? = xx - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4 thru 5.)
foo = xxx - FAIL (Cell 6,7,8 - i.e index between 6 and 9)
Match failed.
Step 6:
.*? = xxx - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4 thru 6.)
foo = xxx - FAIL (Cell 7,8,9 - i.e index between 7 and 10)
Match failed.
Step 7:
.*? = xxxx - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4 thru 7.)
foo = xxf - FAIL (Cell 8,9,10 - i.e index between 8 and 11)
Match failed.
Step 8:
.*? = xxxxx - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4 thru 8.)
foo = xfo - FAIL (Cell 9,10,11 - i.e index between 9 and 12)
Match failed.
Step 9:
.*? = xxxxxx - PASS (Add characters to the reluctant quantifier '.*?'. Cell 4 thru 9.)
foo = foo - PASS (Cell 10,11,12 - i.e index between 10 and 13)
Report MATCH
Step 10:
.*? = '' (blank) - PASS (Match as little as possible to the reluctant quantifier '.*?'. Index 13 is blank.)
foo = No character left to match - FAIL (There is nothing after index 13 to match)
Match failed.
Result: 2 match(es)
I found the text "xfoo" starting at index 0 and ending at index 4.
I found the text "xxxxxxfoo" starting at index 4 and ending at index 13.
Possessive - Match as much as possible to the possessive quantifer and match the entire regex. Do NOT backtrack.
Input String: xfooxxxxxxfoo
Regex: .*+foo
The above regex has two parts: '.*+' and 'foo'.
Step 1:
.*+ = xfooxxxxxxfoo - PASS (Match as much as possible to the possessive quantifier '.*')
foo = No character left to match - FAIL (Nothing to match after index 13)
Match failed.
Note: Backtracking is not allowed.
Result: 0 match(es)
Greedy: "match the longest possible sequence of characters"
Reluctant: "match the shortest possible sequence of characters"
Possessive: This is a bit strange as it does NOT (in contrast to greedy and reluctant) try to find a match for the whole regex.
By the way: No regex pattern matcher implementation will ever use backtracking. All real-life pattern matcher are extremely fast - nearly independent of the complexity of the regular expression!
Greedy Quantification involves pattern matching using all of the remaining unvalidated characters of a string during an iteration. Unvalidated characters start in the active sequence. Every time a match does not occur, the character at the end is quarantined and the check is performed again.
When only leading conditions of the regex pattern are satisfied by the active sequence, an attempt is made to validate the remaining conditions against the quarantine. If this validation is successful, matched characters in the quarantine are validated and residual unmatched characters remain unvalidated and will be used when the process begins anew in the next iteration.
The flow of characters is from the active sequence into the quarantine. The resulting behavior is that as much of the original sequence is included in a match as possible.
Reluctant Quantification is mostly the same as greedy qualification except the flow of characters is the opposite--that is, they start in the quarantine and flow into the active sequence. The resulting behavior is that as little of the original sequence is included in a match as possible.
Possessive Quantification does not have a quarantine and includes everything in a fixed active sequence.

regex to match one and only one digit

I need to match a single digit, 1 through 9. For example, 3 should match but 34 should not.
I have tried:
\d
\d{1}
[1-9]
[1-9]{1}
[1-9]?
They all match 3 and 34. I am using regex for this because it is part of a much larger expression in which I am using alternation.
The problem with all of your examples, of course, is that they match the digit, but don't keep themselves from matching multiple digits next to each other.
In the following example:
Some text with a 3 and a 34 and what about b5 and 64b?
This regex will match only the lone 3. It uses word boundaries, a handy feature.
\b[1-9]\b
It gets more complicated if you want to match single digits inside words, like the 5 in my example, but you didn't specify if you'd want that, so I'll leave that out for now.

Regex - Why does the question mark behave like this?

I'm learning regex. When I match this:
\d[^\w]\d
on this
30-01-2003 15:20
I get 3 matches: 0-0, 1-2, 3 5, and 5:2.
When I try adding a question mark at the end of the regex (\d[^\w]\d?), my matches don't change.
When I move the question mark to after the square bracket (\d[^\w]?\d), the matches are now 30, 01, 20, 03, 15, and 20.
When I move the question mark to before the square bracket (\d?[^\w]\d), my matches are the same as in the first case.
Why is this? I know the ? operator marks the preceding character as optional, so I expected the behaviour in the second case, but not in the first or third.
Because ? is a greedy match. It will attempt to consume as much as possible. So, if a \d is present, it will always grab it.
Think of the ? at the end as defining two regexes: \d[^\w]\d and \d[^\w]. In your test case, you never have a match where the first regex doesn't match and the second one does (without overlaps, again, it's greedy). That's why your matches never change. If, however, you changed your test case to this:
30-01-2003 15:20/
You'll get an extra match of 0/ depending on whether or not you include the question mark at the end of the regex.
Your first and third cases produce the same results as the original only because of the particular string you're searching - they are NOT equivalent searches in general. Specifically, every occurrence of \d[^\w] in your string happens to be followed by a digit, so making the trailing digit optional does not change any of the matches. Likewise, every occurrence of [^\w]\d happens to be preceded by a digit. If your string had two spaces together, or a doubled punctuation mark somewhere, the results would differ for each case.
U just need it
-Two Solutions-
1. REGEXP:
\d+
1. Explanation:
\d =>numbers
+ => 1 or more
2. REGEXP
[0-9]+
2. Explanation
[0-9] <= Numbers
+ <= 1 or more
it will match all numbers (Solution 1 or 2)
Original Text:
30-01-2003 15:20
Result:
30
01
2003
15
20
Enjoy.
See: https://regex101.com/r/xXaLgN/6

What is the point of having * in a regular expression

Recently I am thinking the reason why we need a * in regular expression. For example, if we want to represent A0,A1..,Z99, we can do:
[A-Z][0-9][0-9]*
But A0A (which is not we want) is also valid according to the above. What benefit does the * give me?
* is just a quantifier, matching between zero and unlimited times.
[A-Z][0-9][0-9]* matches A0,A1..,Z99 and also A10000,Z123456789...
Remembering that if you dont put the ^ and $ as anchors, the processor will match the specified part, and return true even if the input contain more characters, because you don't said that you want a positive result ONLY if the entire input matches the regex.
If your goal is to match just A0,A1..,Z99, the regex should be:
^[A-Z][0-9][0-9]?$
Or simply:
^[A-Z]\d{1,2}$
\d means 'digit', and is the same as [0-9].
{1,2} means at least 1 time and nothing more than 2 times.
? also is a quantifier, matching 0 or 1 time.
But A0A (which is not we want) is also valid
No it is not valid, you just need to use anchors:
^[A-Z][0-9][0-9]*$
^ will ensure this matches at line start and $ ensures it matches till line end.
Also if only 2nd digit is optional then better to use:
^[A-Z][0-9][0-9]?$
Since * matches 0 or more times whereas ? matches 0 or 1 time.
Seems like you're trying to match the strings starts with an uppercase alphabet and the following numbers ranges from 1 to 99.
^[A-Z][1-9]?[0-9]$
^ asserts that we are at the start and $ asserts that we are at the end. So this helps to do an exact string match. It won't match at the middle or start or at the end of a string or line. That is, [A-Z][1-9]?[0-9] will match A10 in fooA10 string but ^[A-Z][1-9]?[0-9]$ won't produce a match in fooA10 string.

About this regular expression (?<=\d)\d{4}

I use (?<=\d)\d{4} to match 1234567890, the result is 2345 6789.
Why it's not 2345 7890?
In the second match, it starts from 6 and 6 is matched by (?<=\d), so I think the result is 7890 rather than 6789.
Besides, how about using ((?<=\d)\d{3})+ match 1234567890?
Look behinds are non consuming, so the 5 is being "reused" in the second match (even though the first match consumed it).
If you want to start at 6, consume but don't capture:
\d(\d{4})
And use group 1, or if your regex engine supports it, use a negative look behind for \G, which is the end of the previous match:
(?!\G)(?<=\d)\d{4}
See a live demo.
(?<=\d) is Zero-Length Assertion, assertions do not consume characters in the string, but only assert whether a match is possible or not.
It matches this way as the first match finishes at 5 so the next group can be matched from 6. (?<=\d) matches 5 in this case and the match is on 6789, starting with 6.
(?<=\d) doesn't belong to the match, it doesn't consume a character, it's just asserting what is in front of the match.
(?<=\d)\d{4}
?<= Lookbehind. Makes sure a digit precedes the text to be matched.
What text are we matching ? d{4} So, Meaning is match those 4 digits which are preceded by one digit.
In 1234567890 such a match is 2345 as it is preceded by 1 Now we have got one match and the string to be matched still is 1234567890 Now checking the regex condition will again tell to find group of four digits which has a prefix as a digit. Since 2345 has already been matched, the next successful match is 6789 which is preceded by 5 satisfying the regex conditions.
Coming to (?<=\d)\d{3} it does the same thing as before only it makes a group of 3. Editing this regex to get the one mentioned by you, we add the whole thing in a capture group. ((?<=\d)\d{3}) and say one or more of this ((?<=\d)\d{3})+. A repeated capturing group will only capture the last iteration.
So 890 is returned as a match.