Regex - Nested matches - regex

When the following regex - \d\[\w*] is given the input string - asd3[bc]de , it would match 3[bc].
When given input such as 3[bc4[de]] that has nested matches, it matches the inner pattern 4[de] and not the outer one. Why is this so? Is there a way to force the regex to match the outer pattern?

\w won't match a '['.
The \d\[ matches the 3[, then \w* matches bc4, but won't match the inner '['. So, the regex engine has to back track and find another match for \d\[. That matches the 4[, \w* matches de, and then the ]s match.
I believe there are some regex engines that can have recursive patterns and match nested items.

let re = /\d\[\w*]?(\d\[\w*])]/;
let str = "3[bc4[de]]";
console.log([...str.match(re)]);

Related

regex for matching phrases even if its within a bracket

I want to match 'marks' 'marks-for-days' 'reactions(98%)' 'fun-for me'
All of those even if its without the brackets like:
marksman - no match
marksreaction - no match
but
marks (98%) - match
reactions 98% - match
fun for me - match
there are no fun only marks - match
I tried the basic word matching but it doesnt work \w*(marks|reactions|fun for me)\w*.
You are not matching the - in marks-for-days only the spaces.
Using a character class can match either a space or hyphen [- ] and use word boundaries to prevent partial matches.
\b(?:marks|reactions|fun[- ]for[- ]me)\b
Regex demo
If you want to match the whole line you can add .* before and after the pattern:
.*\b(?:marks|reactions|fun[- ]for[- ]me)\b.*
Regex demo

How to match the closest pattern on a capture group excluding overlap? [duplicate]

Given an input string fooxxxxxxfooxxxboo I am trying to write a regex that matches fooxxxboo i.e. starting from the second foo till the last boo.
I tried the following
foo.*?boo matches the complete string fooxxxxxxfooxxxboo
foo.*boo also matches the complete string fooxxxxxxfooxxxboo
I read this Greedy vs. Reluctant vs. Possessive Quantifiers and I understand their difference, but I am trying to match the shortest string from the end which matches the regex i.e. something like the regex to be evaluated from back.
Is there any way I can match only the last portion?
Use negative lookahead assertion.
foo(?:(?!foo).)*?boo
DEMO
(?:(?!foo).)*? - Non-greedy match of any character but not of foo zero or more times. That is, before matching each character, it would check that the character is not the letter f followed by two o's. If yes, then only the corresponding character will be matched.
Why the regex foo.*?boo matches the complete string fooxxxxxxfooxxxboo?
Because the first foo in your regex matches both the foo strings and the following .*? will do a non-greedy match upto the string boo, so we got two matches fooxxxxxxfooxxxboo and fooxxxboo. Because the second match present within the first match, regex engine displays only the first.
.*(foo.*?boo)
Try this. Grab the capture i.e $1 or \1.
See demo.
https://regex101.com/r/nL5yL3/9

Regex match two words or at least one

I have problem with my regex string. I have two combinations of strings as follows,
2.3.8.2.2.1.2.3.4.12345 = WORDS: "String to capture"
2.3.8.2.2.1.2.3.4.12345 = ""
Regex:
1\.2\.3\.4\.(\d+) = WORDS: (?|"([^"]*)|([^:]*))
https://regex101.com/r/kQ3wT5/10 - matching
https://regex101.com/r/kQ3wT5/9 - Not matching
This regex is matching only for the first string and not for the second where i have empty string. So the regex has to match on both scenario. And one more thing i really dont want to go with "global" match.
Please help me on this.
You need to make WORDS:<space> optional by enclosing it with an optional non-capturing group:
1\.2\.3\.4\.(\d+) = (?:WORDS: )?(?|"([^"]*)|([^:]*))
See the regex demo.
The (?:WORDS: )? matches 1 or 0 sequences (due to the ? quantifier) of WORDS: substring followed with a space.

Shortest match in regex from end

Given an input string fooxxxxxxfooxxxboo I am trying to write a regex that matches fooxxxboo i.e. starting from the second foo till the last boo.
I tried the following
foo.*?boo matches the complete string fooxxxxxxfooxxxboo
foo.*boo also matches the complete string fooxxxxxxfooxxxboo
I read this Greedy vs. Reluctant vs. Possessive Quantifiers and I understand their difference, but I am trying to match the shortest string from the end which matches the regex i.e. something like the regex to be evaluated from back.
Is there any way I can match only the last portion?
Use negative lookahead assertion.
foo(?:(?!foo).)*?boo
DEMO
(?:(?!foo).)*? - Non-greedy match of any character but not of foo zero or more times. That is, before matching each character, it would check that the character is not the letter f followed by two o's. If yes, then only the corresponding character will be matched.
Why the regex foo.*?boo matches the complete string fooxxxxxxfooxxxboo?
Because the first foo in your regex matches both the foo strings and the following .*? will do a non-greedy match upto the string boo, so we got two matches fooxxxxxxfooxxxboo and fooxxxboo. Because the second match present within the first match, regex engine displays only the first.
.*(foo.*?boo)
Try this. Grab the capture i.e $1 or \1.
See demo.
https://regex101.com/r/nL5yL3/9

regex pattern selects alternates matches

Regex pattern
/("[^:=,]+":")(.*?)("}*\]*}*,")/
String :
"foo":""fooooooooooooooooooo"foooo","bar":"barrrrrrrrr""barrrrrr","fooo":"foooooo","bar":"barrrrrr","
Matches the first and the third pattern
http://rubular.com/r/S5fbsSfCjy
String:
"bar":"barrrrrrrrr""barrrrrr","fooo":"foooooo","bar":"barrrrrr","foo":""fooooooooooooooooooo"foooo","
Matches the first and the third pattern
http://rubular.com/r/hDfcBCkB2o
How do make it match all 4 patterns match any of the string above?
That's because the ," at the end of your regex pattern consumes the quotes from the following string. So, it is not matched. In fact, the regex will match only every alternate matching string.
You need to use look-ahead:
/("[^:=,]+":")(.*?)("}*\]*}*(?=,"))/
http://rubular.com/r/6v2OjPtmVM