Regex: Getting Variable from Substring - regex

How would I go about using Regex to extract the number from the following file:
abc_defg123_100aaa_abc_defg123
Where I want the 100 from the substring '_100aaa_'?
The closest have gotten is:
[0-9](?!(aaa_))*\w
but this matches up to the first underscore found!
Many thanks!

Try this:
(?<=_)\d+(?=aaa_)
See live demo.
This regex uses look arounds to assert, without capturing, the delimiting input either side of the target.

Related

Perl Regex: Match this group, but not having this pattern

I need to extract some text that starts and ends with a double quote " but will not extract if it detects multiple double quotes.
This is my example
I tried using different look-arounds, positive/negative look-aheads and look-behinds, but it leads to an error.
In my example above, I would like to exclude the data
"XxXXXXX - "" """"XX""""""",
and
"XxXXXXX - ""XXXXX XXXXXXXX 1.4.90 """"X2""""""",
from being matched.
I saw some other answers here but I'm getting an error whenever I use a negative look-behind, no problems in positive look-ahead and negative look-ahead but it doesn't work.
Edit:
I've added some examples regex in the link provided, and also more example data.
However, I still don't want to match data above by the current regex.
What about using this:
"([^"]+?)"(,|$)
You can see it here
and also here
Thanks for this one. Strange, I think I've tried this one before. But didn't get the result I've expected. Maybe it's because I didn't wait for it to be matched again.

Regex - some matches are missing

I am trying to solve a really simple problem,but I cant find any solution.
My string looks like this: "...0.0..0.0..."
My regex is: 0[.]{1,3}0
I am expecting 3 matches: 0.0, 0..0, 0.0
But instead of that I am getting only two matches: 0.0 and 0.0. Can You please tell me what
The problem is that when the regex matches the first time, it consumes the characters from the input string that it has matched with. So in first match, it matches with:
...0.0....0.0...
^^^
so then for the next match it will consider the remainder of the string which is
....0.0...
and there, as you can see, it will only find a single match.
One way around this issue is to use a zero width lookahead assertion, provided that your regex engine supports that. So your regex would look like
0[.]{1,3}(?=0)
The meaning of this is that it will match the 0 at the end but it will not consume it. The issue with this approach is that it will not include that 0 in the matches. One solution for this issue is add the 0 afterwards yourself.

Is there any upper limit for number of groups used or the length of the regex in Notepad++?

I am new to using regex. I am trying to use the regex find and replace option in Notepad++.
I have used the following regex:
((?:)|(\+)|(-))(\d)((?:)|(\+)|(-))(/)((?:)|(\+)|(-))(\d)((?:)|(\+)|(-))
For the following text:
2/2
+2/+2
-2/-2
2+/2+
2-/2-
But I am able to get matches only for the first three. The last two, it only gives partial matches, excluding the last "+" and the "-". I am wondering if there is any upper limit for the number of groups (which i doubt is unlikely) that can be used or any upper limit for the maximum length of the regex. I am not sure why my regex is failing. Or if there is anything wrong with my regex, please correct it.
This is not an issue with Notepad++'s regex engine. The problem is that when you have alternations like (?:)|(\+)|(-), the regex engine will attempt to match the different options in the order they are specified. Since you specified an empty group first, it will attempt to match an empty string first, only matching the + or - if it needs to backtrack. This essentially makes the alternation lazy—it will never match any character unless it has to.
vks's answer works perfectly well, but just in case you actually needed those capturing groups separated out, you can do the same thing just by rewriting your alternations like this:
((\+)|(-)|(?:))(\d)((\+)|(-)|(?:))(/)((\+)|(-)|(?:))(\d)((\+)|(-)|(?:))
or even more simply, like this:
((\+)|(-)|)(\d)((\+)|(-)|)(/)((\+)|(-)|)(\d)((\+)|(-)|)
([-+]?)(\d)([-+]?)(/)([-+]?)(\d)([-+]?)
You can use this simple regex to match all cases.See here.
https://www.regex101.com/r/fG5pZ8/19

Repeating groups regex url path, node.js

I am trying to extract express route named parameters with regex.
So, for example:
www.test.com/something/:var/else/:var2
I am trying with this regex:
.*\/?([:]+\w+)+
but I am getting only last matched group.
Does anyone knows how to match both :var and :var2.
The first problem is that .* is greedy, and will therefore bypass all matches until the final one is found. This means that the first :var is bypassed.
However, as you are searching for a variable number of capture groups (with thanks to #MichaelTang), I recommend using two regexes in sequence. First, use
^(?:.*?\/?\:\w+)+$
to detect which lines contain colon-elements...
Debuggex Demo
...and then search that line repeatedly for, simply
\/:(\w+)
This places the text post-colon into capture group one.
Debuggex Demo
Here is how you can match both of them:
www.test.com/something/:var/else/:var2'.match(/\:(\w+)/g)
[":var", ":var2"]

Regex matching character within

Looking to match WS-810-REFERENCE-1 where the string must have -'s within it
And can't think of something to work perfectly
[a-zA-Z0-9\-]+
That will match but will also match words that do not have the - character
Thought of maybe this ([a-zA-Z0-9\-]+\-)+
But that will match WS-810-REFERENCE- missing the final segment.
Thoughts?
Used a modified version of the second attempt just to grab that extra missing section
((?:[a-zA-Z0-9]+\-)+[a-zA-Z0-9]+)
I believe you're looking for lookahead to make sure hyphen is present in the string. You can use:
\b(?=\w*?-)[a-zA-Z0-9-]+(?= |$)
Online Demo: http://regex101.com/r/pZ6hV6