Regex to match n times for helm - regex

To match these examples:
1-10-1
1-7-3
10-8-5
1-7-14
11-10-12
This regex works:
^[\\d]{1,2}-[\\d]{1,2}-[\\d]{1,2}$
How could this be written in a way that just matches something like "[\d]{1,2}-?" three (n) times?

You may use:
^\d\d?(?:-\d\d?){2}$
See an online demo.
^ - Start line anchor.
\d\d? - A single digit followed by an optional one (the same as \d{1,2}).
(?:-\d\d?){2} - A non-capture group starting with an hyphen followed by the same construct as above, one or two digits. The capture group is repeated exactly two times.
$ - End string anchor.
The idea here is to avoid an optional hyphen in the attempts you made since essentially you'd start to allow whole different things like "123" and "123456". It's more appropriate to match the first element of a delimited string and then use the non-capture group to match a delimiter and the rest of the required elements exactly n-1 times.

Related

RegEx: how to don't match a repetition

I have followings String:
test_abc123_firstrow
test_abc1564_secondrow
test_abc123_abc234_thirdrow
test_abc1663_fourthrow
test_abc193_abc123_fifthrow
I want to get the abc + following number of each row.
But just the first one if it has more than one.
My current pattern looks like this: ([aA][bB][cC]\w\d+[a-z]*)
But this doesn't involve the first one only.
If somebody could help how I can implement that, that would be great.
You can use
^.*?([aA][bB][cC]\d+[a-z]*)
Note the removed \w, it matches letters, digits and underscores, so it looks redundant in your pattern.
The ^.*? added at the start matches the
^ - start of string
.*? - any zero or more chars other than line break chars as few as possible
([aA][bB][cC]\d+[a-z]*) - Capturing group 1: a or A, b or B, c or C, then one or more digits and then zero or more lowercase ASCII letters.
Use the following regex:
^.*?([aA][bB][cC]\d+)
Use ^ to begin at the start of the input
.*? matches zero or more characters (except line breaks) as few times as possible (lazy approach)
The rest is then captured in the capturing group as expected.
Demo

Regex for finding string after nth occurrence of hyphen

I have the multiple rows of strings that look like the following:
irm-eap-edp-refined-nonprod
irm-eap-edp-reporting-prod
irm-eap-edp-development-nonprod
I need to extract the nonprod or prod string from each, it will always be after the 4th hyphen and the last substring of the entire string.
What's a simple regex for this situation?
If you want the last substring after - you can do:
.*-(.*)
Regex demo.
You could substitute everything starting with 4 [^-]+- chunks with nothing:
/^(?:[^-]+-){4}//gm
^ - start of line anchor
[^-]+- - match on anything but - 1 or more times followed by a -
(?: ... ) - non capturing group
{4} - four times
Demo
It looks like what you want is prod or nonprod after a hyphen, and at the end of a string, so:
-(prod|nonprod)$
This has the benefit of the prod/nonprod being in the capture group.

Regex for two of any digit then three of another then four of another?

Regex is great, but I can't for the life of me figure out how I'd express the following constraint, without spelling out the whole permutation:
2 of any digit [0-9]
3 of any other digit [0-9] excluding the above
4 of any third digit [0-9] excluding the above
I've got this monster, which is clearly not a good way of doing this, as it grows exponentially with each additional set of digits:
^(001112222|001113333|001114444|001115555|001116666|0001117777|0001118888|0001119999|0002220000|...)$
OR
^(0{2}1{3}2{4}|0{2}1{3}3{4}|0{2}1{3}4{4}|0{2}1{3}5{4}|0{2}1{3}6{4}|0{2}1{3}7{4}|0{2}1{3}8{4}|...)$
Looks like the following will work:
^((\d)\2(?!.+\2)){2}\2(\d)\3{3}$
It may look a bit tricky, using recursive patterns, but it may look more intimidating then it really is. See the online demo.
^ - Start string anchor.
( - Open 1st capture group:
(\d) - A 2nd capture group that does capture a single digit ranging from 0-9.
\2 - A backreference to what is captured in this 2nd group.
(?!.+\2) - Negative lookahead to prevent 1+ characters followed by a backreference to the 2nd group's match.
){2} - Close the 1st capture group and match this two times.
\2 - A backreference to what is most recently captured in the 2nd capture group.
(\d) - A 3rd capture group holding a single digit.
\3{3} - Exactly three backreferences to the 3rd capture group's match.
$ - End string anchor.
EDIT:
Looking at your alternations it looks like you are also allowing numbers like "002220000" as long as the digits in each sequence are different to the previous sequence of digits. If that is the case you can simplify the above to:
^((\d)\2(?!.\2)){2}\2(\d)\3{3}$
With the main difference is the "+" modifier been taken out of the pattern to allow the use of the same number further on.
See the demo
Depending on whether your target environment/framework/language supports lookaheads, you could do something like:
^(\d)\1(?!\1)(\d)\2\2(?!\1|\2)(\d)\3\3\3$
First capture group ((\d)) allows us to enforce the "two identical consecutive digits" by referencing the capture value (\1) as the next match, after which the negative lookahead ensures the next sequence doesn't start with the previous digit - then we just repeat this pattern twice
Note: If you want to exclude only the digit used in the immediately preceding sequence, change (?!\1|\2) to just (?!\2)

Regex, how repeat group?

i try to write regex for get only amount, from string, i do this, it's work but i want to optimise my expression, for exemple i have
125.250.230,55
this is my regew :
\d{1,3}[\,\.]{1}\d{1,3}[\,\.]{1}\d{1,3}[\,\.]{1}\d{1,3}
i want to write it with another form with a repeat group like, but it doesn't work for me
(\d{1,3}[\,\.]{1}){6}\d{1,3}
It's unclear from your example if you want to match always 4 sets of digits separated by a comma or dot, or a variable sets of digits.
If exactly 4 sets of digits use this:
(?:\d{1,3}[.,]){3}\d{1,3}
If a variable sets of digits use this:
(?:\d{1,3}[.,])+\d{1,3}
If you want to properly match sets of 3 digits, with variable number of digits at the beginning and end, such as:
1,123,123.1
12,123,123.12
123,123,123.123
1,123,123,123,123.1
Then use this:
\d{1,3}[.,](?:\d{3}[.,])*\d{1,3}
Explanation of regex:
\d{1,3} - one to three digits (0...9)
[.,] - followed by a dot or a comma
(?: ... )* - followed by a non-capturing group; the * means zero to multiple repeats
d{3}[.,] - inside non-capturing group, expect three digits, followed by a dot or comma
\d{1,3} - followed by one to three digits
You need to make the group in your exemple non capturing. See https://www.regular-expressions.info/captureall.html
Here is your regew:
(?:\d{1,3}[\,\.]){3}\d{1,3}

How to search for a combination of words inside a nested bullet list in regex?

I would like to search for a combination of two words (let’s say term1 and term2) inside a bullet list. The condition is that these two words have to be from the same nested list and that the search has to be for any combination of these two words. Nested lists are separated by two new lines \n.
The input text
- Example nested list 1
- This point contains term 2
- This point contains term 1
- Example nested list 2
- This line contains term 2
In order to achieve this I first tried to capture different nested lists in different capture groups.
- ([\s\S]*?)\n\n
Now, I would like to search inside each of these capture groups but I can’t seem to find out how.
An example regex is at https://regex101.com/r/OC6OI5/12
Edit:
For people who are wondering why this might be useful, I’m trying to build a Roam Research like Linked references filtering in a markdown editor
If there should be only a single nested list, and the term 1 and term 2 should be both present, you could use 2 lookahead assertions (?=
You could start the match at the start of the string, and match all lines that start with -.
Then you can assert what follows are lines that all begin with 1 or more whitespace chars without a newline followed by -.
Use 2 positive lookaheads to assert both terms, and match all following lines that have that same indentation.
^(?:-.*\r?\n)+(?=(?:[^\S\r\n]+-.*\r?\n)*[^\S\r\n]+.*(term 1))(?=(?:[^\S\r\n]+-.*\r?\n)*[^\S\r\n]+-.*(term 2))(?:[^\S\r\n]+.*(?:\r?\n|$))+
Explanation
^ Start of the string
(?:-.*\r?\n)+ Match 1+ times a line that starts with -
(?= Positive lookahead, assert what is on the right is
(?:[^\S\r\n]+-.*\r?\n)* Optionally repeat lines that start with whitespaces without newlines, then - and the rest of the line
[^\S\r\n]+-.*(term 1) Match a line that starts with 1+ whitespaces followed by - and contains term 1
) Close lookahead
(?=(?:[^\S\r\n]+-.*\r?\n)*[^\S\r\n]+.*(term 2)) The same lookahead mechanism again for term 2
(?: Non capture group
[^\S\r\n]+-.*(?:\r?\n|$) Match a line that starts with 1+ whitespaces followed by - and the rest of the line
)+ Close the group and repeat 1+ more times
Regex demo