What does this specific regex do? [closed] - regex

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
The regex is:
/^\/node?(?:\/(\d+)(?:\.\.(\d+))?)?/
I can understand that / in the beginning and the end are regex delimiters, also ^\/node is for a string starting with /node What's happening after that beats me!

You should look into getting a tool like RegexBuddy. It will explain everything in a given regex, as well as how it compiles and how it branches.
Assuming PCRE or similar:
/ //begin
^ //start of string
\/ //literal /
node? //I assume node is optional, normally it'd be (node)?
//? makes the previous expression optional
(
?: //non-capturing group (think of it like dont capture <this>)
\/ //literal /
(\d+) // one or more digits, 0-9
(
?: // another non-capturing group
\.\. // literal ..
(\d+) // one or more digits 0-9
)
? // optional once more
)
? // make the previous group optional
/ // end

? anything following this is "optional"
(?: non-capturing group
\/ escaped /
(\d+) -more than 1 digit - also in a capture group "()"
(?: again
\. - escaped .
\. - again
(\d+) - same as before
)?)? - not sure - what flavour of regex is this?

You are correct, the / at the start are pattern delimiters. Lets remove those for simplicity
^\/node?(?:\/(\d+)(?:\.\.(\d+))?)?
The (?:...) is a non-capturing group. This is a group that does not get grabbed into a match group. This is an optimisation, let's remove the ?: to make the pattern clearer.
^\/node?(\/(\d+)(\.\.(\d+))?)?
The \ is an escape character, so \/ is actually just a / but as these denote the start and end of the pattern then need to be escaped. The . matches (almost) any character so it needs to be escaped too.
The ? makes the receding pattern optional, so ()? means whatever is in the brackets appears zero or one times.
^ denotes the start of the string
\/node? matches /node or /nod
\/(\d+) matches / followed by one or more digits (the \d+). The digits are captured into the first match group
(\.\.(\d+))? matches .. followed by one or more digits (the \d+). The digits are captured into the second match group

Related

Regex (PCRE): Match all digits in a line following a line which includes a certain string

Using PCRE, I want to capture only and all digits in a line which follows a line in which a certain string appears. Say the string is "STRING99". Example:
car string99 house 45b
22 dog 1 cat
women 6 man
In this case, the desired result is:
221
As asked a similar question some time ago, however, back then trying to capture the numbers in the SAME line where the string appears ( Regex (PCRE): Match all digits conditional upon presence of a string ). While the question is similar, I don't think the answer, if there is one at all, will be similar. The approach using the newline anchor ^ does not work in this case.
I am looking for a single regular expression without any other programming code. It would be easy to accomplish with two consecutive regex operations, but this not what I'm looking for.
Maybe you could try:
(?:\bstring99\b.*?\n|\G(?!^))[^\d\n]*\K\d
See the online demo
(?: - Open non-capture group:
\bstring99\b - Literally match "string99" between word-boundaries.
.*?\n - Lazy match up to (including) nearest newline character.
| - Or:
\G(?!^) - Asserts position at the end of the previous match but prevent it to be the start of the string for the first match using a negative lookahead.
) - Close non-capture group.
[^\d\n]* - Match 0+ non-digit/newline characters.
\K - Resets the starting point of the reported match.
\d - Match a digit.

regex for certain characters and rules [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am trying to build a regex to ensure a given string only contains these 13 certain characters/rules. I am having a bit of trouble. Any help would be appreciated.
Allowed Characters:
a-z
A-Z
0-9
! (Exclamation point)
- (Hyphen)
_ (Underscore)
. (Period)
* (Asterisk)
' (Single Quote)
( (Open parenthesis)
) (Close parenthesis)
(No consecutive spaces)
*. (CANNOT end with a period)
So Far I have this
/^[+\-0-9().!-_*' ]+$/g
But not getting expected results. Thank you in advance.
EDIT:
Sorry first time posting here. Here are some test cases(JS). Second one should not pass because it has consecutive spaces and ends with period.:
let testOne = "Testing Regex - 2021 Th*i)s_On(e_pa!'ss.es.end";
let testTwo ="Testing Regex - 2021 Th*i)s_On(e_pa!'ss.es.end but
shouldn't.";
testOne.match(/^[+\-\w().!-_*' ]+$/g);
testTwo.match(/^[+\-\w().!-_*' ]+$/g);
Some issues:
Your regex does not allow for Latin letters: you didn't list them.
Your regex allows for some additional characters (including $, # and %) because of !-*, which specifies a range.
There is no provision for not allowing more than a single space.
There is no provision for not allowing a dot as last character
The g modifier has little purpose when you have end-of-string markers requiring that a match will always match the whole input.
From your regular expression it seems you also require that the input has at least 1 character.
Taken all that together, we get this:
/^(?!.*? )(?!.*?\.$)[\w+\-().!*' ]+$/
You can try this:
^(?!.* )[\w!()\-*'\s.]+[\w!()\-*'\s]$
https://regex101.com/r/kTcJUN/3
And if you don't want to allow space character at the end of string then:
^(?!.* )[\w!()\-*'\s.]+[\w!()\-*']$
Explanation:
(?!.* ) - Exclude double space in string
\w - any word character. Matches any letter, digit or underscore. Equivalent to [a-zA-Z0-9_].
! - literally !
( - literally (
) - literally )
- - literally -
* - literally *
' - literally '
\s - space character
. - literally .
+ - quantifier. Matches between one and unlimited times.
[\w!()\-*'\s] - Allow a single character from the list. Putting this just before $ (end of line) makes this character last in string.

RegEx for matching operation sequences

I have a numbers operation like this:
-2-28*95+874-1545*-5+36
I need to extract operands, not implied in a multiplication operation with a regex:
-2
+874
+36
I tried things like that without success:
[\+,-]\d+(?=\+|-|$)
This regex matches -5, too, and
(?(?=\d+)[\+,-]|^)\d+(?=\+|-|$)
matches nothing.
How do I solve this problem?
You may use
(?<!\*)[-+]\d*\.?\d+(?![*\d])
See the regex demo
Details
(?<!\*) - (a negative lookbehind making sure the current position is) not immediately preced with a * char
[-+] - - or +
\d* - 0 or more digits
\.? - an optional . char
\d+ - 1+ digits
(?![*\d]) - not immediately followed with a * or digit char.
See the regex graph:
This RegEx might help you to capture your undesired pattern in one group (), then it would leave your desired output:
(((-|\+|)\d+\*(-|\+|)\d+))
You can also use other language specific functions such as (*SKIP)(*FAIL) or (*SKIP)(*F) and get the desired output:
((((-|\+|)\d+\*(-|\+|)\d+))(*SKIP)(*FAIL)|([s\S]))
You can also DRY your expression, if you wish, and remove unnecessary groups that you may not need.
Another option could be to match what you don't want and capture in a group what you want to keep. Your values are then in the first capturing group:
[+-]?\d+(?:\*[+-]?\d+)+|([+-]?\d+)
Explanation
[+-]?\d+ Optional + or - followed by 1+ digits
(?:\*[+-]?\d+)+ Repeat the previous pattern 1+ times with an * prepended
| Or
([+-]?\d+) Capture in group 1 matching an optional + or - and 1+ digits
Regex demo

Regex to match specific string + optional space + 8 digits [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
I need a regular expression to validate strings with the prefix 'CON' followed by an optional space followed by 8 digits.
I've tried various expressions, I got tangled up and now I'm lost.
^(CON+s\?d{8})$
\bCON\b\S?D{8}
Syntax is off a bit
^(CON\s?\d{8})
( starts a capturing group
CON is exactly matched
\s matches any white space character and the ? makes it optional
\d{8} matches 8 digits
) ends the capturing group
You were pretty well off to start, Hope this helps :)
keeping in mind If there is no space, then there shouldn't be 8 more digits
^CON(\ \d{8})?
If the string you are looking for can be part of a larger string (note that in this case it may be preceded or followed by anything, even other digits):
CON\s?\d{8}
If the string must match in full, use ^$ to designate that:
^CON\s?\d{8}$
You can add variations to it, if say you want it to begin/end with a word boundary - use \bto indicate that. If you want it to end in a non-digit, use \D+ at the end, instead of $.
Finally, if you want the string to end with an EOL or a non-digit, you may use an expression like this:
CON\s?\d{8}(\D+|$) or the same with a non-capturing group: CON\s?\d{8}(?:\D+|$)

Using regex to find portions of a path where first portion is n digits only

I would like a regex expression that will match the first 4 digit part of a path and then the next two parts. So it would find:
/58A2/D456-509F-4905-A473/FCAD1612CEDB/
in both of these lines
pyramid:/58A2/D456-509F-4905-A473/FCAD1612CEDB/filename1.tif
cache:/ThumbCache/58A2/D456-509F-4905-A473/FCAD1612CEDB/filename2.jpg
I tried
/.{4}/.*?/.*?/
and this works for the first one... but not the second one. Apparently the /ThumbCache/ is simply not matching because it's not 4 digits.
* UPDATE *
Ok... so this actually works, in this simplified example... in my actual code I had an extra /.*?/
It won't let me delete this post, because people posted answers. Not sure what to do.
shareeditdeleteflag
Try this regex:
\/[0-9A-Z]{4}\/(?:[0-9A-Z]{4}-?){4}\/[0-9A-Z]{12}\/
Regex explanation:
\/ # Slash literal
[0-9A-Z] # Match digit or uppercase letter
{4} # Match previous exactly 4 times (So 4 digits/uppercase letters)
\/
(?: # Start non-capturing group (Don't store the result of group)
[0-9A-Z]
{4}
-? # Optionally a hyphen (Last sequence doesn't have one)
) # End non-capturing group
{4}
\/
[0-9A-Z]
{12}
\/
You will also need global or g flag to capture globally if you have multi-line input.
For input
pyramid:/58A2/D456-509F-4905-A473/FCAD1612CEDB/filename1.tif
cache:/ThumbCache/58A2/D456-509F-4905-A473/FCAD1612CEDB/filename2.jpg
this will match
/58A2/D456-509F-4905-A473/FCAD1612CEDB/
/58A2/D456-509F-4905-A473/FCAD1612CEDB/
I didn't add any capturing groups as you haven't specified anything that needs to be captured separately. You get the full match output.
Ok... so this actually works, in this simplified example... in my actual code I had an extra /.*?/
It won't let me delete this post, because people posted answers. Not sure what to do.