Regex expression for 2 identical strings in a row - regex

So I am trying to create a regex expression for the following template.
"[alphaNumeric]String/String.xcl"
So
[a1B2c3]Hello/Hello.xcl would pass
a1B2c3]hello/hello.xcl fails
[a1B2c3]Hello/hello.xcl fails
[a1B2c3]hello/hello.xc fails
I have tried the following so far:
\[[\da-zA-Z]+\][a-z]+\/[a-z]+\.xcl$
How do I check if the middle strings are identical?

Use a backreference:
\[[a-zA-Z0-9]+\]([^/]+)/\1\.xcl
The term in parenthesis captures the first part of your path. We may then refer to it later in the regex using \1.
Depending on how you plan to use this regex, you might need optional starting and closing anchors (^ and $).
Demo

You may capture the part after brackets and use a backreference after /:
^\[[\da-zA-Z]+]([A-Za-z]+)\/\1\.xcl$
^^^^^^^^^^ ^^
See the regex demo
Details
^ - start of the string
\[ - a [
[\da-zA-Z]+ - 1+ alphanumeric chars
] - a ] char
([A-Za-z]+) - Capturing group 1: one or more letters
\/ - a slash
\1 - a backreference to capturing group 1 value
\.xcl - .xcl substring
$ - end of string.
NOTE: If you do not care about what kind of chars there can be inside brackets, you may replace [\da-zA-Z]+ with [^\]]+.
NOTE2: If you want to match any chars on both ends of /, replace ([A-Za-z]+) with ([^\/]+).

Related

How to capture second match from the given text using regex

I tried to capture the second match from given text i.e,
hash=e1467eb30743fb0a180ed141a26c58f7&token=a62ef9cf-2b4e-4a99-9335-267b6224b991:IO:OPCA:117804471:OPI:false:en:opsdr:117804471&providerId=paytm
In the above text, I want to capture the second number with the length of 9 (117804471).
I tried following, but it didn't work; so please help me resolving in this.
https://regex101.com/r/vBJceR/1
You can use
^(?:.*?\K\b[0-9]{9}\b){2}
See the regex demo.
Details:
^ - start of string
(?: - start of a non-capturing group:
.*? - any zero or more chars other than line break chars (as few as possible) followed with
\K - match reset operator discarding text matched so far
\b[0-9]{9}\b - a 9-digit number as a whole word
){2} - two occurrences of the pattern sequence defined above.

Find regex expression

I am trying to find a regex expression to match strings with 4 repeating digits and optional hyphens in between, such as:
-3-3-3-3-
-1111-
2222-
0-0-00
Currently I have:
\-?(\d(\-*))\1{3}\-?
which matches the first two but not the last two. Any suggestions?
You may use
^-?(\d)(?:-?\1){3}-?$
See the regex demo. To find the pattern in a larger string, remove the ^ and $ anchors:
-?(\d)(?:-?\1){3}-?
If the pattern is a part of a longer pattern, you might have to adjust the backreference number (if there are other capturing groups in the pattern).
Details
^ - start of string
-? - an optional -
(\d) - Group 1: any digit
(?:-?\1){3} - three occurrences of an optional - and then the same value as captured in Group 1
-? - an optional hyphen
$ - end of the string.

Matching Word Regex

Hello i want to match with regex this word
(Parc Installé)
from this text:
31/1/2017 17:19:23,4245986,ct0001#Intotel.int,Parc Installé,100.100.30.100
I did this regex ',[A-Za-zA-zÀ-ú+ \/\w+0-9._%+-]+,'
But the result is : 4245986 ans Parc Installé.
How can i match only Parc Installé
You may try a regex based on a lookahead that will require a comma and digits/commas after it up to the end of string:
[^,]+(?=\s*,[\d.]+$)
See this regex demo
Details:
[^,]+ - 1 or more chars other than ,
(?=\s*,[\d.]+$) - a lookahead requiring
\s* - zero or more whitespaces
, - a comma
[\d.]+ - 1+ digits or dots up to...
$ - ... the end of string
To make it a bit more restrictive, you may replace the lookahead with (?=\s*,\d+(?:\.\d+){3}$) to require 4 sequences of dot-separated 1+ digits. See this regex demo.
If a lookahead is not supported (case with a RE2 engine), you might want to use a capturing group based solution:
([^,]+)\s*,[\d.]+$
Here, the part within (...) will be captured into Group 1 and will be accessible via a backreference or a function like =REGEXEXTRACT in Google Spreasheets that only retrieves the contents of a capturing group if the latter is present in the pattern.

Repeated capturing group PCRE

Can't get why this regex (regex101)
/[\|]?([a-z0-9A-Z]+)(?:[\(]?[,][\)]?)?[\|]?/g
captures all the input, while this (regex101)
/[\|]+([a-z0-9A-Z]+)(?:[\(]?[,][\)]?)?[\|]?/g
captures only |Func
Input string is |Func(param1, param2, param32, param54, param293, par13am, param)|
Also how can i match repeated capturing group in normal way? E.g. i have regex
/\(\(\s*([a-z\_]+){1}(?:\s+\,\s+(\d+)*)*\s*\)\)/gui
And input string is (( string , 1 , 2 )).
Regex101 says "a repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations...". I've tried to follow this tip, but it didn't helped me.
Your /[\|]+([a-z0-9A-Z]+)(?:[\(]?[,][\)]?)?[\|]?/g regex does not match because you did not define a pattern to match the words inside parentheses. You might fix it as \|+([a-z0-9A-Z]+)(?:\(?(\w+(?:\s*,\s*\w+)*)\)?)?\|?, but all the values inside parentheses would be matched into one single group that you would have to split later.
It is not possible to get an arbitrary number of captures with a PCRE regex, as in case of repeated captures only the last captured value is stored in the group buffer.
What you may do is get mutliple matches with preg_match_all capturing the initial delimiter.
So, to match the second string, you may use
(?:\G(?!\A)\s*,\s*|\|+([a-z0-9A-Z]+)\()\K\w+
See the regex demo.
Details:
(?:\G(?!\A)\s*,\s*|\|+([a-z0-9A-Z]+)\() - either the end of the previous match (\G(?!\A)) and a comma enclosed with 0+ whitespaces (\s*,\s*), or 1+ | symbols (\|+), followed with 1+ alphanumeric chars (captured into Group 1, ([a-z0-9A-Z]+)) and a ( symbol (\()
\K - omit the text matched so far
\w+ - 1+ word chars.

Regular Expression to parse some special cases of C Code

I am trying to check generated C Code with a regular expression.
Actually the lines I need to check always start the same way
R_Wrt_somename(V_var)
or
R_Wrt_othername((int64) (V_var2 * 3))
I already got an expression for the first one, but I am not able to get a fitting expression for the second possibility of function call.
Is there someone able to help me out with this problem? I also would appreciate a regular expression with explanation as I just started working with them.
The expression for the first function type:
R_Wrt_(\w+)\((\s*(V_)[a-zA-Z_0-9\[\] ]+)
Here is a regex that should fetch you expected results:
R_Wrt_(\w+)\((?:\((\w+)\)\s*)?\(?(\s*(V_)[a-zA-Z_0-9\[\]* ]+)\)*
See demo
The regex matches:
R_Wrt_ - literal R_Wrt
(\w+) - 1 or more English letters, digits or underscore (captured into Group 1)
\( - a literal opening parenthesis
(?:\((\w+)\)\s*)? - an optional non-capturing group (so as not to mess the groups) that matches...
\( - a literal opening parenthesis
(\w+) - 1 or more English letters, digits or underscore (captured into Group 2)
\)\s* - a literal closing parenthesis with optional whitespace
\(? - a literal optional opening parenthesis
(\s*(V_)[a-zA-Z_0-9\[\]* ]+) - a capturing group 3 (from your original regex) matching...
\s* - optional whitespace
(V_) - literal V_ (captured into Group 4)
[a-zA-Z_0-9\[\]* ]+ - 1 or more characters from the set
\)* - 0 or more literal closing parentheses.