How to utilize a URL regex match inside of the last grouping?

How to utilize a URL regex match inside of the last grouping? - regex

This is my regex
(?:!\(([elementType | inlineType]+),([\w\d\s\#]+),([\w\d\s]+),([\w\d\s\-]+),([\s\w\d\.\:\/\-\_\?\=]+)\))
This regex has 5 groups separated by commas, where as the last one needs to match any URL. I'm having a bit difficulty with it as it doesn't seem like i can paste in a URL matcher that I find on the internet inside of it. It seems to only be matching one thing at a time.

Some notes about your pattern:
\w also matches \d
This notation is a character class [elementType | inlineType]+ but as you also want to match block you can also use \w for that
You don't have to escape these characters in a character class . : _ ? =
Using 5 capture groups, where the 5th one matches an url like pattern:
!\(([\w#-]+),\s*([\w#-]+),\s*([\w-]+),\s*([\w-]+),\s*(https?:\/\/\S+)\)
Explanation
! Match literally
\( Match (
([\w#-]+),\s*([\w#-]+),\s*([\w-]+),\s*([\w-]+),\s* 4 capture groups where the character class [\w#-]+ matches 1+ occurrences of the allowed listed characters
(https?:\/\/\S+) Capture group 5, match the protocol with an optional s, then :// and 1+ non whitespace chars
\) Match )
Regex demo

Related

Regex match specific strings

I want to capture all the strings from multi lines data. Supposed here the result and here’s my code which does not work.
Pattern: ^XYZ/[0-9|ALL|P] I’m lost with this part anyone can help?
Result
XYZ/1
XYZ/1,2-5
XYZ/5,7,8-9
XYZ/2-4,6-8,9
XYZ/ALL
XYZ/P1
XYZ/P2,3
XYZ/P4,5-7
XYZ/P1-4,5-7,8-9
Changed to
XYZ/1
XYZ/1,2-5
XYZ/5,7,8-9
XYZ/2-4,6-8,9
XYZ/A12345 after the slash limited to 6 alphanumeric chars
XYZ/LH-1234567890 after the /LH- limited to 10 numeric chars

The pattern could be:
^XYZ\/(?:ALL|P?[0-9]+(?:-[0-9]+)?(?:,[0-9]+(?:-[0-9]+)?)*)$
The pattern in parts matches:
^ Start of string
XYZ\/ Match XYX/ (You don't have to escape the / depending on the pattern delimiters)
(?: Outer on capture group for the alternatives
ALL Match literally
| Or
P? Match an optional P
[0-9]+(?:-[0-9]+)? Match 1+ digits with an optional - and 1+ digits
(?: Non capture group to match as a whole
,[0-9]+(?:-[0-9]+)? Match ,and 1+ digits and optional - and 1+ digits
)* Close the non capture group and optionally repeat it
) Close the outer non capture group
$ End of string
Regex demo

You can use this regex pattern to match those lines
^XYZ\/(?:P|ALL|[0-9])[0-9,-]*$
Use the global g and multiline m flags.
Btw, [P|ALL] doesn't match the word "ALL".
It only matches a single character that's a P or A or L or |.

What is the proper regex for capturing everything after "String" and between two delimeters ('=' and and non alphanumeric))

Details={
AwsEc2SecurityGroup={GroupName=m.com-rds, OwnerId=123, VpcId=vpc-123,
IpPermissions=[{FromPort=3306, ToPort=3306, IpProtocol=tcp, IpRanges=[{CidrIp=1.1.1.1/32}, {CidrIp=2.2.2.2/32}, {CidrIp=0.0.0.0/0}, {CidrIp=3.3.3.3/32}],
UserIdGroupPairs=[{UserId=123, GroupId=sg-123abc}]}], IpPermissionsEgress=[{IpProtocol=-1, IpRanges=[{CidrIp=0.0.0.0/0}]}], GroupId=sg-123abc}},
Region=us-east-1, Id=arn:aws:ec2:us-east-1:123:security-group/sg-123abc}]
}
I want to capture exactly arn:aws:ec2:us-east-1:123:security-group/sg-123abc in this example. Generically, I want to capture the value of Id regardless of placement. My current solution is /Details={.*Id=(.*\w)/, but this only works if it's the last object in the data. How can I take into account the following potential scenario:
Id=arn:aws:ec2:us-east-1:123:security-group/sg-123abc, Thing=123abc}]

You have a pattern with 2 times .* which will first match till the end of the line/string (depending on if the dot matches a newline) and it will backtrack to match the last occurrence where this part of the pattern Id=(.*\w) can match.
If you want to use a capture group, you can make the format and the allowed characters a bit more specific:
\bId=(\w+(?:[:\/-]\w+)+)
The pattern in parts
\b A word boundary to prevent a partial word match
Id= Match literally
( Capture group 1
\w+ Match 1+ word chars
(?:[:\/-]\w+)+ Repeat 1+ times either : / - and 1+ word chars
) Close group 1
Regex demo
Or if you know that it starts with Id=arn:
\bId=(arn:[\w:\/-]+)
Regex demo
Note that you don't have to escape the \/ only when the delimiters of the regex are forward slashes, but there is no language tagged.

You can use look-behind to check that there is the Id= prefix, and then match anything that is not a space, comma or closing brace:
(?<=\bId=)[^,}\s]*

regex pattern to highlight all the matches for the punctuation in VBA

need an expression to allow only the below pattern
end word(dot)(space)start word [eg: end. start]
in other words
no space before colon,semicolon and dot |
one space after colon,semicolon and dot
rest of the all other patterns need to get capture to identify such as
end.start || end . start || end .start
i used
"([\s{0,}][\.]|[\.][\s{2,}a-z]|[\.][\s{0,}a-z])"
but not working as i expected.Need your support please
need_regex_patterns aim_of_regex_need

You could match 1+ word characters using \w+ and match either a colon or semi colon using a character class [;:] between optional spaces ?.
After that, match again 1+ word characters.
\w+ ?[;:] ?\w+
Regex demo
To match the dot followed by a single space variant, you don't need a character class but you could match the dot only using \.
\w+\. \w+
Regex demo
Edit
To highlight all the matches for the punctuations:
(?: [.:;]|[.:;] {2,}|(?<=\S)[;:.](?=\S))
Explanation
(?: Non capture group
[.:;] match a space followed by either . : or ;
| Or
[.:;] {2,} Match one of the listed followed by 2 or more spaces
| Or
(?<=\S)[;:.](?=\S) Match one of the listed surrounded by non whitespace chars
) Close group
Regex demo

Exclude curly brace matches

I have the following strings:
logger.debug('123', 123)
logger.debug(`123`,123)
logger.debug('1bc','test')
logger.debug('1bc', `test`)
logger.debug('1bc', test)
logger.debug('1bc', {})
logger.debug('1bc',{})
logger.debug('1bc',{test})
logger.debug('1bc',{ test })
logger.debug('1bc',{ test})
logger.debug('1bc',{test })
Instead of debug there can be other calls like warn, fatal etc.
All quote pairs can be "", '' or ``.
I need to create a regular express which matches case 1 - 5 but not 6 - 11.
That's what I've come up with:
logger.*\(['`].*['`],\s*.([^{.*}])
This also matches 8 - 11, so I'm suspecting this part is wrong ([^{.*}]) but I don't get it why.

You can try this
logger\.[^(]+\((?:"(?:\\"|[^"])*"|'(?:\\'|[^'])*'|`(?:\\`|[^`])*`),[^{}]*?\)
Regex Demo
P.S:- This pattern can be shorten if we are sure there won't be any mismatch of quotes, also if there won't be any escaped quote inside string
If there's no escaped string
logger\.[^(]+\((?:"[^"]*"|'[^']*'|`[^`]*`),[^{}]*?\)
If there's no quotes in between string. i.e no strings like "mr's jhon
logger\.[^(]+\(([`"'])[^"'`]*\1,[^{}]*?\)

If there are no quotes between the quoted parts, you could make use of a capturing group to match one of the quote types (['`"]) and use a backreference \1 to match the closing quote type.
The \r\n in the negated character class is to not cross newline boundaries.
The pattern will match either the quoted parts or 1+ times a word character for the first part.
The second part matches any char except { or } or ) using a negated character class.
logger\.[^(\r\n]+\((?:(['`"])[^'`"]+\1|\w+),[^{})\r\n]+\)
That will match
logger\. Match logger.
[^(\r\n]+ Match 1+ times any char except ( or a newline
\( Match (
(?: Non capture group
(['`"]) Capture group 1
[^'`"]+\1 Match 1+ times any char except the quote types, backreference to the captured
| or
\w+ Match 1+ word chars
), Close non capture group and match ,
[^{})\r\n]+ Match 1+ times any char except { } ) or a newline
\) Match )
Regex demo

Regex which does not allow leading space and any character from (^\\/:*?"<>|)

I have a regex like "^[a-zA-Z]:(\\\\+[^\\/:*?"<>|]+)*([\\\\]+)?$" which is responsible for file path validation.
It successfully validates paths like C:\Users\data and C:\\Users\\data
I want the string which comes after "C:\" to not start with space and not have (^\\/:*?"<>|) characters in it.

You could use match the start of the string up till the colon and use your negated character class to not match your unwanted characters right after. You could add a space or \s to that character class to not match that as well.
Also you might use a capturing group and backreference to which variant is used for the backslashed \\ or \
After that you could use a repeating pattern and specify which characters to allow for the rest of the string.
^[a-zA-Z]:(\\+)(?:[^\\/:*?"<>|\s][\w&]+(?: [\w&]+)*(?:\1[a-zA-Z&]+)*)?$
Regex demo
That will match:
^ Start of the string
[a-zA-Z]: - [a-zA-Z]: Match a-zA-Z and a colon
(\\+) Capture in a group 1+ times a backslash to reference it
(?: Non capturing group
[^\\/:*?"<>|\s] Negated character class to not match 1+ times what is listed (Added \s but you could also just use a space)
[\w&]+(?: [\w&]+)* Match 1+ times a word char and repeat 0+ times matching a space and 1+ times a word char. Note that you can extend the character class to match what you want.
(?: Non capturing group
\1[a-zA-Z&]+ Match backreference to what is captured in group 1 followed by 1+ times a-zA-Z (You can add to the character class what you would like to match as well)
)* Close non capturing group and repeat it 0+ times
)? Close non capturing group and make it optional
$ End of the string

As said here
Negative lookahead is indispensable if you want to match something not followed by something else. When explaining character classes, this tutorial explained why you cannot use a negated character class to match a q not followed by a u. Negative lookahead provides the solution: q(?!u)
So you can mix it with if-then-else regex statement like (?(?!your_pattern_in_regex)match_then|match_else)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to utilize a URL regex match inside of the last grouping? - regex

Related

Regex match specific strings

What is the proper regex for capturing everything after "String" and between two delimeters ('=' and and non alphanumeric))

regex pattern to highlight all the matches for the punctuation in VBA

Exclude curly brace matches

Regex which does not allow leading space and any character from (^\\/:*?"<>|)

Categories

Resources