Regex to match specific domain and it's subfolder - regex

I want to match a particular domain and its subdomain, no matter how it's entered. In the following example, I want to match all ´test.comif nothing comes after it (only a slash or query strings) OR if a specific folder follows it in this case it's named as:subfolder`. Again the subfolder could have / or query strings after.
Domain
Match
test.com
match
https://test.com
match
https://test.com?foo=bar
match
https://test.com/
match
https://test.com/?foo=bar
match
https://www.test.com
match
https://www.test.com/subfolder
match
https://www.test.com/subfolder/
match
https://www.test.com/subfolder/?foo=bar
match
test.com/subfolder
match
https://www.test.com/foo
no match
test.com/foo
no match
https://www.test.com/jason
no match
https://www.test.com/jason?foo=bar
no match
Right now I have the following regex:
^(?:\S+://)?[^/]+/?$
The problem though is that it matches ANY domains, which is not what I need. I want to match a specific domain and a specific subfolder.
How is this possible?

You may use this regex:
^(?:https?://)?(?:www\.)?test\.com(?:/subfolder)?/?(?:\?\S*)?$
RegEx Demo
RegEx Demo:
^: Start
(?:https?://)?: *optionally* match http://orhttps://`
(?:www\.)?: optionally match www.
test\.com: match test.com
(?:/subfolder)?: optionally match /subfolder
/?: optionally match a trailing /
(?:\?\S*)?: optionally match query string starting with ?
$: End

With your shown samples, could you please try following.
^(?:(?:https?:\/\/)(?:www\.)?)?test\.com(?:(?:(?:\/)?(?:\/subfolder\/?)?(?:\/\?\S+\/?)?)?(?:\?\S+)?)?(?:\/)?$
Online demo for above regex
Explanation: Adding detailed explanation for above.
^ ##Starting of match here by caret sign.
(?: ##Starting non-capturing group here.
(?:https?:\/\/) ##In this non-capturing group which has http/https// in it to match.
(?:www\.)? ##In this non-capturing group keeping www. as an optional here.
)? ##Closing very first non-capturing group here.
test\.com ##Matching string test.com here. (1, calling it 1 for explanation purposes)
(?: ##Starting a non-capturing group here.
(?: ##Starting one more non-capturing group here.(2, calling it for explanation purposes only)
(?:\/)? ##Matching / optional in a non-capturing group here.
(?:\/subfolder\/?)? ##Matching /subfolder /(as optional) and whole non-capturing group as optional.
(?:\/\?\S+\/?)? ##Matching /? and all non-space characters followed by /(optional) in non-capturing group, keep this optional.
)? ##Closing (2) non-capturing group here.
(?: ##Starting non-capturing group here.
\?\S+ ##Matching ? non-spaces values here.
)? ##Closing non-capturing group here.
)? ##Closing (1) non-capturing group here.
(?: ##Starting non-capturing group here.
\/ ##Matching single / here.
)? ##Closing non-capturing group here, keeping it optional.
$ ##Mentioning $ to tell the end of value(match).

Related

Pick only the alphabets and not the description from a given string

I am a newbie to Regex and require help with the following:
I have strings like - B - Comp-Band Disk,C - Check Oncoming Private,D - DL Procurement Outer. Is there a Regex expression which I could use to change string to B,C,D?
You can use
(?:(?:^|,)(\w))
Regex Explanation
(?: Non-capturing group
(?: Non-capturing group
^|, Match start of the string or ,
) Close non-capturing group
( Capturing group
\w Match any word character
) Close group
) Close non-capturing group
See the demo

Regex to capture string with multiple optional words

I'm using Overpass API's regex. Unsure which flavour it uses.
I'm wishing to capture these strings:
"Footpath"
"Public Footpath"
"Footpath No. 27001"
"Public Footpath No. 125"
"Footpath #424"
"Public Footpath #5"
This fails to return the first two options.
^(Public)?Footpath (No\. |#)?[0-9]
How do I make the 'No./# optional?
I've tried variations on wrapping them in brackets, but to no avail eg.
^(Public)?Footpath ((No\. |#)?[0-9])?
I'm afraid I'm out of my depth.
You may use this regex with multiple optional non-capturing groups:
^(?:Public )?Footpath(?: No\.)?(?: #?[0-9]+)?$
RegEx Demo
RegEx Details:
^: Start
(?:Public )?: Match Public in an optional non-capturing group
Footpath: Match Footpath
(?: No\.)?: Match No\. in an optional non-capturing group
(?: #?[0-9]+)?: Match space followed by optional # and 1+ digits in an optional non-capturing group
$: End

regex matching conditional strings

For example, if I have the following strings:
99%89 (should match)
99%? (should match)
?%99 (should match)
?%? (should not match)
?%99%99 (should match)
99%99%99%? (should match)
essentially the first or second element can be a ? or a number, but both elements cannot be ?. I tried thinking of something like:
[0-9]*|[?](?!\?)[%][0-9]*|[?]
But this does not yield the correct answer, any help would be appreciated
With your shown samples, could you please try following.
^(?:(?:\?(?:(?:%\d+){1,})?)|(?:(?:(?:\d+%){1,})?\?(?:(?:%\d+){1,})?)|(?:\d+%\d+))$
Online demo for above regex
Explanation: Adding detailed explanation for above.
^(?: ##Matching from starting of the value, starting a non-capturing group from here.
(?:\? ##Starting non-capturing group(one for understanding purposes) matching literal ? here.
(?:(?:%\d+){1,})? ##In a non capturing group looking for % with 1 or more occurrences of digits and matching this group match keeping it optional.
)| ##Closing one non-capturing group here, with OR condition here.
(?: ##Starting non-capturing group(two) here.
(?:(?:\d+%){1,})?\? ##Looking for digits with % one or more occurrences in a non-capturing group keeping it optional followed by ?
(?:(?:%\d+){1,})? ##Checking for % digits one or more occurrences in a non-capturing group keeping it optional followed by ?
)| ##Closing two non-capturing group here, with OR condition here.
(?:\d+%\d+) ##In a non-capturing group looking for 1 or more digits % one or more digits
)$ ##Closing 1st non-capturing group at the end of value.
Not sure if I am reading the question right, but as you tried using a negative lookahead you could assert that the string does not only contains % and/or ?
^(?![%?]+$)[\d?%]+$
Regex demo
Or without a lookahead:
^[%?]*\d[%?\d]*$
Regex demo

Validate string # followed by digits but # increases after every occurance

I have a string looks like this
#123##1234###2356####69
It starts with # and followed by any digits, every time the # appears, the number of # increases, first time 1, second time 2, etc.
It's similar to this regex, but since I don't know how long this pattern goes, so it's not very useful.
^#\d+##\d+###\d+$
I'm using PCRE regex engine, it allows recursion (?R) and conditions (?(1)...) etc.
Is there a regex to validate this pattern?
Valid
#123
#12##235
#1234##12###368
#1234##12###368####22235#####723356
Invalid
##123
#123###456
#123##456##789
I tried ^(?(1)(?|(#\1)|(#))\d+)+$ but it doesn't seem to work at all
You can do this using PCRE conditional sub-pattern matching:
^(?:((?(1)\1)#)\d+)++$
RegEx Demo
RegEx Details:
^: Start
(?:: Start non-capture group
(: Start capture group #1
(?(1)\1): if/then/else directive that means match back-reference \1 only if 1st capture group is available otherwise match null
#: Match an additional #
): End capture group #1
\d+: Match 1+ digits
)++: End non-capture group. Match 1+ of this non-capture group.
$: End
One option could be optionally matching a backreference to group 1 inside group 1 using a possessive quantifier \1?+# adding # on every iteration.
^(?:(\1?+#)\d+)++$
^ Start of string
(?: Non capture group
(\1?+#)\d+ Capture group 1, match an optional possessive backreference to what is already captured in group 1 and add matching a # followed by 1+ digits
)++ Close the non capture group and repeat 1+ times possessively
$ End of string
Regex demo
I think you can use forward-referencing here:
^(?:((?:\1(?!^)|^)#)\d+)+$
See the regex demo.
Details:
^ - start of string
(?:((?:\1(?!^)|^)#)\d+)+ - one or more occurrences of
((?:\1(?!^)|^)#) - Group 1 (the \1 value): start of string or an occurrence of the Group 1 value if it is not at the string start position
\d+ - one or more digits
$ - end of string.
NOTE: This technique does not work in regex flavors that do not support forward referencing, like ECMAScript based flavors (e.g. JavaScript, VBA, C++ std::regex)
Despite there are already working answers, and inspired by Wiktor's answer, I came up this idea:
(?:(^#|#\1)\d+)+$
Which is also quite short and effective(also works for non pcre environment).
See the test cases

Regex doesn't ignore the optionnals groups

I'm trying the create a regex to catch my url and his, optionnals, groups. The regex works fine if the url is complete. The optionnals groups are not optionnals at all.
Regex :
\/(.+)(?:\/(.+))(?:(?:\?(.+)))
Urls to catch :
/taxi
/taxi/lyon
/taxi/lyon?coordinates=7542
https://regex101.com/r/NKFkwq/4/
As you can see, the third line is catched. But i'd like the first and second too.
I thought the ?: will be enought to do that, but i missed something...
Thanks a lot for your help !
Cheers
EDIT and answer
Thanks in the comments for helping me. Here the great regex (the one i expected) : https://regex101.com/r/NKFkwq/8
Indeed ?: is about ignoring a match, not made him optionnal.
Your pattern consists of capturing and non capturing groups. The (?: denotes a non capturing group.
If you want to match all 3 lines, you could use match the part starting from the first forward slash and make the part starting from the second forward slash optional.
^/[^\s/]+(?:/[^\s/]+)?$
^ Start of string
/[^\s/]+ Match / and match 1+ times any char except a whitespace or /
(?: Non capturing group
/[^\s/]+ Match / and match 1+ times any char except a whitespace or /
)? Close non capturing group and make it optional
$ End of string
Regex demo
If you want to have capturing groups, but don't want to match /taxi?coordinates=7542 you could nest the groups and make them optional as well.
^/\w+(/\w+(\?\S*)?)?$
^ Start of string
/\w+ Match / and 1+ word chars
( Capture group 1
/\w+ Match / and 1+ word chars
( Capture group 2
\?\S* Match ? and 0+ times a non whitespace char
)? Close group 2
)? Close group 1
$ End of string
Regex demo