Im trying to work out a url that will match domain.com\about-us\ & domain.com\home\
I have a url regex:
^(?P<page>\w+)/$
but it won't match the url with the - in it.
I've tried
^(?P<page>\.)/$
^(?P<page>\*)/$
but nothing seems to work.
Try:
^(?P<page>[-\w]+)/$
[-\w] will accept a-z 1-9 and dash
Related
I'm currently using this regex (?<=\/movie\/)[^\/]+, but it only matches the username from the second url, i know i could make a if (contains /movie/): use this regex, else: use another regex on my code, but i'm trying to do this directly on regex.
http://example.com:80/username/token/30000
http://example.com:80/movie/username/token/30000.mp4
To complete the Tensibai's answer, if you have not a port in url, you can use the last dot in url to start your regex :
\.[^\/\.]+\/(?:movie\/)?([^\/]+)
(demo)
You can use something like this to make the movie/ optional and have the username in a named capture group (Live exemple):
\d[/](?:movie\/)?(?<username>[^/]+)[/]
using \d/ to anchor the start of match at after the url.
I'm trying to extract the part of an URL ignoring the http(s)://www. part of it.
These URLs come from a form that the user fills and multiple formats and errors are expected, here's a sample:
http://www.akashicbooks.com
https://deliciouselsalvador.com
http://altaonline.com
http://https://www.amtb-la.org/
http://https://www.amovacations.com/
http://dornsife.usc.edu/jep
I've tried in Google Sheets and Airtable using the REGEXEXTRACT formula:
=REGEXEXTRACT({URL},"[^/]+$")
But unfortunately, I can't make it work for all the cases:
Any ideas on how to make it work?
You can use
^(?:https?://(?:www\.)?)*(.*)
See the regex demo. Details:
^ - start of string
(?:https?://(?:www\.)?)* - zero or more occurrences of
https?:// - http:// or https://
(?:www\.)? - an optional sequence of www.
(.*) - Group 1: the rest of the string.
With REGEXEXTRACT, the output value is the text captured with Group 1.
I need to fix my url pattern:
/^((http(s)?(\:\/\/)){1}(www\.)?([\w\-\.\/])*(\.[a-zA-Z]{2,4}\/?)[^\\\/#?])[^\s\b\n|]*[^\.,;:\?\!\#\^\$ -]/
I thought this regex was ok, but it is not working for urls like: https://xx.xx (without www). 'www' should be optional ((www.)?). Where is the bug?
The problem is not in the (www\.)? part but that parts after that.
Take a look at the [^\\\/#?] and the [^\.,;:\?\!\#\^\$ -] parts.
So a valid URL would be https://xx.xx plus none of \/#? plus none of .,;:?!#^$_- making the url valid if you add those, for example https://xx.xx11.
I do advice you to not try to create your own regex because you are missing a lot!
For example, tlds like .amsterdam are valid. And why are you capturing so many groups?
Your regex as an image made with https://www.debuggex.com/:
Could you please help me to find a regex that match all YouTube urls except user accounts and channels urls
I am using this regex:
https?:\/\/((www|m)\.)?(youtube\.com|youtu.be)\/((^channel\/|^user\/){0}(embed\/|(watch)?(\?|\/)?v(=|\/)?))(\S+)?
It works fine but youtube url with format of " https://youtu.be/abcdefgh " is not match
Thanks
Use
https?:\/\/((www|m)\.)?(youtube\.com|youtu\.be)\/(?!user\/|channel\/)(embed\/|(watch)?[?\/]?v[=\/]?)?(\S*)
See proof. Note the (embed\/|(watch)?[?\/]?v[=\/]?)? part is now optional with the help of the ? quantifier.
The (?!user\/|channel\/) part will disallow user/ and channel/.
I am using google appengine and cant map this URL "user/test#example.com"
application = webapp.WSGIApplication( [('/user/(\w+)',UsersSubPath)],debug=True)
I dont know why this expression doesnt work. any ideas?
You'll have to widen the scope of your regex. \w only matches [A-Za-z0-9] which excludes the special characters # and .. For this example you could use:
'/user/([A-Za-z0-9#.]*)'
or
'/user/(\S*)'