Regex to match any domain except two domains

Regex to match any domain except two domains - regex

in my htaccess i'm trying to set document root for all park domains to a specific path except two main domains, so basically i need a regex to match any domain except tow domains
i found something like this
^(?!foo$|bar$).*
and this
(?>[\w-]+)(?<!tea|nuka-cola)
but can not get it work with my situation because there is a dot tld in domain name and i want to use regex there too
here is my current regex
^(.*?)\.(com|net)$
instead of (.*?) i want to make exception there

Use a negative look behind:
^(.*?)(?<!(foo)|(bar))\.(com|net)$
Not sure what you want, but this regex will not match urls ending in foo.com or bar.net etc

Related

Match url with uppercase letters except if it contains a filename like .jpg,.css,.js etc

I need a Regular Expression that can match url with uppercase letters but do not match if it contains a filename like .jpg,.css,.js etc
I want to redirect all uppercase url to lowercase but only when it is not pointing to a file resource.

Try using a regex visualizer like regexpal.com.
Here's an example of a regular expression that approximates what you're trying to do:
\w+\.(?:com|net)(?:/[A-Z]+){1,}[/]?(?:\.jpg|\.png|\.JPG|\.PNG){0}$
\w+\.(?:com|net) captures a domain of the form word.com or word.net. (You'll need to add other domains or improve this if you want to capture subdomains as well.)
(?:/[A-Z]+){1,}[/]?captures all-caps directories like /FOO/BAR/ with an optional trailing slash.
(?:\.jpg|\.png|\.JPG|\.PNG){0}$ captures exactly zero of the extensions listed; you'll obviously need to add to this list of extensions.
But perhaps rethink your routing; it's better form to keep all assets in devoted directories on your server, so that you can simply pass any request to mysite.com/assets/ along unchanged while handling other URLs.

Regex remove www from URL

I hope someone can help, this is driving me crazy!
I am attempting to modify Logstash Grok filters to parse a domain name.
Currently the regex is:
\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b) and correctly separates the domain however, I need to add an additional check to remove www..
This is what I have come up with so far:
\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(^(?<!www$).*$?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b)
I can only seem to keep the www. part of the domain, and not the domain itself.
Example of what I need to achieve:
www.stackoverflow.com should be stackoverflow.com.
I need to remove specifically www. and not the entire subdomain.
Thank you in advance!
UPDATE
Example inputs to expected outputs (using this post as an example):
In it's current state:
https://stackoverflow.com/questions/37070358/ returns www.stackoverflow.com
What I need is for it to return stackoverflow.com

You can add a (?!www\.) and (?!http:\/\/www\.) negative lookaheads right after the first \b to exclude matching www. or http://www.:
\b(?!www\.)(?!http:\/\/www\.)(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(?:\.?|\b)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
See the regex demo
You may add more negative lookaheads to exclude https:// or ftp/ftps links.
ALTERNATIVE:
\b(?!(?:https?|ftps?):\/\/)(?!www\.)(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(?:\.?|\b)
See this regex demo
The (?!(?:https?|ftps?):\/\/) and (?!www\.) lookaheads will just let you skip the protocol and www parts of the URLs.

This will match the part after www if the url starts with www.
(?!www\.)\b(?:(?!-)[0-9A-Za-z]{1,63})(?:\.(?:(?!-)[0-9A-Za-z-]{1,63}))*(\.?|\b)
I simplified the rest of your regex too by using a negative look ahead for - in the subdomains.

Regex to match all except URLs that contain specific directory?

I need a regular expression for IIS URL Rewrite that will process the rule only when the expression matches any bit of the URL EXCEPT a specific sub-root directory.
Example:
www.mysite.com/wordpress - process rule on any URL that starts with /wordpress after the domain name
www.mysite.com/inventory - do not process rule on any URL that starts with /inventory after the domain name
Tried .*(?<!^\/inventory\/.*) but it still matches the entire string.

You need a lookahead rather than lookbehind. Something like this I think:
^([^/]*/){1}(?!inventory\b)
Where you change 1 to 2 when the exclusion is needed at the next lower sublevel, etc.

Regex needed to match a domain name for django view

I'm trying to match a url with a domain like:
Testing.com
testing.com
Testing.net
testing.net
Testing.org
testing.org
and other extensions as well.
I'm trying to formulate a regex to use in a django view like:
(r'^Account/Testing/d=([a-z]{1,50})$', TestApp),
I tried ^[A-za-z]{2,50}$ but that doesn't match a domain with capital letter in the beginning
Any help?
Thank you!

you can use this
/^(?:http(?:s)?:\/\/)?(?:w{3})\.([a-z_0-9-]+\.\w{2,3}(?:\.\w{2})?)/i
it will match for links likes this
http://www.site.com
https://www.site.com
http://www.site.co.uk
https://www.site.co.uk
http://www.site.com.br
https://www.site.com.br
http://www.site-site.com.br
https://www.site-site.com
http://www.site-site.co.uk
https://www.site-site.co.uk
www.site-site.com
www.site-site.co.uk
www.site-site.com.br
www.site.com
and alot of other variations
even if the site has
www.site.com/news
it will only match for "site.com"
the /i modifier will match for all variations of upper and lower cases
if you only want to match domain name as upper and lower
/^(?:http(?:s)?:\/\/)?(?:w{3})\.((?i:[a-z_0-9-])+\.\w{2,3}(?:\.\w{2})?)/
(?i:[a-z_0-9-]) will match variations for domain's names only

Fortunately, this wasn't that bad after all - this is one way to match a domain with varying extensions:
^[A-za-z]{2,50}.[a-z]{1,3}$
matches .com, .org, .net, etc.
If you have a domain like me2.com, its better to use this:
(^[A-za-z0-9]{2,50}.[a-z]{1,3})$

Regex domain validation

The following code
/^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6}(:[0-9]{1,5})?(\/.*‌)?$/ix
validates all types of domains.
I would like to validate only one domain or subdomain (for example .cu.cc or .co.cc).

You can just add this to the end of your domain regex:
(?<=\.cu\.cc)$
It's a positive look-behind

The final \.[a-z]{2,6} is what matches a top-level domain. Change it to whatever specific TLD you want to match.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex to match any domain except two domains - regex

Use a negative look behind: ^(.*?)(?<!(foo)|(bar))\.(com|net)$ Not sure what you want, but this regex will not match urls ending in foo.com or bar.net etc

Related

Match url with uppercase letters except if it contains a filename like .jpg,.css,.js etc

Regex remove www from URL

Regex to match all except URLs that contain specific directory?

Regex needed to match a domain name for django view

Regex domain validation

Categories

Resources