Regex to match all https URLs except a certain path - regex

I need a regex that will match all https URLs except for a certain path.
e.g.
Match
https://www.domain.com/blog
https://www.domain.com
Do Not Match
https://www.domain.com/forms/*
This is what I have so far:
<rule name="Redirect from HTTPS to HTTP excluding /forms" enabled="true" stopProcessing="true">
<match url=".*" />
<conditions>
<add input="{URL}" pattern="^https://[^/]+(/(?!(forms/|forms$)).*)?$" />
</conditions>
<action type="Redirect" url="http://{HTTP_HOST}/{R:0}" redirectType="Permanent" />
</rule>
But it doesn't work

The way the redirect module works, you should simply use:
<rule name="Redirect from HTTPS to HTTP excluding /forms" stopProcessing="true">
<match url="^forms/?" negate="true" />
<conditions>
<add input="{HTTPS}" pattern="^ON$" />
</conditions>
<action type="Redirect" url="http://{HTTP_HOST}/{R:0}" />
</rule>
The rule will trigger the redirect to HTTP only if the request was HTTPS and if the path wasn't starting with forms/ or forms (using the negate="true" option).
You could also add a condition for the host to match www.example.com as following:
<rule name="Redirect from HTTPS to HTTP excluding /forms" stopProcessing="true">
<match url="^forms/?" negate="true" />
<conditions>
<add input="{HTTPS}" pattern="^ON$" />
<add input="{HTTP_HOST}" pattern="^www.example.com$" />
</conditions>
<action type="Redirect" url="http://{HTTP_HOST}/{R:0}" />
</rule>

Does this give you the behavior you're looking for?
https?://[^/]+($|/(?!forms)/?.*$)
After the www.domain.com bit, it's looking for either the end of the string, or for a slash and then something that ISN'T forms.

I came up with the following pattern: ^https://[^/]+(/(?!form/|form$).*)?$
Explanation:
^ : match begin of string
https:// : match https://
[^/]+ : match anything except forward slash one or more times
( : start matching group 1
/ : match /
(?! : negative lookahead
form/ : check if there is no form/
| : or
form$ : check if there is no form at the end of the string
) : end negative lookahead
.* : match everything zero or more times
) : end matching group 1
? : make the previous token optional
$ : match end of line

I see two issues in the posted pattern http://[^/]+($|/(?!forms)/?.*$)
It misses redirecting URLs such as https://domain.com/forms_instructions, since the pattern fails to match those also.
I believe you have http and https reversed between the pattern and the URL. The pattern should have https and the URL http.
Perhaps this will work as you intend:
<rule name="Redirect from HTTPS to HTTP excluding /forms" enabled="true" stopProcessing="true">
<match url="^https://[^/]+(/(?!(forms/|forms$)).*)?$" />
<action type="Redirect" url="http://{HTTP_HOST}{R:1}" redirectType="Permanent" />
</rule>
Edit: I've moved the pattern to the tag itself since matching everything with .* and then using an additional condition seems unnecessary. I've also changed the redirection URL to use the part of the input URL captured by the brackets in the match.

Related

IIS URL Rewrite: Add trailing slash in url only once

I have a rule in IIS to append a slash at the end (if there is no). It is working fine but in my case, I only need it for the first time. I have a reverse proxy on IIS to forward the request to another server. With this rule, it appends the slash all the time.
How I can modify the rule to append the slash only if there is not any after a keyword like 'myapp' so that it appends the slash if a URL is like http://myserver/myapp
<rule name="AddTrailingSlashRule1" stopProcessing="true">
<match url="(.*[^/])$" />
<conditions>
<add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
<add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
</conditions>
<action type="Redirect" url="{R:1}/" />
</rule>
I tried changing the regular expression in url="(.*myapp[^/])$" but it does not work.
As far as I know the [^/] means there is a character which is not the '/'. So your regex will not work well for the myapp.
The right regex is .*myapp$.
Result:

Matching strings not ending with .html

I want to redirect my website users when they hit a REST path without the trailing slash.
Example.
http://mywebsite.my/it/products/brand/name => http://mywebsite.my/it/products/brand/name/
http://mywebsite.my/it/products => http://mywebsite.my/it/products/
http://mywebsite.my => http://mywebsite.my/
http://mywebsite.my/it/products/brand/name/code.html => ???
Well, I don't want the last one to be rewritten, I don't want the trailing slash when the URL ends with .html.
I'm working with URL rewrite module of IIS7, and this is my "slash-rule".
<rule name="SLASHFINALE" stopProcessing="true">
<match url="(.*[^/])$" />
<conditions>
<add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
<add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
</conditions>
<action type="Redirect" redirectType="Permanent" url="{R:1}/" />
</rule>
In other words, if the input url matches that regex (everything not ending with a slash), I rewrite the same URL adding the trailing slash.
So my rule would be the same, but with that little addition: rewrite all URLs, except the ones (already) having the trailing slash or the ones ending with ".html".
I wrote this
(.*(?<!html)[^\/])$
but I can't understand why it's not working.
IIS Javascript-flavored regex parser does not support conditional expressions.
I ended up with this:
<match url="(.*[^\.]...[^\/]$)" />
Enough for me.

Rewrite Rule Url To Lowercase Except for Querystring

I have the basic rewrite rule for my web.config file that transforms my urls to lowercase.
Perfect except for one issue: I pass tokens that are case sensitive in my emails that allow users to change their username/emails
How can I make a rewrite rule that makes my url lowercase while having the querystring remain case sensitive?
Example:
<rule name="Convert to lower case" stopProcessing="true">
<match url=".*[A-Z].*" ignoreCase="false" />
<action type="Redirect" url="{ToLower:{R:0}}" redirectType="Permanent" />
</rule>
Makes this Url:
http://resources.championscentre.org/ConfirmChangeEmail/abcDEfGhIJKlmn
Into This:
http://resources.championscentre.org/confirmchangeemail/abcdefghijklmn
But needs to be:
http://resources.championscentre.org/confirmchangeemail/abcDEfGhIJKlmn
The regular expression should be
^[\w:\/\.]*\/
\w is [a-zA-Z0-9]
^ anchors the begining.
^[\w:\/\.]* mathes any alpha number or / or : or .
/ at the end ensures that the last / is selected. (assuming that your URL doesnt ends with /)
check the example
<rule name="Convert to lower case" stopProcessing="true">
<match url="^(.*[A-Z].*)(\/.*)" ignoreCase="false" />
<action type="Redirect" url="{ToLower:{R:1}}{R:2}" redirectType="Permanent" />
</rule>

What does the following regex matches?

I am trying to use IIS URL rewrite to take a user to a WWW domain instead of a non-WWW. I came across an article which uses the following regex to match domain names:
^[^\.]+\.[^\.]+$
I can't figure out what sort of domain is being matched with this regex. Here is the complete piece of code:
<rule name="www redirect" enabled="true" stopProcessing="true">
<match url="." />
<conditions>
<add input="{HTTP_HOST}" **pattern="^[^\.]+\.[^\.]+$"** />
<add input="{HTTPS}" pattern="off" />
</conditions>
<action type="Redirect" url="http://www.{HTTP_HOST}/{R:0}" />
</rule>
<rule name="www redirect https" enabled="true" stopProcessing="true">
<match url="." />
<conditions>
<add input="{HTTP_HOST}" **pattern="^[^\.]+\.[^\.]+$"** />
<add input="{HTTPS}" pattern="on" />
</conditions>
<action type="Redirect" url="https://www.{HTTP_HOST}/{R:0}" />
</rule>
^ # anchor the pattern to the beginning of the string
[^\.] # negated character class: matches any character except periods
+ # one or more of those characters
\. # matches a literal period
[^\.] # negated character class: matches any character except periods
+ # one or more of those characters
$ # anchor the pattern to the end of the string
The anchors are important to make sure that there is nothing around the domain that is not allowed.
As Tim Pietzker mentioned, the periods do not need to be escaped inside the character class.
To answer your question, the most basic way: what does this match? Any string that contains exactly one ., which is neither the first nor the last character.

Regex URL match on anything but www

I'm using IIS7 and the URL Rewrite module.
I would like to use regex to match any subdomain apart from www.
So...
frog.domain.co.uk = Match
m.domain.co.uk = Match
anything.domain.co.uk = Match
www.domain.co.uk = No match
This way I can redirect any subdomain that someone types in back to www.
you can use 301 in .htaccess for this.
This will match what you want:
^(?!=www\.).*
Which is a negative lookahead for www.. Not sure if you need the trailing .*
Use this rule -- it will redirect to www.exmaple.com domain if domain is different:
<system.webServer>
<rewrite>
<rules>
<rule name="Force www" stopProcessing="true">
<match url="(.*)$" />
<conditions>
<add input="{HTTP_HOST}" pattern="^www\.example\.com" negate="true" />
</conditions>
<action type="Redirect" url="http://www.example.com/{R:1}" />
</rule>
</rules>
</rewrite>
</system.webServer>
You can optimize it a bit if you do not want to type domain name twice (example.com) -- but that is very minor thing and depending on your circumstances/configuration it is can be undesired.