Matching strings not ending with .html - regex

I want to redirect my website users when they hit a REST path without the trailing slash.
Example.
http://mywebsite.my/it/products/brand/name => http://mywebsite.my/it/products/brand/name/
http://mywebsite.my/it/products => http://mywebsite.my/it/products/
http://mywebsite.my => http://mywebsite.my/
http://mywebsite.my/it/products/brand/name/code.html => ???
Well, I don't want the last one to be rewritten, I don't want the trailing slash when the URL ends with .html.
I'm working with URL rewrite module of IIS7, and this is my "slash-rule".
<rule name="SLASHFINALE" stopProcessing="true">
<match url="(.*[^/])$" />
<conditions>
<add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
<add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
</conditions>
<action type="Redirect" redirectType="Permanent" url="{R:1}/" />
</rule>
In other words, if the input url matches that regex (everything not ending with a slash), I rewrite the same URL adding the trailing slash.
So my rule would be the same, but with that little addition: rewrite all URLs, except the ones (already) having the trailing slash or the ones ending with ".html".
I wrote this
(.*(?<!html)[^\/])$
but I can't understand why it's not working.

IIS Javascript-flavored regex parser does not support conditional expressions.
I ended up with this:
<match url="(.*[^\.]...[^\/]$)" />
Enough for me.

Related

IIS Rewrite - redirect site to new domain capturing the language embedded in the URL

I am trying to write a regex to redirect the URL to a new domain. I wrote IIS Rewrite rule for this:
<rule name="Redirect to new domain" stopProcessing="true">
<match url="^(.*)" />
<conditions>
<add input="{HTTP_HOST}" pattern="^(www\.)?my-www-en\.sites\.company\.net(\/([a-zA-Z]{2,3}-[a-zA-Z]{2,3}|en)\/?)?(.*$)" />
</conditions>
<action type="Redirect" redirectType="Permanent" url="https://my-new-domain.com/en-us/{C:4}" appendQueryString="true" />
</rule>
It works fine when the language is not added to the initial URL, however, some of the pages have the language added after the domain which results in double language appearance in the end URL.
So basically I would like to redirect things like:
my-www-en.sites.company.net/some-page/another/page/
www.my-www-en.sites.company.net/some-page/another/page/
my-www-en.sites.company.net/de-de/some-page/another/page/
www.my-www-en.sites.company.net/de-de/some-page/another/page/
my-www-en.sites.company.net/en/some-page/another/page/
to redirect to:
https://my-new-domain.com/en-us/some-page/another/page/
My current regex does not capture these groups correctly (even when it does while testing the regex in IIS rewrite) and I struggle to make it work. Right now everything gets redirected to the homepage instead to particular websites. Could you please help?
Please try this rule. The regular expressions can match all urls above.
<rule name="test">
<match url=".*" />
<conditions>
<add input="{HTTP_HOST}" pattern="^(www\.)?my-www-en\.sites\.company\.net$" />
<add input="{REQUEST_URI}" pattern="(/.*)?(/some-page/another/page/)" />
</conditions>
<action type="Rewrite" url="https://my-new-domain.com/en-us{C:2}" />
You can change rewrite to redirect.

IIS URL Rewrite: Add trailing slash in url only once

I have a rule in IIS to append a slash at the end (if there is no). It is working fine but in my case, I only need it for the first time. I have a reverse proxy on IIS to forward the request to another server. With this rule, it appends the slash all the time.
How I can modify the rule to append the slash only if there is not any after a keyword like 'myapp' so that it appends the slash if a URL is like http://myserver/myapp
<rule name="AddTrailingSlashRule1" stopProcessing="true">
<match url="(.*[^/])$" />
<conditions>
<add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
<add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
</conditions>
<action type="Redirect" url="{R:1}/" />
</rule>
I tried changing the regular expression in url="(.*myapp[^/])$" but it does not work.
As far as I know the [^/] means there is a character which is not the '/'. So your regex will not work well for the myapp.
The right regex is .*myapp$.
Result:

Redirect bookmarked URLS via IIS Rewrite Module is not working properly even though URL pattern matches in testing

I've read so many forums and I did possibly whatever I could. My outbound rule to works in terms of rewriting the URL for SEO purposes but my Redict URL which in case the changed URLs marked in our users' bookmarks does not work.
I am using IIS 10.0.
The URL that needs changing:
http://agmodel.com/files/content/insights/publishing/e_clouds.pdf
To:
http://agmodel.com/assets/content/insights/publishing/e_clouds.pdf
So only thing I am changing is the string "files" to "assets".
Here is what I've tried:
Attempt 1:
<rule name="Redirect" stopProcessing="true">
<match url="(https?:\/\/[^\/]+)\/" />
<conditions trackAllCaptures="true">
<add input="{QUERY_STRING}" pattern="(https?:\/\/[^\/]+)\/files\/(.*)" />
<add input="{REQUEST_URI}" pattern="^/assets" negate="true" />
</conditions>
<action type="Redirect" url="{R:1}/assets/{C:2}" appendQueryString="false" redirectType="Found" />
</rule>
I tried to make sure that the first pattern is always the domain the second pattern is files.
Attempt 2:
<rule name="assets-to-files" stopProcessing="true">
<match url="(https?:\/\/[^\/]+)\/files\/(.*)" />
<action type="Redirect" url="{R:1}/assets/{C:1}" appendQueryString="false" logRewrittenUrl="true" />
<conditions>
<add input="{QUERY_STRING}" pattern="\/files\/(.*)" />
</conditions>
</rule>
So whenever I test whether the bookmarked old URL will change to the new one, it does not work. It gets green light during pattern match testing in IIS 10.
What am I doing wrong here?
You may use a very simple rule here:
<rule name="assets-to-files">
<match url="^files/(.*)" />
<action type="Rewrite" url="assets/{R:1}" />
</rule>
The URL you want to match is http://agmodel.com/files/content/insights/publishing/e_clouds.pdf. The url attribute in match node will receive files/content/insights/publishing/e_clouds.pdf as input, so you want
^files/(.*)
It will match files/ at the start of the string and then will capture into {R:1} any 0 or more chars other than newline.
In the action node url attribute, all you need is to specify the assets/ new path and append what you captured into {R:1}.

Regex to match all https URLs except a certain path

I need a regex that will match all https URLs except for a certain path.
e.g.
Match
https://www.domain.com/blog
https://www.domain.com
Do Not Match
https://www.domain.com/forms/*
This is what I have so far:
<rule name="Redirect from HTTPS to HTTP excluding /forms" enabled="true" stopProcessing="true">
<match url=".*" />
<conditions>
<add input="{URL}" pattern="^https://[^/]+(/(?!(forms/|forms$)).*)?$" />
</conditions>
<action type="Redirect" url="http://{HTTP_HOST}/{R:0}" redirectType="Permanent" />
</rule>
But it doesn't work
The way the redirect module works, you should simply use:
<rule name="Redirect from HTTPS to HTTP excluding /forms" stopProcessing="true">
<match url="^forms/?" negate="true" />
<conditions>
<add input="{HTTPS}" pattern="^ON$" />
</conditions>
<action type="Redirect" url="http://{HTTP_HOST}/{R:0}" />
</rule>
The rule will trigger the redirect to HTTP only if the request was HTTPS and if the path wasn't starting with forms/ or forms (using the negate="true" option).
You could also add a condition for the host to match www.example.com as following:
<rule name="Redirect from HTTPS to HTTP excluding /forms" stopProcessing="true">
<match url="^forms/?" negate="true" />
<conditions>
<add input="{HTTPS}" pattern="^ON$" />
<add input="{HTTP_HOST}" pattern="^www.example.com$" />
</conditions>
<action type="Redirect" url="http://{HTTP_HOST}/{R:0}" />
</rule>
Does this give you the behavior you're looking for?
https?://[^/]+($|/(?!forms)/?.*$)
After the www.domain.com bit, it's looking for either the end of the string, or for a slash and then something that ISN'T forms.
I came up with the following pattern: ^https://[^/]+(/(?!form/|form$).*)?$
Explanation:
^ : match begin of string
https:// : match https://
[^/]+ : match anything except forward slash one or more times
( : start matching group 1
/ : match /
(?! : negative lookahead
form/ : check if there is no form/
| : or
form$ : check if there is no form at the end of the string
) : end negative lookahead
.* : match everything zero or more times
) : end matching group 1
? : make the previous token optional
$ : match end of line
I see two issues in the posted pattern http://[^/]+($|/(?!forms)/?.*$)
It misses redirecting URLs such as https://domain.com/forms_instructions, since the pattern fails to match those also.
I believe you have http and https reversed between the pattern and the URL. The pattern should have https and the URL http.
Perhaps this will work as you intend:
<rule name="Redirect from HTTPS to HTTP excluding /forms" enabled="true" stopProcessing="true">
<match url="^https://[^/]+(/(?!(forms/|forms$)).*)?$" />
<action type="Redirect" url="http://{HTTP_HOST}{R:1}" redirectType="Permanent" />
</rule>
Edit: I've moved the pattern to the tag itself since matching everything with .* and then using an additional condition seems unnecessary. I've also changed the redirection URL to use the part of the input URL captured by the brackets in the match.

IIS7.5 URL Rewrite Regex matching when it shouldn't

I have split some pages in between subdomains and want to do a URL rewrite to different pages on different subdomains in certain cases. Everything is a rewrite rule except for the final two rules in the file. Those last two rules determine which subdomain to route the path I fixed to.
The way I am doing it is if I prepend the path with an underscore (_) then it stays on subdomain A. If I prepend the path with a tilde (~) then it is redirected to subdomain B.
So I have this rule:
<rule name="Login rule" stopProcessing="false">
<match url="(.*?)/?old-path/Login\.aspx$" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false">
<add input="{HTTP_METHOD}" pattern="GET" />
</conditions>
<action type="Rewrite" url="~new-path/login.aspx" />
</rule>
Please notice there is an aspx on the end of the URL. It continues processing, but I have a generic rewrite rule at the end of the list right before the redirect ones. This is to remove all ASPX extensions on subdomain A (www), but I want to leave the ASPX extension for subdomain B (Please don't suggest removing the suggested on the 2nd subdomain. Thanks :)
<rule name="Remove ASPX" stopProcessing="false">
<match url="^([^www\.]+)\.aspx$" />
<conditions>
<add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
<add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
</conditions>
<action type="Rewrite" url="_{R:1}" />
</rule>
The problem is, is this won't work because all the URLs have www in the beginning. I am not that good with regex, but I am guessing I need to just apply this rule to all URL that has a tilde in it. I tried this, but it's not really working either:
<match url="^_+\.aspx$" />
Basically I want this rule to ignore URLs that I have rewritten to have a ~ in them, but remove the ASPX if I placed the _ at the start of the path.
Any suggestions?
If I'm understood your problem then you have URL: "~new-path/login.aspx" and you want do redirect to "~new-path/login", right?
Then your regex should be like this:
^(.*~.*)\.aspx$
Note: "www" is a part of domain name and not included into matching.
So if your full URL is "http://www.mysite.com/~new-path/login.aspx" then only "~new-path/login.aspx" piece will take part in regex matching.
And template {R:1} will contain value in first group (braces): "~new-path/login"